

{"id":20004,"date":"2018-07-12T04:07:10","date_gmt":"2018-07-12T04:07:10","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=20004"},"modified":"2018-07-12T04:07:10","modified_gmt":"2018-07-12T04:07:10","slug":"hcatalog-reader-writer","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/","title":{"rendered":"Learn Apache HCatalog Reader Writer With Example"},"content":{"rendered":"<p><span style=\"font-weight: 400\">In our last <strong>HCatalog tutorial<\/strong>, we discussed the <strong>input-output interface<\/strong>. Today, we will learn HCatalog Reader Writer. In this HCatalog blog, we will learn how HCatalog manages for parallel input and output even without using MapReduce. <\/span><\/p>\n<p><span style=\"font-weight: 400\">So, it offers a way to read data from a Hadoop cluster or to write data into a Hadoop cluster.\u00a0<\/span><\/p>\n<p>So, let&#8217;s start HCatalog Reder Writer.<\/p>\n<h2><span style=\"font-weight: 400\">What is HCatalog Reader Writer?<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, a data transfer API for parallel input and output without using <strong>MapReduce<\/strong> is offered by HCatalog. By using a basic storage abstraction of tables and rows, this API offers a way to read data from a Hadoop cluster or to write data into a Hadoop cluster.<\/span><\/p>\n<p><span style=\"font-weight: 400\">There are 3 essentials classes in data transfer API in HCatalog Reader Writer, such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>HCatReader<\/strong> \u2013 That helps to read data from a Hadoop cluster.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>HCatWriter<\/strong> \u2013 Whereas to write data into a Hadoop cluster, we use this class.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>DataTransferFactory<\/strong> \u2013 Moreover, in order to generate reader and writer instances, we use this class.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Also, there are some auxiliary classes in the data transfer API in HCatalog Writer Reader, like:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">ReadEntity<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">ReaderContext<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">WriteEntity<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">WriterContext<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In addition, to facilitate the integration of external systems with Hadoop, the HCatalog data transfer API is designed.<\/span><br \/>\n<span style=\"font-weight: 400\">Although, make sure HCatalog is not thread-safe.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HCatReader<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, reading is a two-step process. So, on the master node of an external system, Its first step occurs. Whereas in parallel on multiple slave nodes its second step is done.<\/span><\/p>\n<p><span style=\"font-weight: 400\">However, on a \u201cReadEntity\u201d, Reads are done. Hence, we need to define a ReadEntity from which to read, before we start to read. It is possible by ReadEntity.Builder. We can easily specify a database name, table name, partition, as well as a filter string.<\/span><\/p>\n<p><strong>Example of HCatalog Reader:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\">ReadEntity.Builder builder = new ReadEntity.Builder();\nReadEntity entity = builder.withDatabase(\"mydatabase\").withTable(\"mytable\").build();<\/pre>\n<p><span style=\"font-weight: 400\">Above code explains a ReadEntity object (\u201centity\u201d), where the name of the table is \u201cmytable\u201d in a database of name \u201cmydatabase\u201d. So, we can use it to read all the rows in this table. Although, make sure to the start of this operation this table must exist in HCatalog prior.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Further, using the ReadEntity and cluster configuration, we obtain an instance of HCatReader, after defining a ReadEntity:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">HCatReader reader = DataTransferFactory.getHCatReader(entity, config);<\/pre>\n<p><span style=\"font-weight: 400\">Then to obtain a ReaderContext from a reader, \u00a0follow this:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">ReaderContext cntxt = reader.prepareRead();<\/pre>\n<p><span style=\"font-weight: 400\">However, on the master node, only all of the above steps occur. Afterward, master node serializes this ReaderContext object. Then sends it to all the slave nodes. Hence, to read data, Slave nodes use this reader context.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">for(InputSplit split : readCntxt.getSplits()){\nHCatReader reader = DataTransferFactory.getHCatReader(split,\nreaderCntxt.getConf());\n      Iterator&lt;HCatRecord&gt; itr = reader.read();\n      while(itr.hasNext()){\n             HCatRecord read = itr.next();\n         }\n}<\/pre>\n<h2><span style=\"font-weight: 400\">HCatWriter<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Similarly, it is also a two-step process where the first step occurs on the master node. And, the second one occurs in parallel on slave nodes.<\/span><\/p>\n<p><span style=\"font-weight: 400\">However, on a \u201cWriteEntity\u201d, Writes are done that we can construct in a fashion similar to reads<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">WriteEntity.Builder builder = new WriteEntity.Builder();\nWriteEntity entity = builder.withDatabase(\"mydatabase\").withTable(\"mytable\").build();<\/pre>\n<p><span style=\"font-weight: 400\">So, by above code, we create a WriteEntity object (\u201centity\u201d) which we actually use to write into a table of name \u00a0\u201cmytable\u201d in the database, of name \u201cmydatabase\u201d.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The next step is to obtain a WriterContext, after creating a WriteEntity:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">HCatWriter writer = DataTransferFactory.getHCatWriter(entity, config);\nWriterContext info = writer.prepareWrite();<\/pre>\n<p><span style=\"font-weight: 400\">So, on the master node, all of the above steps occur. Afterward, master node serializes the WriterContext object and then it makes it available to all the slaves.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Further, using WriterContext, \u00a0we need to obtain an HCatWriter, on slave nodes, like:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">HCatWriter writer = DataTransferFactory.getHCatWriter(context);<\/pre>\n<p><span style=\"font-weight: 400\">Also, \u00a0for the write method, the writer takes an iterator as the argument:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">writer.write(hCatRecordItr);<\/pre>\n<p><span style=\"font-weight: 400\">The writer calls getNext() on this iterator in a loop. That writes out all the records attached to the iterator.<\/span><\/p>\n<p>So, this was all in HCatalog Reader Writer. Hope you like our explanation.<\/p>\n<h2><span style=\"font-weight: 400\">Conclusion<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Hence, we have learned the whole concept of HCatalog Reader Writer. Also, we discussed the HCatalog Reader Example and HCatalog Writer Example. So, if any doubt occurs regarding HCatalog Reader Writer, feel free to ask in the comment section.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In our last HCatalog tutorial, we discussed the input-output interface. Today, we will learn HCatalog Reader Writer. In this HCatalog blog, we will learn how HCatalog manages for parallel input and output even without&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":21062,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[5532,5533,5539,5544,5547],"class_list":["post-20004","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hcatalog","tag-hcatalog-reader","tag-hcatalog-reader-writer","tag-hcatalog-writer","tag-hcatreader","tag-hcatwriter"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Learn Apache HCatalog Reader Writer With Example - DataFlair<\/title>\n<meta name=\"description\" content=\"HCatalog Reader Writer tutorial, What is HCatalog Reader, What is HCatalog Writer, HCatalog Reader Example, HCatalog Writer Example\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Learn Apache HCatalog Reader Writer With Example - DataFlair\" \/>\n<meta property=\"og:description\" content=\"HCatalog Reader Writer tutorial, What is HCatalog Reader, What is HCatalog Writer, HCatalog Reader Example, HCatalog Writer Example\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-07-12T04:07:10+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Learn Apache HCatalog Reader Writer With Example - DataFlair","description":"HCatalog Reader Writer tutorial, What is HCatalog Reader, What is HCatalog Writer, HCatalog Reader Example, HCatalog Writer Example","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/","og_locale":"en_US","og_type":"article","og_title":"Learn Apache HCatalog Reader Writer With Example - DataFlair","og_description":"HCatalog Reader Writer tutorial, What is HCatalog Reader, What is HCatalog Writer, HCatalog Reader Example, HCatalog Writer Example","og_url":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-07-12T04:07:10+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Learn Apache HCatalog Reader Writer With Example","datePublished":"2018-07-12T04:07:10+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/"},"wordCount":624,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg","keywords":["Hcatalog reader","HCatalog Reader Writer","HCatalog Writer","HCatReader","HCatWriter"],"articleSection":["HCatalog Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/","url":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/","name":"Learn Apache HCatalog Reader Writer With Example - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg","datePublished":"2018-07-12T04:07:10+00:00","description":"HCatalog Reader Writer tutorial, What is HCatalog Reader, What is HCatalog Writer, HCatalog Reader Example, HCatalog Writer Example","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/07\/HCatalog-Reader-Writer-01.jpg","width":1200,"height":628,"caption":"HCatalog Reader Writer"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/hcatalog-reader-writer\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"HCatalog Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/hcatalog\/"},{"@type":"ListItem","position":3,"name":"Learn Apache HCatalog Reader Writer With Example"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/20004","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=20004"}],"version-history":[{"count":0,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/20004\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/21062"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=20004"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=20004"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=20004"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}