

{"id":13697,"date":"2018-04-23T07:20:45","date_gmt":"2018-04-23T07:20:45","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=13697"},"modified":"2018-04-23T07:20:45","modified_gmt":"2018-04-23T07:20:45","slug":"pig-architecture","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/pig-architecture\/","title":{"rendered":"Apache Pig Architecture &#8211; Learn Pig Hadoop Working"},"content":{"rendered":"<p><span style=\"font-weight: 400\">In order to write a Pig script, we do require a Pig Latin language. Moreover, we need an execution environment to execute them. <\/span><\/p>\n<p><span style=\"font-weight: 400\">S<\/span><span style=\"font-weight: 400\">o, in this article &#8220;Introduction to Apache Pig Architecture&#8221;, we will study the complete architecture of<strong> Apache Pig<\/strong>. It includes its components, Pig Latin Data Model and Pig Job Execution Flow in depth.<\/span><\/p>\n<h2>What is Apache Pig Architecture?<\/h2>\n<p>In Pig, there is a language we use to analyze data in <strong>Hadoop<\/strong>. That is what we call Pig Latin. Also, it is a high-level data processing language that offers a rich set of data types and operators to perform several operations on the data.<\/p>\n<p>Moreover, in order to perform a particular task, programmers need to write a Pig script using the Pig Latin language and execute them using any of the execution mechanisms (Grunt Shell, UDFs, Embedded) using Pig.<\/p>\n<p>To produce the desired output, these scripts will go through a series of transformations applied by the Pig Framework, after execution.<\/p>\n<p>Further, Pig converts these scripts into a series of <strong>MapReduce<\/strong> jobs internally. Therefore it makes the programmer\u2019s job easy. Here, is the architecture of Apache Pig.<\/p>\n<div id=\"attachment_14487\" style=\"width: 1090px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-14487\" class=\"wp-image-14487 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1.png\" alt=\"Architecture of Apache Pig\" width=\"1080\" height=\"1080\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1.png 1080w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1-150x150.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1-300x300.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1-768x768.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1-1024x1024.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-1-100x100.png 100w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/a><p id=\"caption-attachment-14487\" class=\"wp-caption-text\">Architecture of Apache Pig<\/p><\/div>\n<h2><span style=\"font-weight: 400\">Apache Pig Components<\/span><\/h2>\n<p><span style=\"font-weight: 400\">There are several components in the Apache Pig framework. Let\u2019s study these major components in detail:<\/span><\/p>\n<h3><span style=\"font-weight: 400\">i. Parser<\/span><\/h3>\n<p><span style=\"font-weight: 400\">At first, all the Pig Scripts are handled by the Parser. Parser basically checks the syntax of the script, does type checking, and other miscellaneous checks. Afterwards, Parser\u2019s output will be a DAG (directed acyclic graph) that represents the Pig Latin statements as well as logical operators.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The logical operators of the script are represented as the nodes and the data flows are represented as edges in DAG (the logical plan)<\/span><\/p>\n<h3><span style=\"font-weight: 400\">ii. Optimizer<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Afterwards, the logical plan (DAG) is passed to the logical optimizer. It carries out the logical optimizations further such as projection and push down.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iii. Compiler<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Then compiler compiles the optimized logical plan into a series of MapReduce jobs.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iv. Execution engine<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Eventually, all the MapReduce jobs are submitted to Hadoop in a sorted order. Ultimately, it produces the desired results while these MapReduce jobs are executed on Hadoop.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Pig Latin Data Model<\/span><\/h2>\n<div id=\"attachment_14504\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-14504\" class=\"wp-image-14504 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01.jpg\" alt=\"Apache Pig architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Pig-Latin-Data-Model-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-14504\" class=\"wp-caption-text\">Apache Pig architecture &#8211; Pig Latin Data Model<\/p><\/div>\n<p><span style=\"font-weight: 400\">Pig Latin data model is fully nested. Also, it allows complex non-atomic data types like map and tuple. Let\u2019s discuss this data model in detail:<\/span><\/p>\n<h3><span style=\"font-weight: 400\">i. Atom<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Atom is defined as any single value in Pig Latin, irrespective of their data. Basically, we can use it as string and number and store it as the string. Atomic values of Pig are int, long, float, double, char array, and byte array. Moreover, a field is a piece of data or a simple atomic value in Pig.<\/span><\/p>\n<p><span style=\"font-weight: 400\">For Example \u2212 \u2018Shubham\u2019 or \u201825\u2019<\/span><\/p>\n<h3><span style=\"font-weight: 400\">ii. Tuple<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Tuple is a record that is formed by an ordered set of fields. However, the fields can be of any type. In addition, a tuple is similar to a row in a table of RDBMS.<\/span><br \/>\n<span style=\"font-weight: 400\">For Example \u2212 (Shubham, 25)<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iii. Bag<\/span><\/h3>\n<p><span style=\"font-weight: 400\">An unordered set of tuples is what we call Bag. To be more specific, a Bag is a collection of tuples (non-unique). Moreover, each tuple can have any number of fields (flexible schema). Generally, we represent a bag by \u2018{}\u2019. <\/span><\/p>\n<p><span style=\"font-weight: 400\">For Example \u2212 {(Shubham, 25), (Pulkit, 35)}<\/span><\/p>\n<p><span style=\"font-weight: 400\">In addition, when a bag is a field in a relation, in that way it is known as the inner bag.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Example \u2212 {Shubham, 25, {9826022258, Shubham@gmail.com,}}<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iv. Map<\/span><\/h3>\n<p><span style=\"font-weight: 400\">A set of key-value pairs is what we call a map (or data map). Basically, the key needs to be of type char array and should be unique. Also, the value might be of any type. And, we represent it\u00a0 by \u2018[]\u2019<\/span><\/p>\n<p><span style=\"font-weight: 400\">For Example \u2212 [name#Shubham, age#25]<\/span><\/p>\n<h3><span style=\"font-weight: 400\">v. Relation<\/span><\/h3>\n<p><span style=\"font-weight: 400\">A bag of tuples is what we call Relation. In Pig Latin, the relations are unordered. Also, there is no guarantee that tuples are processed in any particular order.<\/span><\/p>\n<p>So, this was all in Apache Pig Architecture. Hope you like our explanation.<\/p>\n<h2><span style=\"font-weight: 400\">Conclusion<br \/>\n<\/span><\/h2>\n<p>As a result, we have seen the whole\u00a0Apache Pig Architecture in detail. Still, if you want to ask any query about\u00a0Apache Pig Architecture, feel free to ask in the comment section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In order to write a Pig script, we do require a Pig Latin language. Moreover, we need an execution environment to execute them. So, in this article &#8220;Introduction to Apache Pig Architecture&#8221;, we will&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":42875,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40],"tags":[864,868,2790,4454,8130,9299,9414,9496,9511],"class_list":["post-13697","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-pig","tag-apache-pig-architecture","tag-apache-pig-components","tag-compiler","tag-execution-engine","tag-learn-apache","tag-optimizer","tag-parser","tag-pig-architecture","tag-pig-latin-data-model"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Pig Architecture - Learn Pig Hadoop Working - DataFlair<\/title>\n<meta name=\"description\" content=\"Apache Pig Architecture - Learn architecture of Apache Pig with Pig components - Parser, Optimizer,Compiler, Execution Engine. Learn Pig Job Execution Flow\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/pig-architecture\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Pig Architecture - Learn Pig Hadoop Working - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Apache Pig Architecture - Learn architecture of Apache Pig with Pig components - Parser, Optimizer,Compiler, Execution Engine. Learn Pig Job Execution Flow\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/pig-architecture\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-23T07:20:45+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Pig Architecture - Learn Pig Hadoop Working - DataFlair","description":"Apache Pig Architecture - Learn architecture of Apache Pig with Pig components - Parser, Optimizer,Compiler, Execution Engine. Learn Pig Job Execution Flow","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/pig-architecture\/","og_locale":"en_US","og_type":"article","og_title":"Apache Pig Architecture - Learn Pig Hadoop Working - DataFlair","og_description":"Apache Pig Architecture - Learn architecture of Apache Pig with Pig components - Parser, Optimizer,Compiler, Execution Engine. Learn Pig Job Execution Flow","og_url":"https:\/\/data-flair.training\/blogs\/pig-architecture\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-04-23T07:20:45+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Apache Pig Architecture &#8211; Learn Pig Hadoop Working","datePublished":"2018-04-23T07:20:45+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/"},"wordCount":720,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg","keywords":["Apache pig Architecture","Apache Pig Components","Compiler","Execution Engine","Learn Apache","Optimizer","Parser","Pig Architecture","Pig Latin Data Model"],"articleSection":["Pig Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/pig-architecture\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/","url":"https:\/\/data-flair.training\/blogs\/pig-architecture\/","name":"Apache Pig Architecture - Learn Pig Hadoop Working - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg","datePublished":"2018-04-23T07:20:45+00:00","description":"Apache Pig Architecture - Learn architecture of Apache Pig with Pig components - Parser, Optimizer,Compiler, Execution Engine. Learn Pig Job Execution Flow","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/pig-architecture\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Apache-Pig-Architecture-01.jpg","width":1200,"height":628,"caption":"Apache Pig Architecture - Learn Pig Hadoop Working"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/pig-architecture\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Pig Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/pig\/"},{"@type":"ListItem","position":3,"name":"Apache Pig Architecture &#8211; Learn Pig Hadoop Working"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/13697","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=13697"}],"version-history":[{"count":0,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/13697\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/42875"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=13697"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=13697"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=13697"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}