

{"id":13303,"date":"2018-04-16T08:42:45","date_gmt":"2018-04-16T08:42:45","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=13303"},"modified":"2018-04-16T08:42:45","modified_gmt":"2018-04-16T08:42:45","slug":"hadoop-pig-tutorial","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/","title":{"rendered":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop"},"content":{"rendered":"<p><span style=\"font-weight: 400\">While it comes to analyze large sets of data, as well as to represent them as data flows, we use Apache Pig. It is nothing but an abstraction over MapReduce. So, in this Hadoop Pig Tutorial, we will discuss the whole concept of Hadoop Pig. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Apart from its Introduction, it also includes History, need, its Architecture as well as its Features. Moreover, we will see, some Comparisons like<strong> Pig Vs Hive<\/strong>, Apache Pig Vs SQL and Hadoop Pig Vs MapReduce.<\/span><\/p>\n<p>So, let&#8217;s start the Hadoop Pig Tutorial.<\/p>\n<h2><span style=\"font-weight: 400\">What is Hadoop Pig?<\/span><\/h2>\n<p>Hadoop Pig is nothing but an abstraction over MapReduce. While it comes to analyze large sets of data, as well as to represent them as data flows, we use Apache Pig. Generally, we use it with <strong>Hadoop<\/strong>. By using Pig, we can perform all the data manipulation operations in Hadoop.<\/p>\n<p><span style=\"font-weight: 400\">In addition, Pig offers a high-level language to write data analysis programs which we call as Pig Latin. One of the major advantages of this language is, it offers several operators. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Through them, programmers can develop their own functions for reading, writing, and processing data.<\/span><br \/>\n<span style=\"font-weight: 400\">It has following key properties such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Ease of programming<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Basically, when all the complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, that makes them easy to write, understand, and maintain.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Optimization opportunities<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It allows users to focus on semantics rather than efficiency, to optimize their execution automatically, in which tasks are encoded permits the system.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Extensibility<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In order to do special-purpose processing, users can create their own functions.<\/span><br \/>\n<span style=\"font-weight: 400\">Hence, programmers need to write scripts using Pig Latin language to analyze data using Apache Pig. <\/span><\/p>\n<p><span style=\"font-weight: 400\">However, all these scripts are internally converted to Map and Reduce tasks. It is possible with a component,\u00a0we call as Pig Engine. That accepts the Pig Latin scripts as input and further convert those scripts into MapReduce jobs.<\/span><\/p>\n<p>Next in Hadoop Pig Tutorial is it&#8217;s History.<\/p>\n<h2><span style=\"font-weight: 400\">Hadoop Pig Tutorial &#8211; History<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Apache Pig was developed as a research project, in 2006, at Yahoo. Basically, to create and execute MapReduce jobs on every dataset it was created. By Apache incubator, Pig was open sourced, in 2007. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Then the first release of Apache Pig came out in 2008. Further, Hadoop Pig graduated as an Apache top-level project, in 2010.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Why Do We Need Apache Pig?<\/span><\/h2>\n<p><span style=\"font-weight: 400\">While performing any MapReduce tasks, there is a case Programmers who are not so good at Java normally used to struggle to work with Hadoop. Thus, we can say, Pig is a boon for all such programmers because:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Without having to type complex codes in Java, using Pig Latin, programmers can perform MapReduce tasks easily.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It also helps in reduce the length of codes, since Pig uses multi-query approach. Let\u2019s understand it with an example. Here an operation that would require us to type 200 lines of code (LoC) in Java can be easily done by typing as less as just 10 LoC in Apache Pig. Hence, it shows, Pig reduces the development time by almost 16 times.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">When you are familiar with SQL, it is easy to learn Pig. Because Pig Latin is SQL-like language.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It offers many built-in operators, in order to support data operations such as joins, filters, ordering, and many more. Also, it offers nested data types that are missing from MapReduce such as tuples, bags, and maps.<\/span><\/li>\n<\/ul>\n<p>Further in the Hadoop Pig Tutorial, lets understand where can we use Pig.<\/p>\n<h2><span style=\"font-weight: 400\">Hadoop Pig Tutorial &#8211; Using Pig<\/span><\/h2>\n<p><span style=\"font-weight: 400\">There are several scenarios, where we can use Pig. Such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">While data loads are time sensitive.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Also, while processing various data sources.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">While we require analytical insights through sampling.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Where Not to Use Pig?<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Also, there are some Scenarios, where we can not use. Such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">While the data is completely unstructured. Such as video, audio, and readable text.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Where time constraints exist. Since Pig is slower than MapReduce jobs.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Also, when more power is required to optimize the codes, we cannot use Pig.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Architecture of Hadoop Pig<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Here, the image, which shows the architecture of Apache Pig.<\/span><\/p>\n<div id=\"attachment_13306\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-13306\" class=\"wp-image-13306 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01.jpg\" alt=\"Hadoop Pig Tutorial\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Architecture-of-Apache-Pig-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-13306\" class=\"wp-caption-text\">Apache Pig Architecture: Hadoop Pig Tutorial<\/p><\/div>\n<p><span style=\"font-weight: 400\">Now, you can see, several components in the Hadoop Pig framework. The major components are:<\/span><\/p>\n<h3><span style=\"font-weight: 400\">i. Parser<\/span><\/h3>\n<p><span style=\"font-weight: 400\">At first, all the Pig Scripts are handled by the Parser. Basically, Parser checks the syntax of the script, does type checking, and other miscellaneous checks. Afterward, Parser\u2019s output will be a DAG (directed acyclic graph). That represents the Pig Latin statements as well as logical operators.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Basically, the logical operators of the script are represented as the nodes and the data flows are represented as edges, in the DAG (the logical plan).<\/span><\/p>\n<h3><span style=\"font-weight: 400\">ii. Optimizer<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Further, DAG is passed to the logical optimizer. That carries out the logical optimizations. Like projection and push down.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iii. Compiler<\/span><\/h3>\n<p><span style=\"font-weight: 400\">It compiles the optimized logical plan into a series of MapReduce jobs.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iv. Execution Engine<\/span><\/h3>\n<p><span style=\"font-weight: 400\">At last, MapReduce jobs are submitted to Hadoop in a sorted order. Hence, these MapReduce jobs are executed finally on Hadoop, that produces the desired results.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Hadoop Pig Tutorial &#8211; Pig Features<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Now in the Hadoop Pig Tutorial is the time to\u00a0 learn the Features of Pig which makes it what it is. There are several features of Pig. Such as: <\/span><\/p>\n<h3><span style=\"font-weight: 400\">i. Rich set of operators <\/span><\/h3>\n<p><span style=\"font-weight: 400\">In order to perform several operations, Pig offers many operators. Such as join, sort, filer and many more.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">ii. Ease of programming <\/span><\/h3>\n<p><span style=\"font-weight: 400\">Since you are good at SQL, \u00a0it is easy to write a Pig script. Because of Pig Latin as same as SQL.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iii. Optimization opportunities<\/span><\/h3>\n<p><span style=\"font-weight: 400\">In Apache Pig, all the tasks optimize their execution automatically. As a result, the programmers need to focus only on the semantics of the language.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iv. Extensibility <\/span><\/h3>\n<p><span style=\"font-weight: 400\">Through Pig, it is easy to read, process, and write data. It is possible by using the existing operators. Also, users can develop their own functions.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">v. UDFs <\/span><\/h3>\n<p><span style=\"font-weight: 400\">By using Pig, we can create User-defined Functions in other programming languages like Java. Also, can invoke or embed them in Pig Scripts.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">vi. Handles all kinds of data <\/span><\/h3>\n<p><span style=\"font-weight: 400\">Pig generally analyzes all kinds of data. Even both structured and unstructured. Moreover, it stores the results in <strong>HDFS<\/strong>.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Recommended Skills prior to learning Pig<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Basic knowledge of Linux Operating System <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Fundamental programming skills<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Pig Vs MapReduce<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Some major differences between Hadoop Pig and <strong>MapReduce<\/strong>, are:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><b>Apache Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is a data flow language.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">However, it is a data processing paradigm.<\/span><\/p>\n<ul>\n<li><b>Hadoop Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Pig is a high-level language.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Well, it is a low level and rigid.<\/span><\/p>\n<ul>\n<li><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In Apache Pig, performing a Join operation is pretty simple.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">But, in MapReduce, it is quite difficult to perform a Join operation between datasets.<\/span><\/p>\n<ul>\n<li><b> Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">With a basic knowledge of SQL, any novice programmer can work conveniently with Pig.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">But, to work with MapReduce, exposure to Java is essential.<\/span><\/p>\n<ul>\n<li><b>Hadoop Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Generally, it uses multi-query approach, thereby reducing the length of the codes to a great extent.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Although, to perform the same task it needs almost 20 times more the number of lines.<\/span><\/p>\n<ul>\n<li><b>Apache Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Here, we do not require any compilation. Every Pig operator is converted internally into a MapReduce job, at the time of execution.<\/span><\/p>\n<ul>\n<li><b>MapReduce<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It has a long compilation process.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Hadoop Pig Vs SQL<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Here, are the major differences between Apache Pig and SQL.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is a procedural language.<\/span><\/p>\n<ul>\n<li><b>SQL<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">While it is a declarative language.<\/span><\/p>\n<ul>\n<li><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Here, the schema is optional. Although, without designing a schema, we can store data. However, it stores values as $01, $02 etc.<\/span><\/p>\n<ul>\n<li><b>SQL<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In SQL, Schema is mandatory.<\/span><\/p>\n<ul>\n<li><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In Pig, data model is nested relational.<\/span><\/p>\n<ul>\n<li><b>SQL<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">In SQL, data model used is flat relational.<\/span><\/p>\n<ul>\n<li><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Here, we have limited opportunity for Query Optimization.<\/span><\/p>\n<ul>\n<li><b>SQL<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">While here we have more opportunity for query optimization.<\/span><br \/>\n<span style=\"font-weight: 400\">Also, Apache Pig Latin \u2212<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">offer splits in the pipeline.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Provides developers to store data anywhere in the pipeline.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It also Declares execution plans.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Offers operators to perform ETL (Extract, Transform, and Load) functions.<\/span><\/li>\n<\/ul>\n<p>Any doubt yet in Hadoop Pig Tutorial. Please Comment.<\/p>\n<h2><span style=\"font-weight: 400\">Apache Pig Vs Hive<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, to create MapReduce jobs, we use both Pig and Hive. Also, we can say, at times, <strong>Hive <\/strong>operates on HDFS as same as Pig does. So, here we are listing few significant points those set Apache Pig apart from Hive.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><b>Hadoop Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Pig Latin is a language, Apache Pig uses. Originally, it was created at Yahoo.<\/span><\/p>\n<ul>\n<li><b>Hive<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">HiveQL is a language, Hive uses. It was originally created at Facebook.<\/span><\/p>\n<ul>\n<li><b>Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is a data flow language.<\/span><\/p>\n<ul>\n<li><b>Hive<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Whereas, it is a query processing language.<\/span><\/p>\n<ul>\n<li><b>\u00a0Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Moreover, it is a procedural language which fits in pipeline paradigm.<\/span><\/p>\n<ul>\n<li><b>Hive<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is a declarative language.<\/span><\/p>\n<ul>\n<li><b>Apache Pig<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Also, can handle structured, unstructured, and semi-structured data.<\/span><\/p>\n<ul>\n<li><b>Hive<\/b><\/li>\n<\/ul>\n<p>Whereas, it is mostly for structured data.<\/p>\n<h2><span style=\"font-weight: 400\">Applications of Pig<\/span><\/h2>\n<p><span style=\"font-weight: 400\">For performing tasks involving ad-hoc processing and quick prototyping, data scientists generally use Apache Pig. More of its applications are:<\/span><\/p>\n<ol>\n<li><span style=\"font-weight: 400\"> In order to process huge data sources like weblogs.<\/span><\/li>\n<li><span style=\"font-weight: 400\"> Also, to perform data processing for search platforms.<\/span><\/li>\n<li>Moreover, to process time sensitive data loads.<\/li>\n<\/ol>\n<p>So, this was all on Hadoop Pig Tutorial. Hope you like our explanation.<\/p>\n<h2><span style=\"font-weight: 400\">Conclusion &#8211; Hadoop Pig Tutorial<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Hence, we have seen the whole concept of Hadoop Pig in this Hadoop Pig Tutorial. Apart from its usage, we have also seen where we can not use it. \u00a0Also, we have seen its prerequisites to learn it well. However, if any doubt occurs, regarding Apache Pig, feel free to ask in the comment section.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>While it comes to analyze large sets of data, as well as to represent them as data flows, we use Apache Pig. It is nothing but an abstraction over MapReduce. So, in this Hadoop&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":35518,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40],"tags":[863,864,1052,16671,5307,6999,9495,9502,16672,9505,9521,9522,9524,9525,16166],"class_list":["post-13303","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-pig","tag-apache-pig","tag-apache-pig-architecture","tag-applications-of-pig","tag-hadoop-pig-tutorial","tag-hadoop-pig-vs-sql","tag-introduction-to-apache-pig","tag-pig-applications","tag-pig-features","tag-pig-hadoop","tag-pig-history","tag-pig-use-cases","tag-pig-vs-hive","tag-pig-vs-mapreduce","tag-pig-vs-sql","tag-why-pig"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop - DataFlair<\/title>\n<meta name=\"description\" content=\"Apache Hadoop Pig Tutorial:Pig Introduction,History,Architecture,Applications, Features, Difference between Apache Pig Vs Hive, Pig vs SQL, Pig vs MapReduce\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Apache Hadoop Pig Tutorial:Pig Introduction,History,Architecture,Applications, Features, Difference between Apache Pig Vs Hive, Pig vs SQL, Pig vs MapReduce\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-16T08:42:45+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop - DataFlair","description":"Apache Hadoop Pig Tutorial:Pig Introduction,History,Architecture,Applications, Features, Difference between Apache Pig Vs Hive, Pig vs SQL, Pig vs MapReduce","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop - DataFlair","og_description":"Apache Hadoop Pig Tutorial:Pig Introduction,History,Architecture,Applications, Features, Difference between Apache Pig Vs Hive, Pig vs SQL, Pig vs MapReduce","og_url":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-04-16T08:42:45+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop","datePublished":"2018-04-16T08:42:45+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/"},"wordCount":1600,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg","keywords":["apache pig","Apache pig Architecture","Applications of Pig","Hadoop Pig Tutorial","Hadoop Pig vs SQL","Introduction to Apache Pig","Pig Applications","Pig Features","Pig Hadoop","Pig History","Pig use cases","pig vs hive","Pig vs MapReduce","Pig vs SQL","Why Pig"],"articleSection":["Pig Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/","url":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/","name":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg","datePublished":"2018-04-16T08:42:45+00:00","description":"Apache Hadoop Pig Tutorial:Pig Introduction,History,Architecture,Applications, Features, Difference between Apache Pig Vs Hive, Pig vs SQL, Pig vs MapReduce","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/04\/Hadoop-Pig-Tutorial-01.jpg","width":1200,"height":628,"caption":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/hadoop-pig-tutorial\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Pig Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/pig\/"},{"@type":"ListItem","position":3,"name":"Hadoop Pig Tutorial: A Comprehensive Guide to Pig Hadoop"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/13303","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=13303"}],"version-history":[{"count":0,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/13303\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/35518"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=13303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=13303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=13303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}