

{"id":431,"date":"2016-06-13T14:02:03","date_gmt":"2016-06-13T14:02:03","guid":{"rendered":"http:\/\/data-flair.training\/blogs\/?p=431"},"modified":"2018-11-19T16:19:07","modified_gmt":"2018-11-19T10:49:07","slug":"hadoop-mapreduce-flow","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/","title":{"rendered":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce?"},"content":{"rendered":"<h2>1. Objective<\/h2>\n<p><strong><a href=\"http:\/\/data-flair.training\/blogs\/hadoop-introduction-comprehensive-tutorial-guide-beginners\/\">Hadoop<\/a><\/strong> MapReduce processes a huge amount of data in parallel by dividing the job into a set of independent tasks (sub-job). In Hadoop, MapReduce works by breaking the processing into phases: Map and Reduce. In this tutorial, will explain you the complete Hadoop MapReduce flow.<\/p>\n<p>This MapReduce tutorial, will cover an end to end Hadoop\u00a0MapReduce\u00a0flow. Hope this blog will give you the answer for how Hadoop MapReduce works, how data is processed when a map-reduce job is submitted. Step by step execution flow of MapReduce, what are the steps involved in MapReduce job execution, etc. Let&#8217;s get the answer to all these questions with the deep study of Hadoop MapReduce flow.<\/p>\n<div id=\"attachment_42793\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-42793\" class=\"size-full wp-image-42793\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg\" alt=\"Hadoop MapReduce Flow \u2013 How data flows in MapReduce?\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-42793\" class=\"wp-caption-text\">Hadoop MapReduce Flow \u2013 How data flows in MapReduce?<\/p><\/div>\n<h2>2. What is MapReduce?<\/h2>\n<p><strong>MapReduce<\/strong> is the data processing layer of Hadoop (other layers are <strong><a href=\"http:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\">HDFS<\/a><\/strong> &#8211; data processing layer, <strong><a href=\"http:\/\/data-flair.training\/blogs\/hadoop-yarn-tutorial\/\">Yarn<\/a><\/strong> &#8211; resource management layer).\u00a0MapReduce is a programming paradigm designed for processing huge\u00a0volumes of data in parallel by dividing the job (submitted work)\u00a0into a set of independent tasks (sub-job). You just need to put the custom code (business logic) in the way map reduce works and rest things will be taken care by the engine.<\/p>\n<h3>2.1. Hadoop MapReduce Flow &#8211; MapReduce Video Tutorial<\/h3>\n<p>We have also provided the video tutorial for more understanding of the internal of Hadoop MapReduce flow.<\/p>\n<h2>3. How Hadoop MapReduce Works?<\/h2>\n<div id=\"attachment_1347\" style=\"width: 610px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/12\/hadoop-mapreduce-data-flow-execution.gif\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1347\" class=\"wp-image-1347 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/12\/hadoop-mapreduce-data-flow-execution.gif\" alt=\"hadoop mapreduce data flow execution\" width=\"600\" height=\"338\" \/><\/a><p id=\"caption-attachment-1347\" class=\"wp-caption-text\">MapReduce Data Flow<\/p><\/div>\n<p><strong>MapReduce<\/strong> is the heart of<strong> Hadoop<\/strong>. It is a programming model designed for processing huge\u00a0volumes of data (both structured as well as unstructured) in parallel by dividing the work into a set of independent sub-work (tasks). Let&#8217;s discuss How MapReduce works internally-<\/p>\n<h3>3.1. MapReduce Internals<\/h3>\n<p>MapReduce is the combination of two different processing idioms called <strong>Map<\/strong> and <strong>Reduce<\/strong>, where we can specify our custom business logic. The map is the first phase of processing, where we specify all the complex logic\/business rules\/costly code. On the other hand, Reduce is the second phase of processing, where we specify light-weight processing. For example, aggregation\/summation.<\/p>\n<h3>3.2. Step by step MapReduce Job Flow<\/h3>\n<p>The data processed by MapReduce should be stored in <strong>HDFS<\/strong>, which divides the data into blocks and store distributedly, for more details about HDFS\u00a0<a href=\"http:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\">follow this HDFS comprehensive tutorial<\/a>. Below are the steps for MapReduce data flow:<\/p>\n<ul>\n<li><strong>Step 1:<\/strong>\u00a0One block is processed by one<strong><a href=\"http:\/\/data-flair.training\/blogs\/mapper-in-hadoop-mapreduce\/\"> mapper<\/a><\/strong> at a time. In the mapper, a developer can specify his own business logic as per the requirements. In this manner, Map runs on all the nodes of the cluster and process the data blocks in parallel.<\/li>\n<li><strong>Step 2:<\/strong>\u00a0Output of Mapper also known as intermediate output is written to the local disk. An output of mapper is not stored on HDFS as this is temporary data and <a href=\"http:\/\/data-flair.training\/blogs\/hdfs-data-write-operation\/\"><strong>writing on HDFS<\/strong> <\/a>will create unnecessary many copies.<\/li>\n<li><strong>Step 3:<\/strong>\u00a0Output of mapper is shuffled to <strong><a href=\"http:\/\/data-flair.training\/blogs\/reducer-in-hadoop-mapreduce\/\">reducer<\/a><\/strong> node (which is a normal slave node but reduce phase will run here hence called as reducer node). The shuffling\/copying is a physical movement of data which is done over the network.<\/li>\n<li><strong>Step 4:<\/strong> Once all the mappers are finished and their output is shuffled on reducer nodes then this intermediate output is merged &amp; sorted. Which is then provided as input to reduce phase.<\/li>\n<li><strong>Step 5:<\/strong> Reduce is the second phase of processing where the user can specify his own custom business logic as per the requirements. An input to a reducer is provided from all the mappers. An output of reducer is the final output, which is written on HDFS.<\/li>\n<\/ul>\n<p>Hence, in this manner, a map-reduce job is executed over the cluster. All the complexities of distributed processing are handled by the framework. For example, data\/code distribution, <strong><a href=\"http:\/\/data-flair.training\/blogs\/hadoop-high-availability-tutorial\/\">high availability<\/a><\/strong>, fault-tolerance, data locality, etc. The user just needs to concentrate on his own business requirements and write his custom code at specified phases (map and reduce).<\/p>\n<h3>3.3. Data Locality<\/h3>\n<p><strong>Data locality<\/strong> is one of the most innovative principles which says move the algorithm close to the data rather than moving the data. Since data is in the range of Petabytes. Movement of petabytes of data is not workable, hence algorithms \/ user-codes are moved to the location where data is present. If we summarize data locality &#8211; Movement of computation is cheaper than the movement of data. To learn data locality in detail<a href=\"http:\/\/data-flair.training\/blogs\/data-locality-hadoop-mapreduce\/\"> follow this quick guide.<\/a><\/p>\n<h2>4. Conclusion<\/h2>\n<p>In conclusion, we can say that data flow in MapReduce is the combination of Map and Reduce.\u00a0<strong>The map<\/strong> takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (<a href=\"http:\/\/data-flair.training\/blogs\/key-value-pairs-hadoop-mapreduce\/\"><strong>key\/value pairs<\/strong><\/a>). Then, <strong>Reduce<\/strong> takes the input from the Map and combines those data tuples based on the key and modifies the value of the key. Hence, in this manner, a map-reduce job is executed over the cluster. Thus all the complexities of distributed processing are handled by the framework.<br \/>\nIf you have any query related to Hadoop MapReduce data flow process, so please feel free to share with us.<br \/>\n<strong>References:<\/strong><br \/>\n<a href=\"http:\/\/hadoop.apache.org\" target=\"_blank\" rel=\"noopener noreferrer\">hadoop.apache.org<\/a><br \/>\n<strong>See Also-<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/data-flair.training\/blogs\/shuffling-sorting-hadoop-mapreduce\/\">Shuffling and Sorting in MapReduce<\/a><\/li>\n<li><a href=\"http:\/\/data-flair.training\/blogs\/mapreduce-job-optimization-performance-tuning-techniques\/\">MapReduce Job Optimization and Performance Tuning<\/a><\/li>\n<\/ul>\n<p><span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:1961,&quot;href&quot;:&quot;http:\\\/\\\/hadoop.apache.org&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20251008061344\\\/https:\\\/\\\/hadoop.apache.org\\\/&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-10 14:04:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-13 17:20:07&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-16 18:01:00&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-19 19:50:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-22 20:54:18&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-26 11:01:10&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-29 14:17:41&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-02 12:20:14&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-05 13:45:38&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-08 16:24:57&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-11 19:41:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-14 19:52:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-18 05:07:46&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-21 05:37:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-24 11:10:18&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-27 15:53:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-30 16:11:51&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-02 16:36:11&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-05 19:55:07&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-09 01:39:54&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-12 03:15:19&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-15 09:31:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-18 09:35:50&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-22 06:23:55&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-25 11:29:35&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-28 16:00:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-03 17:03:37&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-07 11:08:33&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-10 13:03:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-13 19:14:29&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-17 05:54:08&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-20 11:50:36&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-23 13:05:02&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-26 14:25:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-30 06:56:47&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-02 06:59:42&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-05 18:20:10&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-09 05:19:38&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-12 06:38:00&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-15 12:30:25&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-18 15:09:26&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-21 15:42:42&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-24 15:47:46&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-27 23:43:53&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-01 05:55:54&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-04 15:38:31&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-07 17:57:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-11 03:53:53&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-14 11:16:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-18 03:36:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-21 03:50:12&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-24 05:21:44&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-28 01:43:22&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-31 20:41:16&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-04 03:29:33&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-07 05:08:27&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-10 05:20:26&quot;,&quot;http_code&quot;:503},{&quot;date&quot;:&quot;2026-06-13 07:55:02&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-16 08:00:16&quot;,&quot;http_code&quot;:206}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-16 08:00:16&quot;,&quot;http_code&quot;:206},&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Objective Hadoop MapReduce processes a huge amount of data in parallel by dividing the job into a set of independent tasks (sub-job). In Hadoop, MapReduce works by breaking the processing into phases: Map&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":42793,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[782,2589,3313,3406,4798,6881,8537,8541,8556,8564],"class_list":["post-431","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mapreduce","tag-apache-hadoop","tag-cloudera-hadoop","tag-data-flow-mechanism-in-mapreduce","tag-data-processing-in-mapreduce","tag-flow-of-data-in-mapreduce","tag-internal-flow-of-mapreduce","tag-mapreduce","tag-mapreduce-data-processing","tag-mapreduce-model-of-data-processing","tag-mapreduce-working"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hadoop MapReduce Flow \u2013 How data flows in MapReduce? - DataFlair<\/title>\n<meta name=\"description\" content=\"Hadoop MapReduce Flow covers How Mapreduce works, internals of mapreduce job execution flow,map phase,reduce phase,mapreduce shuffling-sorting,data locality\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop MapReduce Flow \u2013 How data flows in MapReduce? - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Hadoop MapReduce Flow covers How Mapreduce works, internals of mapreduce job execution flow,map phase,reduce phase,mapreduce shuffling-sorting,data locality\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2016-06-13T14:02:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-19T10:49:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce? - DataFlair","description":"Hadoop MapReduce Flow covers How Mapreduce works, internals of mapreduce job execution flow,map phase,reduce phase,mapreduce shuffling-sorting,data locality","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce? - DataFlair","og_description":"Hadoop MapReduce Flow covers How Mapreduce works, internals of mapreduce job execution flow,map phase,reduce phase,mapreduce shuffling-sorting,data locality","og_url":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2016-06-13T14:02:03+00:00","article_modified_time":"2018-11-19T10:49:07+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce?","datePublished":"2016-06-13T14:02:03+00:00","dateModified":"2018-11-19T10:49:07+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/"},"wordCount":868,"commentCount":9,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg","keywords":["apache hadoop","Cloudera Hadoop","Data Flow Mechanism in MapReduce","Data processing in MapReduce","Flow of Data in MapReduce","Internal Flow of MapReduce","MapReduce","MapReduce Data Processing","MapReduce model of data processing","MapReduce Working"],"articleSection":["MapReduce Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/","url":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/","name":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce? - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg","datePublished":"2016-06-13T14:02:03+00:00","dateModified":"2018-11-19T10:49:07+00:00","description":"Hadoop MapReduce Flow covers How Mapreduce works, internals of mapreduce job execution flow,map phase,reduce phase,mapreduce shuffling-sorting,data locality","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/hadoop-mapreduce-flow.jpg","width":1200,"height":628,"caption":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce?"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/hadoop-mapreduce-flow\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"MapReduce Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/mapreduce\/"},{"@type":"ListItem","position":3,"name":"Hadoop MapReduce Flow \u2013 How data flows in MapReduce?"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=431"}],"version-history":[{"count":5,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/431\/revisions"}],"predecessor-version":[{"id":42796,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/431\/revisions\/42796"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/42793"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}