MapReduce is the data processing layer of Hadoop. It is the framework for writing applications that process the vast amount of data stored in the HDFS.
In Hadoop, Job is divided into multiple small parts known as Task.
In Hadoop, “MapReduce Job” splits the input dataset into independent chunks which are processed by the “Map Tasks” in a completely parallel manner. Hadoop framework sorts the output of the map, which are then input to the reduce tasks.
Both the input and output of the job is stored in a filesystem. Hadoop framework deals with task scheduling, monitoring, and re-executing the failed tasks.