what is map only job in Hadoop MapReduce?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop what is map only job in Hadoop MapReduce?

Viewing 1 reply thread
  • Author
    Posts
    • #5399
      DataFlair TeamDataFlair Team
      Spectator

      What is map only job?
      What type of problems can be solved using map only job?
      What is the difference between map reduce job and map only job?
      Whose performance is better MapReduce job or map only job?

    • #5400
      DataFlair TeamDataFlair Team
      Spectator

      When there is no Reduce job to execute, then it is Map-Only job. Map does all its task with its InputSplit and there is no job for Reducer. This can be achieved by setting Job.SetNumReduceTasks(0). In this output files will be equal to number of Mappers.

      When there is no computational job required, we use Map-Only job and you just need to perform a repetitive serial process on each piece of data.

      Map-Reduce. job is sorted while Map-Only job is not stored because Complete Sorting done under Reducer phase. In Map-Only job output of Mapper is directly stored in HDFS unlike Map-Reduce phase where Mapper output i.e. intermediate output is stored on local disks and then transferred to Reducer. This saves time and reduce costs.

      Map-Only job have better performance as it is much quicker since Reducer job have computational task which is a time taking process and completely depends on the job we are required to do. It is more efficient as well.

      Follow the link to learn more about: Map-Only job in Hadoop

Viewing 1 reply thread
  • You must be logged in to reply to this topic.