what is map only job in Hadoop MapReduce?

This topic has 1 reply, 1 voice, and was last updated 5 years, 8 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 3:03 pm #5399
  
  DataFlair Team
  Spectator
  
  What is map only job?
  What type of problems can be solved using map only job?
  What is the difference between map reduce job and map only job?
  Whose performance is better MapReduce job or map only job?
- September 20, 2018 at 3:03 pm #5400
  
  DataFlair Team
  Spectator
  
  When there is no Reduce job to execute, then it is Map-Only job. Map does all its task with its InputSplit and there is no job for Reducer. This can be achieved by setting Job.SetNumReduceTasks(0). In this output files will be equal to number of Mappers.
  
  When there is no computational job required, we use Map-Only job and you just need to perform a repetitive serial process on each piece of data.
  
  Map-Reduce. job is sorted while Map-Only job is not stored because Complete Sorting done under Reducer phase. In Map-Only job output of Mapper is directly stored in HDFS unlike Map-Reduce phase where Mapper output i.e. intermediate output is stored on local disks and then transferred to Reducer. This saves time and reduce costs.
  
  Map-Only job have better performance as it is much quicker since Reducer job have computational task which is a time taking process and completely depends on the job we are required to do. It is more efficient as well.
  
  Follow the link to learn more about: Map-Only job in Hadoop
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.