This topic contains 2 replies, has 1 voice, and was last updated by  dfbdteam3 1 year, 6 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #5812

    dfbdteam3
    Moderator

    When reducer is set to 0 in MapReduce? Why?

    #5814

    dfbdteam3
    Moderator

    If we set the number of Reducer to 0 (by setting job.setNumreduceTasks(0)), then no reducer will execute and no aggregation will take place. In such case, we will prefer “Map-only job” in Hadoop.
    Map-Only job
    In Map-Only job, the map does all task with its InputSplit and the reducer do no job. Mapper output is the final output. Between map and reduce phases there is key, sort, and shuffle phase. Sort and shuffle phase are responsible for sorting the keys in ascending order.

    Then grouping values based on same keys. This phase is very expensive. If reduce phase is not required we should avoid it. Avoiding reduce phase would eliminate sort and shuffle phase as well. This also saves network congestion. As in shuffling an output of mapper travels to the reducer, when data size is huge, large data travel to the reducer.

    In MapReduce job, mapper output is written to local disk before sending to Reducer but in the map-only job, this output is directly written to HDFS. This further saves time and reduces cost as well.

    Follow the link to learn more about Reducer in Hadoop

    #5815

    dfbdteam3
    Moderator

    The number of reducer can be set to 0 in driver class by job.setNumreduceTasks(0).This shows that there is no reducer phase and has only map phase.It is called as a map-only job.

    Map-only job:
    The map-only job has only map phase.The output of mapper stores directly on HDFS not on disk. The map output is final output.As it has no reducer phase, the aggregation and sorting is also not done.Generally, in map-reducer job the output after shuffling and sorting goes to the reducer, when the data is huge it needs good network bandwidth. As there is no shuffling and sorting in map-only job there will be less network congestion.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.