can we set the number of reducers to zero in MapReduce?

This topic has 3 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 3 reply threads

Author

Posts
- September 20, 2018 at 3:48 pm #5627
  
  DataFlair Team
  Spectator
  
  Can we specify reducer to 0?
  How to set no of reduce task to zero?
  What can be the minimum number of reducers in map reduce?
- September 20, 2018 at 3:49 pm #5629
  
  DataFlair Team
  Spectator
  
  Yes. We can set the number of Reducer to 0 in Hadoop and it is valid configuration.
  When we set the reducer to 0 in that case, no reduce phase gets executed and output from mapper is considered as final output and written in HDFS
  Following are the ways to set the reducer to 0
  By setting the mapred.reduce.tasks = 0
  
  job.setNumReduceTasks(0);
  
  where job is an instance of class JobConf which helps the user to configure the map/reduce job.
  
  Job in which we set the No. of Reducer = 0, it is also known as Map only job.
  In a map-only job, the map does all task with its InputSplit and the reducer does no job. Between map and reduce phases there is key, sort, and shuffle phase. Sort and shuffle phase are responsible for sorting the keys in ascending order. Then grouping values based on same keys. This phase is very expensive. If reduce phase is not required we should avoid it. Avoiding reduce phase would eliminate sort and shuffle phase as well. This also saves network congestion. As in shuffling an output of mapper travels to the reducer, when data size is huge, large data travel to the reducer.
  
  Follow the link to learn more about Reducer in Hadoop
- September 20, 2018 at 3:49 pm #5632
  
  DataFlair Team
  Spectator
  
  Number of Reducer can be set to zero if there is no need of a reducer job. As reducer is generally used for data consolidation or aggregation rather than heavy computation.
  
  If there is no reducer defined, in that case, the output generated by the mapper task will be considered as final output and stored in HDFS.
- September 20, 2018 at 3:49 pm #5633
  
  DataFlair Team
  Spectator
  
  Yes, we can set the Number of Reducer to zero.This means it is map only.The data is not sorted and directly stored in HDFS.
  job.setNumReduceTasks(0)
  
  If we want the output from mapper to be sorted ,we can use Identity reducer.
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

can we set the number of reducers to zero in MapReduce?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses