What is Reducer in Hadoop?

Viewing 1 reply thread
  • Author
    Posts
    • #6334
      DataFlair TeamDataFlair Team
      Spectator

      What is Reducer in MapReduce?
      How Reducer works in Hadoop MapReduce?
      What can we do in Reducer of Hadoop MapReduce?

    • #6336
      DataFlair TeamDataFlair Team
      Spectator

      Reducer is the 2nd phase of processing the data in Hadoop. Reducer takes the intermiedate (key,value pairs) output which stored in local disk from the mapper as input.Several reducers can run parallely since they are independent of each other.In Reducer we do aggregation or summation computation anlaysis.

      Reducer has 3 phases
      Shuffle: Output from the mapper is shuffled from all the mappers.
      Sort: Sorting is done parallely with shuffle phase where the input from different mappers is sorted
      Reduce: Reducer task aggerates the key value pair and gives the required output based on the business logic implemented.The output of reducer is written on HDFS and is not sorted.

      By default, no of reducers is set to 1.The user can set no of reducers for the job to run.
      Right no of reducers =0.95 or 1.75 * no of nodes * no of the maximum container per node.

      Follow the link to learn more about Reducer in Hadoop

Viewing 1 reply thread
  • You must be logged in to reply to this topic.