What is difference between reducer and combiner in Hadoop MapReduce?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is difference between reducer and combiner in Hadoop MapReduce?

Viewing 2 reply threads
  • Author
    Posts
    • #6290
      DataFlair TeamDataFlair Team
      Spectator

      Comparison between Combiner vs Reducer in Hadoop?
      How Combiner is different from Reducer in Hadoop?

    • #6291
      DataFlair TeamDataFlair Team
      Spectator

      Both Reducer and Combiner are conceptually the same thing. The difference is when and where they are executed.

      • Combiner is executed (optionally) after the Mapper phase in the same Node which runs the Mapper. So there is no Network I/O involved. Thus is it also know as a Local Reducer. It does similar things as a Reducer (i.e.) group data in order to reduce Network traffic between Mapper and Reducer Nodes.
      • Reducer, on the other hand, is executed after data from multiple Mappers are partitioned and based on an algorithm they are Shuffled into various Nodes across the network. When the data is of Key/Value pairtype, Values with the same Key always lands at the same Reducer.

      Follow the link to learn more about Reducer and Combiner in Hadoop

    • #6292
      DataFlair TeamDataFlair Team
      Spectator

      The Combiner is the reducer of an input split.
      Combiner processes the Key/Value pair of one input split at mapper node before writing this data to local disk, if it specified.
      Reducer processes the key/value pair of all the key/value pairs of given data that has to be processed at reducer node if it is specified.

      Note: While using combiner, we need to be careful that whether the data process can be done incrementally or not. If yes then only we can use combiner.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.