Live instructor-led & Self-paced Online Certification Training Courses (Big Data, Hadoop, Spark) Forums Hadoop What is difference between reducer and combiner in Hadoop MapReduce?

This topic contains 2 replies, has 1 voice, and was last updated by  dfbdteam3 1 year, 6 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #6290

    dfbdteam3
    Moderator

    Comparison between Combiner vs Reducer in Hadoop?
    How Combiner is different from Reducer in Hadoop?

    #6291

    dfbdteam3
    Moderator

    Both Reducer and Combiner are conceptually the same thing. The difference is when and where they are executed.

    • Combiner is executed (optionally) after the Mapper phase in the same Node which runs the Mapper. So there is no Network I/O involved. Thus is it also know as a Local Reducer. It does similar things as a Reducer (i.e.) group data in order to reduce Network traffic between Mapper and Reducer Nodes.
    • Reducer, on the other hand, is executed after data from multiple Mappers are partitioned and based on an algorithm they are Shuffled into various Nodes across the network. When the data is of Key/Value pairtype, Values with the same Key always lands at the same Reducer.

    Follow the link to learn more about Reducer and Combiner in Hadoop

    #6292

    dfbdteam3
    Moderator

    The Combiner is the reducer of an input split.
    Combiner processes the Key/Value pair of one input split at mapper node before writing this data to local disk, if it specified.
    Reducer processes the key/value pair of all the key/value pairs of given data that has to be processed at reducer node if it is specified.

    Note: While using combiner, we need to be careful that whether the data process can be done incrementally or not. If yes then only we can use combiner.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.