This topic contains 1 reply, has 1 voice, and was last updated by  dfbdteam3 1 year, 6 months ago.

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
  • #4696


    <div class=”post”>

    The Number of Combiners per node is 1, can this be altered?




    The combiner in MapReduce is also known as ‘Mini-reducer’. The primary job of Combiner is to process the output data from the Mapper, before passing it to Reducer. It runs after the mapper and before the Reducer and its use is optional.

    Theoretically speaking, it’s harmless to have more combiners till your function is cumulative and associative (like the word count example).

    But think, if you have more number of combiners in a mapper node, isn’t the original purpose (reducing the volume of data sent to reducer) of combiner itself is defeated/compromised?

    • As individual combiners will receive data streams separately, hence the level of compression will not be as good as one combiner could have achieved.

    Take the example of WordCount:-

    In original InputSplit supplied to the mapper, let’s say, we have the word ”Hadoop” appearing 10 times, and you have 3 combiners let’s say (assuming that it’s possible):-

    so the best combiner output can be:-
    (”hadoop”, 4 )
    ( ”hadoop”,3)

    however had we just one combiner output from this mapper node could have been

    Moreover, just even going by the basic literal meaning of ”combiner”, for a single node it’s more intuitive to think it as just one instance.

    Follow the link to learn more about Combiner in Hadoop

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.