Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › can number of combiners can be changed?
- This topic has 1 reply, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 11:48 am #4696DataFlair TeamSpectator
<div class=”post”>
The Number of Combiners per node is 1, can this be altered?
</div>
-
September 20, 2018 at 11:48 am #4697DataFlair TeamSpectator
Combiner
The combiner in MapReduce is also known as ‘Mini-reducer’. The primary job of Combiner is to process the output data from the Mapper, before passing it to Reducer. It runs after the mapper and before the Reducer and its use is optional.Theoretically speaking, it’s harmless to have more combiners till your function is cumulative and associative (like the word count example).
But think, if you have more number of combiners in a mapper node, isn’t the original purpose (reducing the volume of data sent to reducer) of combiner itself is defeated/compromised?
- As individual combiners will receive data streams separately, hence the level of compression will not be as good as one combiner could have achieved.
Take the example of WordCount:-
In original InputSplit supplied to the mapper, let’s say, we have the word ”Hadoop” appearing 10 times, and you have 3 combiners let’s say (assuming that it’s possible):-
so the best combiner output can be:-
(”hadoop”, 4 )
( ”hadoop”,3)
(”hadoop”,3)however had we just one combiner output from this mapper node could have been
(”hadoop”,10)Moreover, just even going by the basic literal meaning of ”combiner”, for a single node it’s more intuitive to think it as just one instance.
Follow the link to learn more about Combiner in Hadoop
-
-
AuthorPosts
- You must be logged in to reply to this topic.