Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › What is difference between reducer and combiner in Hadoop MapReduce?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 5:38 pm #6290DataFlair TeamSpectator
Comparison between Combiner vs Reducer in Hadoop?
How Combiner is different from Reducer in Hadoop? -
September 20, 2018 at 5:38 pm #6291DataFlair TeamSpectator
Both Reducer and Combiner are conceptually the same thing. The difference is when and where they are executed.
- A Combiner is executed (optionally) after the Mapper phase in the same Node which runs the Mapper. So there is no Network I/O involved. Thus is it also know as a Local Reducer. It does similar things as a Reducer (i.e.) group data in order to reduce Network traffic between Mapper and Reducer Nodes.
- A Reducer, on the other hand, is executed after data from multiple Mappers are partitioned and based on an algorithm they are Shuffled into various Nodes across the network. When the data is of Key/Value pairtype, Values with the same Key always lands at the same Reducer.
Follow the link to learn more about Reducer and Combiner in Hadoop
-
September 20, 2018 at 5:38 pm #6292DataFlair TeamSpectator
The Combiner is the reducer of an input split.
Combiner processes the Key/Value pair of one input split at mapper node before writing this data to local disk, if it specified.
Reducer processes the key/value pair of all the key/value pairs of given data that has to be processed at reducer node if it is specified.Note: While using combiner, we need to be careful that whether the data process can be done incrementally or not. If yes then only we can use combiner.
-
-
AuthorPosts
- You must be logged in to reply to this topic.