What is Partioner in hadoop? Where does it run,mapper or reducer?
-
-
Explain the partitioner. where partitioner runs ? does it run on mapper node or reducer node ?
-
In MapReduce, the input data is fed to the Mapper as InputSplit. This InputSplit is processed by the mapper and it produces intermediate output as Key, Value pairs.
This Key, Value pairs are then partitioned based on the key. The records for the same key from all the mappers will go to the same partition. By default, HashPartition on key is applied to the intermediate output. The no. of partitions depends on the no. of the reducers set for the job. The records in the same partition go to the same reducer for further processing.
Partitioner runs on the mapper node, as it is applied on the output of each map task.
Follow the link for more detail: Partitioner in Hadoop
- You must be logged in to reply to this topic.