What is Partioner in hadoop? Where does it run,mapper or reducer?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is Partioner in hadoop? Where does it run,mapper or reducer?

Viewing 1 reply thread
  • Author
    Posts
    • #6155
      DataFlair TeamDataFlair Team
      Spectator

      Explain the partitioner. where partitioner runs ? does it run on mapper node or reducer node ?

    • #6158
      DataFlair TeamDataFlair Team
      Spectator

      In MapReduce, the input data is fed to the Mapper as InputSplit. This InputSplit is processed by the mapper and it produces intermediate output as Key, Value pairs.
      This Key, Value pairs are then partitioned based on the key. The records for the same key from all the mappers will go to the same partition. By default, HashPartition on key is applied to the intermediate output. The no. of partitions depends on the no. of the reducers set for the job. The records in the same partition go to the same reducer for further processing.
      Partitioner runs on the mapper node, as it is applied on the output of each map task.

      Follow the link for more detail: Partitioner in Hadoop

Viewing 1 reply thread
  • You must be logged in to reply to this topic.