Does Partitioner run in its own JVM or shares with another process?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Does Partitioner run in its own JVM or shares with another process?

Viewing 1 reply thread
  • Author
    Posts
    • #4724
      DataFlair TeamDataFlair Team
      Spectator

      As we know Partitioner runs on Mapper node.
      What is the triggering condition to call Partitioner?

      Does Partitioner run in its own JVM or shares JVM with some other process?

    • #4725
      DataFlair TeamDataFlair Team
      Spectator

      Condition to call a Partitioner
      MapReduce job takes an input data set and produces the list of the key-value pair which is the result of map phase in which input data is split and each task processes the split and each map, output the list of key-value pairs. Afterward, the output from the map phase is sent to reduce task which processes the user-defined reduce function on map outputs. Yet, partitioning of the map output take place on the basis of the key and sorted, before the reduce phase.

      Hence, the partitioning mentioned that for each key, all the values are grouped together as well as it ensures that all the values of a single key go to the same reducer, therefore permits even distribution of the map output over the reducer. Also, by determining which reducer is responsible for the particular key, Partitioner in Hadoop MapReduce redirects the mapper output to the reducer.

      And the Partitioner, the Mapper, and the Combiner all can execute on the Mapper node in a single JVM which is designated for the Mapper.

      Learn more about Hadoop Partitioner in detail, follow the link: Hadoop Partitioner – Internals of MapReduce Partitioner

Viewing 1 reply thread
  • You must be logged in to reply to this topic.