Live instructor-led & Self-paced Online Certification Training Courses (Big Data, Hadoop, Spark) Forums Hadoop Does Partitioner run in its own JVM or shares with another process?

This topic contains 1 reply, has 1 voice, and was last updated by  dfbdteam3 1 year, 6 months ago.

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #4724

    dfbdteam3
    Moderator

    As we know Partitioner runs on Mapper node.
    What is the triggering condition to call Partitioner?

    Does Partitioner run in its own JVM or shares JVM with some other process?

    #4725

    dfbdteam3
    Moderator

    Condition to call a Partitioner
    MapReduce job takes an input data set and produces the list of the key-value pair which is the result of map phase in which input data is split and each task processes the split and each map, output the list of key-value pairs. Afterward, the output from the map phase is sent to reduce task which processes the user-defined reduce function on map outputs. Yet, partitioning of the map output take place on the basis of the key and sorted, before the reduce phase.

    Hence, the partitioning mentioned that for each key, all the values are grouped together as well as it ensures that all the values of a single key go to the same reducer, therefore permits even distribution of the map output over the reducer. Also, by determining which reducer is responsible for the particular key, Partitioner in Hadoop MapReduce redirects the mapper output to the reducer.

    And the Partitioner, the Mapper, and the Combiner all can execute on the Mapper node in a single JVM which is designated for the Mapper.

    Learn more about Hadoop Partitioner in detail, follow the link: Hadoop Partitioner – Internals of MapReduce Partitioner

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.