Condition to call a Partitioner
MapReduce job takes an input data set and produces the list of the key-value pair which is the result of map phase in which input data is split and each task processes the split and each map, output the list of key-value pairs. Afterward, the output from the map phase is sent to reduce task which processes the user-defined reduce function on map outputs. Yet, partitioning of the map output take place on the basis of the key and sorted, before the reduce phase.
Hence, the partitioning mentioned that for each key, all the values are grouped together as well as it ensures that all the values of a single key go to the same reducer, therefore permits even distribution of the map output over the reducer. Also, by determining which reducer is responsible for the particular key, Partitioner in Hadoop MapReduce redirects the mapper output to the reducer.
And the Partitioner, the Mapper, and the Combiner all can execute on the Mapper node in a single JVM which is designated for the Mapper.