Is reduce-only job possible in Hadoop

This topic has 2 replies, 1 voice, and was last updated 7 years, 10 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 4:48 pm #5939
  
  DataFlair Team
  Spectator
  
  I have heard about Map-only job.
  What about reduce-only job. Is reduce-only job possible in Hadoop ?
- September 20, 2018 at 4:48 pm #5940
  
  DataFlair Team
  Spectator
  
  The reduce-only job is not possible. If we view the internal flow of data movement from local HDFS store to Mapper, the OOTB components namely InputFormat,InputSpilt and RecordReader are getting executed in a sequential manner to provide input data as key-value pair to Mapper first.
  
  Reducer takes a set of an intermediate key-value pair produced by the mapper as the input. After that, it runs a reduce function on each of them to generate the output. Thus the output of the reducer is the final output, which it stored in HDFS. Usually, in the reducer, we do aggregation or summation sort of computation.
  
  Reducer has three primary phases-
  
  Shuffle- In this phase, for each reducer hadoop framework collects the relevant partition of the output of all the Mappers by HTTP.
  Sort- The framework groups Reducers inputs by the key in this Phase.
  Shuffle/sorting phases occur simultaneously.
  Reduce- After shuffling and sorting, reduce task aggregates the key-value pairs. In this phase, call the reduce (Object, Iterator, OutputCollector, Reporter) method for each <key, (list of values)> pair in the grouped inputs.
  So if we try to write the reduce-only job, the above steps have to be omitted which Hadoop won’t allow.
- September 20, 2018 at 4:48 pm #5942
  
  DataFlair Team
  Spectator
  
  No, Reduce Only job can’t exist because the reducer’s input is the intermediate data in the form of key-value pair from the Mapper. Since the source of it’s input is the mapper output and the reducer produces final output as aggregation/summation sort of function occurs here, the map-reduce function is incomplete without reducer.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Is reduce-only job possible in Hadoop

About DataFlair

Trending Courses in Indore

Trending Courses in Bangalore

Trending Courses in Chennai

Trending Courses in Pune

Trending Courses in Hyderabad

Trending Courses in Delhi NCR