Can we pass output of one reducer as input to another mapper?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 11:24 am #4626
  
  DataFlair Team
  Spectator
  
  I am running two jobs back to back and I am using the output of first job as an input for second. Can I do that without writing the output of first job in HDFS because it may create memory overhead.
- September 20, 2018 at 11:24 am #4627
  
  DataFlair Team
  Spectator
  
  Sure it is possible to do, we can pass the output of one reducer to another mapper at the time we execute the application through command line we have to give the correct sequence of input as well as output files, so, when we have multiple mapper and reducer classes, this is exactly we have to do. Although make sure reducer output will be treated as the key-value pair for your mapper, if you are using TextInputFormat to read the file then and here each line offset from the beginning of the file will be key and the entire line will be the value.
- September 20, 2018 at 11:24 am #4629
  
  DataFlair Team
  Spectator
  
  Basically, the ChainReducer class permits us to chain multiple Mapper classes after a Reducer in the Reducer task.
  
  Here, the Mapper classes are invoked in a chained (or piped) fashion, for each record output by the Reducer, that means the output of the first becomes the input of the second, and so on until the last Mapper. Thus, last Mapper’s output will be written to the task’s output.
  
  Well, the best thing about this function is, the Mappers in the chain need not to worry about that they are executed after the Reducer or in a chain. So, it enables to have reusable specialized Mappers which we can combine to perform composite operations within a single task.
  
  In addition, to compose Map/Reduce jobs that look like [MAP+ / REDUCE MAP*], can use the ChainMapper and the ChainReducer classes. However, the advantage of this pattern is a dramatic reduction in disk IO.
  
  Furthermore, there is an important thing to note that, we don’t need to specify the output key/value classes for the ChainReducer, because that happens by the setReducer or the addMapper for the last element in the chain only.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Can we pass output of one reducer as input to another mapper?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses