Use case for Map only job

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 3:45 pm #5604
  
  DataFlair Team
  Spectator
  
  What are the Use cases of Map only job?
- September 20, 2018 at 3:45 pm #5606
  
  DataFlair Team
  Spectator
  
  Map-Only job is needed, where we don’t need any aggregation or summation, where there is no Reducer to execute. Here Map does all its task with its InputSplit and no job for Reducer
  
  For image processing, there is no need to run a reducer and even in some preprocessor techniques in dataMining while chaining different jobs there are situations where a key(Nullwritable key) is not needed but only values. In that situation we can run the job without reducers, if we don’t want to output in a single file.
  
  Performing data cleanup on something like an HBase table. Read in each row in your mapper, and if it matches some conditional statement then delete it. No need for reduce here.
  
  You have a classification model that you build every day, and you need to classify all your data with that classifier. There is no need for a reduce, you just load the classifier from the distributed cache (or from a remote resource like a DB) and inside the map() function of your mapper you do the classification and write the result somewhere.
- September 20, 2018 at 3:45 pm #5609
  
  DataFlair Team
  Spectator
  
  There may be many cases where Map-only job is needed,
  Where there is no Reducer to execute.Here Map does all its task with its InputSplit
  and no job for Reducer.This can be achieved by setting job.setNumReduceTasks() to Zero.
  
  Generally, map the only job is needed when we don’t need any aggregation or we can say when we just have to parse the data.
  Following are some of the cases where there is no reducer involved:
  
  1.Performing delete operations on data, if some condition matches.There is no need of reducer for this.
  
  2.Let’s take an example of word count problem. Instead of counting word, we have to replace each word with its corresponding length.We have to just replace each word with its length through mapper.
  
  3. Meanwhile, for image processing, there is no need to initiate a reducer. Meanwhile, some preprocessor techniques in data Mining while chaining different tasks there are situations only the values are needed, instead of a key(Nullwritable key). In that situation, we can initiate the tasks without reducers if we don’t want to output in a single file.
  
  Alternately, I can say all cases, where you can use select and where clause easily on a table, those are might be the only job.
  
  Note:
  if you turn off the reducer, the number of output files will be equal to no. of mappers and output files will be named as part-m-00000.
  
  To learn more about Map only Job visit:Map-only job in Map-Reduce
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Use case for Map only job

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses