Use case for Map only job

Viewing 2 reply threads
  • Author
    Posts
    • #5604
      DataFlair TeamDataFlair Team
      Spectator

      What are the Use cases of Map only job?

    • #5606
      DataFlair TeamDataFlair Team
      Spectator

      Map-Only job is needed, where we don’t need any aggregation or summation, where there is no Reducer to execute. Here Map does all its task with its InputSplit and no job for Reducer

      For image processing, there is no need to run a reducer and even in some preprocessor techniques in dataMining while chaining different jobs there are situations where a key(Nullwritable key) is not needed but only values. In that situation we can run the job without reducers, if we don’t want to output in a single file.

      Performing data cleanup on something like an HBase table. Read in each row in your mapper, and if it matches some conditional statement then delete it. No need for reduce here.

      You have a classification model that you build every day, and you need to classify all your data with that classifier. There is no need for a reduce, you just load the classifier from the distributed cache (or from a remote resource like a DB) and inside the map() function of your mapper you do the classification and write the result somewhere.

    • #5609
      DataFlair TeamDataFlair Team
      Spectator

      There may be many cases where Map-only job is needed,
      Where there is no Reducer to execute.Here Map does all its task with its InputSplit
      and no job for Reducer.This can be achieved by setting job.setNumReduceTasks() to Zero.

      Generally, map the only job is needed when we don’t need any aggregation or we can say when we just have to parse the data.
      Following are some of the cases where there is no reducer involved:

      1.Performing delete operations on data, if some condition matches.There is no need of reducer for this.

      2.Let’s take an example of word count problem. Instead of counting word, we have to replace each word with its corresponding length.We have to just replace each word with its length through mapper.

      3. Meanwhile, for image processing, there is no need to initiate a reducer. Meanwhile, some preprocessor techniques in data Mining while chaining different tasks there are situations only the values are needed, instead of a key(Nullwritable key). In that situation, we can initiate the tasks without reducers if we don’t want to output in a single file.

      Alternately, I can say all cases, where you can use select and where clause easily on a table, those are might be the only job.

      Note:
      if you turn off the reducer, the number of output files will be equal to no. of mappers and output files will be named as part-m-00000.

      To learn more about Map only Job visit:Map-only job in Map-Reduce

Viewing 2 reply threads
  • You must be logged in to reply to this topic.