How many instance of Record Reader will run for a specific map reduce job?

Job-ready Courses with Certificates – Learn Today. Lead Tomorrow. Forums Apache Hadoop How many instance of Record Reader will run for a specific map reduce job?

Viewing 1 reply thread
  • Author
    Posts
    • #6195
      DataFlair Team
      Spectator

      How many instance of Record Reader will run for a specific map reduce job?

    • #6197
      DataFlair Team
      Spectator

      The InputFormat defines the data split i.e. logical division of data. But the actual read of data is done by the RecordReader.

      RecordReader generates the key-value pair from the split which is given as input to the Map task.

      public abstract RecordReader<K, V>
      createRecordReader(InputSplit split, TaskAttemptContext context)
      throws IOException, InterruptedException;

      The split is calculated by getSplit(), the map task pass the split to createRecordReader() method on InputFormat to get the key-value pair which is passed and processed by Mapper function.

      If there are N splits it would use N RecordReader intstance and N Map task to process the same

      Follow the link for more detail: RecordReader in Hadoop

Viewing 1 reply thread
  • You must be logged in to reply to this topic.