Record Reader and Mapper runs in same JVM or Different JVM?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Record Reader and Mapper runs in same JVM or Different JVM?

Viewing 2 reply threads
  • Author
    Posts
    • #6082
      DataFlair TeamDataFlair Team
      Spectator

      Record Reader and Mapper runs in same JVM or Different JVM?

    • #6086
      DataFlair TeamDataFlair Team
      Spectator

      In MapReduce the Mappers understands key value pairs . So before data should be passed as key-value pairs.This was done by Inputsplit and record reader.

      InputSplit the files into chunks .Inputsplit is user defined and user can define the split size based on the size of data.The no of splits will be equal to no of map task.split is the logical division of data.

      Record Reader converts the data into key value pair.The start is the byteoffset where record reader starts generating key/value pairs and end is postion where to stop reading.It communicates with inputsplit until reading the complete data.

      InputFormat class is responsible for creating splits and dividing them into records.The data is divided into no of splits either 64 or 128 mb in HDFS. getsplit() method in inputFormat class computes the splits and pass to map task.Map task passes the splits to createRecordReader() method to get the RecordReader for that split. RecordReader loads the split and converts them in to key value pair

      As the Record reader and mapper are in same mappernode. So they both run on same JVM. But if reducer is on same JVM as mapper, any failure on reducer will kill the JVM and hence Hadoop will start re running mapper phase. This will be very inefficient

    • #6088
      DataFlair TeamDataFlair Team
      Spectator

      Mapper is the the first processing stage in MapReduce.
      Mappers takes as a input only one Key and Value pair at a time so if the key and value is pair of 100 then 100 times map method will be called.
      Record Reader is the class which actually loads the data and converts this data to in key value on the basis of Input Format.
      As the Record Reader and Mapper runs on same node so they both runs on same jvm.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.