Record Reader and Mapper runs in same JVM or Different JVM?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 5:05 pm #6082
  
  DataFlair Team
  Spectator
  
  Record Reader and Mapper runs in same JVM or Different JVM?
- September 20, 2018 at 5:05 pm #6086
  
  DataFlair Team
  Spectator
  
  In MapReduce the Mappers understands key value pairs . So before data should be passed as key-value pairs.This was done by Inputsplit and record reader.
  
  InputSplit the files into chunks .Inputsplit is user defined and user can define the split size based on the size of data.The no of splits will be equal to no of map task.split is the logical division of data.
  
  Record Reader converts the data into key value pair.The start is the byteoffset where record reader starts generating key/value pairs and end is postion where to stop reading.It communicates with inputsplit until reading the complete data.
  
  InputFormat class is responsible for creating splits and dividing them into records.The data is divided into no of splits either 64 or 128 mb in HDFS. getsplit() method in inputFormat class computes the splits and pass to map task.Map task passes the splits to createRecordReader() method to get the RecordReader for that split. RecordReader loads the split and converts them in to key value pair
  
  As the Record reader and mapper are in same mappernode. So they both run on same JVM. But if reducer is on same JVM as mapper, any failure on reducer will kill the JVM and hence Hadoop will start re running mapper phase. This will be very inefficient
- September 20, 2018 at 5:06 pm #6088
  
  DataFlair Team
  Spectator
  
  Mapper is the the first processing stage in MapReduce.
  Mappers takes as a input only one Key and Value pair at a time so if the key and value is pair of 100 then 100 times map method will be called.
  Record Reader is the class which actually loads the data and converts this data to in key value on the basis of Input Format.
  As the Record Reader and Mapper runs on same node so they both runs on same jvm.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Record Reader and Mapper runs in same JVM or Different JVM?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses