Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Record Reader and Mapper runs in same JVM or Different JVM?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 5:05 pm #6082DataFlair TeamSpectator
Record Reader and Mapper runs in same JVM or Different JVM?
-
September 20, 2018 at 5:05 pm #6086DataFlair TeamSpectator
In MapReduce the Mappers understands key value pairs . So before data should be passed as key-value pairs.This was done by Inputsplit and record reader.
InputSplit the files into chunks .Inputsplit is user defined and user can define the split size based on the size of data.The no of splits will be equal to no of map task.split is the logical division of data.
Record Reader converts the data into key value pair.The start is the byteoffset where record reader starts generating key/value pairs and end is postion where to stop reading.It communicates with inputsplit until reading the complete data.
InputFormat class is responsible for creating splits and dividing them into records.The data is divided into no of splits either 64 or 128 mb in HDFS. getsplit() method in inputFormat class computes the splits and pass to map task.Map task passes the splits to createRecordReader() method to get the RecordReader for that split. RecordReader loads the split and converts them in to key value pair
As the Record reader and mapper are in same mappernode. So they both run on same JVM. But if reducer is on same JVM as mapper, any failure on reducer will kill the JVM and hence Hadoop will start re running mapper phase. This will be very inefficient
-
September 20, 2018 at 5:06 pm #6088DataFlair TeamSpectator
Mapper is the the first processing stage in MapReduce.
Mappers takes as a input only one Key and Value pair at a time so if the key and value is pair of 100 then 100 times map method will be called.
Record Reader is the class which actually loads the data and converts this data to in key value on the basis of Input Format.
As the Record Reader and Mapper runs on same node so they both runs on same jvm.
-
-
AuthorPosts
- You must be logged in to reply to this topic.