Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › What is the need for key value in MapReduce Hadoop?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 5:09 pm #6114DataFlair TeamSpectator
What is the need for key value in MapReduce Hadoop?
Why MapReduce uses Key Value pair to process the data? -
September 20, 2018 at 5:09 pm #6116DataFlair TeamSpectator
Lets take a simple example of word count example. In word count program mappers receives simple offset as a key and entire line as a value. We split the line into words and write (word,1) as output of mapper. Now reducer will receive the word as a input key and iterable value object which we traverse and get total count for that word. Before reducer receives the input hadoop does shuffling and sorting because of which you get the input to reducer in sorted order and because of that reducer gets the output of all the mappers for the same key into combined format. Now if we don’t produce everything into the key-value pairs format we will not be able to combine all the records from all the mapper for giving it to the single reducer.
-
September 20, 2018 at 5:09 pm #6117DataFlair TeamSpectator
Hadoop use to process Structured, Semi Structured and unstructured data and hence the schema are not static here. If the schema is static, we may make use of the Columns/Rows to process the data set. Due to this reason, Hadoop is using key-value concept in MapReduce to analyze the data.
Keys and values are not the properties of the data, where it is being framed by the Data Analyst to analyze/process the data. So, to do any analysis using MapReduce we have to specify Keys/Values i.e. what we are looking for(Key) and what’s it’s worth(value).
-
-
AuthorPosts
- You must be logged in to reply to this topic.