What is the need for key value in MapReduce Hadoop?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is the need for key value in MapReduce Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #6114
      DataFlair TeamDataFlair Team
      Spectator

      What is the need for key value in MapReduce Hadoop?
      Why MapReduce uses Key Value pair to process the data?

    • #6116
      DataFlair TeamDataFlair Team
      Spectator

      Lets take a simple example of word count example. In word count program mappers receives simple offset as a key and entire line as a value. We split the line into words and write (word,1) as output of mapper. Now reducer will receive the word as a input key and iterable value object which we traverse and get total count for that word. Before reducer receives the input hadoop does shuffling and sorting because of which you get the input to reducer in sorted order and because of that reducer gets the output of all the mappers for the same key into combined format. Now if we don’t produce everything into the key-value pairs format we will not be able to combine all the records from all the mapper for giving it to the single reducer.

    • #6117
      DataFlair TeamDataFlair Team
      Spectator

      Hadoop use to process Structured, Semi Structured and unstructured data and hence the schema are not static here. If the schema is static, we may make use of the Columns/Rows to process the data set. Due to this reason, Hadoop is using key-value concept in MapReduce to analyze the data.

      Keys and values are not the properties of the data, where it is being framed by the Data Analyst to analyze/process the data. So, to do any analysis using MapReduce we have to specify Keys/Values i.e. what we are looking for(Key) and what’s it’s worth(value).

Viewing 2 reply threads
  • You must be logged in to reply to this topic.