what if a reducer cannot handle the input data coming from mapper

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop what if a reducer cannot handle the input data coming from mapper

Viewing 1 reply thread
  • Author
    Posts
    • #4954
      DataFlair TeamDataFlair Team
      Spectator

      As we all know data with the same key goes the same reducer.

      Let’s say we have a scenario where we are using some 20 nodes cluster each of 3 TB disk size.

      While proessing the data and after mapper has processed the data, let’s say I have 7TB of data , all of which has the same key and hence needs to go the same reducer.

      How will the reducer handle this 7TB of data as we have only 3TB of disk per slave node?

      Will this data run on different machines ? Can a single reducer run over multiple slave nodes ?

    • #4955
      DataFlair TeamDataFlair Team
      Spectator

      In this case I may provide a composite key which will handle this scenario.
      But ideally speaking if all your data have same key that means that can’t be the a perfect key for the data set.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.