Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Explain Data Locality in Hadoop?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 3:56 pm #5683DataFlair TeamSpectator
What does the term ‘Data Locality’ mean in Hadoop?
What is Data locality? What is need of Data Locality in Hadoop MapReduce? -
September 20, 2018 at 3:56 pm #5685DataFlair TeamSpectator
What does the term Data Locality mean in Hadoop?
Data Locality is one of the design principal of Hadoop. As per this, data movement if prohibited. Instead, computation code will be moving towards Data Node, performing Data processing and writing output to HDFS.What is Data locality? What is need of Data Locality in Hadoop MapReduce?
Data Locality ensures that MapReduce task is moved to Data Node for performing required processing. This ensures small sized computation code(KBs) is moved across the network rather than huge size data(GBs, TBS) in turn better utilization of network resources and time required for performing specific Map reduce task.Follow the link to learn more about Data Locality in Hadoop
-
September 20, 2018 at 3:57 pm #5687DataFlair TeamSpectator
Hadoop works on huge volume of data so it is not feasible to move such volume over the network.
Hadoop has come up with the most innovative principle of moving algorithm to data rather than data to algorithm. This is called Data Locality.
So whenever any MapReduce job is invoked, the logic usually goes to the data for further computation rather then moving data to the MapReduce job.
Fortunately, having map code executing on the node where the data resides significantly reduces this problem
-
-
AuthorPosts
- You must be logged in to reply to this topic.