Why replication is done in hdfs Hadoop

This topic has 7 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 7 reply threads

Author

Posts
- September 20, 2018 at 4:57 pm #6010
  
  DataFlair Team
  Spectator
  
  What is the need of Replication in HDFS – Hadoop Distributed File System.
- September 20, 2018 at 4:57 pm #6011
  
  DataFlair Team
  Spectator
  
  Replication in HDFS increases the availability of Data at any point of time. If any node containing a block of data which is used for processing crashes, we can get the same block of data from another node this is because of replication.
  Replication is one of the major factors in making the HDFS a Fault tolerant system.
- September 20, 2018 at 4:57 pm #6014
  
  DataFlair Team
  Spectator
  
  The most important feature of HDFS is availability of data and to achieve this objective the concept of Replication of Data Blocks comes into picture.
  Replication ensures that the same Data Block/Information is present on more than one Data Node, the address of which is stored in the Name Node so that if one Data Node goes down still the Data Block/Information can be retrieved from another Data Node.
  So, Replication helps to achieve availability of Data at all times even in the case of node failure, thus making the system Fault tolerant
- September 20, 2018 at 4:57 pm #6017
  
  DataFlair Team
  Spectator
  
  Data Availability is the most important feature of HDFS and it is possible because of Data Replication.
  Suppose we have a Data Blocks stored only on one DataNode and if this node goes down then there are chances that we might loose the data. So, to cater this problem we do replication.
  In Replication, we store the data Block on more than 1 node, so that if 1 node goes down then the data is available on the other node.
- September 20, 2018 at 4:57 pm #6018
  
  DataFlair Team
  Spectator
  
  How does a NameNode handle the failure of the DataNodes in Hadoop?
- September 20, 2018 at 4:57 pm #6019
  
  DataFlair Team
  Spectator
  
  How NameNode tackle Datanode failures in HDFS?
  How NameNode handles Datanode failures in Hadoop HDFS?
- September 20, 2018 at 4:58 pm #6020
  
  DataFlair Team
  Spectator
  
  NameNode has meta data (data about datanode ie; location of datanode , what is the replication factor of the datanode … As per the question am mentioning only these features )
  
  DataNode will constantly send a heartbeat to Name node in this way Name node understands that Data node is working ,if in case (due to any reason) Data node stops sending the heartbeat to the Name node ,then name node will come to know that that particular Data node is down and then make sure that the Blocks in that Date node get replicated in another node and if in case the node which stopped sending the heartbeat again started to send its heartbeat then Name node will balance the replication factor again . In this way, Name node handles the data node failure in Hadoop HDFS
- September 20, 2018 at 4:58 pm #6022
  
  DataFlair Team
  Spectator
  
  Datanode constantly communicates with the Namenode, each Datanode sends a Heartbeat message to the Namenode periodically.
  
  If the signal is not received by the Namenade as intended, the Namenode will consider that Datanode as a failure and doesn’t send any new request to the dead datanode. If the Replication Factor is more than 1, the lost Blocks from the dead datanode can be recovered from other datanodes where the replica is available thus providing features like data availability and Fault tolerance.
  
  The Namenode coordinates the replication of data blocks from one Datanode to another. But, the replication data transfer happens directly between Datanode and the data never passes through the Namenode
Author

Posts

Viewing 7 reply threads

You must be logged in to reply to this topic.

Why replication is done in hdfs Hadoop

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses