Explain NameNode High Availability in HDFS?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 5:06 pm #6091
  
  DataFlair Team
  Spectator
  
  What do you mean by the High Availability of a NameNode in Hadoop?
- September 20, 2018 at 5:06 pm #6093
  
  DataFlair Team
  Spectator
  
  In earlier version of Hadoop there is a single point of failure problem for name node.
  But this has been overcome in the second version of hadoop by maintaining one active namenode and one standby namenode.
  
  Standby namenode has all the metadata information required to recover from fail over.
  
  Datanodes send both the Block information and heartbeats to both the active as well as secondary namenode.
  
  Active namenode writes all the metadata information like edit logs and namespace change to a shared location (Journal nodes). Whenever an edit is made to this shared location, standby namenode get to know that some change has been made and it immediately reads from that location. So that both active & stand by name node are in sync.
  Only the active name node can write to the shared location.
  
  Each of the NameNode machines in the cluster maintains a persistent session in ZooKeeper. If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered.
  
  If the current active NameNode crashes, another node may take a special exclusive lock in ZooKeeper indicating that it should become the next active and transition occurs.
  
  For more detail follow: NameNode High Availability in Hadoop
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Explain NameNode High Availability in HDFS?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses