Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Explain NameNode High Availability in HDFS?
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 5:06 pm #6091DataFlair TeamSpectator
What do you mean by the High Availability of a NameNode in Hadoop?
-
September 20, 2018 at 5:06 pm #6093DataFlair TeamSpectator
In earlier version of Hadoop there is a single point of failure problem for name node.
But this has been overcome in the second version of hadoop by maintaining one active namenode and one standby namenode.Standby namenode has all the metadata information required to recover from fail over.
Datanodes send both the Block information and heartbeats to both the active as well as secondary namenode.
Active namenode writes all the metadata information like edit logs and namespace change to a shared location (Journal nodes). Whenever an edit is made to this shared location, standby namenode get to know that some change has been made and it immediately reads from that location. So that both active & stand by name node are in sync.
Only the active name node can write to the shared location.Each of the NameNode machines in the cluster maintains a persistent session in ZooKeeper. If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered.
If the current active NameNode crashes, another node may take a special exclusive lock in ZooKeeper indicating that it should become the next active and transition occurs.
For more detail follow: NameNode High Availability in Hadoop
-
-
AuthorPosts
- You must be logged in to reply to this topic.