What is active and passive NameNode in Apache Hadoop?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 5:35 pm #6281
  
  DataFlair Team
  Spectator
  
  What is the need of passive NameNode in Hadoop?
- September 20, 2018 at 5:35 pm #6283
  
  DataFlair Team
  Spectator
  
  Namenode maintains all the metadata information of the data nodes. It knows data is written on which Block and on which data nodes. Also as datanodes continuously send heartbeats to the namenode, namenode is responsible for maintaining replica factor of the blocks whenever a data node fails. It copies from the block which is not corrupted and store on some other data node.
  
  In Hadoop 1.0, there is only one namenode which is a Single point of failure. So if the namenode fails we lose all the information about the datanodes.
  
  So in Hadoop 2.0, two namenodes were maintained.One active and one Passive or Stand by namenode.
  
  Passive node maintains sufficient metadata information to recover from a namenode failover.
  
  This is done by maintaining a Shared storage between Active & Passive namenode. Only Active namenode has write access to this shared storage so that there is no Split-brain problem in which both active and passive namenodes will be writing to the same storage.
  
  When the active name node performs any changes to the file system namespace, it writes the changes to the edit log available on the shared storage, the passive name node constantly applies changes made by the active name node in the edit log from the shared storage to its own copy of the file system namespace. When a failover happens, the passive name node ensures that it has fully synchronized its file system namespace from the changes made in the edit log before it can declare itself as active name node.
  
  Follow the link fo more detail: HDFS NameNode High Availability
- September 20, 2018 at 5:36 pm #6285
  
  DataFlair Team
  Spectator
  
  In Hadoop 2.0, we have two Namenodes – Active Namenode and Passive Namenode.
  
  Namenode basically maintains and manages the slave nodes and assign the task to them. Namenode stores the metadata of HDFS. Namenode also executes file system namespace operations like opening, closing and renaming files and directions. All replication factor details are maintained in Name node.
  This metadata is available in memory in the master for faster retrieval data.
  
  Active Namenode is the primary Namenode which works and runs in the cluster. Passive Namenode is a standby Namenode, which has similar metadata as active Namenode. When the active Namenode goes down, the passive Namenode replaces the active Namenode in the cluster. Hence, the cluster is never without a Namenode and so it never fails.
  
  Namenode is very crucial in HDFS as it maintains and manages the slave nodes & stores all the metadata. Hence to achieve high availability feature in Hadoop from Namenode perspective as well have a Passive Namenode.
  
  Follow the link fo more detail: HDFS NameNode High Availability
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

What is active and passive NameNode in Apache Hadoop?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses