What is active and passive NameNode in Apache Hadoop?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is active and passive NameNode in Apache Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #6281
      DataFlair TeamDataFlair Team
      Spectator

      What is the need of passive NameNode in Hadoop?

    • #6283
      DataFlair TeamDataFlair Team
      Spectator

      Namenode maintains all the metadata information of the data nodes. It knows data is written on which Block and on which data nodes. Also as datanodes continuously send heartbeats to the namenode, namenode is responsible for maintaining replica factor of the blocks whenever a data node fails. It copies from the block which is not corrupted and store on some other data node.

      In Hadoop 1.0, there is only one namenode which is a Single point of failure. So if the namenode fails we lose all the information about the datanodes.

      So in Hadoop 2.0, two namenodes were maintained.One active and one Passive or Stand by namenode.

      Passive node maintains sufficient metadata information to recover from a namenode failover.

      This is done by maintaining a Shared storage between Active & Passive namenode. Only Active namenode has write access to this shared storage so that there is no Split-brain problem in which both active and passive namenodes will be writing to the same storage.

      When the active name node performs any changes to the file system namespace, it writes the changes to the edit log available on the shared storage, the passive name node constantly applies changes made by the active name node in the edit log from the shared storage to its own copy of the file system namespace. When a failover happens, the passive name node ensures that it has fully synchronized its file system namespace from the changes made in the edit log before it can declare itself as active name node.

      Follow the link fo more detail: HDFS NameNode High Availability

    • #6285
      DataFlair TeamDataFlair Team
      Spectator

      In Hadoop 2.0, we have two Namenodes – Active Namenode and Passive Namenode.

      Namenode basically maintains and manages the slave nodes and assign the task to them. Namenode stores the metadata of HDFS. Namenode also executes file system namespace operations like opening, closing and renaming files and directions. All replication factor details are maintained in Name node.
      This metadata is available in memory in the master for faster retrieval data.

      Active Namenode is the primary Namenode which works and runs in the cluster. Passive Namenode is a standby Namenode, which has similar metadata as active Namenode. When the active Namenode goes down, the passive Namenode replaces the active Namenode in the cluster. Hence, the cluster is never without a Namenode and so it never fails.

      Namenode is very crucial in HDFS as it maintains and manages the slave nodes & stores all the metadata. Hence to achieve high availability feature in Hadoop from Namenode perspective as well have a Passive Namenode.

      Follow the link fo more detail: HDFS NameNode High Availability

Viewing 2 reply threads
  • You must be logged in to reply to this topic.