What is active and passive NameNode in Hadoop?

This topic has 3 replies, 1 voice, and was last updated 7 years, 8 months ago by DataFlair Team.

Viewing 3 reply threads

Author

Posts
- September 20, 2018 at 4:18 pm #5824
  
  DataFlair Team
  Spectator
  
  What is the need of passive NameNode in Hadoop?
- September 20, 2018 at 4:19 pm #5826
  
  DataFlair Team
  Spectator
  
  NameNode maintains the file system tree and metadata information of data nodes. It writes the information in the local file system in two ways: Namespace image and edit logs. If name node gets failed, the entire system gets fail. We can overcome this failure by NFS Mount or Secondary Name Node. But we have the single point of failure in the name node. Secondary name node merges the Namespace image and edit logs of active name node. In case of failure to activate the Secondary Name Node we require three things:
  
  All active node information (Secondary name node info) is stored in the memory.
  The edit logs are repayed
  Actively getting the status of data node. These all require more than 30 minutes of time.
  So to increase High Availability in hadoop 2.x we have a name node in stand by configuration. we need to do these changes
  This stand by node will read till the last line of the namespace image (generally can be done by zoo keeper) later if we add in active node, it reads automatically.
  Data node need to send status to both active and stand by node (Passive node).
  clients must be configured to handle the name node failure
  
  so to increase the high availability feature in hadoop we require passive name node
- September 20, 2018 at 4:19 pm #5828
  
  DataFlair Team
  Spectator
  
  Prior to Hadoop 2.x, the NameNode was the single point of failure. So if Primary NameNode goes down, complete file system becomes unavailable.
  
  To overcome this issue, Hadoop 2.x Hadoop has come up with a new feature which addresses the problem of single point of failure by providing the option of running two redundant NameNodes in the same cluster in an Active/Passive stand by Node configuration.
  
  So Passive Node can act as a Primary Node and facilitate client requests without significant interruption in case of failure or planned downtime activity on Primary/Active Name node.
  
  To implement this change, few architectural changes are required such as Primary node must use highly available shared storage or QJM to share the edit logs with the stand by the node.
  
  Also, DataNodes must send the block report to both the nodes(Primary and StandBy Node).
- September 20, 2018 at 4:19 pm #5830
  
  DataFlair Team
  Spectator
  
  In Hadoop 1.x, name node was a single point of failure(SPOF), So we a name node is down then the Entire Hadoop Cluster will be down, So to overcome this problem Hadoop 1.x was introduced with High Availability of Name Node feature in which there are two name nodes namely
  1) Active
  2) Passive Name Nodes.
  Active Name Node is the one which works and runs in the Cluster and Passive Name node is the one which is a standby name node.
  Data nodes send Block reports to both the Name Node’s so Both Name Nodes have Same data, So when Active Name Node goes down, Passive Name Node replaces it and Manages the Cluster so that the Cluster never goes down or remains unavailable.
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

What is active and passive NameNode in Hadoop?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses