Explain Single point of Failure in Hadoop?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Explain Single point of Failure in Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #6005
      DataFlair TeamDataFlair Team
      Spectator

      What is single point of failure in Apache Hadoop 1 and how it is resolved?
      What is Hadoop single point of failure?
      How single point of failure is resolved by Hadoop 2?

    • #6006
      DataFlair TeamDataFlair Team
      Spectator

      Single point of failure problem implies that if the NameNode fails, then that Hadoop Cluster will become out of service. This may be a rare scenario because everyone uses high configuration hardware for NameNode.

      In Hadoop 1.0 if NameNode failure occurred, the administrator would need to recover the data from secondary namenode.

      With the introduction of High availability in Hadoop 2.0, the SPOF problem has been solved by adding 2 Namenodes, one in active mode and another in standby mode. Both the NamNodes have same data and in failure on active Namenode, the standby Namenode automatically takes over. It requires no manual intervention as the architecture is itself designed so. This also ensures availability with no downtime.

    • #6008
      DataFlair TeamDataFlair Team
      Spectator

      The single point of failure in a Hadoop cluster is the NameNode. While the loss of any other machine (intermittently or permanently) does not result in data loss, NameNode loss results in cluster unavailability. The permanent loss of NameNode data would render the cluster’s HDFS inoperable.

      There is an optional SecondaryNameNode that can be hosted on a separate machine. It is just a helper for Namenode.
      It gets the edit logs from the namenode in regular intervals and applies to fsimage.
      Once it has new fsimage, it copies back to namenode. Namenode will use this fsimage for the next restart, which will reduce the startup time.
      Secondary Namenode’s whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode. That’s why it also known as checkpoint node.
      But, It cant replace namenode on namenode’s failure.

      High availability of Namenode has been introduced with Hadoop 2.x release. In a typical HA cluster, two separate machines are configured as NameNodes. At any point in time, exactly one of the NameNodes is in an Active state, and the other is in a Standby state. The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover if necessary.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.