Understand HDFS Feature – Fault Tolerance

1. Definition

Fault tolerance in HDFS refers to the working strength of a system in unfavorable conditions and how that system can handle such situation. HDFS is highly fault tolerant. It handles faults by the process of replica creation. The replica of users data is created on different machines in the HDFS cluster. So whenever if any machine in the cluster goes down, then data can be accessed from other machines in which same copy of data was created.

Understand HDFS Feature - Fault Tolerance

Understand HDFS Feature – Fault Tolerance

HDFS also maintains the replication factor by creating replica of data on other available machines in the cluster if suddenly one machine fails. To learn more about world’s most reliable storage layer follow this HDFS introductory guide

Looking for HDFS Hands-on, follow these tutorials: Top 10 Useful Hdfs Commands Part-I

Fault Tolerance hdfs hadoop tutorial training

HDFS Fault Tolerance

2. How HDFS Faut Tolerance achieved?

HDFS achieves fault tolerance mechanism by replication process. In HDFS whenever a file is stored by the user, then firstly that file is divided into blocks and then these blocks of data are distributed across different machines present in HDFS cluster. After this, replica of each block is created on other machines present in the cluster. By default HDFS creates 3 copies of a file on other machines present in the cluster. So due some reason if any machine on the HDFS goes down or fails, then also user can easily access that data from other machines in the cluster in which replica of file is present. Hence HDFS provides faster file read and write mechanism, due to its unique feature of distributed storage.

Hadoop Quiz

Get the most demanding skills of IT Industry - Learn Hadoop

3. Example of HDFS Faut Tolerance

Suppose there is a user data named FILE. This data FILE is divided in into blocks say B1, B2, B3 and send to Master. Now master sends these blocks to the slaves say S1, S2, and S3. Now slaves creates replica of these blocks to the other slaves present in the cluster say S4, S5 and S6. Hence multiple copies of blocks are created on slaves. Say S1 contains B1 and B2, S2 contains B2 and B3, S3 contains B3 and B1, S4 contains B2 and B3, S5 contains B3 and B1, S6 contains B1 and B2. Now if due to some reasons slave S4 gets crashed. Hence data present in S4 was B2 and B3 become unavailable. But we don’t have to worry because we can get the blocks B2 and B3 from other slave S2. Hence in unfavourable conditions also our data doesn’t get lost. Hence HDFS is highly fault tolerant. 

4. What were the issues in legacy systems?

In legacy systems like RDBMS, all the read and write operation performed by the user, was done on a single machine.  And if due to some unfavourable conditions like machine failure, RAM Crash, Hard-disk failure, power down, etc the users have to wait until the issue is manually corrected. So at the time of machine crashing or failure, the user cannot access their data until the issue in the machine gets recovered and becomes available for the user. Also in legacy systems we can store data in the range of GBs only. So in order to increase the data storage capacity, one has to buy a new server machine. Hence to store a huge amount of data one has to buy a number of server machines, due to which the cost becomes very expensive.

Related Links

Reference For HDFS

9 Responses

  1. jass says:

    Why HDFS is world’s most reliable storage system ? what are the alternative to hdfs? can we use nfs instead >

  2. james says:

    We were using earlier NFS recently switched to HDFS hdfs is much more reliable and i like data-locality property which improved performance of job like anything

  3. marcus says:

    is there any other way for fault handling apart from replication ??

  4. Weto says:

    Good matter on fault tolerance feature of HDFS. Good to see that Hadoop is so much fault tolerant. Please share something on High availability feature as well.

  5. Zetelvg says:

    There are many features of HDFS like reliability, scalability, high availability and fault tolerance. Nice explanation given by you to explain HDFS fault tolerant feature.

  6. Zethcqe says:

    Very properly explained fault tolerance feature in HDFS in Hadoop. It covers every concept to explain how HDFS is fault tolerant.

  7. Zetprvn says:

    Fault tolerance is one of the key feature of HDFS and you have explained it very properly.
    Thanks

  8. Setxije says:

    Fault tolerant is the key feature of Hadoop HDFS that has made Hadoop so popular and you have explained it very nicely.
    Thanks

  9. Petrewa says:

    I read other blogs for this article but found DataFlair blog the best. Keep sharing similar material!!

Leave a Reply

Your email address will not be published. Required fields are marked *