Why HDFS stores data using commodity hardware despite higher chance of failures?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 12:17 pm #4799
  
  DataFlair Team
  Spectator
  
  <div class=”post”>
  
  How Fault Tolerance is achieved in HDFS?
  What is default replication factor?
  
  </div>
- September 20, 2018 at 12:18 pm #4800
  DataFlair Team
  Spectator
  There are some reasons that HDFS stores data using commodity hardware despite the higher chance of failures:
  - HDFS is highly fault-tolerant.HDFS provides fault tolerance by replicating the data blocks and distributing it among different DataNodes across the cluster. By, default, replication factor is set to 3 which is configurable. In Hadoop HDFS, Replication of data solves the problem of data loss in unfavorable conditions like crashing of the node, hardware failure and so on. So, when any machine in the cluster goes down, then the client can easily access their data from another machine which contains the same copy of data blocks.
  - It provides distributed processing so that every datanode have sufficient process to do.
  - It has replicas of the block on different datanode so it is economical to store data on commodity hardware.
  - It provide HIGH AVAILABILITY features which mean that availability of data in all condition even in the case of machine failure.
  For more detail please follow: HDFS Tutorial
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.