Why HDFS performs replication, although it leads to consumption of lot of space?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Why HDFS performs replication, although it leads to consumption of lot of space?

Viewing 2 reply threads
  • Author
    Posts
    • #5024
      DataFlair TeamDataFlair Team
      Spectator

      Why HDFS performs replication, although it results in data redundancy?

    • #5025
      DataFlair TeamDataFlair Team
      Spectator

      HDFS provides reliable, scalable and fault tolerant data processing system.

      Whenever the client comes with data to be written to data nodes. It is written to 3 (default replication factor is 3, this can be configured) different 
      Blocks
       which are present in different data nodes. These data nodes are present in different racks as well.

      So even if one of data nodes containing a particular block goes down a copy of that block will always be available on some other data node.In this way data is never lost in Hadoop.
      Also whenever a data node goes faulty, the block is replicated to another data node so that the replication factor is always maintained.

      The data nodes on which these blocks are stored are commodity hardware which are not very expensive , so Hadoop provides a cost effective and fault tolerant data processing system.

    • #5026
      DataFlair TeamDataFlair Team
      Spectator

      Hadoop is designed and developed to analyze (or to perform set of actions on) small number of very large files (Terabytes or Petabytes).

      So to store such big files, commodity hardware are preferred due to their cost effectiveness. Now a days, data availability is more important than actual disk space.

      In case of failure of data nodes or racks, data is highly available on other data nodes or racks. HDFS is pretty intelligent to make data available at any time.

      Hence to achieve data availability at all the time, HDFS performs replication of data.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.