How will you balance the disk space usage on a HDFS cluster?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop How will you balance the disk space usage on a HDFS cluster?

Viewing 1 reply thread
  • Author
    Posts
    • #4767
      DataFlair TeamDataFlair Team
      Spectator

      How to balance the disk space usage on an HDFS cluster?

    • #4769
      DataFlair TeamDataFlair Team
      Spectator

      In order to make sure all nodes are being balanced used, Hadoop has its balanced policy, apart from that for the unbalanced situation like new nodes adding, deletion caused unbalancing etc, there is HDFS balancer to rebalance the space usage among the cluster data nodes.

      Hadoop space balance policy
      There is 3 space balance related parameter in Hadoop:

      – Balanced space preference fraction

      – Balanced space threshold

      – Balance bandwidth control

      HDFS Balancer
      As we know, data might not be uniformly placed across the DataNodes, due to multiple competing considerations, So, HDFS offers a tool for administrators which analyzes block placement and also rebalances data across the DataNode.

      Note: HDFS balancer has to run manually, it doesn’t run at background.

      learn more about HDFS Disk balancer, follow the link: HDFS Disk Balancer – Learn how to Balance Data on DataNode

Viewing 1 reply thread
  • You must be logged in to reply to this topic.