What is Disk Balancer in Apache Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #6208
      DataFlair TeamDataFlair Team
      Spectator

      What is the need of Disk Balancer in Hadoop?
      How to enable Disk Balancer in Hadoop?

    • #6209
      DataFlair TeamDataFlair Team
      Spectator

      It is a Command line tool which distributes data evenly on all the data node.
      A set of instruction about the data movement called Plan is composed later the Plan can be executed against an operational data node.

      To enable Disk balancer dfs.disk.balancer.enabled must be set to true in hdfs-site.xml. By default, it is disabled.

      Disk balancer generates two output files:
      nodename.before.json : contains the state of cluster that we read from the name-node.
      nodename.plan.json : contains the plan for the specific node

      Volume-Choosing policy is used during writing of new Block in HDFS. Round-robin(Default) and Available space(based on free space availability)

      Follow the link to learn more about: Disk balancer in Hadoop

    • #6211
      DataFlair TeamDataFlair Team
      Spectator

      Need of Disk balancer in Hadoop:
      Generally the data on datanode will be spread across multiple disks due to various reasons. This can happen due to large amount of writes and deletes or due to a disk replacement. This can lead to significant imbalance within a datanode. This situation is not handled by the existing HDFS balancer which takes care of cluster-wide data balancing. Hence to balance the data on all disks of a datanode we need a Disk Balancer.

      Diskbalancer is a command line tool that distributes data evenly on all disks of a datanode.

      To enable Disk Balancer in Hadoop:
      By default, disk balancer is not enabled.
      To enable disk balancer dfs.disk.balancer.enabled setting must be set to true in hdfs-site.xml file.

      Follow the link to learn more about: Disk balancer in Hadoop

Viewing 2 reply threads
  • You must be logged in to reply to this topic.