On what basis name node distribute blocks across the data nodes?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop On what basis name node distribute blocks across the data nodes?

Viewing 1 reply thread
  • Author
    Posts
    • #5100
      DataFlair TeamDataFlair Team
      Spectator

      Explain data block placement policies. On what factors blocks are distributed in hdfs

    • #5102
      DataFlair TeamDataFlair Team
      Spectator

      The strategy by which Hadoop distributes Data Blocksacross clusters is based on trade offs between data reliability, write bandwidth and read bandwidth.

      • The basic placement policy tries to place the first block on the client (in case client is in a different cluster, random datanodes are chosen other than busy/fully loaded ones)
      • The second replica is placed on a different rack than the first one (also known as off rack)
      • The third replica is placed on the same rack as the second one but in a different datanode.

      Further replicas are placed in random nodes avoiding same rack placement.

      Apart from the basic policy, hadoop also has a Balancer daemon which distributes blocks by moving them from over-utilized datanodes to under-utilized datanodes keeping the basic placement policy in mind.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.