Explain Rack Awareness in Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #6332
      DataFlair TeamDataFlair Team
      Spectator

      What is Rack Awareness in Hadoop?
      What is the need of Rack Awareness in Hadoop HDFS?

    • #6333
      DataFlair TeamDataFlair Team
      Spectator

      A set of computers(nodes) are placed together physically is called a Rack. It is the way machines are physically located .
      Hadoop runs on a cluster of computers which are spread across many racks. Major advantage of Rack is that , it makes the system highly available and fault tolerant.
      There are three replicas of Block by default. generally, one replica is placed in the same rack where the block exists and other two replicas are spread across racks. this way, even if complete rack goes down, due to any reason, there are still 2 replicas available.
      This optimization of replica replacement is what distinguishes HDFS from any any other DFS.

      Follow the link to learn more about Rack Awareness in Hadoop

    • #6335
      DataFlair TeamDataFlair Team
      Spectator

      rack is a set of computers which are storage machine on which Hadoopruns. Rack awareness is the knowledge that how the data nodes are distributed across the rack of Hadoop cluster.

      By default, Hadoop make 3 replicas of a Block . The first replica is stored on the local node, second replica is stored on different rack and the third replica is stored on the different node of the same rack. When a block is regenerated then if the number of replica on a rack is one then place it on different rack.

      The advantages of rack awareness are
      1) Minimize the writing cost and maximize read speed.
      2) Data protection against rack failure.
      3) Improve network bandwidth
      4) High availability of data
      5) Reliability of data

      Follow the link to learn more about Rack Awareness in Hadoop

Viewing 2 reply threads
  • You must be logged in to reply to this topic.