What is Rack awareness?

Viewing 1 reply thread
  • Author
    Posts
    • #4734
      DataFlair TeamDataFlair Team
      Spectator

      what is rack ? what is rack awareness ? how hadoop is rack-aware ?

    • #4736
      DataFlair TeamDataFlair Team
      Spectator

      Rack is nothing but is a collection of machines typically in the range of around 40-50 machines. All these machines are connected using the same network switch. If that network switch goes down then all machines in that rack will be out of network and hence we say rack is down.

      A fair fact about the rack is that network communication between machines of same network is fast and hence less latency but more data transfer hence more bandwidth. However, if the machines are located on different racks then there will be more latency and more usage of network bandwidth over network.

      Rack awareness – Rack awareness is more of a machine knowing about the network topology. In a raw sense, each machine in the network knows how the data is distributed across the different machines in the network.

      How Hadoop is rack aware ?
      Each data node in the Hadoop is rack aware. This means each node in the Hadoop knows about all different locations of the data Blocks stored on each node of the network. Rack awareness helps distributes the replicated blocks in such a way that system is healthy in the event of a machine or a few machines going down or complete rack goes down even then Hadoop can work without that rack. The data blocks are normally replicated such that one or more copy of that data block is available in different rack.

      In the event of a node going down, Hadoop is intelligent enough to look for the nearest node where that data block is available and hence reduces network latency.


      For more details, please refer: Rack awareness in Hadoop

Viewing 1 reply thread
  • You must be logged in to reply to this topic.