What is a Block Scanner in HDFS?

Viewing 1 reply thread
  • Author
    Posts
    • #6148
      DataFlair TeamDataFlair Team
      Spectator

      What is a Block Scanner in Hadoop?
      What is the need of Block Scanner?

    • #6149
      DataFlair TeamDataFlair Team
      Spectator

      Block Scanner is basically used to identify corrupt datanode Block.
      During a write operation, when a datanode writes in to the HDFS, it verifies a checksum for that data. This checksum helps in verifying the data corruptions during the data transmission.
      When the same data is read from the HDFS, the client verifies the checksum returned by the datanode against the checksum it calculates against the data to check the data corruption that might have caused by the data node that might have occurred during the storage of data in the data node.
      Therefore every datanode periodically runs a block scanner, to verify all the blocks that are stored in the data node. So this helps to identify and fix the corrupt data before a read operation. With the block scanner service, HDFS can prematurely identify and fix corruptions.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.