Why HDFS block size is large ?

This topic has 2 replies, 1 voice, and was last updated 7 years, 8 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 2:21 pm #5133
  
  DataFlair Team
  Spectator
  
  FileSystem block size is in KBs while disk block size is in bytes but why HDFS block size is large ?
- September 20, 2018 at 2:21 pm #5135
  
  DataFlair Team
  Spectator
  
  HDFS blocks are large compared to disk blocks, because to minimize the cost of seeks. If we have many smaller size disk blocks, the seek time would be maximum (time spent to seek/look for an information). And also, having multiple small sized blocks is the burden on name node/master, as ultimately the name node stores metadata, so it has to save this disk block information.
  
  If the Data Block is large enough, the time it takes to transfer the data from the disk can be significantly longer than the time to seek to the start of the block. Thus, transferring a large file made of multiple blocks operates at the disk transfer rate.
  
  For each block we need a Mapper. So, in the case of small-sized blocks, there will be a lot of Mappers. Each will be processing the data, which isn’t efficient.
  
  Follow the link to learn more about HDFS Data Blocks
- September 20, 2018 at 2:21 pm #5136
  
  DataFlair Team
  Spectator
  
  HDFS blocks are large, the reason is to lower the seek time(the time to locate the head of a file to read it completely).
  
  With smaller Data Block we have larger no of seek time and lesser number of transfer time, however, we wanted to reverse this process, i.e lesser no of seek time and more no.of transfer time( seek time/transfer time = .01), which is only possible with larger block sizes. Many times we won’t be interested to read the complete file and just find the seek, to process the files.
  
  With HDFS we deal with large files and quick processing hence helps in having lesser seek time.
  Also due to this large block size in HDFS (64mb), MapReduce to processes large single block file easily at one time.
  
  Follow the link to learn more about HDFS Blocks in Hadoop
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Why HDFS block size is large ?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses