Ideal HDFS Blocks size is the one which is not too large (say 1 GB ro so) or too small (say 10 -20 KB) and the input data is actually the factor. As we have HDFS, to deal with Big Data (which is tera or peta bytes), So if we keep block-size small, the number of blocks will be more and managing huge number of block and their meta data will create large overhead and congestion which is certainly not desirable.
On the other hand, by keeping a larger block size, we may completely loose the benefit of the distributed file system where the processing is done in parallel for all the blocks because with less number and large size of blocks will make the process slow and system may have to wait for a very long time for one Mapper to complete its job of data processing.
For example, let’s say we need to process 1 petabyte data. In this case 64 MB block-size may not be ideal size as 15 million(approx.) blocks will be created, which is difficult to manage, so ideal size in this case may be 128 MB or even 256 MB.