Throughput is the amount of work done in a unit time.
Because of bellow reasons HDFS provides good throughput:
In hadoop, the task is divided among different blocks, the processing is done parallel and independent to each other. so because of parallel processing, HDFS has high throughput.
The HDFS is based on Write Once and Read Many Model, it simplifies the data coherency issues as the data is written once can’t be modified and therefore, provides high throughput data access.
Apache Hadoop works on Data Locality principle which states that move computation to data instead of data to computation, this reduces network congestion and therefore, enhances the overall system throughput.