What is throughput in Hadoop?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 12:14 pm #4782
  
  DataFlair Team
  Spectator
  
  How HDFS provide high throughput in Hadoop?
  What is throughput in Apache Hadoop?
- September 20, 2018 at 12:14 pm #4783
  DataFlair Team
  Spectator
  Throughput is the amount of work done in a unit time.
  
  Because of bellow reasons HDFS provides good throughput:
  - In hadoop, the task is divided among different blocks, the processing is done parallel and independent to each other. so because of parallel processing, HDFS has high throughput.
  - The HDFS is based on Write Once and Read Many Model, it simplifies the data coherency issues as the data is written once can’t be modified and therefore, provides high throughput data access.
  - Apache Hadoop works on Data Locality principle which states that move computation to data instead of data to computation, this reduces network congestion and therefore, enhances the overall system throughput.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.