Why Apache Spark is faster than Hadoop

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 4:02 pm #5718
  
  DataFlair Team
  Spectator
  
  I have read Spark is 100 times faster than Apache Hadoop. When both of them are distributed computing framework; how the spark is 100 x faster than Hadoop?
- September 20, 2018 at 4:02 pm #5720
  
  DataFlair Team
  Spectator
  
  Apache Spark is faster than Apache Hadoop due to below reasons:
  
  1) Apache Spark provides in-Memory computating. Spark is designed to transform data In-memory and hence reduces time for disk I/O. While MapReduce writes intermediate results back to Disk and reads it back.
  
  2) Spark utilizes Direct Acyclic Graph that helps to do all the optimization and computation in a single stage rather than multiple stages in the MapReduce model
  
  3) Apache Spark core is developed using SCALA programming language which is faster than JAVA. SCALA provides inbuilt concurrent execution by providing immutable collections. While in JAVA we need to use Thread to achieve parallel execution.
  
  For more details, please refer: Comparision between Apache Spark and Apache hadoop.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.