Why Apache Spark is faster than Hadoop

Viewing 1 reply thread
  • Author
    Posts
    • #5718
      DataFlair TeamDataFlair Team
      Spectator

      I have read Spark is 100 times faster than Apache Hadoop. When both of them are distributed computing framework; how the spark is 100 x faster than Hadoop?

    • #5720
      DataFlair TeamDataFlair Team
      Spectator

      Apache Spark is faster than Apache Hadoop due to below reasons:

      1) Apache Spark provides in-Memory computating. Spark is designed to transform data In-memory and hence reduces time for disk I/O. While MapReduce writes intermediate results back to Disk and reads it back.

      2) Spark utilizes Direct Acyclic Graph that helps to do all the optimization and computation in a single stage rather than multiple stages in the MapReduce model

      3) Apache Spark core is developed using SCALA programming language which is faster than JAVA. SCALA provides inbuilt concurrent execution by providing immutable collections. While in JAVA we need to use Thread to achieve parallel execution.

      For more details, please refer: Comparision between Apache Spark and Apache hadoop.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.