Benefits of Spark over MapReduce or Spark vs MapReduce?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Spark Benefits of Spark over MapReduce or Spark vs MapReduce?

Viewing 2 reply threads
  • Author
    Posts
    • #5972
      DataFlair TeamDataFlair Team
      Spectator

      What is the difference between Apache Spark and Apache Hadoop (Map-Reduce) ?
      I want to know the core differences between them.

    • #5975
      DataFlair TeamDataFlair Team
      Spectator
      1. Spark is easy to program and don’t require that much hand coding whereas MapReduce is not that easy in terms of programming and requires lots of hand coding
      2. It has interactive mode whereas in MapReduce there is no built-in interactive mode, MapReduce is developed for batch processing.
      3. For data processing Spark can use streaming, machine learning, and batch processing whereas Hadoop MapReduce can use the batch engine. Spark is general purpose cluster computation engine.
      4. Spark executes batch processing jobs about 10 to 100 times faster than Hadoop MapReduce.
      5. Spark uses an abstraction called RDD which makes Spark feature rich, whereas map reduce doesn’t have any abstraction
      6. Spark uses lower latency by caching partial/complete results across distributed nodes whereas MapReduce is completely disk-based.

      For a detailed comparison between Spark & Hadoop-MapReduce, Please refer:
      Spark vs Hadoop MapReduce

    • #5978
      DataFlair TeamDataFlair Team
      Spectator

      a) Spark needs higher RAM whereas MapReduce needs larger disk space in terms of big data processing. On the cloud, Spark will definitely outperform MapReduce.
      b) Hadoop has been around since 2005, there is still a shortage of MapReduce experts out there on the market. What does this mean for Spark, which has only been around since 2010? Maybe it has a faster learning curve, but it still lacks way more skilled ninjas out there compared to Hadoop MR.
      c) Spark’s compatibility to data types and data sources is the same as Hadoop MapReduce.
      d) Spark and Hadoop MapReduce both have good failure tolerance, but Hadoop MapReduce is slightly more tolerant.
      e) Spark security is still in its infancy; Hadoop MapReduce has more security features and projects.

      For more detailed study on Apache Spark and MapReduce read Difference between Apache Spark and Hadoop MapReduce.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.