This topic contains 2 replies, has 1 voice, and was last updated by  dfbdteam5 9 months, 4 weeks ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #5972

    dfbdteam5
    Moderator

    What is the difference between Apache Spark and Apache Hadoop (Map-Reduce) ?
    I want to know the core differences between them.

    #5975

    dfbdteam5
    Moderator
    1. Spark is easy to program and don’t require that much hand coding whereas MapReduce is not that easy in terms of programming and requires lots of hand coding
    2. It has interactive mode whereas in MapReduce there is no built-in interactive mode, MapReduce is developed for batch processing.
    3. For data processing Spark can use streaming, machine learning, and batch processing whereas Hadoop MapReduce can use the batch engine. Spark is general purpose cluster computation engine.
    4. Spark executes batch processing jobs about 10 to 100 times faster than Hadoop MapReduce.
    5. Spark uses an abstraction called RDD which makes Spark feature rich, whereas map reduce doesn’t have any abstraction
    6. Spark uses lower latency by caching partial/complete results across distributed nodes whereas MapReduce is completely disk-based.

    For a detailed comparison between Spark & Hadoop-MapReduce, Please refer:
    Spark vs Hadoop MapReduce

    #5978

    dfbdteam5
    Moderator

    a) Spark needs higher RAM whereas MapReduce needs larger disk space in terms of big data processing. On the cloud, Spark will definitely outperform MapReduce.
    b) Hadoop has been around since 2005, there is still a shortage of MapReduce experts out there on the market. What does this mean for Spark, which has only been around since 2010? Maybe it has a faster learning curve, but it still lacks way more skilled ninjas out there compared to Hadoop MR.
    c) Spark’s compatibility to data types and data sources is the same as Hadoop MapReduce.
    d) Spark and Hadoop MapReduce both have good failure tolerance, but Hadoop MapReduce is slightly more tolerant.
    e) Spark security is still in its infancy; Hadoop MapReduce has more security features and projects.

    For more detailed study on Apache Spark and MapReduce read Difference between Apache Spark and Hadoop MapReduce.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.