Why does the picture of Spark come into existence?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 10:10 pm #6443
  
  DataFlair Team
  Spectator
  
  What is the need of Apache Spark?
  List the Drawbacks of Apache Hadoop.
- September 20, 2018 at 10:10 pm #6444
  
  DataFlair Team
  Spectator
  
  Let’s first discuss some major issues with Apache Hadoop.
  1. Issue with Small Files
  2. It has Slow Processing Speed
  3. Only support for Batch Processing only
  4. There is no support for Real-time Data Processing
  
  There are some more limitations of Apache Hadoop. To learn all, follow link: 13 Big Limitations of Hadoop
  
  To overcome all these issues, Apache Spark comes into the picture. one more reason behind Evolution of Apache Spark is that there were many general purpose computing engines those can perform operations but again they attain some limitations with their functionality itself.
  
  For Example:
  1. Hadoop MapReduce is limited to batch processing.
  2. Apache Storm / S4 can only support stream processing.
  3. Apache Impala / Apache Tez can only allow interactive processing
  4. Neo4j / Apache Giraph can only support graph processing
  
  Therefore, If we want to use them together, reduces the efficiency and also increases the complexity. So, there is a big demand for a powerful engine. By that, we can process the data in real-time (streaming) as well as in batch mode. Also, there was a requirement for an engine which can respond in sub-second. And also perform in-memory processing.
  
  In this way, Apache Software foundation introduces Apache Spark. It is a powerful open-source engine. It offers real-time stream processing, interactive processing, graph processing, in-memory processing as well as batch processing. Even with very fast speed, ease of use and standard interface. That makes difference between Hadoop vs Spark. Also, makes a huge comparison between Spark vs Storm.
  
  For detailed insights about Apache Spark. follow the link: Apache Spark – A Complete Spark Tutorial for Beginners
- September 20, 2018 at 10:11 pm #6445
  
  DataFlair Team
  Spectator
  
  Let’s first discuss some major issues with Apache Hadoop.
  1. Issue with Small Files
  2. It has Slow Processing Speed
  3. Only support for Batch Processing only
  4. There is no support for Real-time Data Processing
  
  There are some more limitations of Apache Hadoop. To learn all, follow link: 13 Big Limitations of Hadoop
  
  To overcome all these issues, Apache Spark comes into the picture. one more reason behind Evolution of Apache Spark is that there were many general purpose computing engines those can perform operations but again they attain some limitations with their functionality itself.
  
  For Example:
  1. Hadoop MapReduce is limited to batch processing.
  2. Apache Storm / S4 can only support stream processing.
  3. Apache Impala / Apache Tez can only allow interactive processing
  4. Neo4j / Apache Giraph can only support graph processing
  
  Therefore, If we want to use them together, reduces the efficiency and also increases the complexity. So, there is a big demand for a powerful engine. By that, we can process the data in real-time (streaming) as well as in batch mode. Also, there was a requirement for an engine which can respond in sub-second. And also perform in-memory processing.
  
  In this way, Apache Software foundation introduces Apache Spark. It is a powerful open-source engine. It offers real-time stream processing, interactive processing, graph processing, in-memory processing as well as batch processing. Even with very fast speed, ease of use and standard interface. That makes difference between Hadoop vs Spark. Also, makes a huge comparison between Spark vs Storm.
  
  For detailed insights about Apache Spark. follow the link: Apache Spark – A Complete Spark Tutorial for Beginners
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Why does the picture of Spark come into existence?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses