Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › Why Apache Spark?
- This topic has 1 reply, 1 voice, and was last updated 6 years ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 10:02 pm #6433DataFlair TeamSpectator
What is the need of Apache Spark?
-
September 20, 2018 at 10:02 pm #6434DataFlair TeamSpectator
Basically, we had so many general purpose cluster computing tools. For example Hadoop MapReduce, Apache Storm, Apache Impala, Apache Storm, Apache Giraph and many more. But each one has some limitations in their functionality as well. Such as:
1. Hadoop MapReduce can only allow for batch processing.
2. If we talk about stream processing only Apache Storm / S4 can perform.
3. Again for interactive processing, we need Apache Impala / Apache Tez.
4. While we need to perform graph processing, we opt for Neo4j / Apache Giraph.Therefore, No single engine can perform all the tasks together. hence there was a big demand for a powerful engine that can process the data in real-time (streaming) as well as in batch mode
Also, which can respond to sub-second and perform in-memory processing
.In this way, Apache Spark comes in picture. It is a powerful open-source engine that offers interactive processing, real-time stream processing, graph processing, in-memory processing as well as batch processing. Even with very fast speed, ease of use and also standard interface at the same time.
There are many more insights of Spark. To learn all, follow the link: Apache Spark – A Complete Spark Tutorial for Beginners
-
-
AuthorPosts
- You must be logged in to reply to this topic.