Explain Spark Architecture in brief.
-
-
Explain Spark Architecture in brief.
-
<li style=”list-style-type: none”>
- In real world, ApacheSpark operates in master / slave fashion with one central co-ordinator and many distributed worker.
- Central co-ordinator is called ‘Driver’ while distributed worker called ‘executor’.
- Driver communicates with large no. of executor.
- Driver program runs in its Java Process while each executor runs in its own Java Process.
- Driver and executors together known as ‘Spark Application’.
- Spark application is launched on cluster using Cluster Manager.
- Spark has its in-built cluster manager called Standalone Cluster Manager.
- However, one can run spark on two popular open source cluster manager known as Hadoop YARN and Apache Mesos.
Spark Driver –> Cluster Manager (Standalone, YARN, Mesos)–> Worker (executor)
In reality, there are many workers below Cluster Manager but for simplicity just shown one executor.
For detailed description of Apache Spark Ecosystem refer to Components of Apache Spark Ecosystem
- You must be logged in to reply to this topic.