What is DAG in Apache Spark?
-
-
What is DAG – Directed Acyclic Graph in Spark?
How is DAG used for job execution in Spark?
-
DAG: Directed Acyclic Graph
DAG is a graph data structure which has an edge which is directional and does not have any loops or cycles. DAG is a way of representing dependencies between RDD in Apache Spark. Transformation creates dependencies between RDDs.
RDD transformations from Direct Acyclic Graph, which is then split into stages of by DAGSchedler once an Action is performed.
Stages are the sequence of RDDs operations that don’t have a shuffle in between.
A stage is set for a task that runs in parallel. In the end, every stage will have only shuffle dependencies on other stages and may compute multiple operations inside it.
For more details visit:
DAG in Apache Spark
- You must be logged in to reply to this topic.