What is DAG in Apache Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #5598
      DataFlair TeamDataFlair Team
      Spectator

      What is DAG – Directed Acyclic Graph in Spark?
      How is DAG used for job execution in Spark?

    • #5602
      DataFlair TeamDataFlair Team
      Spectator

      DAG: Directed Acyclic Graph
      DAG is a graph data structure which has an edge which is directional and does not have any loops or cycles. DAG is a way of representing dependencies between RDD in Apache Spark. Transformation creates dependencies between RDDs.
      RDD transformations from Direct Acyclic Graph, which is then split into stages of by DAGSchedler once an Action is performed.

      Stages are the sequence of RDDs operations that don’t have a shuffle in between.
      A stage is set for a task that runs in parallel. In the end, every stage will have only shuffle dependencies on other stages and may compute multiple operations inside it.

      For more details visit:
      DAG in Apache Spark

Viewing 1 reply thread
  • You must be logged in to reply to this topic.