What is Directed Acyclic Graph in Apache Spark?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Spark What is Directed Acyclic Graph in Apache Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #5987
      DataFlair TeamDataFlair Team
      Spectator

      Explain Directed Acyclic Graph in Spark.
      What is the function of Directed Acyclic Graph in Spark?

    • #5989
      DataFlair TeamDataFlair Team
      Spectator

      In mathematical term, the Directed Acyclic Graph is a graph with cycles which are not directed. DAG is a graph which contains set of all the operations that are applied on RDD. On RDD when any action is called. Spark creates the DAG and submits it to the DAG scheduler. Only after the DAG is built, Spark creates the query optimization plan. The DAG scheduler divides operators into stages of tasks. A stage is comprised of tasks based on partitions of the input data. The DAG scheduler pipelines operators together.
      Fault tolerance is achieved in Spark using the Directed Acyclic Graph. The query optimization is possible in Spark by the use of DAG. Thus, we get the better performance by using DAG.

      To know about how to create DAG, how is fault tolerance achieved through DAG, Working of DAG optimizer read DAG in Apache Spark

Viewing 1 reply thread
  • You must be logged in to reply to this topic.