What is Directed Acyclic Graph in Apache Spark?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 4:53 pm #5987
  
  DataFlair Team
  Spectator
  
  Explain Directed Acyclic Graph in Spark.
  What is the function of Directed Acyclic Graph in Spark?
- September 20, 2018 at 4:53 pm #5989
  
  DataFlair Team
  Spectator
  
  In mathematical term, the Directed Acyclic Graph is a graph with cycles which are not directed. DAG is a graph which contains set of all the operations that are applied on RDD. On RDD when any action is called. Spark creates the DAG and submits it to the DAG scheduler. Only after the DAG is built, Spark creates the query optimization plan. The DAG scheduler divides operators into stages of tasks. A stage is comprised of tasks based on partitions of the input data. The DAG scheduler pipelines operators together.
  Fault tolerance is achieved in Spark using the Directed Acyclic Graph. The query optimization is possible in Spark by the use of DAG. Thus, we get the better performance by using DAG.
  
  To know about how to create DAG, how is fault tolerance achieved through DAG, Working of DAG optimizer read DAG in Apache Spark
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.