Explain Lazy Evaluation in spark

Viewing 1 reply thread
  • Author
    Posts
    • #6397
      DataFlair TeamDataFlair Team
      Spectator

      What is lazy evaluation in Spark?
      How is the concept of lazy evaluation used in Spark?

    • #6398
      DataFlair TeamDataFlair Team
      Spectator

      In Apache Spark Lazy evaluation can be understood as “Operation is not performed until and unless it’s required”. In another way “Evaluation is procrastinated as long as it’s possible”.
      Let’s understand this behaviour by taking an example. A simple application has to follow below-mentioned steps for successful execution.
      1. Loading the file.
      2. Applying filter to some condition
      3. Mapping the data on some logic
      4. Grouping or reducing data
      5. Saving the resulted data at given location.

      Here, only the fifth step is required for the user. The user is not interested to run step1 to step 4, but it is required to execute step 5. Step 1 to step 4 are just transforming the data to one form to another applying some logic, these operations are called Transformation operation
      and step 5 is the only step, which provides the actual result, these types of operations are called Action operation
      . Spark doesn’t do anything on transformation operations except making an entry in DAG until an action operation is called. Action operation behaves like a trigger, it informs the Spark that now results has to be saved in some location then Spark starts execution of the application on basis of created DAG
      .

      This lazy behaviour increases the cluster performance by optimizing resource utilization.

      For complete information on Lazy Evolution in Apache Spark. Refer link: Lazy Evaluation in Apache Spark – A Complete guide

Viewing 1 reply thread
  • You must be logged in to reply to this topic.