What are the benefits of lazy evaluation in Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #6369
      DataFlair Team
      Moderator

      Discuss the benefits of lazy evaluation in Apache Spark.

    • #6370
      DataFlair Team
      Moderator

      Apache Spark uses lazy evaluation in order the benefits:

      1) Apply Transformations operations on RDD or “loading data into RDD” is not executed immediately until it sees an action. Transformations on RDDs and storing data in RDD are lazily evaluated. Resources will be utilized in a better way if Spark uses lazy evaluation.

      2) Spark uses lazy evaluation to reduce the number of passes it has to take over our data by grouping operations together. In case MapReduce, user/developer has to spend a lot of time on how to group operations together in order to minimize the number of MapReduce passes. In spark, there is no benefit of writing a single complex map instead of chaining together many simple operations. The user can organize their spark program into smaller operations. Spark will be managed very efficiently of all the operations by using lazy evaluation

      3) Lazy evaluation helps to optimize the disk and memory usage in Spark.

      4) In general, when are doing computation on data, we have to consider two things, that is space and time complexities. Using spark lazy evaluation, we can overcome both. The actions are triggered only when the data is required. It reduces overhead.

      5) It also saves computation and increases speed. Lazy evaluation will play a key role in saving calculation overhead.
      Only necessary values are computed instead of whole dataset (Its all depends on actions operations, and few
      transformations also)

      To learn more about Lazy Evaluation, go through this link: Lazy Evaluation in Apache Spark – A Quick guide

Viewing 1 reply thread
  • You must be logged in to reply to this topic.