Why RDD is immutable ?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 5:07 pm #6095
  
  DataFlair Team
  Spectator
  
  Why RDD is immutable ?
- September 20, 2018 at 5:07 pm #6097
  
  DataFlair Team
  Spectator
  
  Following are the reasons:
  – Immutable data is always safe to share across multiple processes as well as multiple threads.
  – Since RDD is immutable we can recreate the RDD any time. (From lineage graph).
  – If the computation is time-consuming, in that we can cache the RDD which result in performance improvement.
  
  Please add more points if I am missing something
  RDDs are also fault-tolerant and evaluate lazily for more information read Fault tolerance in Spark and Lazy evaluation in Spark.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.