Explain transformation and action in RDD in Apache Spark.

Viewing 1 reply thread
  • Author
    Posts
    • #5937
      DataFlair Team
      Moderator

      Explain the operations of an RDD in Spark.
      Define transformation and Action in Apache Spark RDD.

    • #5938
      DataFlair Team
      Moderator

      Transformations are operations on RDD that create one or more new RDDs. E.g. map, filter, reduceByKey etc. In other words, transformations are functions that take an RDD as the input and produce one or more RDDs as the output. There is no change in the input RDD, but it always produces one or more new RDDs by applying the computations they represent.Transformations are lazy, i.e. are not executed immediately. Only after calling an action are transformations executed.

      Actions are RDD operations that produce non-RDD values. In other words, an RDD operation that returns a value of any type but an RDD is an action. They trigger execution of RDD transformations to return values. Simply put, an action evaluates the RDD lineage graph. E.g. collect, reduce, count, foreach etc.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.