Explain transformation and action in Spark

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 9:41 pm #6395
  
  DataFlair Team
  Spectator
  
  Define transformation and Action in Apache Spark RDD.
- September 20, 2018 at 9:41 pm #6396
  
  DataFlair Team
  Spectator
  
  Before we start with Spark RDD Operations, let us deep dive into RDD in Spark.
  
  Apache Spark RDD supports two types of Operations-
  
  Transformations
  Actions
  
  Now let’s discuss them in detail:-
  
  Transformation: These type of operations transform the data to a new data using some logic may be mapping, filtering, grouping, reducing.
  ex: rdd2=rdd1.groupByKey()
  Here data of rdd1 is in key value pair and the operation specified here groups the data by the key value and creates a new rdd of name rdd2.
  
  Action: These are the operation that evaluates the overall application, get the results that is the main objective of the application. Like getting the count, storing some filtered or mapped data, printing on console
  ex: rdd2.saveAsText(“file_path”)
  
  Above operation saves the rdd2 to the path specified.
  
  Transformation operations are lazy in nature and Actions are the trigger. If we load the data and apply some kind of filtering, mapping, grouping, spark just make an entry for each transformation in DAG(Directed Acyclic Graph), ie a flow of data. It doesn’t perform any operation until and unless an action operation is not applied to data. This behaviour of spark is known as lazy evaluation. Lazy evaluation increases the overall cluster performance by optimizing resource utilization.
  
  For detailed information of Spark RDD Operations with examples, follow the link: Spark RDD Operations-Transformation & Action with Example
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Explain transformation and action in Spark

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses