What are shared variables in Apache Spark?
-
-
Explain shared variable in Spark.
What is need of Shared variable in Apache Spark?
-
Shared variables are nothing but the variables that can be used in parallel operations. By default, when Apache Spark runs a function in parallel as a set of tasks on different nodes, it ships a copy of each variable used in the function to each task. Sometimes, a variable needs to be shared across tasks, or between tasks and the driver program. Spark supports two types of shared variables: broadcast variables, which can be used to cache a value in memory on all nodes, and accumulators, which are variables that are only “added” to, such as counters and sums.
- You must be logged in to reply to this topic.