What are shared variables in Apache Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #6025
      DataFlair Team
      Moderator

      Explain shared variable in Spark.
      What is need of Shared variable in Apache Spark?

    • #6027
      DataFlair Team
      Moderator

      Shared variables are nothing but the variables that can be used in parallel operations. By default, when Apache Spark runs a function in parallel as a set of tasks on different nodes, it ships a copy of each variable used in the function to each task. Sometimes, a variable needs to be shared across tasks, or between tasks and the driver program. Spark supports two types of shared variables: broadcast variables, which can be used to cache a value in memory on all nodes, and accumulators, which are variables that are only “added” to, such as counters and sums.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.