Name the two types of shared variable available in Apache Spark.

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Spark Name the two types of shared variable available in Apache Spark.

Viewing 1 reply thread
  • Author
    Posts
    • #5270
      DataFlair TeamDataFlair Team
      Spectator

      Name the two types of shared variable available in Apache Spark.

    • #5272
      DataFlair TeamDataFlair Team
      Spectator

      There are two types of shared variables available in Apache Spark:
      (1) Accumulators: used to Aggregate the Information.
      (2) Broadcast variable: to efficiently distribute large values.

      When we pass the function to Spark, say filter(), this function can use the variable which defined outside of the function but within the Driver program but when we submit the task to Cluster, each worker node gets a new copy of variables and update from these variables not propagated back to Driver program.

      Accumulators and Broadcast variable are used to remove above drawback ( i.e. we can get the updated values back to our Driver program)

Viewing 1 reply thread
  • You must be logged in to reply to this topic.