What is SparkSession in Apache Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #5634
      DataFlair TeamDataFlair Team
      Spectator

      What is the need for SparkSession in Spark?
      What are the responsibilities of SparkSession?

    • #5635
      DataFlair TeamDataFlair Team
      Spectator

      Starting from Apache Spark 2.0, Spark Session is the new entry point for Spark applications.

      Prior to 2.0, SparkContext was the entry point for spark jobs. RDD was one of the main APIs then, and it was created and manipulated using Spark Context. For every other APIs, different contexts were required – For SQL, SQL Context was required; For Streaming, Streaming Context was required; For Hive, Hive Context was required.

      But from 2.0, RDD along with DataSet and its subset DataFrame APIs are becoming the standard APIs and are a basic unit of data abstraction in Spark. All of the user defined code will be written and evaluated against the DataSet and DataFrame APIs as well as RDD.

      So, there is a need for a new entry point build for handling these new APIs, which is why Spark Session has been introduced. Spark Session also includes all the APIs available in different contexts – Spark Context, SQL Context, Streaming Context, Hive Context.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.