This topic contains 1 reply, has 1 voice, and was last updated by  dfbdteam5 10 months ago.

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #5634

    dfbdteam5
    Moderator

    What is the need for SparkSession in Spark?
    What are the responsibilities of SparkSession?

    #5635

    dfbdteam5
    Moderator

    Starting from Apache Spark 2.0, Spark Session is the new entry point for Spark applications.

    Prior to 2.0, SparkContext was the entry point for spark jobs. RDD was one of the main APIs then, and it was created and manipulated using Spark Context. For every other APIs, different contexts were required – For SQL, SQL Context was required; For Streaming, Streaming Context was required; For Hive, Hive Context was required.

    But from 2.0, RDD along with DataSet and its subset DataFrame APIs are becoming the standard APIs and are a basic unit of data abstraction in Spark. All of the user defined code will be written and evaluated against the DataSet and DataFrame APIs as well as RDD.

    So, there is a need for a new entry point build for handling these new APIs, which is why Spark Session has been introduced. Spark Session also includes all the APIs available in different contexts – Spark Context, SQL Context, Streaming Context, Hive Context.

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.