SparkSession vs SparkContext in Apache Spark

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 3:58 pm #5697
  
  DataFlair Team
  Spectator
  
  What is the difference between SparkSession and SparkContext in Apache Spark?
  Where should we use SparkSession / SparkContext?
- September 20, 2018 at 3:58 pm #5699
  DataFlair Team
  Spectator
  Spark Context:
  Prior to Spark 2.0.0 sparkContext was used as a channel to access all spark functionality.
  The spark driver program uses spark context to connect to the cluster through a resource manager (YARN orMesos..).
  sparkConf is required to create the spark context object, which stores configuration parameter like appName (to identify your spark driver), application, number of core and memory size of executor running on worker node.
  
  In order to use APIs of SQL, HIVE, and Streaming, separate contexts need to be created.
  
  Example:
  creating sparkConf :
```
val conf = new SparkConf().setAppName(“RetailDataAnalysis”).setMaster(“spark://master:7077”).set(“spark.executor.memory”, “2g”)

creation of sparkContext:
val sc = new SparkContext(conf)
```
  Spark Session:
  
  SPARK 2.0.0 onwards, SparkSession provides a single point of entry to interact with underlying Spark functionality and
  allows programming Spark with DataFrame and Dataset APIs. All the functionality available with sparkContext are also available in sparkSession.
  
  In order to use APIs of SQL, HIVE, and Streaming, no need to create separate contexts as sparkSession includes all the APIs.
  
  Once the SparkSession is instantiated, we can configure Spark’s run-time config properties.
  
  Example:
  
  Creating Spark session:
  val spark = SparkSession
  .builder
  .appName(“WorldBankIndex”)
  .getOrCreate()
  
  Configuring properties:
  spark.conf.set(“spark.sql.shuffle.partitions”, 6)
  spark.conf.set(“spark.executor.memory”, “2g”)
  
  Spark 2.0.0 onwards, it is better to use sparkSession as it provides access to all the spark Functionalities that sparkContext does. Also, it provides APIs to work on DataFrames and Datasets.
  
  For more details, please refer:
  SparkContext
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

SparkSession vs SparkContext in Apache Spark

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses