Name the components of Apache Spark ecosystem.

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Spark Name the components of Apache Spark ecosystem.

Viewing 1 reply thread
  • Author
    Posts
    • #6138
      DataFlair TeamDataFlair Team
      Spectator

      What are the components of Spark Ecosystem?

    • #6141
      DataFlair TeamDataFlair Team
      Spectator

      Apache spark consists of following components
      1.Spark Core
      2.Spark SQL
      3.Spark Streaming
      4.MLlib
      5.GraphX

      Spark Core: Spark Core contains the basic functionality of Spark, including components for task scheduling, memory management, fault recovery, interacting with storage systems, and more. Spark Core is also home to the API that defines resilient distributed datasets (RDDs), which are Spark’s main programming abstraction.It also provides many APIs for building and manipulating these RDDS.

      Spark SQL: Spark SQL provides an interface to work with structured data.It allows querying in SQL as well as Apache Hivevariant of SQL(HQL).It supports many sources.

      Spark Streaming: It is spark component that enables processing of live streams of data.

      MLlib: Spark comes with common machine learning package called MLlib

      GraphX: GraphX is a library for manipulating graphs (e.g., a social network’s friend graph)and performing graph-parallel computations.

      To learn more about Apache Spark visit: Components of Spark

Viewing 1 reply thread
  • You must be logged in to reply to this topic.