What are DataFrame in Apache Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #6382
      DataFlair TeamDataFlair Team
      Spectator

      Define Dataframe in Spark.
      Describe Apache Spark DataFrame.

    • #6383
      DataFlair TeamDataFlair Team
      Spectator

      DataFrame in Spark
      DataFrames are distributed collection of data. In DataFrames, data is organized into named columns. This is conceptually similar to relational tables with good optimization techniques. we can construct DataFrames from an array of different sources such as Hive tables, Structured Data files, external DBs or existing RDDs.
      Apache Spark aimed to provide a simple API for distributed data processing in general purpose programming languages (Java, Python, Scala). Spark enables distributed data processing through functional programming transformations on distributed collections of data (RDDs).

      The DataFrame API is available in Scala, Java, Python, and R.

      For detailed Insights, click on the link: Spark SQL DataFrame Tutorial – An Introduction to DataFrame

Viewing 1 reply thread
  • You must be logged in to reply to this topic.