What is DataFrames in Spark?

Viewing 1 reply thread
  • Author
    Posts
    • #6391
      DataFlair Team
      Moderator

      Explain DataFrames in Spark?

    • #6392
      DataFlair Team
      Moderator

      Introduction
      DataFrame consists of two words data and frame, means data has to be fit in some kind of frame. We can understand a frame as a schema of the relational database.

      In Spark, DataFrame is a collection of distributed data over the network with some schema. We can understand it as the data formatted as row/column manner. DataFrame can be created from Hive data, JSON file, CSV, Structured data or raw data that can be framed in structured data. We can also create a DataFrame from RDD if some schema can be applied on that RDD.
      Temporary view or table can also be created from DataFrame as it has data and schema. We can also run SQL query on created table/view to get the faster result.
      It is also evaluated lazily (Lazy Evaluation) for better resource utilization.

      For detailed Insights of DataFrame, refer link: Spark SQL DataFrame Tutorial – An Introduction to DataFrame

Viewing 1 reply thread
  • You must be logged in to reply to this topic.