What is DataFrames in Spark?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 9:39 pm #6391
  
  DataFlair Team
  Spectator
  
  Explain DataFrames in Spark?
- September 20, 2018 at 9:40 pm #6392
  
  DataFlair Team
  Spectator
  
  Introduction
  DataFrame consists of two words data and frame, means data has to be fit in some kind of frame. We can understand a frame as a schema of the relational database.
  
  In Spark, DataFrame is a collection of distributed data over the network with some schema. We can understand it as the data formatted as row/column manner. DataFrame can be created from Hive data, JSON file, CSV, Structured data or raw data that can be framed in structured data. We can also create a DataFrame from RDD if some schema can be applied on that RDD.
  Temporary view or table can also be created from DataFrame as it has data and schema. We can also run SQL query on created table/view to get the faster result.
  It is also evaluated lazily (Lazy Evaluation) for better resource utilization.
  
  For detailed Insights of DataFrame, refer link: Spark SQL DataFrame Tutorial – An Introduction to DataFrame
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.