RDDs vs DataFrames

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 12:24 pm #4832
  
  DataFlair Team
  Spectator
  
  What is the difference between rdd and dataframes?
- September 20, 2018 at 12:26 pm #4841
  
  DataFlair Team
  Spectator
  
  DataFrame: A Data Frame is used for storing data into tables. It is equivalent to a table in a relational database but with richer optimization. It is a data abstraction and domain-specific language (DSL) applicable on structure and semi-structured data. It is distributed collection of data in the form of named column and row. It has a matrix-like structure whose column may be different types (numeric, logical, factor, or character ).we can say data frame has two-dimensional array like structure where each column contains the value of one variable and row contains one set of values for each column. It combines feature of list and matrices.
  
  For more details about DataFrame, please refer: DataFrame in Spark
  
  RDD is the representation of set of records, immutable collection of objects with distributed computing. RDD is large collection of data or RDD is an array of reference of partitioned objects. Each and every datasets in RDD is logically partitioned across many servers so that they can be computed on different nodes of the cluster. RDDs are fault tolerant i.e. self-recovered/recomputed in the case of failure. The dataset could be data loaded externally by the users which can be in the form of JSON file, CSV file, text file or database via JDBC with no specific data structure.
  
  For more details about RDD, please refer: RDD in Spark
  
  For the detailed comparison between RDD vs DataFrame, follow: RDD vs DataFrame vs DataSet
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

RDDs vs DataFrames

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses