Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › Explain the RDD properties.
- This topic has 1 reply, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 3:10 pm #5432DataFlair TeamSpectator
Explain the RDD properties.
-
September 20, 2018 at 3:11 pm #5435DataFlair TeamSpectator
-
<li style=”list-style-type: none”>
- RDD (Resilient Distributed Dataset) is a basic abstraction in Apache Spark.
- RDD is an immutable, partitioned collection of elements on the cluster which can be operated in parallel.
- Each RDD is characterized by five main properties :
- Below operations are lineage operations.1. List or Set of partitions.
2. List of dependencies on other (parent) RDD
3. A function to compute each partitionBelow operations are used for optimization during execution.
4. Optional preferred location [i.e. block location of an HDFS file] [it’s about data locality]
5. Optional partitioned info [i.e. Hash-Partition for Key/Value pair –> When data shuffled how data will be traveled]Examples :
#HadoopRDD : - HadoopRDD provides core functionality for reading data stored in Hadoop (HDFS, HBase, Amazon S3..) using the older MapReduce API (org.apache.hadoop.mapred)
- Properties of HadoopRDD :
1. List or Set of partitions: One per HDFS block
2. List of dependencies on parent RDD: None
3. A function to compute each partition: read respective HDFS block
4. Optional Preferred location: HDFS block location
5. Optional partitioned info: None
#FilteredRDD :
-
<li style=”list-style-type: none”>
- Properties of FilteredRDD:
1. List or Set of partitions: No. of partitions same as parent RDD
2. List of dependencies on parent RDD: ‘one-to-one’ as parent (same as parent)
3. A function to compute each partition: compute parent and then filter it
4. Optional Preferred location: None (Ask Parent)
5. Optional partitioned info: None
Find features of RDD in RDD Features in Spark
For detailed information on RDD tour on RDD in Spark
-
-
AuthorPosts
- You must be logged in to reply to this topic.