Explain keys() operation in Apache spark.

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 2:35 pm #5214
  
  DataFlair Team
  Spectator
  
  Explain keys() operation in Apache spark.
- September 20, 2018 at 2:36 pm #5220
  DataFlair Team
  Spectator
  - keys() is a transformation.
  - It returns an RDD of keys.
```
val rdd1 = sc.parallelize(Seq((2,4),(3,6),(4,8),(5,10),(6,12),(7,14),(8,16),(9,18),(10,20)))
val rdd2 = rdd1.keys
rdd2.collect
```
  Output:
```
Array[Int] = Array(2, 3, 4, 5, 6, 7, 8, 9,
```
- September 20, 2018 at 2:37 pm #5223
  DataFlair Team
  Spectator
  Example 2 : (Keys are repeated – duplicate keys are present in data set)
```
val rdd1 = sc.parallelize(Seq((2,4),(3,6),(4,8),(2,6),(4,12),(5,10),(5,40),(10,40)))
val rdd2 = rdd1.keys
rdd2.collect
```
  Output:
  Array[Int] = Array(2, 3, 4, 2, 4, 5, 5, 10)
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.