Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › Explain keys() operation in Apache spark. This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team. Viewing 2 reply threads Author Posts September 20, 2018 at 2:35 pm #5214 DataFlair TeamSpectator Explain keys() operation in Apache spark. September 20, 2018 at 2:36 pm #5220 DataFlair TeamSpectator keys() is a transformation. It returns an RDD of keys. val rdd1 = sc.parallelize(Seq((2,4),(3,6),(4,8),(5,10),(6,12),(7,14),(8,16),(9,18),(10,20))) val rdd2 = rdd1.keys rdd2.collect Output: Array[Int] = Array(2, 3, 4, 5, 6, 7, 8, 9, September 20, 2018 at 2:37 pm #5223 DataFlair TeamSpectator Example 2 : (Keys are repeated – duplicate keys are present in data set) val rdd1 = sc.parallelize(Seq((2,4),(3,6),(4,8),(2,6),(4,12),(5,10),(5,40),(10,40))) val rdd2 = rdd1.keys rdd2.collect Output: Array[Int] = Array(2, 3, 4, 2, 4, 5, 5, 10) Author Posts Viewing 2 reply threads You must be logged in to reply to this topic. Log In Username: Password: Keep me signed in Signup / Login with Google Log In