Explain countByKey() operation

Viewing 1 reply thread
  • Author
    Posts
    • #5057
      DataFlair Team
      Moderator

      Explain countByKey() operation.

    • #5058
      DataFlair Team
      Moderator

      It is an action operation
      > Returns (key, noofkeycount) pairs.

      From :
      http://data-flair.training/blogs/rdd-transformations-actions-apis-apache-spark/#38_CountByKey

      It counts the value of RDD consisting of two components tuple for each distinct key. It actually counts the number of elements for each key and return the result to the master as lists of (key, count) pairs.

      val rdd1 = sc.parallelize(Seq(("Spark",78),("Hive",95),("spark",15),("HBase",25),("spark",39),("BigData",78),("spark",49)))
      rdd1.countByKey


      Output:

      scala.collection.Map[String,Long] = Map(Hive -> 1, BigData -> 1, HBase -> 1, spark -> 3, Spark -> 1)

Viewing 1 reply thread
  • You must be logged in to reply to this topic.