filter transformation in Spark

Viewing 2 reply threads
  • Author
    Posts
    • #5015
      DataFlair Team
      Moderator

      Explain the filter transformation.

    • #5018
      DataFlair Team
      Moderator
        <li style=”list-style-type: none”>
      • filter() transformation in Apache Spark takes function as input.
      • It returns an RDD that only has element that pass the condition mentioned in input function.

      Example:

      val rdd1 = sc.parallelize(List(10,20,40,60))
      val rdd2 = rdd2.filter(x => x !=10)
      println(rdd2.collect())

      Output

      10

      For more Transformation see Transformation and Action in Apache Spark.

    • #5022
      DataFlair Team
      Moderator

      It returns a new dataset which is formed by selecting those elements of source on which function returns true. It returns those elements only that satisfy a predicate. The predicate is a function that accepts parameter and returns Boolean value either true or false. It keeps only those elements which pass/satisfies the condition and filter out those which don’t. so the new RDD will be set of those elements for which function returns true.

      From :
      Operation on RDD in Spark.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.