Explain the flatMap() transformation in Apache Spark.

Viewing 2 reply threads
  • Author
    Posts
    • #4979
      DataFlair Team
      Moderator

      Explain the flatMap() transformation in Apache Spark.

    • #4980
      DataFlair Team
      Moderator
      • When one want to produce multiple elements (values) for each input element, flatMap() is used.
      • As with map(), flatMap() also takes function as an input.
      • Output of the function is a List of the element through which we can iterate. (i.e. function can return 0 or more element for each input element)
      • Simple use of flatMap() is splittin up an input line (string) into words.

      Example

      val fm1 = sc.parallelize(List("Good Morning", "Data Flair", "Spark Batch"))
      val fm2 = fm1.flatMap(y => y.split(" "))
      fm2.foreach{println}

      Output is as follows:

      Good
      Morning
      Data
      Flair
      Spark
      Batch

    • #4981
      DataFlair Team
      Moderator

      It does the similar job like map() but the difference is that flatmap() returns a list of elements (0 or more) as an iterator & output of flatmap is flattened. Function in flatmap returns a list of elements, array or sequence.

      For the detailed study on Apache Spark Transformation and Action refer:
      http://data-flair.training/blogs/rdd-transformations-actions-apis-apache-spark/

Viewing 2 reply threads
  • You must be logged in to reply to this topic.