Explain pipe() operation in Apache Spark

Viewing 1 reply thread
  • Author
    Posts
    • #5450
      DataFlair Team
      Moderator

      Explain pipe() operation in Apache Spark

    • #5452
      DataFlair Team
      Moderator
        <li style=”list-style-type: none”>
      • It is a transformation.


      def pipe(command: String): RDD[String]
      Return an RDD created by piping elements to a forked external process.

      • In general, Spark is using Scala, Java, and Python to write the program. However, if that is not enough, and one want to pipe (inject) the data which written in other languages like ‘R’, Spark provides general mechanism in the form of pipe() method
      • Spark provides the pipe() method on RDDs.
      • With Spark’s pipe() method, one can write a transformation of an RDD that can read each element in the RDD from standard input as String.
      • It can write the results as String to the standard output.

      For more transformation on RDDs see: Apache Spark Operations

Viewing 1 reply thread
  • You must be logged in to reply to this topic.