Explain pipe() operation in Apache Spark

This topic has 1 reply, 1 voice, and was last updated 7 years, 10 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 3:13 pm #5450
  
  DataFlair Team
  Spectator
  
  Explain pipe() operation in Apache Spark
- September 20, 2018 at 3:13 pm #5452
  DataFlair Team
  Spectator
  - It is a transformation.
  def pipe(command: String): RDD[String]
  Return an RDD created by piping elements to a forked external process.
  - In general, Spark is using Scala, Java, and Python to write the program. However, if that is not enough, and one want to pipe (inject) the data which written in other languages like ‘R’, Spark provides general mechanism in the form of pipe() method
  - Spark provides the pipe() method on RDDs.
  - With Spark’s pipe() method, one can write a transformation of an RDD that can read each element in the RDD from standard input as String.
  - It can write the results as String to the standard output.
  For more transformation on RDDs see: Apache Spark Operations
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.