What is DStream in Apache Spark Streaming?

Viewing 1 reply thread
  • Author
    Posts
    • #6380
      DataFlair Team
      Moderator

      Define abstraction of Spark Streaming.
      How can we form DStream?

    • #6381
      DataFlair Team
      Moderator

      Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDsrepresenting a continuous stream of data. DStreams can either be created from live data (such as, data from HDFS, Kafka or Flume) or it can be generated by transformationexisting DStreams using operations such as map, window and reduceByKeyAndWindow.

      Internally, there are few basic properties by which DStreams is characterized:

      1. DStream depends on the list of other DStreams.
      2. A time interval at which the DStream generates an RDD
      3. A function that is used to generate an RDD after each time interval

      for complete introduction, refer link: Apache Spark DStream (Discretized Streams)

Viewing 1 reply thread
  • You must be logged in to reply to this topic.