Explain Spark streaming

This topic has 1 reply, 1 voice, and was last updated 7 years, 10 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 9:58 pm #6426
  
  DataFlair Team
  Spectator
  
  Explain Spark streaming
- September 20, 2018 at 9:58 pm #6427
  
  DataFlair Team
  Spectator
  
  Spark Streaming
  A data stream defines as a data arriving continuously in the form of an unbounded sequence. For further processing, Streaming separates continuously flowing input data into discrete units. It is a low latency processing and analyzing of streaming data.
  
  In the year 2013, Apache Spark Streaming was added to Apache Spark. Through Streaming, we can do fault-tolerant,scalable stream processing of live data streams. From many sources like Kafka, Apache Flume, Amazon Kinesis or TCP sockets, Data ingestion can be possible. Also, by using complex algorithms, processing is possible. That are expressed with high-level functions such as map, reduce, join and window. By the end, processed data can be pushed out to filesystems, databases and live dashboards.
  
  Internally, By Spark streaming, Live input data streams are received and divided into batches. Afterwards, these batches are then processed by the Spark engine to generate the final stream of results in batches.
  
  Discretized Stream or, in short, a Spark DStream is its basic abstraction. That also represents a stream of data divided into small batches. DStreams are built on Spark RDDs, Spark’s core data abstraction. Streaming can aslo integrate with any other Apache Spark components like Spark MLlib and Spark SQL.
  
  For more information on Spark Streaming, follow the link: Spark Streaming Tutorial for Beginners
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Explain Spark streaming

About DataFlair

Trending Courses in Indore

Trending Courses in Bangalore

Trending Courses in Chennai

Trending Courses in Pune

Trending Courses in Hyderabad

Trending Courses in Delhi NCR