

{"id":2891,"date":"2017-06-17T09:13:24","date_gmt":"2017-06-17T09:13:24","guid":{"rendered":"http:\/\/data-flair.training\/blogs\/?p=2891"},"modified":"2018-11-16T14:47:09","modified_gmt":"2018-11-16T09:17:09","slug":"apache-spark-dstream-discretized-streams","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/","title":{"rendered":"Apache Spark DStream (Discretized Streams)"},"content":{"rendered":"<h2>1. Objective<\/h2>\n<p>This Spark tutorial, walk you through the <a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-tutorial\/\"><strong>Apache Spark<\/strong><\/a> <strong>DStream<\/strong>. First of all, we will see what is Spark Streaming, then, what is DStream in Apache Spark. Discretized Stream Operations i.e Stateless and Stateful Transformations, Output operation, Input DStream, and Receivers are also discussed in this Apache Spark blog.<\/p>\n<div id=\"attachment_42370\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-42370\" class=\"size-full wp-image-42370\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg\" alt=\"Apache Spark DStream (Discretized Streams)\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-42370\" class=\"wp-caption-text\">Apache Spark DStream (Discretized Streams)<\/p><\/div>\n<h2>2. Introduction to DStream in Apache Spark<\/h2>\n<p>In this section, we will learn about DStream. What are its role, and responsibility in Spark Streaming? It includes what all methods are inculcated to deal with live streaming of data.<br \/>\nAs an extension to Apache Spark API,<a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-streaming-comprehensive-guide\/\"><strong> Spark Streaming<\/strong><\/a> is fault tolerant, high throughput system. It processes the live stream of data. Spark Streaming takes input from various reliable inputs sources like<strong> <a href=\"http:\/\/data-flair.training\/blogs\/apache-flume-tutorial\/\">Flume<\/a>,<a href=\"http:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\"> HDFS<\/a>,<\/strong> and <strong>Kafka<\/strong> etc. and then sends the processed data to filesystems, database or live dashboards. The input data stream is divided into the batches of data and then generates the final stream of the result in batches.<\/p>\n<p>Spark<strong> DStream (Discretized Stream)<\/strong> is the basic abstraction of <strong>Spark Streaming<\/strong>. DStream is a continuous stream of data. It receives input from various sources like <em>Kafka, Flume, Kinesis, or TCP sockets.<\/em> It can also be a data stream generated by transforming the input stream. At its core, <strong><em>DStream is a continuous stream of RDD (Spark abstraction)<\/em><\/strong>. Every RDD in DStream contains data from the certain interval.<\/p>\n<p>Any operation on a DStream applies to all the underlying RDDs. DStream covers all the details. It provides the developer a high-level API for convenience. As a result, Spark DStream facilitates working with streaming data.<\/p>\n<p><a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-streaming-fault-tolerance\/\">Spark Streaming offers fault-tolerance <\/a>properties for DStreams as that for RDDs. as long as a copy of the input data is available, it can recompute any state from it using the<a href=\"http:\/\/data-flair.training\/blogs\/directed-acyclic-graph-dag-in-apache-spark\/\"> lineage of the RDDs<\/a>. By default, Spark replicates data on two nodes. As a result, Spark Streaming can bear single worker failures.<\/p>\n<h2>3. Apache Spark DStream Operations<\/h2>\n<p>Like RDD, Spark DStream also support two types of Operations: Transformations and output Operations-<\/p>\n<h3>i. Transformation<\/h3>\n<p>There are two types of transformation in DStream:<\/p>\n<ul>\n<li>Stateless Transformations<\/li>\n<li>Stateful Transformations<\/li>\n<\/ul>\n<h4><strong style=\"font-family: Verdana, Geneva, sans-serif\">a. Stateless Transformations<\/strong><\/h4>\n<p>The processing of each batch has no dependency on the data of previous batches. <em>Stateless transformations<\/em> are simple RDD transformations. It applies on every batch meaning every RDD in a DStream. It includes common <a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-rdd-transformations-actions\/\"><strong>RDD transformations<\/strong><\/a> like <em>map(), filter(), reduceByKey()<\/em> etc.<br \/>\nAlthough these functions seem like applying to the whole stream, <em><strong>each DStream is a collection of many RDDs (batches)<\/strong><\/em>. As a result, each stateless transformation applies to each RDD.<\/p>\n<p>Stateless transformations are capable of combining data from many DStreams within each time step. For example, key\/value DStreams have the same join-related transformations as RDDs\u2014 <em>cogroup(), join(), leftOuterJoin()<\/em> etc.<\/p>\n<p>We can use these operations on DStreams to perform underlying RDD operations on each batch.<\/p>\n<p>If <em>stateless transformations<\/em> are insufficient, DStreams comes with an advanced operator called transform(). <strong>transform()<\/strong> allow operating on the RDDs inside them. The <em>transform()<\/em> allows any arbitrary RDD-to-RDD function to act on the DStream. This function gets called on each batch of data in the stream to produce a new stream.<\/p>\n<h4><strong style=\"font-family: Verdana, Geneva, sans-serif\">b. Stateful Transformations<\/strong><\/h4>\n<p>It uses data or intermediate results from previous batches and computes the result of the current batch.<em> Stateful transformations<\/em> are operations on DStreams that track data across time. Thus it makes use of some data from previous batches to generate the results for a new batch.<\/p>\n<p>The two main types are <strong>windowed operations<\/strong>, which act over a sliding window of time periods, and <strong>updateStateByKey()<\/strong>, which is used to track state across events for each key (e.g., to build up an object representing each user session).<\/p>\n<p>Follow this link to<a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-streaming-transformation-operations\/\"> Read DStream Transformations in detail with the examples.<\/a><\/p>\n<h3>ii. Output Operation<\/h3>\n<p>Once we get the data after transformation, on that data output operation are performed in Spark Streaming. After the debugging of our program, using output operation we can only save our output. Some of the output operations are print(), save() etc.. The save operation takes directory to save file into and an optional suffix. The print() takes in the first 10 elements from each batch of the DStream and prints the result.<\/p>\n<h2>4. Input DStreams and Receivers<\/h2>\n<p>Input DStream is a DStream representing the stream of input data from streaming source. Receiver (<a href=\"http:\/\/data-flair.training\/blogs\/why-you-should-learn-scala-introductory-tutorial\/\">Scala<\/a> doc,\u00a0Java doc) object associated with every input DStream object. It receives the data from a source and stores it in Spark\u2019s memory for processing.<\/p>\n<p>Spark Streaming provides two categories of built-in streaming sources:<\/p>\n<ul>\n<li><strong><em> Basic sources &#8211;<\/em><\/strong>\u00a0These are Source which is directly available in the<em> StreamingContext API.<\/em> Examples: file systems, and socket connections.<\/li>\n<li><strong><em> Advanced Sources &#8211;<\/em><\/strong>\u00a0These sources are available by extra utility classes like <em>Kafka, Flume, Kinesis.<\/em> Thus, requires linking against extra dependencies.<\/li>\n<\/ul>\n<p>For example:<\/p>\n<ul>\n<li><strong><em>Kafka:<\/em><\/strong> the artifact required for Kafka is <em>spark-streaming-kafka-0-8_2.11.<\/em><\/li>\n<li><strong><em>Flume:<\/em><\/strong> the artifact requires for Flume is<em> dspark-streaming-flume_2.11.<\/em><\/li>\n<li><strong><em>Kinesis:<\/em><\/strong> the artifact required for Kinesis is <em>spark-streaming-kinesis-asl_2.11.<\/em><\/li>\n<\/ul>\n<p>It creates many inputs DStream to receive multiple streams of data in parallel. It creates multiple receivers that receive many data stream. Spark worker\/executor is a long-running task. Thus, occupies one of the cores which associate to Spark Streaming application. So, it is necessary that, Spark Streaming application has enough cores to process received data.<\/p>\n<h2>5. Conclusion<\/h2>\n<p>In conclusion, just like RDD in Spark, Spark Streaming provides a high-level abstraction known as\u00a0DStream. DStream represents a continuous stream of data. Internally, DStream is portrait as a sequence of RDDs. Thus, like RDD, we can obtain DStream from input DStream like Kafka, Flume etc. Also, the transformation could be applied on the existing DStream to get a new DStream.<\/p>\n<p><em>For any query about Spark DStream(Discretized Streams), do leave a comment in the section below.<\/em><br \/>\n<strong>See Also-<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/data-flair.training\/blogs\/apache-spark-streaming-checkpoint\/\">Spark Streaming Checkpointing<\/a><\/li>\n<li><a href=\"http:\/\/data-flair.training\/blogs\/apache-storm-vs-apache-spark-streaming-comparison-guide\/\">Apache Storm vs Spark Streaming<\/a><\/li>\n<\/ul>\n<p><strong><a href=\"https:\/\/en.wikipedia.org\/wiki\/Apache_Spark\">Reference for Spark<\/a><\/strong><span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:1357,&quot;href&quot;:&quot;https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Apache_Spark&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20250922221612\\\/https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Apache_Spark&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-09 05:27:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-12 10:08:16&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-15 10:54:44&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-18 15:58:49&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-21 22:36:30&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-25 05:31:45&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-28 12:45:42&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-31 14:24:43&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-03 17:46:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-07 06:00:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-10 18:44:33&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-14 03:23:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-17 07:55:39&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-20 08:53:11&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-23 13:06:21&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-26 19:31:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-30 03:59:32&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-02 04:29:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-05 06:45:01&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-08 15:14:08&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-11 17:11:37&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-14 17:21:25&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-17 19:54:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-21 15:31:35&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-24 16:57:05&quot;,&quot;http_code&quot;:429},{&quot;date&quot;:&quot;2026-02-27 17:43:21&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-02 18:00:05&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-06 08:59:01&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-09 10:45:21&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-12 12:05:44&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-15 13:52:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-18 16:22:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-22 02:26:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-25 06:42:29&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-28 13:17:46&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-31 19:34:11&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-03 21:06:08&quot;,&quot;http_code&quot;:503},{&quot;date&quot;:&quot;2026-04-07 13:23:55&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-10 15:12:24&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-14 01:00:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-17 15:03:23&quot;,&quot;http_code&quot;:429},{&quot;date&quot;:&quot;2026-04-20 17:12:48&quot;,&quot;http_code&quot;:429},{&quot;date&quot;:&quot;2026-04-23 18:14:30&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-04-26 23:59:57&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-30 03:29:22&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-03 03:48:13&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-06 06:11:43&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-09 10:25:28&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-12 12:20:35&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-15 15:48:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-19 00:06:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-22 12:24:50&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-25 12:59:28&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-28 18:04:56&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-01 07:34:11&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-04 09:52:56&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-07 13:28:25&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-06-10 15:46:34&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-06-14 08:05:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-18 01:16:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-21 13:30:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-24 15:27:50&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-27 17:21:08&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-07-01 06:45:50&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-07-04 09:53:27&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-07-04 09:53:27&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Objective This Spark tutorial, walk you through the Apache Spark DStream. First of all, we will see what is Spark Streaming, then, what is DStream in Apache Spark. Discretized Stream Operations i.e Stateless&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":42370,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[987,3971,4085,13055,13104,13130],"class_list":["post-2891","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-spark","tag-apche-spark-dstream","tag-discretized-streams","tag-dstream-transformations","tag-spark-dstream","tag-spark-rdd","tag-spark-streaming"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Spark DStream (Discretized Streams) - DataFlair<\/title>\n<meta name=\"description\" content=\"Discretized Streams in Apache Spark cover what is DStream in spark, DStream stateless &amp; stateful transformation,Spark DStream output operation,Input DStream\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Spark DStream (Discretized Streams) - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Discretized Streams in Apache Spark cover what is DStream in spark, DStream stateless &amp; stateful transformation,Spark DStream output operation,Input DStream\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2017-06-17T09:13:24+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-16T09:17:09+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Spark DStream (Discretized Streams) - DataFlair","description":"Discretized Streams in Apache Spark cover what is DStream in spark, DStream stateless & stateful transformation,Spark DStream output operation,Input DStream","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/","og_locale":"en_US","og_type":"article","og_title":"Apache Spark DStream (Discretized Streams) - DataFlair","og_description":"Discretized Streams in Apache Spark cover what is DStream in spark, DStream stateless & stateful transformation,Spark DStream output operation,Input DStream","og_url":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2017-06-17T09:13:24+00:00","article_modified_time":"2018-11-16T09:17:09+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"Apache Spark DStream (Discretized Streams)","datePublished":"2017-06-17T09:13:24+00:00","dateModified":"2018-11-16T09:17:09+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/"},"wordCount":975,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg","keywords":["Apche Spark DStream","Discretized Streams","DStream Transformations","Spark DStream","spark rdd","spark streaming"],"articleSection":["Apache Spark Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/","url":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/","name":"Apache Spark DStream (Discretized Streams) - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg","datePublished":"2017-06-17T09:13:24+00:00","dateModified":"2018-11-16T09:17:09+00:00","description":"Discretized Streams in Apache Spark cover what is DStream in spark, DStream stateless & stateful transformation,Spark DStream output operation,Input DStream","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2017\/06\/apache-spark-dstream-1.jpg","width":1200,"height":628,"caption":"Apache Spark DStream (Discretized Streams)"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/apache-spark-dstream-discretized-streams\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Apache Spark Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/spark\/"},{"@type":"ListItem","position":3,"name":"Apache Spark DStream (Discretized Streams)"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/2891","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=2891"}],"version-history":[{"count":5,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/2891\/revisions"}],"predecessor-version":[{"id":42371,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/2891\/revisions\/42371"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/42370"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=2891"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=2891"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=2891"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}