Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › Explain different transformations in DStream in Apache Spark Streaming.
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 4:48 pm #5941DataFlair TeamSpectator
Define the various type of transformation in Apache Spark Streaming.
Explain kind of transformation in Spark Streaming DStream. -
September 20, 2018 at 4:48 pm #5944DataFlair TeamSpectator
Different transformations in DStream in Apache Spark Streaming are:
1-map(func) — Return a new DStream by passing each element of the source DStream through a function func.
2-flatMap(func) — Similar to map, but each input item can be mapped to 0 or more output items.
3-filter(func) — Return a new DStream by selecting only the records of the source DStream on which func returns true.
4-repartition(numPartitions) — Changes the level of parallelism in this DStream by creating more or fewer partitions.
5-union(otherStream) — Return a new DStream that contains the union of the elements in the source DStream and
otherDStream.6-count() — Return a new DStream of single-element RDDs by counting the number of elements in each RDD of the source DStream.
7-reduce(func)— Return a new DStream of single-element RDDs by aggregating the elements in each RDD of the source DStream using a function func (which takes two arguments and returns one).
8-countByValue() — When called on a DStream of elements of type K, Return a new DStream of (K, Long) pairs where the value of each key is its frequency in each RDD of the source DStream.
9-reduceByKey(func, [numTasks])— When called on a DStream of (K, V) pairs, return a new DStream of (K, V) pairs where the values for each key are aggregated using the given reduce function.
10-join(otherStream, [numTasks]) — When called on two DStreams of (K, V) and (K, W) pairs, return a new DStream of (K, (V, W)) pairs with all pairs of elements for each key.
11-cogroup(otherStream, [numTasks]) — When called on DStream of (K, V) and (K, W) pairs, return a new DStream of (K, Seq[V], Seq[W]) tuples.
12-transform(func) — Return a new DStream by applying a RDD-to-RDD function to every RDD of the source DStream.
13-updateStateByKey(func) — Return a new “state” DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values for the key.
-
-
AuthorPosts
- You must be logged in to reply to this topic.