> reduce() is an action. It is wide operation (i.e. shuffle data across multiple partitions and output a single value)
> It takes function as an input which has two parameter of the same type and output a single value of the input type.
> i.e. combine the elements of RDD together.
Example 1 :
val rdd1 = sc.parallelize(1 to 100)
val rdd2 = rdd1.reduce((x,y) => x+y)
val rdd2 = rdd1.reduce(_ + _)
rdd2: Int = 5050
val rdd1 = sc.parallelize(1 to 5)
val rdd2 = rdd1.reduce(_*_)
It takes function with two arguments an accumulator and a value which should be commutative and Associative in mathematical nature. It reduces a list of element s into one as a result. This function produces same result when continuously applied on same set of RDD data with multiple partitions irrespective of elements order. It is wide operation.