This topic contains 1 reply, has 1 voice, and was last updated by  dfbdteam5 9 months, 4 weeks ago.

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #5888

    dfbdteam5
    Moderator

    What is the exact differences between reduce and fold operation in spark?

    #5889

    dfbdteam5
    Moderator

    Reduce:
    Reduce methods walk through the elements in a collection,
    applying your function to neighboring elements to yield a new result,
    which is then compared to the next element in the sequence to yield a new result

    def reduce[T]((value1,value1) => res)

    Fold:
    Fold also works similar to Reduce and aggregate over a collection by executing an operation
    but with a specified initial value

    def fold[T](acc:T)((acc,value) => acc)

    Example:

    Finding max in a given RDD

    val employeeData = List(("Ram",1000.0),("Vishnu",2000.0),("Ravi",7000.0))
    val employeeRDD = sc.makeRDD(employeeData)
    
    val dummyEmployee = ("ABC",0.0);
    
    val maxSalaryEmployee = employeeRDD.fold(dummyEmployee)((acc,employee) => {
    if(acc._2 < employee._2) employee else acc})
    println("employee with maximum salary is"+maxSalaryEmployee)

    For more Action on Apache Spark RDD refer RDD Operations

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.