With Static typing feature of Dataset, a developer can catch errors at compile time (which saves time and costs). 2)Run-time Safety:- Dataset APIs are all expressed as lambda functions and JVM typed objects, any mismatch of typed-parameters will be
detected at compile time. Also, analysis error can be detected at compile time too, when using Datasets,
hence saving developer-time and costs. 3)Performance and Optimization
Dataset APIs are built on top of the Spark SQL engine, it uses Catalyst to generate an optimized logical and physical query plan providing the space and speed efficiency.
4) For processing demands like high-level expressions, filters, maps, aggregation, averages, sum, SQL queries, columnar access and also for use of lambda functions on semi-structured data, DataSets are best.
5) Datasets provides rich semantics, high-level abstractions, and domain-specific APIs