Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › What are the advantages of datasets in spark?
- This topic has 1 reply, 1 voice, and was last updated 6 years ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 9:48 pm #6410DataFlair TeamSpectator
List the benefits of Dataset in Apache Spark?
-
September 20, 2018 at 9:48 pm #6411DataFlair TeamSpectator
1)Static typing-
With Static typing feature of Dataset, a developer can catch errors at compile time (which saves time and costs).
2)Run-time Safety:-
Dataset APIs are all expressed as lambda functions and JVM typed objects, any mismatch of typed-parameters will be
detected at compile time. Also, analysis error can be detected at compile time too, when using Datasets,
hence saving developer-time and costs.
3)Performance and Optimization
Dataset APIs are built on top of the Spark SQL engine, it uses Catalyst to generate an optimized logical and physical query plan providing the space and speed efficiency.
4) For processing demands like high-level expressions, filters, maps, aggregation, averages, sum,
SQL queries, columnar access and also for use of lambda functions on semi-structured data, DataSets are best.
5) Datasets provides rich semantics, high-level abstractions, and domain-specific APIsFor complete Introduction on DataSets as well as for all the features, follow link: Spark Dataset Tutorial
-
-
AuthorPosts
- You must be logged in to reply to this topic.