What are the advantages of datasets in spark?

Viewing 1 reply thread
  • Author
    Posts
    • #6410
      DataFlair Team
      Moderator

      List the benefits of Dataset in Apache Spark?

    • #6411
      DataFlair Team
      Moderator

      1)Static typing-
      With Static typing feature of Dataset, a developer can catch errors at compile time (which saves time and costs).
      2)Run-time Safety:-
      Dataset APIs are all expressed as lambda functions and JVM typed objects, any mismatch of typed-parameters will be
      detected at compile time. Also, analysis error can be detected at compile time too, when using Datasets,
      hence saving developer-time and costs.
      3)Performance and Optimization
      Dataset APIs are built on top of the Spark SQL engine, it uses Catalyst to generate an optimized logical and physical query plan providing the space and speed efficiency.
      4) For processing demands like high-level expressions, filters, maps, aggregation, averages, sum,
      SQL queries, columnar access and also for use of lambda functions on semi-structured data, DataSets are best.
      5) Datasets provides rich semantics, high-level abstractions, and domain-specific APIs

      For complete Introduction on DataSets as well as for all the features, follow link: Spark Dataset Tutorial

Viewing 1 reply thread
  • You must be logged in to reply to this topic.