A Dataset is an immutable collection of objects, those are mapped to a relational schema. They are strongly-typed in nature.
There is an encoder, at the core of the Dataset API. That Encoder is responsible for converting between JVM objects and
tabular representation. By using Spark’s internal binary format, the tabular representation is stored that allows to carry out operations on serialized data and improves memory utilization. It also supports automatically generating encoders for a wide variety of types, including primitive types (e.g. String, Integer, Long) and Scala case classes. It offers many functional transformations(e.g. map, flatMap, filter).