Yes, this is true that Apache Pig deals with both schema and schema-less data, but then the question arises, how? so for that explanation is:
– While the schema only includes the field name, so the data type of field is considered as a byte array.
– If we assign a name to the field we can access the field by both, either the field name or the positional notation, however, we can only access it by the positional notation i.e. $ followed by the index number, if the field name is missing.
– While we perform any operation that is a combination of relations ( JOIN, COGROUP, and many more.) although the resulting relation will have null schema if any of the relations is the missing schema.
– Further, while the schema is null, then Pig will consider it as a byte array and also the real data type of field will be determined dynamically.