How Apache Pig deals with the schema and schema-less data?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop How Apache Pig deals with the schema and schema-less data?

Viewing 1 reply thread
  • Author
    Posts
    • #4608
      DataFlair TeamDataFlair Team
      Spectator

      How does Pig deal with the schema and schema-less data?

    • #4609
      DataFlair TeamDataFlair Team
      Spectator

      Yes, this is true that Apache Pig deals with both schema and schema-less data, but then the question arises, how? so for that explanation is:

      – While the schema only includes the field name, so the data type of field is considered as a byte array.
      – If we assign a name to the field we can access the field by both, either the field name or the positional notation, however, we can only access it by the positional notation i.e. $ followed by the index number, if the field name is missing.
      – While we perform any operation that is a combination of relations ( JOIN, COGROUP, and many more.) although the resulting relation will have null schema if any of the relations is the missing schema.
      – Further, while the schema is null, then Pig will consider it as a byte array and also the real data type of field will be determined dynamically.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.