Pig Latin Operators and Statements – A Complete Guide

1. Objective

In our previous blog, we have seen Apache Pig introduction and pig architecture in detail. Now this article covers the basics of Pig Latin Operators such as comparison, general and relational operators. Moreover, we will also cover the type construction operators as well. We will also discuss the Pig Latin statements in this blog with an example.

Pig Latin Operators and Statements - A Complete Guide

Pig Latin Operators and Statements – A Complete Guide

2. What is Pig Latin?

Pig Latin is the language which analyzes the data in Hadoop using Apache Pig. An interpreter layer transforms Pig Latin statements into MapReduce jobs. Then Hadoop process these jobs further. Pig Latin is a simple language with SQL like semantics. Anyone can use it in a productive manner. Latin has a rich set of functions. These functions exhibit data manipulation. Furthermore, they are extensible by writing user-defined functions (UDF) using java.

If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. Pig Latin Operators

a. Arithmetic Operators

These pig latin operators are basic mathematical operators.  

Operator Description Example
+Addition − It add values on any single side of the operator.

if a= 10, b= 30,

a + b gives 40

Subtraction − It reduces the value of right hand operand from left hand operand.

if a= 40, b= 30,

a-b gives 10

*

Multiplication − This operation multiplies the values on either side of the operator.

a * b gives you 1200

/

Division − This operator divides the left hand operand by right hand operand.

if a= 40, b= 20,

b / a results to 2

%

Modulus − It divides the left hand operand by right hand operand with remainder as result.

if a= 40, b= 30,

b%a results to 10

? :

Bincond − It evaluates the Boolean operators. Moreover, it has three operands below.

variable x = (expression) ? value1 if true : value2 if false.

b = (a == 1)? 40: 20;

if a = 1 the value is 40.

if a!=1 the value is 20.

CASE

WHEN

THEN

ELSE END

Case − This operator is equal to the nested bincond.

CASE f2 % 4

WHEN 0 THEN ‘even’

WHEN 1 THEN ‘odd’

END

b. Comparison Operators

This table contains the comparison operators of Pig Latin.

Operator

 Description

 Example

==

Equal − This operator checks whether the values of two operands are equal or not. If yes, then the condition becomes true.

If a=10, b=20, then (a = b) is not true

!=

Not Equal − Checks the values of two operands are equal or not. If the values are equal, then condition becomes false else true.

If a=10, b=20, then (a != b) is true

>

Greater than − It checks whether the right operand value is greater than that of the right operand. If yes, then the condition becomes true.

If a=10, b=20, then(a > b) is not true.

<

Less than − This operator checks the value of the left operand is less than the right operand. If condition fulfills, then it returns true.

(a < b) is true, if a=10, b=20.

>=

Greater than or equal to − It checks the value of the left operand with right hand. It checks whether it is greater or equal to the right operand. If yes, then it returns true.

If a=20, b=50, true(a >= b) is not true.

<=

Less than or equal to − The value of the left operand is less than or equal to that of the right operand. Then the condition still returns true.

If a=20, b=20, (a <= b) is true.

matches

Pattern matching − This checks the string in the left-hand matches with the constant in the RHS.f1 matches ‘.*df.*’

c. Type Construction Operators

The above table describes the Type construction pig latin operators.

Operator Description Example

()

Tuple constructor operator − This operator constructs a tuple.

(Dataflair, 20)

{}

Bag constructor operator − To construct a bag, we use this operator.

{(Dataflair, 10), (training, 25)}

[]

Map constructor operator − This operator construct a tuple.

[name#DF, age#12]

d.  Relational Operations

The above table describes the relational operators of Pig Latin.

Operator

Description

Loading and Storing

LOAD

It loads the data from a file system into a relation.

STORE

It stores a relation to the file system (local/HDFS).

Filtering

FILTER

There is a removal of unwanted rows from a relation.

DISTINCT

We can remove duplicate rows from a relation by this operator.
FOREACH, GENERATE

It transforms the data based on the columns of data.

STREAM

To transform a relation using an external program.

Grouping and Joining

JOIN

We can join two or more relations.

COGROUP

There is a grouping of the data into two or more relations.

GROUP

It groups the data in a single relation.

CROSS

We can create the cross product of two or more relations.

Sorting

ORDER

It arranges a relation in an order based on one or more fields.
LIMIT

We can get a particular number of tuples from a relation.

Combining and Splitting

UNION

We can combine two or more relations into one relation.
SPLIT

To split a single relation into more relations.

Diagnostic Operators

DUMP

It prints the content of a relationship through the console.

DESCRIBE

It describes the schema of a relation.
EXPLAIN

We can view the logical, physical execution plans to evaluate a relation.

ILLUSTRATE

It displays all the execution steps as the series of statements.

Hadoop Quiz

4.  Pig Latin – Statements

The statements are the basic constructs while processing data using Pig Latin.

  • The statements can work with relations including expressions and schemas.
  • However, every statement terminate with a semicolon (;).
  • We will perform different operations using Pig Latin operators.
  • Pig Latin statements inputs a relation and produces some other relation as output.
  • The semantic checking initiates as we enter a Load step in the Grunt shell. We use the Dump operator to view the contents of the schema. The MapReduce job initiates for loading the data into the file system. It performs only after the dump operation.

For Example
Following is a Pig Latin statement, it loads the data to Apache Pig.

grunt> Sample_data = LOAD 'sample_data.txt' USING PigStorage(',')as
( id:int, name:chararray, contact:chararray, city:chararray );

So, this was all in Pig Latin Tutorial. Hope you like our explanation.

5. Conclusion

Thus, in this Pig Latin Tutorial, we discussed the Pig Latin language analyzes the data in Hadoop. Also, it transforms the statements into further MapReduce jobs. It also has a certain set of data manipulation functions. At last, the Pig Latin statements are the constructs for data processing.
See Also-

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.