Pig Latin Operators and Statements – A Complete Guide
1. Objective
In our previous blog, we have seen Apache Pig introduction and pig architecture in detail. Now this article covers the basics of Pig Latin Operators such as comparison, general and relational operators. Moreover, we will also cover the type construction operators as well. We will also discuss the Pig Latin statements in this blog with an example.
2. What is Pig Latin?
Pig Latin is the language which analyzes the data in Hadoop using Apache Pig. An interpreter layer transforms Pig Latin statements into MapReduce jobs. Then Hadoop process these jobs further. Pig Latin is a simple language with SQL like semantics. Anyone can use it in a productive manner. Latin has a rich set of functions. These functions exhibit data manipulation. Furthermore, they are extensible by writing userdefined functions (UDF) using java.
3. Pig Latin Operators
a. Arithmetic Operators
These pig latin operators are basic mathematical operators.
Operator  Description  Example 
+  Addition − It add values on any single side of the operator. 
if a= 10, b= 30, a + b gives 40 
− 
Subtraction − It reduces the value of right hand operand from left hand operand. 
if a= 40, b= 30, ab gives 10 
* 
Multiplication − This operation multiplies the values on either side of the operator. 
a * b gives you 1200 
/ 
Division − This operator divides the left hand operand by right hand operand. 
if a= 40, b= 20, b / a results to 2 
% 
Modulus − It divides the left hand operand by right hand operand with remainder as result. 
if a= 40, b= 30, b%a results to 10 
? : 
Bincond − It evaluates the Boolean operators. Moreover, it has three operands below. variable x = (expression) ? value1 if true : value2 if false. 
b = (a == 1)? 40: 20; if a = 1 the value is 40. if a!=1 the value is 20. 
CASE WHEN THEN ELSE END 
Case − This operator is equal to the nested bincond. 
CASE f2 % 4 WHEN 0 THEN ‘even’ WHEN 1 THEN ‘odd’ END 
b. Comparison Operators
This table contains the comparison operators of Pig Latin.
Operator 
Description 
Example 
== 
Equal − This operator checks whether the values of two operands are equal or not. If yes, then the condition becomes true. 
If a=10, b=20, then (a = b) is not true 
!= 
Not Equal − Checks the values of two operands are equal or not. If the values are equal, then condition becomes false else true. 
If a=10, b=20, then (a != b) is true 
> 
Greater than − It checks whether the right operand value is greater than that of the right operand. If yes, then the condition becomes true. 
If a=10, b=20, then(a > b) is not true. 
< 
Less than − This operator checks the value of the left operand is less than the right operand. If condition fulfills, then it returns true. 
(a < b) is true, if a=10, b=20. 
>= 
Greater than or equal to − It checks the value of the left operand with right hand. It checks whether it is greater or equal to the right operand. If yes, then it returns true. 
If a=20, b=50, true(a >= b) is not true. 
<= 
Less than or equal to − The value of the left operand is less than or equal to that of the right operand. Then the condition still returns true. 
If a=20, b=20, (a <= b) is true. 
matches 
Pattern matching − This checks the string in the lefthand matches with the constant in the RHS.  f1 matches ‘.*df.*’ 
c. Type Construction Operators
The above table describes the Type construction pig latin operators.
Operator  Description  Example 
() 
Tuple constructor operator − This operator constructs a tuple. 
(Dataflair, 20) 
{} 
Bag constructor operator − To construct a bag, we use this operator. 
{(Dataflair, 10), (training, 25)} 
[] 
Map constructor operator − This operator construct a tuple. 
[name#DF, age#12] 
d. Relational Operations
The above table describes the relational operators of Pig Latin.
Operator 
Description 
Loading and Storing 

LOAD 
It loads the data from a file system into a relation. 
STORE 
It stores a relation to the file system (local/HDFS). 
Filtering 

FILTER 
There is a removal of unwanted rows from a relation. 
DISTINCT 
We can remove duplicate rows from a relation by this operator. 
FOREACH, GENERATE 
It transforms the data based on the columns of data. 
STREAM 
To transform a relation using an external program. 
Grouping and Joining 

JOIN 
We can join two or more relations. 
COGROUP 
There is a grouping of the data into two or more relations. 
GROUP 
It groups the data in a single relation. 
CROSS 
We can create the cross product of two or more relations. 
Sorting 

ORDER 
It arranges a relation in an order based on one or more fields. 
LIMIT 
We can get a particular number of tuples from a relation. 
Combining and Splitting 

UNION 
We can combine two or more relations into one relation. 
SPLIT 
To split a single relation into more relations. 
Diagnostic Operators 

DUMP 
It prints the content of a relationship through the console. 
DESCRIBE 
It describes the schema of a relation. 
EXPLAIN 
We can view the logical, physical execution plans to evaluate a relation. 
ILLUSTRATE 
It displays all the execution steps as the series of statements. 
4. Pig Latin – Statements
The statements are the basic constructs while processing data using Pig Latin.
 The statements can work with relations including expressions and schemas.
 However, every statement terminate with a semicolon (;).
 We will perform different operations using Pig Latin operators.
 Pig Latin statements inputs a relation and produces some other relation as output.
 The semantic checking initiates as we enter a Load step in the Grunt shell. We use the Dump operator to view the contents of the schema. The MapReduce job initiates for loading the data into the file system. It performs only after the dump operation.
For Example
Following is a Pig Latin statement, it loads the data to Apache Pig.
[php]grunt> Sample_data = LOAD ‘sample_data.txt’ USING PigStorage(‘,’)as
( id:int, name:chararray, contact:chararray, city:chararray );[/php]
So, this was all in Pig Latin Tutorial. Hope you like our explanation.
5. Conclusion
Thus, in this Pig Latin Tutorial, we discussed the Pig Latin language analyzes the data in Hadoop. Also, it transforms the statements into further MapReduce jobs. It also has a certain set of data manipulation functions. At last, the Pig Latin statements are the constructs for data processing.
See Also