Hive Join | HiveQL Select Joins Query | Types of Join in Hive
In Apache Hive, for combining specific fields from two tables by using values common to each one we use Hive Join – HiveQL Select Joins Query. However, we need to know the syntax of Hive Join for implementation purpose.
So, in this article, “Hive Join – HiveQL Select Joins Query and its types” we will cover syntax of joins in hive. Also, we will learn an example of Hive Join to understand well.
Moreover, there are several types of Hive join – HiveQL Select Joins: Hive inner join, hive left outer join, hive right outer join, and hive full outer join. We will also learn Hive Join tables in depth.
Apache Hive Join – HiveQL Select Joins Query
Basically, for combining specific fields from two tables by using values common to each one we use Hive JOIN clause.
In other words, to combine records from two or more tables in the database we use JOIN clause. However, it is more or less similar to SQL JOIN. Also, we use it to combine rows from multiple tables.
Moreover, there are some points we need to observe about Hive Join:
- In Joins, only Equality joins are allowed.
- However, in the same query more than two tables can be joined.
- Basically, to offer more control over ON Clause for which there is no match LEFT, RIGHT, FULL OUTER joins exist in order.
- Also, note that Hive Joins are not Commutative
- Whether they are LEFT or RIGHT joins in Hive, even then Joins are left-associative.
a. Apache Hive Join Syntax
Following is a syntax of Hive Join – HiveQL Select Clause.
join_table:
table_reference JOIN table_factor [join_condition]
| table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference
join_condition
| table_reference LEFT SEMI JOIN table_reference join_condition
| table_reference CROSS JOIN table_reference [join_condition]
b. Example of Join in Hive
Example of Hive Join – HiveQL Select Clause.
However, to understand well let’s suppose the following table named CUSTOMERS.
Table.1 Hive Join Example
ID | Name | Age | Address | Salary |
1 | Ross | 32 | Ahmedabad | 2000 |
2 | Rachel | 25 | Delhi | 1500 |
3 | Chandler | 23 | Kota | 2000 |
4 | Monika | 25 | Mumbai | 6500 |
5 | Mike | 27 | Bhopal | 8500 |
6 | Phoebe | 22 | MP | 4500 |
7 | Joey | 24 | Indore | 10000 |
Also, suppose another table ORDERS as follows:
Table.2 – Hive Join Example
OID | Date | Customer_ID | Amount |
102 | 2016-10-08 00:00:00 | 3 | 3000 |
100 | 2016-10-08 00:00:00 | 3 | 1500 |
101 | 2016-11-20 00:00:00 | 2 | 1560 |
103 | 2015-05-20 00:00:00 | 4 | 2060 |
Type of Joins in Hive
Basically, there are 4 types of Hive Join. Such as:
- Inner join in Hive
- Left Outer Join in Hive
- Right Outer Join in Hive
- Full Outer Join in Hive
So, let’s discuss each Hive join in detail.
a. Inner Join
Basically, to combine and retrieve the records from multiple tables we use Hive Join clause. Moreover, in SQL JOIN is as same as OUTER JOIN. Moreover, by using the primary keys and foreign keys of the tables JOIN condition is to be raised.
Furthermore, the below query executes JOIN the CUSTOMER and ORDER tables. Then further retrieves the records:
hive> SELECT c.ID, c.NAME, c.AGE, o.AMOUNT
FROM CUSTOMERS c JOIN ORDERS o
ON (c.ID = o.CUSTOMER_ID);
Moreover, we get to see the following response, on the successful execution of the query:
Table.3 – Inner Join in Hive
ID | Name | Age | Amount |
3 | Chandler | 23 | 1300 |
3 | Chandler | 23 | 1500 |
2 | Rachel | 25 | 1560 |
4 | Monika | 25 | 2060 |
b. Left Outer Join
On defining HiveQL Left Outer Join, even if there are no matches in the right table it returns all the rows from the left table.
To be more specific, even if the ON clause matches 0 (zero) records in the right table, then also this Hive JOIN still returns a row in the result. Although, it returns with NULL in each column from the right table.
In addition, it returns all the values from the left table. Also, the matched values from the right table, or NULL in case of no matching JOIN predicate.
However, the below query shows LEFT OUTER JOIN between CUSTOMER as well as ORDER tables:
hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE FROM CUSTOMERS c LEFT OUTER JOIN ORDERS o ON (c.ID = o.CUSTOMER_ID);
Moreover, we get to see the following response, on the successful execution of the HiveQL Select query:
Table.4 – Left Outer Join in Hive
ID | Name | Amount | Date |
1 | Ross | NULL | NULL |
2 | Rachel | 1560 | 2016-11-20 00:00:00 |
3 | Chandler | 3000 | 2016-10-08 00:00:00 |
3 | Chandler | 1500 | 2016-10-08 00:00:00 |
4 | Monika | 2060 | 2015-05-20 00:00:00 |
5 | Mike | NULL | NULL |
6 | Phoebe | NULL | NULL |
7 | Joey | NULL | NULL |
c. Right Outer Join
Basically, even if there are no matches in the left table, HiveQL Right Outer Join returns all the rows from the right table.
To be more specific, even if the ON clause matches 0 (zero) records in the left table, then also this Hive JOIN still returns a row in the result. Although, it returns with NULL in each column from the left table
In addition, it returns all the values from the right table. Also, the matched values from the left table or NULL in case of no matching join predicate.
However, the below query shows RIGHT OUTER JOIN between the CUSTOMER as well as ORDER tables.
notranslate"> hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE FROM CUSTOMERS c RIGHT OUTER JOIN ORDERS o ON (c.ID = o.CUSTOMER_ID);
Moreover, we get to see the following response, on the successful execution of the HiveQL Select query:
Table.5 – Right Outer Join in Hive
ID | Name | Amount | Date |
3 | Chandler | 1300 | 2016-10-08 00:00:00 |
3 | Chandler | 1500 | 2016-10-08 00:00:00 |
2 | Rachel | 1560 | 2016-11-20 00:00:00 |
4 | Monika | 2060 | 2015-05-20 00:00:00 |
d. Full Outer Join
The major purpose of this HiveQL Full outer Join is it combines the records of both the left and the right outer tables which fulfills the Hive JOIN condition. Moreover, this joined table contains either all the records from both the tables or fills in NULL values for missing matches on either side.
However, the below query shows FULL OUTER JOIN between CUSTOMER as well as ORDER tables:
hive> SELECT c.ID, c.NAME, o.AMOUNT, o.DATE FROM CUSTOMERS c FULL OUTER JOIN ORDERS o ON (c.ID = o.CUSTOMER_ID);
Moreover, we get to see the following response, on the successful execution of the query:
Table.6 – Full Outer Join in Hive
ID | Name | Amount | Date |
1 | Ross | NULL | NULL |
2 | Rachel | 1560 | 2016-11-20 00:00:00 |
3 | Chandler | 3000 | 2016-10-08 00:00:00 |
3 | Chandler | 1500 | 2016-10-08 00:00:00 |
4 | Monika | 2060 | 2015-05-20 00:00:00 |
5 | Mike | NULL | NULL |
6 | Phoebe | NULL | NULL |
7 | Joey | NULL | NULL |
3 | Chandler | 3000 | 2016-10-08 00:00:00 |
3 | Chandler | 1500 | 2016-10-08 00:00:00 |
2 | Rachel | 1560 | 2016-11-20 00:00:00 |
4 | Monika | 2060 | 2015-05-20 00:00:00 |
This was all about HiveQL Select – Apache Hive Join Tutorial. Hope you like our explanation of Types of Joins in Hive.
Conclusion
As a result, we have seen what is Apache Hive Join and possible types of Join in Hive- HiveQL Select. Also, we have seen several examples to understand Joins in Hive well. Moreover, we hope this article “Hive Join” will help a lot. Still, if any doubt occurs feel free to ask in the comment section.
Did you like this article? If Yes, please give DataFlair 5 Stars on Google
correct your inner join example output you given Address column name insteed of Amount name
Thank-you for mentioning the mistake. we appreciate you notice that. Also, we have updated it correctly. Hope it helps!
Right outer join and Full outer join example outputs are not correct.
The result of Full Outer Join show here is not correct
There should not be any duplicates as shown below:
SELECT
A.ID, A.NAME, A.AGE, B.ODID, B.ODATE, B.AMOUNT
FROM
SUBH_CUST1 A
FULL OUTER JOIN
SUBH_ORDER1 B
ON A.ID = B.CUST_ID;
+——-+———–+——–+———+————————+———–+
| a.id | a.name | a.age | b.odid | b.odate | b.amount |
+——-+———–+——–+———+————————+———–+
| 1 | Ross | 32 | NULL | NULL | NULL |
| 2 | Rachel | 25 | 101 | 2016-11-20 00:00:00.0 | 1560 |
| 3 | Chandler | 23 | 100 | 2016-10-08 00:00:00.0 | 1500 |
| 3 | Chandler | 23 | 102 | 2016-10-08 00:00:00.0 | 3000 |
| 4 | Monika | 25 | 103 | 2015-05-20 00:00:00.0 | 2060 |
| 5 | Mike | 27 | NULL | NULL | NULL |
| 6 | Phoebe | 22 | NULL | NULL | NULL |
| 7 | Joey | 24 | NULL | NULL | NULL |
+——-+———–+——–+———+————————+———–+
i like it bro