Introduction to Impala UNION Clause with Example

1. Impala UNION Clause – Objective

While it comes to combine the results of two queries in Impala, we use Impala UNION Clause. There is much more to learn about Impala UNION Clause. So, let’s learn about it from this article. Apart from its introduction, it includes its syntax, type as well as its example, to understand it well.

So, let’s start Impala UNION Clause tutorial.

Impala UNION Clause

Impala UNION Clause

2. Introduction to Impala UNION Clause

Basically, in order to combine the result sets of multiple queries, we use the Impala UNION clause. Although, the result sets are combined by default as if the DISTINCT operator was applied.
In other words, to combine the results of two queries we the Impala Union clause.

If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. Syntax

So, the syntax for using Impala UNION Clause is-

query_1 UNION [DISTINCT | ALL] query_2

4. Usage

We can say, UNION DISTINCT and the UNION keyword by itself is similar. Always,  prefer UNION ALL where practical, because eliminating duplicates can be a memory-intensive process for a large result set. Especially, where the duplicate values are acceptable or when you know the different queries in the union will not produce any duplicates.

However,  in Impala 1.4 and higher, we do not need the LIMIT clause while an ORDER BY clause applies to a UNION ALL or UNION query. Moreover, turn the UNION query into a subquery, SELECT from the subquery, and put the ORDER BY clause at the end, outside the subquery, in order to make the ORDER BY and LIMIT clauses apply to the entire result set.

Join DataFlair on Telegram
Hadoop Quiz

5. Example

Impala UNION Clause Example,
Let’s suppose we have a table named Students in the database my_db. Its contents are−

[quickstart.cloudera:21000] > select * from Students;
Query: select * from Students

idnameageaddresssalary
1shubham32delhi20000
9Pulkit23Gandhi nagar28000
2monika25mumbai15000
4revti25indore35000
7Vaishnavi25Goa23000
6mehul22hyderabad32000
8Rishabh22chennai31000
5shreyash23pune30000
3kajal27alirajpur40000

Fetched 9 row(s) in 0.59s
Similarly,  assume we have another table named Users. Its contents are −

[quickstart.cloudera:21000] > select * from Users;
Query: select * from Users

idnameageaddresssalary
3vishal54Banglore55000
2Shubham44Banglore50000
4Mansi64kolkata60000
1Ankur34kolkata40000


Fetched 4 row(s) in 0.59s
So, here is an example of the Impala union clause. Basically,  using the UNION clause, we arrange the records in both tables in the order of their id’s and limit their number by 3 using two separate queries and joining these queries.

[quickstart.cloudera:21000] > select * from Students order by id limit 3
union select * from Users order by id limit 3;
Hence, we get the following output, on executing the above query.

Query: select * from Students order by id limit 3 union select
* from Users order by id limit 3

idnameAge Address Salary
2monika25mumbai15000
3vishal54Banglore55000
1Ankur34kolkata40000
2Shubham44Banglore50000
3kajal27alirajpur40000
1shubham32Delhi 20000


Fetched 6 row(s) in 3.11s

6. Conclusion

As a result, we have seen the whole concept of Impala UNION Clause. Still, if any doubt occurs, feel free to ask in the comment section.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.