Impala ORDER BY Clause – Syntax & Example

1. Objective

In our last tutorial, we discussed Impala Group by Clause. Here, we are going to study Impala Order By Clause, helps to sort the data in an ascending or descending order, based on one or more columns in Impala. So, in this article, we will discuss, how we can sort data with Impala ORDER BY clause. Also, we will cover its syntax, usage, and example to understand it well.
So, let’s start Impala order By Clause.

Impala ORDER BY Clause

Impala ORDER BY Clause

2. Impala ORDER BY Clause

Basically, sorts the result set based on the values from one or more columns we use the familiar ORDER BY clause of a SELECT statement.
In other words, we use the Impala ORDER BY clause to sort the data in an ascending or descending order, based on one or more columns. However, it also happens that some databases sort the query results in ascending order by default.

However, the entire result set must be produced and transferred to one node before the sorting can happen hence it turns out as a relatively expensive operation for distributed queries. One more disadvantage is it demands more memory capacity than a query without ORDER BY.
Read about Impala Use Cases and Applications
Also, the query can appear slower, even if takes approximately the same time to finish with or without the ORDER BY clause. It happens because until all processing is finished, no results are available, rather than results coming back gradually as rows matching the WHERE clause are found. Hence, if we need the first N results from the sorted result set, that involves the LIMIT clause. That reduces network overhead and the memory requirement on the coordinator node.
It is also very important to note that the LIMIT clause is optional (rather than required) for queries that use the ORDER BY clause, in Impala 1.4.0 and higher. Because to perform the sort Impala automatically uses a temporary disk work area since the sort operation would otherwise exceed the Impala memory limit for a particular DataNode.

Get the most demanding skills of IT Industry - Learn Hadoop

3. A Syntax of Impala Order By Clause

Here, is the syntax of Impala ORDER BY clause, below;

ORDER BY col_ref [, col_ref ...] [ASC | DESC] [NULLS FIRST | NULLS LAST]
col_ref ::= column_name | integer_literal

There is one more way to use this clause. That is ORDER BY column_name. It means, if we want to sort by the first column of the result set, we can also specify ORDER BY 1. Here 1 is column_name. Similarly, we use  ORDER BY 2 to sort by the second column, and so on. But the only condition for it is, the number must be a numeric literal, not some other kind of constant expression.

However, make sure we can only use ORDER BY column_number when the query explicitly lists the columns in the SELECT list, not with SELECT * queries.
Let’s read about Impala SQL – Basics of Impala Query Language
Note: The query does succeeds even if the argument is some other expression, but the order of results is undefined.

Moreover, using the keywords ASC or DESC, we can arrange the data in the table in ascending or descending order respectively.

Also, there are two commands, NULLS FIRST and NULLS LAST. Basically, all the null values in the table are arranged in the top rows, if we use NULLS FIRST. Whereas, the rows containing null values will be arranged last if we use NULLS LAST.

Hadoop Quiz

4. Impala Order By Clause Example

For Example,

So, let us suppose we have a table named Employees in the database my_db. Its contents are −

[quickstart.cloudera:21000] > select * from Employees;
Query: select * from Employees
idnameageaddresssalary
3kajal27alirajpur40000
1shubham32delhi20000
2monika25mumbai15000
6mehul22hyderabad32000
4revti25indore35000
5shreyash23pune30000

Fetched 6 row(s) in 0.51s
Afterward, we will use the order by clause to arrange the data of Employees table in ascending order, of their id’s.
Let’s Discuss Steps to Install Impala on Linux
For Example,

[quickstart.cloudera:21000] > Select * from Employees ORDER BY id ASC;
On executing, the above query produces the following output.

Query: select * from Employees ORDER BY id ASC 

idnameageaddresssalary
1shubham32delhi20000
2monika25mumbai15000
3kajal27alirajpur40000
4revti25indore35000
5shreyash23pune30000
6mehul22hyderabad32000

Fetched 6 row(s) in 0.56s
After ascending order, we will see how to use the order by clause to arrange the data of Employees table in descending order.
For Example,

[quickstart.cloudera:21000] > Select * from Employees ORDER BY id desc;
On executing, the above query produces the following output.

Query: select * from Employees ORDER BY id desc

idnameageaddresssalary
6mehul22hyderabad32000
5shreyash23pune30000
4revti25indore35000
3kajal27alirajpur40000
2monika25mumbai15000
1shubham32delhi20000

Fetched 6 row(s) in 0.54s
So, this was all about Impala Order By Clause. Hope you like our explanation.

5. Conclusion

As a result, we have seen the whole concept of Impala ORDER BY clause. Still, if any doubt occurs in how to sort data, feel free to ask in the comment section.
Related Topic- Impala Quiz
For reference

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.