Impala ORDER BY Clause – Syntax & Example
In our last tutorial, we discussed Impala Group by Clause. Here, we are going to study Impala Order By Clause, helps to sort the data in an ascending or descending order, based on one or more columns in Impala. So, in this article, we will discuss, how we can sort data with Impala ORDER BY clause. Also, we will cover its syntax, usage, and example to understand it well.
So, let’s start Impala order By Clause.
2. Impala ORDER BY Clause
Basically, sorts the result set based on the values from one or more columns we use the familiar ORDER BY clause of a SELECT statement.
In other words, we use the Impala ORDER BY clause to sort the data in an ascending or descending order, based on one or more columns. However, it also happens that some databases sort the query results in ascending order by default.
However, the entire result set must be produced and transferred to one node before the sorting can happen hence it turns out as a relatively expensive operation for distributed queries. One more disadvantage is it demands more memory capacity than a query without ORDER BY.
Read about Impala Use Cases and Applications
Also, the query can appear slower, even if takes approximately the same time to finish with or without the ORDER BY clause. It happens because until all processing is finished, no results are available, rather than results coming back gradually as rows matching the WHERE clause are found. Hence, if we need the first N results from the sorted result set, that involves the LIMIT clause. That reduces network overhead and the memory requirement on the coordinator node.
It is also very important to note that the LIMIT clause is optional (rather than required) for queries that use the ORDER BY clause, in Impala 1.4.0 and higher. Because to perform the sort Impala automatically uses a temporary disk work area since the sort operation would otherwise exceed the Impala memory limit for a particular DataNode.
3. A Syntax of Impala Order By Clause
Here, is the syntax of Impala ORDER BY clause, below;
ORDER BY col_ref [, col_ref ...] [ASC | DESC] [NULLS FIRST | NULLS LAST] col_ref ::= column_name | integer_literal
There is one more way to use this clause. That is ORDER BY column_name. It means, if we want to sort by the first column of the result set, we can also specify ORDER BY 1. Here 1 is column_name. Similarly, we use ORDER BY 2 to sort by the second column, and so on. But the only condition for it is, the number must be a numeric literal, not some other kind of constant expression.
However, make sure we can only use ORDER BY column_number when the query explicitly lists the columns in the SELECT list, not with SELECT * queries.
Let’s read about Impala SQL – Basics of Impala Query Language
Note: The query does succeeds even if the argument is some other expression, but the order of results is undefined.
Moreover, using the keywords ASC or DESC, we can arrange the data in the table in ascending or descending order respectively.
Also, there are two commands, NULLS FIRST and NULLS LAST. Basically, all the null values in the table are arranged in the top rows, if we use NULLS FIRST. Whereas, the rows containing null values will be arranged last if we use NULLS LAST.
4. Impala Order By Clause Example
So, let us suppose we have a table named Employees in the database my_db. Its contents are −
[quickstart.cloudera:21000] > select * from Employees; Query: select * from Employees
Fetched 6 row(s) in 0.51s
Afterward, we will use the order by clause to arrange the data of Employees table in ascending order, of their id’s.
Let’s Discuss Steps to Install Impala on Linux
[quickstart.cloudera:21000] > Select * from Employees ORDER BY id ASC; On executing, the above query produces the following output.
Query: select * from Employees ORDER BY id ASC
Fetched 6 row(s) in 0.56s
After ascending order, we will see how to use the order by clause to arrange the data of Employees table in descending order.
[quickstart.cloudera:21000] > Select * from Employees ORDER BY id desc; On executing, the above query produces the following output.
Query: select * from Employees ORDER BY id desc
Fetched 6 row(s) in 0.54s
So, this was all about Impala Order By Clause. Hope you like our explanation.
As a result, we have seen the whole concept of Impala ORDER BY clause. Still, if any doubt occurs in how to sort data, feel free to ask in the comment section.
Related Topic- Impala Quiz