Impala GROUP BY Clause – Zero-Length Strings

Stay updated with the latest technology trends while you're on the move - Join DataFlair's Telegram Channel

1. Objective

It is possible to arrange identical data into groups in Impala. For that, we use Impala Group BY clause. So, in this article, we will discuss, how we can arrange data with Impala Group BY clause. Also, we will cover its syntax, usage as well as an example to understand it well.
So, let’s start Impala Group by Clause.

Impala GROUP BY Clause - Zero-Length Strings

Impala GROUP BY Clause – Zero-Length Strings

2. Impala GROUP BY Clause

Basically, Impala GROUP BY Clause specifies the GROUP BY clause in queries that use aggregation functions. Like COUNT(), SUM(), AVG(), MIN(), and MAX(). Moreover, specify the names of all the columns that do not participate in the aggregation operation in the GROUP BY clause.
In other words, to arrange identical data into groups, we use the Impala GROUP BY clause in collaboration with SELECT statement.
Let’s Learn Impala ORDER BY Clause

If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. Syntax of Group By Clause in Impala

Here, is the syntax of Impala GROUP BY clause, below;

select data from table_name Group BY col_name;

4. Complex Type Considerations

There are several complex data types available in CDH 5.5 / Impala 2.3 and higher. Like STRUCT, ARRAY, and MAP.
Basically, we cannot reference these columns directly in the ORDER BY clause. So, we use join notation to “unpack” the elements of the complex type when we query a complex type column. Moreover, we can include an ORDER BY clause to control the order in the result set of the scalar elements from the complex type within the join query.

5. Zero-Length Strings in Group By

Impala considers zero-length strings (“”), NULL, and space to all the different values for purposes of several clauses. Like DISTINCT and GROUP BY.

Read about Impala LIMIT Clause with syntax

Hadoop Quiz

6. Examples of Impala Group By Clause

Let’s Understand by Impala Group By example,

Let us suppose we have a table named Employees in the database my_db. Its contents are −

[quickstart.cloudera:21000] > select * from Employees;
Query: select * from Employees
idnameageaddresssalary
1shubham32delhi20000
2monika25mumbai15000
3kajal27alirajpur40000
4revti25indore35000
5shreyash23pune30000
6mehul22hyderabad32000

Fetched 6 row(s) in 0.51s
Further, using GROUP BY query, we can get the total amount of salary of each Employee.

[quickstart.cloudera:21000] > Select name, sum(salary) from Employees Group BY name;

On executing, the above query we get the following output.
Read about Impala UNION Clause with Example
Query: select name, sum(salary) from Employees Group BY name

name sum(salary)
shubham20000
mehul32000
Shreyash 30000
monika15000
revti35000
kajal40000

Fetched 6 row(s) in 1.75s
However, now suppose that this table has multiple records.

idnameageaddresssalary
1shubham32delhi20000
2shubham32delhi2000
3monika25mumbai15000
4kajal27alirajpur40000
5revti25indore35000
6revti25indore3000
7shreyash23pune30000
8mehul22hyderabad32000

So, using the Group By clause again, we can get the total amount of salaries of the employees, considering the repeated entries of records.

Select name, sum(salary) from Employees Group BY name;

So, above query gives the following output while execution.
Do you know about Impala OFFSET Clause with Example

Query: select name, sum(salary) from Employees Group BY name 

name sum(salary)
shubham22000
mehul32000
Shreyash 30000
monika15000
revti38000
kajal40000

Fetched 6 row(s) in 1.75s

So, this was all in the Impala Group By Clause tutorial. Hope you like our explanation.

7. Conclusion

As a result, we have seen the whole concept of Impala GROUP BY Clause. Still, if any doubt occurs in how to arrange data in groups, feel free to ask in the comment section.
See Also- Impala DISTINCT Operator
For reference

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.