Site icon DataFlair

HiveQL Select – Group By Query | Group By Clause

HiveQL Group By Query | Group By Clause

HiveQL Group By Query | Group By Clause

In Apache Hive Tutorial, for grouping particular column values mentioned with the group by Query. Basically, we use Hive Group by Query with Multiple columns on Hive tables.

However, we need to know the syntax of HiveQL group by query to implement it. So, in this article, we will learn what is Hive Query – Group by Query, syntax, and an example of HiveQL Select Group By Clause to understand with JDBC Program.

What is Hive Query?

However, for the ETL purpose on top of Hadoop file system Hive offers SQL type querying language. Also, to work with tables, databases queries Hive Query language (HiveQL) offers SQL type environment in Hive.

In addition, to perform different type data manipulations and querying it is possible to have a different type of clauses associated with Hive. Especially, for better connectivity with different nodes outside the environment. Also, HIVE offers JDBC connectivity.

Also, there are several features of Hive queries offers. Such as:

HiveQL Select

Basically, for grouping particular column values mentioned with the group by query, Group by clause use columns on Hive tables. However, column name does not matter, since for whatever the name we are defining a Group By query will selects and display results by grouping the particular column values.

i.  Group by Query Syntax

However,  see below the syntax of GROUP BY Clause:
SELECT [ALL | DISTINCT] select_expr, select_expr, …
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[HAVING having_condition]
[ORDER BY col_list]]
[LIMIT number];

ii. Group by Query Example

Also, to understand well, see an example below. Although, let’s suppose an employee table. Basically, it includes Id, Name, Salary, Designation, and Dept fields. However, to retrieve the number of employees in each department Generate a query.
Table 1- Group By Clause Example

ID Name Salary Designation Dept
1201 Ross 45000 Tech manager TP
1202 Rachel 45000 Proofreader PR
1203 Monika 40000 Technical writer TP
1204 Mike 45000 Proofreader PR
1205 Joey 30000 OP Admin Admin

Moreover, BY using the above scenario, the following query retrieves the employee details.
hive> SELECT Dept,count(*) FROM employee GROUP BY DEPT;

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

Also, we get to see the following response to the successful execution of the query:

Table 2 – Group By Query 

Dept Count(*)
Admin 1
PR 2
TP 3

iii. JDBC Program

Moreover, to apply the Group By clause for the given example, here is the JDBC program is given below.
For example,

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
public class HiveQLGroupBy
      {
         private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";  
         public static void main(String[] args) throws SQLException
              {
// Register driver and create driver instance
                   Class.forName(driverName);     
// get connection
                   Connection con = DriverManager.
                   getConnection("jdbc:hive://localhost:10000/userdb", "", "");     
// create statement
                   Statement stmt = con.createStatement();     
// execute statement
                   Resultset res = stmt.executeQuery(“SELECT Dept,count(*) ” + “FROM employee GROUP BY DEPT; ”);
                   System.out.println(" Dept \t count(*)");     
                   while (res.next())
                     {
                             System.out.println(res.getString(1) + " " + res.getInt(2));
                     }
                  con.close();
            }
     }

Also, use the following commands to compile and execute this program. Moreover, save the program in a file named HiveQLGroupBy.java.

$ javac HiveQLGroupBy.java
$ java HiveQLGroupBy

iv. Group By Clause – Output

However, here is the possible output of Group By Query.
Table 3 – Group By Query 

Dept Count(*)
Admin 1
PR 2
TP 3

So, this was all about Apache HiveQL Select – Group By Query Tutorial. Hope you like our explanation of Hive Group by Clause.

Conclusion

As a result, we have seen the whole concept of HiveQL Select -Group By query in Apache Hive, with a group by query example & syntax, we also discuss JDBC program with its output to understand HiveQL – Group By clause well.

In our next tutorial, we will study hive Oder By Query in detail. Still, if you have any query, feel free to ask in the comment section.

Exit mobile version