HiveQL Select – Group By Query | Group By Clause
In Apache Hive Tutorial, for grouping particular column values mentioned with the group by Query. Basically, we use Hive Group by Query with Multiple columns on Hive tables.
However, we need to know the syntax of HiveQL group by query to implement it. So, in this article, we will learn what is Hive Query – Group by Query, syntax, and an example of HiveQL Select Group By Clause to understand with JDBC Program.
What is Hive Query?
However, for the ETL purpose on top of Hadoop file system Hive offers SQL type querying language. Also, to work with tables, databases queries Hive Query language (HiveQL) offers SQL type environment in Hive.
In addition, to perform different type data manipulations and querying it is possible to have a different type of clauses associated with Hive. Especially, for better connectivity with different nodes outside the environment. Also, HIVE offers JDBC connectivity.
Also, there are several features of Hive queries offers. Such as:
- For example data modeling. Basically, for the creation of databases, tables, etc.
- Moreover, ETL functionalities. For example, Extraction, Transformation, and Loading data into tables.
- Also, it offers joins to merge different data tables.
- While it comes to ease of code, it offers user-specific custom scripts.
- Also, it offers a faster-querying tool on top of Hadoop.
HiveQL Select
Basically, for grouping particular column values mentioned with the group by query, Group by clause use columns on Hive tables. However, column name does not matter, since for whatever the name we are defining a Group By query will selects and display results by grouping the particular column values.
i. Group by Query Syntax
However, see below the syntax of GROUP BY Clause:
SELECT [ALL | DISTINCT] select_expr, select_expr, …
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[HAVING having_condition]
[ORDER BY col_list]]
[LIMIT number];
ii. Group by Query Example
Also, to understand well, see an example below. Although, let’s suppose an employee table. Basically, it includes Id, Name, Salary, Designation, and Dept fields. However, to retrieve the number of employees in each department Generate a query.
Table 1- Group By Clause Example
ID | Name | Salary | Designation | Dept |
1201 | Ross | 45000 | Tech manager | TP |
1202 | Rachel | 45000 | Proofreader | PR |
1203 | Monika | 40000 | Technical writer | TP |
1204 | Mike | 45000 | Proofreader | PR |
1205 | Joey | 30000 | OP Admin | Admin |
Moreover, BY using the above scenario, the following query retrieves the employee details.
hive> SELECT Dept,count(*) FROM employee GROUP BY DEPT;
Also, we get to see the following response to the successful execution of the query:
Table 2 – Group By Query
Dept | Count(*) |
Admin | 1 |
PR | 2 |
TP | 3 |
iii. JDBC Program
Moreover, to apply the Group By clause for the given example, here is the JDBC program is given below.
For example,
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveQLGroupBy { private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { // Register driver and create driver instance Class.forName(driverName); // get connection Connection con = DriverManager. getConnection("jdbc:hive://localhost:10000/userdb", "", ""); // create statement Statement stmt = con.createStatement(); // execute statement Resultset res = stmt.executeQuery(“SELECT Dept,count(*) ” + “FROM employee GROUP BY DEPT; ”); System.out.println(" Dept \t count(*)"); while (res.next()) { System.out.println(res.getString(1) + " " + res.getInt(2)); } con.close(); } }
Also, use the following commands to compile and execute this program. Moreover, save the program in a file named HiveQLGroupBy.java.
$ javac HiveQLGroupBy.java
$ java HiveQLGroupBy
iv. Group By Clause – Output
However, here is the possible output of Group By Query.
Table 3 – Group By Query
Dept | Count(*) |
Admin | 1 |
PR | 2 |
TP | 3 |
So, this was all about Apache HiveQL Select – Group By Query Tutorial. Hope you like our explanation of Hive Group by Clause.
Conclusion
As a result, we have seen the whole concept of HiveQL Select -Group By query in Apache Hive, with a group by query example & syntax, we also discuss JDBC program with its output to understand HiveQL – Group By clause well.
In our next tutorial, we will study hive Oder By Query in detail. Still, if you have any query, feel free to ask in the comment section.
Did we exceed your expectations?
If Yes, share your valuable feedback on Google