HiveQL SELECT Statement – JDBC Program With Syntax and Example

1. Objective

In Apache Hive, to process and analyze structured data in a Metastore, we have Hive Query Language (HiveQL) as a query language. Moreover, to retrieve the data from a table we use HiveQL SELECT statement. However, there are many more insights we can learn about the HiveQL SELECT statement. So, in this blog, we will learn how to use the HiveQL SELECT statement with WHERE clause. Also, we will see  JDBC Program, syntax as well as an example of HiveQL SELECT statement to understand it well.

What is HiveQL SELECT Statement

What is HiveQL SELECT Statement

2. HiveQL (Hive Query Language)

Generally, to write Hive queries, Hive offers a command line interface (CLI). It is possible by using Hive Query Language (HiveQL). Moreover, we can say HQL syntax is similar to the SQL syntax that most data analysts are familiar with.
In addition, from the complexity of MapReduce programming basically, Hive’s SQL-inspired language separates the user. Also, it reuses familiar concepts from the relational database world. Like tables, rows, columns, and schema, to ease learning.
Must Learn – Apache Hive Architecture & Components in detail
Moreover, here most of the interactions tend to take place over a command line interface (CLI).
However, there are four file formats which Hive supports. Such as TEXTFILE, SEQUENCEFILE, ORC and RCFILE (Record Columnar File).

  • Basically, for single user metadata storage Hive uses derby database.
  • Whereas Hive uses MYSQL for multiple user Metadata or shared Metadata case.
Hadoop Quiz

Get the most demanding skills of IT Industry - Learn Hadoop

3. HiveQL SELECT Statement

Basically, to retrieve the data from a table we use HiveQL SELECT statement. Moreover, WHERE clause works similar to a condition. Further, using the condition it generally filters the data and gives you a finite result. In addition, all the built-in functions and operators generate an expression, that fulfills the condition.

i. Syntax

Here is the syntax of HiveQL SELECT Statement:

SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[HAVING having_condition]
[CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list]]
[LIMIT number];

Learn about Difference between Hive Partitioning vs Bucketing in detail

ii. Example

So, let’s suppose as given below we have the employee table. Basically, it includes fields named Id, Name, Salary, Designation, and Dept. Further, to retrieve the employee details who earn a salary of more than Rs 30000  generate a query.

ID                               NameSalaryDesignationDept
1201Ross 45000Technical managerTP
1202Rachel45000ProofreaderPR
1203Chandler40000Technical writerTP
1204Monika40000Hr AdminHR
1205Mike 30000        Op Admin Admin

However, using the above scenario the following query retrieves the employee details are:
hive> SELECT * FROM employee WHERE salary>30000;
Let’s discuss Difference between Hive vs HBase
Now, we get to see the following response to the successful execution of the query:

ID                               NameSalaryDesignationDept
1201Ross 45000Technical managerTP
1202Rachel45000ProofreaderPR
1203Chandler40000Technical writerTP
1204Monika40000Hr AdminHR

iii. JDBC Program

However, now see for the given example the JDBC program to apply where clause is as follows.

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
public class HiveQLWhere
    {
         private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";   
         public static void main(String[] args) throws SQLException
          {   
// Register driver and create driver instance
              Class.forName(driverName);      
// get connection
              Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", "");   
// create statement
              Statement stmt = con.createStatement();     
// execute statement
              Resultset res = stmt.executeQuery("SELECT * FROM employee WHERE salary>30000;");  
              System.out.println("Result:");
              System.out.println(" ID \t Name \t Salary \t Designation \t Dept ");      
              while (res.next())
                  {
System.out.println(res.getInt(1) + " " + res.getString(2) + " " + res.getDouble(3) + " " + res.getString(4) + " " + res.getString(5));
                  }
con.close();
  }
}

However, save the program in a file named HiveQLWhere.java. Moreover, to compile and execute this program, we can use the following commands.
$ javac HiveQLWhere.java
$ java HiveQLWhere
Read more about Java Programming Language in detail

iv. Output

ID                               NameSalaryDesignationDept
1201Ross 45000Technical managerTP
1202Rachel45000ProofreaderPR
1203Chandler40000Technical writerTP
1204Monika40000Hr AdminHR

So, this was all in HiveQL Select Statement Tutorial. Hope you like our explanation.

4. Conclusion

As a result, we have seen the whole concept of  HiveQL SELECT statement with WHERE clause. Also, we have seen syntax as well as HiveQL SELECT Statement example to understand well with JDBC Program. However, if you have any query, feel free to ask in the comment section.
See Also- Impala vs Hive
For reference

Leave a Reply

Your email address will not be published. Required fields are marked *