Top 7 Hive DML Commands with Syntax and Examples

Boost your career with Free Big Data Courses!!

Wishing to load, insert, retrieve, update, or delete data in the Hive tables?

After learning basic Commands in Hive, let us now study Hive DML Commands. Hive Data Manipulation Language commands are used for inserting, retrieving, modifying, deleting, and updating data in the Hive table.

In this article, we will learn Hive DML commands. There are many Hive DML commands like LOAD, INSERT, UPDATE, etc. We will explore each of these DML commands individually, along with their syntax and examples.

Before moving towards the Hive DML commands, let us first see the short introduction to Hive Query Language.

Hive Query Language(HQL)

Hive Query Language is a language used in Hive, similar to SQL, to process and analyze unstructured data.

Hive Query Language is easy to use if you are familiar with SQL. The syntax of Hive QL is very similar to SQL with slight differences.

Hive QL supports DDL, DML, and user-defined functions.

Introduction to Hive DML commands

Hive DML (Data Manipulation Language) commands are used to insert, update, retrieve, and delete data from the Hive table once the table and database schema has been defined using Hive DDL commands.

The various Hive DML commands are:

  1. LOAD
  2. SELECT
  3. INSERT
  4. DELETE
  5. UPDATE
  6. EXPORT
  7. IMPORT

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

Let us now learn each DML command individually.

[ps2id id=’LOAD-command’ target=”/]1. LOAD Command

The LOAD statement in Hive is used to move data files into the locations corresponding to Hive tables.

  • If a LOCAL keyword is specified, then the LOAD command will look for the file path in the local filesystem.
  • If the LOCAL keyword is not specified, then the Hive will need the absolute URI of the file.
  • In case the keyword OVERWRITE is specified, then the contents of the target table/partition will be deleted and replaced by the files referred by filepath.
  • If the OVERWRITE keyword is not specified, then the files referred by filepath will be appended to the table.

Want to practice these Hive commands yourself? Follow the Hive-3.1.2 installation guide to install the latest Hive version on your system.

Syntax:

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)];

Example:

Here we are trying to load data from the ‘dab’ file in the local filesystem to the ‘emp_data’ table.

LOAD statement Hive DML Command

SELECT_statment_to_view_data_loaded_in_hive_table Hive DML Command

[ps2id id=’SELECT-command’ target=”/]2. SELECT COMMAND

The SELECT statement in Hive is similar to the SELECT statement in SQL used for retrieving data from the database.

Syntax:

SELECT col1,col2 FROM tablename;

Example:

SELECT_statement - Hive-DML-Command

[ps2id id=’INSERT-command’ target=”/]3. INSERT Command

The INSERT command in Hive loads the data into a Hive table. We can do insert to both the Hive table or partition.

a. INSERT INTO

The INSERT INTO statement appends the data into existing data in the table or partition. INSERT INTO statement works from Hive version 0.8.

Syntax:

INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;

Example:

Here in this example, we are trying to insert the data of ‘emp_data’ table created above into the table ‘example’.

Creating_table_example - Hive DML Commands

INSERT statement to load data into table “example”.

INSERT_INTO_statement - Hive-DML-Command

Displaying_result_of_insert_into - Hive DML Commands

b. INSERT OVERWRITE

The INSERT OVERWRITE table overwrites the existing data in the table or partition.

Syntax:

INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, ..) [IF NOT EXISTS]] select_statement FROM from_statement;

Example:

Here we are overwriting the existing data of the table ‘example’ with the data of table ‘dummy’ using INSERT OVERWRITE statement.

INSERT_OVERWRITE_statement - Hive DML Commands

Executing_insert_overwrite - Hive DML Commands

By using the SELECT statement we can verify whether the existing data of the table ‘example’ is overwritten by the data of table ‘dummy’ or not.

displaying_result_of_insert_overwrite - Hive DML Command

c. INSERT .. VALUES

INSERT ..VALUES statement in Hive inserts data into the table directly from SQL. It is available from Hive 0.14.

Syntax:

INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...];

Example:

Inserting data into the ‘student’ table using INSERT ..VALUES statement.

INSERT_VALUES_statement - Hive DML Commands

displaying_result_of_insert_values_statement - Hive DML commands

[ps2id id=’DELETE-command’ target=”/]4. DELETE command

The DELETE statement in Hive deletes the table data. If the WHERE clause is specified, then it deletes the rows that satisfy the condition in where clause.

The DELETE statement can only be used on the hive tables that support ACID.

Syntax:

DELETE FROM tablename [WHERE expression];

Example:

In the below example, we are deleting the data of the student from table ‘student’ whose roll_no is 105.

DELETE_statement - Hive DML Commands

executing_DELETE_statement - Hive DML Commands

By using the SELECT statement we can verify whether the data of the student from table ‘student’ whose roll_no is 105 is deleted or not.displaying_result_after_delete - Hive DML Commands

[ps2id id=’UPDATE-command’ target=”/]5. UPDATE Command

The update can be performed on the hive tables that support ACID.

The UPDATE statement in Hive deletes the table data. If the WHERE clause is specified, then it updates the column of the rows that satisfy the condition in WHERE clause.

Partitioning and Bucketing columns cannot be updated.

Syntax:

UPDATE tablename SET column = value [, column = value ...] [WHERE expression];

Example:

In this example, we are updating the branch of the student whose roll_no is 103 in the ‘student’ table using an UPDATE statement.

UPDATE-statement - Hive DML Commands

executing_update_statement - Hive DML Commands

By using the SELECT statement we can verify whether the branch of the student whose roll_no is 103 in the ‘student’ table is updated or not.

displaying_result_after_update - Hive DML Commands

[ps2id id=’EXPORT-command’ target=”/]6. EXPORT Command

The Hive EXPORT statement exports the table or partition data along with the metadata to the specified output location in the HDFS.

Metadata is exported in a _metadata file, and data is exported in a subdirectory ‘data.’

Syntax:

EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])]
TO 'export_target_path' [ FOR replication('eventid') ];

Example:

Here in this example, we are exporting the student table to the HDFS directory “export_from_hive”.

EXPORT-statement - Hive DML Commands

executing_export_statement

The table successfully exported. You can check for the _metadata file and data sub-directory using ls command.

displaying_exported_table_directory_Hive DML Commands

[ps2id id=’IMPORT-command’ target=”/]7. IMPORT Command

The Hive IMPORT command imports the data from a specified location to a new table or already existing table.

Syntax:

IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]]
FROM 'source_path' [LOCATION 'import_target_path'];

Example:

Here in this example, we are importing the data exported in the above example into a new Hive table ‘imported_table’.

IMPORT-statement-Hive-DML-Commands

executing_import_statement - Hive DML Commands

Verifying whether the data is imported or not using hive SELECT statement.

displaying_import_table_result

Summary

After reading this article, I hope now you are familiar with the Hive DML commands.

The article explained how to load data into the Hive table, insert data into the Hive table, and delete rows from the hive table. In addition, we have studied how to update the particular row column in a table. We have also seen the Hive IMPORT and EXPORT statement with the help of an example.

Do you love DataFlair efforts? Please review us here.

Any queries while practicing Hive commands?

Ask us freely.

Keep Practicing!!

You give me 15 seconds I promise you best tutorials
Please share your happy experience on Google

follow dataflair on YouTube

2 Responses

  1. Rahul says:

    What exactly are the differences between load and import commands?

    • MJ says:

      you cant perform overwrite in import command, I guess. However same is possible with “load data”.
      Please correct me if I am wrong.

Leave a Reply

Your email address will not be published. Required fields are marked *