Hive Create Table Commands and Examples

Boost your career with Free Big Data Courses!!

In this article, we will learn how to create tables in Apache Hive. The article explains the syntax for creating Hive Non-ACID transaction tables as well as ACID transaction tables in Hive.

You will also explore the properties which you have to set true for creating an ACID Hive transaction table. Firstly we will see how to create a Non-ACID transaction table. Then we will see how to create ACID hive transaction tables.

 

What are ACID Transactions?

ACID stands for the 4 traits of the database transactions that are Atomicity, Consistency, Isolation, and Durability.

  • Atomicity: It ensures that an operation will either succeed completely or fails. No operation will leave partial data.
  • Consistency: It ensures that once an application performs an operation then the results of that particular operation is visible to it in its every subsequent operation.
  • Isolation: It ensures that the incomplete operation by one user does not cause any unexpected side effects for other users.
  • Durability: It ensures that once an operation is complete then it will be preserved even in the case of machine or system failure.

Up until Hive 0.13, at the partition level, atomicity, consistency, and durability were provided. Isolation can be provided by starting any locking mechanisms like ZooKeeper or in memory. The transaction was added in Hive 0.13 that provides full ACID support at the row level.

Thus, one application can add rows while the other is reading data from the same partition without getting interfering with each other.

Create Non-ACID transaction Hive Table

The syntax for creating Non-ACID transaction table in Hive is:

CREATE TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name data_type [COMMENT col_comment], ... [COMMENT col_comment])] [COMMENT table_comment] [ROW FORMAT row_format] [FIELDS TERMINATED BY “”] [STORED AS file_format] [LOCATION hdfs_path];

In the above syntax:

  • ROW FORMAT DELIMITED specifies Hive that when it encounters a new line character, then that means a new record.
  • FIELDS TERMINATED BY ‘,’ informs Hive about what delimiter we are using in our data files to separate each column.
  • STORED AS TEXTFILE informs Hive about what type of file to expect.

Example of Hive Non- ACID Transaction Table

For creating a table, first we have to use the database in which we want to create the table. In this example, I am creating a table in the database “dataflair”.

 

Now I am creating a table name “employ_detail” in the database “dataflair”.

 

You can insert data into the Non-ACID transaction table by using LOAD command. I am inserting data into my employ_detail as:

 

 

 

Now to see the data in the table, you can use the SELECT statement as:

hive create table

 

 

In this way, we can create Non-ACID transaction Hive tables.

Let us now see how to create an ACID transaction table in Hive.

Create ACID Transaction Hive Table

For creating ACID transaction tables in Hive we have to first set the below mentioned configuration parameters for turning on the transaction support in Hive.

1. Setting Configuration Parameters

At client side:

SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
Set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

At server side (Metastore):

SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;

You can set these configuration properties either in the hive-site.xml file or in the start of the session before any query runs.

2. Table Properties

For using a table in ACID writes ( such as insert, update, delete) then we have to set the table property “transactional=true”. Once we define a table as an ACID table via TBLPROPERTIES (“transactional”=”true”), then we cannot convert it back to a non-ACID table by changing TBLPROPERTIES (“transactional”=”false”). This is not allowed.

Syntax for Creating ACID Transaction Hive Table:

The ACID transaction Hive table currently supports only ORC format.

CREATE TABLE [IF NOT EXISTS] table_name (Colm_name datatype, Colm_name datatype, …...) CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS STORED AS ORC TBLPROPERTIES (transaction=true, ...);

Let us now see an example where we create a Hive ACID transaction table and perform INSERT.

Example of Hive ACID transaction Table

In the below example, we are creating a Hive ACID transaction table name “employ”.

For creating a Hive table, we will first set the above-mentioned configuration properties before running queries.

create table in hive

Now, we will create ‘employ’ table as:

 

Now, we will insert data into the employ table using INSERT INTO statement as:

SELECT Statement in Hive

Using select statement to check the data is inserted or not:

Thus, in this manner we can create ACID transactions tables in Hive.

Summary

I hope after reading this Hive Create Table article, you now understand what the ACID transaction is? And how we can create Non-ACID and ACID transaction tables in Hive. The article explains the syntax and the configuration parameters to be set for creating an ACID table through an example. You can implement this example in your own system.

We work very hard to provide you quality material
Could you take 15 seconds and share your happy experience on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *