Apache Hive Data Types – Learn Quickly

Stay updated with the latest technology trends while you're on the move - Join DataFlair's Telegram Channel

1. Objective

In our previous blog, we have discussed the Apache Hive introduction and Hive Architecture in detail. Now in this blog, we are going to cover Apache Hive Data Types with examples. We will discuss different types of data types in Hive: Hive Primitive Data type, Hive Complex Data types, Hive Literals, Hive Column Datatypes etc. We will also cover the data types that fall into these categories.

Apache Hive Data Types

Apache Hive Data Types

2. Apache Hive Data Types

Hive Data types are used for specifying the column/field type in Hive tables.

Hadoop Quiz
If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. Types of Data Types in Hive

Mainly Hive Data Types are classified into 5 major categories, let’s discuss them one by one:

a. Primitive Data Types in Hive

Primitive Data Types also divide into 4 types which are as follows:

  • Numeric Data Type
  • Date/Time Data Type
  • String Data Type
  • Miscellaneous Data Type

Let’s install Apache Hive in 5 min
Let us now discuss these Hive Primitive data types one by one-

Introduction to Hive Primitive Data Types

Primitive Data Types in Hive

i. Numeric Data Type

The Hive Numeric Data types also classified into two types-

  • Integral Data Types
  • Floating Data Types

* Integral Data Types
Integral Hive data types are as follows-

  • TINYINT (1-byte (8 bit) signed integer, from -128 to 127)
  • SMALLINT (2-byte (16 bit) signed integer, from -32, 768 to 32, 767)
  • INT (4-byte (32-bit) signed integer, from –2,147,483,648to 2,147,483,647)
  • BIGINT (8-byte (64-bit) signed integer, from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)

* Floating Data Types
Floating Hive data types are as follows-

  • FLOAT (4-byte (32-bit) single-precision floating-point number)
  • DOUBLE (8-byte (64-bit) double-precision floating-point number)
  • DECIMAL (Arbitrary-precision signed decimal number)

ii. Date/Time Data Type

The second category of Apache Hive primitive data type is Date/Time data types. The following Hive data types comes into this category-

  • TIMESTAMP (Timestamp with nanosecond precision)
  • DATE (date)
  • INTERVAL

iii. String Data Type

String data types are the third category under Hive data types. Below are the data types that come into this-

  • STRING (Unbounded variable-length character string)
  • VARCHAR (Variable-length character string)
  • CHAR (Fixed-length character string)

iv. Miscellaneous Data Type
Miscellaneous data types has two types of Hive data types-

  • BOOLEAN (True/false value)
  • BINARY (Byte array)

b. Complex Data Types in Hive

In this category of Hive data types following data types are come-

  • Array
  • MAP
  • STRUCT
  • UNION

let’s now discuss the Hive Complex data types with the example-

Hive Complex data types

Hive Complex data types

i. ARRAY

An ordered collection of fields. The fields must all be of the same type.
Syntax: ARRAY<data_type>
E.g. array (1, 2)

ii. MAP

An unordered collection of key-value pairs. Keys must be primitives; values may be any type. For a particular map, the keys must be the same type, and the values must be the same type.
Syntax: MAP<primitive_type, data_type>
E.g. map(‘a’, 1, ‘b’, 2).

iii. STRUCT

A collection of named fields. The fields may be of different types.
Syntax: STRUCT<col_name : data_type [COMMENT col_comment],…..>
E.g. struct(‘a’, 1 1.0),[b] named_struct(‘col1’, ‘a’, ‘col2’, 1,  ‘col3’, 1.0)

iv. UNION

A value that may be one of a number of defined data. The value is tagged with an integer (zero-indexed) representing its data type in the union.
Syntax: UNIONTYPE<data_type, data_type, …>
E.g. create_union(1, ‘a’, 63)

c. Column Data Types in Hive

Column Hive data types are furthermore divide into 6 categories:

  • Integral Type
  • Strings
  • Timestamp
  • Dates
  • Decimals
  • Union Types

Let us discuss these Hive Column data types one by one-

Hive Column Data Types

Hive Column Data Types

i. Integral type

In this category of Hive data types following 4 data types are come-

  • TINYINT
  • SMALLINT
  • INT/INTEGER
  • BIGINT

By default, Integral literals are assumed to be INT. When the data range exceeds the range of INT, we need to use BIGINT.  If the data range is smaller than the INT, we uses SMALLINT. And TINYINT is smaller than SMALLINT.

TablePostfixExample
TINYINTY100Y
SMALLINTS100S
BIGINTL100L

ii. Strings

The string data types in Hive, can be specified with either single quotes (‘) or double quotes (“). Apache Hive use C-style escaping within the strings.

Data TypeLength
VARCHAR1 to 65355
CHAR255

* VARCHAR
Varchar- Hive data types are created with a length specifier (between 1 and 65355). It defines the maximum number of characters allowed in the character string.
* Char
Char – Hive data types are similar to VARCHAR. But they are fixed-length meaning that values shorter than the specified length value are padded with spaces but trailing spaces are not important during comparisons. 255 is the maximum fixed length.

iii. Timestamp

Hive supports traditional UNIX timestamp with operational nanosecond precision. Timestamps in text files use format ”YYYY-MM-DD HH:MM:SS.fffffffff” and “yyyy-mm-dd hh:mm:ss.ffffffffff”.

iv. Dates

DATE values are described in particular year/month/day (YYYY-MM-DD) format.E.g. DATE ‘2017-­01-­01’. These types don’t have a time of day component. This type supports range of values for 0000-­01-­01 to 9999-­12-­31.

v. Decimals

In Hive DECIMAL type is similar to Big Decimal format of java. This represents immutable arbitrary precision. The syntax and example are below:
Apache Hive 0.11 and 0.12 has the precision of the DECIMAL type fixed. And it’s limited to 38 digits.
Apache Hive 0.13 users can specify scale and precision when creating tables with the DECIMAL data type using DECIMAL (precision, scale) syntax.  If the scale is not specified, then it defaults to 0 (no fractional digits). If no precision is specified, then it defaults to 10.

CREATE TABLE foo (
a DECIMAL, -- Defaults to decimal(10,0)b DECIMAL(9, 7)
b DECIMAL(9, 7)
)

vi. Union Types

Union Hive data types are the collection of heterogeneous data types. By using create union we can create an instance. The syntax and example are below:

CREATE TABLE union_test(foo UNIONTYPE<int, double, array<string>, struct<a:int,b:string>>);
SELECT foo FROM union_test;
{0:1}
{1:2.0}
{2:["three","four"]}
{3:{"a":5,"b":"five"}}
{2:["six","seven"]}
{3:{"a":8,"b":"eight"}}
{0:9}
{1:10.0}

d. Literals Data Types in Hive

In Hive data types following literals are used:

Apache Hive Literals Data Types

Apache Hive Literals Data Types

  • Floating Point Types
  • Decimal Type

i. Floating Point Types

These are nothing but numbers with decimal points. This type of data is composed of the DOUBLE data types in Hive.

ii. Decimal Type

This type is nothing but floating point value with the higher range than the DOUBLE data type. The decimal type range is approximate -10-308 to 10308.

e. Null Value Data Types in Hive

In this Hive Data Types, missing values are represented by the special value NULL.

So, this was all in Hive Data Types. Hope you like our explanation.

3. Conclusion – Hive Data Types

Hence, in this Apache Hive data type tutorial, we have discussed all the data types in Hive in detail. Hope this blog will help you to understand all the data types in hive easily. In the next section, we will discuss the Hive Operators in detail.
See Also:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.