Configure Hive Metastore to MySQL – Apache Hive Tutorial

1. Objective

This Hive tutorial describes how to configure Hive Metastore to MySQL. Hive stores its metadata (schema-related information, partitioning information, etc.) into the database, Hive is shipped with Derby database. Derby is an embedded database backed by local disk. Derby is a single threaded database which doesn’t allow multiple connections, it is not production ready. In this tutorial, we will change the Metastore of Hive to MySQL.

Configure Hive Metastore to MySQL - Apache Hive Tutorial

Configure Hive Metastore to MySQL – Apache Hive Tutorial

2. What is Hive?

Apache Hive is a data warehouse on the top of Hadoop. Using Hive we can run ad-hoc queries for the analysis of data. Hive saves us from writing complex Map-Reduce jobs, rather than that we can submit merely SQL queries. Hive converts SQL queries into MapReduce job and submits the same to the cluster.

Configure Hive Metastore to MySQL

Configure Hive Metastore to MySQL

Hive is very fast and scalable and is highly extensible. The hive consists of a huge user base, with the help of Hive thousands of jobs on the cluster can be run by hundreds of users at a time. As Hive is similar to SQL, hence it becomes very easy for the SQL developers to learn and implement Hive Queries.

3. Steps to Configure Hive Metastore to MySQL

Follow the steps given below to easily configure Hive Metastore to MySQL-

i. Install MySQL

$sudo apt-get install mysql-server

ii. Copy MySQL connector to lib directory

Download MySQL connector (mysql-connector-java-5.1.35-bin.jar) and copy it into the $HIVE_HOME/lib directory
Note: $HIVE_HOME refers hive installation directory

iii. Edit / Create configuration file hive-site.xml

Add following entries in the hive-site.xml (present in $HIVE_HOME/conf)

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/hcatalog?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>your_username</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>your_password</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>

Now start hive terminal, it will connect to MySQL. Now you can open multiple hive connections, which was not possible with Derby database.

So, this was all about configuring Hive Metastore to MySQL. Hope you like our explanation.

4. Conclusion

Hence, in this tutorial, we discussed a brief introduction of Apache Hive. moreover, we saw how to configure Hive Metastore to MySQL. Still, if you have any query, you can ask in the comment tab.

Related Links – 

No Responses

  1. Bianca says:

    Thank you for explaining Derby which stores metadata of Hive. Nice explanation.

Leave a Reply

Your email address will not be published. Required fields are marked *