HBase Troubleshooting – Problem, Cause & Solution
In this HBase tutorial, we will learn HBase troubleshooting. It is possible that several kinds of problems may occur while working on HBase. So, in this article, “HBase Troubleshooting” we will discuss different problems along with their solution, to solve any error or failures in Troubleshooting in HBase.
So, let’s start HBase Troubleshooting.
Stay updated with latest technology trends
Join DataFlair on Telegram!!
i. Problem 1
Problem Statement: Thrift Server Crashes after Receiving Invalid Data.
If a Thrift server receives a large amount of invalid data, due to a buffer overrun, it may crash.
In order to check the validity of data which Thrift server receives, it allocates memory. So, it may need to allocate more memory than is available if there is a large amount of invalid data. However, it occurs because of limitation in the Thrift library itself.
Basically, to prevent from crashes, we can use the framed and compact transport protocols. By default, these protocols are disabled. The reason behind this is it may need changes to our client code. There are two options to add to our hbase-site.xml are:
Like in the XML below, set each of these to true. Moreover, using the hbase.regionserver.thrift.framed.max_frame_size_in_mb option, we can also specify the maximum frame size.
<property> <name>hbase.regionserver.thrift.framed</name> <value>true</value> </property> <property> <name>hbase.regionserver.thrift.framed.max_frame_size_in_mb</name> <value>2</value> </property> <property> <name>hbase.regionserver.thrift.compact</name> <value>true</value> </property>
ii. Problem 2
Problem Statement: Master server initializes but region servers not initialize
Basically, through their IP addresses, the Communication between Master and region servers occurs. As Master is going to listen that region servers are running or having the IP address of 127.0.0.1.
Basically, region server continuously informs Master server about their IP addresses are 127.0.0.1, in dual communication between region servers and master.
As a solution, remove master server name node from localhost especially, which is present in the hosts file
Host file location /etc/hosts
What to change:
Go to this location, after opening /etc./hosts:
127.0.0.1 fully.qualified.regionservernameregionservernamelocalhost.localdomain localhost
: : 1 localhost3.localdomain3 localdomain3
Also, do modifications in the above configuration:
: : 1 localhost3.localdomain3 localdomain3
iii. Problem 3
Problem Statement: Couldn’t find my address: XYZ in the list of Zookeeper quorum servers
1. ZooKeeper server will throw an error like .xyz in the name of the server, and also it was not able to start.
2. Also, to start a ZooKeeper server, HBase attempts, on some machine but at the same time machine is not able to find itself the quorum configuration.
1. Its solution is to replace the hostname with a hostname that is presented in the error message
2. we can set the below configurations in HBase-site.xml, but only if we suppose we are having DNS server.
iv. Problem 4
Problem Statement: Created Root Directory for HBase through Hadoop DFS
1. As we need to run the HBase migrations script.
2. The HBase migrations script respond like no files in the root directory, upon running that.
– Using Hadoop Distributed file system, create a new directory for HBase.Two possibilities are:
1) Root directory not to exist
2) Before only, HBase previous running instance initialized.
Ensure that the HBase root directory does not currently exist or by a previous run of HBase instance, it has been initialized.
To delete the HBase root directory uses Hadoop dfs.
By itself, HBase creates and initializes the directory.
v. Problem 5
Problem Statement: Zookeeper session expired events
1. By throwing Exceptions, HMaster or HRegion servers shutting down.
2. Moreover, we can find out the actual exceptions that thrown, if we observe logs.
Because of Zookeeper expired event, the following shows the exception thrown:
Log files code:
WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x278bd16a96000f to sun.nio.ch.SelectionKeyImpl@355811ec java.io.IOException: TIMED OUT at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906) WARN org.apache.hadoop.hbase.util.Sleeper: We slept 79410ms, ten times longer than scheduled: 5000 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server hostname/IP:PORT INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/IP:PORT remote=hostname/IP:PORT] INFO org.apache.zookeeper.ClientCnxn: Server connection successful WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x278bd16a96000d to sun.nio.ch.SelectionKeyImpl@3544d65e java.io.IOException: Session Expired at org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589) at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expired
- RAM size is 1 GB, by default. So, we have maintained RAM capacity more than 1 GB, for doing long running imports.
- It is must to increase the session timeout for Zookeeper.
- Also, we have to modify the following property in “hbase-site.xml” that present in hbase /conf folder path, for increasing session time out of Zookeeper.
- Session timeout is 60 seconds, by default. So, we can change it to 120 seconds.
<property> <name> zookeeper.session.timeout </name> <value>1200000</value> </property> <property> <name> hbase.zookeeper.property.tickTime </name> <value>6000</value> </property>
So, this was all about Troubleshooting in HBase. Hope you like our explanation.
Conclusion: HBase Troubleshooting
Hence, in this HBase Troubleshooting tutorial, we saw solutions to various problems which can often occur during Troubleshooting in HBase. Still, if any doubt regarding HBase Troubleshooting, ask in the comment tab.