HBase Troubleshooting – Problem, Cause & Solution

Boost your career with Free Big Data Courses!!

In this HBase tutorial, we will learn HBase troubleshooting. It is possible that several kinds of problems may occur while working on HBase. So, in this article, “HBase Troubleshooting” we will discuss different problems along with their solution, to solve any error or failures in Troubleshooting in HBase.

So, let’s start HBase Troubleshooting.

HBase Troubleshooting

i. Problem 1

Problem Statement: Thrift Server Crashes after Receiving Invalid Data.
If a Thrift server receives a large amount of invalid data, due to a buffer overrun, it may crash.

Cause:
In order to check the validity of data which Thrift server receives, it allocates memory. So, it may need to allocate more memory than is available if there is a large amount of invalid data. However, it occurs because of limitation in the Thrift library itself.

Solution:
Basically, to prevent from crashes, we can use the framed and compact transport protocols. By default, these protocols are disabled. The reason behind this is it may need changes to our client code. There are two options to add to our hbase-site.xml are:

  • hbase.regionserver.thrift.framed,
  • hbase.regionserver.thrift.compact

Like in the XML below, set each of these to true. Moreover, using the hbase.regionserver.thrift.framed.max_frame_size_in_mb option, we can also specify the maximum frame size.

<property>
 <name>hbase.regionserver.thrift.framed</name>
 <value>true</value>
</property>
<property>
 <name>hbase.regionserver.thrift.framed.max_frame_size_in_mb</name>
 <value>2</value>
</property>
<property>
 <name>hbase.regionserver.thrift.compact</name>
 <value>true</value>
</property>

ii. Problem 2

Problem Statement: Master server initializes but region servers not initialize
Basically, through their IP addresses, the Communication between Master and region servers occurs. As Master is going to listen that region servers are running or having the IP address of 127.0.0.1.

Cause:
Basically, region server continuously informs Master server about their IP addresses are 127.0.0.1, in dual communication between region servers and master.

Solution:
As a solution, remove master server name node from localhost especially, which is present in the hosts file
Host file location /etc/hosts

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

What to change:
Go to this location, after opening /etc./hosts:
127.0.0.1 fully.qualified.regionservernameregionservernamelocalhost.localdomain localhost
: : 1 localhost3.localdomain3 localdomain3

Also, do modifications in the above configuration:
127.0.0.1 localhost.localdomainlocalhost
: : 1 localhost3.localdomain3 localdomain3

iii. Problem 3

Problem Statement: Couldn’t find my address: XYZ in the list of Zookeeper quorum servers

Cause:
1. ZooKeeper server will throw an error like .xyz in the name of the server, and also it was not able to start.
2. Also,  to start a ZooKeeper server, HBase attempts, on some machine but at the same time machine is not able to find itself the quorum configuration.

Solution:
1. Its solution is to replace the hostname with a hostname that is presented in the error message
2. we can set the below configurations in HBase-site.xml, but only if we suppose we are having DNS server.
           – HBase.zookeeper.dns.interface
           – HBase.zookeeper.dns.nameserver

iv. Problem 4

Problem Statement: Created Root Directory for HBase through Hadoop DFS
1. As we need to run the HBase migrations script.
2. The HBase migrations script respond like no files in the root directory, upon running that.

Cause:
– Using Hadoop Distributed file system, create a new directory for HBase.Two possibilities are:

1) Root directory not to exist
2) Before only, HBase previous running instance initialized.

Solution:
Ensure that the HBase root directory does not currently exist or by a previous run of HBase instance, it has been initialized.
Step 1:
To delete the HBase root directory uses Hadoop dfs.
Step 2:
By itself, HBase creates and initializes the directory.

v. Problem 5

Problem Statement: Zookeeper session expired events

Cause:
1. By throwing Exceptions, HMaster or HRegion servers shutting down. 
2. Moreover, we can find out the actual exceptions that thrown, if we observe logs.
Because of Zookeeper expired event, the following shows the exception thrown:

Log files code:

WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x278bd16a96000f to sun.nio.ch.SelectionKeyImpl@355811ec
java.io.IOException: TIMED OUT
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
WARN org.apache.hadoop.hbase.util.Sleeper: We slept 79410ms, ten times longer than scheduled: 5000
INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server hostname/IP:PORT
INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/IP:PORT remote=hostname/IP:PORT]
INFO org.apache.zookeeper.ClientCnxn: Server connection successful
WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x278bd16a96000d to sun.nio.ch.SelectionKeyImpl@3544d65e
java.io.IOException: Session Expired
at org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589)
at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expired

Solution:

  1.  RAM size is 1 GB, by default. So, we have maintained RAM capacity more than 1 GB, for doing long running imports.
  2. It is must to increase the session timeout for Zookeeper.
  3. Also, we have to modify the following property in “hbase-site.xml” that present in hbase /conf folder path, for increasing session time out of Zookeeper.
  4. Session timeout is 60 seconds, by default. So, we can change it to 120 seconds.
<property>
   <name> zookeeper.session.timeout </name>
   <value>1200000</value>
</property>
<property>
   <name> hbase.zookeeper.property.tickTime </name>
   <value>6000</value>
</property>

So, this was all about Troubleshooting in HBase. Hope you like our explanation.

Conclusion: HBase Troubleshooting

Hence, in this HBase Troubleshooting tutorial, we saw solutions to various problems which can often occur during Troubleshooting in HBase. Still, if any doubt regarding HBase Troubleshooting, ask in the comment tab.

Did we exceed your expectations?
If Yes, share your valuable feedback on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *