HBase Operations: Read and Write Operations

1. HBase Operations

Today, in this HBase article “HBase Operations: Read and Write” we will learn the whole concept of HBase. There are two basic Operations of HBase i.e. HBase read and HBase write. Moreover, in this HBase tutorial, we will see some major components of HBase Operations such as HFile, META table.
So let’s start HBase Operations.

HBase Operations

HBase Operations: Read and Write Operations

Do you know about HBase Table Management Commands

2. HBase Operations: Read and Write

Basically, in both data read and write operation of HBase, there are two major components which play a vital role in it, like HFile and META Table, so let’s study about both in detail:

i. HFile

A basic level HBase architecture where the tables exist in physical form is what we call HFile.
Some key points in HFile:

  • A primary identifier is a Row key.
  • Here in lexicographical order, keys are stored.
  • Data is stored and split across the nodes, according to this order.
  • Only to 1 region, HFile is allocated.
  • The rows are stored in HFile, in sorted by KeyValues on disk.
  • Moreover, the entire sorted set is written to a new HFile in HDFS, while the MemStore accumulates data more than its limit.
  • In each column family, HBase uses multiple HFiles, which may consist of actual cells or key-value instances.
  • In each HFile, the highest sequence number stored as a meta field, to a better state where it has ended previously and where to continue next.
  • To search the data without having to read the whole file, HFile contains a multi-layered index which allows HBase.
  • HDFS replicates the WAL and HFile blocks.
  • Also, replication OF HFile block happens automatically.
  • By default, IO in HBase happens at HFile block level which is 64KB.

Moreover, HRegion Server controls integrating HFile component to have HRegion.

Have a look at HBase Shell Commands

ii. META Table

META Table is one of the major components of HBase Operations.
HBase Read operation needs to know which HRegion server has to be accessed for reading actual data, so, we use META Table in Read operation of HBase.
Moreover, the META Table will have the updated data because, after every Write process, this table is updated for the next Read.

Join DataFlair on Telegram
  • An HBase table which keeps a list of all regions in the system is META Table.
  • It is like a binary tree.
  • Its structure is as follows:

Key: Region start key, Region id
Values: RegionServer
Read Best Features of HBase | Why HBase is Used?

If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. HBase Write Path

These following steps occur in HBase Operations, while the client gives a command to Write:

  • At very first, for the fault tolerant purpose, write important logs to Write Ahead Log. Hence, HBase always has WAL to look into, if any error occurs while writing data.
  • The data to be written is forwarded to MemStore which is actually the RAM of the data node, as soon as the log entry is done. All the data is written in MemStore which is faster than RDBMS (Relational databases).
  • Afterward, all the data is dumped in HFile, however, the actual data is stored in HDFS. Also, then data stores in HFile directly, if the MemCache is full.
  • Further, ACK (Acknowledgement) is sent to the client as a confirmation of task completed, as soon as writing data is completed.
Hadoop Quiz

4. HBase Read Path

As a client sends a request to HBase, read process starts. A request is sent to zookeeper which keeps all the status of the distributed system, where HBase is also present. 

Do you know about HBase Admin API

  • META Table which is present in HRegion Server, Zookeeper has the location for it. Hence, Zookeeper gives the address for the table, at the time a client requests.
  • Afterward, that process continues to META Table after HRegionServer. So, there it gets the region address of table where the data is present to be read.
  • Further, the process enters the BlockCache where data is present from the previous read. However, the client will get the same data in no time, if a user queries the same records. Also, the process returns to the client with the data as result, if the table is found.
  • Moreover, data would have been written to HFile sometime back, the process starts to search MemStore, if the table is not found. Then, the process returns to the client with the data as result, if it is found.
  • Furthermore, the process moves forward in search of data within the HFile, if the table is not found. Once the search is completed, the data will be located here, the process takes required data and moves forward.
  • Now, make sure, The data which HFile takes is the latest read data and further, it can be read by the user again. The reason that the data is written in BlockCache, is it can be instantly accessed by the client, at the next time.

Have a look at HBase Security

  • Finally, the read process with required data will be returned to the client along with ACK, while the data is written in BlockCache and all the search is completed.

So, this was all about HBase Operations. Hope you like our explanation.

5. Conclusion

Hence, in this HBase Operations tutorial, we have seen how HBase performs Read and Write operations internally. Moreover, we also discussed 2 major components of HBase, these are HFile and META Table in operation of HBase. However, if any doubt occurs, feel free to ask in the comment tab.
See also –
HBase Commands

2 Responses

  1. Akshay verma says:

    Nice and an informative article on HBase Operations.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.