

{"id":16276,"date":"2018-06-13T04:20:22","date_gmt":"2018-06-13T04:20:22","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=16276"},"modified":"2018-06-13T04:20:22","modified_gmt":"2018-06-13T04:20:22","slug":"hbase-architecture","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/","title":{"rendered":"HBase Architecture &#8211; Regions, Hmaster, Zookeeper"},"content":{"rendered":"<p><span style=\"font-weight: 400\">In this <strong>HBase tutorial<\/strong>, we will learn the concept of HBase Architecture.\u00a0Moreover, we will see the 3 major components of HBase, such as HMaster, Region Server, and ZooKeeper. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Along with this, we will see the working of HBase Components, <strong>HBase Memstore<\/strong>, HBase Compaction in Architecture of HBase.\u00a0This HBase Technology tutorial also includes the advantages and limitations of HBase Architecture to understand it well.<\/span><\/p>\n<p>So, let&#8217;s start HBase Architecture.<\/p>\n<h2><span style=\"font-weight: 400\">What is HBase Architecture?<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, there are 3 types of servers in a master-slave type of HBase Architecture. They are HBase HMaster, Region Server, and <strong>ZooKeeper<\/strong>. Let\u2019s start with Region servers, these servers serve data for reads and write purposes. <\/span><\/p>\n<p><span style=\"font-weight: 400\">That means clients can directly communicate with HBase Region Servers while accessing data. Further, the HBase Master process handles the region assignment as well as DDL (create, delete tables) operations. And finally, a part of <strong>HDFS<\/strong>, Zookeeper, maintains a live cluster state.<\/span><\/p>\n<div id=\"attachment_16284\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16284\" class=\"wp-image-16284 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Components-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16284\" class=\"wp-caption-text\">What is HBase Architecture<\/p><\/div>\n<p><span style=\"font-weight: 400\">In addition, the data which we manage by Region Server further stores in the Hadoop DataNode. And, all HBase data is stored in <strong>HDFS<\/strong> files. <\/span><span style=\"font-weight: 400\">Then for the data served by the RegionServers, Region Servers are collocated with the HDFS DataNodes, which also enable data locality. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Here, data locality refers to putting the data close to where we need. Make sure, when we write HBase data it is local, but while we move a region, it is not local until compaction.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, for all the physical data blocks the NameNode maintains Metadata information that comprise the files.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Architecture &#8211; Regions<\/span><\/h2>\n<p><span style=\"font-weight: 400\">In HBase Architecture, a region consists of all the rows between the start key and the end key which are assigned to that Region. And, those Regions which we assign to the nodes in the HBase Cluster, is what we call \u201cRegion Servers\u201d. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Basically, for the purpose of reads and writes these servers serves the data. While talking about numbers, it can serve approximately 1,000 regions. However, we manages rows in each region in HBase in a sorted order.<\/span><\/p>\n<div id=\"attachment_16285\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16285\" class=\"wp-image-16285 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Regions-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16285\" class=\"wp-caption-text\">HBase Architecture &#8211; Regions<\/p><\/div>\n<p><span style=\"font-weight: 400\">These Regions of a Region Server are responsible for several things, like handling, managing, executing as well as <strong>reads and writes HBase operations<\/strong> on that set of regions. The default size of a region is 256MB, which we can configure as per requirement.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Architecture &#8211; HMaster<\/span><\/h2>\n<p><span style=\"font-weight: 400\">HBase master in the architecture of HBase is responsible for region assignment as well as DDL (create, delete tables) operations.<\/span><\/p>\n<p><span style=\"font-weight: 400\">There are two main responsibilities of a master in HBase architecture:<\/span><\/p>\n<div id=\"attachment_16286\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16286\" class=\"wp-image-16286 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/HBase-Hmaster-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16286\" class=\"wp-caption-text\">The Architecture of HBase &#8211; HMaster<\/p><\/div>\n<p><strong>a. Coordinating the region servers<\/strong><br \/>\n<span style=\"font-weight: 400\">Basically, a master assigns Regions on startup. Also for the purpose of recovery or load balancing, it re-assigns regions.<\/span><br \/>\n<span style=\"font-weight: 400\">Also, a master monitors all RegionServer instances in the HBase Cluster.<\/span><\/p>\n<p><strong>b. Admin functions<\/strong><br \/>\n<span style=\"font-weight: 400\">Moreover, it acts as an interface for creating, deleting and updating tables in HBase.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">ZooKeeper in HBase Architecture<\/span><\/h2>\n<p><span style=\"font-weight: 400\">However, to maintain server state in the HBase Cluster, HBase uses ZooKeeper as a distributed coordination service. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Basically, which servers are alive and available is maintained by Zookeeper, and also it provides server failure notification. Moreover, in order to guarantee common shared state, Zookeeper uses consensus. <\/span><\/p>\n<div id=\"attachment_16287\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16287\" class=\"wp-image-16287 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Zookeeper-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16287\" class=\"wp-caption-text\">HBase Architecture &#8211; Zookeeper<\/p><\/div>\n<h2><span style=\"font-weight: 400\">How HBase Components Works?<\/span><\/h2>\n<p><span style=\"font-weight: 400\">As we know, to coordinate shared state information for members of distributed systems, HBase uses Zookeeper. Further, active HMaster, as well as Region servers, connect with a session to ZooKeeper. Then for active sessions, ZooKeeper maintains ephemeral nodes by using heartbeats. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Ephemeral nodes mean znodes which exist as long as the session which created the znode is active and then znode is deleted when the session ends.<\/span><\/p>\n<div id=\"attachment_16300\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16300\" class=\"wp-image-16300 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/How-the-Components-Work-Together-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16300\" class=\"wp-caption-text\">HBase Architecture &#8211; working of Components<\/p><\/div>\n<p><span style=\"font-weight: 400\">In addition, each Region Server in HBase Architecture produces an ephemeral node. Further, to discover available region servers, the HMaster monitors these nodes. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Also for server failures, it monitors these nodes. Moreover, to make sure that only one master is active, Zookeeper determines the first one and uses it. <\/span><\/p>\n<p><span style=\"font-weight: 400\">As a process, the active HMaster sends heartbeats to Zookeeper, however, the one which is not active listens for notifications of the active HMaster failure.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Although, the session gets expired and the corresponding ephemeral node is also deleted if somehow a region server or the active HMaster fails to send a heartbeat. Then for updates, listeners will be notified of the deleted nodes. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Further, the active HMaster will recover region servers, as soon as it listens for region servers on failure. Also, when inactive one listens for the failure of active HMaster, the inactive HMaster becomes active, if an active HMaster fails.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Architecture &#8211; Read or Write<\/span><\/h2>\n<p><span style=\"font-weight: 400\">When the first time a client<strong> reads or writes to HBase<\/strong>:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Basically, the client gets the Region server which helps to hosts the META Table from ZooKeeper.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Moreover, in order to get the region server corresponding to the row key, the client will query the.META. server, it wants to access. However, along with the META Table location, the client caches this information.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Also, from the corresponding Region Server, it will get the Row.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">HBase META Table<\/span><\/h2>\n<p><span style=\"font-weight: 400\"> META Table is a special HBase Catalog Table. Basically, it holds the location of the regions in the HBase Cluster.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It keeps a list of all Regions in the system.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Structure of the .META. table is as follows:<\/span><\/li>\n<\/ul>\n<ol>\n<li><span style=\"font-weight: 400\"><strong>Key:<\/strong> region start key, region id<\/span><\/li>\n<li><span style=\"font-weight: 400\"><strong> Values:<\/strong> RegionServer<\/span><\/li>\n<\/ol>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It is like a binary tree.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Region Server Components in HBase Architecture<\/span><\/h2>\n<p><span style=\"font-weight: 400\">There are following components of a Region Server, which runs on an HDFS data node:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>WAL<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is a file on the distributed file system. Basically, to store new data that hasn&#8217;t yet been persisted to permanent storage, we use the WAL. Moreover, we also use it for recovery in the case of failure.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>BlockCache<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is the read cache. The main role of BlockCache is to store the frequently read data in memory. And also, the data which is least recently used data gets evicted when full.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>MemStore<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It is the write cache. The main role of MemStore is to store new data which has not yet been written to disk. Also, before writing to disk, it gets sorted.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>Hfiles<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">These files store the rows as sorted KeyValues on disk.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Write Steps (1)<\/span><\/h2>\n<p><span style=\"font-weight: 400\">The first step is to write the data to the write-ahead log, while the client issues a put request:<\/span><br \/>\n<span style=\"font-weight: 400\">&#8211; \u00a0To the end of the WAL file, all the edits are appended which is stored on disk.<\/span><br \/>\n<span style=\"font-weight: 400\">&#8211; \u00a0In case a server crashes, the WAL is used, to recover not-yet-persisted data.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Write Steps (2)<\/span><\/h2>\n<p><span style=\"font-weight: 400\">As soon as the data is written to the WAL, it is placed in the MemStore. After that acknowledgment of the put, the request returns to the client.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase MemStore<\/span><\/h2>\n<p><span style=\"font-weight: 400\">It updates in memory as sorted KeyValues, the same as it would be stored in an HFile. There is one MemStore per column family. The updates are sorted per column family.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Compaction in HBase Architecture<\/span><\/h2>\n<div id=\"attachment_16288\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16288\" class=\"wp-image-16288 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Compaction-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16288\" class=\"wp-caption-text\">Compaction in HBase Architecture<\/p><\/div>\n<p><span style=\"font-weight: 400\">In order to reduce the storage and reduce the number of disks seeks needed for a read, HBase combines HFiles. This entire process is what we call compaction. It selects few HFiles from a region and combines them. Compaction is of two types, such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>Minor Compaction<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">As you can see in the image, HBase picks smaller HFiles automatically and then recommits them to bigger HFiles. This process is what we call Minor Compaction. For committing smaller HFiles to bigger HFiles, it performs merge sort. <\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><strong>Major Compaction<\/strong><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">HBase merges and recommits the smaller HFiles of a region to a new HFile, in Major compaction, as you can see in the image. Here, in the new HFile, the same column families are placed together. In this process, it drops deleted as well as expired cell.<\/span><\/p>\n<p><span style=\"font-weight: 400\">However, it is a possibility that input-output disks and network traffic might get congested during this process. Hence, generally during low peak load timings, it is scheduled.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Region Split in HBase<\/span><\/h2>\n<div id=\"attachment_16289\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16289\" class=\"wp-image-16289 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split.png\" alt=\"HBase Architecture\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Region-Split-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16289\" class=\"wp-caption-text\">Region Split in HBase<\/p><\/div>\n<p><span style=\"font-weight: 400\">The region has two child regions in HBase Architecture, whenever a region becomes large. Here each region represents exactly a half of the parent region. Afterward, we report this split to the HMaster. <\/span><\/p>\n<p><span style=\"font-weight: 400\">However, until the HMaster allocates them to a new Region Server for load balancing, we handle this by the same Region Server.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HDFS Data Replication<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, primary node handles all Writes and Reads. And, HDFS replicates the write-ahead logs as well as HFile blocks. However, these replication process of HFile block happens automatically. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, to provide the data safety, HBase relies on HDFS because it stores its files. <\/span><br \/>\n<span style=\"font-weight: 400\">The process is, one copy is written locally, while data is written in HDFS. Then we replicate it to a secondary node, and after that third copy is written to a tertiary node.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">HBase Crash Recovery<\/span><\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">ZooKeeper notifies to the HMaster about the failure, whenever a Region Server fails.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Afterward, too many active Region Servers, HMaster distributes and allocates the regions of crashed Region Server. Also, the HMaster distributes the WAL to all the Region Servers, in order to recover the data of the MemStore of the failed Region Server.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Furthermore, to build the MemStore for that failed region\u2019s column family, each Region Server re-executes the WAL.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">However, Re-executing that WAL means updating all the change that was made and stored in the MemStore file because, in WAL, the data is written in timely order. <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Therefore, we recover the MemStore data for all column family just after all the Region Servers executes the WAL.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Advantages of HBase Architecture<\/span><\/h2>\n<p><span style=\"font-weight: 400\">There are some benefits which HBase Architecture offers:<\/span><br \/>\n<strong>a. Strong consistency model<\/strong><br \/>\n<span style=\"font-weight: 400\">&#8211; All readers will see same value, while a write returns.<\/span><br \/>\n<strong>b. Scales automatically<\/strong><br \/>\n<span style=\"font-weight: 400\">&#8211; While data grows too large, Regions splits automatically.<\/span><br \/>\n<span style=\"font-weight: 400\">&#8211; To spread and replicate data, it uses HDFS.<\/span><br \/>\n<strong>c. Built-in recovery<\/strong><br \/>\n<span style=\"font-weight: 400\">&#8211; It uses Write Ahead Log for recovery.<\/span><br \/>\n<strong>d. Integrated with Hadoop<\/strong><br \/>\n<span style=\"font-weight: 400\">&#8211; On HBase MapReduce is straightforward.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Limitations With Apache HBase<\/span><\/h2>\n<p><strong>a. Business continuity reliability<\/strong><br \/>\n<span style=\"font-weight: 400\">&#8211; Write Ahead Log replay very slow.<\/span><br \/>\n<span style=\"font-weight: 400\">&#8211; Also, a slow complex crash recovery.<\/span><br \/>\n<span style=\"font-weight: 400\">&#8211; Major Compaction I\/O storms.<\/span><\/p>\n<p>So, this was all about HBase Architecture. Hope you like our explanation.<\/p>\n<h2><span style=\"font-weight: 400\">Conclusion &#8211; HBase Architecture<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Hence, in this HBase architecture tutorial, we\u00a0saw the whole concept of HBase Architecture. Moreover, we saw 3 HBase components that are region, Hmaster, Zookeeper.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> Also, we discussed, advantages &amp; limitations of HBase Architecture. So, if any doubt occurs regarding HBase Architecture, feel free to ask through the comment tab.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this HBase tutorial, we will learn the concept of HBase Architecture.\u00a0Moreover, we will see the 3 major components of HBase, such as HMaster, Region Server, and ZooKeeper. Along with this, we will see&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":18261,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23],"tags":[341,1074,2701,5393,5416,5427,5433,5455,5501,5559,8279,11468,11469,16433],"class_list":["post-16276","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hbase","tag-advantages-of-hbase-architecture","tag-architecture-in-hbase","tag-compaction","tag-hbase-architecture","tag-hbase-crash-recovery","tag-hbase-first-read-or-write","tag-hbase-hmaster","tag-hbase-meta-table","tag-hbase-write-steps","tag-hdfs-data-replication","tag-limitations-with-apache-hbase","tag-region-server-components","tag-region-split-in-hbase","tag-zookeeper-the-coordinator"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>HBase Architecture - Regions, Hmaster, Zookeeper - DataFlair<\/title>\n<meta name=\"description\" content=\"HBase Architecture, what is HBase architecture, Regions, Hmaster, HBase Zookeeper, HBase meta data,Advantages of HBase Architecture.disadvantages of HBase\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/hbase-architecture\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"HBase Architecture - Regions, Hmaster, Zookeeper - DataFlair\" \/>\n<meta property=\"og:description\" content=\"HBase Architecture, what is HBase architecture, Regions, Hmaster, HBase Zookeeper, HBase meta data,Advantages of HBase Architecture.disadvantages of HBase\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/hbase-architecture\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-06-13T04:20:22+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"HBase Architecture - Regions, Hmaster, Zookeeper - DataFlair","description":"HBase Architecture, what is HBase architecture, Regions, Hmaster, HBase Zookeeper, HBase meta data,Advantages of HBase Architecture.disadvantages of HBase","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/","og_locale":"en_US","og_type":"article","og_title":"HBase Architecture - Regions, Hmaster, Zookeeper - DataFlair","og_description":"HBase Architecture, what is HBase architecture, Regions, Hmaster, HBase Zookeeper, HBase meta data,Advantages of HBase Architecture.disadvantages of HBase","og_url":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-06-13T04:20:22+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"HBase Architecture &#8211; Regions, Hmaster, Zookeeper","datePublished":"2018-06-13T04:20:22+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/"},"wordCount":1798,"commentCount":2,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg","keywords":["Advantages of HBase Architecture","architecture in HBase","Compaction","hbase architecture","HBase Crash Recovery","HBase First Read or Write","HBase HMaster","HBase Meta Table","HBase Write Steps","HDFS Data Replication","Limitations with Apache HBase","Region Server Components","Region Split in HBase","ZooKeeper: The Coordinator"],"articleSection":["HBase Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/hbase-architecture\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/","url":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/","name":"HBase Architecture - Regions, Hmaster, Zookeeper - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg","datePublished":"2018-06-13T04:20:22+00:00","description":"HBase Architecture, what is HBase architecture, Regions, Hmaster, HBase Zookeeper, HBase meta data,Advantages of HBase Architecture.disadvantages of HBase","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/hbase-architecture\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/06\/HBase-Architecture-01.jpg","width":1200,"height":628,"caption":"HBase Architecture - Regions, Hmaster, Zookeeper"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/hbase-architecture\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"HBase Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/hbase\/"},{"@type":"ListItem","position":3,"name":"HBase Architecture &#8211; Regions, Hmaster, Zookeeper"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/16276","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=16276"}],"version-history":[{"count":0,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/16276\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/18261"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=16276"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=16276"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=16276"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}