HBase Security: Kerberos Authentication & Authorization

Boost your career with Free Big Data Courses!!

Today, we will learn HBase Security. So, In this article “HBase Security: Authentication & Authorization”, we will learn the way we use Kerberos with Hadoop and HBase to offer User Authentication i.e. HBase Kerberos Authorization.

Also, the implementation of HBase Authorization to grant users permissions for particular actions on a specified set of data. Moreover, we will cover some HBase Commands for security in HBase.

Along with this, we will discuss HDFS & Zookeeper SASL and also HBase ACL. At last, we will know about HBase Simple Authentication & HBase Client Authentication.

So, let’s explore the HBase Security tutorial.

HBase Security: Authentication & Authorization

Basically, protection of HBase against sniffers, unauthenticated/unauthorized users and network-based attacks is what we meant by term “HBase Security”. However, it can not protect against authorized users especially those who accidentally delete all the data.

But it is possible to configure HBase, to provide User Authentication.

However, that ensures that only authorized users can communicate with HBase. Moreover, on the basis of HBase Simple Authentication and Security Layer (SASL), the HBase authorization system is implemented at the RPC level, that supports Kerberos.

Further, on a per connection basis, SASL allows authentication, encryption negotiation and/or message integrity verification.

After enabling User Authentication, the next step is to give an admin the ability to define a series of User Authorization rules which allow or deny particular actions.

Access Controller Coprocessor or Access Control List (ACL), which is the second name of the Authorization system, is available from HBase 0.92 (CDH4) onward.

It provides the ability to define the authorization policy (Read/Write/Create/Admin), with table/family/qualifier granularity, for a specified user.

Kerberos in HBase Security

A networked authentication protocol is what we call Kerberos. Basically, by using secret-key cryptography, it offers strong authentication for client/server applications.

To help a client to prove its identity to a server (and vice versa) across an insecure network connection, the HBase Kerberos protocol uses strong cryptography (AES, 3DES, …).

A client and server can also encrypt all of their communications to assure privacy and data integrity as they go about their business if they have used Kerberos to prove their identities.

i. Ticket exchange protocol

There are 3 steps which must follow to access a service using HBase Kerberos, at a high level:

HBase Security

Kerberos in HBase Security

a. Kerberos Authentication

At very first, the HBase client authenticates itself to the Kerberos Authentication Server. Afterward, it receives a Ticket Granting Ticket (TGT).

b. Kerberos Authorization

Then from the Ticket Granting Server, client request a service ticket, so if the client TGT sent with the request is valid, that issues a ticket and a session key.

c. Service Request

Further, to authenticate client uses the service ticket itself to the server which is providing the service the client is using (e.g. HDFS, HBase, …)

HBase, HDFS, ZooKeeper SASL

As we know, secure HBase relies on a secure HDFS and a secure ZooKeeper, because HBase depends on HDFS and ZooKeeper. That says to communicate with HDFS and ZooKeeper,  the HBase servers need to create a secure service session.

Further, in HDFS, all the files written by HBase are stored. Moreover, the access control provided by HDFS is based on users, groups, and permissions, as in Unix filesystems.

On each znode, ZooKeeper has an Access Control List (ACL) which permits read/write access to the users on the basis of user information in a similar manner to HDFS.

HBase ACL

Basically, we are sure that the username that we received is one of our trusted users only if our users are authenticated via Kerberos.

However, there are times when this is not enough granularity like when we want to control that a specified user is able to read or write a table so, HBase offers an Authorization mechanism which allows restricted access for specific users, to do that.

However, we must enable the Access Controller coprocessor, to enable this feature. It is possible by adding it to hbase-site.xml under the master and region server coprocessor classes. On defining a coprocessor, it is a code which runs inside each HBase Region Server and/or Master.  

  • Rights management and _acl_ table

To manage the user rights, the HBase shell has a couple of commands which permits an admin:

grant [table] [family] [qualifier]
revoke [table] [family] [qualifier]

Also, an admin can restrict user access on the basis of table schema:

  • Provide user-W only read rights to Table-X/Family-Y
(grant 'User-W', 'R', 'Table-X', 'Family-Y')
  • And, to user-W, the full read/write rights to Qualifier-Z
(grant 'User-W', 'RW', 'Table-X', 'Family-Y', 'Qualifier-Z')

Furthermore, to operate at the cluster level, an admin can easily grant the global right, for example balancing regions, creating tables, shutting down the cluster and many more:

  • In order, to grant user-W the ability to create tables
(grant 'User-W', 'C')
  • And, to give user-W the ability to manage the cluster
(grant 'User-W', 'A')

However, in a table created by the Access Controller coprocessor, called _acl_, all the permissions are stored. The table name that we specify in the grant command, is the primary key of this table.

Here, the _acl_ table has only one column family. Whereas, for a particular table/user each qualifier describes the granularity of rights.  

  • Access Controller under the hood

To intercept each user request,  the Access Controller coprocessor uses the ability. Also, it checks, whether the user has the rights to execute the operations or not. To see if the user has the rights to execute the operation, the Access Controller needs to query the _acl_ table, for each operation.

Although, it is possible, that this operation may leave the negative impact on performance. So, to fix this problem, there is one solution we have that is we can use the _acl_ table for persistence and ZooKeeper in order to speed up the rights lookup.

Commands for HBase Security Purpose

HBase Security

Commands for HBase Security

i. grant

This command grants specific rights for example read, write, execute, and admin on a table to a certain user.
The Syntax for Security Purpose:

hbase> grant <user> <permissions> [<table> [<column family> [<column; qualifier>]]

From the set of RWXCA, we can grant zero or more privileges to a user. RWXCA refers to:
R – Here R represents read privilege.
W – And, W represents write privilege.
X – Here X represents execute privilege.
C – Now C refers to create privilege.
A – And, A means admin privilege.
For example,
Here we are granting all the privileges to a user named ‘Dataflair’.

hbase(main):018:0> grant 'Dataflair', 'RWXCA'

ii. revoke

To revoke a user’s access rights of a table, we use the revoke command is used.
The Syntax for Revoke: 

hbase> revoke <user>

For Example
Below code revokes all the permissions from the user named ‘Dataflair’.

hbase(main):006:0> revoke 'Dataflair'

iii. user_permission

In order to list all the permissions for a particular table, we use this command.
The Syntax for user permission:

hbase>user_permission ‘tablename’

For Example
Now, below code lists all the user permissions of ‘emp’ table.

hbase(main):013:0> user_permission 'emp'

So, this was all about HBase Security.  Hope you like our explanation.

Conclusion: HBase Security

Hence, in this HBase security, we have seen how to use Kerberos to authenticate users and encrypt communications between services. Also, Security in HBase adds two extra features which permit us to protect our data against sniffers or other network attacks.

Moreover, we have seen all possible HBase commands we can use for HBase Security purpose. Still, if any doubt regarding HBase Security, ask in the comment tab.

If you are Happy with DataFlair, do not forget to make us happy with your positive feedback on Google

courses

DataFlair Team

DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.

Leave a Reply

Your email address will not be published. Required fields are marked *