Impala Security – Latest Impala Security Guidelines for 2019

Stay updated with latest technology trends
Join DataFlair on Telegram!!

1. Impala Security With Guidelines

In our last tutorial, Impala SQL, and today we talk about Impala Security. We studied It is essential to learn about Impala Security while working on Impala.  Furthermore, we will discuss the categories of security features. Also, we will learn the Security Guidelines for Impala in detail.

So, let’s start Impala Security Tutorial.

2. What is Impala Security?

On the basis of Sentry open source project, Impala includes a fine-grained authorization framework for Hadoop. Basically, in Impala 1.1.0, Sentry authorization was added. Sentry takes Hadoop security to a new level needed for the requirements of highly regulated industries along with the Kerberos authentication framework.

Such as healthcare, financial services, and government. Moreover, it attains an auditing capability, generates the audit data, the Cloudera Navigator product consolidates the audit data from all nodes in the cluster, and Cloudera Manager lets you filter, visualize, and produce reports.

Read about Impala Features in detail

There are various objectives of  Impala security features. Such as, security prevents accidents or mistakes that could disrupt application processing, delete or corrupt data, or reveal data to unauthorized users. Also, it can harden the system against malicious users trying to gain unauthorized access or perform other disallowed operations. To confirm that no unauthorized access occurred, the auditing feature provides a way. Also, to detect such attempts,  we use the auditing feature.

However, for production deployments in large organizations that handle important or sensitive data, this is a critical set of features. Basically, where multiple applications run concurrently and are prevented from interfering with each other it sets the stage for multi-tenancy.

3. Category of Impala Security Features

There are 3 broad categories, of these security features. Such as:

  1. Authorization
  2. Authentication
  3. Auditing

a. Authorization

While it comes to authorization, Impala relies on the open source Sentry project. However, Impala does all read and write operations with the privileges of the Impala user when authorization is not enabled, which is suitable for a development/test environment but not for a secure production environment. Hence, Impala uses the OS user ID of the user who runs impala-shell or another client program and associates various privileges with each user at the time of enabling authorization.

Let’s discuss Impala DISTINCT Operator with Example

b. Authentication

For authentication purpose, Impala relies on the Kerberos subsystem.

c. Auditing

If there are any attempts to perform unauthorized operations this feature provides a way to look back and diagnose. Basically,  to see where we require changes in authorization policies and to track down suspicious activity we can use this information. However, Cloudera Manager product collects the audit data produced by this feature. Further, present it in a user-friendly form by the Cloudera Manager product. This feature was added in Impala 1.1.1.

4. Security Guidelines for Impala

Basically, to harden a cluster running Impala against accidents and mistakes, there are some following steps that will also save from malicious attackers those are trying to access sensitive data. Such as:

  • At first, secure the root account. The reason behind it is, the root user can tamper with the Impalad daemon. They can read and write the data files in HDFS, log into other user accounts. Also, can access other system services that are beyond the control of Impala.
  • Moreover, Restrict membership in the sudoers list (in the /etc/sudoers file). Because the users who can run the sudo command can do many of the same things as the root user.

Let’s Learn Impala Shell and Impala commands

  • However, Hadoop ownership and there are no permissions for data files, be careful.
  • Also, there are no permissions for Impala log files.
  • We use password protection for Impala web UI (available by default on port 25000 on each Impala node).
  • Further, using the groupadd command, create the associated Linux groups if necessary, and create a policy file that specifies which Impala privileges are available to users in particular Hadoop groups.
  • For background information, the Impala authorization feature makes use of the HDFS file ownership and permissions mechanism. Moreover, using the useradd command create the associated Linux users if necessary. Further, add them to the appropriate groups with the usermod command.
  • To allow policy rules to specify simple, consistent rules design your databases, tables, and views with database and table structure. 
  • By running the Impala daemons along with the -server_name and -authorization_policy_file options on all nodes Enable authorization.
  • To ensure the identification of Users, Set up authentication using Kerberos.

So, this was all about Impala Security. Hope you like our explanation.

5. Conclusion – Impala Security

As a result, we have seen the overview of Apache Impala security with the latest security guidelines. Here, we studied 3 A’s: Authentication, Auditing, and Authorization. However, if any doubt occurs, feel free to ask in the comment section.

Related Topic- Impala CREATE TABLE 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.