Apache Ambari Architecture – Working With Example

FREE Online Courses: Your Passport to Excellence - Start Now

Today, in this Apache Ambari Tutorial, we will discuss Ambari Architecture. Moreover, in this Architecture of Ambari, we will see the working, and example of Ambari. Also, we will discuss applications of Apache Ambari.

So, let’s start Apache Ambari Architecture.

What is Ambari Architecture?

In simple words, Ambari Server collects data from across cluster at first. Moreover, each host has a copy of the Ambari Agent, that Agent permits the Ambari Server to control each host.
Below you can see a picture of Ambari architecture:

Apache Ambari Architecture

Apache Ambari Architecture

Basically, to access cluster information and perform cluster operations, Ambari Web calls the Ambari REST API (accessible from the Ambari Server). However, the application authenticates to the Ambari Server, after authenticating to Ambari Web.

Further, by using the REST API, communication between the browser and server occurs asynchronously.

In addition, there is a REST API in Ambari which is accessed by Web UI, that resets the session timeout. Hence, we can say Ambari Web sessions do not timeout automatically. And, after a period of inactivity, we can configure Ambari to timeout.

Working of Apache Ambari

Ambari Architecture is of master/slave type architecture. So, to perform certain actions and report back the state of every action, the master node instructs the slave nodes.

Although, for keeping track of the state of the infrastructure, the master node is responsible. But for this process, a database server is used by the master node, that can be further configured during setup time.

Now, we can see the high-level architecture of Ambari by below diagram which also shows how Ambari works:

Ambari architecture

Ambari architecture – Apache Ambari Architecture

There are following applications in Apache Ambari, at the core:

  • Ambari server
  • The Ambari agent
  • Ambari web UI
  • Database

Applications of Apache Ambari

i. Ambari server

Generally, the entry point for all administrative activities on the master server is what we call Ambari server (ambari-server). It is a shell script. Internally this script uses Python code, ambari-server.py, and also routes all the requests to it.

There are several entry points, in Ambari server that are available when passed different parameters to the ambari-server program:

  • Daemon management
  • Software upgrade
  • Software setup
  • LDAP/PAM/Kerberos management
  • Ambari backup and restore
  • Miscellaneous options

So, let’s learn all the entry points in detail:

a. Daemon management

At the time when the script is invoked with start, stop, reset, restart arguments from the command line, the daemon management mode is activated.

For example, we can run the following command, if we want to start the Ambari background server:
For example:
ambari-server start

b. Software upgrade

We can use this mode to upgrade Ambari server itself, once Ambari is installed. Basically, when we call the ambari-server program with the upgrade flag when it triggers. We can pass the upgradestack flag, in case we want to upgrade the entire stack of Ambari:

For example:
ambari-server upgrade

c. Software setup

Moreover, we need to do a preliminary setup of the software, once Ambari is downloaded from the internet (or installed via YUM and APT). When we pass the setup flag to the program, this mode triggers.

As a process, this mode will ask various questions form us that we need to answer. Make sure, Ambari cannot be used for any kind of management of our servers, Unless we finish this step:

For example:
ambari-server setup

d. LDAP/PAM/Kerberos management

LDAP is an acronym for Lightweight Directory Access Protocol (LDAP). Basically, we use it for identity management in enterprises. Though, we need to use the following flags:

  • setup-ldap (for setting up ldap properties with ambari) and
  • sync-ldap (to perform a synchronization of the data from the ldap server)

in order to use LDAP-based authentication:
For example:
ambari-server setup-ldap
ambari-server sync-ldap
PAM is an acronym for Pluggable Authentication Module (PAM). Generally,  in any UNIX or Linux operating systems, at the core of the authentication and authorization, it is present. Though, we need to run it with the setup-pam option, if we want to leverage the PAM-based access for Ambari.

Moreover, we need to run it with migrate-ldap-pam, if we then want to move from LDAP to PAM-based authentication:
ambari-server setup-pam
ambari-server migrate-ldap-pam
Furthermore, it uses Kerberos as another advanced authentication and authorization mechanism. It is very helpful in networked environments. So, on large-scale servers, it simplifies Authenticity, Authorisation, and Auditing (AAA). We can use the setup-Kerberos flag if we want to use Kerberos for Ambari:

For Example:
ambari-server setup-kerberos

e. Ambari backup and restore

Basically, we can enter this mode, if we want to take a snapshot of the current installation of Ambari (excluding the database). It supports both backup and restores methods. Especially which are invoked via the backup as well as restore flags:
ambari-server backup
ambari-server restore

f. Miscellaneous options

Apart from all, there are some other options also which are available with the server program that we can invoke with the -h (help) flag.

ii. Ambari Agent

On all the nodes that we want to manage with Ambari, the Ambari Agent runs. Basically, this program periodically heartbeats to the master node. However, ambari-server executes many of the tasks on the servers, by using this agent.

iii. Ambari web interface

Ambari web interface is one of the powerful features of the Ambari application. Through the server of Ambari program which is running on the master host, this web application is exposed; on port 8080, we can access this application. And, this application is protected by authentication.

Also, we can control and view all aspects of our Hadoop clusters, once we log in to this web portal.

iv. Database

Further, to keep track of the state of the entire Hadoop infrastructure, Ambari supports multiple RDBMS. Although, we can choose the database we want to use, during the setup of the Ambari server for the first time.

Ambari supports the following databases, at the time of writing:

  • PostgreSQL
  • Oracle
  • MySQL or MariaDB
  • Embedded PostgreSQL
  • Microsoft SQL Server
  • SQL Anywhere
  • Berkeley DB

So, this was all in Ambari Architecture. Hope you like our explanation.

Conclusion – Ambari Architecture

Hence, in this Ambari Architecture Tutorial, we have seen the complete Ambari Architecture and its working in detail. Moreover, we discussed applications and example of Ambari. Still, if you have any doubt regarding Ambari Architecture, ask in the comment tab.

Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *