Apache Ambari Architecture – Working With Example
Keeping you updated with latest technology trends, Join DataFlair on Telegram
1. Objective – Architecture of Apache Ambari
Today, in this Apache Ambari Tutorial, we will discuss Ambari Architecture. Moreover, in this Architecture of Ambari, we will see the working, and example of Ambari. Also, we will discuss applications of Apache Ambari.
So, let’s start Apache Ambari Architecture.
2. What is Ambari Architecture?
In simple words, Ambari Server collects data from across cluster at first. Moreover, each host has a copy of the Ambari Agent, that Agent permits the Ambari Server to control each host.
Below you can see a picture of Ambari architecture:
Basically, to access cluster information and perform cluster operations, Ambari Web calls the Ambari REST API (accessible from the Ambari Server). However, the application authenticates to the Ambari Server, after authenticating to Ambari Web. Further, by using the REST API, communication between the browser and server occurs asynchronously.
You must read about the Advantages of Ambari
In addition, there is a REST API in Ambari which is accessed by Web UI, that resets the session timeout. Hence, we can say Ambari Web sessions do not timeout automatically. And, after a period of inactivity, we can configure Ambari to timeout.
If these professionals can make a switch to Big Data, so can you:
Java → Big Data Consultant, JDA
PeopleSoft → Big Data Architect, Hexaware
3. Working of Apache Ambari
Ambari Architecture is of master/slave type architecture. So, to perform certain actions and report back the state of every action, the master node instructs the slave nodes. Although, for keeping track of the state of the infrastructure, the master node is responsible. But for this process, a database server is used by the master node, that can be further configured during setup time.
Now, we can see the high-level architecture of Ambari by below diagram which also shows how Ambari works:
There are following applications in Apache Ambari, at the core:
- Ambari server
- The Ambari agent
- Ambari web UI
4. Applications of Apache Ambari
i. Ambari server
Generally, the entry point for all administrative activities on the master server is what we call Ambari server (ambari-server). It is a shell script. Internally this script uses Python code, ambari-server.py, and also routes all the requests to it.
There are several entry points, in Ambari server that are available when passed different parameters to the ambari-server program:
- Daemon management
- Software upgrade
- Software setup
- LDAP/PAM/Kerberos management
- Ambari backup and restore
- Miscellaneous options
So, let’s learn all the entry points in detail:
a. Daemon management
At the time when the script is invoked with start, stop, reset, restart arguments from the command line, the daemon management mode is activated.
For example, we can run the following command, if we want to start the Ambari background server:
b. Software upgrade
We can use this mode to upgrade Ambari server itself, once Ambari is installed. Basically, when we call the ambari-server program with the upgrade flag when it triggers. We can pass the upgradestack flag, in case we want to upgrade the entire stack of Ambari:
c. Software setup
Moreover, we need to do a preliminary setup of the software, once Ambari is downloaded from the internet (or installed via YUM and APT). When we pass the setup flag to the program, this mode triggers. As a process, this mode will ask various questions form us that we need to answer. Make sure, Ambari cannot be used for any kind of management of our servers, Unless we finish this step:
d. LDAP/PAM/Kerberos management
LDAP is an acronym for Lightweight Directory Access Protocol (LDAP). Basically, we use it for identity management in enterprises. Though, we need to use the following flags:
- setup-ldap (for setting up ldap properties with ambari) and
- sync-ldap (to perform a synchronization of the data from the ldap server)
in order to use LDAP-based authentication:
PAM is an acronym for Pluggable Authentication Module (PAM). Generally, in any UNIX or Linux operating systems, at the core of the authentication and authorization, it is present. Though, we need to run it with the setup-pam option, if we want to leverage the PAM-based access for Ambari. Moreover, we need to run it with migrate-ldap-pam, if we then want to move from LDAP to PAM-based authentication:
Furthermore, it uses Kerberos as another advanced authentication and authorization mechanism. It is very helpful in networked environments. So, on large-scale servers, it simplifies Authenticity, Authorisation, and Auditing (AAA). We can use the setup-Kerberos flag if we want to use Kerberos for Ambari:
e. Ambari backup and restore
Basically, we can enter this mode, if we want to take a snapshot of the current installation of Ambari (excluding the database). It supports both backup and restores methods. Especially which are invoked via the backup as well as restore flags:
f. Miscellaneous options
Apart from all, there are some other options also which are available with the server program that we can invoke with the -h (help) flag.
ii. Ambari Agent
On all the nodes that we want to manage with Ambari, the Ambari Agent runs. Basically, this program periodically heartbeats to the master node. However, ambari-server executes many of the tasks on the servers, by using this agent.
iii. Ambari web interface
Ambari web interface is one of the powerful features of the Ambari application. Through the server of Ambari program which is running on the master host, this web application is exposed; on port 8080, we can access this application. And, this application is protected by authentication.
Also, we can control and view all aspects of our Hadoop clusters, once we log in to this web portal.
Further, to keep track of the state of the entire Hadoop infrastructure, Ambari supports multiple RDBMS. Although, we can choose the database we want to use, during the setup of the Ambari server for the first time.
Ambari supports the following databases, at the time of writing:
- MySQL or MariaDB
- Embedded PostgreSQL
- Microsoft SQL Server
- SQL Anywhere
- Berkeley DB
So, this was all in Ambari Architecture. Hope you like our explanation.
5. Conclusion – Ambari Architecture
Hence, in this Ambari Architecture Tutorial, we have seen the complete Ambari Architecture and its working in detail. Moreover, we discussed applications and example of Ambari. Still, if you have any doubt regarding Ambari Architecture, ask in the comment tab.