Apache Mesos Tutorial – Architecture and Working
Keeping you updated with latest technology trends, Join DataFlair on Telegram
1. Objective – Apache Mesos Tutorial
In this Apache Mesos tutorial, we will learn what is Apache Mesos, what is the need of Mesos, Mesos architecture and various components of Apache Mesos. Moreover, we will also see the working of Apache Mesos to get the in-depth knowledge to learn Apache Mesos.
So, let’s start Apache Mesos Tutorial.
2. What is Apache Mesos?
Apache Mesos is the first Open Source cluster manager that handles the workload in distributed environment through dynamic resource sharing and isolation.
It is good for deployment and management of applications in large-scale cluster environments. Mesos groups together the existing resources of the machines/nodes in a cluster into a single unit, from this unit a variety of workloads may utilize. This is known as node abstraction in Mesos that reduces an overhead of allocating a specific machine for different workloads.
It is a resource management platform for Hadoop and Big Data cluster. Companies such as Twitter, Xogito, and Airbnb utilize Apache Mesos. Mesos two level scheduler discern the platform that allows distributed applications such as Apache Spark, Apache Kafka, and Apache Cassandra.
In some way, Apache Mesos is the opposite of virtualization because in virtualization one physical resource is divided into multiple virtual resources, while in Mesos multiple physical resources are clubbed into a single virtual resource.
If these professionals can make a switch to Big Data, so can you:
Java → Big Data Consultant, JDA
PeopleSoft → Big Data Architect, Hexaware
3. A need of Apache Mesos
Many resource manager exists today like Hadoop on Demand, batch scheduler(e.g. Torque), VM Scheduler (e.g. Eucalyptus) but the problem with Hadoop like workload is data locality, it was compromised because of static partitioning of nodes and since the job holds nodes for full duration of time, utilization of the system was affected. These drawbacks were solved in Mesos with the help of Fine-grained Sharing and Two-level scheduling.
Mesos shares a resource in a fine-grained manner meaning it allows the framework to achieve data locality by taking turns and reading data stored on each machine.
4. The architecture of Apache Mesos
The below diagram shows the key components of Apache Mesos:
i. Mesos Master
Mesos master is the heart of the cluster. It guarantees that the cluster will be highly available. It hosts the primary user interface that provides information about the resources available in the cluster. The master is a central source of all running task, it stores in memory all the data related to the task. For the completed task, there is only fixed amount of memory available, thus allowing the master to serve the user interface and data about the task with the minimal latency.
ii. Mesos Agent
The Mesos Agent holds and manages the container that hosts the executor (all things runs inside a container in Mesos). It manages the communication between the local executor and Mesos master, thus agent acts as an intermediate between them. The Mesos agent publishes the information related to the host they are running in, including data about running task and executors, available resources of the host and other metadata. It guarantees the delivery of status update of the tasks to the schedulers.
iii. Mesos Framework
Mesos Framework has two parts: The Scheduler and The Executor. The Scheduler registers itself in the Mesos master, and in turn gets the unique framework id. It is the responsibility of scheduler to launch task when the resource requirement and constraints match with received offer the Mesos master. It is also responsible for handling task failures and errors. The executor executes the task launched by the scheduler and notifies back the status of each task.
Some of the Frameworks provided by Mesos are:
- Chronos- It is used as Fault-tolerant scheduler for Mesos Cluster as it supports complex job topologies.
- Marathon- It automatically handles hardware or software failures to ensure that an app is “always on”.
- Aurora- It is a Service scheduler that enables to run long-running services to take advantage of Mesos’ scalability, fault-tolerance, and resource isolation.
- Hadoop- It is for data processing.
- Spark- It is for data processing.
- Jenkins- Jenkins is a continuous integration server that allows dynamically launch of workers on a Mesos cluster depending on the workload.
5. How does the Mesos framework works?
The picture below depicts scheduling of framework to run a task.
- Step 1: The agent 1 details the master about its availability that it has 4cpu and 4 GB of memory available. The master then cites the allocation policy module.
- Step 2: The master then describes what is available on agent 1 to framework 1.
- Step 3: Now the framework provides the information of task to be run on agent 1. It provides 2 task to the agent using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.
- Step 4: The master sends the task to the agent, which allocates appropriate resource to the framework Executor. If space is free the other framework can also use the spare space and resources.
Hence, in this Apache Mesos tutorial, we discussed the meaning of Mesos. Moreover, we saw the need of Apache Mesos. Also, we discussed the architecture of Mesos and the working of Mesos framework. Still, if you have any query regarding Mesos tutorial, ask in the comment tab.