Site icon DataFlair

Deep Dive into Hadoop YARN Node Manager – A Yarn Tutorial

Deep Dive into Hadoop YARN Node Manager - A Yarn Tutorial

Deep Dive into Hadoop YARN Node Manager - A Yarn Tutorial

In this Hadoop Yarn node manager tutorial, we will discuss node manager in Yarn, how it interact with the resource manager, how it allocates containers. We will also cover different Node manager components, how auxiliary services run in the node manager?

Introduction to Hadoop Yarn Node Manager

Conceptually, the NodeManager is more of a generic and efficient version of TaskTracker (of Hadoop1 architecture) which is more flexible than TaskTracker.

In contrast to fixed number of slots for map and reduce tasks in MRV1, the NodeManager of MRV2 has a number of dynamically created resource containers. There is no hard code split available into Map and Reduce slots as in MRV1.

The container refers to a collection of resources such as memory, CPU, disk and network IO. The number of containers on a node is the product of configuration parameter and the total amount of node resources. Node manager is the slave daemon of Yarn.

Hadoop yarn Node Manager

The Hadoop Yarn Node Manager is the per-machine/per-node framework agent who is responsible for containers, monitoring their resource usage and reporting the same to the ResourceManager.

Overseeing container’s lifecycle management, NodeManager also tracks the health of the node on which it is running, controls auxiliary services which different YARN applications may exploit at any point in time.

NodeManager can execute any computations that make sense to ApplicationMaster just by creating the container for each task.The above architecture diagram gives a detailed view of the NodeManager components.

The above architecture diagram gives a detailed view of the NodeManager components.

Yarn NodeManager Components

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

This section of Hadoop Yarn node manager tutorial will provide you a detailed description of yarn node manager components-

1. NodeStatusUpdater

On startup, this component registers with the ResourceManager(RM) and sends information about the resources available to every node. Subsequent NM-RM communication exchange updates on container statuses of every node like containers running on the node completed containers, etc.

In addition, the RM may signal the NodeStatusUpdater to potentially kill already running containers.

2. Container Manager

Being the core component of the NodeManager, shoulder the responsibilities of managing the containers running on each node with its sub-components, each of which performs a subset of the functionality that is needed to manage containers running on the node.

3. Container Executor

Interacts with the underlying operating system to securely place files and directories needed by containers and subsequently to launch and clean up processes corresponding to containers in a secure manner.

4. NodeHealthChecker Service

The functionality of checking the health of the node by running a configured script regularly is the due responsibility of NodeHealthCheckerService. It also monitors the health of the disks specifically by creating temporary files on the disks every so often.

Any changes in the health of the system are notified to NodeStatusUpdater which in turn passes on the information to the RM.

5. Security

6. WebServer

Hadoop Yarn Node Manager Auxiliary Service

Exposes the list of applications, container information running on the node at a given point of time, node-health related information and the logs produced by the containers.

In the case of MapReduce applications, the Map and Reduce tasks are executed inside the container. However, in between the Map and Reduce tasks (i.e, outside the containers) there is the ‘Shuffle and Sort’ phase.

The actions within this phase must be additionally specified to YARN as a NodeManager auxiliary service. Below is an illustration.

To summarize briefly, the key functionality of the NodeManager is to facilitate container launch on request from AM. Receiving the container launch request, the NM verifies this request, authorizes the user before resources assignment.

Exit mobile version