Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › How HDFS helps namenode in scaling ?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 11:33 am #4657DataFlair TeamSpectator
What is HDFS Federation?
How it helps namenode in scaling ? -
September 20, 2018 at 11:33 am #4658DataFlair TeamSpectator
Before starting HDFS Federation, let us first discuss
Scalability
The primary benefit of Hadoop is its Scalability.One can easily scale the cluster by adding more nodes.
There are two types of Scalability in Hadoop: Vertical and Horizontal
Vertical scalability
It is also referred as “scale up”. In vertical scaling, you can increase the hardware capacity of the individual machine. In other words, you can add more RAM or CPU to your existing system to make it more robust and powerful.Horizontal scalability
It is also referred as “scale out” is basically the addition of more machines or setting up the cluster. In horizontal scaling instead of increasing hardware capacity of individual machine you add more nodes to existing cluster and most importantly, you can add more machines without stopping the system. Therefore we don’t have any downtime or green zone, nothing of such sort while scaling out. So at last to meet your requirements you will have more machines working in parallel.To learn more about the Scalabilty follow: HDFS Scalability
HDFS has two main layers:-
1. Namespace – manages directories, files and blocks. It supports file system operations such as creation, modification, deletion and listing of files and directories.
2. Block Storage – Block storage provides operations like creation, deletion, modification and getting the location of the blocks. It also takes care of replica placement and replication.
Architecture without HDFS Federation
Datanode can be scaled both vertically & horizontally. But namenode was scaled only vertically not horizontally. Architecture without HDFS Federation has multiple datanodes, but it has only one NameNodefor (one namespace) for all datanodes .This limits the number of blocks, files, and directories supported on the file system.
To overcome this limitation HDFS Federation is introduced.
Architecture with HDFS Federation
In order to scale the namenode horizontally, federation uses multiple independent Namenodes/namespaces. In HDFS Federation, Namenodes does not require coordination with each other as the namenode is independent. And in HDFS federation, all the datanodes are used as common storage for blocks by all the Namenodes. In HDFS Federation, each datanode in registers with all the Namenodes in the cluster. Blocks that belong to a single namespace are called Block pool. Datanodes store blocks for all the block pools in the cluster.
Therefore, with HDFS Federation previous limitation has been overcome where we were not able to scale namenode horizontally. This also provides the scope for absolute isolation. -
September 20, 2018 at 11:33 am #4659DataFlair TeamSpectator
HDFS Federation addresses the limitation of the prior HDFS architecture, which allows only a single namespace for the entire cluster, by adding multiple Namenodes/namespaces to HDFS file system. Federated, meaning each Nameservice is independent and does not require coordination with other Nameservices.
Some details about Federated HDFS –
1. Small changes to Namenode and most of the changes in Data Nodes, Config, and tools
2. Namespace and Block Management remain in Name Node
3. Data Nodes provide storage services for all the name nodes, like periodic heartbeat and block reports to all the name nodes, block received/deleted for a block pool to the corresponding namenode
4. Balancer works with multiple Namespaces
5. Name Node can be added/deleted in Federated cluster
Single configuration for all the nodes in the cluster
To learn more about the HDFS Federation follow: HDFS FederationTutorial
-
-
AuthorPosts
- You must be logged in to reply to this topic.