Hadoop is the most popular Big data framework, which is still the backbone of Big Data industry, it can handle large datasets across the cluster of servers using simple programming models. The Hadoop framework provides distributed storage, resource management, and computation across various clusters of servers.
It can be deployed on a single node or multi-node cluster. But the real power of Hadoop comes on a multi-node cluster. It utilizes the power of distributed processing. But for following purpose Hadoop can be run on a single node cluster:
In Single Node cluster, all the essential daemons (like NameNode, DataNode, Resource Manager and Node Manager) run on the same machine but on different ports. In distributed mode, they run on different machines (master and slave).
The Replication Factor of single node cluster will be 1. It is single node because it is only one machine and a cluster since all the daemons of the cluster run on the single machine. The application developed for production cluster is also tested on single node cluster. Single node cluster is also useful for development & QA purpose.