Installation of Hadoop 3.x on Ubuntu on Single Node Cluster

Hadoop Quiz

1. Objective

In this tutorial on Installation of Hadoop 3.x on Ubuntu, we are going to learn steps for setting up a pseudo-distributed, single-node Hadoop 3.x cluster on Ubuntu. We will learn steps like how to install java, how to install SSH and configure passwordless SSH, how to download Hadoop, how to setup Hadoop configurations like .bashrc file,, core-site.xml, hdfs-site.xml, mapred-site.xml, YARN-site.xml, how to start the Hadoop cluster and how to stop the Hadoop services.

Learn step by step installation of Hadoop 2.7.x on Ubuntu.

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster

2. Installation of Hadoop 3.x on Ubuntu

Before we start with Hadoop 3.x installation on Ubuntu, let us understand key features that have been added in Hadoop 3 that makes the comparison between Hadoop 2 and Hadoop 3.

2.1. Java 8 installation

Hadoop requires working java installation. Let us start with steps for installing java 8:

a. Install Python Software Properties

sudo apt-get install python-software-properties

b. Add Repository

sudo add-apt-repository ppa:webupd8team/java

c. Update the source list

sudo apt-get update

d. Install Java 8

sudo apt-get install oracle-java8-installer

e. Check if java is correctly installed

java -version

2.2. Configure SSH

SSH is used for remote login. SSH is required in Hadoop to manage its nodes, i.e. remote machines and local machine if you want to use Hadoop on it. Let us now see SSH installation of Hadoop 3.x on Ubuntu:

a. Installation of passwordless SSH

sudo apt-get install ssh
sudo apt-get install pdsh

b. Generate Key Pairs

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

c. Configure passwordless ssh

cat ~/.ssh/>>~/.ssh/authorized_keys

e. Change the permission of file that contains the key

chmod 0600 ~/.ssh/authorized_keys

f. check ssh to the localhost

ssh localhost

2.3. Install Hadoop

a. Download Hadoop

(Download the latest version of Hadoop hadoop-3.0.0-alpha2.tar.gz)

b. Untar Tarball

tar -xzf hadoop-3.0.0-alpha2.tar.gz

2.4. Hadoop Setup Configuration

a. Edit .Bashrc
Open .bashrc

nano ~/.bashrc

Edit .bashrc:

Edit .bashrc file is located in user’s home directory and adds following parameters:

export HADOOP_PREFIX="/home/dataflair/hadoop-3.0.0-alpha2"

Then run

Source ~/.bashrc

b. Edit

Edit configuration file (located in HADOOP_HOME/etc/hadoop) and set JAVA_HOME:

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

c. Edit core-site.xml

Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


d. Edit hdfs-site.xml

Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


e. Edit mapred-site.xml

If mapred-site.xml file is not available, then use

cp mapred-site.xml.template mapred-site.xml

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


f. Yarn-site.xml

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


Test your Hadoop knowledge with this Big data Hadoop quiz.

2.5. How to Start the Hadoop services

Let us now see how to start the Hadoop cluster:

The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster”. This is done as follows:

a. Format the namenode

bin/hdfs namenode -format

NOTE: This activity should be done once when you install Hadoop and not for running Hadoop filesystem, else it will delete all your data from HDFS

b. Start HDFS Services


It will give an error at the time of start HDFS services then use:

echo "ssh" | sudo tee /etc/pdsh/rcmd_default

c. Start YARN Services


d. Check how many daemons are running

Let us now see whether expected Hadoop processes are running or not:

2961 ResourceManager
2482 DataNode
3077 NodeManager
2366 NameNode
2686 SecondaryNameNode
3199 Jps

Learn How to install Cloudera Hadoop CDH5 on ubuntu from this installation guide.

2.6. How to Stop the Hadoop services

Let us learn how to stop Hadoop services now:

a. Stop YARN services


b. Stop HDFS services



Browse the web interface for the NameNode; by default, it is available at:

NameNode – http://localhost:9870/

Browse the web interface for the ResourceManager; by default, it is available at:

ResourceManager – http://localhost:8088/

Run a MapReduce job

We are all ready to start our first Hadoop MapReduce job through Hadoop word count example.

Learn MapReduce job optimization and performance tuning techniques.

Also see:


No Responses

  1. Sarvottam Patel says:

    Hey the web interface for namenode is not working.

  2. haranesh says:

    NameNode is not working still its very very helpfull artical for fresher who want to work with hadoop 3.x

    • Ted Cahall says:

      Take the reference to hadoop.tmp.dir as /home/dataflair/hdata out of core-site.xml. Mine was working before that addition and stopped – even though I had it in HADOOP_HOME and the hdata directory was there with the correct permissions. As soon as I removed that reference, the NameNode begain working again. Overall it was a good, well detailed article on Installation of Hadoop 3.x on Ubuntu on single node cluster.

    • Ted says:

      One of the reasons I could not get the NameNode to start was that after I added the
      to my core-site.xml file, I forgot to format the HDFS file system. Once I did that, it worked.
      It is better to use a ‘hdata’ or some data directory under your HADOOP_HOME since by default the HDFS directory will be built under /tmp and be removed during system reboot.

  3. Sachin says:

    Error while formatting the namenode
    Cannot create directory /home/dataflair/hdata/dfs/name/current
    Please help with detailed settings in any of the files while configuration

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.