How to Install Hadoop 2.7 on Ubuntu | Hadoop Installation Steps

Boost your career with Data Engineering Courses!!

1. Install Hadoop 2.7 on Ubuntu Tutorial: Objective

This Install Hadoop 2.7 on Ubuntu tutorial explains about How to install and configure Hadoop 2.7.x on Ubuntu? In this tutorial, we will step by step guide you on how to install Hadoop and deploy Hadoop on the Single server (single node cluster) on Ubuntu OS. This quick start will help you to install Hadoop 2.7 on ubuntu, configure and run it in less than 10 min. While installation we will enable YARN so that apart from MapReduce you can run different types of applications like Spark.
Looking to start career in Big Data and Hadoop – Learn from Experts

2. How to Install Hadoop 2.7 on Ubuntu?

In this section of Hadoop 2.7 installation tutorial, we will learn step by step to install and configure Hadoop 2.7.x on Ubuntu OS. Follow the steps given below to install Hadoop 2.7 –

2.1. Prerequisites to install Hadoop 2.7 on Ubuntu

If you are using Windows/Mac OS to install Hadoop 2.7 you can create a virtual machine and install Ubuntu using VMWare Player, alternatively, you can create a virtual machine and install Ubuntu using Oracle Virtual Box.

I. Install Oracle Java 8

a. Install Python Software Properties

[php]sudo apt-get install python-software-properties[/php]

b. Add Repository

[php]sudo add-apt-repository ppa:webupd8team/java[/php]

c. Update the source list

[php]sudo apt-get update[/php]

d. Install Java

[php]sudo apt-get install oracle-java8-installer[/php]

II. Setup Password-less SSH

a. Install Open SSH Server & Open SSH Client

[php]sudo apt-get install openssh-server openssh-client[/php]

b. Generate Public & Private Key Pairs

[php]ssh-keygen -t rsa -P “”[/php]

c. Configure password-less SSH

[php]cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys[/php]

d. Check by SSH to localhost

[php]ssh localhost[/php]

3.1. Configure, Setup and Install Hadoop 2.7 on Ubuntu

I. Download Hadoop

https://archive.apache.org/dist/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz

II. Untar Tar ball

[php]tar xzf hadoop-2.7.1.tar.gz[/php]

Note: All the required jars, scripts, configuration files, etc. are available in HADOOP_HOME directory (hadoop-2.7.1)

III. Setup Configuration

a. Edit .bashrc

Edit .bashrc file located in user’s home directory and add following parameters:

[php]export HADOOP_PREFIX=/home/hdadmin/hadoop-2.7.1
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_PREFIX/lib/native
export HADOOP_OPTS=”-Djava.library.path=$HADOOP_PREFIX/lib”[/php]

Note: After above step restarts the terminal so that all the environment variables will come into effect

b. Edit hadoop-env.sh

Edit hadoop-env.sh (hadoop-env.sh is located in etc/hadoop inside Hadoop installation directory) and set JAVA_HOME:

[php]export JAVA_HOME=<root-of-your-Java-installation> (eg: /usr/lib/jvm/java-8-oracle/)[/php]

c. Edit core-site.xml

Edit core-site.xml (core-site.xml is located in etc/hadoop inside Hadoop installation directory) and add following entries:

[php]<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdadmin/hdata</value>
</property>
</configuration>[/php]

Note: you must have Read Write privileges in /home/hdadmin/hdata else specify a location where you have Read Write privileges.

d. Edit hdfs-site.xml

Edit hdfs-site.xml (hdfs-site.xml is located in etc/hadoop inside Hadoop installation directory) and add following entries:

[php]<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>[/php]

e. Edit mapred-site.xml

Edit mapred-site.xml (mapred-site.xml.template is located in etc/hadoop inside Hadoop installation directory, copy the file with the name mapred-site.xml) and add following entries:

[php]<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>[/php]

f. Edit yarn-site.xml

Edit yarn-site.xml (yarn-site.xml is located in etc/hadoop inside Hadoop installation directory) and add following entries:

[php]<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>[/php]

4.1. Start the Cluster

I. Format the name node:

[php]hdfs namenode -format[/php]

NOTE: Namenode should be formatted just once when you install Hadoop.

II. Start HDFS Services:

[php]start-dfs.sh[/php]

III. Start YARN Services:

[php]start-yarn.sh[/php]

IV. Check whether services have been started

[php]jps
NameNode
DataNode
ResourceManager
NodeManager
SecondaryNameNode
[/php]

5.1. Run Map-Reduce Jobs

I. Run word count example:

[php]hdfs dfs -mkdir /data
hdfs dfs -put <file> /data
yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /data /data-out
hdfs dfs -cat /data-out/*[/php]

To work with HDFS and perform various operations follow this guide

6.1. Stop the Cluster

I. Stop HDFS Services:

[php]stop-dfs.sh[/php]

II. Stop YARN Services:

[php]stop-yarn.sh[/php]

This was all on the tutorial to install Hadoop 2.7 on Ubuntu in 10 minutes. For any queries on How to install Hadoop feel free to ask the question in the comment section below. We would also like to know you feedback on install Hadoop on Ubuntu tutorial.

See Also-

Did you like our efforts? If Yes, please give DataFlair 5 Stars on Google

Tags: apache hadoop big data hadoop big data training hadoop hadoop admin hadoop installation installation learn hadoop

DataFlair Team

The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.

Freddie says:
December 7, 2016 at 10:54 am
It’s an amazing post in support of all the online people; they will take
advantage from it I am sure.
Reply
- Data Flair says:
  August 24, 2018 at 12:13 pm
  You are the best Freddie. Thank you for sharing such a positive experience on Hadoop installation. This post is for our loyal readers so that they can gain more Hadoop Knowledge easily.
  Keep learning from Data Flair
  Visit again
  Reply
AJ says:
January 26, 2017 at 3:38 am
Cannot format hdfs namenode
Reply
achu13 says:
March 26, 2017 at 3:06 pm
start-dfs.sh
17/03/26 20:34:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoopacc/hadoop/logs/hadoop-hadoopacc-namenode-Achu.out
localhost: starting datanode, logging to /home/hadoopacc/hadoop/logs/hadoop-hadoopacc-datanode-Achu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoopacc/hadoop/logs/hadoop-hadoopacc-secondarynamenode-Achu.out
17/03/26 20:34:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Namenode and datanodes are not starting!!! i am getting this error whenever i try to start !!please help me!!
Reply
- DataFlair Team says:
  March 28, 2017 at 7:38 am
  There are no errors in your hadoop installation, it’s just a warning, which means when Hadoop setup was compiled your platform was not specified, you can ignore it because it is ultimately going to use Hadoop native library.
  Run the jps command to check whether all the Hadoop daemons are running or not
  Reply
  - rupesh says:
    January 29, 2018 at 1:19 pm
    I am also facing same issue the data node and name node are not running
    On doing JPS Node Manager and resource manage rare shown
    Reply
daniel says:
April 5, 2018 at 11:27 am
Hii i am trying to execute the “jps” command but got the error
help me plz
hduser@ubuntu:/home/daniel$ jps
The program ‘jps’ can be found in the following packages:
* openjdk-8-jdk-headless
* openjdk-9-jdk-headless
Try: sudo apt install
Reply
dinesh tak says:
August 12, 2018 at 9:41 am
i am install hadoop . i want to show namenode and datanode in ambari. what is step to perform . please send me step.
Reply

How to Install Hadoop 2.7 on Ubuntu | Hadoop Installation Steps

1. Install Hadoop 2.7 on Ubuntu Tutorial: Objective

2. How to Install Hadoop 2.7 on Ubuntu?

2.1. Prerequisites to install Hadoop 2.7 on Ubuntu

I. Install Oracle Java 8

II. Setup Password-less SSH

3.1. Configure, Setup and Install Hadoop 2.7 on Ubuntu

I. Download Hadoop

II. Untar Tar ball

III. Setup Configuration

4.1. Start the Cluster

I. Format the name node:

II. Start HDFS Services:

III. Start YARN Services:

IV. Check whether services have been started

5.1. Run Map-Reduce Jobs

I. Run word count example:

6.1. Stop the Cluster

I. Stop HDFS Services:

II. Stop YARN Services:

8 Responses

Leave a Reply Cancel reply

About DataFlair

Trending Courses

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Data Science Tutorials

Trending Projects

Trending Programming Tutorials

Trending Tutorials