12 frequently used Hadoop HDFS Commands with Examples & usage

Boost your career with Data Engineering Courses!!

Practice the most frequently used Hadoop HDFS commands to perform operations on HDFS files/directories with usage and examples.

In this Hadoop HDFS commands tutorial, we are going to learn the remaining important and frequently used HDFS commands with the help of which we will be able to perform HDFS file operations like copying a file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc.

Hadoop HDFS Commands Tutorial

Hadoop HDFS commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. In this Hadoop fs commands tutorial, we will discuss the Hadoop basic commands, Hadoop shell commands and frequently use Hadoop commands with examples and description.

If you encounter any query in this Hadoop HDFS tutorial, Please Comment.

Before interacting with HDFS you need to Deploy Hadoop follow this detailed tutorial to Install and configure Hadoop.

1. touchz

Hadoop touchz Command Usage:

hadoop fs –touchz /directory/filename

Hadoop touchz Command Example:

Here in this example, we are trying to create a new file ‘file1’ in the newDataFlair directory of HDFS with file size 0 byte.

To check for the file, use the ls command to enlist the files and directories.

Hadoop touchz Command Description:

touchz command creates a file in HDFS with file size equals to 0 byte. The directory is the name of the directory where we will create the file, and filename is the name of the new file we are going to create.

Read: Read write Operations in HDFS

2. test

Hadoop test Command Usage:

hadoop fs -test  -[defsz] <path>

Hadoop test Command Example:

[php] "hdfs dfs -test -e sample
hdfs dfs -test -z sample
hdfs dfs -test -d sample" [/php]

Hadoop test Command Description:

The test command is used for file test operations.

It gives 1 if a path exists.

It gives 0 if it has zero length, or path provided by the user is a directory, or otherwise.

Options	Description
-d	Check whether the path given by the user is a directory or not, return 0 if it is a directory.
-e	Check whether the path given by the user exists or not, return 0 if the path exists.
-f	Check whether the path given by the user is a file or not, return 0 if it is a file.
-s	Check if the path is not empty, return 0 if a path is not empty.
-r	return 0 if the path exists and read permission is granted
-w	return 0 if the path exists and write permission is granted
-z	Checks whether the file size is 0 byte or not, return 0 if the file is of 0 bytes.

3. text

Hadoop text Command Usage:

hadoop fs -text <src>

Hadoop text Command Example:

Here in this example, we are using the text command to display the ‘sample’ zip file in text format.

Hadoop text Command Description:

The Hadoop fs shell command text takes the source file and outputs the file in the text format. It detects the encoding of the file and decodes it to plain text.

The allowed formats are zip and TextRecordInputStream.

4. stat

Hadoop stat Command Usage:

hadoop fs -stat [format] <path>

Hadoop stat Command Example:

In the below example, we are using the stat command to print the information about file ‘test’ present in the dataflair directory of HDFS.

Hadoop stat Command Description:

The Hadoop fs shell command stat prints the statistics about the file or directory in the specified format.

Formats:

%b – file size in bytes
%g – group name of owner
%n – file name
%o – block size
%r – replication
%u – user name of owner
%y – modification date

If the format is not specified then %y is used by default.

Execute top 10 HDFS Commands from our Part – 1 of this HDFS tutorial.

5. usage

Hadoop ‘usage’ Command Usage:

hadoop fs -usage <command>

Hadoop usage Command Example:

Hadoop usage Command Description:

The Hadoop fs shell command usage returns the help for an individual command.

6. help

Hadoop help Command Usage:

hadoop fs -help [command]

Hadoop help Command Example:

Hadoop help Command Description:

The Hadoop fs shell command help shows help for all the commands or the specified command.

Read: HDFS Commands Part – 2

7. chmod

Hadoop chmod Command Usage:

hadoop fs -chmod [-R] <mode> <path>

Hadoop chmod Command Example:

Here we are changing the file permission of file ‘testfile’ present on the HDFS file system.

Hadoop chmod Command Description:

The Hadoop fs shell command chmod changes the permissions of a file.

The -R option recursively changes files permissions through the directory structure.

The user must be the owner of the file or superuser.

8. appendToFile

Hadoop appendToFile Command Usage:

hadoop fs -appendToFile <localsrc> <dest>

Hadoop appendToFile Command Example:

In the below example, we are trying to append the localfile1, localfile2 present in the local filesystem into the file named as ‘apendfile’ in the DataFlair directory on the HDFS filesystem.

Hadoop appendToFile Command Description:

The HDFS fs shell command appendToFile appends the content of single or multiple local files specified in the localsrc to the provided destination file on the HDFS.

The destination file gets created if it does not exist earlier.

9. checksum

Hadoop checksum Command Usage:

hadoop fs -checksum <src>

Hadoop checksum Command Example:

Here we are checking the checksum of file ‘apendfile’ present in DataFlair directory on the HDFS filesystem.

Hadoop checksum Command Description:

The Hadoop fs shell command checksum returns the checksum information of a file.

10. count

Hadoop count Command Usage:

hadoop fs -count [options] <path>

Hadoop count Command Example:

Hadoop count Command Description:

The Hadoop fs shell command count counts the number of files, directories, and bytes under the paths that matches the specified file pattern.

Options:
-q – shows quotas(quota is the hard limit on the number of names and amount of space used for individual directories)
-u – it limits output to show quotas and usage only
-h – shows sizes in a human-readable format
-v – shows header line

11. find

Hadoop HDFS find command usage:

hadoop fs -find <path> … <expression>

Hadoop find Command Example:

Here in this example, we are trying to find ‘copytest’ file in HDFS.

Hadoop HDFS find command description:

The Hadoop fs shell command find finds all files that match the specified expression. If no path is specified, then it defaults to the present working directory. If an expression is not specified, then it defaults to -print.

12. getmerge

Hadoop HDFS getmerge command usage:

hadoop fs -getmerge <src> <localdest>

Hadoop getmerge Command Example:

Here in this example, we are merging the copytest, file1, and sample file present in HDFS into a single file ‘MergeFile’ on the local filesystem.

Hadoop HDFS getmerge command description:

getmerge command merges a list of files in a directory on the HDFS filesystem into a single local file on the local filesystem.

Summary

I hope after reading this article, you are now able to use HDFS commands to perform operations on the Hadoop filesystem. The article has explained all the essential HDFS commands, including test, chown, chmod, count, etc.

Now its time to learn a few more concepts to master HDFS thoroughly.

To learn more about the world’s most reliable storage layer follow this HDFS introductory guide.

For any queries or feedback regarding Hadoop Commands just leave a comment in the below section. I hope you like the Hadoop Commands tutorial.

Did we exceed your expectations?
If Yes, share your valuable feedback on Google

Evelyn Serrell says:
December 9, 2016 at 12:08 pm
I came to your Play with HDFS using Commands Part-III – DataFlair Blogs page by searching on Google and I really like your blog. Nice explanation.
- Data Flair says:
  August 20, 2018 at 10:13 am
  Hii Evelyn
  Thank you for giving us a chance to serve you with our best study material. All the information provided on the site is researched and framed by Hadoop specialists. We are definitely sure that you would like to read more articles on Hadoop HDFS published by Data Flair. Here is one for you –
  https://data-flair.training/blogs/hadoop-hdfs-disk-balancer/
YuvraniK says:
May 3, 2020 at 12:13 am
Hi Team,
Thank you so much for giving such a wonderful resource for learning with screenshots. Please keep up with the good work.
krish says:
December 30, 2021 at 8:29 am
Hadoop HDFS Commands Tutorial,nice tutorial
- Ramavtat says:
  March 28, 2024 at 8:02 pm
  Great tutorial. Very helpful. God bless.
hadoop training in hyderabd says:
December 30, 2021 at 8:49 am
nice xplanation on hadoop. Hadoop Distributed File System (HDFS) is one of the most important components of the Hadoop Architecture. It is a storage component, which stores the data across multiple machines of a cluster. HDFS is a distributed file system, which means it stores the files on multiple machines. HDFS architecture consists of three main components, being:
Content: This article will help you to understand the installation and configuration of Hadoop with Apache Ambari Interface on a single node.
Content: The article will give you the details of how to install Hortonworks Data Platform 2.3.0 on a single node.
klinton says:
January 21, 2023 at 12:11 pm
Hi,
Thankyou for sharing the valuable information.
Hadoop HDFS Commands with Examples & usage

12 frequently used Hadoop HDFS Commands with Examples & usage

Hadoop HDFS Commands Tutorial

1. touchz

2. test

3. text

4. stat

5. usage

6. help

7. chmod

8. appendToFile

9. checksum

10. count

11. find

12. getmerge

Summary

7 Responses

Leave a Reply Cancel reply

About DataFlair

Trending Courses

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Data Science Tutorials

Trending Projects

Trending Programming Tutorials

Trending Tutorials