12 frequently used Hadoop HDFS Commands with Examples & usage

Boost your career with Free Big Data Courses!!

Practice the most frequently used Hadoop HDFS commands to perform operations on HDFS files/directories with usage and examples.

In this Hadoop HDFS commands tutorial, we are going to learn the remaining important and frequently used HDFS commands with the help of which we will be able to perform HDFS file operations like copying a file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc.

Hadoop HDFS Commands Tutorial

Hadoop HDFS commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. In this Hadoop fs commands tutorial, we will discuss the Hadoop basic commands, Hadoop shell commands and frequently use Hadoop commands with examples and description.

If you encounter any query in this Hadoop HDFS tutorial, Please Comment.

Before interacting with HDFS you need to Deploy Hadoop follow this detailed tutorial to Install and configure Hadoop.

12 frequently used hdfs hadoop commands

1. touchz

Hadoop touchz Command Usage:

hadoop fs –touchz /directory/filename

Hadoop touchz Command Example:

Here in this example, we are trying to create a new file ‘file1’ in the newDataFlair directory of HDFS with file size 0 byte.

touchz command - HDFS commands

To check for the file, use the ls command to enlist the files and directories.

ls command to check file - HDFS commands

Hadoop touchz Command Description:

touchz command creates a file in HDFS with file size equals to 0 byte. The directory is the name of the directory where we will create the file, and filename is the name of the new file we are going to create.

Read: Read write Operations in HDFS

2. test

Hadoop test Command Usage:

hadoop fs -test  -[defsz] <path>

Hadoop test Command Example:

[php] "hdfs dfs -test -e sample
hdfs dfs -test -z sample
hdfs dfs -test -d sample" [/php]

Hadoop test Command Description:

The test command is used for file test operations.

It gives 1 if a path exists.

It gives 0 if it has zero length, or path provided by the user is a directory, or otherwise.

OptionsDescription
-dCheck whether the path given by the user is a directory or not, return 0 if it is a directory.
-eCheck whether the path given by the user exists or not, return 0 if the path exists.
-fCheck whether the path given by the user is a file or not, return 0 if it is a file.
-sCheck if the path is not empty, return 0 if a path is not empty.
-rreturn 0 if the path exists and read permission is granted
-wreturn 0 if the path exists and write permission is granted
-zChecks whether the file size is 0 byte or not, return 0 if the file is of 0 bytes. 

3. text

Hadoop text Command Usage:

hadoop fs -text <src>

Hadoop text Command Example:

Here in this example, we are using the text command to display the ‘sample’ zip file in text format.

Text command HDFS commands

Hadoop text Command Description:

The Hadoop fs shell command text takes the source file and outputs the file in the text format. It detects the encoding of the file and decodes it to plain text.

The allowed formats are zip and TextRecordInputStream.

4. stat

Hadoop stat Command Usage:

hadoop fs -stat [format] <path>

Hadoop stat Command Example:

In the below example, we are using the stat command to print the information about file ‘test’ present in the dataflair directory of HDFS.

stat command - HDFS commands

Hadoop stat Command Description:

The Hadoop fs shell command stat prints the statistics about the file or directory in the specified format.

Formats:

%b –    file size in bytes
%g –    group name of owner
%n –    file name
%o –    block size
%r  –    replication
%u –    user name of owner
%y –    modification date

If the format is not specified then %y is used by default.

Execute top 10 HDFS Commands from our Part – 1 of this HDFS tutorial.

5. usage

Hadoop ‘usage’ Command Usage:

hadoop fs -usage <command>

Hadoop usage Command Example:

usage command - HDFS commands

Hadoop usage Command Description:

The Hadoop fs shell command usage returns the help for an individual command.

6. help

Hadoop help Command Usage:

hadoop fs -help [command]

Hadoop help Command Example:

help command - HDFS commands

Hadoop help Command Description:

The Hadoop fs shell command help shows help for all the commands or the specified command.

Read: HDFS Commands Part – 2

7. chmod

Hadoop chmod Command Usage:

hadoop fs -chmod [-R] <mode> <path>

Hadoop chmod Command Example:

Here we are changing the file permission of file ‘testfile’ present on the HDFS file system.

chmod command - HDFS commands

checking mode of file - HDFS commands

Hadoop chmod Command Description:

The Hadoop fs shell command chmod changes the permissions of a file.

The -R option recursively changes files permissions through the directory structure.

The user must be the owner of the file or superuser.

8. appendToFile

Hadoop appendToFile Command Usage:

hadoop fs -appendToFile <localsrc> <dest>

Hadoop appendToFile Command Example:

In the below example, we are trying to append the localfile1, localfile2 present in the local filesystem into the file named as ‘apendfile’ in the DataFlair directory on the HDFS filesystem.

appendToFile command - HDFS commands

checking appendtofile - HDFS commands

Hadoop appendToFile Command Description:

The HDFS fs shell command appendToFile appends the content of single or multiple local files specified in the localsrc to the provided destination file on the HDFS.

The destination file gets created if it does not exist earlier.

9. checksum

Hadoop checksum Command Usage:

hadoop fs -checksum <src>

Hadoop checksum Command Example:

Here we are checking the checksum of file ‘apendfile’ present in DataFlair directory on the HDFS filesystem.

checksum command - HDFS commands

Hadoop checksum Command Description:

The Hadoop fs shell command checksum returns the checksum information of a file.

10. count

Hadoop count Command Usage:

hadoop fs -count [options] <path>

Hadoop count Command Example:

count command - HDFS commands

Hadoop count Command Description:

The Hadoop fs shell command count counts the number of files, directories, and bytes under the paths that matches the specified file pattern.

Options:
-q  –  shows quotas(quota is the hard limit on the number of names and amount of space used for individual directories)
-u  –  it limits output to show quotas and usage only
-h  –  shows sizes in a human-readable format
-v  –  shows header line

11. find

Hadoop HDFS find command usage:

hadoop fs -find <path> … <expression>

Hadoop find Command Example:

Here in this example, we are trying to find ‘copytest’ file in HDFS.

find command - HDFS commands

Hadoop HDFS find command description:

The Hadoop fs shell command find finds all files that match the specified expression. If no path is specified, then it defaults to the present working directory. If an expression is not specified, then it defaults to -print.

12. getmerge

Hadoop HDFS getmerge command usage:

hadoop fs -getmerge <src> <localdest>

Hadoop getmerge Command Example:

Here in this example, we are merging the copytest, file1, and sample file present in HDFS into a single file ‘MergeFile’ on the local filesystem.

getmerge command - HDFS commands

displaying content of mergefile

Hadoop HDFS getmerge command description:

getmerge command merges a list of files in a directory on the HDFS filesystem into a single local file on the local filesystem.

Summary

I hope after reading this article, you are now able to use HDFS commands to perform operations on the Hadoop filesystem. The article has explained all the essential HDFS commands, including test, chown, chmod, count, etc.

Now its time to learn a few more concepts to master HDFS thoroughly.

To learn more about the world’s most reliable storage layer follow this HDFS introductory guide.

For any queries or feedback regarding Hadoop Commands just leave a comment in the below section. I hope you like the Hadoop Commands tutorial.

Did you like this article? If Yes, please give DataFlair 5 Stars on Google

follow dataflair on YouTube

7 Responses

  1. Evelyn Serrell says:

    I came to your Play with HDFS using Commands Part-III – DataFlair Blogs page by searching on Google and I really like your blog. Nice explanation.

  2. YuvraniK says:

    Hi Team,

    Thank you so much for giving such a wonderful resource for learning with screenshots. Please keep up with the good work.

  3. krish says:

    Hadoop HDFS Commands Tutorial,nice tutorial

  4. hadoop training in hyderabd says:

    nice xplanation on hadoop. Hadoop Distributed File System (HDFS) is one of the most important components of the Hadoop Architecture. It is a storage component, which stores the data across multiple machines of a cluster. HDFS is a distributed file system, which means it stores the files on multiple machines. HDFS architecture consists of three main components, being:
    Content: This article will help you to understand the installation and configuration of Hadoop with Apache Ambari Interface on a single node.
    Content: The article will give you the details of how to install Hortonworks Data Platform 2.3.0 on a single node.

  5. klinton says:

    Hi,

    Thankyou for sharing the valuable information.

    Hadoop HDFS Commands with Examples & usage

Leave a Reply

Your email address will not be published. Required fields are marked *