12 frequently used Hadoop HDFS Commands with Examples & usage
Practice the most frequently used Hadoop HDFS commands to perform operations on HDFS files/directories with usage and examples.
In this Hadoop HDFS commands tutorial, we are going to learn the remaining important and frequently used HDFS commands with the help of which we will be able to perform HDFS file operations like copying a file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc.
Hadoop HDFS Commands Tutorial
Hadoop HDFS commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. In this Hadoop fs commands tutorial, we will discuss the Hadoop basic commands, Hadoop shell commands and frequently use Hadoop commands with examples and description.
If you encounter any query in this Hadoop HDFS tutorial, Please Comment.
Before interacting with HDFS you need to Deploy Hadoop follow this detailed tutorial to Install and configure Hadoop.
1. touchz
Hadoop touchz Command Usage:
hadoop fs –touchz /directory/filename
Hadoop touchz Command Example:
Here in this example, we are trying to create a new file ‘file1’ in the newDataFlair directory of HDFS with file size 0 byte.
To check for the file, use the ls command to enlist the files and directories.
Hadoop touchz Command Description:
touchz command creates a file in HDFS with file size equals to 0 byte. The directory is the name of the directory where we will create the file, and filename is the name of the new file we are going to create.
Read: Read write Operations in HDFS
2. test
Hadoop test Command Usage:
hadoop fs -test -[defsz] <path>
Hadoop test Command Example:
[php] "hdfs dfs -test -e sample hdfs dfs -test -z sample hdfs dfs -test -d sample" [/php]
Hadoop test Command Description:
The test command is used for file test operations.
It gives 1 if a path exists.
It gives 0 if it has zero length, or path provided by the user is a directory, or otherwise.
Options | Description |
-d | Check whether the path given by the user is a directory or not, return 0 if it is a directory. |
-e | Check whether the path given by the user exists or not, return 0 if the path exists. |
-f | Check whether the path given by the user is a file or not, return 0 if it is a file. |
-s | Check if the path is not empty, return 0 if a path is not empty. |
-r | return 0 if the path exists and read permission is granted |
-w | return 0 if the path exists and write permission is granted |
-z | Checks whether the file size is 0 byte or not, return 0 if the file is of 0 bytes. |
3. text
Hadoop text Command Usage:
hadoop fs -text <src>
Hadoop text Command Example:
Here in this example, we are using the text command to display the ‘sample’ zip file in text format.
Hadoop text Command Description:
The Hadoop fs shell command text takes the source file and outputs the file in the text format. It detects the encoding of the file and decodes it to plain text.
The allowed formats are zip and TextRecordInputStream.
4. stat
Hadoop stat Command Usage:
hadoop fs -stat [format] <path>
Hadoop stat Command Example:
In the below example, we are using the stat command to print the information about file ‘test’ present in the dataflair directory of HDFS.
Hadoop stat Command Description:
The Hadoop fs shell command stat prints the statistics about the file or directory in the specified format.
Formats:
%b – file size in bytes
%g – group name of owner
%n – file name
%o – block size
%r – replication
%u – user name of owner
%y – modification date
If the format is not specified then %y is used by default.
Execute top 10 HDFS Commands from our Part – 1 of this HDFS tutorial.
5. usage
Hadoop ‘usage’ Command Usage:
hadoop fs -usage <command>
Hadoop usage Command Example:
Hadoop usage Command Description:
The Hadoop fs shell command usage returns the help for an individual command.
6. help
Hadoop help Command Usage:
hadoop fs -help [command]
Hadoop help Command Example:
Hadoop help Command Description:
The Hadoop fs shell command help shows help for all the commands or the specified command.
7. chmod
Hadoop chmod Command Usage:
hadoop fs -chmod [-R] <mode> <path>
Hadoop chmod Command Example:
Here we are changing the file permission of file ‘testfile’ present on the HDFS file system.
Hadoop chmod Command Description:
The Hadoop fs shell command chmod changes the permissions of a file.
The -R option recursively changes files permissions through the directory structure.
The user must be the owner of the file or superuser.
8. appendToFile
Hadoop appendToFile Command Usage:
hadoop fs -appendToFile <localsrc> <dest>
Hadoop appendToFile Command Example:
In the below example, we are trying to append the localfile1, localfile2 present in the local filesystem into the file named as ‘apendfile’ in the DataFlair directory on the HDFS filesystem.
Hadoop appendToFile Command Description:
The HDFS fs shell command appendToFile appends the content of single or multiple local files specified in the localsrc to the provided destination file on the HDFS.
The destination file gets created if it does not exist earlier.
9. checksum
Hadoop checksum Command Usage:
hadoop fs -checksum <src>
Hadoop checksum Command Example:
Here we are checking the checksum of file ‘apendfile’ present in DataFlair directory on the HDFS filesystem.
Hadoop checksum Command Description:
The Hadoop fs shell command checksum returns the checksum information of a file.
10. count
Hadoop count Command Usage:
hadoop fs -count [options] <path>
Hadoop count Command Example:
Hadoop count Command Description:
The Hadoop fs shell command count counts the number of files, directories, and bytes under the paths that matches the specified file pattern.
Options:
-q – shows quotas(quota is the hard limit on the number of names and amount of space used for individual directories)
-u – it limits output to show quotas and usage only
-h – shows sizes in a human-readable format
-v – shows header line
11. find
Hadoop HDFS find command usage:
hadoop fs -find <path> … <expression>
Hadoop find Command Example:
Here in this example, we are trying to find ‘copytest’ file in HDFS.
Hadoop HDFS find command description:
The Hadoop fs shell command find finds all files that match the specified expression. If no path is specified, then it defaults to the present working directory. If an expression is not specified, then it defaults to -print.
12. getmerge
Hadoop HDFS getmerge command usage:
hadoop fs -getmerge <src> <localdest>
Hadoop getmerge Command Example:
Here in this example, we are merging the copytest, file1, and sample file present in HDFS into a single file ‘MergeFile’ on the local filesystem.
Hadoop HDFS getmerge command description:
getmerge command merges a list of files in a directory on the HDFS filesystem into a single local file on the local filesystem.
Summary
I hope after reading this article, you are now able to use HDFS commands to perform operations on the Hadoop filesystem. The article has explained all the essential HDFS commands, including test, chown, chmod, count, etc.
Now its time to learn a few more concepts to master HDFS thoroughly.
To learn more about the world’s most reliable storage layer follow this HDFS introductory guide.
For any queries or feedback regarding Hadoop Commands just leave a comment in the below section. I hope you like the Hadoop Commands tutorial.
Did you like this article? If Yes, please give DataFlair 5 Stars on Google
I came to your Play with HDFS using Commands Part-III – DataFlair Blogs page by searching on Google and I really like your blog. Nice explanation.
Hii Evelyn
Thank you for giving us a chance to serve you with our best study material. All the information provided on the site is researched and framed by Hadoop specialists. We are definitely sure that you would like to read more articles on Hadoop HDFS published by Data Flair. Here is one for you –
https://data-flair.training/blogs/hadoop-hdfs-disk-balancer/
Hi Team,
Thank you so much for giving such a wonderful resource for learning with screenshots. Please keep up with the good work.
Hadoop HDFS Commands Tutorial,nice tutorial
Great tutorial. Very helpful. God bless.
nice xplanation on hadoop. Hadoop Distributed File System (HDFS) is one of the most important components of the Hadoop Architecture. It is a storage component, which stores the data across multiple machines of a cluster. HDFS is a distributed file system, which means it stores the files on multiple machines. HDFS architecture consists of three main components, being:
Content: This article will help you to understand the installation and configuration of Hadoop with Apache Ambari Interface on a single node.
Content: The article will give you the details of how to install Hortonworks Data Platform 2.3.0 on a single node.
Hi,
Thankyou for sharing the valuable information.
Hadoop HDFS Commands with Examples & usage