In this Hadoop HDFS commands tutorial, we are going to learn the remaining important and frequently used HDFS commands with the help of which we will be able to perform HDFS file operations like copying a file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc. To learn more about world’s most reliable storage layer follow this HDFS introductory guide.
2. Hadoop HDFS Commands Tutorial
Hadoop file system shell commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. In this Hadoop HDFS commands tutorial, we will discuss the frequently use HDFS commands with their usage and description. All the Hadoop file system shell commands are invoked by the bin/hdfs script.
hdfs dfs -touchz /user/dataflair/dir2
It creates a file at the path containing the current time as a timestamp. Fails if a file already exists at a path, unless the file is already size 0.
hdfs dfs -test -[ezd] URI
"hdfs dfs -test -e sample hdfs dfs -test -z sample hdfs dfs -test -d sample"
The Hadoop test is used for file test operations.
It gives 1 output if a path exists; it has zero length, or it is a directory or otherwise 0.
-d: if the path given by the user is a directory, then it gives 0 output.
-e: if the path given by the user exists, then it gives 0 output.
-f: if the path given by the user is a file, then it gives 0 output.
-s: if the path given by the user is not empty, then it gives 0 output.
-z: if the file is zero length, then it gives 0 output.
hdfs dfs -text <source>
hdfs dfs -text /user/dataflair/dir1/sample
Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
hdfs dfs -stat path
hdfs dfs -stat /user/dataflair/dir1
Prints information about the path.
%b: If the format is a string which accepts file size in blocks.
%o: Block size
%y, %Y: modification date.
hdfs dfs -tail [-f] <filename2>
"hdfs dfs -tail /user/dataflair/dir1/sample hdfs dfs -tail -f /user/dataflair/dir1/sample"
Shows the last 1KB of the file on console or stdout.
hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
hdfs dfs -chown -R dataflair /opt/hadoop/logs
Changes the owner of files. With -R, makes the change according to the order of the directory structure recursively. The user must be the superuser.
chmod [-R] mode,mode,... <path>...
hdfs dfs -chmod 777 /user/dataflair/dir1/sample
Changes the permissions of files. With -R, makes the change recursively by way of the directory structure. The user must be the owner of the file or the superuser.
hadoop fs -appendToFile <localsource> ... <dst>
hadoop fs -appendToFile /home/dataflair/Desktop/sample /user/dataflair/dir1
Append single sources or multiple sources from local file system to the file system at the destination. It also reads input from standard input and adds it to destination file system.
hadoop fs -checksum URI
hadoop fs -checksum /user/dataflair/dir1/sample
Returns the checksum information of a file.
hdfs dfs -count [-q] <paths>
hdfs dfs -count /user/dataflair
Counts the number of directories, number of files present and bytes under the paths that match the specified file pattern.
3. Related Links
- Top 10 Useful Hdfs Commands Part-IV
- Introduction to HDFS – World’s Most Reliable Storage System
- Hadoop HDFS data read and write operations.