Top 10 Hadoop Hdfs Commands Part-I – An HDFS Tutorial


1. Objective

In this tutorial, we are going to learn the most important and frequently used Hadoop HDFS commands with the help of which we will be able to perform HDFS file operations like copying the file, changing files permissions, viewing the file contents, changing files ownership, creating directories, etc.

2. HDFS Introduction

Hadoop HDFS is a distributed file system which provides redundant storage space for files having huge sizes. It is used for storing files which are in the range of terabytes to petabytes. To learn more about world’s most reliable storage layer follow this HDFS introductory guide.

Hadoop HDFS commands

Before working with HDFS you need to Deploy Hadoop, follow this guide to Install and configure Hadoop.

3. Hadoop HDFS Commands

In This section of Hadoop HDFS command tutorial top, 10 HDFS commands are discussed below along with their usage, description, and examples.Hadoop file system shell commands are used to perform various Hadoop HDFS operations and in order to manage the files present on HDFS clusters. All the Hadoop file system shell commands are invoked by the bin/hdfs script.

3.1. version

Command Usage

version

Command Example

hdfs dfs version

Description
Prints the Hadoop version

3.2. mkdir

Command Usage

mkdir <path>

Command Example

hdfs dfs -mkdir /user/dataflair/dir1

Description
Takes path URI’s as an argument and creates directories.
Creates any parent directories in path that are missing (e.g., mkdir -p in Linux).

Learn various features of Hadoop HDFS from this HDFS features guide.

3.3. ls

Command Usage

ls <path>

Command Example

hdfs dfs -ls /user/dataflair/dir1

Description
It displays a list of the contents of a directory specified by path provided by the user, showing the names, permissions, owner, size and modification date for each entry.

Command Example

hdfs dfs -ls -R

Description
Behaves like -ls, but recursively displays entries in all subdirectories of a path.

3.4. put

Command Usage

put <localSrc> <dest>

Command Example

hdfs dfs -put /home/dataflair/Desktop/sample /user/dataflair/dir1

Description
Copies the file or directory from the local file system to the destination within the DFS.

Learn Internals of HDFS Data Write Pipeline and File write execution flow.

3.5. copyFromLocal

Command Usage

copyFromLocal <localSrc> <dest>

Command Example

hdfs dfs -copyFromLocal /home/dataflair/Desktop/sample /user/dataflair/dir1

Description
Similar to put command, but the source is restricted to a local file reference.

Learn Internals of HDFS Data Read Operation, How Data flows in HDFS while reading the file.

3.6. get

Command Usage

get [-crc] <src> <localDest>

Command Example

hdfs dfs -get /user/dataflair/dir2/sample /home/dataflair/Desktop

Description
Copies the file or directory in HDFS identified by the source to the local file system path identified by local destination.

Command Example

hdfs dfs -getmerge /user/dataflair/dir2/sample /home/dataflair/Desktop

Description
Retrieves all files that match to the source path entered by the user in HDFS, and creates a copy of them to one single, merged file in the local file system identified by local destination.

Command Example

hadoop fs -getfacl  /user/dataflair/dir1/sample
hadoop fs -getfacl -R  /user/dataflair/dir1

Description
It shows the Access Control Lists (ACLs) of files and directories. If a directory contains a default ACL, then getfacl also displays the default ACL.
Options :
-R: It displays a list of all the ACLs of all files and directories recursively.
path: File or directory to list.

Command Example

hadoop fs -getfattr -d /user/dataflair/dir1/sample

Description
Displays if there is any extended attribute names and values for a file or directory.
Options:
-R: It recursively lists the attributes for all files and directories.
-n name: It displays the named extended attribute value.
-d: It displays all the extended attribute values associated with the pathname.
-e encoding: Encodes values after extracting them. The valid converted coded forms are “text”, “hex”, and “base64”. All the values encoded as text strings are with double quotes (” “), and prefix 0x and 0s are used for all the values which are converted and coded as hexadecimal and base64.
path: The file or directory.

3.7. copyToLocal

Command Usage

copyToLocal <src> <localDest>

Command Example

hdfs dfs -copyToLocal /user/dataflair/dir1/sample /home/dataflair/Desktop

Description
Similar to get command, only the difference is that in this the destination is restricted to a local file reference.

3.8. cat

Command Usage

cat <file-name>

Command Example

hdfs dfs -cat /user/dataflair/dir1/sample

Description
Displays the contents of the filename on console or stdout.

3.9. mv

Command Usage

mv <src> <dest>

Command Example

hadoop fs -mv /user/dataflair/dir1/purchases.txt /user/dataflair/dir2

Description
Moves the file or directory indicated by the source to destination, within HDFS.

3.10. cp

Command Usage

cp <src> <dest>

Command Example

hadoop fs -cp /user/dataflair/dir2/purchases.txt /user/dataflair/dir1

Description
Copies the file or directory identified by the source to destination, within HDFS.

4. What’s Next

Leave a comment

Your email address will not be published. Required fields are marked *