10 Most Frequently Used Hadoop Commands With Examples
In this blog, we are going to explore most frequently used commands in Hadoop. These commands aid in performing various HDFS file operations. These include copying a file, moving a file, showing the contents of the file, creating directories, etc. So let us begin with an introduction and then we will see different commands in Hadoop with examples.
Top Hadoop Commands
Hadoop stores petabytes of data using HDFS. HDFS is a distributed file system which stores structured to unstructured data. It provides redundant storage for files having humongous size. There are various commands to perform different file operations. Let us take a look at some of the important Hadoop commands.
List of Hadoop Commands
Command Name: version
Command Usage: version
Description: Shows the version of hadoop installed.
Command Name: mkdir
Command Usage: mkdir <path>
hdfs dfs -mkdir /user/dataflair/dir1
Description: This command takes the <path> as an argument and creates the directory.
Command Name: ls
Command Usage: ls <path>
hdfs dfs -ls /user/dataflair
Description: This command displays the contents of the directory specified by <path>. It shows the name, permissions, owner, size and modification date of each entry.
hdfs dfs -ls -R /user
<insert image ls-R.png>
Description: This command behaves like ls but displays entries in all the sub-directories recursively
Command Name: put
Command Usage: put <localsrc> <dest>
hdfs dfs -put /home/sample.txt /user/dataflair/dir1
Description: This command copies the file in the local filesystem to the file in DFS.
5. copyFrom Local
Command Name: copyFrom Local
Command Usage: copyFrom Local <localsrc> <dest>
hdfs dfs -copyFromLocal /home/sample /user/dataflair/dir1
Description: This command is similar to put command. But the source should refer to local file.
Command Usage: get <src> <localdest>
hdfs dfs -get /user/dataflair/dir1 /home
Description: This Hadoop shell command copies the file in HDFS identified by <src> to file in local file system identified by <localdest>
hdfs dfs -getmerge /user/dataflair/dir1/sample.txt /user/dataflair/dir2/sample2.txt /home/sample1.txt
Description: This HDFS command retrieves all files in the source path entered by the user in HDFS. And merges them into one single file created in the local file system identified by local destination.
hadoop fs –getfacl /user/dataflair/dir1
hadoop fs –getfacl -R /user/dataflair/dir1
Description: This Hadoop command shows the Access Control Lists (ACLs) of files and directories. This command displays default ACL if the directory contains the same.
Options : -R: It recursively displays a list of all the ACLs of all files and directories.
hadoop fs –getfattr –d /user/dataflair/dir1
Description: This HDFS command displays if there is any extended attribute names and values for the specified file or directory.
Options:-R: It lists the attributes for all files and directories recursively. -n name: It shows the named extended attribute value. -d: It shows all the extended attribute values associated with the pathname. -e encoding: Encodes values after extracting them. The valid coded forms that are “text”, “hex”, and “base64”. The values which are encoded as text strings gets enclosed with double quotes (” “). It uses prefix 0x for hexadecimal conversion. And 0s for all the values which gets coded as base64.
Command Name: copyToLocal
Command Usage: copyToLocal <src> <localdest>
hdfs dfs -copyToLocal /user/dataflair/dir1 /home
Description: It is similar to get command. Only the difference is that in this the destination of copied file should refer to a local file.
Command Name: cat
Command Usage: cat <file-name>
hdfs dfs -cat /user/dataflair/dir1/sample.txt
Description: This Hadoop shell command displays the contents of file on console or stdout.
Command Name: mv
Command Usage: mv <src> <dest>
hdfs dfs -mv /user/dataflair/dir1/sample.txt /user/dataflair/dir2
Description: This Hadoop shell command moves the file from the specified source to destination within HDFS.
Command Name: cp
Command Usage: cp <src> <dest>
hdfs dfs -cp /user/dataflair/dir2/sample.txt /user/dataflair/dir1
Description: This Hadoop shell command copies the file or directory from given source to destination within HDFS.
There are many commands in “$HADOOP_HOME/bin/hadoop fs” other than what we have discussed in this tutorial. What we have covered are the frequently used basic commands to get started. If you are stuck somewhere then type the following:
$HADOOP_HOME/bin/hadoop fs -help commandName
This will display a short usage summary of the command specified.
Still, if you have any questions related to Hadoop Commands, ask in the comment section. We will definitely get back to you.