Hadoop getmerge Command – Learn to Execute it with Example

In this blog, we are going to discuss Hadoop file system shell command getmerge. It is used to merge n number of files in the HDFS distributed file system and put it into a single file in local file system. So, let’s start Hadoop getmerge Command.

Hadoop getmerge Command

Usage:

hdfs dfs –getmerge [-nl] <src> <localdest>

Takes the src directory and local destination file as the input. Concatenates the file in the src and puts it into the local destination file. Optionally we can use –nl to add new line character at the end of each file. We can use the –skip-empty-file option to avoid unnecessary new line characters for empty files.

Example of getmerge command

hdfs dfs -getmerge /user/dataflair/dir1/sample.txt /user/dataflair/dir2/sample2.txt /home/sample1.txt

getmerge command in Hadoop

You Must Explore – Most Frequently Used Commands in Hadoop

Join DataFlair on Telegram

Why Do We Use Hadoop getmerge Command?

The getmerge command in Hadoop is for merging files existing in the HDFS file system into a single file in the local file system.

The command is useful to download the output of MapReduce job. It has multiple part-* files into a single local file. We can use this local file later on for other operations like putting it in excel file for presentation and so on.

If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

Conclusion

We conclude that getmerge is a very useful HDFS file system shell command. In practice, we can use it to merge the output of MapReduce program into a local file.

Still, if you have any confusion regarding Hadoop getmerge command, ask in the comment section.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.