Hadoop - getmerge Command
Last Updated :
04 Aug, 2025
hdfs dfs -getmerge command in Hadoop is used to merge multiple files stored in HDFS (Hadoop Distributed File System) into a single output file and place that file into the local file system. This is useful when:
- You have multiple small files in HDFS and want to combine them into one.
- You want to fetch processed output files from HDFS to your local system in a single file for further use.
Example : Suppose we have two files in HDFS:
We want to merge them into a single file named output.txt in our local file system.
Step 1: Check Content of Files
Before merging, let’s see the content of both files that are available in HDFS.
Content of file1.txt

Content of file2.txt

We will merge these two files into one.
Step 2: Create a Directory in HDFS
First, create a directory in HDFS (e.g., /Hadoop_File) where we will store our files:
hdfs dfs -mkdir /Hadoop_File
Step 3: Copy Files from Local to HDFS
Copy both file1.txt and file2.txt from the local system to HDFS:
hdfs dfs -copyFromLocal /home/dikshant/Documents/hadoop_file/file1.txt /Hadoop_File
hdfs dfs -copyFromLocal /home/dikshant/Documents/hadoop_file/file2.txt /Hadoop_File

Now both files are inside the /Hadoop_File directory in HDFS. You can verify this by listing the files:
hdfs dfs -ls /Hadoop_File

Step 4: Syntax of -getmerge Command
hdfs dfs -getmerge [-nl] <source_path1> <source_path2> ... <local_destination_file>
- -nl: Adds a new line between the contents of files being merged.
- <source_path> : The files in HDFS to merge.
- <local_destination_file>: Path in the local file system where the merged file will be created.
Step 5: Merge Files Using -getmerge
Now merge file1.txt and file2.txt from HDFS into a single file output.txt in the local system:
hdfs dfs -getmerge -nl /Hadoop_File/file1.txt /Hadoop_File/file2.txt /home/dikshant/Documents/hadoop_file/output.txt
Step 6: Verify the Output
Check whether the files have been merged successfully:
cd /home/dikshant/Documents/hadoop_file
ls
cat output.txt
You should now see the combined content of both files, with a newline separating them (because we used -nl).

Key Points to Remember
- If you omit -nl, the contents of files will be merged without newlines, which may cause data overlap.
- You can merge an entire directory instead of individual files:
hdfs dfs -getmerge -nl /Hadoop_File /home/dikshant/Documents/hadoop_file/output.txt
- This will merge all files inside /Hadoop_File into output.txt.
- The merged file is always stored in the local file system, not back in HDFS.
Similar Reads
MapReduce - Combiners Map-Reduce is a programming model that is used for processing large-size data-sets over distributed systems in Hadoop. Map phase and Reduce Phase are the main two important parts of any Map-Reduce job. Map-Reduce applications are limited by the bandwidth available on the cluster because there is a m
6 min read
How to Download and Install WinMerge on Windows? WinMerge is a Windows application for displaying and merging visual differences between files and folders. It's great for figuring out what changed between file versions and then merging those changes. Unicode support, a flexible syntax coloring editor, Visual SourceSafe integration, and Windows She
3 min read
How to Install Meld on Windows? Meld is a visual diff and merge tool that allows users to compare and merge differences between files and directories. It is typically used by developers and other technical professionals to compare and merge code changes, but it can also be used to compare and merge other types of files. Meld is av
2 min read
comm command in Linux with examples The 'comm' command in Linux is a powerful utility that allows you to compare two sorted files line by line, identifying the lines that are unique to each file and those that are common to both. This command is particularly useful when you have lists, logs, or data sets that need to be compared effic
4 min read
sdiff command in Linux with Examples sdiff command in Linux is used to compare two files and then writes the results to standard output in a side-by-side format. It displays each line of the two files with a series of spaces between them if the lines are identical. It displays a greater than sign if the line only exists in the file spe
3 min read
How To Undo Merge in Git? Merging is a fundamental aspect of version control, allowing you to bring together different branches in a Git repository. However, there are situations where you might need to undo a merge, whether due to conflicts, mistakes, or unwanted changes. In this guide, we'll explore different ways to undo
4 min read