Merge, Join and Concatenate DataFrames using Pandas
Last Updated :
11 Jul, 2025
Dataframe is a two-dimensional data structure having multiple rows and columns. In a Pandas DataFrame, the data is aligned in the form of rows and columns only. A dataframe can perform arithmetic as well as conditional operations. It has a mutable size. This article will show how to join, concatenate, and merge in Pandas.
Python Merge, Join, and Concatenate DataFrames Using Pandas
Below are the different ways and approaches by which we can merge, join, and concatenate in Pandas in Python:
Concatenate DataFrames
concat()
function does all of the heavy lifting of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes. In this example, three DataFrames (df1
, df2
, and df3
) with identical columns but different indices are created. Using pd.concat()
, these Pandas DataFrames are vertically stacked, resulting in a combined DataFrame where rows from each original DataFrame follow one another.
Python
# Creating first dataframe
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
# Creating second dataframe
df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
'B': ['B4', 'B5', 'B6', 'B7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D': ['D4', 'D5', 'D6', 'D7']},
index=[4, 5, 6, 7])
# Creating third dataframe
df3 = pd.DataFrame({'A': ['A8', 'A9', 'A10', 'A11'],
'B': ['B8', 'B9', 'B10', 'B11'],
'C': ['C8', 'C9', 'C10', 'C11'],
'D': ['D8', 'D9', 'D10', 'D11']},
index=[8, 9, 10, 11])
# Concatenating the dataframes
pd.concat([df1, df2, df3])
Output:
A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2
3 A3 B3 C3 D3
4 A4 B4 C4 D4
5 A5 B5 C5 D5
6 A6 B6 C6 D6
7 A7 B7 C7 D7
8 A8 B8 C8 D8
9 A9 B9 C9 D9
10 A10 B10 C10 D10
11 A11 B11 C11 D11
Merge DataFrames
DataFrames Merge Pandas provides a single function, merge(), as the entry point for all standard database join operations between DataFrame objects. In this example, two DataFrames (left
and right
) are created with a common key column 'Key'. The pd.merge()
function merges these DataFrames based on their common 'Key' column using an inner join, resulting in a combined DataFrame containing only the rows where the key values match in both DataFrames.
Python
# Dataframe created
left = pd.DataFrame({'Key': ['K0', 'K1', 'K2', 'K3'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'Key': ['K0', 'K1', 'K2', 'K3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})
# Merging the dataframes
pd.merge(left, right, how='inner', on='Key')
Output:
Inner Merge:
Key A B C D
0 K0 A0 B0 C0 D0
1 K1 A1 B1 C1 D1
Left Merge
The pd.merge()
function with how='left'
performs a left join, merging the left
DataFrame with the right
DataFrame on the 'Key' column. All rows from the left
DataFrame are retained, and matching rows from the right
DataFrame are appended with their respective values.
Python
left_merged = pd.merge(left, right, how='left', on='Key')
print(left_merged)
Output:
Left Merge:
Key A B C D
0 K0 A0 B0 C0 D0
1 K1 A1 B1 C1 D1
2 K2 A2 B2 NaN NaN
3 K3 A3 B3 NaN NaN
Right Merge
The pd.merge()
function with how='right'
performs a right join, merging the left
DataFrame with the right
DataFrame on the 'Key' column. All rows from the right
DataFrame are retained, and matching rows from the left
DataFrame are appended with their respective values.
Python
right_merged = pd.merge(left, right, how='right', on='Key')
print(right_merged)
Output:
Right Merge:
Key A B C D
0 K0 A0 B0 C0 D0
1 K1 A1 B1 C1 D1
2 K4 NaN NaN C4 D4
3 K5 NaN NaN C5 D5
Outer Merge
The pd.merge()
function with how='outer'
performs a full outer join, merging the left
DataFrame with the right
DataFrame on the 'Key' column. This results in a DataFrame that contains all rows from both the left
and right
DataFrames, filling in missing values with NaN where there isn't a match.
Python
outer_merged = pd.merge(left, right, how='outer', on='Key')
print(outer_merged)
Output:
Outer Merge:
Key A B C D
0 K0 A0 B0 C0 D0
1 K1 A1 B1 C1 D1
2 K2 A2 B2 NaN NaN
3 K3 A3 B3 NaN NaN
4 K4 NaN NaN C4 D4
5 K5 NaN NaN C5 D5
Join DataFrames
In this example, both left
and right
DataFrames are indexed with keys 'K0', 'K1', 'K2', and 'K3'. The left.join(right)
operation merges the DataFrames on their indices, combining columns 'A' and 'B' from left
with columns 'C' and 'D' from right
based on the shared index.
Python
left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']},
index=['K0', 'K1', 'K2', 'K3'])
right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=['K0', 'K1', 'K2', 'K3'])
# Joining the dataframes
left.join(right)
Output
A B C D
K0 A0 B0 C0 D0
K1 A1 B1 C1 D1
K2 A2 B2 C2 D2
K3 A3 B3 C3 D3
Similar Reads
How To Concatenate Two or More Pandas DataFrames? In real-world data the information is often spread across multiple tables or files. To analyze it properly we need to bring all that data together. This is where the pd.concat() function in Pandas comes as it allows you to combine two or more DataFrames in: Vertically (stacking rows on top of each o
3 min read
Concatenate Pandas DataFrames Without Duplicates Concatenating Pandas DataFrames refers to combining multiple DataFrames into one, either by stacking them vertically (rows) or horizontally (columns). However, when the DataFrames being combined have overlapping rows or columns, duplicates can occur. The core idea is simple: use the pd.concat method
4 min read
Append Pandas DataFrames Using for Loop When dealing with large datasets, we often need to combine dataframes into single dataframe. Usually concat() is used along with the for loop to append the dataframes. Let us consider an example:Pythonimport pandas as pd import numpy as np # Create some example DataFrames dataframes = [pd.DataFrame(
4 min read
Merge/Join Two Dataframes on Multiple Columns in Pandas When working with large datasets, it's common to combine multiple DataFrames based on multiple columns to extract meaningful insights. Pandas provides the merge() function, which enables efficient and flexible merging of DataFrames based on one or more keys. This guide will explore different ways to
6 min read
Pandas Join Dataframes Joining DataFrames is a common operation in data analysis, where you combine two or more DataFrames based on common columns or indices. Pandas provides various methods to perform joins, allowing you to merge data in flexible ways. In this article, we will explore how to join DataFrames using methods
4 min read
How to combine two DataFrames in Pandas? While working with data, there are multiple times when you would need to combine data from multiple sources. For example, you may have one DataFrame that contains information about a customer, while another DataFrame contains data about their transaction history. If you want to analyze this data tog
3 min read