Open In App

How to Merge Two Pandas DataFrames on Index

Last Updated : 12 Nov, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Merging two pandas DataFrames on their index is necessary when working with datasets that share the same row identifiers but have different columns. The core idea is to align the rows of both DataFrames based on their indices, combining the respective columns into one unified DataFrame.

To merge two pandas DataFrames on their index, you can use the merge() function with  left_index and right_index parameters set to True. Alternatively, you can use the join() or concat() functions, which also support merging on index.

merging-two-pandas-dataframes-based-on-index
Merge Two Pandas DataFrames on Index

Merge Two Pandas DataFrames on Index

This merge() method will merge the two Dataframes with matching indexes. If you have two DataFrames with matching index labels, you can simply set left_index=True and right_index=True in the pd.merge() function to merge them.

  • Types of Joins: Pandas supports different join types for merging data, such as inner, outer, left, and right joins. By default, pd.merge() performs an inner join, meaning it keeps only the rows that have matching indices in both DataFrames. You can specify the join type with the how parameter.
  • Aligning Data: When merging on the index, each row in the resulting DataFrame aligns based on the index value rather than a specific column, making this approach ideal when your row labels are the critical point of reference.
Python
# import pandas module
import pandas as pd

# join two dataframes with merge
print(pd.merge(data1, data2, left_index=True, right_index=True))

Output:

Merge Two Pandas DataFrames on Index using merge()

2. Merging two Pandas DataFrames on Index using join()

By default, the join() method it performs a left join. In this case, all rows from the left DataFrame (df1) are kept, and matching rows from the right DataFrame (df2) are added.

Python
# import pandas module
import pandas as pd

# create student  dataframe
data1 = pd.DataFrame({'id': [1, 2, 3, 4],
                      'name': ['manoj', 'manoja', 'manoji', 'manij']},
                     index=['one', 'two', 'three', 'four'])


# create marks  dataframe
data2 = pd.DataFrame({'s_id': [1, 2, 3, 6, 7],
                      'marks': [98, 90, 78, 86, 78]}, 
                     index=['one', 'two', 'three', 'siz', 'seven'])

# join two dataframes
print(data1.join(data2))

Output:

Merge two Pandas DataFrames on Index using join() 

3. Merging Two Pandas DataFrames on Index using concat()

By default, concat() method performs an outer join by setting axis=1. This method includes all rows from both DataFrames, filling in missing values with NaN where there are no matches.

Python
# import pandas module
import pandas as pd

# join two dataframes with concat
print(pd.concat([data1, data2], axis=1))

Output:

Merging Two Pandas DataFrames on Index using concat()

Key Takeaways:

  • Merging on Index: Use merge()join(), or concat() to merge two DataFrames based on their row indices.
  • Join Types: You can specify different join types (innerouterleft, or right) depending on whether you want to keep only matching rows or include all rows from one or both DataFrames.
  • Flexibility: The merge() function is more flexible than join() and concat(), allowing for more complex merging operations like merging on both columns and indices simultaneously.

Recommended Article: Pandas Merging, Joining, and Concatenating


Next Article
Practice Tags :

Similar Reads