• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
PythonForBeginners.com

PythonForBeginners.com

Learn By Example

  • Home
  • Learn Python
    • Python Tutorial
  • Categories
    • Basics
    • Lists
    • Dictionary
    • Code Snippets
    • Comments
    • Modules
    • API
    • Beautiful Soup
    • Cheatsheet
    • Games
    • Loops
  • Python Courses
    • Python 3 For Beginners
You are here: Home / Basics / Concatenate DataFrames in Python

Concatenate DataFrames in Python

Author: Aditya Raj
Last Updated: October 14, 2022

We use dataframes in python to handle and analyze tabular data in python. In this article, we will discuss how we can concatenate two or more dataframes in python.

How to Concatenate DataFrames in Python?

To concatenate two or more dataframes in python, we can use the concat() method defined in the pandas module. The concat() method takes a list of dataframes as its input arguments and concatenates them vertically.

We can also concatenate the dataframes in python horizontally using the axis parameter of the concat() method. The axis parameter has a default value of 0, which denotes that the dataframes will be concatenated vertically. If you want to concatenate the dataframes horizontally, you can pass the value 1 to the axis parameter.

After execution, the concat() method will return the resultant dataframe.

Concatenate Dataframes Vertically in python

To concatenate two dataframes vertically in python, you need to first import the pandas module using the import statement. After that, you can concatenate the dataframes using the concat() method as follows.

import numpy as np
import pandas as pd
df1=pd.read_csv("grade1.csv")
print("First dataframe is:")
print(df1)
df2=pd.read_csv("grade2.csv")
print("second dataframe is:")
print(df2)
df3=pd.concat([df1,df2])
print("Merged dataframe is:")
print(df3)

Output:

First dataframe is:
   Class  Roll    Name  Marks Grade
0      1    11  Aditya     85     A
1      1    12   Chris     95     A
2      1    14     Sam     75     B
3      1    16  Aditya     78     B
4      1    15   Harry     55     C
5      2     1    Joel     68     B
6      2    22     Tom     73     B
7      2    15    Golu     79     B
second dataframe is:
   Class  Roll        Name  Marks Grade
0      2    27       Harsh     55     C
1      2    23       Clara     78     B
2      3    33        Tina     82     A
3      3    34         Amy     88     A
4      3    15    Prashant     78     B
5      3    27      Aditya     55     C
6      3    23  Radheshyam     78     B
7      3    11       Bobby     50     D
Merged dataframe is:
   Class  Roll        Name  Marks Grade
0      1    11      Aditya     85     A
1      1    12       Chris     95     A
2      1    14         Sam     75     B
3      1    16      Aditya     78     B
4      1    15       Harry     55     C
5      2     1        Joel     68     B
6      2    22         Tom     73     B
7      2    15        Golu     79     B
0      2    27       Harsh     55     C
1      2    23       Clara     78     B
2      3    33        Tina     82     A
3      3    34         Amy     88     A
4      3    15    Prashant     78     B
5      3    27      Aditya     55     C
6      3    23  Radheshyam     78     B
7      3    11       Bobby     50     D

If all the dataframes have the same number of columns and the column names are also the same, the resultant dataframe has the same number of columns as the input dataframes. You can observe this in the example above.

However, if a dataframe has less number of columns than the other dataframes, the corresponding value in the resultant dataframe for that column will be NaN for the rows obtained from the dataframe. You can observe this in the following example.

import numpy as np
import pandas as pd
df1=pd.read_csv("grade_with_roll.csv")
print("First dataframe is:")
print(df1)
df2=pd.read_csv("grade_with_name.csv")
print("second dataframe is:")
print(df2)
df3=pd.concat([df1,df2])
print("Merged dataframe is:")
print(df3)

Output:

First dataframe is:
   Roll  Marks Grade
0    11     85     A
1    12     95     A
2    13     75     B
3    14     75     B
4    16     78     B
5    15     55     C
6    20     72     B
7    24     92     A
second dataframe is:
   Roll      Name  Marks Grade
0    11    Aditya     85     A
1    12     Chris     95     A
2    13       Sam     75     B
3    14      Joel     75     B
4    16       Tom     78     B
5    15  Samantha     55     C
6    20      Tina     72     B
7    24       Amy     92     A
Merged dataframe is:
   Roll  Marks Grade      Name
0    11     85     A       NaN
1    12     95     A       NaN
2    13     75     B       NaN
3    14     75     B       NaN
4    16     78     B       NaN
5    15     55     C       NaN
6    20     72     B       NaN
7    24     92     A       NaN
0    11     85     A    Aditya
1    12     95     A     Chris
2    13     75     B       Sam
3    14     75     B      Joel
4    16     78     B       Tom
5    15     55     C  Samantha
6    20     72     B      Tina
7    24     92     A       Amy

If the dataframes have different column names, each column name is assigned a separate column in the resultant dataframe. Also, the corresponding value in the resultant dataframe for that column will be NaN for the rows obtained dataframes that do not have the specified column.

Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on k-means clustering with numerical example.

Concatenate DataFrames Horizontally in Python

To concatenate dataframes horizontally, we will use the axis parameter and give the value 1 as its input in the concat() method. After execution, the concat() method will return the horizontally concatenated dataframe as shown below.

import numpy as np
import pandas as pd
df1=pd.read_csv("grade_with_roll.csv")
print("First dataframe is:")
print(df1)
df2=pd.read_csv("grade_with_name.csv")
print("second dataframe is:")
print(df2)
df3=pd.concat([df1,df2],axis=1)
print("Merged dataframe is:")
print(df3)

Output:

First dataframe is:
   Roll  Marks Grade
0    11     85     A
1    12     95     A
2    13     75     B
3    14     75     B
4    16     78     B
5    15     55     C
6    20     72     B
7    24     92     A
second dataframe is:
   Roll      Name  Marks Grade
0    11    Aditya     85     A
1    12     Chris     95     A
2    13       Sam     75     B
3    14      Joel     75     B
4    16       Tom     78     B
5    15  Samantha     55     C
6    20      Tina     72     B
7    24       Amy     92     A
Merged dataframe is:
   Roll  Marks Grade  Roll      Name  Marks Grade
0    11     85     A    11    Aditya     85     A
1    12     95     A    12     Chris     95     A
2    13     75     B    13       Sam     75     B
3    14     75     B    14      Joel     75     B
4    16     78     B    16       Tom     78     B
5    15     55     C    15  Samantha     55     C
6    20     72     B    20      Tina     72     B
7    24     92     A    24       Amy     92     A

If the dataframes that are being concatenated have the same number of records, the resultant dataframe will not have any NaN values as shown in the above example. However, if a dataframe has a lesser number of rows than the other dataframe, the resultant dataframe will have NaN values. This occurs when the join parameter is set to “outer”.

Conclusion

In this article, we have discussed how to concatenate two pandas dataframe in python. To concatenate more than two dataframes, you just need to add the dataframe to the list of dataframes that is given as input to the concat() method.

To learn more about python programming, you can read this article on dictionary comprehension in python. You might also like this article on list comprehension in python.

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Basics Author: Aditya Raj

More Python Topics

API Argv Basics Beautiful Soup Cheatsheet Code Code Snippets Command Line Comments Concatenation crawler Data Structures Data Types deque Development Dictionary Dictionary Data Structure In Python Error Handling Exceptions Filehandling Files Functions Games GUI Json Lists Loops Mechanzie Modules Modules In Python Mysql OS pip Pyspark Python Python On The Web Python Strings Queue Requests Scraping Scripts Split Strings System & OS urllib2

Primary Sidebar

Menu

  • Basics
  • Cheatsheet
  • Code Snippets
  • Development
  • Dictionary
  • Error Handling
  • Lists
  • Loops
  • Modules
  • Scripts
  • Strings
  • System & OS
  • Web

Get Our Free Guide To Learning Python

Most Popular Content

  • Reading and Writing Files in Python
  • Python Dictionary โ€“ How To Create Dictionaries In Python
  • How to use Split in Python
  • Python String Concatenation and Formatting
  • List Comprehension in Python
  • How to Use sys.argv in Python?
  • How to use comments in Python
  • Try and Except in Python

Recent Posts

  • Count Rows With Null Values in PySpark
  • PySpark OrderBy One or Multiple Columns
  • Select Rows with Null values in PySpark
  • PySpark Count Distinct Values in One or Multiple Columns
  • PySpark Filter Rows in a DataFrame by Condition

Copyright © 2012–2025 ยท PythonForBeginners.com

  • Home
  • Contact Us
  • Privacy Policy
  • Write For Us