Append or Concatenate Two DataFrames in Python Polars
Last Updated :
21 Aug, 2024
Polars is a fast Data Frame library implemented in Rust, providing efficient ways to work with large datasets. Whether we need to append rows or concatenate columns, Polars offers multiple methods to handle these tasks effectively.
Setting Up Your Environment
Before diving into the examples, ensure you have Polars installed. If not, you can install it using pip:
pip install polars
Basic DataFrame Creation
Let’s create a simple data frame to demonstrate filtering:
Python
import polars as pl
# Create a DataFrame
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [24, 30, 22],
"gender": ["F", "M", "M"],
"city": ["New York", "Dallas", "Chicago"]
})
print(df)
Output
shape: (3, 4)
┌─────────┬─────┬────────┬─────────┐
│ name │ age │ gender │ city │
│ --- │ --- │ --- │ --- │
│ str │ i64 │ str │ str │
├─────────┼─────┼────────┼─────────┤
│ Alice │ 24 │ F │ New York│
│ Bob │ 30 │ M │ Dallas │
│ Charlie │ 22 │ M │ Chicago │
└─────────┴─────┴────────┴─────────┘
Loading Data into Polars DataFrame
We have a data.csv file and let's load some data into a Polars DataFrame from CSV file.
Python
import polars
df = polars.read_csv('data.csv')
print(df)
Output:
shape: (10, 4)
┌─────────┬─────┬────────┬────────────┐
│ name │ age │ gender │ city │
│ --- │ --- │ --- │ --- │
│ str │ i64 │ str │ str │
├─────────┼─────┼────────┼────────────┤
│ Alice │ 24 │ F │ New York │
│ Bob │ 30 │ M │ Dallas │
│ Charlie │ 22 │ M │ Chicago │
│ David │ 25 │ M │ Dallas │
│ Eve │ 28 │ F │ Phoenix │
│ Frank │ 33 │ M │ New York │
│ Grace │ 27 │ F │ Chicago │
│ Hank │ 27 │ M │ Phoenix │
│ Ivy │ 26 │ F │ Dallas │
│ Jack │ 31 │ M │ Chicago │
└─────────┴─────┴────────┴────────────┘
Methods to Append or Concatenate DataFrames in Polars
- Using
pl.concat
for Concatenation - Using
vstack
for Appending Rows
Using pl.concat
for Concatenation
The pl.concat
function in Polars allows us to concatenate multiple DataFrames either vertically (by adding rows) or horizontally (by adding columns).
Example: Concatenating Two DataFrames Vertically
Python
import polars as pl
# Create two DataFrames
df1 = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [24, 30, 22],
"gender": ["F", "M", "M"],
"city": ["New York", "Dallas", "Chicago"]
})
df2 = pl.DataFrame({
"name": ["David", "Eve", "Frank"],
"age": [25, 28, 33],
"gender": ["M", "F", "M"],
"city": ["Dallas", "Phoenix", "New York"]
})
# Concatenate the DataFrames vertically
df_combined = pl.concat([df1, df2])
print(df_combined)
Output:
Using pl.concat for ConcatenationExample: Concatenating Two DataFrames Horizontally
Python
import polars as pl
# Create two DataFrames
df1 = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [24, 30, 22],
})
df2 = pl.DataFrame({
"gender": ["F", "M", "M"],
"city": ["New York", "Dallas", "Chicago"]
})
# Concatenate the DataFrames horizontally
df_combined = pl.concat([df1, df2], how="horizontal")
print(df_combined)
Output
Combine DF in PolarUsing vstack
for Appending Rows
The vstack
method is used to append one DataFrame to another vertically (row-wise).
Example: Appending Two DataFrames in Python Polars
Python
import polars as pl
# Create two DataFrames
df1 = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [24, 30, 22],
"gender": ["F", "M", "M"],
"city": ["New York", "Dallas", "Chicago"]
})
df2 = pl.DataFrame({
"name": ["David", "Eve", "Frank"],
"age": [25, 28, 33],
"gender": ["M", "F", "M"],
"city": ["Dallas", "Phoenix", "New York"]
})
# Append df2 to df1
df_combined = df1.vstack(df2)
print(df_combined)
Output
Combine DF in Polar using VstackConclusion
Polars provides efficient and easy-to-use functions for appending or concatenating DataFrames. We can use pl.concat
for both vertical and horizontal concatenation, or vstack
specifically for appending DataFrames vertically. Additionally, Polars allows us to load data from CSV files seamlessly, making it a powerful tool for handling large datasets effectively.