How to Add a Column to a Polars DataFrame Using .with_columns()
Last Updated :
02 Sep, 2024
The .with_columns()
method in Polars allows us to add one or more columns to a DataFrame. Unlike traditional methods that modify the DataFrame in place, .with_columns()
returns a new DataFrame with the added columns, preserving immutability. This method is highly versatile, allowing us to create new columns based on existing ones, use constant values, or even apply complex transformations.
In this article, we’ll explore different methods to add a new column to a Polars DataFrame using .with_columns().
Install Polars
We first make sure that we have Polars installed in our system.
pip install polars
Basic Usage of .with_columns()
To get started, let's first understand the syntax of the .with_columns()
method.
In this example, we created a new column "
Age_in_5_Years
"
by adding 5 to the "
Age
"
column. The .with_columns()
method is passed a list of expressions, each representing a column to be added.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})
# Add a new column
new_df = df.with_columns([
(pl.col("Age") + 5).alias("Age_in_5_Years")
])
print(new_df)
Output
Add column to Polars dataframe using .with_columns()Example 1: Adding a Constant Value Column
One of the simplest use cases for .with_columns()
is adding a column with a constant value. This is useful when we need to add metadata or a static identifier to each row.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a constant value column
new_df = df.with_columns([
pl.lit("₹").alias("Currency")
])
print(new_df)
Output
Adding a constant value to polars dataframeHere, we added a "
Currency
"
column with a constant value "
₹"
.
Example 2: Creating a Column Based on Multiple Existing Columns
We can create a new column by performing operations on multiple existing columns. For instance, let's create a "
Total_Cost
"
column by multiplying the "
Price
"
by a "
Quantity
"
column.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a conditional column
new_df = df.with_columns([
pl.when(pl.col("Price") > 150)
.then(pl.lit("Expensive"))
.otherwise(pl.lit("Affordable"))
.alias("Category")
])
print(new_df)
Output
Adding column from existing ones.This example demonstrates how to create a "
Total_Cost
"
column by multiplying "
Price
"
and "
Quantity
"
.
Example 3: Conditional Column Creation
The .with_columns()
method allows us to create columns conditionally based on existing data. Let's create a column "
Category
"
that categorizes products as "Expensive" if their price is above 150 and "Affordable" otherwise.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a conditional column
new_df = df.with_columns([
pl.when(pl.col("Price") > 150)
.then("Expensive")
.otherwise("Affordable")
.alias("Category")
])
print(new_df)
Output
Adding Colum based on condition in Polars DataframeHere, we categorized products based on their price.
Example 4: Adding Multiple Columns Simultaneously
We can add multiple columns in one go using the .with_columns()
method by passing a list of expressions.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add multiple columns
new_df = df.with_columns([
(pl.col("Price") * 1.1).alias("Price_with_Tax"),
(pl.col("Price") + 50).alias("Discounted_Price")
])
print(new_df)
Output
add multiple columns in one go in Polars DataframeThis example shows how to add multiple columns: "
Price_with_Tax
"
and "
Discounted_Price
"
.
Example 5: Adding a Column with a Custom Function
Sometimes, We may need to add a column based on a custom function. We can achieve this using the .
map_elements()
method inside .with_columns()
.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})
# Define a custom function
def age_category(age):
if age < 30:
return "Young"
else:
return "Mature"
# Add a column using a custom function
new_df = df.with_columns([
pl.col("Age").map_elements(age_category).alias("Category")
])
print(new_df)
Output
Add column to polars dataframe using a functionThis example adds a "Category"
column based on a custom function that categorizes people by age.
Conclusion
The .with_columns()
method in Polars is a powerful and flexible way to add new columns to a DataFrame. Whether we are adding constant values, performing calculations, creating conditional columns, or applying custom functions, .with_columns()
provides an intuitive and efficient interface for DataFrame manipulation. With the examples provided, we can now confidently add columns to our Polars DataFrames in a variety of scenarios.
Similar Reads
Add column with constant value to pandas dataframe Prerequisite: Pandas In this article, we will learn how to add a new column with constant value to a Pandas DataFrame. Before that one must be familiar with the following concepts: Pandas DataFrame : Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular arrangement wit
2 min read
Adding two columns to existing PySpark DataFrame using withColumn In this article, we are going to see how to add two columns to the existing Pyspark Dataframe using WithColumns. WithColumns is used to change the value, convert the datatype of an existing column, create a new column, and many more. Syntax: df.withColumn(colName, col) Returns: A new :class:`DataFr
2 min read
Add New Columns to Polars DataFrame Polars is a fast DataFrame library implemented in Rust and designed to process large datasets efficiently. It is gaining popularity as an alternative to pandas, especially when working with large datasets or needing higher performance. One common task when working with DataFrames is adding new colum
3 min read
How to add multiple columns to a data.frame in R? In R Language adding multiple columns to a data.frame can be done in several ways. Below, we will explore different methods to accomplish this, using some practical examples. We will use the base R approach, as well as the dplyr package from the tidyverse collection of packages.Understanding Data Fr
4 min read
How to plot all the columns of a dataframe in R ? In this article, we will learn how to plot all columns of the DataFrame in R programming language. Dataset in use: x y1 y2 y3 1 1 0.08475635 0.4543649 0 2 2 0.22646034 0.6492529 1 3 3 0.43255650 0.1537271 0 4 4 0.55806524 0.6492887 3 5 5 0.05975527 0.3832137 1 6 6 0.08475635 0.4543649 0 7 7 0.226460
5 min read