Adding New Variable to Pandas DataFrame
Last Updated :
13 Jun, 2022
In this article let's learn how to add a new variable to pandas DataFrame using the assign() function and square brackets.
Pandas is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing data much easier. Whereas Pandas DataFrame is a potentially heterogeneous two-dimensional size-mutable tabular data structure with labeled axes (rows and columns). A data frame is a two-dimensional data structure in which data is organized in rows and columns in a tabular format. The data, rows, and columns are the three main components of a Pandas DataFrame. here we will see two different methods for adding new variables to our pandas Dataframe.
Method 1: Using pandas.DataFrame.assign() method
This method is used to create new columns for a DataFrame. It Returns a new object containing all original columns as well as new ones. If there are Existing columns, they will be overwritten if they are re-assigned.
Syntax: DataFrame.assign(**kwargs)
- **kwargsdict of {str: callable or Series} : Keywords are used to name the columns. If the values are callable, they are computed and assigned to the new columns on the DataFrame. The callable must not modify the input DataFrame . If the values are not callable (for example, if they are a Series, scalar, or array), they are easily assigned.
Returns: A new DataFrame is returned with the new columns as well as all the existing columns.
Example
In this example, we import the NumPy and the panda's packages, we set the seed so that the same random data gets generated each time. A dataset with 10 team scores ranging from 30 to 100 is generated for three teams. The assign() method is used to create another column in the Dataframe, we provide a keyword name which will be the name of the column we'll assign data to it. After assigning data, a new Dataframe gets created with a new column in addition to the existing columns.
Python3
# import packages
import numpy as np
import pandas as pd
# setting a seed
np.random.seed(123)
# creating a dataframe
df = pd.DataFrame({'TeamA': np.random.randint(30, 100, 10),
'TeamB': np.random.randint(30, 100, 10),
'TeamC': np.random.randint(30, 100, 10)})
print('Before assigning the new column')
print(df)
# using assign() method to add a new column
scores = np.random.randint(30, 100, 10)
df2 = df.assign(TeamD=scores)
print('After assigning the new column')
print(df2)
Output:
Method 2: Using [] to add a new column
In this example, instead of using the assign() method, we use square brackets ([]) to create a new variable or column for an existing Dataframe. The syntax goes like this:
dataframe_name['column_name'] = data
column_name is the name of the new column to be added in our dataframe.
Example
we get the same output as when we used the assign() method. A new column called TeamD is created in this example, which shows the scores of people in TeamD. Random data is created and assigned to the Dataframe to the new column.
Python3
# import packages
import numpy as np
import pandas as pd
# setting a seed
np.random.seed(123)
# creating a dataframe
df = pd.DataFrame({'TeamA': np.random.randint(30, 100, 10),
'TeamB': np.random.randint(30, 100, 10),
'TeamC': np.random.randint(30, 100, 10)})
print('Before assigning the new column')
print(df)
# using [] to add a new column
scores = np.random.randint(100, 150, 10)
df['TeamD'] = scores
print('After assigning the new column')
print(df)
Output:
Similar Reads
How to Add Variables to a Data Frame in R In data analysis, it is often necessary to create new variables based on existing data. These new variables can provide additional insights, support further analysis, and improve the overall understanding of the dataset. R, a powerful tool for statistical computing and graphics, offers various metho
5 min read
Python | Pandas dataframe.rename_axis() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. dataframe.rename_axis() is used to rename the axes of the index or columns in datafram
2 min read
Python | Pandas DataFrame.values Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure o
2 min read
Add zero columns to Pandas Dataframe Prerequisites: Pandas The task here is to generate a Python program using its Pandas module that can add a column with all entries as zero to an existing dataframe. A Dataframe is a two-dimensional, size-mutable, potentially heterogeneous tabular data.It is used to represent data in tabular form lik
2 min read
Add Column to Pandas DataFrame with a Default Value Let's discuss how to add column to Pandas DataFrame with a default value using assign(), the [] operator, and insert().Add Column with a Default Value using assign()The assign() method is used to add new columns to a DataFrame and returns a new object with all existing columns and the new ones. Exis
3 min read