Pandas DataFrame drop() Method

Last Updated : 04 Dec, 2024

Drop is a useful functionality in Pandas used to remove specified labels from rows or columns in a DataFrame and it provides options to modify the original DataFrame directly or return a new one with the changes. Since drop works for both columns and rows we have to specify the axis. By default, the axis is 0 which means that the rows are deleted by default. To delete the columns, we specify the axis value as 1. For example: Let us consider a sample dataframe and drop a row and column:

Python

import pandas as pd

df = pd.DataFrame({'A': [1, 2],'B': [4, 5],'C': [7, 8]})
print("Original DataFrame:\n", df)

df_dropped_col = df.drop(columns=['B']) # Dropping a column
print("\nDataFrame after dropping column 'B': \n", df_dropped_col)

df_dropped_row = df.drop(index=1) # Dropping a row
print("\nDataFrame after dropping row with index 1:\n", df_dropped_row)

Output

Original DataFrame:
    A  B  C
0  1  4  7
1  2  5  8

DataFrame after dropping column 'B':
    A  C
0  1  7
1  2  8

DataFrame after dropping row with index 1:
    A  B  C
0  1  4  7

Understanding the Syntax and Parameters of drop() Method

The drop() method has a straightforward syntax that provides flexibility in specifying what to remove:

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise'), here:
labels denote the row labels or column labels
axis=0 denotes rows and axis=1 denotes columns
index denotes row indices
columns variable takes the list of column names
level used for multi indexing dataframes
inplace = True means changes will be reflected in the original dataframe.
errors can be either raise or ignore. This parameter is used when the column names do not exist in the dataframe.

How to Drop Rows Using Index Labels?

Dropping rows by index labels is efficient when you know which specific rows need removal, it ensures that only the specified rows are removed without affecting others. The approach is crucial for refining datasets by eliminating irrelevant or erroneous entries without altering the overall structure.

In general we use index values to access the rows of the dataframe. So we pass the list of row indices that are to be dropped.

Python

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
df_dropped = df.drop(0, axis=0)
print(df_dropped)

Output

   A  B  C
1  2  5  8
2  3  6  9

This code drops rows with index labels 0 directly from the original DataFrame.

Dropping the Columns by Label

Here we will be dropping the columns by specifying the label parameter. It is to be noted that label can take single value or multiple values in list format.

Python

import pandas as pd

data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

df_dropped = df.drop(['B','C'], axis=1)
print(df_dropped)

Output

How to Drop Columns Using Column Names?

Let us consider a dataframe. Here we have four columns and we wish to remove any two columns. Inplace=true determines whether changes are applied directly to the original DataFrame or if a new modified version is returned

Python

import pandas as pd
df = pd.DataFrame({'Name': ['Alice', 'Bob'],'Age': [25, 30],'City': ['New York', 'Los Angeles'],'Country': ['USA', 'USA']})

# Drop the columns 'Age' and 'Country'
df.drop(columns=['Age', 'Country'],inplace=True)
print(df)

Output

    Name         City
0  Alice     New York
1    Bob  Los Angeles

As we can see from the output, the two columns: Age and Country has been dropped from the dataframe. Also using the inplace=True has modified the original dataframe.

Dropping rows from a Multi index dataframe

Multi-index dataframes are those that comprises of more than one level of indexing. These dataframes are used to handle the hierarchical data. For MultiIndex DataFrames, specifying the level parameter allows users to remove labels at specific hierarchical levels. This capability is essential for managing complex data structures where multiple indexing levels exist. Here level basically takes the index name(s) as input. For deleting the rows we have to specify the particular set of indices.

Python

import pandas as pd
# Create a MultiIndex DataFrame
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=['letter', 'number'])
df = pd.DataFrame({'X': [1, 2, 3, 4],'Y': [5, 6, 7, 8]}, index=index)

df_dropped = df.drop(('B', 'one'))
print("\nDataFrame after dropping ('B', 'one'):")
print(df_dropped)

Output

DataFrame after dropping ('B', 'one'):
               X  Y
letter number      
A      one     1  5
       two     2  6
B      two     4  8

Python Tutorial | Learn Python Programming Language

baidehi1874

Improve

Article Tags :

Practice Tags :

python

Pandas DataFrame drop() Method

Understanding the Syntax and Parameters of drop() Method

How to Drop Rows Using Index Labels?

Dropping the Columns by Label

How to Drop Columns Using Column Names?

Dropping rows from a Multi index dataframe

Similar Reads

Thank You!

What kind of Experience do you want to share?