Get column index from column name of a given Pandas DataFrame
Let's learn how to get the index of a column based on its name using the DataFrame.columns attribute and Index.get_loc() method from the pandas library.
Index.get_loc() Method
Index.get_loc() function finds the index of a specified column name and returns an integer if the column name is unique.
Syntax: Index.get_loc(key, method=None, tolerance=None)
Let’s create a DataFrame with some sample data and find the index of specific columns by name.
import pandas as pd
# Sample data
data = {
"Math": [90, 85, 78],
"Science": [88, 92, 95],
"English": [85, 80, 89]
}
# Creating the DataFrame
df = pd.DataFrame(data)
print("DataFrame:")
print(df)
Output:
Example 1: Get the Index of the "Science" Column
# Get index of the "Science" column
science_index = df.columns.get_loc("Science")
print("Index of 'Science' column:", science_index)
Output:
Index of 'Science' column: 1
get_loc method is the most recommended method. There are several other methods than can be implemented to get column index from column name of a given Pandas DataFrame.
Using list().index()
Convert the DataFrame columns to a list and use .index() method to get index of the column.
column_index = list(df.columns).index('Science')
print("Using list().index():", column_index)
Output:
Using list().index(): 1
Similar to using list() here, we can also use .tolist(), like this - column_index = df.columns.tolist().index('Math'). If the performance is critical, .tolist() can be slightly slower.
Using np.where() with columns
If you have many columns and need a fast method, you can use numpy.where().
import numpy as np
column_index = np.where(df.columns == 'Science')[0][0]
print("index of science column using np.where():", column_index)
Output:
index of science column using np.where(): 1