Index in pandas dataframe act as reference for each row in dataset. It can be numeric or based on specific column values. The default index is usually a RangeIndex starting from 0, but you can customize it for better data understanding. You can easily access the current index of a dataframe using the index
attribute. Let's us understand with the help of an example:
1. Accessing and Modifying the Index
Python
import pandas as pd
data = {'Name': ['John', 'Alice', 'Bob', 'Eve', 'Charlie'],
'Age': [25, 30, 22, 35, 28],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male'],
'Salary': [50000, 55000, 40000, 70000, 48000]}
df = pd.DataFrame(data)
print(df.index) # Accessing the index
OutputRangeIndex(start=0, stop=5, step=1)
2. Setting a Custom Index
To set a custom index, you can use the set_index() method, allowing you to set a custom index based on a column, such as Name or Age.
Python
# Set 'Name' column as the index
df_with_index = df.set_index('Name')
print(df_with_index)
Output Age Gender Salary
Name
John 25 Male 50000
Alice 30 Female 55000
Bob 22 Male 40000
Eve 35 Female 70000
Charlie 28 Male 480...
There are various operations you can perform with the DataFrame index, such as resetting it, changing it, or indexing with loc[]. Let's understand these as well:
3. Resetting the Index
If you need to reset the index back to default integer index, use reset_index() method. This will convert the current index into a regular column and create a new default index.
Python
# Reset the index back to the default integer index
df_reset = df.reset_index()
print(df_reset)
Output Name Age Gender Salary
0 John 25 Male 50000
1 Alice 30 Female 55000
2 Bob 22 Male 40000
3 Eve 35 Female 70000
4 Charlie 28 Male 48000
4. Indexing with loc
The loc[] method in pandas allows to access rows and columns of a dataFrame using their labels, making it easy to retrieve specific data points.
Python
row = df.loc['Alice']
print(row)
OutputAge 30
Gender Female
Salary 55000
Name: Alice, dtype: object
5. Changing the Index
Change the index of dataFrame, with help of set_index() method; allows to set one or more columns as the new index.
Python
# Set 'Age' as the new index
df_with_new_index = df.set_index('Age')
print(df_with_new_index)
Output Name Gender Salary
Age
25 John Male 50000
30 Alice Female 55000
22 Bob Male 40000
35 Eve Female 70000
28 Charlie Male 480...
Here are some Key Takeaways:
- Use .loc[] for label-based row selection and set_index() to set custom indices.
- Access the index with .index and reset_index() restores the default index, with an option to drop the old index.
Similar Reads
Pandas dataframe.sort_index() Pandas is one of those packages and makes importing and analyzing data much easier. When working with DataFrames, Pandas is used for handling tabular data. Let's learn Pandas DataFrame sort_index() method, which is used to sort the DataFrame based on index or column labels.Pandas sort_index() functi
3 min read
Pandas DataFrame.reset_index() In Pandas, reset_index() method is used to reset the index of a DataFrame. By default, it creates a new integer-based index starting from 0, making the DataFrame easier to work with in various scenarios, especially after performing operations like filtering, grouping or multi-level indexing. Example
3 min read
Pandas DataFrame index Property In Pandas we have names to identify columns but for identifying rows, we have indices. The index property in a pandas dataFrame allows to identify and access specific rows within dataset. Essentially, the index is a series of labels that uniquely identify each row in the DataFrame. These labels can
6 min read
Reset Index in Pandas Dataframe Letâs discuss how to reset the index in Pandas DataFrame. Often We start with a huge data frame in Pandas and after manipulating/filtering the data frame, we end up with a much smaller data frame. When we look at the smaller data frame, it might still carry the row index of the original data frame.
6 min read
Python | Pandas DataFrame.set_index() Pandas DataFrame.set_index() method sets one or more columns as the index of a DataFrame. It can accept single or multiple column names and is useful for modifying or adding new indices to your DataFrame. By doing so, you can enhance data retrieval, indexing, and merging tasks.Syntax: DataFrame.set_
3 min read