Open In App

Pandas Access DataFrame

Last Updated : 17 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Accessing a dataframe in pandas involves retrieving, exploring, and manipulating data stored within this structure. The most basic form of accessing a DataFrame is simply referring to it by its variable name. This will display the entire DataFrame, which includes all rows and columns.

Python
import pandas as pd

data = {'Name': ['John', 'Alice', 'Bob', 'Eve', 'Charlie'], 
        'Age': [25, 30, 22, 35, 28], 
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male'], 
        'Salary': [50000, 55000, 40000, 70000, 48000]}

df = pd.DataFrame(data)
# Display the entire DataFrame
print(df)

Output
      Name  Age  Gender  Salary
0     John   25    Male   50000
1    Alice   30  Female   55000
2      Bob   22    Male   40000
3      Eve   35  Female   70000
4  Charlie   28    Male   48000

In addition to accessing the entire DataFrame there are several other methods to effectively retrieve and manipulate data within a Pandas DataFrame. Let's have a look on that:

1. Accessing Columns From DataFrame

Columns in a DataFrame can be accessed individually using bracket notation Accessing a column retrieves that column as a Series, which can then be further manipulated.

Python
# Access the 'Age' column
age_column = df['Age']
print(age_column)

Output
0    25
1    30
2    22
3    35
4    28
Name: Age, dtype: int64

2. Accessing Rows by Index

To access specific rows in a DataFrame, you can use iloc (for positional indexing) or loc (for label-based indexing). These methods allow you to retrieve rows based on their index positions or labels.

Python
# Access the row at index 1 (second row)
second_row = df.iloc[1]
print(second_row)

Output
Name       Alice
Age           30
Gender    Female
Salary     55000
Name: 1, dtype: object

3. Accessing Multiple Rows or Columns

You can access multiple rows or columns at once by passing a list of column names or index positions. This is useful when you need to select several columns or rows for further analysis.

Python
# Access the first three rows and the 'Name' and 'Age' columns
subset = df.loc[0:2, ['Name', 'Age']]
print(subset)

Output
    Name  Age
0   John   25
1  Alice   30
2    Bob   22

4. Accessing Rows Based on Conditions

Pandas allows you to filter rows based on conditions, which can be very powerful for exploring subsets of data that meet specific criteria.

Python
# Access rows where 'Age' is greater than 25
filtered_data = df[df['Age'] > 25]
print(filtered_data)

Output
      Name  Age  Gender  Salary
1    Alice   30  Female   55000
3      Eve   35  Female   70000
4  Charlie   28    Male   48000

5. Accessing Specific Cells with at and iat

If you need to access a specific cell, you can use the .at[] method for label-based indexing and the .iat[] method for integer position-based indexing. These are optimized for fast access to single values.

Python
# Access the 'Salary' of the row with label 2
salary_at_index_2 = df.at[2, 'Salary']
print(salary_at_index_2)

Output
40000

Here are some Key Takeaways:

  1. Access a DataFrame by its variable name to view all data, and use bracket notation for columns and loc/iloc for rows.
  2. Retrieve multiple rows or columns simultaneously by passing lists of names or indices.
  3. Filter rows based on conditions to explore specific subsets of data effectively.

Next Article

Similar Reads