Pandas Practice
Pandas Practice
March 6, 2025
1.1 Objectives
After completing this lab you will be able to:
• Use Pandas Library to create DataFrame and Series
• Locate data in the DataFrame using loc() and iloc() functions
• Use slicing
Collecting pandas
Downloading
pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
(89 kB)
Collecting numpy>=1.26.0 (from pandas)
Downloading
numpy-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
1
(62 kB)
Requirement already satisfied: python-dateutil>=2.8.2 in
/opt/conda/lib/python3.12/site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.12/site-
packages (from pandas) (2024.2)
Collecting tzdata>=2022.7 (from pandas)
Downloading tzdata-2025.1-py2.py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.12/site-
packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
Downloading
pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.7
MB)
���������������������������������������� 12.7/12.7 MB
118.7 MB/s eta 0:00:00
Downloading
numpy-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.1 MB)
���������������������������������������� 16.1/16.1 MB
150.7 MB/s eta 0:00:00
Downloading tzdata-2025.1-py2.py3-none-any.whl (346 kB)
Installing collected packages: tzdata, numpy, pandas
Successfully installed numpy-2.2.3 pandas-2.2.3 tzdata-2025.1
Once you’ve imported pandas, you can then use the functions built in it to create and analyze data.
In this practice lab, we will learn how to create a DataFrame out of a dictionary.
Let us consider a dictionary ‘x’ with keys and values as shown below.
We then create a dataframe from the dictionary using the function pd.DataFrame(dict)
2
3 Mary 4 Infrastructure 60000
We can see the direct correspondence between the table. The keys correspond to the column labels
and the values or lists correspond to the rows.
Column Selection: To select a column in Pandas DataFrame, we can either access the columns
by calling them by their columns name.
Let’s Retrieve the data present in the ID column.
[4]: ID
0 1
1 2
2 3
3 4
Let’s use the type() function and check the type of the variable.
[5]: pandas.core.frame.DataFrame
The output shows us that the type of the variable is a DataFrame object.
Access to multiple columns Let us retrieve the data for Department, Salary and ID columns
z = df[['Department','Salary','ID']]
z
3
a = {'Student':['David', 'Samuel', 'Terry', 'Evan'],
'Age':['27', '24', '22', '32'],
'Country':['UK', 'Canada', 'China', 'USA'],
'Course':['Python','Data Structures','Machine Learning','Web Development'],
'Marks':['85','72','89','76']}
df1 = pd.DataFrame(a)
df1
Problem 3: Retrieve the Country and Course columns and assign it to a variable c
[9]: #write your code here
x = df1['Student']
x
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[10], line 3
1 # Get the Student column as a series Object
----> 3 x = df1['Student']
4 x
The output shows us that the type of the variable is a Series object.
4
1.1.3 Exercise 2: loc() and iloc() functions
loc() is a label-based data selecting method which means that we have to pass the name of the row
or column that we want to select. This method includes the last element of the range passed in it.
Simple syntax for your understanding:
• loc[row_label, column_label]
iloc() is an indexed-based selecting method which means that we have to pass an integer index in
the method to select a specific row/column. This method does not include the last element of the
range passed in it.
Simple syntax for your understanding:
• iloc[row_index, column_index]
Let us see some examples on the same.
[ ]: # Access the value on the first row and the first column
df.iloc[0, 0]
[ ]: # Access the value on the first row and the third column
df.iloc[0,2]
df.loc[0, 'Salary']
Let us create a new dataframe called ‘df2’ and assign ‘df’ to it. Now, let us set the “Name” column
as an index column using the method set_index().
[ ]: df2=df
df2=df2.set_index("Name")
5
Use the iloc() function to get the Salary of Mary in the newly created dataframe df2.
df.iloc[0:2, 0:3]
[ ]: #let us do the slicing using loc() function on old dataframe df where index␣
↪column is having labels as 0,1,2
df.loc[0:2,'ID':'Department']
[ ]: #let us do the slicing using loc() function on new dataframe df2 where index␣
↪column is Name having labels: Rose, John and Jane
df2.loc['Rose':'Jane', 'ID':'Department']
Try it yourself
using loc() function, do slicing on old dataframe df to retrieve the Name, ID and department of
index column having labels as 2,3
6
[ ]: # Write your code below and press Shift+Enter to execute
Congratulations, you have completed this lesson and the practice lab on Pandas
1.2 Author(s):
Appalabhaktula Hema
##
© IBM Corporation 2022. All rights reserved.
<!–## Change Log
Date
(YYYY-MM-DD) Version Changed By Change Description
2022-03-31 0.1 Appalabhaktula Created initial version
Hema
–!>
[ ]:
[ ]: