0% found this document useful (0 votes)
12 views

DATA100 - Pandas Introduction - Basics

Uploaded by

Robert Nelson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

DATA100 - Pandas Introduction - Basics

Uploaded by

Robert Nelson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Data Preparation

with Python
11 Pandas
Intro to Pandas

Pandas 2.0 and Open Private API


APIs (Yfinance)
(Lyrics Genius)
MIDTERM
Review Class
6pm to 7:30pm EXAM!!
(If no LEAP)

Review Class 12 nn –
1:30pm
(If with LEAP)
Pandas??

Animal
Pandas is an easy-to-use, fast, flexible,
and powerful open-source Python
library for working with “relational” or
“labeled” data. It aims to be the
fundamental high-level building block for
doing practical, real world data analysis in
Python by offering data structures and
operations for manipulating tables.
PANel DAta
How to start using it?

import pandas as pd

We simply import the pandas library through the code above.

pd is simply an alias for the pandas library. We do this because we will


be utilizing this a lot.
Data Structures

Series
index colleges
● is a one-dimensional
array-like object that has 1 COB
index and value just like
NumPy and is capable of 2 SOE
holding any data type.
3 CCS
Creating Series

Transfer to Jupyter
Notebook for more
Samples and
Exercise
Exercise - Creating Series
Data Typecasting

Typecasting – changing
the variable type to
another

1 -> ‘1’ -> 1.0


Data Structures

DataFrame
index college course
● a two / multi-dimensional
labelled data structure 1 COB MGT
with columns of
potentially different data 2 SOE AE-MGT
types.
● It’s simply a multi-column 3 CCS ST
Series object.
Creating DataFrame
Exercise – Creating DataFrame

Expected Output
Important DataFrame Functions to Know

● Selecting Columns
● Typecasting
● Creating New Column
● Deleting a Column
● Editing Specific Values
● Reindexing
● Filtering
● Sorting
Selecting a Column

To select a column, make use of brackets


then input the column name that you want to
select. Notice that pandas returns a
Series.

You can already treat this as a series and


all calculations that we can apply to a series,
we can apply to a selected column
Typecasting a Column

Similar to a series, we can typecast a column


by using the .astype() function. To make
sure that the data type transformation is
done, we need to store the typecasted data
to the column.
Creating and Deleting Columns
Editing A Specific Value
Dropping Rows
Reindexing
Filtering
Sorting
Exercise

● Create two Series


○ series_a : numbers from 0 – 99 (col_1)
○ series_b : numbers from 0.0 - .99 (col_2)
■ Clue, you can divide series_a by a certain number
● Combine the two series into a DataFrame (df_exercise)
○ Column One : series_a
○ Column Two : series_b
○ Create Column Three :
■ Multiply Column One and Column Two
■ To do this, select column 1 and select column 2
■ Use the * sign to multiply the two columns
○ Column Four = Convert the datatype of column 3 from float to string
○ Sort the values from LARGEST to SMALLEST based on column 1
○ Filter with the following condition
■ Column 1 values should be greater than 30
■ AND
■ Column 2 values should be less than .85

You might also like