0% found this document useful (0 votes)
23 views11 pages

Intro To Pandas For Data Science

Pandas

Uploaded by

sawantganesh188
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views11 pages

Intro To Pandas For Data Science

Pandas

Uploaded by

sawantganesh188
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Intro to

Pandas
For Data Science

Andres Vourakis
Data Scientist & Mentor
@andresvourakis

Understanding
Pandas will allow you
to manipulate and
analyze data
efficiently, enabling
you to derive insights
quickly and prepare
data for advanced
analysis.
Core
Pandas Concepts
Here is what we’ll cover:

1. DataFrames and Series


2. Data Selection and Filtering
3. Data Cleaning and Transformation
4. Merging and Joining DataFrames
5. Aggregation and Grouping
6. Working with Dates and Times
DataFrames and Series
Pandas' core structures allow for powerful data
manipulation and analysis.

Series Series DataFrame

Visits Page Page Visits

0 20 0 /home 0 /home 20

1 40 1 /about 1 /about 40

2 10 2 /contact 2 /contact 10

3 90 3 /projects 3 /projects 90
Data Selection and
Filtering
Use .loc[] to filter data by row and column names
or conditions, and .iloc[] to filter data by row and
column numbers.

Before Filtering
After Filtering
Page Visits
Page
0 /home 20
1 /about
1 /about 40
3 /projects
2 /contact 10

3 /projects 90
Data Cleaning and
Transformation

Handle missing data with functions like:

fillna() to replace NaNs.


dropna() to remove NaNs.
apply() and map() for transformations.

Pandas simplifies the process of preparing your


data for analysis.
Merging and Joining
DataFrames
Combine datasets using:

merge() for relational joins.


concat() to concatenate DataFrames.
join() for combining on index.

Pandas provides powerful functions to work with


multiple datasets.
Aggregation and
Grouping
Summarize data with:

groupby() to group data based on one or more


columns.
Aggregation functions like mean(), sum(), count(), etc.

Before Grouping

After Grouping
Category Page Visits

0 Main /home 20 Category

1 Main /about 40 0 Main 150

2 Support /contact 10 1 Support 10

3 Main /projects 90
Working with Dates
and Times
Pandas supports powerful time series
functionality:

to_datetime() for converting strings to


datetime objects.
resample() for time-based groupings.

This!

Event Date Event Date Day of Week

0 Event 1 2024-08-22 0 Event 1 2024-08-22 Thursday

1 Event 2 2024-08-23 1 Event 2 2024-08-23 Friday

2 Event 3 2024-08-24 2 Event 3 2024-08-24 Saturday


Bonus Tip

PandasAI
PandasAI is a Python library that integrates
generative artificial intelligence capabilities into
pandas, making dataframes conversational.
Andres Vourakis
Data Scientist & Mentor

Follow for more


Data Science content

You might also like