0% found this document useful (0 votes)
60 views12 pages

Data Analysis

The document discusses data analysis and the Python library Pandas. It provides an agenda that covers what data analysis is, an introduction to Pandas, and how to perform data analysis with Pandas. Examples are given on reading data, exploring and cleaning data, and conducting basic analyses using Pandas functions and operations on DataFrames and Series.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views12 pages

Data Analysis

The document discusses data analysis and the Python library Pandas. It provides an agenda that covers what data analysis is, an introduction to Pandas, and how to perform data analysis with Pandas. Examples are given on reading data, exploring and cleaning data, and conducting basic analyses using Pandas functions and operations on DataFrames and Series.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Lesson 12:

Data Analysis
•Attendance:
• Link: Gitter.im | Code: ????
•Class Chat:
• https://gitter.im/IST256/Fudge
•Participation
• http://ist256.participoll.com/
Agenda
• What is Data Analysis? • You’ve Read:
• Readings online.
• What is Pandas?
• How to perform data analysis
with Pandas
Questions? Ask in Our Course Chat!

https://gitter.im/IST256/Fudge
Connect Activity
Question:
The process of systematically applying
techniques to evaluate data is known as ?
A. Data Munging
B. Data Analysis
C. Data Science
D. Data Bases
A B C D 0
Data Analysis:
• What is it? • Goals of Data Analysis:
• Apply logical 1. Discover useful
techniques to information
• Describe, condense, 2. Provide insights
recap and evaluate 3. Suggest conclusions
Data and 4. Support Decision
• Illustrate Information Making
What is pandas ?
• Pandas is Python package for data analysis.
• It Provides built-in data structures which simplify the
manipulation and analysis of data sets.
• Pandas is easy to use and powerful, but “with great power
comes great responsibility”
• We cannot teach you all things Pandas, we must focus on
how it works, so you can figure out the rest on your own.
• http://pandas.pydata.org/pandas-docs/stable/
Pandas: Essential Concepts
• A Series is a named Python list (dict with list as value).
{ ‘grades’ : [50,90,100,45] }

• A DataFrame is a dictionary of Series (dict of series):


{ { ‘names’ : [‘bob’,’ken’,’art’,’joe’]}
{ ‘grades’ : [50,90,100,45] }
}
Watch Me Code 1
Pandas Basics
• Series
• DataFrame
• Creating a DataFrame from a dict
• Select columns, Select rows with Boolean indexing
Check Yourself: Series or DataFrame?
Match the code to the
result. One result is a Series,
the other a DataFrame
1.df[‘Quarter’]
2.df[ [‘Quarter’] ]

A. Series B. Data Frame


A B 0
Check Yourself: Boolean Index
Which rows are included in this
Boolean index?
df[ df[‘Sold’] < 110 ]
A. 0, 1, 2
B. 1, 2, 3
C. 0, 1
D. 2, 3
A B C D 0
Watch Me Code 2
Data Analysis of Superhero Movies:
•read_csv file from web
•no column names
•head(), sample()
•value_counts
•dealing with nulls
•Feature engineering
End-To-End Example
Data Analysis of iSchool Classes
• What percentage of the schedule are undergrad?
• How many undergrad classes on Friday? or 8AM?

https://ischool.syr.edu/classes
• Read_html()
• append()
• Engineer Grad / Undergrad
Conclusion Activity
"1 Important thing"

Explain one important thing you


learned today!

You might also like