0% found this document useful (0 votes)

119 views8 pages

The Series Data Structure: Import Pandas As PD

1) The document discusses pandas Series and DataFrame data structures. It shows how to create Series from lists, dictionaries, and arrays. It also demonstrates indexing, querying, and modifying Series. 2) DataFrames are introduced as 2D data structures that allow storing and manipulating tabular data. The document shows how to create DataFrames from Series, load data from CSV files, and perform indexing and querying of DataFrames. 3) Examples demonstrate common operations on Series and DataFrames like filtering, aggregating, renaming columns, and changing the index.

Uploaded by

Bhanu Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views8 pages

The Series Data Structure: Import Pandas As PD

Uploaded by

Bhanu Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

7/14/2020 Week 2

You are currently looking at version 1.0 of this notebook. To download notebooks and datafiles, as
well as get help on Jupyter notebooks in the Coursera platform, visit the Jupyter Notebook FAQ
(https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource.

The Series Data Structure

In [ ]: import pandas as pd
pd.Series?

In [ ]: animals = ['Tiger', 'Bear', 'Moose']

pd.Series(animals)

In [ ]: numbers = [1, 2, 3]
pd.Series(numbers)

In [ ]: animals = ['Tiger', 'Bear', None]

pd.Series(animals)

In [ ]: numbers = [1, 2, None]

pd.Series(numbers)

In [ ]: import numpy as np
np.nan == None

In [ ]: np.nan == np.nan

In [ ]: np.isnan(np.nan)

In [ ]: sports = {'Archery': 'Bhutan',

'Golf': 'Scotland',
'Sumo': 'Japan',
'Taekwondo': 'South Korea'}
s = pd.Series(sports)
s

In [ ]: s.index

In [ ]: s = pd.Series(['Tiger', 'Bear', 'Moose'], index=['India', 'America', 'Canada'])

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 1/8

7/14/2020 Week 2

In [ ]: sports = {'Archery': 'Bhutan',

'Golf': 'Scotland',
'Sumo': 'Japan',
'Taekwondo': 'South Korea'}
s = pd.Series(sports, index=['Golf', 'Sumo', 'Hockey'])
s

Querying a Series
In [ ]: sports = {'Archery': 'Bhutan',
'Golf': 'Scotland',
'Sumo': 'Japan',
'Taekwondo': 'South Korea'}
s = pd.Series(sports)
s

In [ ]: s.iloc[3]

In [ ]: s.loc['Golf']

In [ ]: s[3]

In [ ]: s['Golf']

In [ ]: sports = {99: 'Bhutan',

100: 'Scotland',
101: 'Japan',
102: 'South Korea'}
s = pd.Series(sports)

In [ ]: s[0] #This won't call s.iloc[0] as one might expect, it generates an error instead

In [ ]: s = pd.Series([100.00, 120.00, 101.00, 3.00])

In [ ]: total = 0
for item in s:
total+=item
print(total)

In [ ]: import numpy as np

total = np.sum(s)
print(total)

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 2/8

7/14/2020 Week 2

In [ ]: #this creates a big series of random numbers

s = pd.Series(np.random.randint(0,1000,10000))
s.head()

In [ ]: len(s)

In [ ]: %%timeit -n 100
summary = 0
for item in s:
summary+=item

In [ ]: %%timeit -n 100
summary = np.sum(s)

In [ ]: s+=2 #adds two to each item in s using broadcasting

s.head()

In [ ]: for label, value in s.iteritems():

s.set_value(label, value+2)
s.head()

In [ ]: %%timeit -n 10
s = pd.Series(np.random.randint(0,1000,10000))
for label, value in s.iteritems():
s.loc[label]= value+2

In [ ]: %%timeit -n 10
s = pd.Series(np.random.randint(0,1000,10000))
s+=2

In [ ]: s = pd.Series([1, 2, 3])
s.loc['Animal'] = 'Bears'
s

In [ ]: original_sports = pd.Series({'Archery': 'Bhutan',

'Golf': 'Scotland',
'Sumo': 'Japan',
'Taekwondo': 'South Korea'})
cricket_loving_countries = pd.Series(['Australia',
'Barbados',
'Pakistan',
'England'],
index=['Cricket',
'Cricket',
'Cricket',
'Cricket'])
all_countries = original_sports.append(cricket_loving_countries)

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 3/8

7/14/2020 Week 2

In [ ]: original_sports

In [ ]: cricket_loving_countries

In [ ]: all_countries

In [ ]: all_countries.loc['Cricket']

The DataFrame Data Structure

In [ ]: import pandas as pd
purchase_1 = pd.Series({'Name': 'Chris',
'Item Purchased': 'Dog Food',
'Cost': 22.50})
purchase_2 = pd.Series({'Name': 'Kevyn',
'Item Purchased': 'Kitty Litter',
'Cost': 2.50})
purchase_3 = pd.Series({'Name': 'Vinod',
'Item Purchased': 'Bird Seed',
'Cost': 5.00})
df = pd.DataFrame([purchase_1, purchase_2, purchase_3], index=['Store 1', 'Store 1
df.head()

In [ ]: df.loc['Store 2']

In [ ]: type(df.loc['Store 2'])

In [ ]: df.loc['Store 1']

In [ ]: df.loc['Store 1', 'Cost']

In [ ]: df.T

In [ ]: df.T.loc['Cost']

In [ ]: df['Cost']

In [ ]: df.loc['Store 1']['Cost']

In [ ]: df.loc[:,['Name', 'Cost']]

In [ ]: df.drop('Store 1')

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 4/8

7/14/2020 Week 2

In [ ]: df

In [ ]: copy_df = df.copy()
copy_df = copy_df.drop('Store 1')
copy_df

In [ ]: copy_df.drop?

In [ ]: del copy_df['Name']
copy_df

In [ ]: df['Location'] = None
df

Dataframe Indexing and Loading

In [ ]: costs = df['Cost']
costs

In [ ]: costs+=2
costs

In [ ]: df

In [ ]: !cat olympics.csv

In [ ]: df = pd.read_csv('olympics.csv')
df.head()

In [ ]: df = pd.read_csv('olympics.csv', index_col = 0, skiprows=1)

df.head()

In [ ]: df.columns

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 5/8

7/14/2020 Week 2

In [ ]: for col in df.columns:

if col[:2]=='01':
df.rename(columns={col:'Gold' + col[4:]}, inplace=True)
if col[:2]=='02':
df.rename(columns={col:'Silver' + col[4:]}, inplace=True)
if col[:2]=='03':
df.rename(columns={col:'Bronze' + col[4:]}, inplace=True)
if col[:1]=='№':
df.rename(columns={col:'#' + col[1:]}, inplace=True)

df.head()

Querying a DataFrame
In [ ]: df['Gold'] > 0

In [ ]: only_gold = df.where(df['Gold'] > 0)

only_gold.head()

In [ ]: only_gold['Gold'].count()

In [ ]: df['Gold'].count()

In [ ]: only_gold = only_gold.dropna()
only_gold.head()

In [ ]: only_gold = df[df['Gold'] > 0]

only_gold.head()

In [ ]: len(df[(df['Gold'] > 0) | (df['Gold.1'] > 0)])

In [ ]: df[(df['Gold.1'] > 0) & (df['Gold'] == 0)]

Indexing Dataframes
In [ ]: df.head()

In [ ]: df['country'] = df.index
df = df.set_index('Gold')
df.head()

In [ ]: df = df.reset_index()
df.head()

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 6/8

7/14/2020 Week 2

In [ ]: df = pd.read_csv('census.csv')
df.head()

In [ ]: df['SUMLEV'].unique()

In [ ]: df=df[df['SUMLEV'] == 50]
df.head()

In [ ]: columns_to_keep = ['STNAME',
'CTYNAME',
'BIRTHS2010',
'BIRTHS2011',
'BIRTHS2012',
'BIRTHS2013',
'BIRTHS2014',
'BIRTHS2015',
'POPESTIMATE2010',
'POPESTIMATE2011',
'POPESTIMATE2012',
'POPESTIMATE2013',
'POPESTIMATE2014',
'POPESTIMATE2015']
df = df[columns_to_keep]
df.head()

In [ ]: df = df.set_index(['STNAME', 'CTYNAME'])
df.head()

In [ ]: df.loc['Michigan', 'Washtenaw County']

In [ ]: df.loc[ [('Michigan', 'Washtenaw County'),

('Michigan', 'Wayne County')] ]

Missing values
In [ ]: df = pd.read_csv('log.csv')
df

In [ ]: df.fillna?

In [ ]: df = df.set_index('time')
df = df.sort_index()
df

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 7/8

7/14/2020 Week 2

In [ ]: df = df.reset_index()
df = df.set_index(['time', 'user'])
df

In [ ]: df = df.fillna(method='ffill')
df.head()

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 8/8

Adler - Orchestration 3rd Edition PDF
100% (1)
Adler - Orchestration 3rd Edition PDF
850 pages
How To Slay The Dragon BAR EXAM
No ratings yet
How To Slay The Dragon BAR EXAM
22 pages
Methodes Pour Dataframes
No ratings yet
Methodes Pour Dataframes
10 pages
Data Science With Python
No ratings yet
Data Science With Python
12 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Pandas
No ratings yet
Pandas
27 pages
Program Dataframe
No ratings yet
Program Dataframe
8 pages
Python Practical Questions
No ratings yet
Python Practical Questions
13 pages
Hrithik Saini Class 12th c1, Roll No 1033
No ratings yet
Hrithik Saini Class 12th c1, Roll No 1033
25 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
Practical File 12th
No ratings yet
Practical File 12th
19 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Acknowledgement
No ratings yet
Acknowledgement
25 pages
XII IP Practical Code and Output
No ratings yet
XII IP Practical Code and Output
4 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Using Python For Data Analysis - July 2018 - Slides
No ratings yet
Using Python For Data Analysis - July 2018 - Slides
43 pages
Python Project File
No ratings yet
Python Project File
31 pages
IP Practical File 2022
No ratings yet
IP Practical File 2022
26 pages
WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas
No ratings yet
WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas
1 page
Cheat Python
No ratings yet
Cheat Python
8 pages
IP Practical
No ratings yet
IP Practical
28 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
Practical File Question 28.09.2022
No ratings yet
Practical File Question 28.09.2022
15 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
XII - LIST OF PRACTICALS - With Answers
No ratings yet
XII - LIST OF PRACTICALS - With Answers
20 pages
Unit 1 Python Programming-Ii
No ratings yet
Unit 1 Python Programming-Ii
15 pages
Numpy
No ratings yet
Numpy
40 pages
Ip Practical File
No ratings yet
Ip Practical File
21 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Class XII-IP-Practical File 1
No ratings yet
Class XII-IP-Practical File 1
28 pages
XII - Informatics Practices (LAB MANUAL)
100% (1)
XII - Informatics Practices (LAB MANUAL)
42 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
Series 1
No ratings yet
Series 1
408 pages
Ip Practice Test (14in)
No ratings yet
Ip Practice Test (14in)
9 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
PANDAS
No ratings yet
PANDAS
24 pages
Practical File
No ratings yet
Practical File
19 pages
KSTV
No ratings yet
KSTV
19 pages
Pragya File
No ratings yet
Pragya File
31 pages
Pandas
No ratings yet
Pandas
5 pages
Ip Project Work 2
No ratings yet
Ip Project Work 2
52 pages
LIst of Practicals 2024 - 25 Class Xii
No ratings yet
LIst of Practicals 2024 - 25 Class Xii
10 pages
PRACTICALS
No ratings yet
PRACTICALS
52 pages
Ip Study
No ratings yet
Ip Study
18 pages
Pandas
No ratings yet
Pandas
8 pages
IP - PRACTICAL EXAM - Revision
No ratings yet
IP - PRACTICAL EXAM - Revision
24 pages
Unit 2
No ratings yet
Unit 2
81 pages
Ip Final Practical File
No ratings yet
Ip Final Practical File
22 pages
IP Practical
No ratings yet
IP Practical
24 pages
Httpsncert Nic Intextbookpdfleip102 PDF
No ratings yet
Httpsncert Nic Intextbookpdfleip102 PDF
36 pages
IP Practical File
No ratings yet
IP Practical File
27 pages
Practical of R
No ratings yet
Practical of R
38 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
47 pages
Pandaspythonfordatascience
No ratings yet
Pandaspythonfordatascience
1 page
Pandas 1
No ratings yet
Pandas 1
267 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Swarup .Patil .36 Banking Sector in India
No ratings yet
Swarup .Patil .36 Banking Sector in India
77 pages
ELE - Q3126 - v1.0 (Weighing Machine)
No ratings yet
ELE - Q3126 - v1.0 (Weighing Machine)
24 pages
Sustainable Development
No ratings yet
Sustainable Development
5 pages
Jkuat Mba Thesis
100% (3)
Jkuat Mba Thesis
5 pages
Uu200 MST
No ratings yet
Uu200 MST
3 pages
BIOSCI109 ECOLOGY DEFINiTIONS
No ratings yet
BIOSCI109 ECOLOGY DEFINiTIONS
6 pages
Star Trek Adventures - Lower Decks Season 1 - Player Characters
100% (3)
Star Trek Adventures - Lower Decks Season 1 - Player Characters
17 pages
Method Statement - Community Parks (Advance Copy) PDF
No ratings yet
Method Statement - Community Parks (Advance Copy) PDF
9 pages
Group 12 Metals
No ratings yet
Group 12 Metals
37 pages
US3136818-Produccion de Anilina
No ratings yet
US3136818-Produccion de Anilina
4 pages
Release Notes: V1.20 Addition
No ratings yet
Release Notes: V1.20 Addition
3 pages
Purely Resistive Circuit - Purely Capacitive Circuit - RC Circuit - Purely Inductive Circuit - RL Circuit - RLC Circuit
No ratings yet
Purely Resistive Circuit - Purely Capacitive Circuit - RC Circuit - Purely Inductive Circuit - RL Circuit - RLC Circuit
28 pages
Supportive Communication
No ratings yet
Supportive Communication
5 pages
Ddcet Abs Chem
No ratings yet
Ddcet Abs Chem
9 pages
Kants Transcendental Deduction of The Categories
100% (1)
Kants Transcendental Deduction of The Categories
61 pages
Solidarity Sing Along Songbook, June 2015 Edition
No ratings yet
Solidarity Sing Along Songbook, June 2015 Edition
52 pages
Level-II Building Electrical Installation
No ratings yet
Level-II Building Electrical Installation
37 pages
Recibo Uber SP 5
No ratings yet
Recibo Uber SP 5
1 page
Prepare For Amazon Domination - Wpromote
No ratings yet
Prepare For Amazon Domination - Wpromote
33 pages
Bhrigu
No ratings yet
Bhrigu
2 pages
Starter Quiz Questions
No ratings yet
Starter Quiz Questions
3 pages
Frontline 17 Mar - 2025
No ratings yet
Frontline 17 Mar - 2025
80 pages
Key Success Factors
No ratings yet
Key Success Factors
11 pages
Eng8-Dll-Q3-Day 1 - W2
No ratings yet
Eng8-Dll-Q3-Day 1 - W2
3 pages
Fundamental Programming Structures in Java
No ratings yet
Fundamental Programming Structures in Java
23 pages
(Ebook) Becoming Tsimshian: The Social Life of Names by Christopher F. Roth ISBN 9780295988078, 029598807X Instant Download
No ratings yet
(Ebook) Becoming Tsimshian: The Social Life of Names by Christopher F. Roth ISBN 9780295988078, 029598807X Instant Download
50 pages
A Comprehensive Review of Traditional and Modern Soil and Water Conservation Practices
No ratings yet
A Comprehensive Review of Traditional and Modern Soil and Water Conservation Practices
22 pages
262086378-Internship-Report-on-Loans-and-Advances-of-Pubali-Bank-Limited 01 PDF
No ratings yet
262086378-Internship-Report-on-Loans-and-Advances-of-Pubali-Bank-Limited 01 PDF
109 pages

The Series Data Structure: Import Pandas As PD

Uploaded by

The Series Data Structure: Import Pandas As PD

Uploaded by

7/14/2020 Week 2

The Series Data Structure

In [ ]: animals = ['Tiger', 'Bear', 'Moose']

In [ ]: animals = ['Tiger', 'Bear', None]

In [ ]: numbers = [1, 2, None]

In [ ]: sports = {'Archery': 'Bhutan',

In [ ]: s = pd.Series(['Tiger', 'Bear', 'Moose'], index=['India', 'America', 'Canada'])

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 1/8

In [ ]: sports = {'Archery': 'Bhutan',

In [ ]: sports = {99: 'Bhutan',

In [ ]: s = pd.Series([100.00, 120.00, 101.00, 3.00])

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 2/8

In [ ]: #this creates a big series of random numbers

In [ ]: s+=2 #adds two to each item in s using broadcasting

In [ ]: for label, value in s.iteritems():

In [ ]: original_sports = pd.Series({'Archery': 'Bhutan',

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 3/8

The DataFrame Data Structure

In [ ]: df.loc['Store 1', 'Cost']

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 4/8

Dataframe Indexing and Loading

In [ ]: df = pd.read_csv('olympics.csv', index_col = 0, skiprows=1)

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 5/8

In [ ]: for col in df.columns:

In [ ]: only_gold = df.where(df['Gold'] > 0)

In [ ]: only_gold = df[df['Gold'] > 0]

In [ ]: len(df[(df['Gold'] > 0) | (df['Gold.1'] > 0)])

In [ ]: df[(df['Gold.1'] > 0) & (df['Gold'] == 0)]

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 6/8

In [ ]: df.loc['Michigan', 'Washtenaw County']

In [ ]: df.loc[ [('Michigan', 'Washtenaw County'),

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 7/8

https://mgogycictijcrtwfnflsmu.coursera-apps.org/notebooks/Week 2.ipynb 8/8

You might also like