0% found this document useful (0 votes)

19 views10 pages

Week 13 1-Pandas

The document provides an overview of the Pandas library, focusing on the creation and manipulation of Series and DataFrames. It includes examples of creating Series from lists and dictionaries, as well as constructing DataFrames from multiple Series. Additionally, it covers loading data from CSV files into DataFrames for analysis.

Uploaded by

shost661

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views10 pages

Week 13 1-Pandas

Uploaded by

shost661

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Pandas Library

Pandas Series
A Pandas series is like a column in a table . it is 1D array which holds data of any type.

Here we will create a simple pandas series.

import pandas as pd
x = [1,7,2]
y = pd.Series(x)
print(y)

0 1
1 7
2 2
dtype: int64

# labeling - label can be use to access a specified value.

import pandas as pd
x = [1,7,2]
y = pd.Series(x)
print(y[0])

# with Create label you can create your own name labels:
import pandas as pd
x = [1,7,2]
y = pd.Series(x, index=["x", "y", "z"])
print(y)

x 1
y 7
z 2
dtype: int64

# labeling - label can be use to access a specified value.

#(after creating own label)
import pandas as pd
x = [1,7,2]
y = pd.Series(x, index=["x", "y", "z"])
print(y["x"])

1
""" you can also use a key or value object like a dictionary,
when creating a series.
here we will create a simple pandas series from a dictionary.
"""
import pandas as pd
cal = {"day1": 420, "day2":380, "day3":390}
x = pd.Series(cal)
print(x)

day1 420
day2 380
day3 390
dtype: int64

# now we will create a series using only data from day1 and day2
import pandas as pd
cal = {"day1": 420, "day2":380, "day3":390}
result = pd.Series(cal, index=["day1", "day2"])
print(result)

day1 420
day2 380
dtype: int64

Data Frame
"""DataFrame: Data sets in pandas are usually multidimentional tables,
and they are called DataFrames.
series are like columns and dataframes is the whole table.
"""
# we will now create a dataframe from 2 series.
import pandas as pd
x = {"cal": [420, 380, 390], "duration": [50, 40, 45]}
y = pd.DataFrame(x)
print(y)

cal duration
0 420 50
1 380 40
2 390 45

# Dataframe: it is a 2D data structure like a 2D array with table

#incl. rows and columns.
import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data)
print(x)
cal dur
0 420 50
1 380 40
2 390 45

# Locate row: pandas use the loc attibute to return one or more
specified row.

import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data)
print(x.loc[0])

cal 420
dur 50
Name: 0, dtype: int64

# example of returning row 0 and 1

import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data)
print(x.loc[[0,1]])

cal dur
0 420 50
1 380 40

# named Index: with the index arg, you can name your own index.
import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data, index=["day1", "day2", "day3"])
print(x)

cal dur
day1 420 50
day2 380 40
day3 390 45

# locate the named index:

import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data, index=["day1", "day2", "day3"])
print(x.loc["day2"])

cal 380
dur 40
Name: day2, dtype: int64

# output in a dataframe:
import pandas as pd
data = {"cal": [420, 380, 390], "dur":[50, 40, 45]}
x = pd.DataFrame(data, index=["day1", "day2", "day3"])
print(x.loc[["day1", "day2"]])

cal dur
day1 420 50
day2 380 40

Pandas CSV
# load the data from the csv file into dataframe i.e data.csv
import pandas as pd
x = pd.read_csv('Data.csv')
print(x)

Duration Pulse Maxpulse Calories

0 60 110 130 409.1
1 60 117 145 479.0
2 60 103 135 340.0
3 45 109 175 282.4
4 45 117 148 406.0
.. ... ... ... ...
164 60 105 140 290.8
165 60 110 145 300.0
166 60 115 145 310.2
167 75 120 150 320.4
168 75 125 150 330.4

[169 rows x 4 columns]

# read csv files: (comma seperated file) it is a simple way

#to store the big and bigest data sets. csv files contains plain text.

# loading the csv into a dataframe with to_string

import pandas as pd
x = pd.read_csv('Data.csv')
print(x.to_string())

Duration Pulse Maxpulse Calories

0 60 110 130 409.1
1 60 117 145 479.0
2 60 103 135 340.0
3 45 109 175 282.4
4 45 117 148 406.0
5 60 102 127 300.0
6 60 110 136 374.0
7 45 104 134 253.3
8 30 109 133 195.1
9 60 98 124 269.0
10 60 103 147 329.3
11 60 100 120 250.7
12 60 106 128 345.3
13 60 104 132 379.3
14 60 98 123 275.0
15 60 98 120 215.2
16 60 100 120 300.0
17 45 90 112 NaN
18 60 103 123 323.0
19 45 97 125 243.0
20 60 108 131 364.2
21 45 100 119 282.0
22 60 130 101 300.0
23 45 105 132 246.0
24 60 102 126 334.5
25 60 100 120 250.0
26 60 92 118 241.0
27 60 103 132 NaN
28 60 100 132 280.0
29 60 102 129 380.3
30 60 92 115 243.0
31 45 90 112 180.1
32 60 101 124 299.0
33 60 93 113 223.0
34 60 107 136 361.0
35 60 114 140 415.0
36 60 102 127 300.0
37 60 100 120 300.0
38 60 100 120 300.0
39 45 104 129 266.0
40 45 90 112 180.1
41 60 98 126 286.0
42 60 100 122 329.4
43 60 111 138 400.0
44 60 111 131 397.0
45 60 99 119 273.0
46 60 109 153 387.6
47 45 111 136 300.0
48 45 108 129 298.0
49 60 111 139 397.6
50 60 107 136 380.2
51 80 123 146 643.1
52 60 106 130 263.0
53 60 118 151 486.0
54 30 136 175 238.0
55 60 121 146 450.7
56 60 118 121 413.0
57 45 115 144 305.0
58 20 153 172 226.4
59 45 123 152 321.0
60 210 108 160 1376.0
61 160 110 137 1034.4
62 160 109 135 853.0
63 45 118 141 341.0
64 20 110 130 131.4
65 180 90 130 800.4
66 150 105 135 873.4
67 150 107 130 816.0
68 20 106 136 110.4
69 300 108 143 1500.2
70 150 97 129 1115.0
71 60 109 153 387.6
72 90 100 127 700.0
73 150 97 127 953.2
74 45 114 146 304.0
75 90 98 125 563.2
76 45 105 134 251.0
77 45 110 141 300.0
78 120 100 130 500.4
79 270 100 131 1729.0
80 30 159 182 319.2
81 45 149 169 344.0
82 30 103 139 151.1
83 120 100 130 500.0
84 45 100 120 225.3
85 30 151 170 300.0
86 45 102 136 234.0
87 120 100 157 1000.1
88 45 129 103 242.0
89 20 83 107 50.3
90 180 101 127 600.1
91 45 107 137 NaN
92 30 90 107 105.3
93 15 80 100 50.5
94 20 150 171 127.4
95 20 151 168 229.4
96 30 95 128 128.2
97 25 152 168 244.2
98 30 109 131 188.2
99 90 93 124 604.1
100 20 95 112 77.7
101 90 90 110 500.0
102 90 90 100 500.0
103 90 90 100 500.4
104 30 92 108 92.7
105 30 93 128 124.0
106 180 90 120 800.3
107 30 90 120 86.2
108 90 90 120 500.3
109 210 137 184 1860.4
110 60 102 124 325.2
111 45 107 124 275.0
112 15 124 139 124.2
113 45 100 120 225.3
114 60 108 131 367.6
115 60 108 151 351.7
116 60 116 141 443.0
117 60 97 122 277.4
118 60 105 125 NaN
119 60 103 124 332.7
120 30 112 137 193.9
121 45 100 120 100.7
122 60 119 169 336.7
123 60 107 127 344.9
124 60 111 151 368.5
125 60 98 122 271.0
126 60 97 124 275.3
127 60 109 127 382.0
128 90 99 125 466.4
129 60 114 151 384.0
130 60 104 134 342.5
131 60 107 138 357.5
132 60 103 133 335.0
133 60 106 132 327.5
134 60 103 136 339.0
135 20 136 156 189.0
136 45 117 143 317.7
137 45 115 137 318.0
138 45 113 138 308.0
139 20 141 162 222.4
140 60 108 135 390.0
141 60 97 127 NaN
142 45 100 120 250.4
143 45 122 149 335.4
144 60 136 170 470.2
145 45 106 126 270.8
146 60 107 136 400.0
147 60 112 146 361.9
148 30 103 127 185.0
149 60 110 150 409.4
150 60 106 134 343.0
151 60 109 129 353.2
152 60 109 138 374.0
153 30 150 167 275.8
154 60 105 128 328.0
155 60 111 151 368.5
156 60 97 131 270.4
157 60 100 120 270.4
158 60 114 150 382.8
159 30 80 120 240.9
160 30 85 120 250.4
161 45 90 130 260.4
162 45 95 130 270.0
163 45 100 140 280.9
164 60 105 140 290.8
165 60 110 145 300.0
166 60 115 145 310.2
167 75 120 150 320.4
168 75 125 150 330.4

# loading the csv into a dataframe without to_string

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

Duration Pulse Maxpulse Calories

[169 rows x 4 columns]

import pandas as pd
x = pd.read_csv('data.csv')
print(x.head())

Duration Pulse Maxpulse Calories

0 60 110 130 409.1
1 60 117 145 479.0
2 60 103 135 340.0
3 45 109 175 282.4
4 45 117 148 406.0

# Viewing the data : one of the most used method for a quick overview
of the dataframe is the head() method. this method returns the headers
and a specified number of rows.
# here we will print the 1st 10 rows in the dataframe.
import pandas as pd
x = pd.read_csv('data.csv')
print(x.head(10))
Duration Pulse Maxpulse Calories
0 60 110 130 409.1
1 60 117 145 479.0
2 60 103 135 340.0
3 45 109 175 282.4
4 45 117 148 406.0
5 60 102 127 300.0
6 60 110 136 374.0
7 45 104 134 253.3
8 30 109 133 195.1
9 60 98 124 269.0

import pandas as pd
x = pd.read_csv('data.csv')
print(x.tail())

Duration Pulse Maxpulse Calories

164 60 105 140 290.8
165 60 110 145 300.0
166 60 115 145 310.2
167 75 120 150 320.4
168 75 125 150 330.4

# here we will print the last 10 rows in the dataframe.

import pandas as pd
x = pd.read_csv('data.csv')
print(x.tail(10))

Duration Pulse Maxpulse Calories

159 30 80 120 240.9
160 30 85 120 250.4
161 45 90 130 260.4
162 45 95 130 270.0
163 45 100 140 280.9
164 60 105 140 290.8
165 60 110 145 300.0
166 60 115 145 310.2
167 75 120 150 320.4
168 75 125 150 330.4

# what if you want the information about the data in the dataframe:
via info()
import pandas as pd
df = pd.read_csv('data.csv')
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 169 entries, 0 to 168
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64
dtypes: float64(1), int64(3)
memory usage: 5.4 KB
None

Annex Final
No ratings yet
Annex Final
375 pages
Antenna Patterns
0% (1)
Antenna Patterns
317 pages
User Manual Test Universe
No ratings yet
User Manual Test Universe
12 pages
Astrology Levels Stock Setup
No ratings yet
Astrology Levels Stock Setup
222 pages
PANDAS Intro 1
No ratings yet
PANDAS Intro 1
24 pages
Rubrics Mechanics of Machines Lab
No ratings yet
Rubrics Mechanics of Machines Lab
2 pages
Chapter 3 Research Method
100% (1)
Chapter 3 Research Method
26 pages
A Study To Assess The Knowledge and Practice Among B.SC Nursing Students, On Biomedical Waste Management in Selected Nursing College, Bhubaneswar, Odisha
No ratings yet
A Study To Assess The Knowledge and Practice Among B.SC Nursing Students, On Biomedical Waste Management in Selected Nursing College, Bhubaneswar, Odisha
4 pages
Test de Excel para Entrevista DE: Trabajo
0% (1)
Test de Excel para Entrevista DE: Trabajo
33 pages
Table A22
0% (1)
Table A22
2 pages
Hardness Conversion Chart: Rockwell Rockwell Superficial Brinell Vickers Shore
No ratings yet
Hardness Conversion Chart: Rockwell Rockwell Superficial Brinell Vickers Shore
4 pages
Grain Loading SANDRA
No ratings yet
Grain Loading SANDRA
58 pages
Ketinggian Tangki Pertamax Turbo
No ratings yet
Ketinggian Tangki Pertamax Turbo
106 pages
Pavimentos
No ratings yet
Pavimentos
167 pages
Techno Stress
No ratings yet
Techno Stress
45 pages
Data Is Raw Material For Data Processing. Data Relates To Fact, Event and Transactions
No ratings yet
Data Is Raw Material For Data Processing. Data Relates To Fact, Event and Transactions
82 pages
Custom Effect Dynamic
No ratings yet
Custom Effect Dynamic
26 pages
Support Staff - JDs and Advertisement - 2025 - Final
No ratings yet
Support Staff - JDs and Advertisement - 2025 - Final
36 pages
Studbolt Weight Chart Metric
100% (1)
Studbolt Weight Chart Metric
1 page
Data Science Lecture No 01
No ratings yet
Data Science Lecture No 01
28 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
Slidesaver - App Ayvcfp
No ratings yet
Slidesaver - App Ayvcfp
24 pages
TSC Analysis
No ratings yet
TSC Analysis
62 pages
Session 7 - A Framework For Community Analysis
100% (1)
Session 7 - A Framework For Community Analysis
23 pages
Multiple Linear Regression and Checking For Collinearity Using SAS
0% (1)
Multiple Linear Regression and Checking For Collinearity Using SAS
18 pages
ML Mini Project: Name: Sarvesh Muttepwar Class: BE COMP (A) Roll No: 21CEBEB11
No ratings yet
ML Mini Project: Name: Sarvesh Muttepwar Class: BE COMP (A) Roll No: 21CEBEB11
12 pages
Data Loading - Jupyter Notebook
No ratings yet
Data Loading - Jupyter Notebook
15 pages
Fan Calc Sheet
No ratings yet
Fan Calc Sheet
16 pages
Oracle Certification
100% (2)
Oracle Certification
13 pages
Data Pre Processing 1
No ratings yet
Data Pre Processing 1
35 pages
Heart Diseases EDA
No ratings yet
Heart Diseases EDA
1 page
Exponential Smoothing
No ratings yet
Exponential Smoothing
17 pages
SKJ2413 Object-Oriented Programming (Lab 5 Sheet) : DR Muhd Zalisham Bin Jali
No ratings yet
SKJ2413 Object-Oriented Programming (Lab 5 Sheet) : DR Muhd Zalisham Bin Jali
17 pages
08 Excel Models For Inventory SS and Demand Forecasting
No ratings yet
08 Excel Models For Inventory SS and Demand Forecasting
17 pages
Selecting and Constructing Data Collection Instruments
No ratings yet
Selecting and Constructing Data Collection Instruments
59 pages
SystemData 033
No ratings yet
SystemData 033
2 pages
GE 4 SIM MMW Week 4-5
No ratings yet
GE 4 SIM MMW Week 4-5
32 pages
24K - Score
No ratings yet
24K - Score
7 pages
Laadou Feb 19 Earnings Report
No ratings yet
Laadou Feb 19 Earnings Report
8 pages
Rowid Value Count Area MIN MAX Range: #Area MAX MIN Promedio Area Acumulad A Area Sobre La Curva
No ratings yet
Rowid Value Count Area MIN MAX Range: #Area MAX MIN Promedio Area Acumulad A Area Sobre La Curva
17 pages
A1 - E1-1-to-E1-6 Database System
No ratings yet
A1 - E1-1-to-E1-6 Database System
7 pages
01 Quantitative Geography
No ratings yet
01 Quantitative Geography
9 pages
SQL Server Database Development Best Practices: Grant Fritchey, Red Gate Software
No ratings yet
SQL Server Database Development Best Practices: Grant Fritchey, Red Gate Software
21 pages
Tank Soundings-All Kind
No ratings yet
Tank Soundings-All Kind
25 pages
Antamina Excel
No ratings yet
Antamina Excel
41 pages
Chapter 11 NTFS Concepts 1695602749
No ratings yet
Chapter 11 NTFS Concepts 1695602749
4 pages
Pasta US - Adj
No ratings yet
Pasta US - Adj
9 pages
Hague Fasteners Studbolt Weight Chart MetricSizes
No ratings yet
Hague Fasteners Studbolt Weight Chart MetricSizes
1 page
Stud Link Chain - Technical Standard
No ratings yet
Stud Link Chain - Technical Standard
1 page
Invictus
No ratings yet
Invictus
2 pages
D - Anas Bin Ariffin
No ratings yet
D - Anas Bin Ariffin
15 pages
Ml1.ipynb - Colaboratory
No ratings yet
Ml1.ipynb - Colaboratory
5 pages
UiPath Certified Professional - Specialized AI Pro Exam Description
No ratings yet
UiPath Certified Professional - Specialized AI Pro Exam Description
15 pages
Peumusan (Update)
No ratings yet
Peumusan (Update)
14 pages
Steam Tables (English Units)
No ratings yet
Steam Tables (English Units)
3 pages
DWM Lab 11 (Open Ended Lab)
No ratings yet
DWM Lab 11 (Open Ended Lab)
3 pages
Estimate Formworks
No ratings yet
Estimate Formworks
2 pages
When Good Design Goes Bad: March 2015
No ratings yet
When Good Design Goes Bad: March 2015
27 pages
Design of Rainfall Intensity
No ratings yet
Design of Rainfall Intensity
7 pages
Using Big Data To Solve Economic and Social Problems: Professor Raj Chetty Head Section Leader: Gregory Bruich, PH.D
No ratings yet
Using Big Data To Solve Economic and Social Problems: Professor Raj Chetty Head Section Leader: Gregory Bruich, PH.D
31 pages
Hardness Conversion Chart
No ratings yet
Hardness Conversion Chart
4 pages
Database Development Cycle & Planning For Database
No ratings yet
Database Development Cycle & Planning For Database
28 pages
MTV Plot Data Format
No ratings yet
MTV Plot Data Format
110 pages
A List of Factorial Math Constants
From Everand
A List of Factorial Math Constants
Archive Classics
No ratings yet
Sampling Distribution 556 G
No ratings yet
Sampling Distribution 556 G
2 pages
Informe Epanet
No ratings yet
Informe Epanet
4 pages
EAJB 015A CAB Copeland
No ratings yet
EAJB 015A CAB Copeland
7 pages
For Assessment 1 - Case Study - Apple Computer's Supplier Hubs - A Tale of Three Cities
No ratings yet
For Assessment 1 - Case Study - Apple Computer's Supplier Hubs - A Tale of Three Cities
11 pages
Maryland Metrics
No ratings yet
Maryland Metrics
1 page
Recursive BAQs
No ratings yet
Recursive BAQs
23 pages
The 8D Methodology An Effective Way To Reduce Recu PDF
No ratings yet
The 8D Methodology An Effective Way To Reduce Recu PDF
7 pages
Tabel Chi-Square
No ratings yet
Tabel Chi-Square
7 pages
Windows Command Line: Robocopy Command Syntax and Examples
No ratings yet
Windows Command Line: Robocopy Command Syntax and Examples
6 pages
Title Section: Report For Business Main Elements of The Standard Report Writing Format
No ratings yet
Title Section: Report For Business Main Elements of The Standard Report Writing Format
14 pages
Series 1400 Capacity Chart Nominal Volume M: Notes
No ratings yet
Series 1400 Capacity Chart Nominal Volume M: Notes
3 pages
%lab Assignment 2 4/12/11
No ratings yet
%lab Assignment 2 4/12/11
5 pages
Sifat-Sifat Penampang Pipa Fy 1600 kg/cm2: Diam. Thick. A P V I W
No ratings yet
Sifat-Sifat Penampang Pipa Fy 1600 kg/cm2: Diam. Thick. A P V I W
7 pages
Mill Duty Motor Chart1
No ratings yet
Mill Duty Motor Chart1
4 pages
Calibration For DailyTank Solar
No ratings yet
Calibration For DailyTank Solar
1 page
Lab27 - Kubernetes Pod Security Context
No ratings yet
Lab27 - Kubernetes Pod Security Context
10 pages
Various Gas (English Units)
No ratings yet
Various Gas (English Units)
4 pages
Spectral Weight
No ratings yet
Spectral Weight
1 page
DHL Import Tariff Wef 1st - Jan 2012
No ratings yet
DHL Import Tariff Wef 1st - Jan 2012
1 page
File Handling With Linked List in C++
0% (1)
File Handling With Linked List in C++
3 pages
Permastore Capacity Chart
No ratings yet
Permastore Capacity Chart
3 pages
Table of Chi
No ratings yet
Table of Chi
1 page
Tabel Chi-Square
No ratings yet
Tabel Chi-Square
1 page
Network Monitoring and Analysis by Packet Sniffing Method
No ratings yet
Network Monitoring and Analysis by Packet Sniffing Method
3 pages
Ldpe Roll Weight Calculations
No ratings yet
Ldpe Roll Weight Calculations
1 page
A Book of Numbers
From Everand
A Book of Numbers
Maria Morisot
No ratings yet

Week 13 1-Pandas

Uploaded by

Week 13 1-Pandas

Uploaded by

Pandas Library

Here we will create a simple pandas series.

# labeling - label can be use to access a specified value.

# labeling - label can be use to access a specified value.

# Dataframe: it is a 2D data structure like a 2D array with table

# example of returning row 0 and 1

# locate the named index:

Duration Pulse Maxpulse Calories

[169 rows x 4 columns]

# read csv files: (comma seperated file) it is a simple way

# loading the csv into a dataframe with to_string

Duration Pulse Maxpulse Calories

# loading the csv into a dataframe without to_string

Duration Pulse Maxpulse Calories

[169 rows x 4 columns]

Duration Pulse Maxpulse Calories

Duration Pulse Maxpulse Calories

# here we will print the last 10 rows in the dataframe.

Duration Pulse Maxpulse Calories

You might also like