Variables and Types
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Variable
Specific, case-sensitive name
Call up value through variable name
1.79 m - 68.7 kg
height = 1.79
weight = 68.7
height
1.79
INTRODUCTION TO PYTHON
Calculate BMI
height = 1.79 68.7 / 1.79 ** 2
weight = 68.7
height
21.4413
1.79
weight / height ** 2
weight
BMI = 21.4413
height2
bmi = weight / height ** 2
bmi
21.4413
INTRODUCTION TO PYTHON
Reproducibility
height = 1.79
weight = 68.7
bmi = weight / height ** 2
print(bmi)
21.4413
INTRODUCTION TO PYTHON
Reproducibility
height = 1.79
weight = 74.2 # <-
bmi = weight / height ** 2
print(bmi)
23.1578
INTRODUCTION TO PYTHON
Python Types
type(bmi)
float
day_of_week = 5
type(day_of_week)
int
INTRODUCTION TO PYTHON
Python Types (2)
x = "body mass index"
y = 'this works too'
type(y)
str
z = True
type(z)
bool
INTRODUCTION TO PYTHON
Python Types (3)
2 + 3
'ab' + 'cd'
'abcd'
Different type = different behavior!
INTRODUCTION TO PYTHON
Python Lists
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Python Data Types
float - real numbers
int - integer numbers
str - string, text
bool - True, False
height = 1.73
tall = True
Each variable represents single value
INTRODUCTION TO PYTHON
Problem
Data Science: many data points
Height of entire family
height1 = 1.73
height2 = 1.68
height3 = 1.71
height4 = 1.89
Inconvenient
INTRODUCTION TO PYTHON
Python List
[a, b, c]
[1.73, 1.68, 1.71, 1.89]
[1.73, 1.68, 1.71, 1.89]
fam = [1.73, 1.68, 1.71, 1.89]
fam
[1.73, 1.68, 1.71, 1.89]
Name a collection of values
Contain any type
Contain different types
INTRODUCTION TO PYTHON
Python List
[a, b, c]
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam2 = [["liz", 1.73],
["emma", 1.68],
["mom", 1.71],
["dad", 1.89]]
fam2
[['liz', 1.73], ['emma', 1.68], ['mom', 1.71], ['dad', 1.89]]
INTRODUCTION TO PYTHON
List type
type(fam)
list
type(fam2)
list
Specific functionality
Specific behavior
INTRODUCTION TO PYTHON
Subsetting Lists
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Subsetting lists
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[3]
1.68
INTRODUCTION TO PYTHON
Subsetting lists
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[6]
'dad'
fam[-1]
1.89
fam[7]
1.89
INTRODUCTION TO PYTHON
Subsetting lists
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[6]
'dad'
fam[-1] # <-
1.89
fam[7] # <-
1.89
INTRODUCTION TO PYTHON
List slicing
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[3:5]
[1.68, 'mom']
fam[1:4]
[1.73, 'emma', 1.68]
INTRODUCTION TO PYTHON
List slicing
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[:4]
['liz', 1.73, 'emma', 1.68]
fam[5:]
[1.71, 'dad', 1.89]
INTRODUCTION TO PYTHON
Manipulating Lists
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
List Manipulation
Change list elements
Add list elements
Remove list elements
INTRODUCTION TO PYTHON
Changing list elements
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam[7] = 1.86
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]
fam[0:2] = ["lisa", 1.74]
fam
['lisa', 1.74, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]
INTRODUCTION TO PYTHON
Adding and removing elements
fam + ["me", 1.79]
['lisa', 1.74,'emma', 1.68, 'mom', 1.71, 'dad', 1.86, 'me', 1.79]
fam_ext = fam + ["me", 1.79]
del fam[2]
fam
['lisa', 1.74, 1.68, 'mom', 1.71, 'dad', 1.86]
INTRODUCTION TO PYTHON
Behind the scenes (1)
x = ["a", "b", "c"]
INTRODUCTION TO PYTHON
Behind the scenes (1)
x = ["a", "b", "c"]
y = x
y[1] = "z"
y
['a', 'z', 'c']
['a', 'z', 'c']
INTRODUCTION TO PYTHON
Behind the scenes (1)
x = ["a", "b", "c"]
y = x
y[1] = "z"
y
['a', 'z', 'c']
['a', 'z', 'c']
INTRODUCTION TO PYTHON
Behind the scenes (1)
x = ["a", "b", "c"]
y = x
y[1] = "z"
y
['a', 'z', 'c']
['a', 'z', 'c']
INTRODUCTION TO PYTHON
Behind the scenes (2)
x = ["a", "b", "c"]
INTRODUCTION TO PYTHON
Behind the scenes (2)
x = ["a", "b", "c"]
y = list(x)
y = x[:]
INTRODUCTION TO PYTHON
Behind the scenes (2)
x = ["a", "b", "c"]
y = list(x)
y = x[:]
y[1] = "z"
x
['a', 'b', 'c']
INTRODUCTION TO PYTHON
Functions
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Functions
Nothing new!
type()
Piece of reusable code
Solves particular task
Call function instead of writing code yourself
INTRODUCTION TO PYTHON
Example
fam = [1.73, 1.68, 1.71, 1.89]
fam
[1.73, 1.68, 1.71, 1.89]
max(fam)
1.89
INTRODUCTION TO PYTHON
Example
fam = [1.73, 1.68, 1.71, 1.89]
fam
[1.73, 1.68, 1.71, 1.89]
max(fam)
1.89
INTRODUCTION TO PYTHON
Example
fam = [1.73, 1.68, 1.71, 1.89]
fam
[1.73, 1.68, 1.71, 1.89]
max(fam)
1.89
INTRODUCTION TO PYTHON
Example
fam = [1.73, 1.68, 1.71, 1.89]
fam
[1.73, 1.68, 1.71, 1.89]
max(fam)
1.89
tallest = max(fam)
tallest
1.89
INTRODUCTION TO PYTHON
round()
round(1.68, 1)
1.7
round(1.68)
help(round) # Open up documentation
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
INTRODUCTION TO PYTHON
round()
help(round)
Help on built-in function round in module builtins:
round(number, ndigits=None)
Round a number to a given precision in decimal digits.
The return value is an integer if ndigits is omitted or None.
Otherwise the return value has the same type as the number. ndigits may be negative.
round(number)
round(number, ndigits)
INTRODUCTION TO PYTHON
Find functions
How to know?
Standard task -> probably function exists!
The internet is your friend
INTRODUCTION TO PYTHON
Methods
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Built-in Functions
Maximum of list: max()
Length of list or string: len()
Get index in list: ?
Reversing a list: ?
INTRODUCTION TO PYTHON
Back 2 Basics
sister = "liz"
height = 1.73
fam = ["liz", 1.73, "emma", 1.68,
"mom", 1.71, "dad", 1.89]
INTRODUCTION TO PYTHON
Back 2 Basics
sister = "liz"
height = 1.73
fam = ["liz", 1.73, "emma", 1.68,
"mom", 1.71, "dad", 1.89]
Methods: Functions that
belong to objects
INTRODUCTION TO PYTHON
Back 2 Basics
sister = "liz"
height = 1.73
fam = ["liz", 1.73, "emma", 1.68,
"mom", 1.71, "dad", 1.89]
Methods: Functions that
belong to objects
INTRODUCTION TO PYTHON
list methods
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam.index("mom") # "Call method index() on fam"
fam.count(1.73)
INTRODUCTION TO PYTHON
str methods
sister
'liz'
sister.capitalize()
'Liz'
sister.replace("z", "sa")
'lisa'
INTRODUCTION TO PYTHON
Methods
Everything = object
Object have methods associated, depending on type
sister.replace("z", "sa")
'lisa'
fam.replace("mom", "mommy")
AttributeError: 'list' object has no attribute 'replace'
INTRODUCTION TO PYTHON
Methods
sister.index("z")
fam.index("mom")
INTRODUCTION TO PYTHON
Methods (2)
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
fam.append("me")
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me']
fam.append(1.79)
fam
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me', 1.79]
INTRODUCTION TO PYTHON
Summary
Functions
type(fam)
list
Methods: call functions on objects
fam.index("dad")
INTRODUCTION TO PYTHON
Packages
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Motivation
Functions and methods are powerful
All code in Python distribution?
Huge code base: messy
Lots of code you won’t use
Maintenance problem
INTRODUCTION TO PYTHON
Packages
Directory of Python Scripts
Each script = module
Specify functions, methods,
types
Thousands of packages
available
NumPy
Matplotlib
scikit-learn
INTRODUCTION TO PYTHON
Install package
http://pip.readthedocs.org/en/stable/installing/
Download get-pip.py
Terminal:
python3 get-pip.py
pip3 install numpy
INTRODUCTION TO PYTHON
Import package
import numpy import numpy as np
array([1, 2, 3]) np.array([1, 2, 3])
NameError: name 'array' is not defined array([1, 2, 3])
numpy.array([1, 2, 3]) from numpy import array
array([1, 2, 3])
array([1, 2, 3])
array([1, 2, 3])
INTRODUCTION TO PYTHON
from numpy import array
my_script.py
from numpy import array
fam = ["liz", 1.73, "emma", 1.68,
"mom", 1.71, "dad", 1.89]
...
fam_ext = fam + ["me", 1.79]
...
print(str(len(fam_ext)) + " elements in fam_ext")
...
np_fam = array(fam_ext)
Using NumPy, but not very clear
INTRODUCTION TO PYTHON
import numpy
import numpy as np
fam = ["liz", 1.73, "emma", 1.68,
"mom", 1.71, "dad", 1.89]
...
fam_ext = fam + ["me", 1.79]
...
print(str(len(fam_ext)) + " elements in fam_ext")
...
np_fam = np.array(fam_ext) # Clearly using NumPy
INTRODUCTION TO PYTHON
NumPy
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Lists Recap
Powerful
Collection of values
Hold different types
Change, add, remove
Need for Data Science
Mathematical operations over collections
Speed
INTRODUCTION TO PYTHON
Illustration
height = [1.73, 1.68, 1.71, 1.89, 1.79]
height
[1.73, 1.68, 1.71, 1.89, 1.79]
weight = [65.4, 59.2, 63.6, 88.4, 68.7]
weight
[65.4, 59.2, 63.6, 88.4, 68.7]
weight / height ** 2
TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'
INTRODUCTION TO PYTHON
Solution: NumPy
Numeric Python
Alternative to Python List: NumPy Array
Calculations over entire arrays
Easy and Fast
Installation
In the terminal: pip3 install numpy
INTRODUCTION TO PYTHON
NumPy
import numpy as np
np_height = np.array(height)
np_height
array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array(weight)
np_weight
array([65.4, 59.2, 63.6, 88.4, 68.7])
bmi = np_weight / np_height ** 2
bmi
array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
INTRODUCTION TO PYTHON
Comparison
height = [1.73, 1.68, 1.71, 1.89, 1.79]
weight = [65.4, 59.2, 63.6, 88.4, 68.7]
weight / height ** 2
TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'
np_height = np.array(height)
np_weight = np.array(weight)
np_weight / np_height ** 2
array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
INTRODUCTION TO PYTHON
NumPy: remarks
np.array([1.0, "is", True])
array(['1.0', 'is', 'True'], dtype='<U32')
NumPy arrays: contain only one type
INTRODUCTION TO PYTHON
NumPy: remarks
python_list = [1, 2, 3]
numpy_array = np.array([1, 2, 3])
python_list + python_list
[1, 2, 3, 1, 2, 3]
numpy_array + numpy_array
array([2, 4, 6])
Different types: different behavior!
INTRODUCTION TO PYTHON
NumPy Subsetting
bmi
array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
bmi[1]
20.975
bmi > 23
array([False, False, False, True, False])
bmi[bmi > 23]
array([24.7473475])
INTRODUCTION TO PYTHON
2D NumPy Arrays
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Type of NumPy Arrays
import numpy as np
np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])
type(np_height)
numpy.ndarray
type(np_weight)
numpy.ndarray
INTRODUCTION TO PYTHON
2D NumPy Arrays
np_2d = np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
[65.4, 59.2, 63.6, 88.4, 68.7]])
np_2d
array([[ 1.73, 1.68, 1.71, 1.89, 1.79],
[65.4 , 59.2 , 63.6 , 88.4 , 68.7 ]])
np_2d.shape
(2, 5) # 2 rows, 5 columns
np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
[65.4, 59.2, 63.6, 88.4, "68.7"]])
array([['1.73', '1.68', '1.71', '1.89', '1.79'],
['65.4', '59.2', '63.6', '88.4', '68.7']], dtype='<U32')
INTRODUCTION TO PYTHON
Subsetting
0 1 2 3 4
array([[ 1.73, 1.68, 1.71, 1.89, 1.79], 0
[ 65.4, 59.2, 63.6, 88.4, 68.7]]) 1
np_2d[0]
array([1.73, 1.68, 1.71, 1.89, 1.79])
INTRODUCTION TO PYTHON
Subsetting
0 1 2 3 4
array([[ 1.73, 1.68, 1.71, 1.89, 1.79], 0
[ 65.4, 59.2, 63.6, 88.4, 68.7]]) 1
np_2d[0][2]
1.71
np_2d[0, 2]
1.71
INTRODUCTION TO PYTHON
Subsetting
0 1 2 3 4
array([[ 1.73, 1.68, 1.71, 1.89, 1.79], 0
[ 65.4, 59.2, 63.6, 88.4, 68.7]]) 1
np_2d[:, 1:3]
array([[ 1.68, 1.71],
[59.2 , 63.6 ]])
np_2d[1, :]
array([65.4, 59.2, 63.6, 88.4, 68.7])
INTRODUCTION TO PYTHON
NumPy: Basic
Statistics
INTRODUCTION TO PYTHON
Hugo Bowne-Anderson
Data Scientist at DataCamp
Data analysis
Get to know your data
Little data -> simply look at it
Big data -> ?
INTRODUCTION TO PYTHON
City-wide survey
import numpy as np
np_city = ... # Implementation left out
np_city
array([[1.64, 71.78],
[1.37, 63.35],
[1.6 , 55.09],
...,
[2.04, 74.85],
[2.04, 68.72],
[2.01, 73.57]])
INTRODUCTION TO PYTHON
NumPy
np.mean(np_city[:, 0])
1.7472
np.median(np_city[:, 0])
1.75
INTRODUCTION TO PYTHON
NumPy
np.corrcoef(np_city[:, 0], np_city[:, 1])
array([[ 1. , -0.01802],
[-0.01803, 1. ]])
np.std(np_city[:, 0])
0.1992
sum(), sort(), ...
Enforce single data type: speed!
INTRODUCTION TO PYTHON
Generate data
Arguments for np.random.normal()
distribution mean
distribution standard deviation
number of samples
height = np.round(np.random.normal(1.75, 0.20, 5000), 2)
weight = np.round(np.random.normal(60.32, 15, 5000), 2)
np_city = np.column_stack((height, weight))
INTRODUCTION TO PYTHON