CS480 Lecture November 14th
CS480 Lecture November 14th
2
Plan for Today
Casual introduction to Machine Learning
3
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning
4
ANN for Classification
features weights weights weights label
5
ANN for Regression
features weights weights weights prediction
6
ANN for Regression: Used Car Price
Used car price predictor: train it first with used car data - price pairs.
features weights weights weights prediction
model
age
mileage
7
k = 25 Nearest Neighbors
8
Classifier Evaluation: Confusion Matrix
Predicted class
Positive Negative
𝑻𝑷 𝑭𝑵
9
Training / Validation / Test Sets
In order to create the best model possible, given
some (relatively large) data set, we should divide it
into:
training set: to train candidate models
validation set: to evaluate candidate models and
pick the best one
test set: to do the final evaluation of the model
10
K-Fold Cross-Validation
Validation
Train Validate Score
4-fold cross-validation
Train Train Train Validate ScoreA
11
Ensemble Learning
In ensemble learning we are creating a collection
(an ensemble) of hypotheses (models) h1, h2, ..., hN
and combine their predictions by averaging, voting,
or another level of machine learning. Indvidual
hypotheses (models) are based models and their
combination is the ensemble model.
Bagging
Boosting
Random Trees
etc.
12
Bagging: Regression
In bagging we generate K training sets by sampling
with replacement from the original training set.
Train (M dataTrain
points) Model 1 | h1
1
Train
Train
(M data points) Model 3 | h3 ℎ(𝒙) = ℎ (𝒙) Output
𝐾
....
Train
Train (M data points) Model K | hK
Bagging tends to reduce variance and helps with smaller data sets.
13
Bagging: Classification
In bagging we generate K training sets by sampling
with replacement from the original training set.
Train (M dataTrain
points) Model 1 | h1
Train
Train
(M data points) Model 3 | h3 Plurality vote Output
....
Train
Train (M data points) Model K | hK
Bagging tends to reduce variance and helps with smaller data sets.
14
scikit-learn Algorithm Cheat Sheet
Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
15
Distance Measures
Source: https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa
16
Practical ML: Feature Engineering
One-hot encoding
red = [1, 0, 0]
yellow = [0, 1, 0]
green = [0, 0, 1]
Binning / Bucketing
Normalization
Dealing with missing data / features
17
Unsupervised Learning
18
What is Unsupervised Learning?
Idea:
Unsupervised learning involves finding underlying
patterns within data. Typically used in clustering
data points (similar customers, etc.).
In other words:
there is some structure (groups / clusters) in
data (for example: customer information)
we don’t know what it is (= no labels!)
unsupervised learning tries to discover it
19
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning
20
Unsupervised Learning:
K-Means Clustering
21
K-Means Clustering: The Idea
Source: https://stanford.edu/~cpiech/cs221/handouts/kmeans.html
22
Exercise: K-Means Clustering
https://lalejini.com/my_empirical_examples/KMean
sClusteringExample/web/kmeans_clustering.html
23
3D K-Means Clustering Visualized
Source: https://github.com/Gautam-J/Machine-Learning
24
Where Would You Use Clustering?
25
Neural Networks Revisited
26
Basic Neural Unit
weighted sum
feature vector
Nonlinear activation
input layer
weights
function output
(can differ for each
layer!)
bias
27
ANN as a Complex Function
In ANNs hypotheses take form of complex algebraic circuits with
tunable connection strengths (weights).
weights weights weights
28
2 Layer Network
features weights weights output
j
i
29
2 Layer Network
features weights weights output
activation
function f1
activation
function f1
activation
function f2
activation
function f1
activation
function f1
Activation function f1: sigmoid, tanh, ReLU, etc. | Activation function f2: sigmoid
30
2 Layer Network
features weights weights output
activation
function f1
activation
function f1
activation
function f2
activation
function f1
activation
function f1
Activation function f1: sigmoid, tanh, ReLU, etc. | Activation function f2: softmax
31
Multilayer Neural Net: Notation
32
Training Neural Networks
33
Back-propagation
Feed forward Evaluate Loss Back-propagation
w1 w1 w1
x
z z 𝜕𝐿𝑜𝑠𝑠 z
f(x,y) f(x,y) 𝜕𝑥 f(x,y)
𝜕𝐿𝑜𝑠𝑠
y 𝜕𝑦
𝜕𝐿𝑜𝑠𝑠
Loss = z - 𝜕𝑧
w2 w2 zexpected w2
Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/
35
Deep Learning
36
Deep Learning
Deep learning is a broad family of techniques for
machine learning (also a sub-field of ML) in which
hypotheses take the form of complex algebraic
circuits with tunable connections. The word “deep”
refers to the fact that the circuits are typically
organized into many layers, which means that
computation paths from inputs to outputs have
many steps.
37
Shallow vs. Deep Models
Source: https://www.quora.com/What-is-the-difference-between-deep-learning-and-usual-machine-learning
39
Machine Learning vs. Deep Learning
Source: https://www.intel.com/content/www/us/en/artificial-intelligence/posts/difference-between-ai-machine-learning-deep-
learning.html
40
Deep Learning: Feature Extraction
Source: https://en.wikipedia.org/wiki/Deep_learning
41
Convolutional Neural Networks
The name Convolutional Neural Network (CNN) indicates that the
network employs a mathematical operation called convolution.
CNNs can reduce images (data grids) into a form which is easier to
process without losing features that are critical for getting a good
prediction.
42
Convolutional Neural Networks
Flattening
Pooling
43
Convolution: The Idea
3 x 3 Kernel / Filter
Source: https://commons.wikimedia.org/wiki/File:Convolutional_Neural_Network_NeuralNetworkFilter.gif
44
Kernel / Filter: The Idea
3 x 3 Kernel / Filter
Source: https://commons.wikimedia.org/wiki/File:Convolution_arithmetic_-_Padding_strides.gif
45
Convoluting Matrices
Convolution (and Convolutional Neural Networks) can be applied
to any grid-like data (tensors: matrices, vectors, etc.).
kernel data
0 1 0 0 2 3 0*0 1*2 0*3
1 1 1 conv 2 4 1 “overla
y” 1*2 1*4 1*1 sum 12
0 1 0 0 3 0 0*0 1*3 0*0
46
Selected Image Processing Kernels
47
Image Processing: Kernels / Filters
48
Applying Kernels / Filters
3 x 3 Kernel / Filter
49
Convolutional NN Kernels
In practice, Convolutional Neural Network kernels can be larger than
3x3 and are learned using back propagation.
50
Convolution Layer 1
Kernel 1
51
Convolution Layer 1
Kernel 2
Kernel 1
52
Convolution Layer 1
Original image
Kernel 3
Kernel 2
Kernel 1
Convolution 1
53
Convolutional Neural Networks
54
Max Pooling Layer
Convolution 1
Max Pooling
55
Convolutional Neural Networks
56
Convolution Layer 2
Original convolution
after pooling Kernel C
Kernel B
Kernel A
Convolution A
57
Convolutional Neural Networks
58
Flattening
Final output of convolution layers is “flattened” to become a vector of features.
Convert to
vector
Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/
59
Recurrent Neural Networks
Recurrent Neural Networks (RNNs) allow cycles in the computational graph
(network). A network node (unit) can take its own output from an earlier step as
input (with delay introduced).
Enables having internal state / memory inputs received earlier affect the RNN
response to current input.
60
Transfer Learning
In transfer learning, experience with one
learning task helps an agent learn better on
another task.
61
Reinforcement Learning (RL)
62
What is Reinforcement Learning?
Idea:
Reinforcement learning is inspired by behavioral
psychology. It is based on a rewarding / punishing
an algorithm.
63
Reinforcement Learning in Action
64
Reinforcement Learning in Action
Source: https://www.youtube.com/watch?v=x4O8pojMF0w
65
Reinforcement Learning in Action
Source: https://www.youtube.com/watch?v=kopoLzvh5jY
66
Reinforcement Learning in Action
Source: https://www.youtube.com/watch?v=Tnu4O_xEmVk
67
ANN for Simple Game Playing
UP
Game
DOWN
state
JUMP
68
ANN for Simple Game Playing
Current game is an input. Decisions (UP/DOWN/JUMP) are rewarded/punished.
UP
Game
DOWN
state
JUMP
69
RL: Agents and Environments
State
What’s
inside?
Reward
Action
Environment
70
RL: Agents and Environments
State
Reward
Action
Environment
71
RL: Agents and Environments
State
Agent
Reward
Action
Environment
72