0% found this document useful (0 votes)
33 views72 pages

CS480 Lecture November 14th

This document provides an overview of machine learning topics covered in a CS 480 Introduction to Artificial Intelligence course. It includes announcements about assignments due, a plan to introduce machine learning categories including supervised learning, unsupervised learning and reinforcement learning. Main topics covered are artificial neural networks for classification and regression, k-nearest neighbors, evaluation metrics like confusion matrices, training/validation/test sets, and cross-validation. Ensemble methods like bagging and boosting are introduced. The document also provides an introduction to unsupervised learning techniques like k-means clustering along with examples and visualizations.

Uploaded by

Rajeswari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views72 pages

CS480 Lecture November 14th

This document provides an overview of machine learning topics covered in a CS 480 Introduction to Artificial Intelligence course. It includes announcements about assignments due, a plan to introduce machine learning categories including supervised learning, unsupervised learning and reinforcement learning. Main topics covered are artificial neural networks for classification and regression, k-nearest neighbors, evaluation metrics like confusion matrices, training/validation/test sets, and cross-validation. Ensemble methods like bagging and boosting are introduced. The document also provides an introduction to unsupervised learning techniques like k-means clustering along with examples and visualizations.

Uploaded by

Rajeswari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

CS 480

Introduction to Artificial Intelligence

November 14, 2023


Announcements / Reminders
 Please follow the Week 12 To Do List instructions (if you
haven't already)
 Written Assignment #04 due on Tuesday (11/28/23)
11:59 PM CST
 Programming Assignment #02 due on Monday
(11/27/23) 11:59 PM CST

 Final Exam date:


– Thursday 11/30/2023 (last week of classes!)
 Ignore the date provided by the Registrar

2
Plan for Today
 Casual introduction to Machine Learning

3
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning

Supervised learning is one Unsupervised learning Reinforcement learning is


of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.

4
ANN for Classification
features weights weights weights label

Input Hidden Hidden Output


layer layer layer layer

5
ANN for Regression
features weights weights weights prediction

Input Hidden Hidden Output


layer layer layer layer

6
ANN for Regression: Used Car Price
Used car price predictor: train it first with used car data - price pairs.
features weights weights weights prediction

model

age

mileage

Input Hidden Hidden Output


layer layer layer layer

7
k = 25 Nearest Neighbors

8
Classifier Evaluation: Confusion Matrix
Predicted class

Positive Negative

False Negative (FN) Sensitivity


Positive True Positive (TP) 𝑻𝑷
Type II Error
Actual class

𝑻𝑷 𝑭𝑵

False Positive (FP) Specificity


Negative True Negative (TN) 𝑻𝑵
Type I Error
𝑻𝑵 𝑭𝑷

Precision Negative Predictive Accuracy


𝑻𝑷 𝑻𝑵 𝑻𝑷 𝑻𝑵
Value
𝑻𝑷 𝑭𝑷 𝑻𝑵 𝑭𝑵 𝑻𝑷 𝑻𝑵 𝑭𝑷 𝑭𝑵

9
Training / Validation / Test Sets
In order to create the best model possible, given
some (relatively large) data set, we should divide it
into:
 training set: to train candidate models
 validation set: to evaluate candidate models and
pick the best one
 test set: to do the final evaluation of the model

10
K-Fold Cross-Validation
Validation
Train Validate Score

4-fold cross-validation
Train Train Train Validate ScoreA

Train Train Validate Train ScoreB

Train Validate Train Train ScoreC

Validate Train Train Train ScoreD

ScoreA + ScoreB + ScoreC + ScoreD


Score =
4

11
Ensemble Learning
In ensemble learning we are creating a collection
(an ensemble) of hypotheses (models) h1, h2, ..., hN
and combine their predictions by averaging, voting,
or another level of machine learning. Indvidual
hypotheses (models) are based models and their
combination is the ensemble model.
 Bagging
 Boosting
 Random Trees
 etc.

12
Bagging: Regression
In bagging we generate K training sets by sampling
with replacement from the original training set.

Train (M dataTrain
points) Model 1 | h1

Train (M data points)


Train Model 2 | h2

1
Train
Train
(M data points) Model 3 | h3 ℎ(𝒙) = ℎ (𝒙) Output
𝐾

....
Train
Train (M data points) Model K | hK

Bagging tends to reduce variance and helps with smaller data sets.

13
Bagging: Classification
In bagging we generate K training sets by sampling
with replacement from the original training set.

Train (M dataTrain
points) Model 1 | h1

Train (M data points)


Train Model 2 | h2

Train
Train
(M data points) Model 3 | h3 Plurality vote Output

....
Train
Train (M data points) Model K | hK

Bagging tends to reduce variance and helps with smaller data sets.

14
scikit-learn Algorithm Cheat Sheet

Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

15
Distance Measures

Source: https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa

16
Practical ML: Feature Engineering
 One-hot encoding
red = [1, 0, 0]
yellow = [0, 1, 0]
green = [0, 0, 1]
 Binning / Bucketing
 Normalization
 Dealing with missing data / features

17
Unsupervised Learning

18
What is Unsupervised Learning?
Idea:
Unsupervised learning involves finding underlying
patterns within data. Typically used in clustering
data points (similar customers, etc.).

In other words:
 there is some structure (groups / clusters) in
data (for example: customer information)
 we don’t know what it is (= no labels!)
 unsupervised learning tries to discover it
19
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning

Supervised learning is one Unsupervised learning Reinforcement learning is


of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.

20
Unsupervised Learning:
K-Means Clustering

21
K-Means Clustering: The Idea

Source: https://stanford.edu/~cpiech/cs221/handouts/kmeans.html

22
Exercise: K-Means Clustering
https://lalejini.com/my_empirical_examples/KMean
sClusteringExample/web/kmeans_clustering.html

23
3D K-Means Clustering Visualized

Source: https://github.com/Gautam-J/Machine-Learning

24
Where Would You Use Clustering?

25
Neural Networks Revisited

26
Basic Neural Unit

weighted sum
feature vector

Nonlinear activation
input layer

weights


function output
(can differ for each
layer!)
bias

27
ANN as a Complex Function
In ANNs hypotheses take form of complex algebraic circuits with
tunable connection strengths (weights).
weights weights weights

Input Hidden Hidden Output


layer layer layer layer

28
2 Layer Network
features weights weights output

j
i

Input Hidden Output


layer layer layer

29
2 Layer Network
features weights weights output

activation
function f1

activation
function f1
activation
function f2

activation
function f1

activation
function f1

Input Hidden Output


layer layer layer

Activation function f1: sigmoid, tanh, ReLU, etc. | Activation function f2: sigmoid

30
2 Layer Network
features weights weights output

activation
function f1

activation
function f1
activation
function f2

activation
function f1

activation
function f1

Input Hidden Output


layer layer layer

Activation function f1: sigmoid, tanh, ReLU, etc. | Activation function f2: softmax

31
Multilayer Neural Net: Notation

32
Training Neural Networks

33
Back-propagation
Feed forward Evaluate Loss Back-propagation

w1 w1 w1

x
z z 𝜕𝐿𝑜𝑠𝑠 z
f(x,y) f(x,y) 𝜕𝑥 f(x,y)
𝜕𝐿𝑜𝑠𝑠
y 𝜕𝑦
𝜕𝐿𝑜𝑠𝑠
Loss = z - 𝜕𝑧
w2 w2 zexpected w2

Feed a labeled sample How “incorrect” is the Update weights


through the network result compare to the (use Gradient Descent)
label?
34
Digit Image as ANN Feature Set
Individual features need to be “extracted” from an image. An image is numbers.

Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/

35
Deep Learning

36
Deep Learning
Deep learning is a broad family of techniques for
machine learning (also a sub-field of ML) in which
hypotheses take the form of complex algebraic
circuits with tunable connections. The word “deep”
refers to the fact that the circuits are typically
organized into many layers, which means that
computation paths from inputs to outputs have
many steps.

37
Shallow vs. Deep Models

Shallow Shallow Deep


Model Model Model
Longer computation path
38
Machine Learning vs. Deep Learning

Source: https://www.quora.com/What-is-the-difference-between-deep-learning-and-usual-machine-learning

39
Machine Learning vs. Deep Learning

Source: https://www.intel.com/content/www/us/en/artificial-intelligence/posts/difference-between-ai-machine-learning-deep-
learning.html

40
Deep Learning: Feature Extraction

Source: https://en.wikipedia.org/wiki/Deep_learning

41
Convolutional Neural Networks
The name Convolutional Neural Network (CNN) indicates that the
network employs a mathematical operation called convolution.

Convolutional networks are a specialized type of neural networks


that use convolution in place of general matrix multiplication in at
least one of their layers.

CNN is able to successfully capture the spatial dependencies in an


image (data grid) through the application of relevant filters.

CNNs can reduce images (data grids) into a form which is easier to
process without losing features that are critical for getting a good
prediction.

42
Convolutional Neural Networks
Flattening

Pooling

By Aphex34 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=45679374

43
Convolution: The Idea

3 x 3 Kernel / Filter

Source: https://commons.wikimedia.org/wiki/File:Convolutional_Neural_Network_NeuralNetworkFilter.gif

44
Kernel / Filter: The Idea

3 x 3 Kernel / Filter

Source: https://commons.wikimedia.org/wiki/File:Convolution_arithmetic_-_Padding_strides.gif

45
Convoluting Matrices
Convolution (and Convolutional Neural Networks) can be applied
to any grid-like data (tensors: matrices, vectors, etc.).

kernel data
0 1 0 0 2 3 0*0 1*2 0*3

1 1 1 conv 2 4 1 “overla
y” 1*2 1*4 1*1 sum 12
0 1 0 0 3 0 0*0 1*3 0*0

46
Selected Image Processing Kernels

Sharpen Mean Blur Gaussian Blur

Laplacian Prewitt (Edge) Prewitt (Edge)

47
Image Processing: Kernels / Filters

48
Applying Kernels / Filters

3 x 3 Kernel / Filter

49
Convolutional NN Kernels
In practice, Convolutional Neural Network kernels can be larger than
3x3 and are learned using back propagation.

Convolution Layer 1 Convolution Layer 2 Convolution Layer 3

50
Convolution Layer 1

Kernel 1

51
Convolution Layer 1

Kernel 2
Kernel 1

52
Convolution Layer 1

Original image
Kernel 3

Kernel 2
Kernel 1

Convolution 1

53
Convolutional Neural Networks

By Aphex34 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=45679374

54
Max Pooling Layer
Convolution 1

Max Pooling

55
Convolutional Neural Networks

By Aphex34 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=45679374

56
Convolution Layer 2

Original convolution
after pooling Kernel C

Kernel B
Kernel A

Convolution A

57
Convolutional Neural Networks

By Aphex34 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=45679374

58
Flattening
Final output of convolution layers is “flattened” to become a vector of features.

Convert to
vector

Final convolution layer output

Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/

59
Recurrent Neural Networks
Recurrent Neural Networks (RNNs) allow cycles in the computational graph
(network). A network node (unit) can take its own output from an earlier step as
input (with delay introduced).
Enables having internal state / memory  inputs received earlier affect the RNN
response to current input.

60
Transfer Learning
In transfer learning, experience with one
learning task helps an agent learn better on
another task.

Pre-trained models can be used as a starting


point for developing new models.

61
Reinforcement Learning (RL)

62
What is Reinforcement Learning?
Idea:
Reinforcement learning is inspired by behavioral
psychology. It is based on a rewarding / punishing
an algorithm.

Rewards and punishments are based on algorithm’s


action within its environment.

63
Reinforcement Learning in Action

64
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=x4O8pojMF0w

65
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=kopoLzvh5jY

66
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=Tnu4O_xEmVk

67
ANN for Simple Game Playing

UP

Game
DOWN
state

JUMP

Input Hidden Hidden Output


layer layer layer layer

68
ANN for Simple Game Playing
Current game is an input. Decisions (UP/DOWN/JUMP) are rewarded/punished.

UP

Game
DOWN
state

JUMP

Input Hidden Hidden Output


layer layer layer layer

Correct all the weights using Reinforcement Learning.

69
RL: Agents and Environments
State
What’s
inside?
Reward

Action
Environment

70
RL: Agents and Environments
State

Reward

Action
Environment

71
RL: Agents and Environments
State

Agent
Reward

Action
Environment

72

You might also like