0% found this document useful (0 votes)
12 views17 pages

A06 Intro To ML

The document provides an introduction to machine learning, outlining three main approaches: supervised learning, unsupervised learning, and reinforcement learning. It explains key concepts such as classification, regression, and various algorithms used in these approaches, including examples like email spam filtering and K-means clustering. The document also discusses the importance of data preparation, model evaluation, and performance metrics in machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views17 pages

A06 Intro To ML

The document provides an introduction to machine learning, outlining three main approaches: supervised learning, unsupervised learning, and reinforcement learning. It explains key concepts such as classification, regression, and various algorithms used in these approaches, including examples like email spam filtering and K-means clustering. The document also discusses the importance of data preparation, model evaluation, and performance metrics in machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Introduction to
Introduction to the
Machine
Internet ofLearning
Things
Mehul Motani
Mehul Motani
National University of Singapore
[email protected]

[email protected]

Three approaches to machine learning

Machine learning is the science of getting


computers to act without being explicitly
programmed. - Andrew Ng (Stanford)

© Mehul Motani Intro to ML 2

1
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Supervised Learning

The main goal in


supervised learning is
to learn a model from
labeled training data
that allows us to make
predictions about
new input data.

© Mehul Motani Intro to ML 3

Compare this to problem solving in engineering systems

Input Mathematical Output


(Data) model (Prediction)

Linear Time Invariant Systems

© Mehul Motani Intro to ML 4

2
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Classification versus Regression


• Classification task
– A supervised learning task with discrete class labels.
– An example of a binary classification task is email spam
filtering, where the ML algorithm learns a set of rules in
order to distinguish between two possible classes: spam
and non-spam e-mail.
– An example of a multi-class classification task is
handwritten character recognition. We first collect a
training dataset that consists of multiple handwritten
examples of each letter in the alphabet. Given a new
handwritten character, the ML algorithm predicts the
correct letter in the alphabet with certain accuracy.

© Mehul Motani Intro to ML 5

Example: Binary Classification

Class 0
Supervised ML can
learn this decision
boundary

Class 1

X1 and x2 are
called features
© Mehul Motani Intro to ML 6

3
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Classification versus Regression


• Regression task
– A supervised learning task in which the outcome signal takes
a continuous value.
– In regression analysis, we are given a number of predictor
(explanatory) variables and a continuous response variable
(outcome), and we try to find a relationship between those
variables that allows us to predict an outcome.
– Suppose we are interested in predicting the test scores of
students. If there is a relationship between the time spent
studying for the test and the final scores, we could use it as
training data to learn a model that uses the study time to
predict the test scores of future students.

© Mehul Motani Intro to ML 7

Example: Linear Regression


Given data which consists of (x,y)
pairs, where x is the independent
or predictor variable and y is the
dependent or response variable.
w
We fit a straight line to this data
that minimizes the average
squared distance between the
sample points and the fitted line.
We can now use the learned best
fit line to predict the output for
new data, as shown here.
z
For example, z is mapped to w.
© Mehul Motani Intro to ML 8

4
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Examples of supervised learning


• Email spam filtering ✔
• Facial recognition in Google Photos ✔
• Fingerprint identification ✔
• Siri / Google Assistant / Alexa / Cortana ✔
• Facebook sponsored content ✔
• Stock market prediction ✔
• Recommendation systems ?

© Mehul Motani Intro to ML 9

Amazon Recommendation System


Recommendation systems are also
known as collaborative filtering.
Is this Supervised or Unsupervised
learning?

© Mehul Motani Intro to ML 10

5
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Unsupervised learning
• In unsupervised learning, we are dealing with
unlabeled data or data of unknown structure.
• Using unsupervised learning techniques, we are able
to explore the structure of our data to extract
meaningful information without the knowledge of
an outcome variable (or label).
• Compare that with supervised learning, in which we
are dealing with labeled data.
• Examples of unsupervised learning are clustering
and dimensionality reduction.

© Mehul Motani Intro to ML 11

Finding subgroups with clustering


The figure illustrates how
clustering can be applied to
organizing unlabeled data
into three distinct groups
based on the similarity of
their features x1 and x2.
For example, clustering
allows marketers to discover
customer groups based on
their interests in order to
develop targeted
campaigns.

© Mehul Motani Intro to ML 12

6
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Dimensionality reduction for data compression

3D 2D

Data can be high-dimensional. Often we can reduce the data


dimension by compressing the data without losing relevant
information needed for the ML task. A typical algorithm for
this case is principal component analysis (PCA).

© Mehul Motani Intro to ML 13

Reinforcement learning
• In reinforcement learning (RL), the goal is to develop
a system (agent) that improves its performance
based on interactions with the environment.
• RL is related to supervised learning. However, in RL
this feedback is not the correct ground truth label or
value, but a measure of how well the action was
measured by a reward function.
• RL learns a series of actions that maximizes the
reward.

© Mehul Motani Intro to ML 14

7
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Reinforcement learning

An example of reinforcement learning is DeepMind’s AlphaGo


engine. Here, the agent decides upon a series of moves
depending on the state of the board (the environment), and the
reward is defined as win or lose at the end of the game.

© Mehul Motani Intro to ML 15

AI Robot that learns to walk


https://youtu.be/gn4nRCC9TwQ

© Mehul Motani Intro to ML 16

8
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Typical machine
Data cleaning and wrangling
Feature extraction & selection
Dimensionality reduction
Feature scaling
Data interpolation & imputation learning system

© Mehul Motani Intro to ML 17

Popular machine learning algorithms


• Bayesian classifiers
• Linear models (regression)
• Logistic regression (classification)
• Decision trees
• Random forest (ensemble of decision trees) Supervised
• Support vector machine (SVM) learning
• Gradient boosting
• K-nearest neighbors (KNN)
• Artificial neural networks
• Deep learning

• K-means clustering ß Unsupervised learning

© Mehul Motani Intro to ML 18

9
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Example: Regression or Classification?

Class 1

Class 2

© Mehul Motani Intro to ML 19

Example – Regression for Population Prediction

• Given historical population


figures, predict the Training Data
population in the future.
• Linear regression – Find the
best fit line through a set of
data
• Linear regression is
supervised machine learning Best fit line
(function approximation)

Data: (x,y) – where x is the year, and y is the population


Data model: y = f(x) + noise, where x = (x1, x2...xn)
Training: Machine learns (fits) f(x) from labelled training set
Test: Machine predicts y from unlabeled test set
© Mehul Motani Intro to ML 20

10
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

More on linear regression


Equation of line
y=mx+b
Regression is just curve fitting
m: slope of line • Linear regression finds the best fit
b: y-intercept line through the data.
• Best fit line à line with smallest error
• Precisely à the algorithm learns the
Training
parameters m and b for the line with
Data
the least squared error

• Can extend linear regression to


polynomial regression
• The algorithm finds the best fit curve
with the least squared error
• Example: fitting a quadratic curve

© Mehul Motani Intro to ML 21

Example – Classification on Iris Dataset


Iris Dataset:
introduced by
British statistician
and biologist Ronald
Fisher in his 1936
paper ’The use of
multiple measurements
in taxonomic problems’

Classification
problem:
Predict the Iris
variety from its
features.

Source: Principal Component Analysis by Sebastian Raschka


© Mehul Motani Intro to ML 22

11
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Decision Tree Algorithm: Decision Tree for the


Asking Yes/No questions Iris Flower dataset

Nonlinear
classifier

The algorithm learns the correct


sequence of yes/no questions to ask
© Mehul Motani Intro to ML 23

Example – Binary Classification


• Dataset with 2 Classes
Feature 1 and 2 Features
Class 1 • Dataset is linearly
separable – means we can
draw a line to separate the
two classes.
Class 2 • But there are many lines
that separate the two
classes
• Question: What is the
Feature 2
best linear separator
between the two
classes?

© Mehul Motani Intro to ML 24

12
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Support Vector Machine (SVM)


ρ • SVM finds the linear
separator that maximizes
the gap between the two
classes.
• The data points at the
boundaries are called
support vectors.
• The linear separator allows
us to classify new points
(green x)

© Mehul Motani Intro to ML 25

Datasets that are not linearly separable

• SVM can be adapted to solve these problem too!


• There are other supervised learning approaches too
• Decision Tree – Nonlinear decision boundary
• Perceptron – Linear decision boundary
• Multilayer Perceptron – Nonlinear decision boundary

© Mehul Motani Intro to ML 26

13
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Example – Image Classification


Deep Learning Image Classification using
Convolutional Neural Networks
Training Data

Test Data

Source: learnopencv.com
© Mehul Motani Intro to ML 27

Example – Natural Language Processing


Deep Learning Natural Language Processing using BERT
(Bidirectional Encoder Representations from Transformers)
Training Data
Large corpus of
text data
containing
pairs of
sentences in a
certain
language.

BERT is useful in many


NLP tasks such as
Question-Answering,
Natural Language
Understanding,
and Machine
Translation.
Source: towarddatascience.com
© Mehul Motani Intro to ML 28

14
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

K-means Clustering – Unsupervised Learning

• K-means clustering partitions ‘n’ observations into ‘k’ clusters


such that each observation belongs to the cluster with the
nearest cluster center.
• Given a dataset with ‘n’ observations. Given an initial set of ‘n’
means: m1(1), ... ,mk(1).
• The algorithm proceeds by alternating between two steps:
• Step 1: Assignment step
– Assign each observation to the cluster with the nearest mean, i.e., the
cluster with the least squared Euclidean distance.
• Step 2: Update step
– Recalculate means (centroids) for observations assigned to each cluster.
– The algorithm has converged when the assignments no longer change.

© Mehul Motani Intro to ML 29

Visualizing K-means Clustering

© Mehul Motani Intro to ML 30

15
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Back to supervised learning


Note the three different datasets:
Training Data – Optimize model
parameters and train model
Validation Data – Hyperparameter
tuning and model selection
Test Data – Evaluate the final
trained model

© Mehul Motani Intro to ML 31

Training and evaluating the learning algorithm

• Divide available labeled data into three sets:


• Training set: TRAIN
– Used for model parameter optimization on
• Validation set TEST
– Used for hyperparameter tuning and model selection
• Test set:
– Used only for final evaluation of the trained model
– Done after training and validation are completely finished
• Avoid data leakage
– The test data should not influence the choice of model
structure or optimization of parameters.
– If after evaluating on the test set, you don’t like the results,
you must set aside a new test set before training a new
model.
© Mehul Motani Intro to ML 32

16
Mehul Motani, NUS/CDE/ECE Introduction to Machine Learning

Evaluating Model Performance


Confusion Matrix for binary classification

https://en.wikipedia.org/wiki/Confusion_matrix
© Mehul Motani Intro to ML 37

Performance Metrics

Fraction of correct predictions

Fraction of positives that


How good is
are correctly classified`
the classifier
at detecting
Fraction of negatives that
positives &
are correctly classified
negatives?
Fraction of predicted positives
that are truly positive

Balance of Precision and Recall

© Mehul Motani Intro to ML 38

17
19

You might also like