0% found this document useful (0 votes)
62 views

Lecture 2

This document discusses supervised machine learning. It begins by explaining that the goal of supervised learning is to learn a function that maps inputs to outputs using labeled training data. Different supervised learning algorithms make different assumptions about the form of this mapping function. The document provides examples of classification, regression, and predicting images to demonstrate supervised learning. It emphasizes that the goal is generally to learn the function in order to make accurate predictions, rather than understand the exact form of the function.

Uploaded by

Mohit Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Lecture 2

This document discusses supervised machine learning. It begins by explaining that the goal of supervised learning is to learn a function that maps inputs to outputs using labeled training data. Different supervised learning algorithms make different assumptions about the form of this mapping function. The document provides examples of classification, regression, and predicting images to demonstrate supervised learning. It emphasizes that the goal is generally to learn the function in order to make accurate predictions, rather than understand the exact form of the function.

Uploaded by

Mohit Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

APL 405: Machine Learning for Mechanics

Lecture 2: Supervised learning

by

Rajdip Nayek
Assistant Professor,
Applied Mechanics Department,
IIT Delhi

Instructor email: [email protected]


Different types of machine learning
Supervised Unsupervised Semi- Reinforcement
supervised
Teacher provides answer No teacher, find patterns! Teacher provides rewards

▪ Labelled data ▪ No labels ▪ Some labelled data ▪ Decision process


▪ Direct feedback ▪ No feedback ▪ A lot of unlabelled data ▪ Rewards
▪ Predict outcome ▪ Find hidden structure ▪ Learn series of actions

• Classification • Clustering • Gaming


• Regression • Dimensionality • Control
reduction
• Outlier detection 2
Example of Supervised learning
Supervised learning: have labelled examples of what is “correct”

e.g. Handwritten digit classification with the MNIST dataset

• Task: Given an image of a handwritten digit, predict the digit class


• Input: a handwritten image of a digit
• Output (or target): the digit class

• Data: 70,000 images of handwritten digits labelled by humans


• Training set: first 60,000 images used to train the network
• Test set: last 10,000 images, not used during training, used to assess
performance

3
Example of Supervised learning

What type of images look like a “2”?

Misclassification

4
Example of Supervised learning
Object Recognition: Detect the class of the object

• ImageNet

• 1.2 million labelled


images

• 1000 classes

• Lots of variability in
lighting, viewpoint,
etc.

• Deep neural
networks reduced
error rates from
26% to under 4%

5
Example of Unsupervised learning
Unsupervised learning: no labelled examples, you only have input data. You are looking for interesting
patterns in the data

• To find clusters in data

• To find a compressed representation

• To find a generative model that could be used to generate more data

E.g. Clustering – Group the input data into separate classes

Elastic mod. Poisson’s ratio

Poisson’s ratio
210 GPa 0.279
70 GPa 0.325
⋮ ⋮
190 GPa 0.267 Elastic modulus
6
Example of Unsupervised learning
Unsupervised learning: no labelled examples, you only have input data. You are looking for interesting
patterns in the data

• To find clusters in data

• To find a compressed representation

• To find a generative model that could be used to generate more data

E.g. Compressed representation – Find a reduced dimension of the input

7
Example of Unsupervised learning
Unsupervised learning: no labelled examples, you only have input data. You are looking for interesting
patterns in the data

• In generative modeling, we want to learn a distribution over some dataset, such as natural images. We
can then sample from the generative model and see how the if it looks like the data.

Generated faces
(not true faces)

8
Example of reinforcement learning
Computer playing a game
Computer (agent that performs action)
Agent

State Reward Action

Environment
Game (environment)

Goal: Learn to choose actions that maximize rewards

• An agent (e.g. player) interacts with an environment (e.g. enemy-killing game)


• In each time step
• the agent receives observations of the state (e.g. how many enemies remaining)
• the agent picks an action (e.g. moving to safe location, or killing an enemy)
• The agent will periodically receive some rewards (e.g. health, ammunition, scores)
9
Idea of RL is based on animal psychology
• Reinforcements are used to train animals

• Negative reinforcements
• Hunger
• Pain

• Positive reinforcements
• Food
• Pleasure

• This psychology is applied here to computers


• Rewards: numbers or numerical signals indicating how good the
agent performed
• Example of rewards: Win/loss in games, points earned, etc.

10
Supervised Learning
Supervised Learning: Background
▪ We start with Supervised Learning; it is most common type of machine learning (will span most of this course)

▪ The task is to learn the function 𝑓 that best maps certain input (𝐱) to output (𝑦)

Example 1
Example 2
Example 3
Example 4

𝑦=𝑓 𝐱

▪ In statistics, one uses the terminology: 𝐱 → Independent variable, regressor, covariate and 𝑦 → Dependent variable,
response
𝐷𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 = 𝑓 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒

▪ In computer science, one uses the terminology: 𝐱 → Input attribute, feature and 𝑦 → Output attribute, output
𝑂𝑢𝑡𝑝𝑢𝑡 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 = 𝑃𝑟𝑜𝑔𝑟𝑎𝑚 𝐼𝑛𝑝𝑢𝑡 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑂𝑢𝑡𝑝𝑢𝑡 = 𝑃𝑟𝑜𝑔𝑟𝑎𝑚 𝐼𝑛𝑝𝑢𝑡 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 12
Supervised Learning: What do we do in supervised ML?
▪ The task in supervised ML is to learn the function 𝑓 that best maps certain input (𝐱) to output (𝑦)
𝑦=𝑓 𝐱

▪ We don’t know what the function (𝑓) looks like or its form
▪ If we knew the form, we would use it directly and we would not need to learn it from data

▪ Moreover, the output 𝑦 is often observed with some errors 𝑒 that is independent of the input 𝐱
▪ The error could be due to measurement instrument errors
▪ The error could be due to not including enough input features to sufficient characterize the mapping from 𝐱 to 𝑦

𝑦=𝑓 𝐱 +𝑒

▪ In supervised ML, we use some labelled training data (input-output pairs) that contains examples of how some
input 𝐱 relates to output 𝑦 to learn the input-output mapping
▪ Say, 𝑁 examples of labelled training data: 𝐱 (1) , 𝑦 (1) , 𝐱 (2) , 𝑦 (2) , ⋯ , 𝐱 (𝑁) , 𝑦 (𝑁)
▪ Labelled data: Each input 𝐱 (𝑖) is accompanied by an associated label 𝑦 (𝒊) , which be jointly recorded or labelled later by
some domain expert
13
Supervised Learning: What is reason for learning a function?
𝑦=𝑓 𝐱 +𝑒

▪ The most common type of ML is to learn the mapping 𝑦 = 𝑓 𝐱 to make good predictions of the output for new
examples of input (say, 𝐱*) → to generalize well beyond training data
▪ This is called predictive modeling or predictive analytics and our goal is to make the most accurate
predictions possible
▪ Hence, we are not really interested in the form of the function (𝑓) that we are learning, only that it makes
accurate predictions

▪ We could learn the mapping of 𝑦 = 𝑓 𝐱 to understand more about the relationship in the data → statistical
inference
▪ If this were the goal, we would use simpler methods and give more importance to understanding the
learned form of 𝑓 than making accurate predictions
▪ E.g. “Does eating seafood increase life expectancy?” requires careful reasoning about the function that was
learned
14
Supervised Learning: Learning a function
𝑦 =𝑓 𝐱 +𝑒

▪ Learning a function 𝑓 means estimating its form from the noisy data that is available with us
▪ The estimate will have errors and will not be exactly same as the underlying true mapping from 𝐱 → 𝑦
▪ Much time in applied machine learning is spent attempting to improve the estimate of the underlying function and in
turn improve the performance of the predictions made by the model

▪ Supervised ML algorithms are techniques for estimating the target function 𝑓 to predict the output 𝑦 given input 𝐱

▪ Different ML algorithms make different assumptions about the form of the function being learned
▪ Linear vs nonlinear models
▪ Parametric vs non-parametric models
▪ How to optimize to approximate the mapping

15
Parametric vs Non-parametric algorithms
▪ Assumptions about the unknown 𝑓 can greatly simplify the learning process, but can also limit what can be learned
Parametric models Non-Parametric models

1. Simplify the unknown function 𝑓 to 1. Don’t make strong assumptions


a known explicit form about the form of 𝑓

2. Summarises the model using a 2. Summarises the model using


fixed number of model parameters number of model parameters
(independent of the number of that depend upon the number of
training examples) training examples

Pros Pros
1. Often simpler and faster 1. Flexible: Can fit any type of function
2. May require less training data 2. Powerful: Can result in better prediction

Cons Cons
1. Constrained: Functional form is fixed 1. More data: Require a lot more training data
2. Poor fit: Unlikely to match the underlying true function 2. Over fitting: Harder to explain certain predictions made

16
Supervised learning: Two types of data
The variables contained in the data (input 𝐱 as well as output 𝑦) can be of two different types:
• Numerical (quantitative)
• Has a natural ordering, i.e., a numerical variable maybe larger or smaller than another one
• Can be continuous or discrete

• Categorical (qualitative)
• Lacks a natural ordering
• Is always discrete

• The notion of categorical vs. numerical applies to both the output variable 𝑦 and to the 𝑝 elements 𝑥𝑗 of the
input vector variable 𝐱 = 𝑥1 𝑥2 ⋯ 𝑥𝑝 𝑇

• Also, the 𝑝 components of the input vector do not have to be of the same type and can be a mix of categorical
and numerical input

• However, the output 𝑦 is either categorical or numerical


17
Numerical vs Categorical data: Examples
Data Type Example Handled as
Number (continuous) 15.58 km/h, 11.50 km/h Numerical

Number (discrete) with


0 bikes, 1 bikes, 2 bikes Numerical
natural ordering

Number (discrete) without


1 = Argentina, 2 = Brazil, 3 = India Catergorical
natural ordering

Text String Hello, Bye, Welcome Catergorical


? 3.4 + 5.6𝑖, −6.2 + 0.1𝑖 ?

▪ The distinction between numerical and categorical is sometimes arbitrary

▪ For example, having no bike is qualitatively different from having bikes, and we can use the categorical
variable ‘bikes: yes/no’ instead of the numerical ‘0, 1 or 2 bikes’

▪ Therefore, the decision lies the ML engineer whether a certain variable is to be considered as numerical
or categorical
18
Supervised ML: Regression vs Classification
▪ Output variable 𝑦? → categorical → Classification
▪ Output variable 𝑦?→ numerical → Regression

▪ Note that the 𝑝 − dimensional input vector variable 𝐱 = 𝑥1 𝑥2 ⋯ 𝑥𝑝 𝑇 can be either numerical or
categorical for both regression and classification problems

▪ It is only the type of the output that determines whether a problem is a regression or a classification
problem

▪ Classification: Binary vs Multi-class


▪ Output is categorical

▪ Output can take values in a finite set


▪ Binary classification, if only two set of values. E.g. True or False

▪ Multi-class classification: if more than two set of values. E.g. “Sweden”, “Norway”, “Finland”,
“Denmark”
19
Examples of classification and regression

Classification or
Problem Input Output
Regression?

Spam detection Text (set of words)

Stock price prediction Time-series of prices

Speech recognition Audio signal

20
Examples of classification and regression

Classification or
Problem Input Output
Regression?

Digit recognition Images of digits

Housing valuation House features

Sensor data (images, wind


Weather prediction speed)

21
Bias-Variance Trade-Off
▪ The goal of any supervised machine learning algorithm is to best estimate the mapping function 𝑓 for the output
variable 𝑦 given the input data 𝐱 = 𝑥1 𝑥2 ⋯ 𝑥𝑝 𝑇

▪ The prediction error for any machine learning algorithm results from three things:
▪ Bias
▪ Variance
▪ Irreducible Error, that cannot be reduced regardless of the algorithm used; caused by factors like partially
known inputs

▪ Bias - the simplifying assumptions made by a model to make the target function easier to learn
▪ They make algorithms easier to understand but are generally less flexible
▪ Low bias: Suggests less assumptions about the function 𝑓
▪ High bias: Suggests more assumptions about the function 𝑓

▪ Variance - amount by which the estimated function (say 𝑓)


መ will change if a different training
data was used to obtain the estimate
▪ Machine learning algorithms that have a high variance are strongly influenced by the specifics
of the training data
▪ Low variance: Suggests small changes to the estimated function with changes to the training dataset
▪ High variance: Suggests large changes to the estimated function with changes to the training dataset
The goal of any supervised machine learning algorithm is to achieve low bias and low variance 22

You might also like