0% found this document useful (0 votes)
12 views

Chapter 1 - Introduction

This document is the introduction to a lecture on machine learning. It defines machine learning as programs that improve automatically through experience, and as the study of algorithms and models that learn patterns from data to perform tasks without explicit instructions. It discusses supervised, unsupervised, reinforcement, and evolutionary learning. Key phases of machine learning projects are presented as training, validation, and testing data sets to evaluate performance and avoid overfitting. Common performance measures like precision, recall, and F1 score are also introduced.

Uploaded by

Gia Khang Tạ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Chapter 1 - Introduction

This document is the introduction to a lecture on machine learning. It defines machine learning as programs that improve automatically through experience, and as the study of algorithms and models that learn patterns from data to perform tasks without explicit instructions. It discusses supervised, unsupervised, reinforcement, and evolutionary learning. Key phases of machine learning projects are presented as training, validation, and testing data sets to evaluate performance and avoid overfitting. Common performance measures like precision, recall, and F1 score are also introduced.

Uploaded by

Gia Khang Tạ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Machine Learning

Chapter 1 - Introduction

Lecturer: Duc Dung Nguyen, PhD.


Contact: [email protected]

Faculty of Computer Science and Engineering


Hochiminh city University of Technology
Machine Learning

What is Machine learning?

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 1 / 23


Machine Learning

What is Machine learning?

• Arthur Samuel (1959): "Field of study that gives computers the ability to learn without
being explicitly programmed"
• Tom Mitchell (1997): "A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its performance at tasks in
T, as measured by P, improves with experience E".

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 1 / 23


Machine Learning

Machine Learning

• How to construct programs that automatically improve with experience.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 2 / 23


Machine Learning

Machine Learning

• How to construct programs that automatically improve with experience.


• The scientific study of algorithms and statistical models that computer systems use to
perform a specific task without using explicit instructions, relying on patterns and
inference instead.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 2 / 23


Machine Learning

Machine Learning

• How to construct programs that automatically improve with experience.


• The scientific study of algorithms and statistical models that computer systems use to
perform a specific task without using explicit instructions, relying on patterns and
inference instead.
• A subset of artificial intelligence.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 2 / 23


Example

Experience

Example Gray? Mammal? Large? Vegetarian? Wild? Elephant


1 + + + + + +
2 + + + - + +
3 + + - + + -
4 - + + + + -
5 + - + - + -
1 + + + + - +

Prediction

7 + + + - + ?
8 + - + - + ?
9 + + + - - ?
Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 3 / 23
Machine Learning

What is learning?

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 4 / 23


Machine Learning

Learning is an (endless) generalization or


induction process.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 5 / 23


Types of Machine Learning

Data + Label
• Supervised learning: the learner (learning algorithm) are trained on labeled examples, i.e.,
input where the desired output is known.
• Unsupervised learning: the learner operates on unlabeled examples, i.e., input where the
desired output is unknown.
grouping/clustering

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 6 / 23


Types of Machine Learning

• Reinforcement learning: between supervised and unsupervised learning. It is told when an


answer is wrong, but not how to correct it.
• Evolutionary learning: biological evolution can be seen as a learning process, to improve
survival rates and chance of having offspring.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 7 / 23


Types of Machine Learning

• The most common type: supervised learning.


– Classification: to find the class of an instance given its selected features.
– Regression: to find a function whose curve passes as close as possible to all of the given
data points.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 8 / 23


Phases of Machine Learning

How many phase do we have in machine


learning?

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 9 / 23


Phases of Machine Learning

BRK O YP ?KMRS O OK S Q

F KS S Q FO S Q 4 VcS Q

F KS S Q FO S Q DOKV
7K K 7K K 7K K
Validation set

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 10 / 23


Phases of Machine Learning

avg performance
• K-fold cross validation: (for small model and small data)
– Randomly partitioned k equal sized sub-samples.
– k - 1 for training and 1 for testing.
– k times (folds) of validation and taking the average.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 11 / 23


Phases of Machine Learning

Statistical significance test: to reject the null-hypothesis that the two


compared systems are equivalently efficient although their performance measures
are different.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 12 / 23


Phases of Machine Learning

loss of test increase


underfitting
-> overfitting

good checkpoint

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 13 / 23


Phases of Machine Learning

Overfitting

• There is noise in the data


• The number of training examples is too small to produce a representative sample of the
target concept.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 14 / 23


Performance Measures

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 15 / 23


Performance Measures

số ng dự đoán bị thật sự / số ng hệ thống dự đoán bị

• Precision:
number of correct system answers
P =
number of system answers
• Recall:
number of correct system answers
R=
number of correct problem answers
số ng dự đoán bị thật sự / số ng bị trong dataset

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 16 / 23


Performance Measures

Trade-off Precision vs Recall

TP
P recision =
TP + FP
TP
Recall =
TP + FN
TP + TN
Accuracy =
TP + TN + FP + FN

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 17 / 23


Performance Measures

F1 score: want to seek a balance between Precision and Recall


It is good when there is an uneven class distribution.
P ∗R
F1 = 2
P +R

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 18 / 23


Inductive Bias

Price

N ??
Example Quality Price Buy
Quality 1 Good Low Yes
2 Bad High No
Y 3 Good High ?
??
4 Bad Low ?

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 19 / 23


Inductive Bias

• A learner that makes no prior assumptions regarding the identity of the target
concept cannot classify any unseen instances.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 20 / 23


Inductive Bias

• A learner that makes no prior assumptions regarding the identity of the target
concept cannot classify any unseen instances.
• A learner that makes no a priori assumptions regarding the identity of the target concept
has no rational basic for classifying any unseen instances.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 20 / 23


Inductive Bias

• A learner that makes no prior assumptions regarding the identity of the target
concept cannot classify any unseen instances.
• A learner that makes no a priori assumptions regarding the identity of the target concept
has no rational basic for classifying any unseen instances.
• The inductive bias (learning bias): the set of assumptions that the learner uses to predict
outputs given inputs that it has not encountered.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 20 / 23


Inductive Bias

Common inductive bias in ML:

• Maximum conditional independence: if the hypothesis can be cast in a Bayesian


framework, try to maximize conditional independence (Naive Bayes classifier).

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 21 / 23


Inductive Bias

Common inductive bias in ML:

• Maximum conditional independence: if the hypothesis can be cast in a Bayesian


framework, try to maximize conditional independence (Naive Bayes classifier).
• Minimum cross-validation error: when trying to choose among hypotheses, select the
hypothesis with the lowest cross-validation error.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 21 / 23


Inductive Bias

Common inductive bias in ML:

• Maximum conditional independence: if the hypothesis can be cast in a Bayesian


framework, try to maximize conditional independence (Naive Bayes classifier).
• Minimum cross-validation error: when trying to choose among hypotheses, select the
hypothesis with the lowest cross-validation error.
• Maximum margin: when drawing a boundary between two classes, attempt to maximize
the width of the boundary (SVM). The assumption is that distinct classes tend to be
separated by wide boundaries.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 21 / 23


Inductive Bias

Common inductive bias in ML:

• Minimum description length: when forming a hypothesis, attempt to minimize the


length of the description of the hypothesis. The assumption is that simpler hypotheses are
more likely to be true.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 22 / 23


Inductive Bias

Common inductive bias in ML:

• Minimum description length: when forming a hypothesis, attempt to minimize the


length of the description of the hypothesis. The assumption is that simpler hypotheses are
more likely to be true.
• Minimum features: unless there is good evidence that a feature is useful, it should be
deleted.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 22 / 23


Inductive Bias

Common inductive bias in ML:

• Minimum description length: when forming a hypothesis, attempt to minimize the


length of the description of the hypothesis. The assumption is that simpler hypotheses are
more likely to be true.
• Minimum features: unless there is good evidence that a feature is useful, it should be
deleted.
• Nearest neighbors: assume that most of the cases in a small neighborhood in feature
space belong to the same class.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 22 / 23

You might also like