ML1 Introduction
ML1 Introduction
Semester 2, 2024/2025
2
What is Machine Learning?
ML – NLU 3
How can we make a robot cook?
ML – NLU 4
How to recognize a math expression?
Photomath: https://photomath.net/
ML – NLU 5
How does Netflix recommend films to a user?
ML – NLU 6
How to predict the price of a house?
ML – NLU 7
Information Retrieval
ML – NLU 8
How can AI play game?
ML – NLU 9
Am I going to pass the ML course?
ML – NLU 10
ML – NLU 11
Dictionary defines “to learn” as:
◦ To commit to memory
◦ To be informed of or to ascertain
◦ To receive instruction
ML – NLU 12
A learning program: learn from experience E on task T with respect
to performance measure P, if its performance on T improves with
experience E.
A learning program:
◦ produces a representation R (often called a hypothesis h) of what it has
learned (or a model).
◦ another program can use R to perform T.
◦ uses a learning algorithm A to produce R from E.
◦ many different algorithms can be used produce the same type of
representation.
ML – NLU 13
Task T:
◦ playing checkers
Performance measure P:
◦ percent of games won against a set of players.
Training experience E:
◦ games played against the other players.
Representation R:
◦ an evaluation function that measures the goodness of the board.
Learning algorithm A:
◦ a variation of temporal difference (predicting the total reward expected over
the future)
ML – NLU 14
Task T:
◦ recognize handwritten letters.
Performance measure P:
◦ error rate on sample handwriting.
Training experience E:
◦ gray scale images of sample handwriting, all identified in advance.
Representation R:
◦ support vector machine, where each pixel is a separate attribute.
Learning algorithm A:
◦ SVMlight.
ML – NLU 15
Wikipedia: (ML introduced in 1980’s)
◦ Machine learning is the subfield of computer science that “gives
computers the ability to learn without being explicitly programmed”
ML – NLU 16
Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
There is no need to “learn” to calculate payroll
Learning is used when:
◦ Human expertise does not exist (navigating on Mars),
◦ Humans are unable to explain their expertise (speech recognition)
◦ Solution changes in time (routing on a computer network)
◦ Solution needs to be adapted to particular cases (user biometrics)
ML – NLU 17
Learning general models from data of particular examples
Data is cheap and abundant (data warehouses, data marts);
knowledge is expensive and scarce.
Example in retail: Customer transactions to consumer
behavior:
People who bought “Blink” also bought “Outliers” (www.amazon.com)
Build a model that is a good and useful approximation to the
data.
ML – NLU 18
Traditional Programming
Data
Computer Output
Program
Machine Learning
Data
Computer Model
Output
ML – NLU 19
20
Human intelligence exhibited by machines
◦ empowers computers to mimic human intelligence such as decision
making, text processing, and visual perception.
ML – NLU 21
An approach to achieve Artificial Intelligence.
◦ a subfield of Artificial Intelligence that enables machines to improve at
a given task with experience
ML – NLU 22
A technique for implementing machine
learning
ML – NLU 23
AI: Human intelligence exhibited by machines
◦ empowers computers to mimic human intelligence such as decision-making,
text processing, and visual perception.
ML: An approach to achieve Artificial Intelligence.
◦ a subfield of Artificial Intelligence that enables machines to improve at a
given task with experience
DL: A technique for implementing machine learning
◦ a specialized field of Machine Learning that relies on training of Deep
Artificial Neural Networks (ANNs) using a large dataset such as images or
texts
ML – NLU 24
ML is a subfield of AI “concerned with the question of how to
construct computer programs that automatically improve with
experience.” [Tom Mitchell, 1997]
Improve = learn
Experience = data
Computer = unnecessary
ML – NLU 25
ML – NLU 26
Feature Machine Learning Deep Learning
Excellent performance
Excellent performance on
Data Dependencies on a small/medium
a big dataset
dataset
ML – NLU 27
Feature Machine Learning Deep Learning
Excellent performance
Excellent performance on
Data Dependencies on a small/medium
a big dataset
dataset
Requires a powerful
machine, preferably with
Works on a low-end
Hardware dependencies GPU; DL performs a
machine
significant amount of
matrix multiplication
ML – NLU 28
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data
ML – NLU 29
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data
Up to weeks. Neural
Network needs to
Execution time From a few minutes to hours
compute a significant
number of weights
ML – NLU 30
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data
Up to weeks. Neural
Network needs to
Execution time From a few minutes to hours
compute a significant
number of weights
Some algorithms are easy to
interpret (logistic regression,
Interpretability Difficult to impossible
decision tree), some are almost
impossible (SVM, XGBoost)
ML – NLU 31
Ex. Classification task
ML – NLU 32
ML – NLU 33
Traditional approach Deep learning approach
has a nose
below eyes?
No feature engineering
Feature engineering
ML – NLU 34
When to use ML or DL?
ML – NLU 35
Both machine learning and statistics have the same objective
The same concepts have different names in the two fields
Statistics Machine Learning
Estimation Learning
Classifier Hypothesis
Data Point Example / Instance
Regression Supervised Learning
Classification Supervised Learning
Covariate Feature
Response Label
ML – NLU 36
Machine Learning: a subfield of Computer Science and
Artificial Intelligence.
◦ Deals with building systems that can learn from data, instead of
explicitly programmed instructions.
◦ A new field.
ML – NLU 37
The process of extracting useful
information from a huge amount of data.
ML – NLU 38
Retail: Market basket analysis, Customer relationship
management (CRM)
Finance: Credit scoring, fraud detection
Manufacturing: Control, robotics, troubleshooting
Medicine: Medical diagnosis
Telecommunications: Spam filters, intrusion detection
Bioinformatics: Motifs, alignment
Web mining: Search engines
...
ML – NLU 39
A Venn diagram that shows how machine learning and
statistics are related
Kdnuggets.com
ML – NLU 40
ML – NLU 41
42
Types of Machine Learning
ML – NLU 43
ML – NLU 44
Supervision: The training data (i.e., observations, measurements,
etc.) are accompanied by labels indicating the class of the
observations
◦ New data is classified based on the training set
Example:
◦ Spam Detection
Map emails to {Spam, Not Spam}
◦ Digit recognition
Map pixels to {0,1,2,3,4,5,6,7,8,9}
◦ Stock Prediction
Map new, historic prices, etc. to (the real numbers)
ML – NLU 45
The class labels of training data is unknown
Given a set of measurements, observations, etc. with the aim
of establishing the existence of classes or clusters in the data
Example:
◦ Customer segmentation in CRM
◦ Image compression: Color quantization
◦ Bioinformatics: Learning motifs
ML – NLU 46
ML – NLU 47
Is there a target variable
to be predicted?
No Yes
Is the target variable
Clustering Categorical or
Continuous?
Categorical Continuous
Unsupervised No Yes
ML – NLU 48
Learning a policy: A sequence of outputs
No supervised output but delayed reward
◦ Credit assignment problem
the problem of measuring the influence and impact of an action taken by an
agent on future rewards
◦ Game playing
◦ Robot in a maze
◦ Multiple agents, partial observability, ...
ML – NLU 49
Imagine that you were dropped off at an isolated island!
What would you do?
Panic? Yes, of course, initially we all would.
ML – NLU 50
Later, you will learn how to live on the island by:
◦ exploring the environment,
◦ understanding the climate condition, the type of food that grows there,
the dangers of the island, etc.
ML – NLU 51
This is exactly how Reinforcement Learning works.
◦ an agent is put in an unknown environment
◦ Agent must learn by observing and performing actions that result in
rewards (maximal).
ML – NLU 53
54
ML – NLU 55
Regression Algorithms
◦ Linear Regression
◦ Logistic Regression
◦ Stepwise Regression
Classification Algorithms
◦ Linear Classifier
◦ Support Vector Machine (SVM)
◦ Kernel SVM
◦ Sparse Representation-based classification (SRC)
ML – NLU 56
Instance-based Algorithms
◦ k-Nearest Neighbor (kNN)
◦ Learning Vector Quantization (LVQ)
Regularization Algorithms
◦ Ridge Regression
◦ Least Absolute Shrinkage and Selection Operator (LASSO)
◦ Least-Angle Regression (LARS)
◦ Bayesian Algorithms
◦ Naive Bayes
◦ Gaussian Naive Bayes
ML – NLU 57
Clustering Algorithms
◦ k-Means clustering
◦ k-Medians
◦ Expectation Maximization (EM)
ML – NLU 58
Dimensionality Reduction Algorithms
◦ Principal Component Analysis (PCA)
◦ Linear Discriminant Analysis (LDA)
Ensemble Algorithms
◦ Boosting
◦ AdaBoost
◦ Random Forest
ML – NLU 59
60
Association
Supervised Learning
◦ Classification
◦ Regression
Unsupervised Learning
Reinforcement Learning
ML – NLU 61
62
Given a set of records each of which contain some number of
items from a given collection
◦ P (Y | X ) probability that somebody who buys X also buys Y where X
and Y are products/services.
◦ Example:
P (chips | beer ) = 0.8;
P (diaper | beer ) = 0.7;
ML – NLU 63
Market-basket analysis
◦ Rules are used for sales promotion, shelf management, and inventory
management
Medical Informatics
◦ Rules are used to find combination of patient symptoms and test results
associated with certain diseases
ML – NLU 64
tiki.vn
ML – NLU 65
vinabook.com
ML – NLU 66
67
How does classification work?
ML – NLU 68
Example: Credit scoring
Differentiating between
low-risk and high-risk
customers from their
income and savings
ML – NLU 70
Training examples of a person
Test images
ORL dataset,
AT&T Laboratories, Cambridge UK
ML – NLU 71
Training set Testing set
ML – NLU 72
Example: Price of a used car
◦ x : car attributes
◦ y : price
y = g (x | ) y = wx+w0
g ( ) model,
parameters
ML – NLU 73
74
How does clustering work?
ML – NLU 75
Document Clustering:
ML – NLU 76
Detect significant deviations from normal
behavior
Applications:
◦ Credit Card Fraud Detection
◦ Network Intrusion Detection
◦ Identify anomalous behavior from sensor networks
for monitoring and surveillance.
◦ Detecting changes in the global forest cover.
ML – NLU 77
GAN: learn to model the input distribution by training two competing
(and cooperating) networks called generator and discriminator
ML – NLU 78
2018
ML – NLU
Towards the Automatic Anime Characters Creation with Generative Adversarial Networks 80
Text to image
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
ML – NLU 81
Music generation
ML – NLU 82
Nym: AI do Nguyễn Phi Vân xây dựng
trong hơn ba năm, được nạp dữ liệu về
ngôn ngữ, kiến thức đời sống.
◦ AI đã nói chuyện trực tiếp, một - một với 11
triệu người trẻ Việt trên facebook để thu thập
thông tin.
◦ Nym - Tôi của tương lai - quyển sách con
người và AI hợp tác viết.
ML – NLU 83
3 AI news anchors appeared on Indonesian television
ML – NLU 84
Ann:
◦ developed by Bobo Đặng, 3/2023
◦ MV "Làm sao nói thương anh“
◦ MV “Cry”, 08/2024
ML – NLU 85
Large Language Model (LLM):
ML – NLU 86
Applications of LLM:
◦ text generation, machine translation, summary writing, image
generation from texts, machine coding, chat-bots, Conversational AI
ML – NLU 87
ML – NLU 88
ChatGPT:
◦ a deep learning language model
◦ developed by OpenAI,
◦ which is capable of generating human-like text based on the input
provided
Prompt engineering
ML – NLU 89
Sora (openai.com)
Sora is an AI model that can create realistic and imaginative scenes from
text instructions.
◦ text-to-video model
◦ can generate videos up to a minute long while maintaining visual quality and
adherence to the user’s prompt.
ML – NLU 90
Prompt: A stylish woman walks down a Tokyo street filled with warm
glowing neon and animated city signage. She wears a black leather jacket, a
long red dress, and black boots, and carries a black purse. She wears
sunglasses and red lipstick. She walks confidently and casually. The street is
damp and reflective, creating a mirror effect of the colorful lights. Many
pedestrians walk about.
ML – NLU 91
92
1997, Deep Blue (IBM) has defeated Garry Kasparov
won 3 games,
lost 2,
tied 1
ML – NLU 93
2016, AlphaGo (Deepmind, Google) has defeated Lee Sedol.
won 4 games,
lost 1
AlphaZero
ML – NLU 94
2006, a driverless robotic car – STANLEY won in DARPA
Grand Challenge (dessert)
ML – NLU 95
QT-Opt achieves a 96% grasp success rate across 700 trials
with previously unseen objects, significantly outperforming
Google AI’s previous method, which had a 78% success rate.
ML – NLU 96
97
A typical Machine Learning process:
ML – NLU 98
Feature engineering:
◦ the process of selecting, manipulating, and transforming raw data into
features that can be used for building models.
ML – NLU 99
Training set: The sample of data used to fit the model.
Testing set: The sample of data used to provide an unbiased
evaluation of a final model fit on the training dataset.
ML – NLU 100
Validation set:
◦ The sample of data used to provide an unbiased evaluation of a model
fit on the training dataset while tuning model hyperparameters.
ML – NLU 101
Hyperparameters: Parameters whose values control the learning
process and determine the values of model parameters that a
learning algorithm ends up learning.
◦ The prefix ‘hyper_’: ‘top-level’ parameters that control the learning process
and the model parameters that result from it.
ML – NLU 102
Split ratio depends on two things:
◦ First, the total number of samples in your data
◦ Second, the actual model we are training.
Example:
◦ Models with very few hyperparameters will be easy to
validate and tune ➔ a small validation set
◦ Models with many hyperparameters➔ a large validation
set
ML – NLU 103
K% 100-K%
K%
bigdatauni.com
ML – NLU 104
https://en.wikipedia.org/wiki/Cross-validation_(statistics)
ML – NLU
Examples of hyperparameters:
◦ Train-test split ratio
◦ Learning rate in optimization algorithms (e.g. gradient descent)
◦ Number of hidden layers in a neural network
◦ Number of iterations (epochs) in training a neural network
◦ Number of clusters in a clustering task
◦ Kernel or filter size in convolutional layers
◦ Pooling size
◦ …
ML – NLU 106
Parameters:
◦ Are internal to the model
◦ Are learned or estimated purely from the data during training
Examples of parameters
◦ The coefficients (or weights) of linear and logistic regression models.
◦ Weights and biases of a neural network
◦ The cluster centroids in clustering
ML – NLU 107
ML – NLU 108
FACULTY OF INFORMATION TECHNOLOGY