0% found this document useful (0 votes)
5 views109 pages

ML1 Introduction

The document provides an overview of Machine Learning (ML), including its definition, types, algorithms, and applications. It discusses the differences between traditional programming and ML, as well as the relationship between ML and statistics. Additionally, it covers various ML processes, such as supervised, unsupervised, and reinforcement learning, along with examples of algorithms used in these categories.

Uploaded by

Lâm Bảo Duy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views109 pages

ML1 Introduction

The document provides an overview of Machine Learning (ML), including its definition, types, algorithms, and applications. It discusses the differences between traditional programming and ML, as well as the relationship between ML and statistics. Additionally, it covers various ML processes, such as supervised, unsupervised, and reinforcement learning, along with examples of algorithms used in these categories.

Uploaded by

Lâm Bảo Duy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

FACULTY OF INFORMATION TECHNOLOGY

Semester 2, 2024/2025
2
 What is Machine Learning?

 Machine Learning Types

 Machine Learning Algorithms

 Applications of Machine Learning

 Machine Learning process

ML – NLU 3
 How can we make a robot cook?

ML – NLU 4
 How to recognize a math expression?

Photomath: https://photomath.net/
ML – NLU 5
 How does Netflix recommend films to a user?

ML – NLU 6
 How to predict the price of a house?

ML – NLU 7
Information Retrieval

 How does Google search images based on a given text?

ML – NLU 8
 How can AI play game?

ML – NLU 9
 Am I going to pass the ML course?

ML – NLU 10
ML – NLU 11
 Dictionary defines “to learn” as:

◦ To get knowledge of something by study, experience, or being taught.

◦ To become aware by information or from observation

◦ To commit to memory

◦ To be informed of or to ascertain

◦ To receive instruction

ML – NLU 12
 A learning program: learn from experience E on task T with respect
to performance measure P, if its performance on T improves with
experience E.

 A learning program:
◦ produces a representation R (often called a hypothesis h) of what it has
learned (or a model).
◦ another program can use R to perform T.
◦ uses a learning algorithm A to produce R from E.
◦ many different algorithms can be used produce the same type of
representation.

ML – NLU 13
 Task T:
◦ playing checkers
 Performance measure P:
◦ percent of games won against a set of players.
 Training experience E:
◦ games played against the other players.
 Representation R:
◦ an evaluation function that measures the goodness of the board.
 Learning algorithm A:
◦ a variation of temporal difference (predicting the total reward expected over
the future)

ML – NLU 14
 Task T:
◦ recognize handwritten letters.

 Performance measure P:
◦ error rate on sample handwriting.

 Training experience E:
◦ gray scale images of sample handwriting, all identified in advance.

 Representation R:
◦ support vector machine, where each pixel is a separate attribute.

 Learning algorithm A:
◦ SVMlight.
ML – NLU 15
 Wikipedia: (ML introduced in 1980’s)
◦ Machine learning is the subfield of computer science that “gives
computers the ability to learn without being explicitly programmed”

 Ability of computers to “learn” from “data” or “past


experience”
◦ learn: Make intelligent predictions or decisions based on data by
optimizing a model
◦ data: Comes from various sources such as sensors, domain knowledge,
experimental runs, etc.

ML – NLU 16
 Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
 There is no need to “learn” to calculate payroll
 Learning is used when:
◦ Human expertise does not exist (navigating on Mars),
◦ Humans are unable to explain their expertise (speech recognition)
◦ Solution changes in time (routing on a computer network)
◦ Solution needs to be adapted to particular cases (user biometrics)

ML – NLU 17
 Learning general models from data of particular examples
 Data is cheap and abundant (data warehouses, data marts);
knowledge is expensive and scarce.
 Example in retail: Customer transactions to consumer
behavior:
People who bought “Blink” also bought “Outliers” (www.amazon.com)
 Build a model that is a good and useful approximation to the
data.

ML – NLU 18
Traditional Programming
Data
Computer Output
Program

Machine Learning
Data
Computer Model
Output

ML – NLU 19
20
 Human intelligence exhibited by machines
◦ empowers computers to mimic human intelligence such as decision
making, text processing, and visual perception.

ML – NLU 21
 An approach to achieve Artificial Intelligence.
◦ a subfield of Artificial Intelligence that enables machines to improve at
a given task with experience

ML – NLU 22
 A technique for implementing machine
learning

◦ a specialized field of Machine Learning that


relies on training of Deep Artificial Neural
Networks (ANNs) using a large dataset such as
images or texts

ML – NLU 23
 AI: Human intelligence exhibited by machines
◦ empowers computers to mimic human intelligence such as decision-making,
text processing, and visual perception.
 ML: An approach to achieve Artificial Intelligence.
◦ a subfield of Artificial Intelligence that enables machines to improve at a
given task with experience
 DL: A technique for implementing machine learning
◦ a specialized field of Machine Learning that relies on training of Deep
Artificial Neural Networks (ANNs) using a large dataset such as images or
texts

➔ Thanks to Deep Learning, AI has a bright future!

ML – NLU 24
 ML is a subfield of AI “concerned with the question of how to
construct computer programs that automatically improve with
experience.” [Tom Mitchell, 1997]
 Improve = learn
 Experience = data
 Computer = unnecessary

ML – NLU 25
ML – NLU 26
Feature Machine Learning Deep Learning
Excellent performance
Excellent performance on
Data Dependencies on a small/medium
a big dataset
dataset

ML – NLU 27
Feature Machine Learning Deep Learning
Excellent performance
Excellent performance on
Data Dependencies on a small/medium
a big dataset
dataset
Requires a powerful
machine, preferably with
Works on a low-end
Hardware dependencies GPU; DL performs a
machine
significant amount of
matrix multiplication

ML – NLU 28
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data

ML – NLU 29
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data
Up to weeks. Neural
Network needs to
Execution time From a few minutes to hours
compute a significant
number of weights

ML – NLU 30
Feature Machine Learning Deep Learning
No need to understand
Feature Need to understand the features
the best feature that
engineering that represent the data
represents the data
Up to weeks. Neural
Network needs to
Execution time From a few minutes to hours
compute a significant
number of weights
Some algorithms are easy to
interpret (logistic regression,
Interpretability Difficult to impossible
decision tree), some are almost
impossible (SVM, XGBoost)

ML – NLU 31
Ex. Classification task

ML – NLU 32
ML – NLU 33
Traditional approach Deep learning approach

has two eyes?

has a nose
below eyes?

No feature engineering
Feature engineering

Ok, it’s a face!

ML – NLU 34
 When to use ML or DL?

Feature Machine Learning Deep Learning


Training dataset Small Large
Choose features Yes No
Number of algorithms Many Few
Training time Short Long

ML – NLU 35
 Both machine learning and statistics have the same objective
 The same concepts have different names in the two fields
Statistics Machine Learning
Estimation Learning
Classifier Hypothesis
Data Point Example / Instance
Regression Supervised Learning
Classification Supervised Learning
Covariate Feature
Response Label

ML – NLU 36
 Machine Learning: a subfield of Computer Science and
Artificial Intelligence.
◦ Deals with building systems that can learn from data, instead of
explicitly programmed instructions.
◦ A new field.

 Statistic: a subfield of Mathematics.


◦ Cheap computing power and availability of large amounts of data
allowed data scientists to train computers to learn by analyzing data.
◦ Statistical modeling existed long before computers were invented.

ML – NLU 37
 The process of extracting useful
information from a huge amount of data.

◦ Focused on discovery of previously unknown and


important properties in data.

◦ Used for extracting patterns from data

ML – NLU 38
 Retail: Market basket analysis, Customer relationship
management (CRM)
 Finance: Credit scoring, fraud detection
 Manufacturing: Control, robotics, troubleshooting
 Medicine: Medical diagnosis
 Telecommunications: Spam filters, intrusion detection
 Bioinformatics: Motifs, alignment
 Web mining: Search engines
 ...
ML – NLU 39
 A Venn diagram that shows how machine learning and
statistics are related

Kdnuggets.com

ML – NLU 40
ML – NLU 41
42
Types of Machine Learning

ML – NLU 43
ML – NLU 44
 Supervision: The training data (i.e., observations, measurements,
etc.) are accompanied by labels indicating the class of the
observations
◦ New data is classified based on the training set
 Example:
◦ Spam Detection
 Map emails to {Spam, Not Spam}
◦ Digit recognition
 Map pixels to {0,1,2,3,4,5,6,7,8,9}
◦ Stock Prediction
 Map new, historic prices, etc. to (the real numbers)

ML – NLU 45
 The class labels of training data is unknown
 Given a set of measurements, observations, etc. with the aim
of establishing the existence of classes or clusters in the data

 Example:
◦ Customer segmentation in CRM
◦ Image compression: Color quantization
◦ Bioinformatics: Learning motifs

ML – NLU 46
ML – NLU 47
Is there a target variable
to be predicted?

No Yes
Is the target variable
Clustering Categorical or
Continuous?

Categorical Continuous

Classification Is the independent


variable a time period?

Unsupervised No Yes

supervised Regression Forecasting

ML – NLU 48
 Learning a policy: A sequence of outputs
 No supervised output but delayed reward
◦ Credit assignment problem
 the problem of measuring the influence and impact of an action taken by an
agent on future rewards

◦ Game playing
◦ Robot in a maze
◦ Multiple agents, partial observability, ...

ML – NLU 49
 Imagine that you were dropped off at an isolated island!
 What would you do?
 Panic? Yes, of course, initially we all would.

ML – NLU 50
 Later, you will learn how to live on the island by:
◦ exploring the environment,
◦ understanding the climate condition, the type of food that grows there,
the dangers of the island, etc.

ML – NLU 51
 This is exactly how Reinforcement Learning works.
◦ an agent is put in an unknown environment
◦ Agent must learn by observing and performing actions that result in
rewards (maximal).

Trial and Error


ML – NLU 52
 An example of Reinforcement Learning:

ML – NLU 53
54
ML – NLU 55
 Regression Algorithms
◦ Linear Regression
◦ Logistic Regression
◦ Stepwise Regression

 Classification Algorithms
◦ Linear Classifier
◦ Support Vector Machine (SVM)
◦ Kernel SVM
◦ Sparse Representation-based classification (SRC)

ML – NLU 56
 Instance-based Algorithms
◦ k-Nearest Neighbor (kNN)
◦ Learning Vector Quantization (LVQ)

 Regularization Algorithms
◦ Ridge Regression
◦ Least Absolute Shrinkage and Selection Operator (LASSO)
◦ Least-Angle Regression (LARS)
◦ Bayesian Algorithms
◦ Naive Bayes
◦ Gaussian Naive Bayes

ML – NLU 57
 Clustering Algorithms
◦ k-Means clustering
◦ k-Medians
◦ Expectation Maximization (EM)

 Artificial Neural Network Algorithms


◦ Perceptron
◦ Softmax Regression
◦ Multilayer Perceptron
◦ Backpropagation

ML – NLU 58
 Dimensionality Reduction Algorithms
◦ Principal Component Analysis (PCA)
◦ Linear Discriminant Analysis (LDA)

 Ensemble Algorithms
◦ Boosting
◦ AdaBoost
◦ Random Forest

ML – NLU 59
60
 Association

 Supervised Learning

◦ Classification

◦ Regression

 Unsupervised Learning

 Reinforcement Learning

ML – NLU 61
62
 Given a set of records each of which contain some number of
items from a given collection
◦ P (Y | X ) probability that somebody who buys X also buys Y where X
and Y are products/services.
◦ Example:
 P (chips | beer ) = 0.8;
 P (diaper | beer ) = 0.7;

ML – NLU 63
 Market-basket analysis

◦ Rules are used for sales promotion, shelf management, and inventory
management

 Telecommunication alarm diagnosis

◦ Rules are used to find combination of alarms that occur together


frequently in the same time period

 Medical Informatics

◦ Rules are used to find combination of patient symptoms and test results
associated with certain diseases

ML – NLU 64
tiki.vn

ML – NLU 65
vinabook.com

ML – NLU 66
67
 How does classification work?

EDA: Exploratory Data Analysis

ML – NLU 68
 Example: Credit scoring
 Differentiating between
low-risk and high-risk
customers from their
income and savings

Discriminant: IF income > θ1 AND savings > θ2


THEN low-risk ELSE high-risk
ML – NLU 69
 Aka Pattern recognition
 Face recognition: Pose, lighting, occlusion (glasses, beard),
make-up, hair style
 Character recognition: Different handwriting styles.
 Speech recognition: Temporal dependency.
 Medical diagnosis: From symptoms to illnesses
 Biometrics: Recognition/authentication using physical and/or
behavioral characteristics: Face, iris, signature, etc
 ...

ML – NLU 70
Training examples of a person

Test images

ORL dataset,
AT&T Laboratories, Cambridge UK

ML – NLU 71
Training set Testing set

ML – NLU 72
 Example: Price of a used car
◦ x : car attributes
◦ y : price

y = g (x |  ) y = wx+w0

g ( ) model,
 parameters

ML – NLU 73
74
 How does clustering work?

EDA: Exploratory Data Analysis

ML – NLU 75
 Document Clustering:

◦ Goal: To find groups of documents that are similar to each other


based on the important terms appearing in them.

◦ Approach: To identify frequently occurring terms in each


document. Form a similarity measure based on the frequencies of
different terms. Use it to cluster.

ML – NLU 76
 Detect significant deviations from normal
behavior
 Applications:
◦ Credit Card Fraud Detection
◦ Network Intrusion Detection
◦ Identify anomalous behavior from sensor networks
for monitoring and surveillance.
◦ Detecting changes in the global forest cover.

ML – NLU 77
 GAN: learn to model the input distribution by training two competing
(and cooperating) networks called generator and discriminator

ML – NLU 78
2018

ML – NLU Prevention, and Mitigation, 2018


Malicious Use of Artificial Intelligence: Forecasting, 79
 Generate Anime
characters

ML – NLU
Towards the Automatic Anime Characters Creation with Generative Adversarial Networks 80
 Text to image

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

ML – NLU 81
 Music generation

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation

ML – NLU 82
 Nym: AI do Nguyễn Phi Vân xây dựng
trong hơn ba năm, được nạp dữ liệu về
ngôn ngữ, kiến thức đời sống.
◦ AI đã nói chuyện trực tiếp, một - một với 11
triệu người trẻ Việt trên facebook để thu thập
thông tin.
◦ Nym - Tôi của tương lai - quyển sách con
người và AI hợp tác viết.

ML – NLU 83
 3 AI news anchors appeared on Indonesian television

ML – NLU 84
 Ann:
◦ developed by Bobo Đặng, 3/2023
◦ MV "Làm sao nói thương anh“
◦ MV “Cry”, 08/2024

ML – NLU 85
 Large Language Model (LLM):

◦ a type of artificial intelligence algorithm

◦ applies neural network techniques with lots of parameters to process


and understand human languages or text using self-supervised
learning techniques

◦ uses deep learning-based Models like transformers that include lakhs


of parameters in their architecture which help to create better results
on the NLP tasks

ML – NLU 86
 Applications of LLM:
◦ text generation, machine translation, summary writing, image
generation from texts, machine coding, chat-bots, Conversational AI

◦ Examples: Chat GPT by open AI, BERT (Bidirectional Encoder


Representations from Transformers) by Google, PhoBERT (Pre-trained
language models for Vietnamese)

ML – NLU 87
ML – NLU 88
 ChatGPT:
◦ a deep learning language model
◦ developed by OpenAI,
◦ which is capable of generating human-like text based on the input
provided

Prompt engineering

ML – NLU 89
Sora (openai.com)

 Sora is an AI model that can create realistic and imaginative scenes from
text instructions.
◦ text-to-video model
◦ can generate videos up to a minute long while maintaining visual quality and
adherence to the user’s prompt.

ML – NLU 90
 Prompt: A stylish woman walks down a Tokyo street filled with warm
glowing neon and animated city signage. She wears a black leather jacket, a
long red dress, and black boots, and carries a black purse. She wears
sunglasses and red lipstick. She walks confidently and casually. The street is
damp and reflective, creating a mirror effect of the colorful lights. Many
pedestrians walk about.

ML – NLU 91
92
 1997, Deep Blue (IBM) has defeated Garry Kasparov

won 3 games,
lost 2,
tied 1

ML – NLU 93
 2016, AlphaGo (Deepmind, Google) has defeated Lee Sedol.

won 4 games,
lost 1

AlphaZero

ML – NLU 94
 2006, a driverless robotic car – STANLEY won in DARPA
Grand Challenge (dessert)

 2007, a driverless robotic car – CMU’s BOSS won in Urban


Challenge

ML – NLU 95
 QT-Opt achieves a 96% grasp success rate across 700 trials
with previously unseen objects, significantly outperforming
Google AI’s previous method, which had a 78% success rate.

ML – NLU 96
97
 A typical Machine Learning process:

ML – NLU 98
 Feature engineering:
◦ the process of selecting, manipulating, and transforming raw data into
features that can be used for building models.

ML – NLU 99
 Training set: The sample of data used to fit the model.
 Testing set: The sample of data used to provide an unbiased
evaluation of a final model fit on the training dataset.

ML – NLU 100
 Validation set:
◦ The sample of data used to provide an unbiased evaluation of a model
fit on the training dataset while tuning model hyperparameters.

ML – NLU 101
 Hyperparameters: Parameters whose values control the learning
process and determine the values of model parameters that a
learning algorithm ends up learning.
◦ The prefix ‘hyper_’: ‘top-level’ parameters that control the learning process
and the model parameters that result from it.
ML – NLU 102
 Split ratio depends on two things:
◦ First, the total number of samples in your data
◦ Second, the actual model we are training.
 Example:
◦ Models with very few hyperparameters will be easy to
validate and tune ➔ a small validation set
◦ Models with many hyperparameters➔ a large validation
set

ML – NLU 103
K% 100-K%

K%

bigdatauni.com

ML – NLU 104
https://en.wikipedia.org/wiki/Cross-validation_(statistics)

ML – NLU
 Examples of hyperparameters:
◦ Train-test split ratio
◦ Learning rate in optimization algorithms (e.g. gradient descent)
◦ Number of hidden layers in a neural network
◦ Number of iterations (epochs) in training a neural network
◦ Number of clusters in a clustering task
◦ Kernel or filter size in convolutional layers
◦ Pooling size
◦ …

ML – NLU 106
 Parameters:
◦ Are internal to the model
◦ Are learned or estimated purely from the data during training

 Examples of parameters
◦ The coefficients (or weights) of linear and logistic regression models.
◦ Weights and biases of a neural network
◦ The cluster centroids in clustering

ML – NLU 107
ML – NLU 108
FACULTY OF INFORMATION TECHNOLOGY

You might also like