0% found this document useful (0 votes)

10 views

Topic 08 - Data Modelling - Part II

Topic 08 -

Uploaded by

Sơn Nguyễn Kim

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Topic 08 - Data Modelling - Part II

Topic 08 -

Uploaded by

Sơn Nguyễn Kim

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

University of Science, VNU-HCM

Faculty of Information Technology

Môn Cơ Sở Trí Tuệ Nhân Tạo

Introduction to Data Science Course

Data Modeling (Part 2)

Le Ngoc Thanh
[email protected]
Department of Computer Science

Ho Chi Minh City

Contents
◎ Data science and machine learning review
◎ Classification model
◎ Clustering model

2
Process

3
Data Science’s tasks

4
ML Tasks

5
The course’s focus
◎ In this course, we focus on three main groups of ML:
○ Regression
○ Classification
○ Clustering

6
General model learning architecture

(Hypothesis)

7
Classification
◎ Classification is the problem of identifying which of a set of
categories an observation belongs to.

8
Classification
◎ The inputs and outputs for the learning binary
classification task can be stated as follows:
○ Input
data 𝐱! ∈ ℝ" , 𝑗 ∈ Z: = 1,2, … , 𝑚
labels 𝐲! ∈ ±1 , 𝑗 ∈ 𝑍 # ⊂ 𝑍
○ Ouput
labels 𝐲! ∈ ±1 , 𝑗 ∈ 𝑍

9
Challenges
◎ Some challenges in classification task:
○ The boundary between the data forms a nonlinear manifold that is difficult to characterize.
○ If the sampling data only captures a portion of the manifold, then it will almost surely fail in
characterizing population data.
○ Data can be in higher dimensional space and visualization is essentially impossible.

10
Well-known classification algorithms
◎ Some well-known classifier:
○ Support Vector Machine (SVM)
○ Classification and Regression Tree (CART)
○ k-nearest Neighbors (kNN)
○ Naïve Bayes
○ Ensemble Learning and Boosting (AdaBoost)
○ Ensemble Learning of Decision Tree (C4.5)
○ Deep neural nets

11
Contents
◎ Data science and machine learning review
◎ Classification model
○ Support Vector Machine (SVM)
○ Neural network
◎ Clustering model

12
Support Vector Machines
◎ The original SVM algorithm by Vapnik and Chervonenkis evolved out of the
statistical learning literature in 1963, where hyperplanes are optimized to split
the data into distinct classes.

13
Linear SVM
◎ The key idea of the linear SVM method
is to construct a hyperplane:
𝐰"𝐱+𝑏 =0
where the vector 𝐰 and constant 𝑏
parametrize the hyperplane.

14
Optimization problem in SVM
◎ The optimization problem in SVM includes:
○ Optimize a decision line which makes the fewest labeling errors.
○ Optimizes the largest margin between the data.

15
Objective function
◎ The loss function is defined as follows:
𝐿 𝐲! , 𝐲$! = 𝐿 𝐲! , sign 𝐰 + 𝐱! + 𝑏

0 if data is correctly labeled

=/
+1 if data is incorrectly labeled
◎ The goal is also to make the margin as large as possible:
'
1 (
argmin𝐰,$ > 𝐿(𝐲! , 𝐲$! ) + 𝐰 subject to min 𝐰 + 𝐱! + 𝑏 = 1
2 !
!%&
16
Noisy/Nonlinear Classification
◎ How to separate the following data by SVM?

17
Noisy/Nonlinear Classification
◎ Two basic approaches:
○ Use a linear classifier, but allow some (penalized) errors
◉ Soft margin, slack variables
○ Project data into higher dimensional space
◉ Do linear classification there
◉ Kernel functions

18
Soft margin
◎ Margin violation means choosing an hyperplane, which can
allow some data points to stay in either in between the margin
area or in the incorrect side of hyperplane.

19
Map to higher-dimensional space
◎ Embedding the data in a higher dimensional space:
𝐱 → Φ(𝐱)
◎ For example:
𝑥! , 𝑥" → 𝑧! , 𝑧" , 𝑧# ≔ (𝑥! , 𝑥" , 𝑥!" + 𝑥"" )

20
Kernel trick
◎ Kernel function (similarity function) takes input vectors in
the original space and return dot product of these vectors
in the feature space (a real number).
𝐾 𝐱, 𝐳 = Φ(𝐱) + Φ(𝐳)

21
Kernel trick
◎ The objective function only includes the dot product of the
transformed feature vectors.
&

𝐰 = 0 𝛼$ Φ(𝐱$ )
$%!

& &

𝑓 𝐱 = 𝐰 " Φ 𝐱 + 𝑏 = 0 𝛼$ Φ 𝐱$ " Φ 𝐱 = 0 𝛼$ 𝐾(𝐱$ , 𝐱)

$%! $%!
◎ Therefore, just substitute these dot product terms with the kernel
function, and don’t even use Φ(𝐱).
22
Contents
◎ Data science and machine learning review
◎ Classification model
○ Support Vector Machine (SVM)
○ Neural network
◎ Clustering model

23
Generic architecture of a multi-layer NN
◎ For classification tasks, the goal of the NN is to map a set of
input data to a classification.

24
One-layer network
◎ First, consider a single layer network for binary classification:

25
One-layer network
◎ Hypothesis with linear mapping:
𝐀𝐗 = 𝐘

| | | |
→ 𝑎& 𝑎( … 𝑎) 𝐱& 𝐱(… 𝐱 * = [+1 + 1 … − 1 − 1]
| | | |
where each column of the matrix 𝐗 is a dog or cat image
(𝐱! ∈ ℝ) ) and the columns of 𝐘 are its corresponding labels.

26
One-layer network
◎ The linear mapping:
𝐀𝐗 = 𝐘

27
Nonlinear transformations
◎ Hypothesis with nonlinear mapping:
𝐲 = 𝑓(𝐀, 𝐱)
where 𝑓 + called an activation function (transfer function) in
neural networks.

28
Neural network optimization
◎ The optimization of NN (determine the weights of the
network) is done through the backpropagation process.
○ The process forces the optimization to backprogate error through the
network relies on the chain rule of differentiation.
◎ For example, with simple neural network:

29
Neural network optimization
◎ The compositional structure is:
𝑦 = 𝑔 𝑧, 𝑏 = 𝑔 𝑓 𝑥, 𝑎 , 𝑏
◎ The error function is:
1 (
𝐸 = 𝑦+ − 𝑦
2

30
Neural network optimization
◎ Partial derivative:
!" !% !% !'
= − 𝑦$ − 𝑦 = −(𝑦$ − 𝑦) !' !# (chain rule)
!# !&
!" !%
= −(𝑦$ − 𝑦)
!& !&
◎ Gradient descent:
𝜕𝐸
𝑎,-& = 𝑎, + 𝜂
𝜕𝑎,
𝜕𝐸
𝑏,-& = 𝑏, + 𝜂
𝜕𝑏, 31
Overall progress of neural network
1. A NN is specified along with a labeled training set.
2. The initial weights of the network are set to random values.
3. The training data is run through the network to produce an output 𝑦,
whose ideal ground-truth output is 𝑦' .
4. The derivatives with respect to each network weight is then
computed using backprop formulas.
5. For a given learning rate 𝜂, the network weights are updated as in
gradient descent equation.
6. Return to step (3) and continue iterating until a maximum number of
iterations is reached or convergence is achieved.
32
Overall progress of neural network
1. A NN is specified along with a labeled training set.
2. The initial weights of the network are set to random values.

33
3. The training data is run through the network

34
◎ Backpropagate error

35
4. The derivatives with respect to each network weight is computed.
5. The network weights are updated as in gradient descent equation.

36
Stochastic Gradient Descent
◎ Stochastic Gradient Descent (SGD): a single, randomly data
point (k) is chosen to approximate the gradient at each step
of the iteration.
𝐰!-& = 𝐰! − 𝜂∇𝐸, (𝐰! )

37
Batch gradient descent
◎ If instead of a single point, a subset of points (K) is used, then
we have the following batch gradient descent algorithm.
𝐰!-& = 𝐰! − 𝜂∇𝐸. (𝐰! )

38
Deep Learning

39
Contents
◎ Data science and machine learning review
◎ Classification model
◎ Clustering model

40
Clustering
◎ Clustering is the process of grouping objects into clusters:
○ The same group objects are highly similar.
○ And very different from the subject in the rest of the groups.

The gap
The gap within between
the group is groups is large
small

41
Clustering
◎ Clustering is an unsupervised learning because the
label/class is not pre-defined
◎ Therefore, clustering is a type of learning by visual rather than
learning by examples

42
CHAMELEON 43
Some applications of clustering
◎ Grouping of related documents for web browsing
◎ Groups of genes and proteins have the same function
◎ Group of stocks with the same volatility
◎ Groups of areas of the same land type in geography
◎ Identify homegroups by Home type, value, and geographic
location
◎ Defining a Gaming object group
…

44
What is a group?

How many groups are there? 6 groups

2 groups 4 groups
45
A good Clustering?
◎ A good clustering method will have to create groups of
high quality:
○ The same level in the High group.
○ Similar levels to other low-level groups.
◎ The quality of the clustering depends on:
○ Analog measurement
○ Its enforcement
○ Ability to discover some or all of the underlying patterns

46
Measure Similarity
◎ The distance used is mainly to measure the same or not
similar level between the two objects.
○ Examples: Euclide gap, Cosin Gap, Minkowski, Mahattan...
◎ Distance functions differ in value ranges, types, ranks, and
variable elements.
◎ The weights of the dependent variables on the application
and the data implications.

47
Some clustering methods
◎ Partitional clustering
○ Formation of partitions and evaluate them based on some criteria
○ Algorithms: K-Means, K-Medoids, CLARANS
◎ Hierarchical clustering
○ Create the division of Layers
Algorithms: Diana, Agnes, BIRCH, CAMELEON
◎ Density-based clustering
○ Based on the jaw and density function
Algorithms: DBSACN, OPTICS, DenClue

48
Examples

49
Some clustering methods (2/2)
◎ Grid-based methods
○ Based on multi-level particle structure
Algorithms: STING, WaveCluster, CLIQUE
◎ Model-based methodology
◎ Frequent pattern-based methods
◎ Methods based on binding or user guidance
◎ Link-based method
…

50
Partitioning Clustering
◎ Partitioning Clustering Is the simplest and most
foundational method of clustering methods
◎ Idea: The partition of a D database consists of n objects
into k Groups so that it optimizes the criteria of the
partition.
◎ Global optimization: Full expression of all groups
◎ Heuristic algorithm:
○ K-means: each group is performed by the center value of the group
K-Medoids, PAM: each group is performed by one of the groups '
objects 51
K-Means algorithm (1/2)
◎ Before number k, each group is performed by the group's
centroid value.
○ S1: Select Random K objects as the center of the groups
○ S2: Assigns each remaining object to the closest group based on
distance measurement such as Euclide, Cosin analogue,
correlation,...
○ S3: Calculating the center value of each group based on newly joined
objects.
○ S4: If the group Center has nothing to change or there are only a few
points that change the group, stop, back to S2.

52
K-means example (1/4)

k=3
S1: Select any of the three centers: k1, k2, k3

k1
Y

k3
53
X
K-means example (2/4)

S2: Assign each point to a group with the

closest group center

k1
Y

k3
X 54
K-means example (3/4)

S3: Move a group center to the group's new

average score

k1 k1
Y

k2
k3
k2

k3
X 55
K-means example (4/4)

Repeat: Reassign points to close to new

group centers...

k1
Y

3 points to be
reassigned k3
k2

56
X
k-means
◎ Pros:
○ Simple, effective. Complexity O (TKN); T, K < < N.
Achieve local optimization.
◎ Cons:
○ Only available to objects in a continuous n-dimensional space
Need to determine the number K group before
Sensitivity to noise and personal data
Not suitable for exploring groups with circles/bridges

57
Exercise 1
◎ Use the K-means algorithm and Euclide spacing to Clustering 8
samples into 3 groups:
A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4),
A7=(1,2), A8=(4,9).
Distance matrix based on Euclide is given in the following slide.
Assuming the initializing seed (center) is k1 = A1, K2 = A4 and K3
= A7. Run K-Means 1 time. Identify groups to be formed. Where is
the new center of each group
◎ Draw in Space 10 x 10 The samples were bundled with the group
over 1 run (draw frames) and the center of each group. How many
K-means loops will converge (stop)? Illustrate the results at each
loop.
58
59

Neural Networks: A Classroom Approach by Satish Kumar: Neuralnetworksaclassroomapproachbysatishkumarpdffre
50% (2)
Neural Networks: A Classroom Approach by Satish Kumar: Neuralnetworksaclassroomapproachbysatishkumarpdffre
2 pages
Forecasting With Artificial Intelligence: Theory and Applications
No ratings yet
Forecasting With Artificial Intelligence: Theory and Applications
441 pages
ISYE 8803 - Kamran - M1 - Intro To HD and Functional Data - Updated
No ratings yet
ISYE 8803 - Kamran - M1 - Intro To HD and Functional Data - Updated
87 pages
Partial Differential Equations II: 2D Laplace Equation On 5x5 Grid
No ratings yet
Partial Differential Equations II: 2D Laplace Equation On 5x5 Grid
26 pages
Seminar Report ANN
100% (2)
Seminar Report ANN
21 pages
Topic 07 - Data Modelling - Part I
No ratings yet
Topic 07 - Data Modelling - Part I
40 pages
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
AnaPeixoto_SupervisedVSUnsupervised_IRISHEPHSFIndia_16012025
No ratings yet
AnaPeixoto_SupervisedVSUnsupervised_IRISHEPHSFIndia_16012025
59 pages
Model Questions DWT COMPLETE SOLUTIONS
No ratings yet
Model Questions DWT COMPLETE SOLUTIONS
18 pages
Lecture 17. Convolutional Neural Networks PDF
No ratings yet
Lecture 17. Convolutional Neural Networks PDF
32 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
Skin Melanoma Stage Detection - CNN
No ratings yet
Skin Melanoma Stage Detection - CNN
55 pages
MN906 AI Watermarking
No ratings yet
MN906 AI Watermarking
99 pages
DataEnggineering
No ratings yet
DataEnggineering
16 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
Deep Learning u5
No ratings yet
Deep Learning u5
5 pages
Support Vector Machine Explained
No ratings yet
Support Vector Machine Explained
10 pages
05 Sciml PINN
No ratings yet
05 Sciml PINN
131 pages
Support Vector Machines: More Generally Kernel Methods
No ratings yet
Support Vector Machines: More Generally Kernel Methods
58 pages
K
No ratings yet
K
28 pages
Assignment_13_Modern_AI
No ratings yet
Assignment_13_Modern_AI
3 pages
Pattern Recognition: Statistical and Neural: Lonnie C. Ludeman
No ratings yet
Pattern Recognition: Statistical and Neural: Lonnie C. Ludeman
47 pages
Assignment4 - AnswerKey
No ratings yet
Assignment4 - AnswerKey
14 pages
Ch03 - Embarrassingly Parallel Computations 2023-2024
No ratings yet
Ch03 - Embarrassingly Parallel Computations 2023-2024
34 pages
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
No ratings yet
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
19 pages
ashfatmaterial
No ratings yet
ashfatmaterial
4 pages
Exp 1-DL
No ratings yet
Exp 1-DL
3 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
ResNet Presentation
No ratings yet
ResNet Presentation
25 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
ML Video
No ratings yet
ML Video
8 pages
K 2
No ratings yet
K 2
527 pages
1
No ratings yet
1
8 pages
Deep Learning Module-2 & 4
No ratings yet
Deep Learning Module-2 & 4
48 pages
Chap 6 Embedding
No ratings yet
Chap 6 Embedding
44 pages
Question 1 (Linear Regression)
No ratings yet
Question 1 (Linear Regression)
18 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Computer Graphics and Visualisation: Lecture 16: Clipping
No ratings yet
Computer Graphics and Visualisation: Lecture 16: Clipping
27 pages
Lecture02. ML Pipeline (Chapter 2)
No ratings yet
Lecture02. ML Pipeline (Chapter 2)
50 pages
Chap 6 - Deep FeedForward Networks - Eunjeong Yi
No ratings yet
Chap 6 - Deep FeedForward Networks - Eunjeong Yi
21 pages
ML_2023
No ratings yet
ML_2023
3 pages
CNN 1
No ratings yet
CNN 1
9 pages
mod4
No ratings yet
mod4
65 pages
Neural - Networks
No ratings yet
Neural - Networks
47 pages
SC 3
No ratings yet
SC 3
127 pages
Lec9 NN I
No ratings yet
Lec9 NN I
47 pages
Kernel Models 1233
No ratings yet
Kernel Models 1233
56 pages
Unit 5 Learning with Algorithm
No ratings yet
Unit 5 Learning with Algorithm
7 pages
ML-chap13_2024_110331
No ratings yet
ML-chap13_2024_110331
67 pages
ml 2m cie2
No ratings yet
ml 2m cie2
4 pages
Machine Learning: Chapter 4. Artificial Neural Networks
No ratings yet
Machine Learning: Chapter 4. Artificial Neural Networks
34 pages
3 Short
No ratings yet
3 Short
10 pages
Line Drawing Algorithms
No ratings yet
Line Drawing Algorithms
55 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Provable Non-Convex Optimization For ML: Prateek Jain Microsoft Research India
No ratings yet
Provable Non-Convex Optimization For ML: Prateek Jain Microsoft Research India
86 pages
Differentiable Quantization of Deep Neural Networks: Equal Contribution
No ratings yet
Differentiable Quantization of Deep Neural Networks: Equal Contribution
21 pages
Mesh Free Methods: Nico Van Der Aa
No ratings yet
Mesh Free Methods: Nico Van Der Aa
32 pages
Lec 3 NNs
No ratings yet
Lec 3 NNs
64 pages
Chương 4 Mạng Nơ ron nhân tạo
No ratings yet
Chương 4 Mạng Nơ ron nhân tạo
81 pages
Recklessly Approximate Sparse Coding
No ratings yet
Recklessly Approximate Sparse Coding
35 pages
Machine Learning Techniques in Image Processing: CSCI 8810 Course Project
No ratings yet
Machine Learning Techniques in Image Processing: CSCI 8810 Course Project
24 pages
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Computer_vision_part2
No ratings yet
Computer_vision_part2
62 pages
AIO2024 Module02 Extra SQL Big Data
No ratings yet
AIO2024 Module02 Extra SQL Big Data
94 pages
Topic 03 - Basic Statistics
No ratings yet
Topic 03 - Basic Statistics
42 pages
Topic 02 - Data Collection
No ratings yet
Topic 02 - Data Collection
44 pages
Daphniphyllum Alkaloids Final MDP
No ratings yet
Daphniphyllum Alkaloids Final MDP
15 pages
Cylindrospermopsin Synthesis
No ratings yet
Cylindrospermopsin Synthesis
8 pages
03a-GP Organomet Cat
No ratings yet
03a-GP Organomet Cat
40 pages
Teruaki Mukaiyama - : Y. Ishihara Baran Lab Group Meeting
No ratings yet
Teruaki Mukaiyama - : Y. Ishihara Baran Lab Group Meeting
9 pages
A Flower Recognition System Based On Image Processing and Neural Network-2
No ratings yet
A Flower Recognition System Based On Image Processing and Neural Network-2
44 pages
Getting Started With Neuroph 2.3
No ratings yet
Getting Started With Neuroph 2.3
6 pages
DR - Ritanjali - CV
No ratings yet
DR - Ritanjali - CV
7 pages
Machine Learning 4 Hep
No ratings yet
Machine Learning 4 Hep
111 pages
Email-Spam-Detector (1)
No ratings yet
Email-Spam-Detector (1)
12 pages
Pulsed Neural Networks and Their Application: Daniel R. Kunkle Chadd Merrigan
No ratings yet
Pulsed Neural Networks and Their Application: Daniel R. Kunkle Chadd Merrigan
11 pages
A Machine Learning Model For Improving Healthcare Services On Cloud Computing Environment
No ratings yet
A Machine Learning Model For Improving Healthcare Services On Cloud Computing Environment
24 pages
Curriculum Guide: Artificial Intelligence and Machine Learning
No ratings yet
Curriculum Guide: Artificial Intelligence and Machine Learning
8 pages
Week 02 Ch2.1 Introduction To Neural Networks
No ratings yet
Week 02 Ch2.1 Introduction To Neural Networks
44 pages
Applications of Artificial Intelligence and Machine Learning in Geospatial Data
No ratings yet
Applications of Artificial Intelligence and Machine Learning in Geospatial Data
25 pages
Spectrum-NET Overview Oct 2 No NDA
100% (1)
Spectrum-NET Overview Oct 2 No NDA
12 pages
Manu Chopra: Education
No ratings yet
Manu Chopra: Education
1 page
Architecture and Learning process in neural network - GeeksforGeeks
No ratings yet
Architecture and Learning process in neural network - GeeksforGeeks
6 pages
Nikita DA
No ratings yet
Nikita DA
3 pages
02 Springer Paper Template
No ratings yet
02 Springer Paper Template
17 pages
Prediction of Brain Stroke Using Machine Learning
No ratings yet
Prediction of Brain Stroke Using Machine Learning
8 pages
Zhao 2020
No ratings yet
Zhao 2020
32 pages
An Efficient Deepfake Detection Using Robust Deep Learning Approch
No ratings yet
An Efficient Deepfake Detection Using Robust Deep Learning Approch
24 pages
Data Driven Modelling Using MATLAB
No ratings yet
Data Driven Modelling Using MATLAB
21 pages
Instant Access to Deep Learning A Visual Approach Glassner ebook Full Chapters
100% (2)
Instant Access to Deep Learning A Visual Approach Glassner ebook Full Chapters
65 pages
Neural Controller of DC Motor
50% (2)
Neural Controller of DC Motor
81 pages
unit 1 A.I
No ratings yet
unit 1 A.I
27 pages
Seminar
No ratings yet
Seminar
20 pages
1Z0-1122-24
No ratings yet
1Z0-1122-24
5 pages
Module-I Machine Learning1
No ratings yet
Module-I Machine Learning1
20 pages
Machine Learning - CheatSheet
100% (1)
Machine Learning - CheatSheet
2 pages
House Price Estimation
No ratings yet
House Price Estimation
7 pages