0% found this document useful (0 votes)

4 views

Lecture01 &02 (1)

The document outlines the relationship between Artificial Intelligence (AI), Machine Learning (ML), Artificial Neural Networks (ANN), and Deep Learning (DL), emphasizing their definitions and applications. It details the workflow of machine learning and pattern recognition, including data preprocessing, feature extraction, and classification processes, along with various techniques like normalization and histogram equalization. Additionally, it discusses the importance of data splitting and outlier removal in machine learning model training and validation.

Uploaded by

mohamedalbialy312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Lecture01 &02 (1)

Uploaded by

mohamedalbialy312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 77

BIO3603:Medical Pattern Recognition

Dr. Lamees Nasser

E-mail: [email protected]
Third Year– Biomedical Engineering Department
Academic Year 2024- 2025

10/8/2024 1
Relationship between AI, ML, NN, and DL.

Artificial Intelligence (AI) AI : programs that mimic human behavior

ML: comprises algorithms and statistical methods used

Machine Learning (ML) by computers to perform a specific task.

Artificial Neural Networks ( ANN) ANNs: models inspired by how neurons in the human
brain work.

Deep Learning (DL) DL:. Kind of artificial neural networks characterized by a

deep structure (several layers), a huge number of
artificial neurons, and the capability to automatically
extract features from data.
10/8/2024 2
Artificial Intelligence Applications

10/8/2024 3
What is Machine Learning?
• In 1959, Arthur Samuel, a pioneer in the field of machine learning (ML) defined it as “ the field of
study that gives computers the ability to learn without being explicitly programmed”.

Traditional Programming
Data (input)
Output
Program (equation)

Machine Learning

Data Program

Output
10/8/2024 4
Machine Learning Workflow
Training Phase

Machine
Preprocessing Feature learning
Training data - Filters extraction and algorithms Model
- Normalization selection - Supervised
- Unsupervised

Testing/Prediction Phase

Preprocessing Feature
Testing data - Filters extraction and Model Prediction
- Normalization selection

10/8/2024 5
What is Pattern Recognition?
• Pattern recognition (PR) is a field in machine learning that uses data analysis to recognize
patterns and regularities and then uses these regularities to take actions such as classifying the data
into different categories
• PR is a complex cognitive process in the brain. It involves analyzing various forms of data,
including images, video, and audio, with the intent of identifying and detecting specific visual
patterns (objects).
What is Pattern Classification?
• Pattern classification is a subfield of pattern recognition that involves categorizing (classifying)
patterns into pre-defined classes or categories. In other words, it is the process of assigning labels
to data based on their content.
Pattern Recognition and Classification Applications

• Computer-aided diagnosis (CAD): helping doctors make diagnostic decisions based on

interpreting medical data such as mammographic images, ultrasound images, electrocardiograms
(ECGs), and electroencephalograms (EEGs).
• Medical imaging: classifying cells as malignant or benign based on the results of magnetic
resonance imaging (MRI) scans or classifying different emotional and cognitive states from the
images of brain activity in functional MRI.
• Speech recognition: helping handicapped patients to control machines.
• Bioinformatics: DNA sequence analysis to detect genes related to particular diseases.
Pattern Recognition Workflow
Pattern Recognition Process
• The sensing/acquisition uses a transducer such as a camera or a microphone. The acquired signal
(e.g., an image) must be of sufficient quality that distinguishing “features” can be adequately
measured.
• Preprocessing: required prior to segmentation, including normalization, and image enhancement
(e.g., brightness adjustment, histogram equalization, contrast enhancement, image averaging,
frequency domain filtering, edge enhancement)
• Segmentation and labeling: isolate different objects from each other and from the background,
and the different objects are labeled. The foreground, comprising the objects of interest, and the
background, is everything else.
Pattern Recognition Process Cont.
• Postprocessing: used to prepare segmented images for feature extraction. For example, partial objects can
be removed from around the periphery of the image, disconnected objects can be merged, objects smaller
or larger than certain limits can be removed, or holes in the objects or background can be filled by
morphological opening or closing.
• Feature Extraction: reduce the data by measuring certain features (such as size, shape, and
texture) of the labeled objects.
• Classification: divide the feature space into decision regions.

Figure: Classes mapped as decision regions, with

decision boundaries
Figure: Example of segmentation,
postprocessing, and labeling
(a) Original image,
(b) variable background [from blurring (a)],
(c) improved image [¼(a) (b)],
(d) segmented image [Otsu thresholding of (c)],
(e) partial objects removed from (d),
(f) labeled components image,
(g) color-coded labeled components image
Data Splitting
• Data splitting is the process of splitting data into 3 sets:
▪ Training set: used to design our models
▪ Validation set: used to evaluate how well these models perform on new data
(refine our models )
▪ Testing set: used to test our models

• Common split percentages include:

▪ Train: 80%, Validation: 10% Test: 10%,
▪ Train: 70%, Validation:15%, Test: 15%,
▪ Train: 60%, Validation: 20%, Test: 20%,

10/8/2024 13
Training data/validation/test

10/8/2024 https://www.v7labs.com/blog/train-validation-test-set 14
Preprocessing- Outlier Removal
• Outlier data that lie far from the mean of the corresponding random variables. They can produce
large errors during training, especially when they are a result of noise.
• For a normal distribution, we could remove data points that are more than three standard
deviations from the mean (since they have less than a 1% chance of belonging to the distribution).
Preprocessing- Outlier Removal Cont.
Preprocessing- Normalization
• Normalization is a technique often applied as part of data preprocessing for
machine learning.
• The goal of normalization is to adjust the values of numeric data to a common scale
without losing information.
❖For example:
Assume your input dataset contains one column with values ranging from 0 to 1,
and another column with values ranging from 10,000 to 100,000. The great
difference in the scale of the numbers could cause problems when you attempt to
combine the values as features during modeling.

10/8/2024 17
Preprocessing- Normalization Cont.
• Min-max normalization: one of the most common ways to normalize data. For every feature, the
minimum value of that feature gets transformed into a 0, the maximum value gets transformed into
a 1, and every other value gets transformed into a decimal between 0 and 1. The formula to achieve
this is the following:
𝒙 − 𝒙𝐦𝐢𝐧
𝒙𝒔𝒄𝒂𝒍𝒆𝒅 =
𝒙𝐦𝐚𝐱 − 𝒙𝐦𝐢𝐧
• Z-score normalization: technique scales the values of a feature to have a mean of 0 and a
standard deviation of 1. 𝒙−𝝁
𝝈
• Here, 𝜇 is the mean value of the feature and 𝜎 is the standard deviation of the feature.
• If 𝒙 (value) is exactly equal to the mean of all the values of the feature, it will be transformed into a
0. If it is below the mean, it will be a negative number, and if it is above the mean, it will be a
positive number.

10/8/2024 18
Normalization Techniques-Numerical Example
• Use the method below to normalize the following group of data: 1000, 2000, 3000, 5000, 9000
▪ Min-max normalization
Data Normalized data

1000 0

2000 0.125

3000 0.25

5000 0.5

9000 1

10/8/2024 19
Normalization Techniques-Numerical
Example
• Use the method below to normalize the following group of data: 1000, 2000, 3000, 5000, 9000
▪ Z-Score Normalization: Data Normalized data
∑ 𝑥𝑖 −𝜇 2
• Standard Deviation = 1000 -1.204
𝑛−1

1000+2000+3000+5000+9000 2000 -0.803

• Mean= =4000
5

3000 -0.4016
(1000−4000)2 +(2000−4000)2 +(3000−4000)2 +(5000−4000)2 +(9000−4000)2
• σ=
5−1
= 2489.97 5000 0.4016

9000 2.008

10/8/2024 20
Histogram Image gray-level occurrence
Initialize H(i) = 0 for all i
For each pixel (i,j)
Considerations H(pixel(i,j)) ++;
• How many times each intensity value occurred in the image
• Information about image characteristics and quality.
• Two completely different images may have very similar histograms (no spatial information).
• Can we reconstruct image from histogram?

10/8/2024 21
Histogram Equalization
• Histogram equalization is a technique in image processing used to enhance the
contrast of an image by effectively redistributing its intensity values.

10/8/2024 22
Histogram Equalization: Manual Calculation

Histogram 8x8 image

Value Count Value Count Value Count Value Count
52 1 66 2 77 1 106 1 52 55 61 66 70 61 64 73
55 3 67 1 78 1 109 1 63 59 55 90 109 85 69 72
58 2 68 5 79 2 113 1 62 59 68 113 144 104 66 73
59 3 69 3 83 1 122 1 63 58 71 122 154 106 70 69
60 1 70 4 85 2 126 1
67 61 68 104 126 88 68 70
61 4 71 2 87 1 144 1
62 1 72 1 88 1 154 1
79 65 60 70 77 68 58 75
63 2 73 2 90 1 85 71 64 59 55 61 65 83
64 2 75 1 94 1 87 79 69 68 65 76 78 94
65 3 76 1 104 2
10/8/2024 23
Histogram Equalization: Manual Calculation
Cont.
CDF
Value CDF Value CDF Value CDF Value CDF Value CDF
52 1 64 19 72 40 85 51 113 60
55 4 65 22 73 42 87 52 122 61
58 6 66 24 75 43 88 53 126 62
59 9 67 25 76 44 90 54 144 63
60 10 68 30 77 45 94 55 154 64
61 14 69 33 78 46 104 57
62 15 70 37 79 48 106 58
63 17 71 39 83 49 109 59

10/8/2024 24
Histogram Equalization: Manual Calculation
Cont.
○ The general histogram equalization formula is:

𝑐𝑑𝑓 𝑣 − 𝑐𝑑𝑓𝑚𝑖𝑛
ℎ 𝑣 = 𝑟𝑜𝑢𝑛𝑑 ∗ (𝐿 − 1)
𝑀 ∗ 𝑁 − 𝑐𝑑𝑓𝑚𝑖𝑛

○ 𝑐𝑑𝑓𝑚𝑖𝑛 : The minimum value of the 𝑐𝑑𝑓

○ 𝑀 ∗ 𝑁 : The number of pixels. (𝑀 𝑤𝑖𝑑𝑡ℎ, 𝑁 ℎ𝑒𝑖𝑔ℎ𝑡)
○ 𝐿 : The number of gray levels (in most cases, 256).

10/8/2024 25
Histogram Equalization: Manual Calculation
Cont.
For example, the 𝑐𝑑𝑓 83 = 194, the normalized value becomes:
49 −1
ℎ 83 = 𝑟𝑜𝑢𝑛𝑑 ∗ 255 = 194
63

0 12 53 93 146 53 73 166
65 32 1 215 235 202 130 158
57 32 117 239 251 227 93 166
65 20 154 243 255 231 146 130
97 53 117 227 247 210 117 146
190 85 36 146 178 117 20 170
202 154 73 32 12 53 85 194
206 190 130 117 85 174 182 219
10/8/2024 Equalized Image 26
Histogram: Threshold

(a) (b)

Intensity histograms that can be partitioned (a) by a single threshold, and

(b) by dual thresholds.

10/8/2024 27
Histogram: Threshold

if f(x,y) > T then g(x,y) =255

else g(x,y) = 0
where T is the threshold

f histogram(f)

g with T = 100 g with T = 150 g with T = 180

10/8/2024 28
Histogram: Global Vs. Local Thresholding
Local Thresholding:
The idea of this function is to Global Thresholding
slide a window (e.g., 7x7) on
the image and determine the
mean of each window and
replace the center of each
window with the mean. Then,
the value of each pixel in the
window is compared to the
mean. The pixel’s value
becomes 255 if it is greater
than the mean and zero
otherwise.
10/8/2024 7x7; T = mean 7x7; T = mean - 7 7x7; T = mean - 10
Segmentation Methods
• Region-based methods: include thresholding (local threshold; global thresholding)
• Boundary-based methods: use an edge detector (e.g., the Canny detector)

If this segmentation
method results in
overlapping objects,
Original Image Segmented Image Edge detection How do we solve
this problem?
Feature Extraction
The choice of appropriate (well-designed) features depends on the particular image and the application at
hand. However, they should be:

• Robust: invariant to translation, orientation (rotation), scale, and illumination and invariant to the presence
of noise and artifacts; this may require some preprocessing of the image.

• Discriminating: the range of values for objects in different classes should be different and preferably be
well separated and non-overlapping.

• Reliable: all objects of the same class should have similar values.

• Independent: uncorrelated; as a counter-example, length, and area are correlated and it would be wasteful
to consider both as separate features.
Feature Extraction Cont.
• Measurements obtainable from the gray-level histogram of an object such as its mean pixel value
(grayness or color) and its standard deviation, its contrast, and its entropy
• The size or area, and its perimeter.
• The Circularity: a ratio of perimeter 2 to area, or area to perimeter 2 (or a scaled
version, such 4𝜋𝐴/𝑃2 )
• Aspect ratio: the ratio of the Feret diameters, given by placing a bounding box around the object.
Feature Extraction Cont.
• Skeleton or medial axis transform, or points within it such as branch points and endpoints, which
can be obtained by counting the number of neighboring pixels on the skeleton (viz., 3 and 1,
respectively)
Feature Extraction Cont.
• The Euler number: the number of connected components (i.e., objects) minus the number of holes
in the image.

Formally, the Euler Number is given by

𝑛comp
𝑖
E = 𝑛comp − ෍ 𝑛hole where
𝑖=1

ncomp => the number of foregrounds connected components

𝑖
𝑛hole => number of holes for 𝑖𝑡ℎ connected component.
Dimensionality Reduction
• Curse of dimensionality
⁃ Increasing computational complexity
⁃ Overfitting
• What PR algorithms want
✓ Uncorrelated data or independent variables
✓ Less enough data to predict
• Fewer features: the data can be analyzed visually more easily, and we get a better idea about the
underlying process.
• Humans have an extraordinary capacity to discern patterns and clusters in one, two, or three
dimensions, but these abilities degrade drastically for four or higher dimensions.
Dimensionality Reduction Cont.
• Overfitting
⁃ A finite sample size, 𝑁, increasing the number of features will initially improve the performance
of a classifier, but after a critical value, a further increase in the number of features (𝑑) will
reduce the performance resulting in overfitting the data (Figure). This is known as the peaking
phenomenon.

X
Simple linear Second-degree polynomial Fourth-degree polynomial
model model model

Poor performance on the training data Good performance on the training Good performance on the training
and poor generalization to other data. data and good generalization to data and poor generalization to
other data other data.

High Bias - the model could not Low Bias -model fits very well Low Bias - model fits very well
fit the training data well. with training data . with training data and thus
produces low error.
Low Variance - any data will produce Low Variance - Both the training
high error in this model, so all errors will and test error is close so that no High Variance - For the test data,
be high and there will not be much much difference in it. the model produces very high error
difference between errors. and thus the difference between
training and test error is high.
10/8/2024 37
Dimensionality Reduction Cont.
• There are two main methods for reducing dimensionality; feature selection and feature extraction
• Feature selection
• Select the 𝑘 features (out of 𝑑) that provide the most information, discarding the other (𝑑 − 𝑘)
features.
• Methods to implement feature selection include using the inter/intraclass distance and subset
selection such as Fisher score
• Feature extraction
• Find a new set of 𝑘 (< 𝑑) features which are combinations of the original 𝑑 features. These
methods may be supervised or unsupervised. The most widely used feature extraction methods
are Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA), which
are both linear projection methods, unsupervised and supervised respectively
Feature Selection
• What is feature selection?
⁃ If the data has more than 20,000 features and you need
to cut down, it to 1000 features before trying machine
learning. Which 1000 features you should choose it?
⁃ The process of choosing the 1000 features to use it is
called feature selection

• Why feature selection?

⁃ Avoid overfitting and achieve better generalizing ability.
⁃ Reduce the storage requirement and training time.

Overfitting means the model performs well on the training data but does not perform well on the test data.
This is because the model is memorizing the data it has seen and is unable to generalize to unseen examples

10/8/2024 39
Feature Selection Cont.
• Inter/Intraclass Distance:
⁃ Good features are discriminative. Intuitively, there should be a large interclass distance and
a small intraclass distance. The figure shows the case for a single feature, two equiprobable
class situation. The separability of the classes is the ratio of the intraclass distance and the
interclass distance.
Feature Selection Cont.
• Fisher’s score algorithm:
Fisher’s score algorithm:

Is one of the supervised feature selection methods that select each feature independently
according to their scores.
• Here's the formula for calculating a score:
2
∑𝑘𝑗=1 𝑛𝑗 𝜇𝑗 − 𝜇
𝐹=
∑𝑘𝑗=1 𝑛𝑗 𝜎𝑗2

- 𝑛𝑗 size of the data points belonging to class j for a particular feature

- 𝜇𝑗 mean of the data points belonging to class j for a particular feature
- 𝜇 overall mean of the data points for a particular feature
- 𝜎𝑗 standard deviation of the data points belonging to class j for a particular feature
• The larger the Fisher’s score is, the better is the selected feature.
10/8/2024 41
2

Example 𝐹=
∑𝑘𝑗=1 𝑛𝑗 𝜇𝑗 − 𝜇
∑𝑘𝑗=1 𝑛𝑗 𝜎𝑗2

Feature 1 Feature 2 Target (class label)

1 5 1
2 6 1
3 7 2
4 1 2
5 2 2

1+2+3+4+5 5+6+7+1+2
𝜇𝑓1 = = 3 𝜇𝑓2 = = 4.2
5 5

10/8/2024 42
2
∑𝑘𝑗=1 𝑛𝑗 𝜇𝑗 − 𝜇
𝐹=
∑𝑘𝑗=1 𝑛𝑗 𝜎𝑗2

Feature 1 Feature 2 Target (class label)

1 5 1
2 6 1
3 7 2
4 1 2
5 2 2
1+2 5+6
𝜇𝑓1𝑐1 = = 1.5 𝜇𝑓2𝑐1 = = 5.5
2 2
3+4+5 7+1+2
𝜇𝑓1𝑐2 = =4 𝜇𝑓2𝑐2 = = 3.3
3 3

2 (1 − 1.5)2 +(2 − 1.5)2 2 (5 − 5.5) +(6 − 5.5)2

2
𝜎𝑓1𝑐1 = = 0.5 𝜎𝑓2𝑐1 = = 0.5
2−1 2−1
2 (3−4)2 +(4−4)2 +(5−4)2 (7 − 3.3)2
+(1 − 3.3)2
+(2 − 3.3)2
𝜎𝑓1𝑐2 = =1 2
𝜎𝑓2𝑐2 = = 10.3
3−1
3−1
𝑛𝑓1𝑐1 = 2 𝑛𝑓2𝑐1 = 2
𝑛𝑓1𝑐2 = 3 𝑛𝑓2𝑐2 = 3
10/8/2024 43
2
∑𝑘𝑗=1 𝑛𝑗 𝜇𝑗 − 𝜇
𝐹=
∑𝑘𝑗=1 𝑛𝑗 𝜎𝑗2

𝟐 𝟐
𝒏𝒇𝟏𝒄𝟏 𝝁𝒇𝟏𝒄𝟏 −𝝁𝒇𝟏 +𝒏𝒇𝟏𝒄𝟐 𝝁𝒇𝟏𝒄𝟐 −𝝁𝒇𝟏
Fisher_score_F1 =
𝒏𝒇𝟏𝒄𝟏 (𝝈𝟐𝒇𝟏𝒄𝟏 )+𝒏𝒇𝟏𝒄𝟐 (𝝈𝟐𝒇𝟏𝒄𝟐 )

𝟐×(𝟏.𝟓−𝟑)𝟐 +𝟑×(𝟒−𝟑)𝟐
Fisher_score_F1 = = 𝟏. 𝟖𝟕𝟓𝟎 1
𝟐×(𝟎.𝟓)+𝟑×(𝟏)

Fisher_score_F2 = 𝟎. 𝟏𝟕𝟔𝟎 2

10/8/2024 44
Assignment
In a pattern recognition problem, there were two classes (C1, and C2), and three features F1, F2, F3, the
values for the features for each class are:
F1 F2 F3
C1 C2 C1 C2 C1 C2
2.1424 1.0575 3.7830 1.9366 5.0385 2.0247
2.2580 0.7707 3.3390 1.4727 4.8183 1.5616
2.1337 1.2382 3.6057 1.5228 4.5713 1.8716
2.2382 1.2378 3.5439 1.7134 4.7761 1.9508
1.7595 0.9925 3.3156 1.5119 4.4982 1.3813
1.9960 1.0655 3.0659 1.4809 4.6961 1.4118
1.9687 1.0349 3.4882 1.3335 4.6904 1.8142
Based on the above table,
1. Sort the three features according to Fisher’s score.
2. Write a Python code to compute Fisher score of the three features(write the code from scratch,
you can use an existing function to validate your code.)

10/8/2024 45
Distance Measures for Machine Learning
• Distance measures are a key part of several machine learning algorithms. These
measures are used in both supervised and unsupervised learning, generally to
calculate the similarity between data points.

• Types of distance measures:

➢Euclidean Distance
➢Manhattan Distance
➢Hamming Distance

10/8/2024 46
Distance Measures for Machine Learning Cont.
𝐴 𝑥1 , 𝑦1
• Euclidean distance (L2- norm): It represents the shortest distance between
two points.
2 2 1/2
𝐷𝑒 = 𝑥1 − 𝑥2 + 𝑦1 − 𝑦2
𝐵 𝑥2 , 𝑦2

• Manhattan distance (L1- norm): is the simple sum of the horizontal and 𝐴 𝑥1 , 𝑦1
vertical components.
𝐷𝑚 = 𝑥1 − 𝑥2 + 𝑦1 − 𝑦2
𝐵 𝑥2 , 𝑦2

• Hamming distance measures the similarity between two binary data strings
of the same length. That is, the number of bits that need to be changed to
turn one string into the other.

10/8/2024 47
Types of Machine Learning Algorithms

10/8/2024 48
1. Supervised Learning
➢In this type of machine learning algorithm,
• The training dataset is a labeled dataset.
• In other words, the training dataset contains
the input value (X) and target value (Y).
• The learning algorithm generates a model.
• Then, a new dataset consisting of only the
input value is fed.
• The model then generates the prediction based
on its learning.
➢Supervised learning problems are categorized into
"classification" and "regression" problems.

10/8/2024 49
Types of Supervised Learning Algorithm
• There are two types of supervised learning algorithm
Classification Regression

• The output variable (Y) is a category or • The output variable (Y) is a real or
discrete value such as “red” or “blue” or continuous value such as “salary” or
“disease” and “no disease” “weight”.
• Example: • Example: house price prediction
▪ Email: Spam / Not spam.
▪ Tumor: Malignant/ Benign

Square

Circle ??

Triangle

10/8/2024 50
K-nearest neighbor (K-NN)
• The k-nearest neighbors (K-NN) algorithm is a supervised
learning and non-parametric algorithm that can be used to solve
both classification and regression problem statements.
• It is also called a lazy learner algorithm because it does not
learn from the training set immediately instead it stores the
dataset and at the time of classification, it performs an action
on the dataset.

10/8/2024 51
How does K-NN work?
• The K-NN working can be explained on the basis of the
below algorithm:

▪ Step 1: Select the number K of the neighbors.

▪ Step 2: Calculate the Euclidean distance of a new data

point to all other training data points.

▪ Step 3: Take the K nearest neighbors as per the

calculated Euclidean distance.

▪ Step 4: Among these k neighbors, count the number of

the data points in each category (class).

▪ Step 5: Assign the new data points to that category

(class) for which the number of neighbors is maximum.
10/8/2024 52
Classification using K-NN
• Given the Ages and Loans of N customers. Determine the eligibility
(Yes/No) of an unknown customer for obtaining a loan

Yes
No

10/8/2024 53
Classification using K-NN Cont.

If K=1, then the nearest neighbor is the last

case in the training set with Y.

If K=3, there are two Y and one N out of

three closest neighbors. The prediction for
the unknown case is again Y.

10/8/2024 54
Regression using K-NN
• Given the Ages and Loans of N customers. Determine the House
Price Index (HPI) of an unknown customer.

10/8/2024 55
Regression using K-NN Cont.

Use the training set to get HPI of an unknown

case (Age=48 and Loan=$142,000) using
Euclidean distance.

• If K=1
then the nearest neighbor is the last case in the
training set with HPI=264.

• If K=3
the prediction for HPI is equal to the average of
TRY 𝑲 = 𝟓 to determine the
HPI for the top three neighbors. HPI of a new customer ?

• HPI = (264+139+139)/3 = 180.7

10/8/2024
K-NN Algorithm: Limitations
• Always needs to determine the value of K which may be complex
sometimes.
• The computation cost is high because of calculating the distance
between the data points for all the training samples.
• Requires high memory storage

10/8/2024 57
2. Unsupervised Learning
In this type of machine learning algorithm,

• The training dataset is an unlabeled

dataset.
• In other words, the training dataset
contains only the input value (X) and
not the target value (Y).
• The learning algorithm generates a
model.
• Based on the similarity between data, it
tries to draw inferences from the data
such as finding patterns or clusters.

10/8/2024 58
K-means Clustering
• The basic idea of the k-means clustering
algorithm is partitioning n data points into
k clusters by defining k centroids.

• The data clustering is done by minimizing

a chosen Euclidean distance measure
between a data point and cluster center.

10/8/2024 59
K-means Clustering Algorithm: Step 1
• Specify the number 𝑘 of clusters to assign and randomly select K
centroids (K=2).

10/8/2024 60
K-means Clustering Algorithm: Step 2

• Assign every point to a cluster whose centroid is the closest to the

point

10/8/2024 61
K-means Clustering Algorithm: Step 3

• Re-compute the centroid for each cluster based on the newly

assigned points in the cluster

10/8/2024 62
K-means Clustering Algorithm: Step 4
• Repeat the step 2, which means reassign each data point to the new
closest centroid of each cluster

10/8/2024 63
K-means Clustering Algorithm: Step 5
• Iterate until some stopping criterion is met

• There are essentially three stopping criteria that can be adopted to

stop the K-means algorithm:

➢Centroids of newly formed clusters do not change.

➢Points remain in the same cluster.
➢Maximum number of iterations is reached.

10/8/2024 64
K-means Clustering Algorithm: Limitations

• Difficult to predict the number of clusters (K-Value) .

• Initial seeds have a strong impact on the final results.
• It is sensitive to noise and outlier data points.
• k-means assumes that we deal with spherical clusters and that each
cluster has roughly equal numbers of observations

10/8/2024 65
K-means Clustering Algorithm: Summary

10/8/2024 66
K-means Clustering Example
Suppose we have 7 types of medicines and each medicine has two attributes or features as shown in the table below.
Our goal is to group these into two clusters of medicines based on the two features
Suppose that the initial seeds (centers of each cluster) are C1=(1,1) and C2=(5,7).

medicines features1 features2

8
1 1 1 7

2 1.5 2

Features 2
6

3 3 4 5

4
4 5 7
3
5 3.5 5
2

6 4.5 5 1

7 3.5 4.5 0
0 1 2 3 4 5 6

features1

10/8/2024 67
Step 1: Initialization K=2 , Randomly select 2 centroids C1=(1, 1) and
C2=(5, 7).
8

7
C2
6

Features 2 5

1 C1
0
0 1 2 3 4 5 6

features1

10/8/2024 68
Step2: Assign every point to a cluster whose centroid is the closest to the point

Medici C1=(1,1) C2=(5,7) cluster

nes
1 (1 − 1)2 +(1 − 1)2 = 0 (5 − 1)2 +(7 − 1)2 = 7.21 1
2 (1 − 1.5)2 +(1 − 2)2 = 1.12 (5 − 1.5)2 +(7 − 2)2 = 6.10 1
3 (1 − 3)2 +(1 − 4)2 = 3.61 (5 − 3)2 +(7 − 4)2 = 3.61 1
4 (1 − 5)2 +(1 − 7)2 = 7.21 (5 − 5)2 +(7 − 7)2 = 0 2
5 (1 − 3.5)2 +(1 − 5)2 = 4.72 (5 − 3.5)2 +(7 − 5)2 = 2.5 2
6 (1 − 4.5)2 +(1 − 5)2 = 5.31 (5 − 4.5)2 +(7 − 5)2 = 2.06 2
7 (1 − 3.5)2 +(1 − 4.5)2 = 4.30 (5 − 3.5)2 +(7 − 4.5)2 = 2.92 2

10/8/2024 69
Step3: Re-compute the centroid for each cluster based on the newly
assigned points in the cluster

• Thus, we obtain two clusters containing: {1,2,3} and {4,5,6,7}.

• Their new centroids are:
1+1.5+3 1+2+4
Group 1 = , = (1.83,2.33)
3 3

5+3.5+4.5+3.5 7+5+5+4.5
Group 2 = , = (4.12,5.38)
4 4

10/8/2024 70
Step4: Repeat the step 2, which means reassign each data point to the
new closest centroid of each cluster
Medicin C1=(1.83,2.33) C2=(4.12,5.38) cluster
es
1 (1.83 − 1)2 +(2.33 − 1)2 = 1.57 (4.12 − 1)2 +(5.38 − 1)2 = 5.38 1

2 (1.83 − 1.5)2 +(2.331 − 2)2 = 0.47 (4.12 − 1.5)2 +(5.38 − 2)2 = 4.29 1

3 (1.83 − 3)2 +(2.33 − 4)2 = 2.04 (4.12 − 3)2 +(5.38 − 4)2 = 1.78 2

4 (1.83 − 5)2 +(2.33 − 7)2 = 5.64 (4.12 − 5)2 +(5.38 − 7)2 = 1.84 2

5 (1.83 − 3.5)2 +(2.33 − 5)2 = 3.15 (4.12 − 3.5)2 +(5.38 − 5)2 = 0.73 2

6 (1.83 − 4.5)2 +(2.33 − 5)2 = 3.78 (4.12 − 4.5)2 +(5.38 − 5)2 = 0.54 2

7 (1.83 − 3.5)2 +(2.33 − 4.5)2 = 2.74 (4.12 − 3.5)2 +(5.38 − 4.5)2 = 2

1.08

10/8/2024 71
Step3: Re-compute the centroid for each cluster based on the newly
assigned points in the cluster

Therefore, the new clusters are:{1,2} and 3,4,5,6,7

1+1.5 1+2
Group 1 = , = (1.25,1.5)
2 2

3+5+3.5+4.5+3.5 4+7+5+5+4.5
Group 2 = , = (3.9,5.1)
5 5

10/8/2024 72
Step5: Iterate until some stopping criterion is met
Medi C1=(1.25,1.5) C2=(3.9,5.1) cluster
cines
1 (1.25 − 1)2 +(1.5 − 1)2 = 0.58 (3.9 − 1)2 +(5.1 − 1)2 = 5.2 1

2 (1.25 − 1.5)2 +(1.5 − 2)2 = 0.56 (3.9 − 1.5)2 +(5.1 − 2)2 = 3.92 1

3 (1.25 − 3)2 +(1.5 − 4)2 = 3.05 (3.9 − 3)2 +(5.1 − 4)2 = 1.42 2

4 (1.25 − 5)2 +(1.5 − 7)2 = 6.66 (3.9 − 5)2 +(5.1 − 7)2 = 2.20 2

5 (1.25 − 3.5)2 +(1.5 − 5)2 = 4.16 (3.9 − 3.5)2 +(5.1 − 5)2 = 0.41 2

6 (1.25 − 4.5)2 +(1.5 − 5)2 = 4.78 (3.9 − 4.5)2 +(5.1 − 5)2 = 0.61 2

7 (1.25 − 3.5)2 +(1.5 − 4.5)2 = 3.75 (3.9 − 3.5)2 +(5.1 − 4.5)2 = 2

0.72

• Therefore, there is no change in the cluster.

• Thus, the algorithm comes to stop here, and result consist of 2 clusters {1,2} and {3,4,5,6,7}.
10/8/2024 73
10/8/2024 74
3. Reinforcement Learning-
• In this type of machine learning
algorithm,
➢Model learns from a series of actions by
maximizing a reward function
➢The reward function can either be
maximized by rewarding good actions
➢Example: training self-driving car using
feedback from the environment
➢Unlike supervised learning, no data is
provided to the agent.

10/8/2024 75
Assignment
Thank you!

10/8/2024 77

MassSpectrometry in Grape and Wine Chemistry
No ratings yet
MassSpectrometry in Grape and Wine Chemistry
366 pages
Lecture 1
No ratings yet
Lecture 1
12 pages
Chapter 11and 12 - MIP and PR
No ratings yet
Chapter 11and 12 - MIP and PR
100 pages
PR Ip 1M 2010
No ratings yet
PR Ip 1M 2010
143 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
Pattern Recognition and Computer Vision Unit-1
No ratings yet
Pattern Recognition and Computer Vision Unit-1
27 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
1737527078055
No ratings yet
1737527078055
111 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DSH - L5 - Data-Driven Approaches - Concepts
No ratings yet
DSH - L5 - Data-Driven Approaches - Concepts
38 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
2-Machine Learning & Deep Learning
No ratings yet
2-Machine Learning & Deep Learning
87 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
Pattern Lec 1
No ratings yet
Pattern Lec 1
15 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
Artificial Intelligence and Machine Learning in Medical Imaging
No ratings yet
Artificial Intelligence and Machine Learning in Medical Imaging
56 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
Pattern Image Rec
No ratings yet
Pattern Image Rec
45 pages
AI Unit-5 Notes
No ratings yet
AI Unit-5 Notes
25 pages
Lab 06
No ratings yet
Lab 06
12 pages
ML1
No ratings yet
ML1
69 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Ids Unit 4 Case Study 1 Checking Patterns in Data
No ratings yet
Ids Unit 4 Case Study 1 Checking Patterns in Data
5 pages
Lecture 10 Image
No ratings yet
Lecture 10 Image
48 pages
Pattern - Recognigation - Lab 3 Sept 23 - Practical File
No ratings yet
Pattern - Recognigation - Lab 3 Sept 23 - Practical File
19 pages
Pattern Recognition 2nd Ed. (2009)
No ratings yet
Pattern Recognition 2nd Ed. (2009)
113 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
23 pages
DMDW 5
No ratings yet
DMDW 5
25 pages
INSEM Exam Answerkey 23
No ratings yet
INSEM Exam Answerkey 23
16 pages
Module 4
No ratings yet
Module 4
44 pages
Introduction of Pattern Recognition PDF
No ratings yet
Introduction of Pattern Recognition PDF
40 pages
Normalization: Normalization Techniques at A Glance
No ratings yet
Normalization: Normalization Techniques at A Glance
5 pages
Module 5
No ratings yet
Module 5
72 pages
AI With Python-Data Preprocessing: Student Name Student Roll # Program Section
No ratings yet
AI With Python-Data Preprocessing: Student Name Student Roll # Program Section
7 pages
unit 3_1_1709014556934
No ratings yet
unit 3_1_1709014556934
49 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Lab 06
No ratings yet
Lab 06
12 pages
AI-Module 4 Updated
No ratings yet
AI-Module 4 Updated
42 pages
Digital Image Processing
No ratings yet
Digital Image Processing
28 pages
Image Processing and Computer Vision Laboratory - DR - Majharoddin
No ratings yet
Image Processing and Computer Vision Laboratory - DR - Majharoddin
64 pages
1 Introduction
No ratings yet
1 Introduction
27 pages
Statistical Pattern Recognition Toolbox For Matlab: User's Guide
No ratings yet
Statistical Pattern Recognition Toolbox For Matlab: User's Guide
99 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
CV Pipeline Preprocessing Stage: Dr. Hussien Karam
No ratings yet
CV Pipeline Preprocessing Stage: Dr. Hussien Karam
10 pages
ML - WEEK 04
No ratings yet
ML - WEEK 04
33 pages
Image Classification
No ratings yet
Image Classification
18 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
52 pages
The Course: Image Representation Image Statistics Histograms Entropy Filters Books
No ratings yet
The Course: Image Representation Image Statistics Histograms Entropy Filters Books
77 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
L3 Overview of ML Model Development Lifecycle-1
No ratings yet
L3 Overview of ML Model Development Lifecycle-1
30 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Blue_Doodle_Project_Presentation[1] 123456
No ratings yet
Blue_Doodle_Project_Presentation[1] 123456
33 pages
1
No ratings yet
1
2 pages
Blue Doodle Project Presentation
No ratings yet
Blue Doodle Project Presentation
23 pages
قطع غيار
No ratings yet
قطع غيار
1 page
Chapter 06 - Course Project
No ratings yet
Chapter 06 - Course Project
14 pages
RL348 EX_2
No ratings yet
RL348 EX_2
10 pages
Blue Doodle Project Presentation
No ratings yet
Blue Doodle Project Presentation
23 pages
3233514408042025232931new_Test
No ratings yet
3233514408042025232931new_Test
1 page
Lecture__001__2024-02-19_V-4.0
No ratings yet
Lecture__001__2024-02-19_V-4.0
62 pages
Chapter 05
No ratings yet
Chapter 05
29 pages
Answers To Questions
No ratings yet
Answers To Questions
9 pages
Presentation (2)
No ratings yet
Presentation (2)
5 pages
report-1
No ratings yet
report-1
13 pages
RL348 EX_2
100% (1)
RL348 EX_2
4 pages
Sheet 1 - Solution
No ratings yet
Sheet 1 - Solution
2 pages
ECGproject
No ratings yet
ECGproject
8 pages
Circuit Design For Front-End Electrocardiograph: LV Jinhua and Xu Yanyi
No ratings yet
Circuit Design For Front-End Electrocardiograph: LV Jinhua and Xu Yanyi
10 pages
ECG Circuit Design and Analysis Algorithm
No ratings yet
ECG Circuit Design and Analysis Algorithm
8 pages
Cambridge International AS and A Level Biology Coursebook with CD ROM and Cambridge Elevate Enhanced Edition 2 Years Mary Jones - The ebook with rich content is ready for you to download
100% (1)
Cambridge International AS and A Level Biology Coursebook with CD ROM and Cambridge Elevate Enhanced Edition 2 Years Mary Jones - The ebook with rich content is ready for you to download
47 pages
GR-6 Science
No ratings yet
GR-6 Science
218 pages
Kingdom Animalia
No ratings yet
Kingdom Animalia
11 pages
A New Paradigm for Environmental Chemistry and Toxicology From Concepts to Insights Guibin Jiang instant download
100% (2)
A New Paradigm for Environmental Chemistry and Toxicology From Concepts to Insights Guibin Jiang instant download
43 pages
CBCازاى تقرا ال
No ratings yet
CBCازاى تقرا ال
15 pages
Definition of Bio
No ratings yet
Definition of Bio
15 pages
Madison and The Fire
No ratings yet
Madison and The Fire
7 pages
Structure of Viruses: Virology 4310 Spring 2024
No ratings yet
Structure of Viruses: Virology 4310 Spring 2024
59 pages
Gambar Skematis Vena Keterangan Gambar:: 8. Duct of Sweat Gland 9. Sweat Gland 10.subcutaneous Fat 11.dermis
No ratings yet
Gambar Skematis Vena Keterangan Gambar:: 8. Duct of Sweat Gland 9. Sweat Gland 10.subcutaneous Fat 11.dermis
6 pages
POSCO Proposed Project in Orissa - Case Study & Analysis by Group D
No ratings yet
POSCO Proposed Project in Orissa - Case Study & Analysis by Group D
8 pages
Tes Evaluasi - Descriptive Text PDF
No ratings yet
Tes Evaluasi - Descriptive Text PDF
5 pages
4. Handling & Semen Evaluation
No ratings yet
4. Handling & Semen Evaluation
34 pages
Gale Et Al 2018 Orchid Conservation Bridging The Gap Between Science and Practice
No ratings yet
Gale Et Al 2018 Orchid Conservation Bridging The Gap Between Science and Practice
10 pages
Requirements Diagrams With UML Models - PPT Download
No ratings yet
Requirements Diagrams With UML Models - PPT Download
1 page
Breeland Et Al., 2023
No ratings yet
Breeland Et Al., 2023
6 pages
University of Strasbourg
No ratings yet
University of Strasbourg
7 pages
IPR MSC II
No ratings yet
IPR MSC II
64 pages
Introduction To Microbiology: Mary Lou Dullona-Basa
No ratings yet
Introduction To Microbiology: Mary Lou Dullona-Basa
68 pages
Zoology Cnidaria
No ratings yet
Zoology Cnidaria
46 pages
Anatomy Trains Overview
83% (6)
Anatomy Trains Overview
20 pages
9.2 Exam Q
0% (1)
9.2 Exam Q
22 pages
Bio Molecules
No ratings yet
Bio Molecules
32 pages
BSC Agriculture Syllabus CBCS
No ratings yet
BSC Agriculture Syllabus CBCS
9 pages
Sample Quiz - BCH210H1 F LEC0101 20189 - Biochemistry I - Proteins, Lipids and Metabolism PDF
No ratings yet
Sample Quiz - BCH210H1 F LEC0101 20189 - Biochemistry I - Proteins, Lipids and Metabolism PDF
4 pages
The Advertising Research Handbook 2nd Edition Revised and Expanded Charles E. Young 2024 scribd download
100% (3)
The Advertising Research Handbook 2nd Edition Revised and Expanded Charles E. Young 2024 scribd download
51 pages
hvjhv
No ratings yet
hvjhv
22 pages
Uji Preklink Bahan Alam
No ratings yet
Uji Preklink Bahan Alam
47 pages
Nuclear Medicine Technology Procedures and Quick Reference 3rd Edition Scribd Download
89% (9)
Nuclear Medicine Technology Procedures and Quick Reference 3rd Edition Scribd Download
16 pages
Phyllosphere
No ratings yet
Phyllosphere
38 pages