EML %TH Module

Uploaded by

cherry.divesh099

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views40 pages

EML %TH Module

Uploaded by

cherry.divesh099

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Clustering and

Reinforcement Methods
Content
• Introduction to Clusters
• K Means Clustering
• Fixing the value of K in the Kmeans
• Hierarchal Model
• DB Scan Model
• Sprial Model
Introduction to Clusters
• In machine learning, clustering is a type of unsupervised
learning that involves grouping a set of objects into subsets, or
clusters,
• so that objects within the same cluster are more similar to each
other than to those in other clusters.
• Unlike supervised learning, clustering doesn't rely on labeled
data. Instead, it tries to uncover patterns and structure from raw,
unlabeled data.
Introduction to Clusters
Why Clustering?
• Clustering is useful in various real-world applications where the
goal is to identify natural groupings in data, such as:
• Customer Segmentation: Grouping customers based on purchasing
behavior for targeted marketing.
• Anomaly Detection: Identifying unusual patterns in network traffic for
cybersecurity.
• Document Classification: Organizing large collections of documents
into categories for easy retrieval.
• Genomics: Identifying gene groups with similar expression patterns.
Introduction to Clusters
Characteristics of Clusters
• Clusters can differ based on:
• Shape: Spherical, elongated, or arbitrary.
• Size: Uniform or varying sizes.
• Density: Tight or loose groupings.
• Distance: Based on different distance metrics like Euclidean,
Manhattan, or cosine similarity.
Introduction to Clusters
Types of Clustering Algorithms
1. Partitioning Methods: Divide the data into distinct clusters.
1. Example: K-Means, K-Medoids.
2. Hierarchical Methods: Build clusters step-by-step in a hierarchy.
1. Example: Agglomerative, Divisive Clustering.
3. Density-Based Methods: Form clusters based on high-density
regions.
1. Example: DBSCAN, OPTICS.
4. Model-Based Methods: Assume a specific model for each cluster and
fit the data.
1. Example: Gaussian Mixture Models (GMM).
Introduction to Clusters
• Challenges in Clustering
• Determining the Number of Clusters: Often requires domain
knowledge or validation methods like the elbow method or silhouette
score.
• Scalability: Some algorithms struggle with large datasets.
• Interpretability: Understanding and interpreting clusters can be
subjective, depending on the application.
• Clustering is a foundational concept in machine learning,
enabling insight into complex data by identifying meaningful
patterns, relationships, and structures.
K Means Clustering
Overview of K-Means Clustering
• K-Means is a popular unsupervised learning algorithm used
for partitioning a dataset into K distinct clusters.
• The goal is to minimize the distance between data points and
their respective cluster centroids.
K Means Clustering
Steps of K-Means Algorithm
1. Initialization: Choose K random initial centroids.
2. Assignment: Assign each data point to the nearest centroid
based on a distance metric (usually Euclidean distance).
3. Update: Recalculate the centroids as the mean of all points
assigned to each cluster.
4. Repeat: Repeat steps 2 and 3 until the centroids no longer
change significantly or a maximum number of iterations is
reached.
K Means Clustering
K Means Clustering
Example: Clustering Customers Based on Annual Income
and Spending Score
• Let’s say we have a dataset of customers with two features:
1. Annual Income ($)
2. Spending Score (a measure of spending habits on a scale of 1 to 100)
K Means Clustering
K Means Clustering
Clustering Process (with K=2):
1. Initialization: Randomly select two centroids.
2. Assignment: Compute the distance from each data point to
both centroids and assign points to the nearest centroid.
3. Update: Recalculate centroids based on assigned points.
4. Convergence: Repeat until centroids stabilize.
K Means Clustering
K Means Clustering
• In the plot above, the data points represent customers, and the two
clusters are differentiated by colors. The red X marks indicate the
centroids of the two clusters. The K-Means algorithm successfully
divides the customers into two groups based on their annual income
and spending score.
• This type of clustering can be useful for:
• Targeted Marketing: Identifying high-spending customers.
• Customer Segmentation: Creating personalized promotions for different
groups.
• You can adjust the number of clusters (KKK) based on specific
business needs or by using evaluation metrics like the elbow
method.
Fixing the value of K in the K Means
• Choosing the optimal number of clusters K is critical for the
effectiveness of the K-Means algorithm.
• There are several methods to determine the best K:
Fixing the value of K in the K Means
Fixing the value of K in the K Means
Fixing the value of K in the K Means
3. Gap Statistic Method
• The Gap Statistic compares the performance of clustering on
the original dataset with random datasets. A higher gap
indicates a more appropriate KKK.
Fixing the value of K in the K Means
Fixing the value of K in the K Means
• The Elbow Plot shows the WCSS for different values of KKK.
The point where the WCSS curve starts to flatten is known as
the "elbow point." This point indicates the optimal number of
clusters.
• In this plot, the elbow appears around K=2 or K=3. Depending
on the specific application and domain knowledge, you could
select one of these values for KKK.
• If interpretability is important, fewer clusters (like K=2) might be
preferred.
• If you want more granularity, a slightly higher KKK (like K=3) could
be chosen.
Hierarchal Model
• Hierarchical clustering is a type of unsupervised learning that
builds a hierarchy of clusters by either merging or splitting them
iteratively.
• Unlike K-Means, hierarchical clustering does not require
specifying the number of clusters KKK in advance.
Hierarchal Model
Types of Hierarchical Clustering
1. Agglomerative (Bottom-Up):
1. Starts with each data point as its own cluster.
2. Iteratively merges the closest clusters until all points are in one cluster.
2. Divisive (Top-Down):
1. Starts with all data points in a single cluster.
2. Iteratively splits clusters until each point is its own cluster.
Hierarchal Model
Hierarchal Model
Steps in Agglomerative Hierarchical Clustering
1. Calculate Distance Matrix: Compute pairwise distances
between all data points.
2. Merge Closest Clusters: Use a linkage criterion to determine
which clusters to merge.
3. Repeat: Continue merging until a single cluster remains.
4. Visualize: Use a dendrogram to visualize the cluster hierarchy.
Hierarchal Model
Hierarchal Model
• The dendrogram above illustrates the hierarchical clustering
process:
• X-axis: Represents individual data points (customers).
• Y-axis: Represents the Euclidean distance between clusters at the
point of merging.
• Horizontal lines: Show the merging process. The height at which two
clusters merge indicates their similarity—lower merges imply more
similar clusters.
Hierarchal Model
How to Determine the Number of Clusters:
• Look for the largest vertical distance between two horizontal
lines without intersecting another horizontal line.
• For instance, if you cut the dendrogram at a certain height, like
just before the two largest clusters merge, the number of
clusters below that cut line is your cluster count.
Hierarchal Model
• In this example, cutting around the middle might suggest two to
three clusters.
• Hierarchical clustering is particularly useful when:
• You need interpretable results with a dendrogram.
• You have small to medium-sized datasets (it can be computationally
expensive for large datasets).
DB Scan Model
• DBSCAN is a powerful density-based clustering algorithm
that groups together points that are closely packed, marking
outliers as noise if they are in low-density regions.
• Unlike K-Means or Hierarchical clustering, DBSCAN doesn't
require the number of clusters to be specified in advance and
can handle clusters of varying shapes and sizes.
DB Scan Model
Key Concepts of DBSCAN
1. Core Point:
A point is a core point if it has at least MinPts points (including
itself) within a given Epsilon (ε) radius.
2. Border Point:
A point that is within the ε radius of a core point but has fewer
than MinPts neighbors itself. It is part of a cluster but not a core
point.
3. Noise Point:
A point that is neither a core point nor a border point and falls
outside any cluster.
DB Scan Model
Parameters of DBSCAN
1. Epsilon (ε):
The radius within which to search for neighboring points.
2. MinPts:
The minimum number of points required to form a dense region.
DB Scan Model
How DBSCAN Works
1. Select an unvisited point.
2. Determine its neighbors using ε.
3. If the point is a core point:
a) Form a new cluster with the point and its neighbors.
b) Recursively add points that are directly density-reachable.
4. If the point is a border point, mark it as part of a cluster.
5. If it’s neither, mark it as noise.
6. Repeat until all points are visited.
DB Scan Model
Advantages of DBSCAN
• Can find clusters of arbitrary shapes.
• Automatically detects outliers.
• No need to specify the number of clusters.
DB Scan Model
Limitations of DBSCAN
• Struggles with datasets that have varying densities.
• Performance depends heavily on the choice of ε and MinPts.
Spiral Model
• The Spiral Model is a risk-driven process model that
combines elements of both iterative and waterfall models.
• It is ideal for large, complex, and high-risk projects.
• The model allows for continuous refinement through multiple
iterations, focusing heavily on risk management at every phase.
Spiral Model
Key Features of the Spiral Model
1. Iterative Cycles (Spirals): The project goes through several
iterations, with each spiral involving a set of activities.
2. Risk Management: Each cycle starts with identifying and
addressing potential risks.
3. Prototyping: Prototypes are often created to clarify
requirements and reduce risks.
4. Customer Feedback: Frequent involvement of stakeholders
ensures that the product meets expectations.
Spiral Model
Phases in Each Spiral
1. Planning Phase:
1. Define objectives, requirements, and constraints.
2. Identify potential risks and develop risk mitigation strategies.
2. Risk Analysis Phase:
1. Assess identified risks and prioritize them.
2. Develop prototypes or simulations if needed to reduce uncertainty.
3. Engineering Phase:
1. Design and develop the system.
2. Build, test, and refine the system or prototype.
4. Evaluation Phase:
1. Receive feedback from stakeholders.
2. Decide whether to proceed, adjust, or terminate the project.
Spiral Model
Diagram of the Spiral Model
• Let me describe the process:
• The spiral starts from the center and progresses outward.
• Each loop represents a phase, including planning, risk analysis,
development, and customer feedback.
• Risk evaluation is continuous at each loop.
Summary
• Introduction to Clusters
• K Means Clustering
• Fixing the value of K in the K Means
• Hierarchal Model
• DB Scan Model
• Spiral Model

Price-Rexroth Hydraulics Division
78% (9)
Price-Rexroth Hydraulics Division
512 pages
BDA Unit 2
No ratings yet
BDA Unit 2
31 pages
DR Engp I 1.15 R6 - Ing
No ratings yet
DR Engp I 1.15 R6 - Ing
19 pages
Unit IV
No ratings yet
Unit IV
51 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
Paidout Policies
No ratings yet
Paidout Policies
2 pages
Modul English PSPK
No ratings yet
Modul English PSPK
139 pages
ARDUINO SOLAR CHARGE CONTROLLER Version 30
No ratings yet
ARDUINO SOLAR CHARGE CONTROLLER Version 30
79 pages
Gr11 P2 ECO June 2024 Question Paper - 125612
100% (1)
Gr11 P2 ECO June 2024 Question Paper - 125612
13 pages
Pengaruh Lingkungan Kos-Kosan Terhadap Motivasi Belajar Mahasiswa Stakpn Ambon
No ratings yet
Pengaruh Lingkungan Kos-Kosan Terhadap Motivasi Belajar Mahasiswa Stakpn Ambon
14 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Mycbseguide: Class 12 - Accountancy Sample Paper 07
No ratings yet
Mycbseguide: Class 12 - Accountancy Sample Paper 07
15 pages
AP Chemistry Solubility Rules Equations Sheet
100% (1)
AP Chemistry Solubility Rules Equations Sheet
8 pages
Analysis of Legal Case Document Automated Summarizer
No ratings yet
Analysis of Legal Case Document Automated Summarizer
6 pages
Bits F455 2518 20240109104826
No ratings yet
Bits F455 2518 20240109104826
5 pages
FSBC01 The Use of Repair and Maintenance Budget For Buildings
No ratings yet
FSBC01 The Use of Repair and Maintenance Budget For Buildings
5 pages
ACDCModule Users Guide
No ratings yet
ACDCModule Users Guide
474 pages
MB-310 Dynamics 365 Finance
No ratings yet
MB-310 Dynamics 365 Finance
13 pages
Applications HVAC EN
No ratings yet
Applications HVAC EN
21 pages
Powerplant Exercises
No ratings yet
Powerplant Exercises
3 pages
GVI Seychelles Marine Report Jan 2017 - Dec 2017 (Cap Ternay)
No ratings yet
GVI Seychelles Marine Report Jan 2017 - Dec 2017 (Cap Ternay)
82 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
Clustering Part1
No ratings yet
Clustering Part1
84 pages
ML L14 Clustering
No ratings yet
ML L14 Clustering
59 pages
Unit 4
No ratings yet
Unit 4
74 pages
Clustering FinancialData
No ratings yet
Clustering FinancialData
38 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
19 - Sessionppt - Clusteringalgos
No ratings yet
19 - Sessionppt - Clusteringalgos
36 pages
Tema Excel Proiect TIC CECCAR
No ratings yet
Tema Excel Proiect TIC CECCAR
33 pages
IELTS Simon Speaking Part 3 9dee133876
No ratings yet
IELTS Simon Speaking Part 3 9dee133876
37 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
ML Unit 4
No ratings yet
ML Unit 4
110 pages
Hyperlipidemia 1
No ratings yet
Hyperlipidemia 1
54 pages
M5
No ratings yet
M5
40 pages
M5
No ratings yet
M5
40 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
ML Module5 Clustering
No ratings yet
ML Module5 Clustering
71 pages
Clustering
No ratings yet
Clustering
38 pages
Unit - 4 DWDM
No ratings yet
Unit - 4 DWDM
27 pages
Ynspire Magazin-1-23 EN
No ratings yet
Ynspire Magazin-1-23 EN
48 pages
ML - 8
No ratings yet
ML - 8
70 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
Week 10
No ratings yet
Week 10
84 pages
Clustering Analysis
No ratings yet
Clustering Analysis
30 pages
Rhabdo Virus
No ratings yet
Rhabdo Virus
13 pages
CTSD-Lab Mannual Final - 241204 - 102238
No ratings yet
CTSD-Lab Mannual Final - 241204 - 102238
54 pages
L07 Clustering Algorithms
No ratings yet
L07 Clustering Algorithms
45 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
Lecture 9 Clustering
No ratings yet
Lecture 9 Clustering
36 pages
Chapter 4 - Clustering
No ratings yet
Chapter 4 - Clustering
21 pages
Unit 5
No ratings yet
Unit 5
33 pages
Data Mining and Machine Learning
No ratings yet
Data Mining and Machine Learning
48 pages
MODULE 4 Clustering
No ratings yet
MODULE 4 Clustering
23 pages
Module 5
No ratings yet
Module 5
43 pages
Clustering
No ratings yet
Clustering
67 pages
"These Are Just Rough Notes For References" What Is K-Means Clustering
No ratings yet
"These Are Just Rough Notes For References" What Is K-Means Clustering
9 pages
Module 4 - 5TH Sem
No ratings yet
Module 4 - 5TH Sem
23 pages
Kmeansfinal
No ratings yet
Kmeansfinal
16 pages
U-5 Iml
No ratings yet
U-5 Iml
20 pages
Clustering 1
No ratings yet
Clustering 1
18 pages
K Means Clustering
No ratings yet
K Means Clustering
13 pages
Clustering
No ratings yet
Clustering
20 pages
DWDM Unit V Note
No ratings yet
DWDM Unit V Note
19 pages
Lecture Notes - Clustering
No ratings yet
Lecture Notes - Clustering
13 pages
Clustering
No ratings yet
Clustering
53 pages
AI
No ratings yet
AI
19 pages
Unit 4
No ratings yet
Unit 4
16 pages
K Mean Clustering
No ratings yet
K Mean Clustering
59 pages
Clustering Analysis
No ratings yet
Clustering Analysis
12 pages
K Means Clustering
No ratings yet
K Means Clustering
27 pages
Mega Cap Trader Strategy Guide
No ratings yet
Mega Cap Trader Strategy Guide
8 pages
Clustering Kmeans
No ratings yet
Clustering Kmeans
6 pages
An Introduction To Clustering Methods
No ratings yet
An Introduction To Clustering Methods
8 pages
Clustering
No ratings yet
Clustering
11 pages
K-Means Clustering
No ratings yet
K-Means Clustering
8 pages
Plant Growth and Devlopment
No ratings yet
Plant Growth and Devlopment
7 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
13 pages
Understanding Clustering - A Comprehensive Guide To
No ratings yet
Understanding Clustering - A Comprehensive Guide To
5 pages
D-155 - 3 Cylinder Diesel Engine (01/75 - 12/85) 00 - Complete Machine 04-02 - Piston and Cylinder Sleeve
No ratings yet
D-155 - 3 Cylinder Diesel Engine (01/75 - 12/85) 00 - Complete Machine 04-02 - Piston and Cylinder Sleeve
4 pages
Unit 5
No ratings yet
Unit 5
5 pages
Activity 1 BRS NSC Mar 2017 Cheques Out)
No ratings yet
Activity 1 BRS NSC Mar 2017 Cheques Out)
1 page
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
No ratings yet
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
2 pages
Diversity of Life Practice Final Exam
No ratings yet
Diversity of Life Practice Final Exam
4 pages
130 FT End Fed Half Wave - Multiband Operation
No ratings yet
130 FT End Fed Half Wave - Multiband Operation
1 page
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet

EML %TH Module

Uploaded by

EML %TH Module

Uploaded by

Clustering and

You might also like