0% found this document useful (0 votes)

15 views18 pages

Unsupervised Learning

The document discusses unsupervised learning, particularly focusing on clustering as a key technique for identifying intrinsic structures in data without predefined labels. It covers various clustering methods, including K-means and DBSCAN, and highlights their applications in real-world scenarios like customer segmentation and anomaly detection. The conclusion emphasizes the ongoing development of clustering algorithms and their practical significance across multiple fields.

Uploaded by

preyanshi555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views18 pages

Unsupervised Learning

Uploaded by

preyanshi555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

ASSIGNMENT 4:

Unsupervised
Learning

Made BY: Preyanshi

Enrollment
No:226140307031
Supervised learning vs.
unsupervised learning
• Supervised learning: discover patterns in the data that relate data
attributes with a target (class) attribute.
 These patterns are then utilized to predict the values of the target
attribute in future data instances.

• Unsupervised learning: The data have no target attribute.

 We want to explore the data to find some intrinsic structures in them.

2
Clustering
• Clustering is a technique for finding similarity groups
in data, called clusters. I.e.,
 it groups data instances that are similar to (near) each
other in one cluster and data instances that are very
different (far away) from each other into different clusters.
• Clustering is often called an unsupervised
learning task as no class values denoting an a
priori grouping of the data instances are given,
which is the case in supervised learning.
• Due to historical reasons, clustering is often
considered synonymous with unsupervised learning.
 In fact, association rule mining is also unsupervised

• This chapter focuses on clustering.

3
An illustration
• The data set has three natural groups of data
points, i.e., 3 natural clusters.

CS583, Bing Liu, UIC 4

What is clustering for?
• Let us see some real-life examples
• Example 1: groups people of similar sizes together to make “small”,
“medium” and “large” T-Shirts.
 Tailor-made for each person: too expensive
 One-size-fits-all: does not fit all.

• Example 2: In marketing, segment customers according to their

similarities
 To do targeted marketing.

5
What is clustering for?
(cont…)
• Example 3: Given a collection of text documents, we want to
organize them according to their content similarities,
 To produce a topic hierarchy

• In fact, clustering is one of the most utilized data mining

techniques.
 It has a long history, and used in almost every field, e.g., medicine,
psychology, botany, sociology, biology, archeology, marketing, insurance,
libraries, etc.
 In recent years, due to the rapid increase of online documents, text
clustering becomes important.

6
K-means clustering
• K-means is a partitional clustering algorithm
• Let the set of data points (or instances) D be

{x1, x2, …, xn},

where xi = (xi1, xi2, …, xir) is a vector in a real-valued space X  Rr, and r is
the number of attributes (dimensions) in the data.

• The k-means algorithm partitions the given data into k clusters.

 Each cluster has a cluster center, called centroid.
 k is specified by the user

7
K-means algorithm
• Given k, the k-means algorithm works as follows:
1)Randomly choose k data points (seeds) to be the initial centroids, cluster
centers
2)Assign each data point to the closest centroid
3)Re-compute the centroids using the current cluster memberships.
4)If a convergence criterion is not met, go to 2).

8
K-means algorithm – (cont
…)

9
K-means summary
• Despite weaknesses, k-means is still the most popular algorithm
due to its simplicity, efficiency and
 other clustering algorithms have their own lists of weaknesses.

• No clear evidence that any other clustering algorithm performs

better in general
 although they may be more suitable for some specific types of data or
applications.

• Comparing different clustering algorithms is a difficult task. No one

knows the correct clusters!

10
Common ways to represent
clusters
• Use the centroid of each cluster to represent the cluster.
 compute the radius and
 standard deviation of the cluster to determine its spread in each
dimension

 The centroid representation alone works well if the clusters are of the
hyper-spherical shape.
 If clusters are elongated or are of other shapes, centroids are not
sufficient

1
Hierarchical Clustering
• Produce a nested sequence of clusters, a
tree, also called Dendrogram.

CS583, Bing Liu, UIC 12

Using classification model
• All the data points in a
cluster are regarded
to have the same
class label, e.g., the
cluster ID.
 run a supervised
learning algorithm on
the data to find a
classification model.

CS583, Bing Liu, UIC 13

DBSCAN Application
• Real-Time Problem: Anomaly Detection in
Credit Card Transactions
• Objective: Detect fraudulent credit card
transactions.
• Dataset: Transaction records including amount,
location, and time.
• Process:
• Apply DBSCAN to cluster normal transactions while
identifying outliers.
• DBSCAN is effective because it does not assume
spherical clusters and can detect outliers.

• Result: Detect anomalies that may indicate

fraudulent activity.

14
Apriori Algorithm
Application
• Real-Time Problem: Optimizing Product
Placement in Retail
• Objective: Identify frequently purchased items
together to improve store layout and product
recommendations.
• Dataset: Transaction data from a large retail store.
• Process:
• Apply the Apriori algorithm to find association rules
between products (e.g., milk and bread are often bought
together).
• Set a minimum support and confidence to filter the rules.

• Result: Store layouts are redesigned to place

frequently bought-together items closer, boosting sales
by cross-promoting products.

15
Conclusion and Key
Takeaways
• Unsupervised Learning is powerful for uncovering
hidden patterns in unlabeled data.
• Real-Time Applications:
• Customer segmentation (K-Means)
• Anomaly detection (DBSCAN)
• Market basket analysis (Apriori)
• Case Study: Retail industry benefits from association
rule mining to improve sales and customer
experience.

16
Summary
• Clustering is has along history and still active
 There are a huge number of clustering algorithms
 More are still coming every year.
• We only introduced several main algorithms. There
are many others, e.g.,
 density based algorithm, sub-space clustering, scale-up
methods, neural networks based methods, fuzzy clustering,
co-clustering, etc.
• Clustering is hard to evaluate, but very useful in
practice. This partially explains why there are still a
large number of clustering algorithms being devised
every year.
• Clustering is highly application dependent and to
some extent subjective.
17
•Thank You!

18

Axpert MAXII 8K TWIN Off Grid Manual 20220121
No ratings yet
Axpert MAXII 8K TWIN Off Grid Manual 20220121
81 pages
Data and Computer Communications: Chapter 18 - Internet Protocols
No ratings yet
Data and Computer Communications: Chapter 18 - Internet Protocols
71 pages
Sony Hcd-s40d Ver.1.0 SM
No ratings yet
Sony Hcd-s40d Ver.1.0 SM
38 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
UFED 5.0 ReleaseNotes Unblock Phones Cellebrite
100% (1)
UFED 5.0 ReleaseNotes Unblock Phones Cellebrite
13 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
Ifferent Methods of Clustering
No ratings yet
Ifferent Methods of Clustering
8 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
No ratings yet
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
95 pages
Unit 2 Unsupervised Learning
No ratings yet
Unit 2 Unsupervised Learning
86 pages
ML Unit 4
No ratings yet
ML Unit 4
110 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
83 pages
Lecture 2.1.1 To 2.1.2
No ratings yet
Lecture 2.1.1 To 2.1.2
97 pages
Unit 4
No ratings yet
Unit 4
74 pages
Clustering
No ratings yet
Clustering
38 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Unit 4
No ratings yet
Unit 4
62 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
Unit 3
No ratings yet
Unit 3
58 pages
ML4 Unsupervised Learning
No ratings yet
ML4 Unsupervised Learning
60 pages
Unit 4
No ratings yet
Unit 4
53 pages
Untitled Document
No ratings yet
Untitled Document
32 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Lecture 4.6 Unsupervised-Learning Clustering
No ratings yet
Lecture 4.6 Unsupervised-Learning Clustering
60 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
HWMobilePanels2GenUS en-US
No ratings yet
HWMobilePanels2GenUS en-US
280 pages
Clustering
No ratings yet
Clustering
67 pages
Clustering-Part 1
No ratings yet
Clustering-Part 1
35 pages
Classify Clustering
No ratings yet
Classify Clustering
31 pages
ML Unit 4 V1
No ratings yet
ML Unit 4 V1
30 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Chapter 5. Clustering Algorithms-Stud
No ratings yet
Chapter 5. Clustering Algorithms-Stud
44 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
Unsupervised Learning Update
No ratings yet
Unsupervised Learning Update
37 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
Unit 4
No ratings yet
Unit 4
16 pages
Unit 3 Unsupervised Learning Algorith
No ratings yet
Unit 3 Unsupervised Learning Algorith
15 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
CLUSTERING
No ratings yet
CLUSTERING
20 pages
Unit 6
No ratings yet
Unit 6
22 pages
Machine Learning Note Modul 4 5
No ratings yet
Machine Learning Note Modul 4 5
20 pages
Unit - 4 (ML)
No ratings yet
Unit - 4 (ML)
13 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
13 pages
Unit-4 ML
No ratings yet
Unit-4 ML
16 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
32 pages
Introduction To Unsupervised Learning:: Clustering
No ratings yet
Introduction To Unsupervised Learning:: Clustering
21 pages
6 - Into To Data Science Techniques and Clustering
No ratings yet
6 - Into To Data Science Techniques and Clustering
16 pages
Data Mining Assignment
No ratings yet
Data Mining Assignment
5 pages
K Means
No ratings yet
K Means
9 pages
UnSupervised Learning
No ratings yet
UnSupervised Learning
3 pages
Week 14 and 15 Machine Learning Unsupervised 2
No ratings yet
Week 14 and 15 Machine Learning Unsupervised 2
25 pages
OSP-P200/P200A/P300/P300A Osp Api Kit INSTRUCTION MANUAL (2nd Edition)
100% (1)
OSP-P200/P200A/P300/P300A Osp Api Kit INSTRUCTION MANUAL (2nd Edition)
21 pages
Unsupervised Learning: Niveditha. GH
No ratings yet
Unsupervised Learning: Niveditha. GH
10 pages
Oiml R76-1
No ratings yet
Oiml R76-1
144 pages
Ansi Tia Eia 606 A
No ratings yet
Ansi Tia Eia 606 A
6 pages
Python Machine Learning
No ratings yet
Python Machine Learning
19 pages
Block Cioher
No ratings yet
Block Cioher
38 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
63 pages
37c22 Samsung Palau
No ratings yet
37c22 Samsung Palau
72 pages
Program
No ratings yet
Program
34 pages
VWR-A Series Operation Manual Rev A
No ratings yet
VWR-A Series Operation Manual Rev A
28 pages
Install Smartplant Reference Data: Setup - Exe in The Main Folder
No ratings yet
Install Smartplant Reference Data: Setup - Exe in The Main Folder
2 pages
HCL Case Study
No ratings yet
HCL Case Study
37 pages
System Requiements 8.1 en
No ratings yet
System Requiements 8.1 en
28 pages
Ds STYLISTIC Q739
No ratings yet
Ds STYLISTIC Q739
4 pages
Syllabus M.tech Computational Biology 2023 2024
No ratings yet
Syllabus M.tech Computational Biology 2023 2024
68 pages
Nama Item Harga Keterangan
No ratings yet
Nama Item Harga Keterangan
14 pages
Data in ML
No ratings yet
Data in ML
26 pages
Natasha Moore Recommendations For Data Exchange Standards Registry Implementation Final
No ratings yet
Natasha Moore Recommendations For Data Exchange Standards Registry Implementation Final
31 pages
17ec741 - Multimedia Information Representation - Module 2
No ratings yet
17ec741 - Multimedia Information Representation - Module 2
54 pages
Zenith MTH 101 PDF 2 For Exam
No ratings yet
Zenith MTH 101 PDF 2 For Exam
18 pages
Aiml Lab New
No ratings yet
Aiml Lab New
49 pages
Sysmex CA-660 操作手册
No ratings yet
Sysmex CA-660 操作手册
2 pages
Pdi and Wimo
No ratings yet
Pdi and Wimo
18 pages
Model Answer Paper Summer 2018
No ratings yet
Model Answer Paper Summer 2018
24 pages
Essentium Drybox Data Sheet - v2 1 (379296)
No ratings yet
Essentium Drybox Data Sheet - v2 1 (379296)
11 pages
Good PDF 4303162 2
No ratings yet
Good PDF 4303162 2
33 pages
Fintech Dictionary Terminology For The Digitalized Financial World Rainer Alt Stefan Huch PDF Download
No ratings yet
Fintech Dictionary Terminology For The Digitalized Financial World Rainer Alt Stefan Huch PDF Download
77 pages
Flashman Royal Flash Flashmans Lady George Macdonald Fraser Download
No ratings yet
Flashman Royal Flash Flashmans Lady George Macdonald Fraser Download
14 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet

Unsupervised Learning

Uploaded by

Unsupervised Learning

Uploaded by

ASSIGNMENT 4:

Made BY: Preyanshi

• Unsupervised learning: The data have no target attribute.

• This chapter focuses on clustering.

CS583, Bing Liu, UIC 4

• Example 2: In marketing, segment customers according to their

• In fact, clustering is one of the most utilized data mining

{x1, x2, …, xn},

• The k-means algorithm partitions the given data into k clusters.

• No clear evidence that any other clustering algorithm performs

• Comparing different clustering algorithms is a difficult task. No one

CS583, Bing Liu, UIC 12

CS583, Bing Liu, UIC 13

• Result: Detect anomalies that may indicate

• Result: Store layouts are redesigned to place

You might also like