Unit-4 ML

Ml pdf

Uploaded by

Sandhya Ranabheri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views16 pages

Unit-4 ML

Ml pdf

Uploaded by

Sandhya Ranabheri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

UNIT-4

Unsupervised Learning Techniques: Clustering, K-Means, Limits of

K-Means, Using Clustering for Image Segmentation, Using Clustering
for Preprocessing, Using Clustering for Semi-Supervised Learning,
DBSCAN, Gaussian Mixtures.
Dimensionality Reduction: The Curse of Dimensionality, Main
Approaches for Dimensionality Reduction, PCA, Using Scikit-Learn,
Randomized PCA, Kernel PCV.
Clustering
1.What is Clustering?
Clustering is a way to organize data into groups based on similarities. It’s a method used
in machine learning where you don’t have any labels or categories to guide you.
2.No Labels Needed:
In clustering, you work with data that doesn’t have any predefined tags. For instance, if
you have a list of animals without species names, clustering can help you group them
based on similar traits like size, habitat, or diet.
3.Finding Patterns:
The goal of clustering is to discover patterns or structures in the data. It helps you see how
data points are related to one another, even if you didn’t know those relationships existed
before.
4.Real-World Uses:
You can use clustering in many areas, like:
1. Customer Segmentation: Grouping customers who buy similar products.
2. Social Networks: Identifying communities of users with similar interests.
3. Image Recognition: Grouping similar images together for easier processing.

• In essence, clustering helps you make sense of large datasets by showing you how data
points can be grouped based on their similarities, without needing prior labels or
classifications.
Clustering
Clustering is the process of organizing a set of data points into groups, or clusters, based on their
similarities. In clustering:
• Similar Data Points: Data points within the same group are more alike each other.
• Dissimilar Data Points: Data points in different groups are more different from each other.
Essentially, clustering helps to categorize objects based on how closely related they are, making it
easier to analyze and understand complex datasets.
For example, in a given graph, you might notice that certain data points are closely grouped together.
These closely clustered points can be classified into a single group. By observing the graph, we can
identify that there are three distinct clusters present. Each cluster contains data points that are similar
to each other, while the points in different clusters are more dissimilar.
Clustering
Clustering is used in many ways, including:

• a) Customer Segmentation: Businesses group customers based on what they buy or how
they behave online. This helps tailor products and marketing to different customer types.
• b) Data Analysis: When looking at new data, finding clusters of similar items makes it
easier to understand and analyze each group.
• c) Dimensionality Reduction: Clustering can simplify data by reducing the number of
features. Each data point can be represented by how much it belongs to each cluster,
making it easier to work with.
• d) Anomaly Detection: Clusters can help identify unusual behavior. For example, if a
user acts very differently from others, they may be flagged as an anomaly, which can help
catch fraud or defects.
• e) Semi-Supervised Learning: If you have only a few labeled examples, clustering helps
spread those labels to similar instances, increasing the amount of labeled data for training.
• f) Search Engines: Search engines can find similar images by clustering all images.
When you upload a reference image, it quickly finds and returns images from the same
cluster.
• g) Image Segmentation: By grouping pixels based on color, you can simplify an image,
making it easier to detect and track objects.
k-means clustering
K-Means Clustering is a popular and easy-to-understand method for
grouping data into clusters. Here’s how it works:

1.Choose Clusters: Decide how many clusters you want, let’s say k clusters.
Then, randomly select k points from the data as the starting "centers" of
these clusters.
2.Assign Points: For each data point, find the closest center and assign the
point to that cluster.
3.Update Centers: After assigning all points, calculate the average position
of the points in each cluster. This average becomes the new center for that
cluster.
4.Repeat: Repeat the assignment and update steps until the centers no longer
change much (they converge).
5.Final Clusters: Once the centers are stable, the data points closest to each
center form the final clusters, with each cluster represented by its center.
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
k-means clustering
Disadvantages of K-Means Clustering
1.Choosing k Manually:
You have to decide how many clusters k to use. A “Loss vs. Clusters” plot can help
find the best k.
2.Dependence on Initial Values:
The result can change based on where you start. To reduce this issue, run k-means
multiple times with different starting points and choose the best outcome. For
larger datasets, more advanced methods for picking initial centers (called k-means
seeding) are needed.
3.Varying Sizes and Densities:
K-means struggles when clusters have different sizes or densities. It may not group
them effectively without adjustments to the algorithm.
4.Clustering Outliers:
Outliers can distort the results, dragging the center (centroid) away or forming their
own cluster. It may help to remove or adjust outliers before clustering.
5.High Dimensions:
As the number of features increases, the distances between points become less
meaningful, making clustering harder. You might need to reduce dimensions using
techniques like PCA or consider different clustering methods.

IT Skill-2
100% (1)
IT Skill-2
58 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Interior Ballistics Simulation of Modular Charge Gun System Using Matlab
100% (1)
Interior Ballistics Simulation of Modular Charge Gun System Using Matlab
7 pages
Milk Vending Machine
50% (2)
Milk Vending Machine
21 pages
Unit 4
No ratings yet
Unit 4
125 pages
Unit4 Datascience
No ratings yet
Unit4 Datascience
43 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Unit I Computer Networks - Network Fundamentals
No ratings yet
Unit I Computer Networks - Network Fundamentals
122 pages
Clustering and K-Means Algorithm
No ratings yet
Clustering and K-Means Algorithm
81 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
K Mean Clustering
No ratings yet
K Mean Clustering
59 pages
Lecture 2.1.1 To 2.1.2
No ratings yet
Lecture 2.1.1 To 2.1.2
97 pages
UNIT-6 K Means Clustering
No ratings yet
UNIT-6 K Means Clustering
12 pages
Amcomp-2750 Honeywell MTU9105 9110 9113 9115 9117 9119 9121 9183 STC-1750 Service Jul2023
No ratings yet
Amcomp-2750 Honeywell MTU9105 9110 9113 9115 9117 9119 9121 9183 STC-1750 Service Jul2023
201 pages
ML Unit 4 (Ab 22)
No ratings yet
ML Unit 4 (Ab 22)
39 pages
Mini Project 2 Semster
No ratings yet
Mini Project 2 Semster
35 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
ML Material Unit-4
No ratings yet
ML Material Unit-4
38 pages
Unit 4
No ratings yet
Unit 4
74 pages
Machine Learning Unsupervised
No ratings yet
Machine Learning Unsupervised
20 pages
04-FSSR DS610 2024 2025T1 Kmeans
No ratings yet
04-FSSR DS610 2024 2025T1 Kmeans
57 pages
Untitled Document
No ratings yet
Untitled Document
32 pages
PCM 04-2013+booklet Amd
No ratings yet
PCM 04-2013+booklet Amd
152 pages
Machine Learning4
No ratings yet
Machine Learning4
39 pages
Week 9 Part 1 Clustering
No ratings yet
Week 9 Part 1 Clustering
44 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
DMB-24E Plus Transmodulator - User Manual From Digicast
No ratings yet
DMB-24E Plus Transmodulator - User Manual From Digicast
33 pages
K Means Algorithm
No ratings yet
K Means Algorithm
4 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
21 pages
Unit 4
No ratings yet
Unit 4
53 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
23 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
FI01 - Us - Kap07 RD500 - 2015
No ratings yet
FI01 - Us - Kap07 RD500 - 2015
38 pages
Chapter 5. Clustering Algorithms-Stud
No ratings yet
Chapter 5. Clustering Algorithms-Stud
44 pages
All Merged Chap 5
No ratings yet
All Merged Chap 5
45 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
Rukavicka RejectionLaplacesDemon 2014
No ratings yet
Rukavicka RejectionLaplacesDemon 2014
2 pages
ML Unit 4 V1
No ratings yet
ML Unit 4 V1
30 pages
Unit 4
No ratings yet
Unit 4
40 pages
Thesis Formate Be
No ratings yet
Thesis Formate Be
80 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
Classify Clustering
No ratings yet
Classify Clustering
31 pages
Cisco DevNets01t03
No ratings yet
Cisco DevNets01t03
57 pages
FPA Unit 3
No ratings yet
FPA Unit 3
17 pages
ML Unit-4
No ratings yet
ML Unit-4
23 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
ML Unit 3
No ratings yet
ML Unit 3
24 pages
Unit - 4 (ML)
No ratings yet
Unit - 4 (ML)
13 pages
How2electronics Com BLDC Brushless DC Motor Driver Circuit 555
No ratings yet
How2electronics Com BLDC Brushless DC Motor Driver Circuit 555
10 pages
ML Unit-4
No ratings yet
ML Unit-4
14 pages
Dumpstate Board
No ratings yet
Dumpstate Board
33 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
Unit-Iv Material
No ratings yet
Unit-Iv Material
24 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
17 pages
Clustering
No ratings yet
Clustering
12 pages
FIre Detection BOQ
No ratings yet
FIre Detection BOQ
10 pages
K Mean Clustering1
No ratings yet
K Mean Clustering1
23 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
17 pages
Mini Project
No ratings yet
Mini Project
8 pages
Clustering: An Overview: Key Concepts Objective
No ratings yet
Clustering: An Overview: Key Concepts Objective
12 pages
Cleaning The Form 3 Roller Holder
No ratings yet
Cleaning The Form 3 Roller Holder
14 pages
Clustering New
No ratings yet
Clustering New
6 pages
3 Cpu Scheduling SJF RSJF
No ratings yet
3 Cpu Scheduling SJF RSJF
10 pages
FINAL NEW SMW5 Presentation 2019 02 V5
No ratings yet
FINAL NEW SMW5 Presentation 2019 02 V5
11 pages
K Means
No ratings yet
K Means
9 pages
Degree Completion Checklist For BSC in CSE (201 To 232)
No ratings yet
Degree Completion Checklist For BSC in CSE (201 To 232)
5 pages
Michael Dubois III: Dataclay - Motion Graphics Artist
No ratings yet
Michael Dubois III: Dataclay - Motion Graphics Artist
2 pages
Css q2 Week6 g12
No ratings yet
Css q2 Week6 g12
4 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
Online Safety Awareness Past Paper
No ratings yet
Online Safety Awareness Past Paper
4 pages
Blockchain Technology Applications in Healthcare: An Overview
No ratings yet
Blockchain Technology Applications in Healthcare: An Overview
11 pages
STD - 11 Chapter-6
No ratings yet
STD - 11 Chapter-6
3 pages
What Is DTCO - An Introduction To Design-Technology Co-Optimization (TSMC) - SemiWiki
No ratings yet
What Is DTCO - An Introduction To Design-Technology Co-Optimization (TSMC) - SemiWiki
6 pages
Unit 5
No ratings yet
Unit 5
5 pages
Clustering U 5
No ratings yet
Clustering U 5
2 pages
Presentation On Unsupervised Learning
No ratings yet
Presentation On Unsupervised Learning
3 pages
AIS Wk1PostAct
No ratings yet
AIS Wk1PostAct
4 pages
Networking Solution
No ratings yet
Networking Solution
2 pages
Job Description - ShareChat - CodeChef
No ratings yet
Job Description - ShareChat - CodeChef
2 pages
CV - 2023 03 11 014419
No ratings yet
CV - 2023 03 11 014419
1 page
Level 2' DFD Showing Passport Management System
No ratings yet
Level 2' DFD Showing Passport Management System
1 page
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

Unit-4 ML

Uploaded by

Unit-4 ML

Uploaded by

UNIT-4

Unsupervised Learning Techniques: Clustering, K-Means, Limits of

You might also like