0% found this document useful (0 votes)

34 views58 pages

Module 4

Uploaded by

Darshan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views58 pages

Module 4

Uploaded by

Darshan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Module 4

Unsupervised Learning
Bayesian learning
Unsupervised Learning
 Unsupervised learning is a type of machine learning in which
models are trained using unlabeled dataset and are allowed
to act on that data without any supervision.

 The goal of unsupervised learning is to find the underlying

structure of dataset, group that data according to similarities,
and represent that dataset in a compressed format.
Why use Unsupervised Learning?
 Unsupervised learning is helpful for finding useful insights from the
data.

 Unsupervised learning is much similar as a human learns to think by

their own experiences, which makes it closer to the real AI.

 Unsupervised learning works on unlabeled and uncategorized data

which make unsupervised learning more important.

 In real-world, we do not always have input data with the corresponding

output so to solve such cases, we need unsupervised learning.
Working of Unsupervised Learning
Types of Unsupervised Learning Algorithm:

 Clustering: Clustering is a method of grouping the

objects into clusters such that objects with most
similarities remains into a group and has less or no
similarities with the objects of another group.

 Association: An association rule is an unsupervised

learning method which is used for finding the
relationships between variables in the large
database.
Different types of clustering techniques
 Partitioning methods

 Hierarchical methods

 Density-based methods
Unsupervised Learning algorithms:
 K-means clustering
 KNN (k-nearest neighbors)
 Hierarchal clustering
 Anomaly detection
 Neural Networks
 Principle Component Analysis
 Independent Component Analysis
 Apriori algorithm
 Singular value decomposition
 Advantages of Unsupervised Learning
 Unsupervised learning is used for more complex tasks as compared to
supervised learning because, in unsupervised learning, we don't have labeled
input data.
 Unsupervised learning is preferable as it is easy to get unlabeled data in
comparison to labeled data.

 Disadvantages of Unsupervised Learning

 Unsupervised learning is intrinsically more difficult than supervised learning
as it does not have corresponding output.
 The result of the unsupervised learning algorithm might be less accurate as
input data is not labeled, and algorithms do not know the exact output in
advance.
Hierarchical Clustering
 Hierarchical clustering is another unsupervised machine learning
algorithm, which is used to group the unlabeled datasets into a cluster
and also known as hierarchical cluster analysis or HCA.

 In this algorithm, we develop the hierarchy of clusters in the form of a

tree, and this tree-shaped structure is known as the dendrogram.
Hierarchical Clustering

 Agglomerative: Agglomerative is a bottom-up approach, in which

the algorithm starts with taking all data points as single clusters and
merging them until one cluster is left.

 Divisive: Divisive algorithm is the reverse of the agglomerative

algorithm as it is a top-down approach.
Agglomerative Hierarchical clustering
 The agglomerative hierarchical clustering algorithm is a popular example of
HCA.

 To group the datasets into clusters, it follows the bottom-up approach.

 It means, this algorithm considers each dataset as a single cluster at the

beginning, and then start combining the closest pair of clusters together.

 It does this until all the clusters are merged into a single cluster that contains
all the datasets.
Dendrogram
Dendrogram
Agglomerative Clustering Algorithm
Starting Situation
Intermediate Situation
Intermediate Situation
After Merging
How to Define Inter-Cluster Distance
MIN or Single Link
MAX or Complete Link
Group Average or Average Link
Cluster Distance Measures
Example

dist((x, y), (a, b)) = √[(x - a)² + (y - b)²]

Hierarchical Clustering
 Hierarchal clustering can sometimes show patterns that are meaningless
or spurious
 For example, in this clustering, the tight grouping of Australia, Anguilla,
St. Helena etc is meaningful, since all these countries are former UK
colonies.
 However the tight grouping of Niger and India is completely spurious,
there is no connection between the two.

South Georgia & Serbia &

St. Helena & South Sandwich
AUSTRALIA ANGUILLA U.K. Montenegro FRANCE NIGER INDIA IRELAND BRAZIL
Dependencies Islands (Yugoslavia)
Partitioning methods
 K-Means
Factors Affecting K-Means Results
 Choosing appropriate number of clusters

 Elbow method
Factors Affecting K-Means Results

 Choosing the initial centroids

K-means
 Disadvantages
 Dependent on initialization
K-means
 Disadvantages
 Dependent on initialization
K-means
 Disadvantages
 Dependent on initialization
K-means
 Disadvantages
 Dependent on initialization
 Sensitive to outliers
K-means Clustering

Kmeans
Input Data
two circles, 2 clusters (K−means)
5

4.5 4.5

4 4

3.5
3.5

3
3

2.5
2.5

2
2

1.5
1.5

1
1
0.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
K-means Clustering

Kmeans work well Kmeans Fails

K-Means
BAYEIAN LEARNING
BAYES THEOREM
 Bayes theorem provides a way to calculate the probability of a hypothesis
based on its prior probability, the probabilities of observing various data given
the hypothesis, and the observed data itself.
Notations
 P(h) prior probability of h, reflects any background knowledge about the
chance that h is correct
 P(D) prior probability of D, probability that D will be observed
 P(D|h) probability of observing D given a world in which h holds
 P(h|D) posterior probability of h, reflects confidence that h holds after D has
been observed
 Bayes theorem is the cornerstone of Bayesian learning methods because it
provides a way to calculate the posterior probability P(h|D), from the prior
probability P(h), together with P(D) and P(D|h).

 P(h|D) increases with P(h) and with P(D|h) according to Bayes theorem.
 P(h|D) decreases as P(D) increases, because the more probable it is that D
will be observed independent of h, the less evidence D provides in support
of h.
Example
Consider a medical diagnosis problem in which there are two alternative
hypotheses
 The patient has a particular form of cancer (denoted by cancer)
 The patient does not (denoted by ¬ cancer)
The available data is from a particular laboratory with two possible outcomes: +
(positive) and - (negative)
 A patient takes a lab test and the result comes back positive, The test returns a
correct positive result in only 98% of the cases in which the disease is actually
present, and a correct negative result in only 97% of the cases in which the disease is
not present. Furthermore, .008 of the entire population have this cancer.
 Suppose a new patient is observed for whom the lab test returns a
positive (+) result.
 Should we diagnose the patient as having cancer or not?

 The exact posterior probabilities can also be determined by normalizing

the above quantities so that they sum to 1

0.0078
P (cancer | )   0.21
0.0078  0.0298
0.0298
P (cancer | )   0.79
0.0078  0.0298
BAYES THEOREM AND CONCEPT LEARNING
MAP Hypotheses and Consistent Learners
 A learning algorithm is a consistent learner if it outputs a hypothesis that
commits zero errors over the training examples.
 Every consistent learner outputs a MAP hypothesis, if we assume a
uniform prior probability distribution over H (P(hi) = P(hj) for all i, j), and
deterministic, noise free training data (P(D|h) =1 if D and h are consistent,
and 0 otherwise).
Example:
 FIND-S outputs a consistent hypothesis, it will output a MAP hypothesis
under the probability distributions P(h) and P(D|h) defined above.
 Are there other probability distributions for P(h) and P(D|h) under which
FIND-S outputs MAP hypotheses? Yes.
 Because FIND-S outputs a maximally specific hypothesis from the version
space, its output hypothesis will be a MAP hypothesis relative to any prior
probability distribution that favours more specific hypotheses.
Naive Bayes Classifier
 Along with decision trees, neural networks, one of the most practical
learning methods.
 When to use
–Moderate or large training set available
–Attributes that describe instances are conditionally independent given
classification
 Successful applications:
–Diagnosis
–Classifying text documents
Bayesian Belief Network
EM for Estimating k Means

CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Unsupervised Machine Learning in Python
100% (1)
Unsupervised Machine Learning in Python
89 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
For Unit 4 Useful
100% (1)
For Unit 4 Useful
107 pages
Chapter-V CLASSIFICATION & CLUSTERING
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
153 pages
Classification and Clustering Method
0% (1)
Classification and Clustering Method
30 pages
Internship Report AIML
No ratings yet
Internship Report AIML
40 pages
ML4 ML Algorithms
No ratings yet
ML4 ML Algorithms
123 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Unit 2 Unsupervised Learning
No ratings yet
Unit 2 Unsupervised Learning
86 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
05 Lecture ML Supervised - Learning SVM
No ratings yet
05 Lecture ML Supervised - Learning SVM
69 pages
ML Valkenborg
No ratings yet
ML Valkenborg
84 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Clustering
No ratings yet
Clustering
39 pages
Module IV - Machine Learning
No ratings yet
Module IV - Machine Learning
53 pages
Unit 4
No ratings yet
Unit 4
62 pages
2nd Unit NN Final Class Notes
No ratings yet
2nd Unit NN Final Class Notes
50 pages
Lecture 3 Types of Machine Learning
No ratings yet
Lecture 3 Types of Machine Learning
40 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
SJNanda - Spider and CollidingBodies
No ratings yet
SJNanda - Spider and CollidingBodies
50 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Chapter 04
No ratings yet
Chapter 04
42 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
40 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
Unit 3 DVA
No ratings yet
Unit 3 DVA
50 pages
KSMF
No ratings yet
KSMF
35 pages
Unit 2 Bayesian Learning
No ratings yet
Unit 2 Bayesian Learning
50 pages
Learning AI
No ratings yet
Learning AI
34 pages
Lec7 - Nonparametric Methods - II
No ratings yet
Lec7 - Nonparametric Methods - II
38 pages
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
No ratings yet
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
30 pages
Module 4 Iot
No ratings yet
Module 4 Iot
31 pages
ML & Cloud Computing For Iot: Topics in Module-3
No ratings yet
ML & Cloud Computing For Iot: Topics in Module-3
38 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
Lecture 03
No ratings yet
Lecture 03
28 pages
Module 6 - Un-Supervised Learning Algorithms
No ratings yet
Module 6 - Un-Supervised Learning Algorithms
31 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
30 pages
Week 9. Unsupervised Learning
No ratings yet
Week 9. Unsupervised Learning
32 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Unsupervised - Learning Final
No ratings yet
Unsupervised - Learning Final
20 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
16 pages
Unit 3 Unsupervised Learning Algorith
No ratings yet
Unit 3 Unsupervised Learning Algorith
15 pages
ML IA2v1
No ratings yet
ML IA2v1
18 pages
M Learning
No ratings yet
M Learning
11 pages
DW Ans
No ratings yet
DW Ans
11 pages
Mid 2
No ratings yet
Mid 2
11 pages
ML Lecture 10
No ratings yet
ML Lecture 10
14 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
10 pages
Lab 10 Unsupervised
No ratings yet
Lab 10 Unsupervised
12 pages
Aiml Prof
No ratings yet
Aiml Prof
8 pages
Unit 5
No ratings yet
Unit 5
8 pages
Fuzzy Meaning
No ratings yet
Fuzzy Meaning
6 pages
Machine Learning File
No ratings yet
Machine Learning File
7 pages
WEEK 5 Machine Learning
No ratings yet
WEEK 5 Machine Learning
8 pages
Un-Supervised Machine Learning
No ratings yet
Un-Supervised Machine Learning
9 pages
Decision Trees. These Models Use Observations About Certain
No ratings yet
Decision Trees. These Models Use Observations About Certain
6 pages
Variance
No ratings yet
Variance
6 pages
Supervised Learning 2
No ratings yet
Supervised Learning 2
4 pages
IoT Question Bank
No ratings yet
IoT Question Bank
2 pages
Machine Learning Coursera
No ratings yet
Machine Learning Coursera
2 pages