ML Unit5 Notes
ML Unit5 Notes
Unsupervised Learning:
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by
their own experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data
which make unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding
output so to solve such cases, we need unsupervised learning.
Once it applies the suitable algorithm, the algorithm divides the data objects into
groups according to the similarities and difference between the objects.
The unsupervised learning algorithm can be further categorized into two types of
problems:
o Clustering: Clustering is a method of grouping the objects into clusters
such that objects with most similarities remains into a group and has less or no
similarities with the objects of another group. Cluster analysis finds the
commonalities between the data objects and categorizes them as per
the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is
used for finding the relationships between variables in the large database. It
determines the set of items that occurs together in the dataset.
Association rule makes marketing strategy more effective. Such as people
who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam)
item. A typical example of Association rule is Market Basket Analysis
K-Means Clustering
Example: Cluster the following eight points (with (x, y) representing locations) into
three clusters: A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4,
9)
Initial cluster centers are: A1(2, 10), A4(5, 8) and A7(1, 2).
Euclidean distance:
Cluster-03:
Third cluster contains points-
. A2(2, 5)
. A7(1, 2)
Now,
We re-compute the new cluster clusters.
The new cluster center is computed by taking mean of all the points contained
in that cluster.
For Cluster-01:
We have only one point A1(2, 10) in Cluster-01.
So, cluster center remains the same.
For Cluster-02:
Center of Cluster-02
= ((8 + 5 + 7 + 6 + 4)/5, (4 + 8 + 5 + 4 + 9)/5)
= (6, 6)
For Cluster-03:
Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)
This is completion of Iteration-01.
Iteration-02:
We calculate the distance of each point from each of the center of the three
clusters.
The distance is calculated by using the given distance function.
From here, New clusters are-
Cluster-01:
First cluster contains points-
A1(2, 10)
A8(4, 9)
Cluster-02:
Second cluster contains points-
A3(8, 4)
A4(5, 8)
A5(7, 5)
A6(6, 4)
Cluster-03:
Third cluster contains points-
A2(2, 5)
A7(1, 2)
Now,
We re-compute the new cluster clusters.
The new cluster center is computed by taking mean of all the points contained
in that cluster.
For Cluster-01:
Center of Cluster-01
= ((2 + 4)/2, (10 + 9)/2)
= (3, 9.5)
For Cluster-02:
Center of Cluster-02
= ((8 + 5 + 7 + 6)/4, (4 + 8 + 5 + 4)/4)
= (6.5, 5.25)
For Cluster-03:
Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)
This is completion of Iteration-02.
After second iteration, the centers of the three clusters are-
C1(3, 9.5)
. C2(6.5, 5.25)
C3(1.5, 3.5)
Continue the iteration until the new cluster and previous clusters remains same.
Stopping Criteria for K-Means Clustering
There are essentially three stopping criteria that can be adopted to stop the K-means
algorithm:
We can stop the algorithm if the centroids of newly formed clusters are not changing.
Even after multiple iterations, if we are getting the same centroids for all the clusters,
we can say that the algorithm is not learning any new pattern, and it is a sign to stop
the training.
Another clear sign that we should stop the training process is if the points remain in
the same cluster even after training the algorithm for multiple iterations.
We have in the previous section that the value of k needs to be chosen beforehand. The
performance of the K-means clustering algorithm depends on the optimal and packed
clusters. So, let’s see how to choose the optimal number of clusters using the techniques
given below:
Elbow Method
The Elbow method is the go to method to find the optimal number of clusters. It uses the
concept of WCSS value. WCSS stands for Within Cluster Sum of Squares, means the total
variations within a cluster. In simple words, it is sum of the squared distances between each
data point and its centroid and calculates the average distance within a cluster. To measure
the distance between data points and centroid, we can use any method such as Euclidean
distance, Manhattan distance or cosine distance, etc.
1. Perform the K-means clustering multiple times using various k values (from 1-10).
2. For each value of k, calculates the WCSS value.
3. Plots a curve between calculated WCSS values (sum of squared distance) and the
number of clusters k.
4. Look for sharp bend in the curve (looks like an arm or elbow), that point is
considered as the optimal value of k.
Advantages of k-means
1. Simple and easy to implement: The k-means algorithm is easy to understand and
implement, making it a popular choice for clustering tasks.
2. Fast and efficient: K-means is computationally efficient and can handle large datasets
with high dimensionality.
3. Scalability: K-means can handle large datasets with a large number of data points and
can be easily scaled to handle even larger datasets.
4. Flexibility: K-means can be easily adapted to different applications and can be used with
different distance metrics and initialization methods.
Disadvantages of K-Means:
3. Sensitive to outliers: K-means is sensitive to outliers, which can have a significant impact
on the resulting clusters.
K MODE CLUSTERING
Similarity and dissimilarity measurements are used to determine the distance between
the data objects in the dataset. In the case of K-modes, these distances are calculated
using a dissimilarity measure called the Hamming distance. The Hamming distance
between two data objects is the number of categorical attributes that differ between
the two objects.
Let x and y be two categorical data objects defined by m features or attributes.
Where,
For example, consider the following dataset with three categorical
attributes:
To calculate the Hamming distance between data objects 1 and 2, we compare their
values for each attribute and count the number of differences. In this case, there is one
difference (Attribute 3 is C for object 1 and D for object 2), so the Hamming distance
between objects
1 and 2 is 1.
To calculate the Hamming distance between objects 1 and 3, we again compare their
values for each attribute and count the number of differences. In this case, there
are two differences (Attribute 2 is B for object 1 and C for object 3, and Attribute 3 is C
for object 1 and E for object 3), so the Hamming distance between objects 1 and 3 is 2.
To calculate the Hamming distance between objects 1 and 4, we again compare their
values for each attribute and count the number of differences. In this case, there
are three differences (Attribute 1 is A for objects 1 and B for object 4, Attribute 2 is B
for object 1 and C for object 4, and Attribute 3 is C for objects 1 and E for object 4), so
the Hamming distance between objects 1 and 4 is 3.
Data objects with a smaller Hamming distance are considered more similar, while objects
with a larger Hamming distance is considered more dissimilar.
Overall, the goal of K-modes clustering is to minimize the dissimilarities between the
data objects and the centroids (modes) of the clusters, using a measure of
categorical similarity such as the Hamming distance.
K-Prototypes clustering
K-Prototypes clustering is a partitioning clustering algorithm. We use k-prototypes
clustering to cluster datasets that have categorical as well as numerical attributes. The K-
Prototypes clustering algorithm is an ensemble of k-means clustering and k-modes
clustering algorithm. Hence, it can handle both numerical and categorical data.
After assigning data points to the clusters, we calculate the new prototype for the current
cluster using the method discussed in the next sections. After that, we recalculate the
distance of prototypes from the data points and reassign the clusters. This process is
continued until the clusters converge.
Reinforcement Learning
o Reinforcement learning does not require any labeled data for the learning
process. It learns through the feedback of action performed by the
agent. Moreover, in
reinforcement learning, agents also learn from past experiences.
Reinforcement learning methods are used to solve tasks where decision-making
is sequential and the goal is long-term, e.g., robotics, online chess, etc.
Exploitation is defined as a greedy approach in which agents try to get more rewards by
using estimated value but not the actual value. So, in this technique, agents make the
best decision based on current information.
State:
A State is a set of tokens that represent every state that the agent can be in.
Model:
A Model (sometimes called Transition Model) gives an action’s effect in a state.
In particular, T(S, a, S’) defines a transition T where being in state S and taking an
action
‘a’ takes us to state S’ (S and S’ may be the same). For stochastic actions (noisy, non-
deterministic) we also define a probability P(S’|S,a) which represents the probability
of reaching a state S’ if action ‘a’ is taken in state S. Note Markov property states
that the effects of an action taken in a state depend only on that state and
not on the prior history.
Actions
An Action A is a set of all possible actions. A(s) defines the set of actions that can be
taken being in state S.
Reward
A Reward is a real-valued reward function. R(s) indicates the reward for simply being
in the state S. R(S,a) indicates the reward for being in a state S and taking an
action ‘a’. R(S,a,S’) indicates the reward for being in a state S, taking an action ‘a’ and
ending up in a state S’.
Policy
A Policy is a solution to the Markov Decision Process. A policy is a mapping from S to a. It
indicates the action ‘a’ to be taken while in state S.
An agent lives in the grid. The above example is a 3*4 grid. The grid has a
START state(grid no 1,1). The purpose of the agent is to wander around the grid to
finally reach the Blue Diamond (grid no 4,3). Under all circumstances, the agent should
avoid the Fire grid (orange color, grid no 4,2). Also the grid no 2,2 is a blocked grid, it acts
as a wall hence the agent cannot enter it.
The agent can take any one of these actions: UP, DOWN, LEFT, RIGHT
Walls block the agent path, i.e., if there is a wall in the direction the agent would
have taken, the agent stays in the same place. So for example, if the agent says
LEFT in the START grid he would stay put in the START grid.
First Aim: To find the shortest sequence getting from START to the Diamond. Two
such sequences can be found:
RIGHT RIGHT UP UPRIGHT
Small reward each step (can be negative when can also be term as punishment,
in the above example entering the Fire can have a reward of -1).
Big rewards come at the end (good or bad).
The goal is to Maximize the sum of rewards.
Q-learning
Q-learning is a model-free, value-based, off-policy algorithm that will find the best
series of actions based on the agent's current state. The “Q” stands for quality.
Quality represents how valuable the action is in maximizing future rewards.
The model-based algorithms use transition and reward functions to estimate the
optimal policy and create the model. In contrast, model-free algorithms learn
the consequences of their actions through the experience without transition and
reward function.
The value-based method trains the value function to learn which state is more
valuable and take action. On the other hand, policy-based methods train the policy
directly to learn which action to take in a given state.
In the off-policy, the algorithm evaluates and updates a policy that differs from
the policy used to take an action. Conversely, the on-policy algorithm evaluates and
improves the same policy used to take an action
Before we jump into how Q-learning works, we need to learn a few useful
terminologies to understand Q-learning's fundamentals.
States(s): the current position of the agent in the environment.
Rewards: for every action, the agent receives a reward and penalty.
Episodes: the end of the stage, where agents can’t take new action. It happens
when
the agent has achieved the goal
or failed.
Q(St+1, a): expected optimal Q-value of doing the action in a particular state.
Q-Table: the agent maintains the Q-table of sets of states and actions.
Q-
Table
The agent will use a Q-table to take the best possible action based on the expected
reward for each state in the environment. In simple words, a Q-table is a data structure
of sets of actions and states, and we use the Q-learning algorithm to update the values
in the table.
Q-
Function
The Q-function uses the Bellman equation and takes state(s) and action(a) as input.
The equation simplifies the state values and state-action value calculation.
Q-learning algorithm
Q learning algorithm