Unit Ii
Unit Ii
Prepared By
S Naresh Kumar
Department of Computer Science and Engineering
UNIT-II
being.
Strong AI uses a theory of mind AI framework, which refers to the ability to discern
be, but we need to ensure these goals align with ours if we expect to maintain some
level of control.
Type 2(Functionality)
Reactive Machines: Reactive AI is incredibly basic; it has no memory or
data storage capabilities, emulating the human mind's ability to respond
to different kinds of stimuli without prior experience. Example:- IBM
chess program that beat Garry Kasparov in the 1990s.
Limited Memory: AI systems can use past experiences to inform future
decisions. Some of the decision-making functions in self-driving
cars have been designed this way. Observations used to inform actions
happening in the not so distant future, such as a car that has changed
lanes. These observations are not stored permanently and also Apple’s
Chatbot Siri.
Theory of Mind: This type of AI should be able to understand
people’s emotion, belief, thoughts, expectations and be able
to interact socially Even though a lot of improvements are there in this
field this kind of AI is not complete yet.
Self-awareness: An AI that has it’s own conscious, super intelligent, self-
awareness and sentient (In simple words a complete human being). Of
course, this kind of bot also doesn’t exist and if achieved it will be one of
the milestones in the field of AI
AI Fields (Discipline)
Machine Learning
Machine Learning is a concept which allows the machine to learn from
examples and experience, and that too without being explicitly
programmed.
Supervised
Unsupervised
Semi –supervised DL
ML
Reinforcement AI
Algorithms are
Linear Regression
Logistic Regression
Decision Tree
SVM
Naive Bayes
kNN
K-Means
Random Forest
Dimensionality Reduction Algorithms
Gradient Boosting algorithms
GBM
XGBoost
LightGBM
CatBoost
The algorithm is mainly divided into:
Training Phase
Testing phase
Training Phase
You take a randomly selected specimen of apples from the market (training
data), make a table of all the physical characteristics of each apple, like color,
size, shape, grown in which part of the country, sold by which vendor, etc
(features), along with the sweetness, juiciness, ripeness of that apple
(output variables).
You feed this data to the machine learning algorithm
(classification/regression), and it learns a model of the correlation
between an average apple’s physical characteristics, and its quality.
Testing Phase :
Next time when you go shopping, you will measure the characteristics of the
apples which you are purchasing(test data)and feed it to the Machine
Learning algorithm.
It will use the model which was computed earlier to predict if the apples are
sweet, ripe and/or juicy.
The algorithm may internally use the rules, similar to the one you manually
wrote earlier (for eg, a decision tree).
Finally, you can now shop for apples with great confidence, without worrying
about the details of how to choose the best apples.
Supervised Learning
A model is prepared through a training process in which it is required to
make predictions and is corrected when those predictions are wrong. The
training process continues until the model achieves a desired level of
accuracy on the training data.
Input data is called training data and has a known label or result such as
spam/not-spam or a stock price at a time.
features.
Clustering is a method of unsupervised learning and is a common technique for
structure based on the hierarchy. New clusters are formed using the previously formed
one. It is divided into two category
Agglomerative (bottom up approach)
Divisive (top down approach)
examplesCURE (Clustering Using Representatives), BIRCH (Balanced Iterative
Reducing Clustering and using Hierarchies) etc.
Clustering Methods :
Partitioning Methods : These methods partition the objects into k clusters and each
partition forms one cluster. This method is used to optimize an objective criterion
similarity function such as when the distance is a major parameter example K-means,
CLARANS (Clustering Large Applications based upon Randomized Search) etc.
Grid-based Methods : In this method the data space is formulated into a finite
number of cells that form a grid-like structure. All the clustering operation done on
these grids are fast and independent of the number of data objects example STING
(Statistical Information Grid), wave cluster, CLIQUE (CLustering In Quest) etc.
K –Means clustering
The main objective of the K-Means algorithm is to
minimize the sum of distances between the points
and their respective cluster centroid
Hard clustering
Partition algorithm
Unsupervised algorithm (make inferences from
datasets using only input vectors without referring to
known, or labelled, outcomes )
Steps:
Step 1: Choose the number of clusters k
The first step in k-means is to pick the number of
clusters, k
Step 2 : Select k random points from the data
as centroids
Next, we randomly select the centroid for each
cluster. Let’s say we want to have 2 clusters, so k is
equal to 2 here. We then randomly select the
centroid, Here, the red and green circles represent
the centroid for these clusters.
Step 3: Assign all the points to the closest cluster centroid
Once we have initialized the centroids, we assign each point to
the closest cluster centroid
Here you can see that the points which are closer to the red point
are assigned to the red cluster whereas the points which are
closer to the green point are assigned to the green cluster
Step 4: Recompute the centroids of newly formed
clusters
Now, once we have assigned all of the points to either cluster, the
next step is to compute the centroids of newly formed clusters:
Here, the red and green crosses are the new centroids.
Repeat steps 3 and 4 until there is no change to the
centroids. i.e assignment of data points to clusters isn’t
changing.
The step by step process:
Example 1: Given data points S = {2,3,4,10,11,12,20,25,30} ,K=2 .Solve it
Solution :
Step 1 : K=2
Step 2 randomly pick centroids (means) m1 =4 for K1 and m2 =12 for K2
Step 3: Calculate distance from data point to all centroids and reassign data
point to the cluster that is nearest to the centroid.
M1=4 M2=12
K1= {2 , 3 , 4} K2={10 , 11 , 12 , 20, 25, 30}
Initial cluster centers are: A1(2, 10), A4(5, 8) and A7(1, 2).
The distance function between two points a = (x1, y1) and b = (x2, y2) is
defined as- Ρ(a, b) = |x2 – x1| + |y2 – y1|
Use K-Means Algorithm to find the three cluster centers after the second
iteration.
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.
C1 C2 C3
First cluster contains Second cluster contains Third cluster contains
points- points- points-
A1 (2,10) A3(8,4) A2(2, 5)
A4(5, 8) A7(1, 2)
A5(7, 5)
A6(6, 4)
A8(4, 9)
X2=6 X3=1.5
Y2=(4+8+5+4+9)/5 Y3=(5+2)/2
Y2=6 Y3=3.5
(X2,Y2) =(6,6) (X3,Y3) =(1.5 ,3.5)
New clusters are - (2, 10) , (6,6) , (1.5 ,3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.
After second iteration, the center of the three clusters are- C1(3, 9.5) ,C2(6.5, 5.25) , C3(1.5, 3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it. (Stop iterating…)
After third iteration, the center of the three clusters are- C1(3.67, 9) ,C2(7, 4.3) , C3(1.5, 3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.
•Stop iteration because centre value are not changing for clusters
k-means algorithm seems to be working pretty well , where the distribution of points in a circular form
If the distribution of points is not in a circular form, It
would still attempt to group the data points in a
circular fashion. That’s not great! k-means fails to
identify the right clusters:
Gaussian Mixture Models are probabilistic models and use the soft
clustering
.
approach for distributing the points in different clusters.
Let’s say we have three Gaussian distributions (more on that in the next
section) – GD1, GD2, and GD3. These have a certain mean (μ1, μ2, μ3)
and variance (σ1, σ2, σ3) value respectively. For a given set of data points,
our GMM would identify the probability of each data point belonging to
each of these distributions
Gaussian Mixture Models are distribution-based model
Expectation and Maximization algorithm(EM) :
• Statistical algorithm
• Uses EM when the data has missing values, or data is incomplete
• Expectation-Maximization tries to use the existing data to determine
the optimum values for these variables and then finds the model
parameters
• Expectation-Maximization is the base of many algorithms, including
Gaussian Mixture Models
The EM algorithm is an iterative approach that cycles between
two modes.
E-Step : The first mode attempts to estimate the missing or
latent variables, called the estimation-step or E-step.
M-Step : The second mode attempts to optimize the parameters
of the model to best explain the data, called the maximization-
step or M-step
(ie. Based on the estimated values generated in
the E-step, the complete data is used to update the parameters)
EM in Gaussian Mixture model:
In the EM algorithm,
Estimation(E)-Step : Estimate the expected value for the process
latent variable for each data point
Maximization(M)-Step : Optimize the parameters of the probability
distributions in an attempt to best capture the density of the data.
The process is repeated until a good set of latent values and a maximum
likelihood is achieved that fits the data.
E-step:
For each point xi, calculate the probability that it belongs to
cluster/distribution c1, c2, … ck. This is done using the below formula:
This value will be high when the point is assigned to the right cluster and lower otherwise
M-step:
Post the E-step, we go back and update the Π, μ and Σ values.
These are updated in the following manner:
1. The new density is defined by the ratio of the number of
points in the cluster and the total number of points:
•k-means only considers the mean to update the centroid while GMM takes into
account the mean as well as the variance of the data!
Reference :https://www.analyticsvidhya.com/blog/2019/10/gaussian-mixture-models-clustering/