0% found this document useful (0 votes)
45 views56 pages

Unit Ii

The document discusses various machine learning techniques including supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled training data to build a model that can predict the correct output for new input data. Unsupervised learning finds hidden patterns or grouping in the data without labeled responses. Reinforcement learning allows an agent to learn through trial-and-error interactions with an environment using reward and punishment to determine optimal actions.

Uploaded by

mahih16237
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views56 pages

Unit Ii

The document discusses various machine learning techniques including supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled training data to build a model that can predict the correct output for new input data. Unsupervised learning finds hidden patterns or grouping in the data without labeled responses. Reinforcement learning allows an agent to learn through trial-and-error interactions with an environment using reward and punishment to determine optimal actions.

Uploaded by

mahih16237
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Data Analytics and Visualization

IMBA III SEM

Prepared By
S Naresh Kumar
Department of Computer Science and Engineering
UNIT-II

Introduction to machine learning: Supervised and Unsupervised


learning,

Gradient descent , Clustering techniques: K-means, Gaussian mixture


models
What is Artificial Intelligence ?
Artificial Intelligence
 Artificial intelligence (AI) refers to the simulation of
human intelligence in machines that are programmed to think like
humans and mimic their actions such as visual perception, speech
recognition, decision-making, and translation between languages.
The goal of AI is to provide a machine with an ability to come up with best
possible decision, after considering all possible scenarios and their
outcomes in a given environment
Applications of AI
Type 1(based on Ability)
Weak AI or Narrow AI or Artificial narrow intelligence (ANI):
which has a narrow range of abilities.
Narrow AI is goal-oriented, designed to perform singular tasks - i.e. facial
recognition, speech recognition/voice assistants, driving a car, or searching
the internet
 It is focused on one narrow task, the phenomenon that machines which are
not too intelligent to do their own work can be built in such a way that they
seem smart.
An example would be a poker game where a machine beats human where in
which all rules and moves are fed into the machine.
Narrow AI can either be reactive, or have a limited memory.
Strong AI/Artificial general intelligence (AGI)/Deep AI: which is on par with
human capabilities
 concept of a machine with general intelligence that mimics human intelligence and/or
behaviors, with the ability to learn and apply its intelligence to solve any problem
 The machines that can actually think and perform tasks on its own just like a human

being.
 Strong AI uses a theory of mind AI framework, which refers to the ability to discern

needs, emotions, beliefs


 There are no proper existing examples for this but some industry leaders are very keen

on getting close to build a strong AI which has resulted in rapid progress.


Super AI/ Artificial super intelligence (ASI)
 machines could surpass human intelligence.
 Super intelligent AI would be extremely efficient at attaining goals, whatever they may

be, but we need to ensure these goals align with ours if we expect to maintain some
level of control.
Type 2(Functionality)
Reactive Machines: Reactive AI is incredibly basic; it has no memory or
data storage capabilities, emulating the human mind's ability to respond
to different kinds of stimuli without prior experience. Example:- IBM
chess program that beat Garry Kasparov in the 1990s.
Limited Memory: AI systems can use past experiences to inform future
decisions. Some of the decision-making functions in self-driving
cars have been designed this way. Observations used to inform actions
happening in the not so distant future, such as a car that has changed
lanes. These observations are not stored permanently and also Apple’s
Chatbot Siri.
Theory of Mind: This type of AI should be able to understand
people’s emotion, belief, thoughts, expectations and be able
to interact socially Even though a lot of improvements are there in this
field this kind of AI is not complete yet.
Self-awareness: An AI that has it’s own conscious, super intelligent, self-
awareness and sentient (In simple words a complete human being). Of
course, this kind of bot also doesn’t exist and if achieved it will be one of
the milestones in the field of AI
AI Fields (Discipline)
Machine Learning
Machine Learning is a concept which allows the machine to learn from
examples and experience, and that too without being explicitly
programmed.
 Supervised
 Unsupervised
Semi –supervised DL
ML
Reinforcement AI
Algorithms are

 Linear Regression
 Logistic Regression
 Decision Tree
 SVM
 Naive Bayes
 kNN
 K-Means
 Random Forest
 Dimensionality Reduction Algorithms
 Gradient Boosting algorithms
 GBM
 XGBoost
 LightGBM
 CatBoost
The algorithm is mainly divided into:
Training Phase
Testing phase
Training Phase
You take a randomly selected specimen of apples from the market (training
data), make a table of all the physical characteristics of each apple, like color,
size, shape, grown in which part of the country, sold by which vendor, etc
(features), along with the sweetness, juiciness, ripeness of that apple
(output variables).
You feed this data to the machine learning algorithm
(classification/regression), and it learns a model of the correlation
between an average apple’s physical characteristics, and its quality.
Testing Phase :
Next time when you go shopping, you will measure the characteristics of the
apples which you are purchasing(test data)and feed it to the Machine
Learning algorithm.
It will use the model which was computed earlier to predict if the apples are
sweet, ripe and/or juicy.
The algorithm may internally use the rules, similar to the one you manually
wrote earlier (for eg, a decision tree).
 Finally, you can now shop for apples with great confidence, without worrying
about the details of how to choose the best apples.
Supervised Learning
A model is prepared through a training process in which it is required to
make predictions and is corrected when those predictions are wrong. The
training process continues until the model achieves a desired level of
accuracy on the training data.

Input data is called training data and has a known label or result such as
spam/not-spam or a stock price at a time.

Some practical examples of classification problems are: speech


recognition, handwriting recognition, bio metric identification,
document classification etc
Example problems are classification and regression(linear).

Example classification algorithms include:


 Logistic Regression ,Back Propagation Neural Network , Naive Bayes Classifier
 Nearest Neighbor ,Support Vector Machines
 Decision Trees ,Boosted Trees
 Random Forest (classification and regression)

Regression algorithm – continuous data


Classification algorithm –discrete value
https://www.edureka.co/blog/supervised-learning/
https://towardsdatascience.com/a-brief-introduction-to-supervised-lear
ning-54a3e3932590
Unsupervised Learning
Input data is not labeled and does not have a known result.
A model is prepared by deducing structures present in the input data.
This may be to extract general rules. It may be through a mathematical
process to systematically reduce redundancy, or it may be to organize data
by similarity.
Example problems are clustering, Auto encoders ,dimensionality
reduction and association rule learning.
Example algorithms include: the Apriori algorithm and K-Means.
Semi-Supervised Learning
Input data is a mixture of labeled and unlabelled examples.
There is a desired prediction problem but the model must learn the
structures to organize the data as well as make predictions.
Example problems are classification and regression.
Example algorithms are extensions to other flexible methods that make
assumptions about how to model the unlabeled data.
GAN (General adversarial networks )
Reinforcement Learning:

Reinforcement Learning(RL) is a type of machine learning technique that


enables an agent to learn in an interactive environment by trial and error
using feedback from its own actions and experiences.
There is no data in this kind of learning, nor do you teach the algorithm
anything. You model the algorithm such that it interacts with the
environment and if the algorithm does a good job, you reward it,
else you punish the algorithm. With continuous interactions and
learning, it goes from being bad to being the best that it can for the
problem assigned to it.
Reinforcement Learning:

 The machine is trained to make specific decisions.


 It trains itself continually using trial and error.
 This machine learns from past experience and tries to capture the best possible
knowledge to make accurate business decisions. (Example Video games)
 Reinforcement Learning : Markov Decision Process
Some key terms that describe the elements of a RL problem are:
• Environment: Physical world in which
the agent operates
• State: Current situation of the agent
• Reward: Feedback from the
environment
• Policy: Method to map agent’s state to
actions
• Value: Future reward that an agent
would receive by taking an action in a
particular state
Supervised Vs Reinforcement
Though both supervised and reinforcement learning use mapping between input and output, unlike
supervised learning where feedback provided to the agent is correct set of actions for performing a
task, reinforcement learning uses rewards and punishment as signals for positive and negative
behavior.
Unsupervised Vs Reinforcement
As compared to unsupervised learning, reinforcement learning is different in terms of goals. While the
goal in unsupervised learning is to find similarities and differences between data points, in
reinforcement learning the goal is to find a suitable action model that would maximize the total
cumulative reward of the agent. The figure below represents the basic idea and elements involved in a
reinforcement learning model.
UNSUPERVISED LEARNING
Unsupervised Learning
Input data is not labeled and does not have a known result.
A model is prepared by deducing structures present in the input data.
This may be to extract general rules. It may be through a mathematical
process to systematically reduce redundancy, or it may be to organize data
by similarity.
Example problems are clustering, Auto encoders ,dimensionality
reduction and association rule learning.
Example algorithms include: the Apriori algorithm and K-Means.
Clustering
Clustering is the task of dividing the population or data points into a
number of groups such that data points in the same groups are more
similar to other data points in the same group and dissimilar to the data
points in other groups.
 data points that are in the same group should have similar properties and/or features,
 while data points in different groups should have highly dissimilar properties and/or

features.
 Clustering is a method of unsupervised learning and is a common technique for

statistical data analysis used in many fields.


 Examples : items arranged in a mall , group of persons near a dinning table of

restaurant, amazon recommendation system, netflix,


Clustering can be divided into two subgroups :
 Hard Clustering: In hard clustering, each data point either belongs to a cluster completely or
not. For example, each customer is put into one cluster out of the 10 clusters.(K-means)
 Soft Clustering: In soft clustering, instead of putting each data point into a separate cluster, a
probability or likelihood of that data point to be in those clusters is assigned. For example each
customer is assigned a probability to be in either of 10 clusters of the retail store. (C-means or
Fuzzy)
 Hierarchical clustering : clusters is represented as a tree (or dendrogram). The root of the tree is
the unique cluster that gathers all the samples, the leaves being the clusters with only one
sample
The more popular algorithms is as follows:
• Affinity Propagation
• Agglomerative Clustering
• BIRCH
• DBSCAN
• K-Means
• Mini-Batch K-Means
• Mean Shift
• OPTICS
• Spectral Clustering
• Mixture of Gaussians
Refer : https://machinelearningmastery.com/clustering-algorithms-with-python/
Clustering Methods :
 Density-Based Methods : These methods consider the clusters as the dense region
having some similarity and different from the lower dense region of the space. These
methods have good accuracy and ability to merge two clusters.Example DBSCAN
(Density-Based Spatial Clustering of Applications with Noise) , OPTICS (Ordering
Points to Identify Clustering Structure) etc.
 Hierarchical Based Methods : The clusters formed in this method forms a tree-type

structure based on the hierarchy. New clusters are formed using the previously formed
one. It is divided into two category
 Agglomerative (bottom up approach)
 Divisive (top down approach)
 examplesCURE (Clustering Using Representatives), BIRCH (Balanced Iterative
Reducing Clustering and using Hierarchies) etc.
Clustering Methods :

 Partitioning Methods : These methods partition the objects into k clusters and each
partition forms one cluster. This method is used to optimize an objective criterion
similarity function such as when the distance is a major parameter example K-means,
CLARANS (Clustering Large Applications based upon Randomized Search) etc.

 Grid-based Methods : In this method the data space is formulated into a finite
number of cells that form a grid-like structure. All the clustering operation done on
these grids are fast and independent of the number of data objects example STING
(Statistical Information Grid), wave cluster, CLIQUE (CLustering In Quest) etc.
K –Means clustering
The main objective of the K-Means algorithm is to
minimize the sum of distances between the points
and their respective cluster centroid
Hard clustering
Partition algorithm
Unsupervised algorithm (make inferences from
datasets using only input vectors without referring to
known, or labelled, outcomes )
Steps:
Step 1: Choose the number of clusters k
 The first step in k-means is to pick the number of
clusters, k
Step 2 : Select k random points from the data
as centroids
Next, we randomly select the centroid for each
cluster. Let’s say we want to have 2 clusters, so k is
equal to 2 here. We then randomly select the
centroid, Here, the red and green circles represent
the centroid for these clusters.
Step 3: Assign all the points to the closest cluster centroid
 Once we have initialized the centroids, we assign each point to
the closest cluster centroid
 Here you can see that the points which are closer to the red point
are assigned to the red cluster whereas the points which are
closer to the green point are assigned to the green cluster
Step 4: Recompute the centroids of newly formed
clusters
 Now, once we have assigned all of the points to either cluster, the
next step is to compute the centroids of newly formed clusters:
 Here, the red and green crosses are the new centroids.
Repeat steps 3 and 4 until there is no change to the
centroids. i.e assignment of data points to clusters isn’t
changing.
The step by step process:
Example 1: Given data points S = {2,3,4,10,11,12,20,25,30} ,K=2 .Solve it
Solution :
Step 1 : K=2
Step 2 randomly pick centroids (means) m1 =4 for K1 and m2 =12 for K2
Step 3: Calculate distance from data point to all centroids and reassign data
point to the cluster that is nearest to the centroid.
M1=4 M2=12
K1= {2 , 3 , 4} K2={10 , 11 , 12 , 20, 25, 30}

Step 4 : After reassigning calculate new centroids


2+3+4 =9 10+11+12+20+25+30=108
M1=9/3=3 M2=108/6= 18
Step 5 : Repeat Step 3 & 4 until no change in centroids
M1 =3 M2=18
K1={2,3,4,10} K2={11,12,20,25,30}
M1 =5 ( 4.75 ) M2=20 ( 19.6)
K1={2,3,4,10,11,12} K2={20.25,30}
M1 =7 M2=25
K1={2,3,4,10,11,12} K2={20,25,30}

Data points {2,3,4,10,11,12} belongs to Cluster K1


Data points {20,25,30 } belongs to Cluster K2

Refer website https://www.gatevidyalay.com/tag/k-means-clustering-numerical-example-pdf/


Example 2 : Cluster the following eight points (with (x, y) representing
locations) into three clusters:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)

Initial cluster centers are: A1(2, 10), A4(5, 8) and A7(1, 2).

The distance function between two points a = (x1, y1) and b = (x2, y2) is
defined as- Ρ(a, b) = |x2 – x1| + |y2 – y1|

Use K-Means Algorithm to find the three cluster centers after the second
iteration.
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.
C1 C2 C3
First cluster contains Second cluster contains Third cluster contains
points- points- points-
A1 (2,10) A3(8,4) A2(2, 5)
A4(5, 8) A7(1, 2)
A5(7, 5)
A6(6, 4)
A8(4, 9)

Recalculate centre for Recalculate centre for Recalculate centre for


Cluster C1 : Cluster C2 : Cluster C2 :
(X1,Y1) = (2,10) X2=(8+5+7+6+4)/5 X3=(2+1)/2

X2=6 X3=1.5

Y2=(4+8+5+4+9)/5 Y3=(5+2)/2

Y2=6 Y3=3.5
(X2,Y2) =(6,6) (X3,Y3) =(1.5 ,3.5)
New clusters are - (2, 10) , (6,6) , (1.5 ,3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.
After second iteration, the center of the three clusters are- C1(3, 9.5) ,C2(6.5, 5.25) , C3(1.5, 3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it. (Stop iterating…)
After third iteration, the center of the three clusters are- C1(3.67, 9) ,C2(7, 4.3) , C3(1.5, 3.5)
•We draw a table showing all the results.
•Using the table, we decide which point belongs to which cluster.
•The given point belongs to that cluster whose center is nearest to it.

•Stop iteration because centre value are not changing for clusters
k-means algorithm seems to be working pretty well , where the distribution of points in a circular form
If the distribution of points is not in a circular form, It
would still attempt to group the data points in a
circular fashion. That’s not great! k-means fails to
identify the right clusters:

K-means algorithm uses a distance-based


model, we will now use a distribution-based
model known as Gaussian Mixture Models
Gaussian Mixture Models
Gaussian Mixture Models (GMMs) assume that there are a certain
number of Gaussian distributions, and each of these distributions
represent a cluster. Hence, a Gaussian Mixture Model tends to group the
data points belonging to a single distribution together.

Gaussian Mixture Models are probabilistic models and use the soft
clustering
.
approach for distributing the points in different clusters.

Let’s say we have three Gaussian distributions (more on that in the next
section) – GD1, GD2, and GD3. These have a certain mean (μ1, μ2, μ3)
and variance (σ1, σ2, σ3) value respectively. For a given set of data points,
our GMM would identify the probability of each data point belonging to
each of these distributions
Gaussian Mixture Models are distribution-based model
Expectation and Maximization algorithm(EM) :
• Statistical algorithm
• Uses EM when the data has missing values, or data is incomplete
• Expectation-Maximization tries to use the existing data to determine
the optimum values for these variables and then finds the model
parameters
• Expectation-Maximization is the base of many algorithms, including
Gaussian Mixture Models
The EM algorithm is an iterative approach that cycles between
two modes.
E-Step : The first mode attempts to estimate the missing or
latent variables, called the estimation-step or E-step.
M-Step : The second mode attempts to optimize the parameters
of the model to best explain the data, called the maximization-
step or M-step
(ie. Based on the estimated values generated in
the E-step, the complete data is used to update the parameters)
EM in Gaussian Mixture model:
In the EM algorithm,
Estimation(E)-Step : Estimate the expected value for the process
latent variable for each data point
Maximization(M)-Step : Optimize the parameters of the probability
distributions in an attempt to best capture the density of the data.
The process is repeated until a good set of latent values and a maximum
likelihood is achieved that fits the data.
E-step:
For each point xi, calculate the probability that it belongs to
cluster/distribution c1, c2, … ck. This is done using the below formula:

This value will be high when the point is assigned to the right cluster and lower otherwise
M-step:
Post the E-step, we go back and update the Π, μ and Σ values.
These are updated in the following manner:
1. The new density is defined by the ratio of the number of
points in the cluster and the total number of points:

2 .The mean and the covariance matrix are updated

based on the values assigned to the distribution, in


proportion with the probability values for the data
point. Hence, a data point that has a higher
probability of being a part of that distribution will
contribute a larger portion:
•Based on the updated values generated from M- step, we calculate the new
probabilities for each data point and update the values iteratively.

•This process is repeated in order to maximize the log-likelihood function.

•k-means only considers the mean to update the centroid while GMM takes into
account the mean as well as the variance of the data!

Reference :https://www.analyticsvidhya.com/blog/2019/10/gaussian-mixture-models-clustering/

You might also like