0% found this document useful (0 votes)
23 views26 pages

ML Notes

Uploaded by

vjjaish9569
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views26 pages

ML Notes

Uploaded by

vjjaish9569
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

What is Machine Learning?

Machine Learning is defined as the study of computer algorithms for


automatically constructing computer software through past experience and
training data.
It is a branch of Artificial Intelligence and computer science that helps build a model
based on training data and make predictions and decisions without being constantly
programmed. Machine Learning is used in various applications such as email
filtering, speech recognition, computer vision, self-driven cars, Amazon product
recommendation, etc.
Commonly used Algorithms in Machine Learning
Machine Learning is the study of learning algorithms using past experience and
making future decisions. Although, Machine Learning has a variety of models, here
is a list of the most commonly used machine learning algorithms by all data scientists
and professionals in today's world.
o Linear Regression
o Logistic Regression
o Decision Tree
o Bayes Theorem and Naïve Bayes Classification
o Support Vector Machine (SVM) Algorithm
o K-Nearest Neighbor (KNN) Algorithm
o K-Means
o Gradient Boosting algorithms
o Dimensionality Reduction Algorithms
o Random Forest

Well Posed Learning Problem – A computer program is said to learn from


experience E in context to some task T and some performance measure P, if its
performance on T, as was measured by P, upgrades with experience E.
Any problem can be segregated as well-posed learning problem if it has three traits

 Task
 Performance Measure
 Experience
Certain examples that efficiently defines the well-posed learning problem are –
1. To better filter emails as spam or not
 Task – Classifying emails as spam or not
 Performance Measure – The fraction of emails accurately classified as spam
or not spam
 Experience – Observing you label emails as spam or not spam
2. A checkers learning problem
 Task – Playing checkers game
 Performance Measure – percent of games won against opposer
 Experience – playing implementation games against itself
3. Handwriting Recognition Problem
 Task – Acknowledging handwritten words within portrayal
 Performance Measure – percent of words accurately classified
 Experience – a directory of handwritten words with given classifications
4. A Robot Driving Problem
 Task – driving on public four-lane highways using sight scanners
 Performance Measure – average distance progressed before a fallacy
 Experience – order of images and steering instructions noted down while
observing a human driver
5. Fruit Prediction Problem
 Task – forecasting different fruits for recognition
 Performance Measure – able to predict maximum variety of fruits
 Experience – training machine with the largest datasets of fruits images
6. Face Recognition Problem
 Task – predicting different types of faces
 Performance Measure – able to predict maximum types of faces
 Experience – training machine with maximum amount of datasets of different
face images

Design a Learning System in Machine


Learning
In Simple Words, When we fed the Training Data to Machine Learning
Algorithm, this algorithm will produce a mathematical model and with the help of
the mathematical model, the machine will make a prediction and take a decision
without being explicitly programmed. Also, during training data, the more machine
will work with it the more it will get experience and the more efficient result is
produced.

Example : In Driverless Car, the training data is fed to Algorithm like how
to Drive Car in Highway, Busy and Narrow Street with factors like speed limit,
parking, stop at signal etc. After that, a Logical and Mathematical model is created
on the basis of that and after that, the car will work according to the logical model.
Also, the more data the data is fed the more efficient output is produced.
Designing a Learning System in Machine Learning :
According to Tom Mitchell, “A computer program is said to be learning from
experience (E), with respect to some task (T). Thus, the performance measure (P) is
the performance at task T, which is measured by P, and it improves with experience
E.”
Example: In Spam E-Mail detection,
 Task, T: To classify mails into Spam or Not Spam.
 Performance measure, P: Total percent of mails being correctly classified
as being “Spam” or “Not Spam”.
 Experience, E: Set of Mails with label “Spam”
Steps for Designing Learning System are:

Choosing the Training Experience:It is important to note that the data or


experience that we fed to the algorithm must have a significant impact on the Success
or Failure of the Model. So Training data or experience should be chosen wisely.
Choosing target function: It means according to the knowledge fed to the
algorithm the machine learning will choose NextMove function which will describe
what type of legal moves should be taken. For example : While playing chess with
the opponent, when opponent will play then the machine learning algorithm will
decide what be the number of possible legal moves taken in order to get success.
Choosing Representation for Target function: When the machine
algorithm will know all the possible legal moves the next step is to choose the
optimized move using any representation i.e. using linear Equations, Hierarchical
Graph Representation, Tabular form etc.
Choosing Function Approximation Algorithm: An optimized move cannot
be chosen just with the training data. The training data had to go through with set of
example and through these examples the training data will approximates which steps
are chosen and after that machine will provide feedback on it. For Example : When
a training data of Playing chess is fed to algorithm so at that time it is not machine
algorithm will fail or get success and again from that failure or success it will
measure while next move what step should be chosen and what is its success rate.
Final Design: The final design is created at last when system goes from
number of examples , failures and success , correct and incorrect decision and what
will be the next step etc. Example: DeepBlue is an intelligent computer which is
ML-based won chess game against the chess expert Garry Kasparov, and it became
the first computer which had beaten a human chess expert.
Following are the qualities that you need to keep in mind while designing
a learning system :
Reliability
Scalability
Maintainability
Adaptability

Common issues in Machine Learning


Although machine learning is being used in every industry and helps organizations
make more informed and data-driven choices that are more effective than classical
methodologies, it still has so many problems that cannot be ignored. Here are some
common issues in Machine Learning that professionals face to inculcate ML skills
and create an application from scratch.
1. Inadequate Training Data
The major issue that comes while using machine learning algorithms is the lack of
quality as well as quantity of data. Although data plays a vital role in the processing
of machine learning algorithms, many data scientists claim that inadequate data,
noisy data, and unclean data are extremely exhausting the machine learning
algorithms. For example, a simple task requires thousands of sample data, and an
advanced task such as speech or image recognition needs millions of sample data
examples. Further, data quality is also important for the algorithms to work ideally,
but the absence of data quality is also found in Machine Learning applications. Data
quality can be affected by some factors as follows:
o Noisy Data- It is responsible for an inaccurate prediction that affects the
decision as well as accuracy in classification tasks.
o Incorrect data- It is also responsible for faulty programming and results
obtained in machine learning models. Hence, incorrect data may affect the
accuracy of the results also.
o Generalizing of output data- Sometimes, it is also found that generalizing
output data becomes complex, which results in comparatively poor future
actions.
2. Poor quality of data
As we have discussed above, data plays a significant role in machine learning, and
it must be of good quality as well. Noisy data, incomplete data, inaccurate data, and
unclean data lead to less accuracy in classification and low-quality results. Hence,
data quality can also be considered as a major common problem while processing
machine learning algorithms.
3. Non-representative training data
To make sure our training model is generalized well or not, we have to ensure that
sample training data must be representative of new cases that we need to generalize.
The training data must cover all cases that are already occurred as well as occurring.
Further, if we are using non-representative training data in the model, it results in
less accurate predictions. A machine learning model is said to be ideal if it predicts
well for generalized cases and provides accurate decisions. If there is less training
data, then there will be a sampling noise in the model, called the non-representative
training set. It won't be accurate in predictions. To overcome this, it will be biased
against one class or a group.
Hence, we should use representative data in training to protect against being biased
and make accurate predictions without any drift.
4. Overfitting and Underfitting
Overfitting:
Overfitting is one of the most common issues faced by Machine Learning engineers
and data scientists. Whenever a machine learning model is trained with a huge
amount of data, it starts capturing noise and inaccurate data into the training data set.
It negatively affects the performance of the model. Let's understand with a simple
example where we have a few training data sets such as 1000 mangoes, 1000 apples,
1000 bananas, and 5000 papayas. Then there is a considerable probability of
identification of an apple as papaya because we have a massive amount of biased
data in the training data set; hence prediction got negatively affected. The main
reason behind overfitting is using non-linear methods used in machine learning
algorithms as they build non-realistic data models. We can overcome overfitting by
using linear and parametric algorithms in the machine learning models.
Methods to reduce overfitting:
o Increase training data in a dataset.
o Reduce model complexity by simplifying the model by selecting one with
fewer parameters
o Ridge Regularization and Lasso Regularization
o Early stopping during the training phase
o Reduce the noise
o Reduce the number of attributes in training data.
o Constraining the model.
Underfitting:
Underfitting is just the opposite of overfitting. Whenever a machine learning model
is trained with fewer amounts of data, and as a result, it provides incomplete and
inaccurate data and destroys the accuracy of the machine learning model.
Underfitting occurs when our model is too simple to understand the base structure
of the data, just like an undersized pant. This generally happens when we have
limited data into the data set, and we try to build a linear model with non-linear data.
In such scenarios, the complexity of the model destroys, and rules of the machine
learning model become too easy to be applied on this data set, and the model starts
doing wrong predictions as well.
Methods to reduce Underfitting:
o Increase model complexity
o Remove noise from the data
o Trained on increased and better features
o Reduce the constraints
o Increase the number of epochs to get better results.
5. Monitoring and maintenance
As we know that generalized output data is mandatory for any machine learning
model; hence, regular monitoring and maintenance become compulsory for the
same. Different results for different actions require data change; hence editing of
codes as well as resources for monitoring them also become necessary.
6. Getting bad recommendations
A machine learning model operates under a specific context which results in bad
recommendations and concept drift in the model. Let's understand with an example
where at a specific time customer is looking for some gadgets, but now customer
requirement changed over time but still machine learning model showing same
recommendations to the customer while customer expectation has been changed.
This incident is called a Data Drift. It generally occurs when new data is introduced
or interpretation of data changes. However, we can overcome this by regularly
updating and monitoring data according to the expectations.
7. Lack of skilled resources
Although Machine Learning and Artificial Intelligence are continuously growing in
the market, still these industries are fresher in comparison to others. The absence of
skilled resources in the form of manpower is also an issue. Hence, we need
manpower having in-depth knowledge of mathematics, science, and technologies for
developing and managing scientific substances for machine learning.
8. Customer Segmentation
Customer segmentation is also an important issue while developing a machine
learning algorithm. To identify the customers who paid for the recommendations
shown by the model and who don't even check them. Hence, an algorithm is
necessary to recognize the customer behavior and trigger a relevant recommendation
for the user based on past experience.
9. Process Complexity of Machine Learning
The machine learning process is very complex, which is also another major issue
faced by machine learning engineers and data scientists. However, Machine
Learning and Artificial Intelligence are very new technologies but are still in an
experimental phase and continuously being changing over time. There is the
majority of hits and trial experiments; hence the probability of error is higher than
expected. Further, it also includes analyzing the data, removing data bias, training
data, applying complex mathematical calculations, etc., making the procedure more
complicated and quite tedious.
10. Data Bias
Data Biasing is also found a big challenge in Machine Learning. These errors exist
when certain elements of the dataset are heavily weighted or need more importance
than others. Biased data leads to inaccurate results, skewed outcomes, and other
analytical errors. However, we can resolve this error by determining where data is
actually biased in the dataset. Further, take necessary steps to reduce it.
Methods to remove Data Bias:
o Research more for customer segmentation.
o Be aware of your general use cases and potential outliers.
o Combine inputs from multiple sources to ensure data diversity.
o Include bias testing in the development process.
o Analyze data regularly and keep tracking errors to resolve them easily.
o Review the collected and annotated data.
o Use multi-pass annotation such as sentiment analysis, content moderation, and
intent recognition.
11. Lack of Explainability
This basically means the outputs cannot be easily comprehended as it is programmed
in specific ways to deliver for certain conditions. Hence, a lack of explainability is
also found in machine learning algorithms which reduce the credibility of the
algorithms.
12. Slow implementations and results
This issue is also very commonly seen in machine learning models. However,
machine learning models are highly efficient in producing accurate results but are
time-consuming. Slow programming, excessive requirements' and overloaded data
take more time to provide accurate results than expected. This needs continuous
maintenance and monitoring of the model for delivering accurate results.
13. Irrelevant features
Although machine learning models are intended to give the best possible outcome,
if we feed garbage data as input, then the result will also be garbage. Hence, we
should use relevant features in our training sample. A machine learning model is
said to be good if training data has a good set of features or less to no irrelevant
features.
What is Concept Learning?
Concept learning is a type of machine learning task that involves training a model to
learn a concept or a target function that maps inputs to outputs. The goal of concept
learning is to train a model to learn a concept that can be used to make predictions
on new, unseen data. In concept learning, the model is trained on a set of labeled
examples, where each example consists of an input and a corresponding output. The
model learns to identify the underlying concept or pattern in the data and makes
predictions on new data based on that concept.
Types of Concept Learning Tasks
There are two types of concept learning tasks: supervised and unsupervised.
Supervised Concept Learning
In supervised concept learning, the model is trained on labeled data, where each
example consists of an input and a corresponding output. The goal of the model is
to learn a concept that can be used to make predictions on new data. Supervised
concept learning is used in applications such as image classification, speech
recognition, and natural language processing.
Unsupervised Concept Learning
In unsupervised concept learning, the model is trained on unlabeled data, and the
goal is to discover hidden patterns or concepts in the data. Unsupervised concept
learning is used in applications such as clustering, dimensionality reduction, and
anomaly detection.
Importance of Concept Learning
Concept learning is an important task in machine learning because it enables
machines to learn from data and make predictions or decisions without being
explicitly programmed. The importance of concept learning can be seen in various
applications, including:
Image Classification
Concept learning is used in image classification to train models to recognize objects
in images. For example, a model can be trained to recognize objects such as cars,
dogs, and buildings in images.
Natural Language Processing
Concept learning is used in natural language processing to train models to
understand the meaning of text. For example, a model can be trained to classify text
as positive or negative based on sentiment analysis.
Recommendation Systems
Concept learning is used in recommendation systems to train models to recommend
products or services based on user behavior. For example, a model can be trained to
recommend products to users based on their browsing history.
How Concept Learning Works
Concept learning involves training a model on a set of labeled examples, where each
example consists of an input and a corresponding output. The model learns to
identify the underlying concept or pattern in the data and makes predictions on new
data based on that concept.

What is the Aim of Concept Learning?


So, with all this being said, concept learning aims to find a function or rule that truly
represents the particular concept being learned. The function must be a true
representation of the concept so that it can be able to make accurate classifications
of unseen data. By “true representation”, it means that the function must be able to
approximate the true value of a target concept. The target concept refers to what
we’re trying to classify. A Boolean-valued function, denoted c(x), can take on two
or more possible categories. The aim is generally to determine the category of the
target concept that a certain object belongs to.
According to the Inductive Learning Hypothesis, if a function can approximate the
target concept well enough over training examples, then it will be able to
approximate the target concept well for unseen examples.
For example, suppose an algebra learner has gained an understanding of the general
rules of algebra based on the examples they’ve practiced. In that case, they’ll be able
to apply those rules to solve any new problems that they encounter. Similarly, in
concept learning, an inferred function will be able to approximate and classify new
data based on how well it has learned in the past.
How Concept Learning Works?
Concept learning works in two ways. It works by:
 Inferring a function from a set of training examples.
 Searching to find the function that best fits the training examples.
How Concept Learning Infers a Function?
Let’s go back to the bank loan example.
Suppose that the bank wants to classify customers according to five features:
 Gender
 Age
 Income
 Dependents
 Loan Amount
And depending on these features, each applicant will be classified into one of two
categories: Loan Approved or Not Approved.
To do this, the bank needs to use a training set consisting of example loan applicants.
This training set will be used to infer a function that will act as a decision rule for
classifying the applicants.
To do this, the bank needs to use a training set consisting of example loan applicants.
This training set will be used to infer a function that will act as a decision rule for
classifying the applicants.
Let’s consider the training sample shown below:
Each customer is either a negative or positive example. The applicants whose
loan application is accepted are the positive examples, while the applicants whose
loan application is rejected are the negative examples.

These examples are determined according to the preferences and needs of the
bank. The bank may have determined that the best customers have a combination of
certain features.
The positive examples are the examples of applicants that the bank has
deemed acceptable for awarding a loan. These applicants have a combination of
features that have been determined as desirable to the bank in terms of the applicant
being able to repay the loan without much trouble.
The negative examples are the examples of applicants that the bank deems
unacceptable for awarding a loan. These applicants have a combination of features
that the bank sees as undesirable. These are applicants that the bank deems will have
difficulty repaying the loan.

Another example of concept learning


Concept Learning in Machine Learning –
The problem of inducing general functions from specific training examples is
central to learning.
Concept learning can be formulated as a problem of searching through a
predefined space of potential hypotheses for the hypothesis that best fits the training
examples.
What is Concept Learning…?
“A task of acquiring potential hypothesis (solution) that best fits the given
training examples.”
Consider the example task of learning the target concept “days on which my
friend Prabhas enjoys his favorite water sport.”
Below Table describes a set of example days, each represented by a set
of attributes. The attribute EnjoySport indicates whether or not Prabhas enjoys his
favorite water sport on this day. The task is to learn to predict the value
of EnjoySport for an arbitrary day, based on the values of its other attributes.
Concept Learning Task Example

Inducing general functions from specific training examples is a main issue of


machine learning.
• Concept Learning: Acquiring the definition of a general category from
given sample positive and negative training examples of the category.
• Concept Learning can seen as a problem of searching through a predefined
space of potential hypotheses for the hypothesis that best fits the training examples.
• The hypothesis space has a general-to-specific ordering of hypotheses, and the
search can be efficiently organized by taking advantage of a naturally occurring
structure over the hypothesis space.

General-to-Specific Ordering of Hypotheses


In the last post, we said that all concept learning problems share 1 thing in
common regarding their structure, and it is the fact that we can order the hypotheses
from the most specific one to the most general one. This will enable the machine
learning algorithm to explore the hypothesis space exhaustively, without having to
enumerate through each and every hypothesis in this space, which is absolutely
impossible when you have an infinitely large hypothesis space. Now, we will talk
about the general-to-specific ordering and show you how we can use this to define a
sense of order among our hypotheses in a hypothesis space in any concept learning
problem.
As a reminder, take a look at our old example regarding Joe wanting to play
golf on a specific day or not. Remember that we would like to learn the concept of
Play about Joe.
Now, consider the following 2 hypotheses:
h1 = <Rainy, Warm, Strong>
h2 = <Rainy, ?, Strong>
The question is which instances and how many of them are classified positive
by each one of these hypotheses (i.e., satisfy these hypotheses). For h1, only example
number 4 is satisfactory, while for h2, both example 3 and 4 are satisfactory and
classified as positive. Why is that? What is the difference between these 2
hypotheses? The answer lies into the strictness of the constraints that are imposed
by each one of these hypotheses. As you can see, h1 imposes more constraints
compared to h2! Naturally, h2 is able to classify more examples as positive than h1 !
Actually, in this example, we can claim the following:
“If an example could satisfy h1, it will FOR SURE satisfy h2, but NOT the
other way around”
This is because h2 is more general than h1. This has been demonstrated
below: h2 encompasses more possibilities than h1. If we have an instance with the
values: <Rainy, Freezing, Strong>, h2 will classify it as positive but h1 will not be
satisfied by it. But when h1 classifies an instance as positive , say, <Rainy, Warm,
Strong>, then it is certain that h2 will also classify it as positive.
More-General-than-or-Equal-to Relationship
Let’s remind ourselves of 2 crucial definitions:
1. For a given example x in the instance space X, and a hypothesis h in the
hypothesis space H, we say that x satisfies h if and only if h(x) = 1.
2. We define the “more general than or equal to” relationship between 2
hypotheses, and we base it on the sets of examples that satisfy each one of the
2 hypotheses.
Now, here is the long-awaited definition:
Hypothesis hj is more general than or equal to the hypothesis h k, if and
only if any instance x in the instance space X that can satisfy hk (i.e., hk(x) = 1)
also satisfies hj (i.e., hj(x) = 1)
We can show this relationship with the following notation:
hj ≥g hk
Where g is short for general. There are cases where a hypothesis is more
general than the other one, but NOT equal to it. Below you can see the definitions
for both cases:
Finally, there are cases where it just makes sense to look at the same definition
but from a different angle. So, after understanding what more general means, we
can also consider the other side of the spectrum. Meaning that, if hypothesis hk is
more general than hj then we can say, equivalently, that hj is more specific than hk.
This is just another way for saying the same thing.

Find S Algorithm in Machine Learning


The Find-S algorithm stands out as a fundamental tool in the field. Developed by Tom
Mitchell, this pioneering algorithm holds great significance in hypothesis space representation
and concept learning.
With its simplicity and efficiency, the Find-S algorithm has garnered attention for its ability to
discover and generalize patterns from labeled training data. In this article, we delve into the inner
workings of the Find-S algorithm, exploring its capabilities and potential applications in modern
machine learning paradigms.
What is the Find-S algorithm in Machine Learning?

The S algorithm, also known as the Find-S algorithm, is a machine learning


algorithm that seeks to find a maximally specific hypothesis based on labeled
training data. It starts with the most specific hypothesis and generalizes it by
incorporating positive examples. It ignores negative examples during the
learning process.

The algorithm's objective is to discover a hypothesis that accurately


represents the target concept by progressively expanding the hypothesis
space until it covers all positive instances.

Symbols used in Find-S algorithm

In the Find-S algorithm, the following symbols are commonly used to


represent different concepts and operations −

 ∅ (Empty Set) − This symbol represents the absence of any specific value or
attribute. It is often used to initialize the hypothesis as the most specific concept.
 ? (Don't Care) − The question mark symbol represents a "don't care" or "unknown"
value for an attribute. It is used when the hypothesis needs to generalize over
different attribute values that are present in positive examples.
 Positive Examples (+) − The plus symbol represents positive examples, which are
instances labeled as the target class or concept being learned.
 Negative Examples (-) − The minus symbol represents negative examples, which
are instances labeled as non-target classes or concepts that should not be covered by
the hypothesis.
Hypothesis (h) − The variable h represents the hypothesis, which is the learned concept or
generalization based on the training data. It is refined iteratively throughout the algorithm.

Introduction :
The find-S algorithm is a basic concept learning algorithm in machine learning. The
find-S algorithm finds the most specific hypothesis that fits all the positive examples.
We have to note here that the algorithm considers only those positive training
example. The find-S algorithm starts with the most specific hypothesis and generalizes
this hypothesis each time it fails to classify an observed positive training data. Hence,
the Find-S algorithm moves from the most specific hypothesis to the most general
hypothesis.
Important Representation :

1. ? indicates that any value is acceptable for the attribute.


2. specify a single required value ( e.g., Cold ) for the attribute.
3. ϕindicates that no value is acceptable.
4. The most general hypothesis is represented by: {?, ?, ?, ?, ?, ?}
5. The most specific hypothesis is represented by: {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
Steps Involved In Find-S :

1. Start with the most specific hypothesis.


h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
2. Take the next example and if it is negative, then no changes occur to the
hypothesis.
3. If the example is positive and we find that our initial hypothesis is too specific then
we update our current hypothesis to a general condition.
4. Keep repeating the above steps till all the training examples are complete.
5. After we have completed all the training examples we will have the final hypothesis
when can use to classify the new examples.

Example :
Consider the following data set having the data about which particular seeds are
poisonous.
First, we consider the hypothesis to be a more specific hypothesis. Hence, our
hypothesis would be :
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}

Consider example 1 :
The data in example 1 is { GREEN, HARD, NO, WRINKLED }. We see that our initial
hypothesis is more specific and we have to generalize it for this example. Hence, the
hypothesis becomes :
h = { GREEN, HARD, NO, WRINKLED }

Consider example 2 :
Here we see that this example has a negative outcome. Hence we neglect this
example and our hypothesis remains the same.
h = { GREEN, HARD, NO, WRINKLED }

Consider example 3 :
Here we see that this example has a negative outcome. Hence we neglect this
example and our hypothesis remains the same.
h = { GREEN, HARD, NO, WRINKLED }
Consider example 4 :
The data present in example 4 is { ORANGE, HARD, NO, WRINKLED }. We compare
every single attribute with the initial data and if any mismatch is found we replace that
particular attribute with a general case ( ” ? ” ). After doing the process the hypothesis
becomes :
h = { ?, HARD, NO, WRINKLED }

Consider example 5 :
The data present in example 5 is { GREEN, SOFT, YES, SMOOTH }. We compare
every single attribute with the initial data and if any mismatch is found we replace that
particular attribute with a general case ( ” ? ” ). After doing the process the hypothesis
becomes :
h = { ?, ?, ?, ? }
Since we have reached a point where all the attributes in our hypothesis have the
general condition, example 6 and example 7 would result in the same hypothesizes
with all general attributes.
h = { ?, ?, ?, ? }

Hence, for the given data the final hypothesis would be :


Final Hyposthesis: h = { ?, ?, ?, ? }

Algorithm :

1. Initialize h to the most specific hypothesis in H


2. For each positive training instance x
For each attribute constraint a, in h
If the constraint a, is satisfied by x
Then do nothing
Else replace a, in h by the next more general constraint that
is satisfied by x
3. Output hypothesis h

ML – Candidate Elimination Algorithm


Last Updated : 30 Mar, 2023


The candidate elimination algorithm incrementally builds the version space given a
hypothesis space H and a set E of examples. The examples are added one by one;
each example possibly shrinks the version space by removing the hypotheses that
are inconsistent with the example. The candidate elimination algorithm does this by
updating the general and specific boundary for each new example.
 You can consider this as an extended form of the Find-S algorithm.
 Consider both positive and negative examples.
 Actually, positive examples are used here as the Find-S algorithm (Basically they
are generalizing from the specification).
 While the negative example is specified in the generalizing form.

Terms Used:
 Concept learning: Concept learning is basically the learning task of the
machine (Learn by Train data)
 General Hypothesis: Not Specifying features to learn the machine.
 G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
 Specific Hypothesis: Specifying features to learn machine (Specific feature)
 S= {‘pi’,’pi’,’pi’…}: The number of pi depends on a number of attributes.
 Version Space: It is an intermediate of general hypothesis and Specific
hypothesis. It not only just writes one hypothesis but a set of all possible
hypotheses based on training data-set.
Algorithm:
Step1: Load Data set
Step2: Initialize General Hypothesis and Specific Hypothesis.
Step3: For each training example
Step4: If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically generalizing it)
Step5: If example is Negative example
Make generalize hypothesis more specific.
Example:
Consider the dataset given below:
Algorithmic steps:
Initially : G = [[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?],
[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?]]
S = [Null, Null, Null, Null, Null, Null]

For instance 1 : <'sunny','warm','normal','strong','warm ','same'> and positive output.


G1 = G
S1 = ['sunny','warm','normal','strong','warm ','same']

For instance 2 : <'sunny','warm','high','strong','warm ','same'> and positive output.


G2 = G
S2 = ['sunny','warm',?,'strong','warm ','same']

For instance 3 : <'rainy','cold','high','strong','warm ','change'> and negative output.


G3 = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?], [?, ?, ?, ?, ?, ?],
[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, 'same']]
S3 = S2

For instance 4 : <'sunny','warm','high','strong','cool','change'> and positive output.


G4 = G3
S4 = ['sunny','warm',?,'strong', ?, ?]

At last, by synchronizing the G4 and S4 algorithm produce the output.


Output :
G = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?]]
S = ['sunny','warm',?,'strong', ?, ?]
The Candidate Elimination Algorithm (CEA) is an improvement over the Find-S
algorithm for classification tasks. While CEA shares some similarities with Find-
S, it also has some essential differences that offer advantages and disadvantages.
Here are some advantages and disadvantages of CEA in comparison with Find-S:
Advantages of CEA over Find-S:
1. Improved accuracy: CEA considers both positive and negative examples to
generate the hypothesis, which can result in higher accuracy when dealing with
noisy or incomplete data.
2. Flexibility: CEA can handle more complex classification tasks, such as those
with multiple classes or non-linear decision boundaries.
3. More efficient: CEA reduces the number of hypotheses by generating a set of
general hypotheses and then eliminating them one by one. This can result in
faster processing and improved efficiency.
4. Better handling of continuous attributes: CEA can handle continuous attributes
by creating boundaries for each attribute, which makes it more suitable for a
wider range of datasets.
Disadvantages of CEA in comparison with Find-S:
1. More complex: CEA is a more complex algorithm than Find-S, which may
make it more difficult for beginners or those without a strong background in
machine learning to use and understand.
2. Higher memory requirements: CEA requires more memory to store the set of
hypotheses and boundaries, which may make it less suitable for memory-
constrained environments.
3. Slower processing for large datasets: CEA may become slower for larger
datasets due to the increased number of hypotheses generated.
4. Higher potential for overfitting: The increased complexity of CEA may make
it more prone to overfitting on the training data, especially if the dataset is
small or has a high degree of noise.

You might also like