0% found this document useful (0 votes)

26 views23 pages

ML Unit-1 Notes

Uploaded by

Venkata Naresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views23 pages

ML Unit-1 Notes

Uploaded by

Venkata Naresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

UNIT-1

LEARNING

The key concept that we will need to think about for our machines is learning from data. learning
from experience. Hopefully, we all agree that humans and other animals can display behaviours
that we label as intelligent by learning from experience. Learning is what gives us flexibility in
our life; the fact that we can adjust and adapt to new circumstances, and learn new tricks, no
matter how old a dog we are! The important parts of animal learning for this book are
remembering, adapting, and generalizing: recognizing that last time we were in this situation
(saw this data) we tried out some particular action (gave this output) and it worked (was correct).
The last word, generalizing, is about recognizing similarity between different situations, so that
things that applied in one place can be used in another. This is what makes learning useful,
because we can use our knowledge in lots of different places.

Of course, there are plenty of other bits to intelligence, such as reasoning, and logical deduction,
but we won’t worry too much about those. We are interested in the most fundamental parts of
intelligence—learning and adapting—and how we can model them in a computer. There has also
been a lot of interest in making computers reason and deduce facts. This was the basis of most
early Artificial Intelligence, and is sometimes known as symbolic processing because the
computer manipulates symbols that reflect the environment. In contrast, machine learning
methods are sometimes called sub-symbolic because no symbols or symbolic manipulation are
involved.

Machine Learning

Machine learning, then, is about making computers modify or adapt their actions (whether these
actions are making predictions, or controlling a robot) so that these actions get more accurate,
where accuracy is measured by how well the chosen actions reflect the correct ones. Imagine that
you are playing Scrabble (or some other game) against a computer. You might beat it every time
in the beginning, but after lots of games it starts beating you, until finally you never win. Either
you are getting worse, or the computer is learning how to win at Scrabble. Having learnt to beat
you, it can go on and use the same strategies against other players, so that it doesn’t start from
scratch with each new player; this is a form of generalization.
the inherent multi-disciplinarity of machine learning has been recognized and It merges ideas
from neuroscience and biology, statistics, mathematics, and physics, to make computers learn.

The computational complexity of the machine learning methods will also be of interest to us
since what we are producing is algorithms. It is particularly important because we might want to
use some of the methods on very large datasets, so algorithms that have high degree polynomial
complexity in the size of the dataset (or worse) will be a problem. The complexity is often
broken into two parts: the complexity of training, and the complexity of applying the trained
algorithm. Training does not happen very often, and is not usually time critical, so it can take
longer. However, we often want a decision about a test point quickly, and there are potentially
lots of test points when an algorithm is in use, so this needs to have low computational cost.

Definition of Machine Learning: “A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.”

TYPES OF MACHINE LEARNING

Supervised learning A training set of examples with the correct responses (targets) is provided
and, based on this training set, the algorithm generalises to respond correctly to all possible
inputs. This is also called learning from exemplars.

Unsupervised learning Correct responses are not provided, but instead the algorithm tries to
identify similarities between the inputs so that inputs that have something in common are
categorized together. The statistical approach to unsupervised learning is known as density
estimation.

Reinforcement learning This is somewhere between supervised and unsupervised learning. The
algorithm gets told when the answer is wrong, but does not get told how to correct it. It has to
explore and try out different possibilities until it works out how to get the answer right.
Reinforcement learning is sometime called learning with a critic because of this monitor that
scores the answer, but does not suggest improvements.

Evolutionary learning Biological evolution can be seen as a learning process: biological

organisms adapt to improve their survival rates and chance of having offspring in their
environment. We’ll look at how we can model this in a computer, using an idea of fitness, which
corresponds to a score for how good the current solution is.

SUPERVISED LEARNING

Supervised learning, as the name indicates, has the presence of a supervisor as a teacher.
Basically supervised learning is when we teach or train the machine using data that is well
labeled. Which means some data is already tagged with the correct answer. After that, the
machine is provided with a new set of examples(data) so that the supervised learning algorithm
analyses the training data(set of training examples) and produces a correct outcome from
labeled data.
For instance, suppose you are given a basket filled with different kinds of fruits. Now the first
step is to train the machine with all different fruits one by one like this:

 If the shape of the object is rounded and has a depression at the top, is red in color, then it
will be labeled as –Apple.
 If the shape of the object is a long curving cylinder having Green-Yellow color, then it will
be labeled as –Banana.

Now suppose after training the data, you have given a new separate fruit, say Banana from the
basket, and asked to identify it.

Since the machine has already learned the things from previous data and this time have to use
it wisely. It will first classify the fruit with its shape and color and would confirm the fruit
name as BANANA and put it in the Banana category. Thus the machine learns the things from
training data(basket containing fruits) and then applies the knowledge to test data(new fruit).
Supervised learning classified into two categories of algorithms:

 Classification: A classification problem is when the output variable is a category, such as

“Red” or “blue” or “disease” and “no disease”.
 Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
Supervised learning deals with or learns with “labeled” data. This implies that some data is
already tagged with the correct answer.
Types:-
 Regression
 Logistic Regression
 Classification
 Naive Bayes Classifiers
 K-NN (k nearest neighbors)
 Decision Trees
 Support Vector Machine
Advantages:-
 Supervised learning allows collecting data and produces data output from previous
experiences.
 Helps to optimize performance criteria with the help of experience.
 Supervised machine learning helps to solve various types of real-world computation
problems.
Disadvantages:-

 Classifying big data can be challenging.

 Training for supervised learning needs a lot of computation time. So, it requires a lot of
time.

Steps

Unsupervised learning
Unsupervised learning is the training of a machine using information that is neither classified
nor labeled and allowing the algorithm to act on that information without guidance. Here the
task of the machine is to group unsorted information according to similarities, patterns, and
differences without any prior training of data.
Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore the machine is restricted to find the hidden structure in unlabeled data by
itself.
For instance, suppose it is given an image having both dogs and cats which it has never seen.
Thus the machine has no idea about the features of dogs and cats so we can’t categorize it as
‘dogs and cats ‘. But it can categorize them according to their similarities, patterns, and
differences, i.e., we can easily categorize the above picture into two parts. The first may
contain all pics having dogs in it and the second part may contain all pics having cats in it.
Here you didn’t learn anything before, which means no training data or examples.
It allows the model to work on its own to discover patterns and information that was
previously undetected. It mainly deals with unlabelled data.
Unsupervised learning is classified into two categories of algorithms:

 Clustering: A clustering problem is where you want to discover the inherent groupings in
the data, such as grouping customers by purchasing behavior.
 Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.
Types of Unsupervised Learning:-
Clustering
1. Exclusive (partitioning)
2. Agglomerative
3. Overlapping
4. Probabilistic
Clustering Types:-
1. Hierarchical clustering
2. K-means clustering
3. Principal Component Analysis
4. Singular Value Decomposition
5. Independent Component Analysis
Supervised vs. Unsupervised Machine Learning

Parameters Supervised machine learning Unsupervised machine learning

Algorithms are trained using labeled Algorithms are used against data that
Input Data data. is not labeled

Computational
Complexity Simpler method Computationally complex

Accuracy Highly accurate Less accurate

THE BRAIN AND THE NEURON

In animals, learning occurs within the brain. If we can understand how the brain works, then
there might be things in there for us to copy and use for our machine learning systems. While the
brain is an impressively powerful and complicated system, the basic building blocks that it is
made up of are fairly simple and easy to understand.
In computational terms the brain does exactly what we want. It deals with noisy and even
inconsistent data, and produces answers that are usually correct from very high dimensional data
(such as images) very quickly. All amazing for something that weighs about 1.5 kg and is losing
parts of itself all the time (neurons die as you age at impressive/depressing rates), but its
performance does not degrade appreciably (in the jargon, this means it is robust).
The most basic level, which is the processing units of the brain. These are nerve cells called
neurons. There are lots of them (100 billion = 1011 is the figure that is often given) and they
come in lots of different types, depending upon their particular task. However, their general
operation is similar in all cases: transmitter chemicals within the fluid of the brain raise or lower
the electrical potential inside the body of the neuron. If this membrane potential reaches some
threshold, the neuron spikes or fires, and a pulse of fixed strength and duration is sent down the
axon. The axons divide (arborise) into connections to many other neurons, connecting to each of
these neurons in a synapse. Each neuron is typically connected to thousands of other neurons, so
that it is estimated that there are about 100 trillion (= 1014) synapses within the brain. After
firing, the neuron must wait for some time to recover its energy (the refractory period) before it
can fire again.
Each neuron can be viewed as a separate processor, performing a very simple computation:
deciding whether or not to fire. This makes the brain a massively parallel computer made up of
1011 processing elements. If that is all there is to the brain, then we should be able to model it
inside a computer and end up with animal or human intelligence inside a computer. This is the
view of strong AI.
We do want to make programs that learn. So how does learning occur in the brain? The principal
concept is plasticity: modifying the strength of synaptic connections between neurons, and
creating new connections. We don’t know all of the mechanisms by which the strength of these
synapses gets adapted, but one method that does seem to be used was first postulated by Donald
Hebb in 1949.
Hebb’s Rule
Hebb’s rule says that the changes in the strength of synaptic connections are proportional to the
correlation in the firing of the two connecting neurons. So if two neurons consistently fire
simultaneously, then any connection between them will change in strength, becoming stronger.
However, if the two neurons never fire simultaneously, the connection between them will die
away. The idea is that if two neurons both respond to something, then they should be connected.
Let’s see a trivial example: suppose that you have a neuron somewhere that recognizes your
grandmother (this will probably get input from lots of visual processing neurons, but don’t worry
about that). Now if your grandmother always gives you a chocolate bar when she comes to visit,
then some neurons, which are happy because you like the taste of chocolate, will also be
stimulated. Since these neurons fire at the same time, they will be connected together, and the
connection will get stronger over time. So eventually, the sight of your grandmother, even in a
photo, will be enough to make you think of chocolate. Sound familiar? Pavlov used this idea,
called classical conditioning, to train his dogs so that when food was shown to the dogs and the
bell was rung at the same time, the neurons for salivating over the food and hearing the bell fired
simultaneously, and so became strongly connected. Over time, the strength of the synapse
between the neurons that responded to hearing the bell and those that caused the salivation reflex
was enough that just hearing the bell caused the salivation neurons to fire in sympathy. There are
other names for this idea that synaptic connections between neurons and assemblies of neurons
can be formed when they fire together and can become stronger. It is also known as long-term
potentiation and neural plasticity, and it does appear to have correlates in real brains.

McCulloch and Pitts Neurons

FIGURE 3.1 A picture of McCulloch and Pitts’ mathematical model of a neuron. The inputs xi
are multiplied by the weights wi, and the neurons sum their values. If this sum is greater than the
threshold _ then the neuron fires; otherwise it does not.

(1) a set of weighted inputs wi that correspond to the synapses

(2) an adder that sums the input signals (equivalent to the membrane of the cell that collects
electrical charge)
(3) an activation function (initially a threshold function) that decides whether the neuron fires
(‘spikes’) for the current inputs

A picture of their model is given in Figure 3.1, and we’ll use the picture to write down a
mathematical description. On the left of the picture are a set of input nodes (labeled x1, x2, . . .
xm). These are given some values, and as an example we’ll assume that there are three inputs,
with x1 = 1, x2 = 0, x3 = 0.5. In real neurons those inputs come from the outputs of other
neurons. So the 0 means that a neuron didn’t fire, the 1 means it did, and the 0.5 has no
biological meaning, but never mind.
Each of these other neuronal firings flowed along a synapse to arrive at our neuron, and those
synapses have strengths, called weights. The strength of the synapse affects the strength of the
signal, so we multiply the input by the weight of the synapse (so we get x1 × w1 and x2 × w2,
etc.). Now when all of these signals arrive into our neuron, it adds them up to see if there is
enough strength to make it fire.
We’ll write that as
which just means sum (add up) all the inputs multiplied by their synaptic weights. I’ve assumed
that there are m of them, where m = 3 in the example. If the synaptic weights are w1 = 1, w2 =
−0.5,w3 = −1, then the inputs to our model neuron are h = 1 × 1 + 0 × −0.5 + 0.5 × −1 = 1 + 0 +
−0.5 = 0.5. Now the neuron needs to decide if it is going to fire. For a real neuron, this is a
question of whether the membrane potential is above some threshold. We’ll pick a threshold
value (labeled θ), say θ = 0 as an example. Now, does our neuron fire? Well, h = 0.5 in the
example, and 0.5 > 0, so the neuron does fire, and produces output 1. If the neuron did not fire, it
would produce output 0.

Limitations of the McCulloch and Pitts Neuronal Model

The McCulloch and Pitts neuron is a binary threshold device. It sums up the inputs (multiplied
by the synaptic strengths or weights) and either fires (produces output 1) or does not fire
(produces output 0) depending on whether the input is above some threshold. We can write the
second half of the work of the neuron, the decision about whether or not to fire (which is known
as an activation function), as:

Limitations of the McCulloch and Pitts Neuronal Model

Note that the weights wi can be positive or negative. This corresponds to excitatory and
inhibitory connections that make neurons more likely to fire and less likely to fire, respectively.
Both of these types of synapses do exist within the brain, but with the McCulloch and Pitts
neurons, the weights can change from positive to negative or vice versa, which has not been seen
biologically—synaptic connections are either excitatory or inhibitory, and never change from
one to the other. Additionally, real neurons can have synapses that link back to themselves in a
feedback loop, but we do not usually allow that possibility when we make networks of neurons.
Again, there are exceptions, but we won’t get into them. It is possible to improve the model to
include many of these features, but the picture is complicated enough already, and McCulloch
and Pitts neurons already provide a great deal of interesting behaviour that resembles the action
of the brain, such as the fact that networks of McCulloch and Pitts neurons can memorise
pictures and learn to represent functions and classify data.

Linear Discriminants (LD)

Linear Discriminants is a statistical method of dimensionality reduction that provides the highest
possible discrimination among various classes, used in machine learning to find the linear
combination of features, which can separate two or more classes of objects with best
performance. It has been widely used in many applications, such as pattern recognition, image
retrieval, speech recognition, among others. The method is based on discriminant functions that
are estimated based on a set of data called training set. These discriminant functions are linear
with respect to the characteristic vector, and usually have the form
f(t)=wtx+b0,
where w represents the weight vector, x the characteristic vector, and b0 a threshold.

FIND S Algorithm – Maximally Specific Hypothesis Solved Example

FIND S Algorithm is used to find the Maximally Specific Hypothesis. Using the Find-S
algorithm gives a single maximally specific hypothesis for the given set of training examples.

Find-S Algorithm Machine Learning

1. Initilize h to the most specific hypothesis in H

2. For each positive training instance x
For each attribute contraint ai in h
If the contraint ai is satisfied by x
then do nothing
Else
replace ai in h by the next more general constraint that is satisfied by x
3. Output the hypothesis h

Solved Numerical Example – 1

Step – 1 of Find-S Algorithm

Step 2 of Find-S Algorithm First iteration

h0 = (ø, ø, ø, ø, ø, ø, ø)

X1 = <Sunny, Warm, Normal, Strong, Warm, Same>

h1 = <Sunny, Warm, Normal, Strong, Warm, Same>

Step 2 of Find-S Algorithm Second iteration

h1 = <Sunny, Warm, Normal, Strong, Warm, Same>

X2 = <Sunny, Warm, High, Strong, Warm, Same>

h2 = <Sunny, Warm, ?, Strong, Warm, Same>

Step 2 of Find-S Algorithm Third iteration

h2 = <Sunny, Warm, ?, Strong, Warm, Same>

X3 = <Rainy, Cold, High, Strong, Warm, Change> – No

X3 is Negative example Hence ignored

h3 = <Sunny, Warm, ?, Strong, Warm, Same>

Step 2 of Find-S Algorithm Fourth iteration

h3 = <Sunny, Warm, ?, Strong, Warm, Same>

X4 = <Sunny, Warm, High, Strong, Cool, Change>

h4 = <Sunny, Warm, ?, Strong, ?, ?>

Step 3

The final maximally specific hypothesis is <Sunny, Warm, ?, Strong, ?, ?>

Solved Numerical Example – 2

1. How many concepts are possible for this instance space?

Solution: 2 * 3 * 2 * 2 * 3 = 72
2. How many hypotheses can be expressed by the hypothesis language?

Solution: 4 * 5 * 4 * 4 * 5 = 1600
Semantically Distinct Hypothesis = ( 3 * 4 * 3 * 3 * 4 ) + 1 = 433
3. Apply the FIND-S algorithm by hand on the given training set. Consider the examples in the
specified order and write down your hypothesis each time after observing an example.

Step 1:

Negative Example Hence Ignore

h1 = (ø, ø, ø, ø, ø)

X2 = (many, big, no, expensive, one) – Yes

h2 = (many, big, no, expensive, one)

X3 = (some, big, always, expensive, few) – No

Negative example hence Ignore

h3 = (many, big, no, expensive, one)

X4 = (many, medium, no, expensive, many) – Yes

h4 = (many, ?, no, expensive, ?)

X5 = (many, small, no, affordable, many) – Yes

h5 = (many, ?, no, ?, ?)

Step 3:
Final Maximally Specific Hypothesis is:
h5 = (many, ?, no, ?, ?)
Version Space and List-Then-Eliminate Algorithm

Consistent Hypothesis, Version Space and List-Then-Eliminate Algorithm

An hypothesis h is said to be consistent hypothesis with a set of training examples D iff h(x)
= c(x) for each example in D,

For Example:

Example Citations Size InLibrary Price Editions Buy

1 Some Small No Affordable One No

2 Many Big No Expensive Many Yes

h1 = (?, ?, No, ?, Many) – Consistent Hypothesis as it is consistent with all the training
examples
h2 = (?, ?, No, ?, ?) – Inconsistent Hypothesis as it is inconsistent with first training example

Version Space

The version space VSH,Dis the subset of the hypothesis from H consistent with the training
example in D,
List-Then-Eliminate algorithm

Steps in List-Then-Eliminate Algorithm

1. VersionSpace = a list containing every hypothesis in H

2. For each training example, <a(x), c(x)> Remove from VersionSpace any hypothesis h for
which h(x) != c(x)
3. Output the list of hypotheses in VersionSpace.

Example:

F1 – > A, B

F2 – > X, Y

Here F1 and F2 are two features (attributes) with two possible values for each feature or
attribute.

Instance Space: (A, X), (A, Y), (B, X), (B, Y) – 4 Examples
Hypothesis Space: (A, X), (A, Y), (A, ø), (A, ?), (B, X), (B, Y), (B, ø), (B, ?), (ø, X), (ø, Y),
(ø, ø), (ø, ?), (?, X), (?, Y), (?, ø), (?, ?) – 16 Hypothesis
Semantically Distinct Hypothesis : (A, X), (A, Y), (A, ?), (B, X), (B, Y), (B, ?), (?, X), (?, Y
(?, ?), (ø, ø) – 10

List-Then-Eliminate Algorithm Steps

Version Space: (A, X), (A, Y), (A, ?), (B, X), (B, Y), (B, ?), (?, X), (?, Y) (?, ?), (ø, ø),
•Training Instances
F1 F2 Target

A X Yes

A Y Yes

Consistent Hypothesis are (Version Space): (A, ?), (?, ?)

Problems with List-Then-Eliminate Algorithm

The hypothesis space must be finite

Enumeration of all the hypothesis, rather inefficient

Candidate Elimination Algorithm Solved Example – 1

Candidate Elimination Algorithm in Machine Learning

Candidate Elimination Algorithm is used to find the set of consistent hypothesis, that is Version
spsce.

Algorithm:

For each training example d, do:

If d is positive example

Remove from G any hypothesis h inconsistent with d

For each hypothesis s in S not consistent with d:

Remove s from S

Add to S all minimal generalizations of s consistent with d and having a generalization

in G

Remove from S any hypothesis with a more specific h in S

If d is negative example

Remove from S any hypothesis h inconsistent with d

For each hypothesis g in G not consistent with d:

Remove g from G

Add to G all minimal specializations of g consistent with d and having a specialization

in S

Remove from G any hypothesis having a more general hypothesis in G

Solved Numerical Example – 1 (Candidate Elimination Algorithm):

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

1 Sunny Warm Normal Strong Warm Same Yes

2 Sunny Warm High Strong Warm Same Yes

3 Rain Cold High Strong Warm Change No

4 Sunny Warm High Strong Cool Change Yes

Solution:

S0: (ø, ø, ø, ø, ø, ø) Most Specific Boundary

G0: (?, ?, ?, ?, ?, ?) Most Generic Boundary

The first example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the hypothesis at the generic boundary is consistent hence we
retain it.

S1: (Sunny,Warm, Normal, Strong, Warm, Same)

G1: (?, ?, ?, ?, ?, ?)

The second example in positive, again the hypothesis at the specific boundary is inconsistent,
hence we extend the specific boundary, and the hypothesis at the generic boundary is consistent
hence we retain it.

S2: (Sunny,Warm, ?, Strong, Warm, Same)

G2: (?, ?, ?, ?, ?, ?)

The third example is negative, the hypothesis at the specific boundary is consistent, hence we
retain it, and hypothesis at the generic boundary is inconsistent hence we write all consistent
hypotheses by removing one “?” (question mark) at time.

S3: (Sunny,Warm, ?, Strong, Warm, Same)

G3: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?) (?,?,?,?,?,Same)
The fourth example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the consistent hypothesis at the generic boundary are retained.

S4: (Sunny, Warm, ?, Strong, ?, ?)

G4: (Sunny,?,?,?,?,?) (?,Warm,?,?,?,?)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
Candidate Elimination Algorithm Solved Example – 2

Candidate Elimination Algorithm in Machine Learning

Candidate Elimination Algorithm is used to find the set of consistent hypothesis, that is Version
spsce.

Algorithm:

For each training example d, do:

If d is positive example

Remove from G any hypothesis h inconsistent with d

For each hypothesis s in S not consistent with d:

Remove s from S

Add to S all minimal generalizations of s consistent with d and having a generalization

in G

Remove from S any hypothesis with a more specific h in S

If d is negative example

Remove from S any hypothesis h inconsistent with d

For each hypothesis g in G not consistent with d:

Remove g from G

Add to G all minimal specializations of g consistent with d and having a specialization

in S

Remove from G any hypothesis having a more general hypothesis in G

Solved Numerical Example – 2 (Candidate Elimination Algorithm):

Example Size Color Shape Class/Label

1 Big Red Circle No
2 Small Red Triangle No
3 Small Red Circle Yes
4 Big Blue Circle No
5 Small Blue Circle Yes

Solution:

S0: (0, 0, 0) Most Specific Boundary

G0: (?, ?, ?) Most Generic Boundary

The first example is negative, the hypothesis at the specific boundary is consistent, hence we
retain it, and the hypothesis at the generic boundary is inconsistent hence we write all consistent
hypotheses by removing one “?” at a time.

S1: (0, 0, 0)

G1: (Small, ?, ?), (?, Blue, ?), (?, ?, Triangle)

The second example is negative, the hypothesis at the specific boundary is consistent, hence we
retain it, and the hypothesis at the generic boundary is inconsistent hence we write all consistent
hypotheses by removing one “?” at a time.

S2: (0, 0, 0)
G2: (Small, Blue, ?), (Small, ?, Circle), (?, Blue, ?), (Big, ?, Triangle), (?, Blue, Triangle)

The third example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the consistent hypothesis at the generic boundary is retained
and inconsistent hypotheses are removed from the generic boundary.

S3: (Small, Red, Circle)

G3: (Small, ?, Circle)

The fourth example is negative, the hypothesis at the specific boundary is consistent, hence we
retain it, and the hypothesis at the generic boundary is inconsistent hence we write all consistent
hypotheses by removing one “?” at a time.

S4: (Small, Red, Circle)

G4: (Small, ?, Circle)

The fifth example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the consistent hypothesis at the generic boundary is retained
and inconsistent hypotheses are removed from the generic boundary.

S5: (Small, ?, Circle)

G5: (Small, ?, Circle)

Learned Version Space by Candidate Elimination Algorithm for given data set is:
S: G: (Small, ?, Circle)

Linear Regression

In Machine Learning,

 Linear Regression is a supervised machine learning algorithm.

 It tries to find out the best linear relationship that describes the data you have.
 It assumes that there exists a linear relationship between a dependent variable and independent
variable(s).
 The value of the dependent variable of a linear regression model is a continuous value i.e. real
numbers.

 Representing Linear Regression Model-


 Linear regression model represents the linear relationship between a dependent variable and
independent variable(s) via a sloped straight line.


The sloped straight line representing the linear relationship that fits the given data best is
called as a regression line.
It is also called as best fit line.

Types of Linear Regression-

Based on the number of independent variables, there are two types of linear regression-

1. Simple Linear Regression

2. Multiple Linear Regression
1. Simple Linear Regression-

In simple linear regression, the dependent variable depends only on a single independent variable.

For simple linear regression, the form of the model is-

Y = β0 + β1X
Here,

 Y is a dependent variable.
 X is an independent variable.
 β0 and β1 are the regression coefficients.
 β0 is the intercept or the bias that fixes the offset to a line.
 β1 is the slope or weight that specifies the factor by which X has an impact on Y.

There are following 3 cases possible-

Case-01: β1 < 0

 It indicates that variable X has negative impact on Y.

 If X increases, Y will decrease and vice-versa.

Case-02: β1 = 0
 It indicates that variable X has no impact on Y.
 If X changes, there will be no change in Y.

Case-03: β1 > 0

 It indicates that variable X has positive impact on Y.

 If X increases, Y will increase and vice-versa.

2. Multiple Linear Regression-

In multiple linear regression, the dependent variable depends on more than one independent
variables.
For multiple linear regression, the form of the model is-

Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn

Here,
 Y is a dependent variable.
 X1, X2, …., Xn are independent variables.
 β0, β1,…, βn are the regression coefficients.
 βj (1<=j<=n) is the slope or weight that specifies the factor by which Xj has an impact on Y.

Chapter 01 Introduction To ML
No ratings yet
Chapter 01 Introduction To ML
178 pages
Aiml Co - 3,4 Notes
No ratings yet
Aiml Co - 3,4 Notes
98 pages
Unit 1 ML
No ratings yet
Unit 1 ML
96 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
1.machine Learning Basics
No ratings yet
1.machine Learning Basics
74 pages
Module 1
No ratings yet
Module 1
122 pages
6CS4 AI Unit-4 @zammers
No ratings yet
6CS4 AI Unit-4 @zammers
129 pages
ML Lec-1
No ratings yet
ML Lec-1
59 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Machine Learning
No ratings yet
Machine Learning
97 pages
ML R20 Material
No ratings yet
ML R20 Material
96 pages
Unit-1 DLL
No ratings yet
Unit-1 DLL
73 pages
Introducti0n (MLT)
No ratings yet
Introducti0n (MLT)
39 pages
Module1 And2
No ratings yet
Module1 And2
122 pages
UNIT II Deep Learning
No ratings yet
UNIT II Deep Learning
42 pages
Unit - 2 Machine Learning
No ratings yet
Unit - 2 Machine Learning
45 pages
Machine Learning Unit1
No ratings yet
Machine Learning Unit1
151 pages
Prelims Speed Race
50% (2)
Prelims Speed Race
12 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
Harvard Undergraduate Science Olympiad - Brochure
No ratings yet
Harvard Undergraduate Science Olympiad - Brochure
13 pages
DNA Structure and Replication
No ratings yet
DNA Structure and Replication
25 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
29 pages
Machine Learning - Data
No ratings yet
Machine Learning - Data
11 pages
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
No ratings yet
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
25 pages
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
No ratings yet
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
19 pages
Module 1
No ratings yet
Module 1
50 pages
Unit 1 Intro
No ratings yet
Unit 1 Intro
41 pages
Unit5 ML Introduction
No ratings yet
Unit5 ML Introduction
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
5 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Introduction To Machine Learning: Unit Structure
No ratings yet
Introduction To Machine Learning: Unit Structure
33 pages
Pyc2602 Portfolio of Evidence 2023 (9) - 1
100% (1)
Pyc2602 Portfolio of Evidence 2023 (9) - 1
22 pages
Research About Dragons
No ratings yet
Research About Dragons
3 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
17 pages
Unit 1
No ratings yet
Unit 1
24 pages
AI - Mod 5. Part 1
No ratings yet
AI - Mod 5. Part 1
30 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
19 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
ML Chapter 1
No ratings yet
ML Chapter 1
37 pages
ML Lecture - 1
No ratings yet
ML Lecture - 1
33 pages
Embryonic Induction
No ratings yet
Embryonic Induction
11 pages
Who Is Podcast Guest Turned Star Andrew Huberman
100% (1)
Who Is Podcast Guest Turned Star Andrew Huberman
41 pages
Unit 1
No ratings yet
Unit 1
21 pages
AI17
No ratings yet
AI17
10 pages
Full Notes
No ratings yet
Full Notes
37 pages
Machine Learnning
No ratings yet
Machine Learnning
17 pages
Unit 3 Material
No ratings yet
Unit 3 Material
8 pages
Chapter 5 Artificial Intelligence Notes
No ratings yet
Chapter 5 Artificial Intelligence Notes
7 pages
Unit 1
No ratings yet
Unit 1
12 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
20 pages
What Is Machine Learning-UNIT III
No ratings yet
What Is Machine Learning-UNIT III
12 pages
MEVL-Lab Course 1 - 11
No ratings yet
MEVL-Lab Course 1 - 11
113 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
MLT Unit 1
No ratings yet
MLT Unit 1
15 pages
100 TÌM LỖI SAI THẦY BÙI VĂN VINH
No ratings yet
100 TÌM LỖI SAI THẦY BÙI VĂN VINH
15 pages
A Level Biology Bridging Unit 2024
No ratings yet
A Level Biology Bridging Unit 2024
53 pages
Types of ML
No ratings yet
Types of ML
10 pages
Supervised and Unsupervised Machine Learning
No ratings yet
Supervised and Unsupervised Machine Learning
3 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
A1579305753 - 23783 - 8 - 2019 - Machine Learning
No ratings yet
A1579305753 - 23783 - 8 - 2019 - Machine Learning
18 pages
Dna Transcription Homework
100% (1)
Dna Transcription Homework
6 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Kling Ai Tutorial
No ratings yet
Kling Ai Tutorial
6 pages
1 ML Landscape, ML Categories
No ratings yet
1 ML Landscape, ML Categories
3 pages
CTCPB 2025
No ratings yet
CTCPB 2025
4 pages
AI Lecture FirstYear Unit 4 Introduction To ML
No ratings yet
AI Lecture FirstYear Unit 4 Introduction To ML
8 pages
Technical Report 2.0
No ratings yet
Technical Report 2.0
8 pages
Machine Learning (MCA)
No ratings yet
Machine Learning (MCA)
5 pages
2ème Séminaire International de La Biologie SIBII (Biotechnologie)
No ratings yet
2ème Séminaire International de La Biologie SIBII (Biotechnologie)
61 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
FICUS NOTA Preliminaries
No ratings yet
FICUS NOTA Preliminaries
9 pages
Nitrax K 21-0-3 1.1 20221119
No ratings yet
Nitrax K 21-0-3 1.1 20221119
16 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Illes, J & Bird, S (2006) - Neuroethics A Modern Context For Ethics in Neuroscience
No ratings yet
Illes, J & Bird, S (2006) - Neuroethics A Modern Context For Ethics in Neuroscience
7 pages
Introduction To Machine Learning For Beginners
No ratings yet
Introduction To Machine Learning For Beginners
5 pages
Formative CRIERION B Grade 9 2023 Diffusion
No ratings yet
Formative CRIERION B Grade 9 2023 Diffusion
8 pages
QP GR 10 Bio
No ratings yet
QP GR 10 Bio
5 pages
Prova de Ingles 8 Tarde
No ratings yet
Prova de Ingles 8 Tarde
2 pages
02b - Frayer Model Vocabulary Slide Deck (Template)
No ratings yet
02b - Frayer Model Vocabulary Slide Deck (Template)
20 pages
Sampling Strategies in Forest Soils: J Fons, T Sauras, J Romanyà, VR Vallejo
No ratings yet
Sampling Strategies in Forest Soils: J Fons, T Sauras, J Romanyà, VR Vallejo
8 pages
Cambridge IGCSE™: Biology 0610/43 May/June 2021
No ratings yet
Cambridge IGCSE™: Biology 0610/43 May/June 2021
12 pages
Cell Membrane FBISE Final
No ratings yet
Cell Membrane FBISE Final
1 page
PRACTICAL 3 Excel Modeling On Cell Substrate Product (LAB REPORT 2024)
No ratings yet
PRACTICAL 3 Excel Modeling On Cell Substrate Product (LAB REPORT 2024)
1 page
CBP 2020 Chapter 5 Module
No ratings yet
CBP 2020 Chapter 5 Module
6 pages
Reference Laboratory 1
No ratings yet
Reference Laboratory 1
2 pages
SF Monitorinf Form After Months
No ratings yet
SF Monitorinf Form After Months
4 pages
key-BTTH 12 - UNIT 1-18-19
No ratings yet
key-BTTH 12 - UNIT 1-18-19
9 pages
GENERAL BIOLOGY 2 Parallel Assessment 3&4
No ratings yet
GENERAL BIOLOGY 2 Parallel Assessment 3&4
2 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

ML Unit-1 Notes

Uploaded by

ML Unit-1 Notes

Uploaded by

UNIT-1

TYPES OF MACHINE LEARNING

Evolutionary learning Biological evolution can be seen as a learning process: biological

 Classification: A classification problem is when the output variable is a category, such as

 Classifying big data can be challenging.

Parameters Supervised machine learning Unsupervised machine learning

Accuracy Highly accurate Less accurate

THE BRAIN AND THE NEURON

McCulloch and Pitts Neurons

(1) a set of weighted inputs wi that correspond to the synapses

Limitations of the McCulloch and Pitts Neuronal Model

Limitations of the McCulloch and Pitts Neuronal Model

Linear Discriminants (LD)

FIND S Algorithm – Maximally Specific Hypothesis Solved Example

Find-S Algorithm Machine Learning

1. Initilize h to the most specific hypothesis in H

Solved Numerical Example – 1

Step – 1 of Find-S Algorithm

X1 = <Sunny, Warm, Normal, Strong, Warm, Same>

h1 = <Sunny, Warm, Normal, Strong, Warm, Same>

Step 2 of Find-S Algorithm Second iteration

h1 = <Sunny, Warm, Normal, Strong, Warm, Same>

X2 = <Sunny, Warm, High, Strong, Warm, Same>

h2 = <Sunny, Warm, ?, Strong, Warm, Same>

Step 2 of Find-S Algorithm Third iteration

h2 = <Sunny, Warm, ?, Strong, Warm, Same>

X3 = <Rainy, Cold, High, Strong, Warm, Change> – No

X3 is Negative example Hence ignored

h3 = <Sunny, Warm, ?, Strong, Warm, Same>

Step 2 of Find-S Algorithm Fourth iteration

h3 = <Sunny, Warm, ?, Strong, Warm, Same>

X4 = <Sunny, Warm, High, Strong, Cool, Change>

h4 = <Sunny, Warm, ?, Strong, ?, ?>

The final maximally specific hypothesis is <Sunny, Warm, ?, Strong, ?, ?>

1. How many concepts are possible for this instance space?

See also Naïve Bayesian Classifier in Python

Negative Example Hence Ignore

X2 = (many, big, no, expensive, one) – Yes

h2 = (many, big, no, expensive, one)

X3 = (some, big, always, expensive, few) – No

Negative example hence Ignore

h3 = (many, big, no, expensive, one)

h4 = (many, ?, no, expensive, ?)

X5 = (many, small, no, affordable, many) – Yes

Consistent Hypothesis, Version Space and List-Then-Eliminate Algorithm

Example Citations Size InLibrary Price Editions Buy

1 Some Small No Affordable One No

2 Many Big No Expensive Many Yes

Steps in List-Then-Eliminate Algorithm

1. VersionSpace = a list containing every hypothesis in H

List-Then-Eliminate Algorithm Steps

Consistent Hypothesis are (Version Space): (A, ?), (?, ?)

Problems with List-Then-Eliminate Algorithm

The hypothesis space must be finite

Enumeration of all the hypothesis, rather inefficient

Candidate Elimination Algorithm in Machine Learning

For each training example d, do:

Remove from G any hypothesis h inconsistent with d

For each hypothesis s in S not consistent with d:

Add to S all minimal generalizations of s consistent with d and having a generalization

Remove from S any hypothesis with a more specific h in S

Remove from S any hypothesis h inconsistent with d

For each hypothesis g in G not consistent with d:

Add to G all minimal specializations of g consistent with d and having a specialization

Remove from G any hypothesis having a more general hypothesis in G

Solved Numerical Example – 1 (Candidate Elimination Algorithm):

Example Sky AirTemp Humidity Wind Water Forecast EnjoySport

2 Sunny Warm High Strong Warm Same Yes

4 Sunny Warm High Strong Cool Change Yes

S0: (ø, ø, ø, ø, ø, ø) Most Specific Boundary

S1: (Sunny,Warm, Normal, Strong, Warm, Same)

S2: (Sunny,Warm, ?, Strong, Warm, Same)

S3: (Sunny,Warm, ?, Strong, Warm, Same)

S4: (Sunny, Warm, ?, Strong, ?, ?)

Candidate Elimination Algorithm in Machine Learning

For each training example d, do: