0% found this document useful (0 votes)

8 views

ML Unit-2

machine learning course unit 2

Uploaded by

dokihi3931

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

ML Unit-2

machine learning course unit 2

Uploaded by

dokihi3931

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 138

Machine Learning

Techniques

KCS 055
Regression Algorithm
House Price
Challenges in guessing the House price
Predicting the price with the help of ML model
Regression Model
Simple Linear Regression
Y = a + bX
Dependent
Variable
Independent
Variable
Y-intercept
(The value of
Y when x is 0)

Slope
(How much Y
changes for a unit
change in X)
Linear Regression
35

30
30

25
Area Price 20
(sq. feet) (in Lakhs) 20

Price
100 10 15

10
200 20 10

300 30 5

0
0 50 100 150 200 250 300 350
Area
Linear Regression
35

30
30

Area Price 25

(sq. feet) (in Lakhs) 20

20
100 10

Price
15
200 20 10
10
300 30
5

Y = a + bX 0
0 50 100 150 200 250 300 350
Y -> Price Area

X -> Area
Linear Regression
Slope (b) = Sum of product of deviation/ Sum of square
of deviation for X
Y-intercept (a) = Mean of Y – (b* Mean of X)

Area (X) Price (Y) Mean Mean Deviation (X) Deviation (Y) Product of Square of
(sq. feet) ( Lakhs) of X of Y X – mean(X) Y – mean(Y) Deviations Deviation for X
100 10 200 20 100 – 200 = -100 10 - 20= -10 1000 10,000
200 20 200-200 = 0 20-20 = 0 0 0
300 30 300 – 200 = 100 30 – 20 = 10 1000 10,000

If you have 150 sq. feet house, predict the price?

Slope (b) = 2000/20,000 = 0.1
Y- intercept (a) = 20 – 0.1 * 200 = 0 Y = a + bx
Y = 0 + 0.1 * 150
Y = 15
35

30
30

20
Price 20

10
10

0
0 50 100 150 200 250 300 350
Area
35

30
30

20
Price 20

Outliers
15

10
10

0
0 50 100 150 200 250 300 350
Area
Outliers
An observation that lies an abnormal distance from other
values in a random sample from a population
Predict the price of the pizza whose
diameter is 20 inches.

Diameter in Inches (X) Price in Dollar (Y)

8 10

10 13

12 16
Predict the price of the pizza whose
diameter is 20 inches.

Diameter Price (Y) Mean Mean Deviation (X) Deviation (Y) Product of Square of
(X) (Dollar) of X of Y X – mean(X) Y – mean(Y) Deviations Deviation for X
(inches)
8 10 10 13 8 -10 = -2 10 – 13 = -3 6 4
10 13 10 – 10 = 0 13 -13 = 0 0 0
12 16 12- 10 = 2 16 – 13 = 3 6 4

Slope (b) = Sum of product of deviation/ Sum of square of deviation for X Price when X is 20
Y-intercept (a) = Mean of Y – (b* Mean of X)
Price = a + bx
Slope (b) = 1.5 = -2 + 1.5 * 20
Y-intercept (a) = -2 = 28
Pizza Price
30

0
0 5 10 15 20 25

-5
The world in not so linear
Multiple Linear Regression
• When the data has more
than one independent
variable.
Y = a + b1X1+ b2X2 + b3X3
………………. + bnXn
Dataset
Use the following steps to fit a multiple linear
regression model to this dataset.

Step 1: Calculate X12, X22, X1y, X2y and X1X2.

Step 2: Calculate Regression Sums.
•Σx12 = ΣX12 – (ΣX1)2 / n = 38,767 – (555)2 / 8 = 263.875
•Σx22 = ΣX22 – (ΣX2)2 / n = 2,823 – (145)2 / 8 = 194.875
•Σx1y = ΣX1y – (ΣX1Σy) / n = 101,895 – (555*1,452) / 8 = 1,162.5
•Σx2y = ΣX2y – (ΣX2Σy) / n = 25,364 – (145*1,452) / 8 = -953.5
•Σx1x2 = ΣX1X2 – (ΣX1ΣX2) / n = 9,859 – (555*145) / 8 = -200.375
Step 3: Calculate b0, b1, and b2.

The formula to calculate b1 is:

[(Σx22)(Σx1y) – (Σx1x2)(Σx2y)] / [(Σx12) (Σx22) – (Σx1x2)2]
Thus,
b1 = [(194.875)(1162.5) – (-200.375)(-953.5)] / [(263.875) (194.875) – (-200.375)2]
= 3.148

The formula to calculate b2 is:

[(Σx12)(Σx2y) – (Σx1x2)(Σx1y)] / [(Σx12) (Σx22) – (Σx1x2)2]
Thus,
b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-
200.375)2] = -1.656

The formula to calculate b0 is:

Y– b1X1 – b2X2
Thus,
b0 = 181.5 – 3.148(69.375) – (-1.656)(18.125) = -6.867
Step 5: Place b0, b1, and b2 in the estimated
linear regression equation.

The estimated linear regression equation is:

Y = b0 + b1*x1 + b2*x2

In our example, it is
Y = -6.867 + 3.148x1 – 1.656x2
Matrix Approach
Coefficients = ((XTX)-1XT )Y

1 1 4
Y X1 X2 1 1 1 1
X= 1 2 5 XT = 1 2 3 4
1 1 4 1 3 8
4 5 8 2
1 4 2 4x3
3x4
6 2 5
1
8 3 8 (((XT)3x4X4x3)-1)3x3(XT)3x4 ) 3x4Y4x1
Y= 6
8 = (result) 3x1
12 4 2
12
4x1
Matrix Approach
Coefficients = ((XTX)-1XT )Y
1 1 1 1 1 1 4
X TX = 1 2 3 4 * 1 2 5
Y X1 X2 1 3 8
4 5 8 2
1 4 2
1 1 4 4 10 19
6 2 5
XTX = 10 30 46
19 46 109
8 3 8
3.15 −0.59 −0.30
12 4 2 (XTX)-1 = −0.59 0.20 0.016
−0.30 0.016 0.054
Matrix Approach
Coefficients = ((XTX)-1XT )Y
3.15 −0.59 −0.30 1 1 1 1
(XTX)-1XT = −0.59 0.20 0.016 * 1 2 3 4
Y X1 X2 −0.30 0.016 0.054 4 5 8 2
1 1 4 (XTX)-1XT =
6 2 5
0.05 0.47 − 1.02 0.19
8 3 8 −0.32 −0.098 0.155 0.26
12 4 2 −0.065 0.005 0.185 −0.125
Matrix Approach
Coefficients = ((XTX)-1XT )Y
((XTX)-1XT )Y =
1
Y X1 X2 0.05 0.47 − 1.02 0.19
* 6
1 1 4 −0.32 −0.098 0.155 0.26 8
−0.065 0.005 0.185 −0.125 12
6 2 5
−1.69 𝑏0
8 3 8 ((XTX)-1XT )Y = 3.48 = 𝑏1
12 4 2 −0.05 𝑏2
b0 = -1.69, b1 = 3.48, b2 = -0.05
Matrix Approach
Coefficients = ((XTX)-1XT )Y
So, Coefficients are:
Y X1 X2 b0 = -1.69, b1 = 3.48, b2 = -0.05
1 1 4
Y = b0 + b1X1 + b2X2
6 2 5
8 3 8 Y = -1.69 + 3.48X1 + -0.05X2

12 4 2
Polynomial Regression Model
It is the extended version of Simple Linear Model
Polynomial
• Zero degree Polynomial
Y = ax0 = a = Constant
• One degree Polynomial
Y = a + bx1 = Simple Linear Equation
• Two degree Polynomial
Y = a + bx1 + bx2
• n degree Polynomial
Y = a + bx1 + bx2 + bx3 +……….. + bxn
Regression Model

Simple Linear • Y = a + bX
Regression
Multiple Linear • Y = a + b1X1 + b2X2 + b3X3 + ……….. + bnXn
Regression
Polynomial • Y = a + bX1 + bX2 + bX3 +……….. + bXn
Regression
1 - Linear Relationship
Between dependent and independent variables
2 - Normal Distribution of Residuals
Mean should be zero
3 - Very low/No Multicollinearity
As we can see, there is no relation between
independent variables
4- No Auto-correlation

• Whenever you plot errors you should not find any

correlation between them
5 - Homoscedasticity
• Homo → Same
• Scedasticity → spread/scatter
‘Having the same scatter’
Application of Linear Regression
• House Price Prediction
• Bitcoin Price Prediction
• Stock Market Analysis
• Market Sales Prediction
• Rainfall Prediction
• Weather Prediction
Logistic Regression
Logistic Regression

𝑌 = 𝜎 𝑎 + 𝑏𝑥
Sigmoid
Function
Logistic Regression

1
𝑌= −(𝑎+𝑏𝑥)
1+ 𝑒
Logistic Regression
Study Exam
Hours Result
X Y
• Supervised Classification Model
2 0 • Dependent Variable (Y) is
3 0
Categorical or binary (0 or 1)
4 0
5 1 • Independent Variable (X) is
6 1 Continuous
7 1
8 1
Linear Regression Vs Logistic Regression
What is error

• The difference between predicted values and the

actual values.
𝐸𝑟𝑟𝑜𝑟 = 𝑌 − 𝑌෠

Observed Predicted value

or actual
value
Mean Squared Error

Note: MSE can be used to calculate

loss in Linear regression but can’t be
used in Logistic Regression
Support Vector Machine

• Supervised Machine
Learning Algorithm.
• Binary Classification.
• Vectors means the
data points.
Basic Concepts in SVM

• Support Vectors – The data points closest to

hyperplane.
• Hyperplane – Line which divides into two different
classes.
• Margin – Gap between 2 lines on closest data points
of two different classes.
How to choose the Hyperplane
Scenario
SVM chooses the hyperplane with maximum margin
Non- Linearly Separable Data Points
Kernel Functions

• Mathematical Functions
• Take data at input and transform it into required output.
• Different Kernal Functions are:
– Linear Kernel
– Polynomial Kernel
– Gaussian Kernel
– Radial Basis Function (RBF)
Linear Kernel

• When data is linearly separable, then Linear

Kernel is used.
• For eg, we have 2 vectors x1 and y1, then linear
Kernel K is:
𝐾 𝑥1, 𝑦1 = 𝑥1. 𝑦1
Polynomial Kernel

• Allowed for the curved lines in the

input space.
𝐾 𝑥𝑖, 𝑦𝑖 = 1 + 𝑥𝑖 , 𝑦𝑖 d
Where d = degree of polynomial.

• Very popular in Image Processing.

Gaussian Kernel

• When there is no prior knowledge

of data.
|𝑥−𝑦|2
−( 2 )
𝐾 𝑥, 𝑦 = 𝑒 2𝜎
Radial Basis Function (RBF)

𝐾 𝑥𝑖 , 𝑥𝑗 = exp −𝐺𝑎𝑚𝑚𝑎 ∗ 𝑆𝑢𝑚 𝑥𝑖 − 𝑥𝑗 2

Where Gamma (𝛾) = Constant Parameter (0 < 𝛾 < 1)

Naïve Bayes Classifier
• Naïve means “untrained” or “without experience”.
• Based on Bayes Theorem.
• Supervised learning algorithm.
• Simple and powerful.
• Assumption made here is that every feature is class
conditionally independent.
Marginal Probability
• Simplest form of probability
• Occurring of event A in presence of all other
events.
Favourable Events
P(A) =
𝑇𝑜𝑡𝑎𝑙 𝐸𝑣𝑒𝑛𝑡𝑠
Eg – Probability of a card being Ace in a deck of
52 cards.
4 1
P(Ace) = =
52 13
Joint Probability
• Occurring of 2 events at same time
𝑃 𝐴, 𝐵 = 𝑃 𝐴 ∩ 𝑃 𝐵 = P (A and B)
Eg - The probability that a card is an Ace and red.
𝑃 𝐴𝑐𝑒, 𝑅𝑒𝑑 = 𝑃 𝐴𝑐𝑒 𝑎𝑛𝑑 𝑅𝑒𝑑
2
=
52
1
=
26
Conditional Probability
• Occurrence of event B when event A is already
occurred.
𝑃(𝐵 ∩ 𝐴)
𝑃 𝐵𝐴 =
𝑃(𝐴)
Eg - Given that you drew a red card, what’s the
probability that it’s an Ace
𝑃(𝐴𝑐𝑒 𝑎𝑛𝑑 𝑅𝑒𝑑)
𝑃 𝐴𝑐𝑒 𝑅𝑒𝑑 =
𝑃(𝑅𝑒𝑑)
2/52
=
26/52
1
=
13
Bayes Theorem
Outlook Play Tennis
Rainy Yes
Total No. of Yes = 10
Sunny Yes Total No. of No = 4
Overcast Yes
Overcast Yes P(Yes) = 10/14
Sunny No P(No) = 4/14
Rainy Yes
Sunny Yes P(Sunny) = 5/14
Overcast Yes P(Rainy) = 4/14
Rainy No P(Overcast) = 5/14
Sunny No
Sunny Yes
Rainy No
Overcast Yes
Overcast Yes
Step-1: Make a frequency table

Outlook Yes No
Overcast
Rainy
Sunny
Total
Step-1: Make a frequency table

Outlook Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 4
Step-2: Make Likelihood Table

Outlook Yes No Likelihood

Overcast
Rainy
Sunny
All
Step-2: Make Likelihood Table

Outlook Yes No Likelihood

Overcast 5 0 5/14
Rainy 2 2 4/14
Sunny 3 2 5/14
All 10/14 4/14
Outlook P(Outook|Yes) P(Outook|No)

Overcast
Rainy
Sunny
Outlook P(Outook|Yes) P(Outook|No)

Overcast 5/10 0
Rainy 2/10 2/4
Sunny 3/10 2/4
Find the probability to play tennis
on 15th day using Naïve Bayes
Classifier where Outlook is Sunny
Step-3: Apply Bayes’ Theorem:
𝑃 𝐵 𝐴 . 𝑃(𝐴)
𝑃 𝐴𝐵 =
𝑃(𝐵)
• First, we find the probability of Yes when it is Sunny
𝑃 𝑆𝑢𝑛𝑛𝑦 𝑌𝑒𝑠 ∗𝑃(𝑌𝑒𝑠)
𝑃 𝑌𝑒𝑠 𝑆𝑢𝑛𝑛𝑦 =
𝑃(𝑆𝑢𝑛𝑛𝑦)
Find the probability to play tennis
on 15th day using Naïve Bayes
Classifier where Outlook is Sunny
• First, we find the probability of Yes when it is Sunny
𝑃 𝑆𝑢𝑛𝑛𝑦 𝑌𝑒𝑠 ∗𝑃(𝑌𝑒𝑠)
𝑃 𝑌𝑒𝑠 𝑆𝑢𝑛𝑛𝑦 =
𝑃(𝑆𝑢𝑛𝑛𝑦)
3 10
∗
10 14
𝑃 𝑌𝑒𝑠 𝑆𝑢𝑛𝑛𝑦 = 5 = 3/5 = 0.60
14
Find the probability to play tennis
on 15th day using Naïve Bayes
Classifier where Outlook is Sunny
• Second, we find the probability of No when it is
Sunny
𝑃 𝑆𝑢𝑛𝑛𝑦 𝑁𝑜 ∗𝑃(𝑁𝑜)
𝑃 𝑁𝑜 𝑆𝑢𝑛𝑛𝑦 =
𝑃(𝑆𝑢𝑛𝑛𝑦)
2 4
∗
4 14
𝑃 𝑁𝑜 𝑆𝑢𝑛𝑛𝑦 = 5 = 2/5 = 0.40
14
Find the probability to play tennis
on 15th day using Naïve Bayes
Classifier where Outlook is Sunny
• So, P(Yes|Sunny) > P(No|Sunny)
= 0.60 > 0.40
Therefore, we can say that Player can play tennis on a
sunny day.
P(Play Tennis = yes) = 9/14 = 0.64
P(Play Tennis = no) = 5/14 = 0.36

Humidity Prob.
Outlook Prob. Temperature Prob.
High
Sunny hot
Normal
Overcast mild

Rain Windy Prob. cool

true
false
P(Play tennis = yes) = 9/14 = 0.64
P(Play Tennis = no) = 5/14 = 0.36

Outlook Prob. Humidity Prob.

Temperature Prob.
Sunny 5/14 High 7/14
hot 4/14
Normal 7/14
Overcast 4/14 mild 6/14
Rain 4/14 Windy Prob. cool 4/14

true 6/14
false 8/14
P(Play tennis = yes) = 9/14 = 0.64
P(Play Tennis = no) = 5/14 = 0.36

Outlook yes no Humidity yes no Temperature yes no

Sunny High hot
Normal
Overcast mild
Rain cool
Windy yes no
true
false
P(Play tennis = yes) = 9/14 = 0.64
P(Play Tennis = no) = 5/14 = 0.36

Outlook yes No Humidity yes no Temperature yes no

Sunny 2/9 3/5 High 3/9 4/5 hot 2/9 2/5
Normal 6/9 1/5
Overcast 4/9 0 mild 4/9 2/5
Rain 3/9 2/5 cool 3/9 1/5
Windy yes no
true 3/9 3/5
false 6/9 2/5
Find the probability to play tennis on 15th day using
Naïve Bayes Classifier where conditions are:
Outlook = Sunny, Temperature = Cool,
Humidity = High and Wind = true
argmax P(Yj)∏iP(Xi|Yj)
→ P(Y|X) =
∏iP(Xi)
→ P(yes|X) =
P(yes) x P(Sunny|yes) x P(Cool|yes) x P(High|yes) x P(true|yes)
P(Sunny) x P(Cool) x P(High) x P(true)

9/14 x 2/9 x 3/9 x 3/9 x 3/9

5/14 x 3/5 x 1/5 x 4/5 x 3/5

= = 0.9408 Thus, P(no|X) > P(yes|X)
5/14 x 4/14 x 7/14 x6/14
So, the result is No.
Applications Of Naïve Bayes Classifier
• Real Time Prediction
• Text Classification
• Spam Filtering
• Sentiment Analysis
• Recommendation System
• Multiclass Classification
Advantages and Disadvantages

• Advantages:
– Fast and easy algorithm
– Can be used for binary and multi classification
– Mostly used for text classification
• Disadvantages:
– Cannot learn relation between independent features
Bayesian Belief Network
• Probabilistic Graphical Model.
• Represents a set of variables and
their conditional dependencies
using a directed acyclic graph.
• Two major components:
• Directed Acyclic Graph (DAG)
• Table of Conditional
Probabilities
Bayesian Belief Network

• Node represents the

random variables
• Arc represents the
casual relationships or
conditional probabilities
between random
variables.
Example 1

0.001
Calculate the probability that alarm has
sounded, but there is neither a burglary,
nor an earthquake occurred, and David
and Sophia both called the Harry.

• We will calculate the joint probability of all the

events
– P(¬B, ¬E, A, D, S)
= P (¬B) *P (¬E) * P (A|¬B ^ ¬E) * P (S|A) * P (D|A)
= 0.998*0.999 * 0.001* 0.75* 0.91
= 0.00068045.
0.001

What is the probability that David called?

P(D) = P(D|A)P(A) + P(D|⌐A)P(⌐A)
What is the probability that David called?
P(D) = P(D|A)P(A) +
P(D|⌐A)P(⌐A)

P(⌐A) = P(⌐A|B,E)P(B)P(E)+
0.001 P(⌐A|B, ⌐E)P(B)P(⌐E)+ P(⌐A| ⌐
B,E)P(⌐ B)P(E)+ P(⌐A|⌐B,
⌐E)P(⌐B)P(⌐E)
What is the probability that David called?

• P(A) = 0.00252
• P(⌐A) = 0.99748
• P(D) = P(D|A)P(A) + P(D|⌐A)P(⌐A)
• P(D) = 0.91 * 0.00252 + 0.05 * 0.99748
EM Algorithm
• E -> Expectation
• M -> Maximization
• Used to find latent variable.
• Latent variable – not directly observed
• Basically, used for many unsupervised
clustering algorithm
Steps involved in EM Algorithm
• Step 1- A set of initial values are considered
– Set of incomplete data is given to the system.
• Step 2 - Expectation Step or E-step
– Use observed data to estimate or guess the values.
• Step 3 – Maximization Step or M-Step
– Update the generated values
• Step 4 – To check values are converging or not
– If converging – stop
– Otherwise repeat step 2 or 3 till the convergence
occurs
Usage of EM Algorithm
• Used to fill missing data.
• Used for unsupervised clustering.
• Used to discover values of latent
variable.
• Used to calculate Gaussian density of a
function.
• Used to estimate parameters of Hidden
Markov Model.
Advantages & Disadvantages

Advantages Disadvantages
• Easy to implement as it has • Slow convergence.
only 2 steps E-step and M- • Make convergence local
step. optimal only.
• Likelihood increases after • Required both forward and
each iteration. backward probabilities.
• Solution of M-Step exists in
closed form.
Concept Learning
• “A task of acquiring potential
hypothesis (solution) that best fits the
given training examples”.

• Main Goal – Find all

concepts/hypothesis that are
consistent.
• For each attribute, the hypothesis will
either
• indicate by a “?’ that any value is
acceptable for this attribute,
• specify a single required value
(e.g., Warm) for the attribute, or
• indicate by a “ø” that no value is
acceptable.
Most General and Specific Hypothesis
• The most general hypothesis-that every day
is a positive example-is represented by
(?, ?, ?, ?, ?, ?)
• The most specific possible hypothesis-that
no day is a positive example-is represented
by
(ø, ø, ø, ø, ø, ø)
Types Of Concept Learning

List then Candidate

Find-S
eliminate Elimination
Algorithm
Algorithm Algorithm
Find – S Algorithm
• Step-1: Initialize with most specific
hypothesis (Փ).
H0 = < Փ , Փ , Փ, Փ, Փ >
• Step 2: For each +ve sample,
– For each attribute,
• If (value = Hypothesis value) => Ignore
Else
Replace with the most general hypothesis
(?).
• h1 = <Sunny, Warm, Normal, Strong, Warm, Same>
• h2 = <Sunny, Warm, ?, Strong, Warm, Same>
• h3 = h2
• h4 = <Sunny, Warm, ?, Strong, ?, ?>
• h4 → most specific hypothesis
Disadvantage of Find-S algorithm

• Considers only +ve values.

• h4 may not be the sole hypothesis that

fits the complete data.
Candidate Elimination Algorithm
• Used the concept of version space.
• Considers both +ve and –ve values.
• For +ve samples,
– move from specific(Փ) to general(?).
• For –ve samples,
– move from general(?) to specific(Փ).
Example
S0 = <Փ,Փ,Փ,Փ,Փ,Փ>
G0 = <?,?,?,?,?,?>

1) +ve
S1 = < Sunny, Warm, Normal,
Strong, Warm, Same>
G1 = <?,?,?,?,?,?>
2) +ve
S2 = < Sunny, Warm, ?, Strong, Warm, Same>
G2 = <?,?,?,?,?,?>

3) –ve
S3 = < Sunny, Warm, ?, Strong, Warm, Same>
G3 = <<Sunny,?,?,?,?,?>,<?,Warm,?,?,?,?>,<?,?,?,?,?,same>>
4) +ve
S4 = < Sunny, Warm, ?, Strong, ?, ?>
G4 = <<Sunny,?,?,?,?,?>,<?,Warm,?,?,?,?>>

S4 and G4 are final hypothesis

SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Find S algorithm.

S0 = <Փ,Փ,Փ,Փ>

1) +ve (Honda, Blue, 1970, Economy)

S1 = < Honda, Blue, 1970, Economy>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Find S algorithm.

2) -ve (Toyota, Green, 1980, Sports)

S1 = < Honda, Blue, 1970, Economy>
S2 = S1
S2 = < Honda, Blue, 1970, Economy>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes

Find S algorithm.

3) +ve (Toyota, Blue, 1990, Economy)

S2 = < Honda, Blue, 1970, Economy>
S3 = <?, Blue, ?, Economy>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes

Find S algorithm.

4) -ve (BMW, Red, 2000, Economy)

S3 = <?, Blue, ?, Economy>
S4 = S3
S4 = <?, Blue, ?, Economy>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes

Find S algorithm.

5) +ve (Hona, White, 2010, Economy)

S4 = <?, Blue, ?, Economy>
S5 = <?, ?, ?, Economy> Final Specific Hypothesis
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Candidate Elimination Algorithm
S0 = <Փ,Փ,Փ,Փ>
G0 = <?,?,?,?>
1) +ve (Honda, Blue, 1970, Economy)
S1 = < Honda, Blue, 1970, Economy>
G1 = <?,?,?,?,?,?>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Candidate Elimination Algorithm

2) -ve (Toyota, Green, 1980, Sports)

S2 = S1
S2 = < Honda, Blue, 1970, Economy>
G2 = <<Honda,?,?,?>,<?,Blue,?,?>,<?,?,1970,?>,<?,?,?, Economy>>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Candidate Elimination Algorithm

3) +ve (Toyota, Blue, 1990, Economy)

S2 = < Honda, Blue, 1970, Economy >
S3 = < ?, Blue, ?, Economy>
G2 = <<Honda,?,?,?>,<?,Blue,?,?>,<?,?,1970,?>,<?,?,?, Economy >>
G3 = <<?,Blue,?,?>,<?,?,?, Economy >>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes

Candidate Elimination Algorithm

4) -ve (BMW, Red, 2000, Economy)

S4 = S3
S4 = < ?, Blue, ?, Economy>
G4 = <<?,Blue,?,?>,<?,?,?, Economy >>
SNo Manufacturer Color Year Type Will Buy
1. Honda Blue 1970 Economy Yes
2. Toyota Green 1980 Sports No
3. Toyota Blue 1990 Economy Yes
4. BMW Red 2000 Economy No
5. Honda White 2010 Economy Yes
Candidate Elimination Algorithm
5) +ve (Honda, White, 2010, Economy)
S5 and G5 are final
S4 = < ?, Blue, ?, Economy >
Hypothesis.
S5 = < ?, ?, ?, Economy>
G4 = <<?,Blue,?,?>,<?,?,?, Economy >>
G5 = <?,?,?, Economy >
Consistent Hypothesis (H)

• A hypothesis (h) is consistent with a set of training

examples (D) if and only if h(x) = C(x) for each
example <x, C(x)> in D”.
– Consistent (h,D) Ξ ( ꓯ<x, C(x)> є D) h(x) = C(x)
Example

Example Citations Size In Library Price Editions Buy

1 Some Small No Afordable One No
2 Many Big No Expensive Many Yes

h1 = (?, ?, No, ?, Many) Consistent

h2 = (?, ?, No, ?, ?) Not Consistent

Version Space (VSH,D )

• The version space with respect to hypothesis space

(H) and training examples (D), is the subset of
hypotheses from (H) consistent with the training
examples in D.
– VSH,D = {h є H | Consistent (h, D)}
List then Eliminate

• Step 1- Version Space : A list of every hypothesis in H.

• Step 2- For each training example (x,c(x))
• Remove from version space any hypothesis (h) which
is not consistent (h(x) != c(x)).
• Step 3- Output the list of hypothesis remaining in the
version space.
Example

• F1 -> A, B
• F2-> X, Y
• Instance Spaces: (A,X), (A,Y), (B,X), (B,Y) – 4 examples
• Hypothesis Space: (A,X), (A,Y), (A, Փ), (A,?), (B,X),
(B,Y), (B, Փ), (B,?), (Փ,X), (?,X), (Փ,Y), (?,Y), (Փ, Փ),
(Փ,?), (?, Փ), (?,?) - 16 Hypothesis
List then Eliminate

• Semantically Distinct Hypothesis: (A,X), (A,Y), (A,?),

(B,X), (B,Y), (B,?), (?,X), (?,Y), (?,?), (Փ, Փ) – 10
• Now, Using list then eliminate algorithm:
• Step 1: Version Space:
– (A,X), (A,Y), (A,?), (B,X), (B,Y), (B,?), (?,X), (?,Y),
(?,?), (Փ, Փ)
List then Eliminate
• Step 1: Version Space:
– (A,X), (A,Y), (A,?), (B,X), (B,Y), (B,?), (?,X),
(?,Y), (?,?), (Փ, Փ)
F1 F2 Target
• Training Instances: A X Yes
A Y Yes

• Step 2: Remove from Version Space which is

not consistent:
– (A,?), (?,) → Consistent Hypothesis
Problem with List then Eliminate
Algorithm

• The Hypothesis space must be finite.

• Enumeration of all the hypothesis,

rather inefficient because listing all the
hypothesis is just a waste of time.
Reference Books

Tom M. Mitchell, Ethem Alpaydin, ―Introduction Stephen Marsland, Bishop, C., Pattern
―Machine Learning, to Machine Learning (Adaptive ―Machine Learning: An Recognition and Machine
McGraw-Hill Computation and Machine Algorithmic Perspective, Learning. Berlin:
Education (India) Learning), The MIT Press 2004. CRC Press, 2009. Springer-Verlag.
Private Limited, 2013.
Text Books

Saikat Dutt, Andreas C. Müller and John Paul Mueller and Dr. Himanshu
Subramanian Sarah Guido - Luca Massaron - Sharma, Machine
Chandramaouli, Amit Introduction to Machine Machine Learning for Learning, S.K.
Kumar Das – Machine Learning with Python Dummies Kataria & Sons -2022
Learning, Pearson

Chapter 10
No ratings yet
Chapter 10
63 pages
AB1202 Statistics and Analysis: Model Building
No ratings yet
AB1202 Statistics and Analysis: Model Building
20 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Linear Regresion
No ratings yet
Linear Regresion
28 pages
Arathi
No ratings yet
Arathi
9 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
chp6 (10) fam
No ratings yet
chp6 (10) fam
24 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
8.-Linear-Regression
No ratings yet
8.-Linear-Regression
25 pages
AI lab7
No ratings yet
AI lab7
13 pages
Lecture 3
No ratings yet
Lecture 3
42 pages
Chapter4_Regression.docx
No ratings yet
Chapter4_Regression.docx
15 pages
Assignment#4: A) Draw The Scatter Plot of The Data, If You Can Plot Via R Would Also Be Acceptable
No ratings yet
Assignment#4: A) Draw The Scatter Plot of The Data, If You Can Plot Via R Would Also Be Acceptable
10 pages
5.linear Regression
No ratings yet
5.linear Regression
39 pages
Regression
No ratings yet
Regression
60 pages
ML Unit
No ratings yet
ML Unit
23 pages
2.1 Linear Regression
No ratings yet
2.1 Linear Regression
39 pages
NM_Presentation
No ratings yet
NM_Presentation
16 pages
Regression Model
No ratings yet
Regression Model
30 pages
Simple and Multiple Regression
100% (1)
Simple and Multiple Regression
39 pages
ML Lab-3
No ratings yet
ML Lab-3
14 pages
Stats 101 - Class 03
No ratings yet
Stats 101 - Class 03
94 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Regression Analysis
100% (2)
Regression Analysis
28 pages
L. D. College of Engineering: Lab Manual For
No ratings yet
L. D. College of Engineering: Lab Manual For
70 pages
Mathematics Behind Machine Learning:: Linear Regression Model
No ratings yet
Mathematics Behind Machine Learning:: Linear Regression Model
21 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Cl-Vii Ass2 4301063
No ratings yet
Cl-Vii Ass2 4301063
5 pages
LINEAR REGRESSION MODEL 1
No ratings yet
LINEAR REGRESSION MODEL 1
23 pages
Multiple Regression Models
No ratings yet
Multiple Regression Models
10 pages
Multivar 2 - Simple and Multiple Regression PDF
No ratings yet
Multivar 2 - Simple and Multiple Regression PDF
26 pages
Regression
No ratings yet
Regression
16 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Sessions 18 19 - Regression - SLR MLR
No ratings yet
Sessions 18 19 - Regression - SLR MLR
70 pages
1725857551_SMA32
No ratings yet
1725857551_SMA32
30 pages
UNIT-2 ML
No ratings yet
UNIT-2 ML
39 pages
Lecture 4 - Linear Regression
No ratings yet
Lecture 4 - Linear Regression
18 pages
Arnav MLlab02
No ratings yet
Arnav MLlab02
6 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Linear Regression
No ratings yet
Linear Regression
97 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
3. Linear Regression
No ratings yet
3. Linear Regression
49 pages
Data Science Lab 5
No ratings yet
Data Science Lab 5
8 pages
Presentation of Statistics
No ratings yet
Presentation of Statistics
21 pages
Statistics For Business Analysis: Learning Objectives
No ratings yet
Statistics For Business Analysis: Learning Objectives
37 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Nonlinear Curve Fitting: "Why Fit in When You Were Born To Stand Out?" - Dr. Seuss
No ratings yet
Nonlinear Curve Fitting: "Why Fit in When You Were Born To Stand Out?" - Dr. Seuss
65 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Linear Regression Concepts_A4
No ratings yet
Linear Regression Concepts_A4
6 pages
Matrix Model
No ratings yet
Matrix Model
6 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
9 pages
Applied Quantitative Analysis and Practices: Lecture#22
No ratings yet
Applied Quantitative Analysis and Practices: Lecture#22
27 pages
Get Applied Regression Analysis Doing Interpreting and Reporting 1st Edition Christer Thrane free all chapters
100% (1)
Get Applied Regression Analysis Doing Interpreting and Reporting 1st Edition Christer Thrane free all chapters
67 pages
Final Research Paper
No ratings yet
Final Research Paper
16 pages
EBE Ch2
No ratings yet
EBE Ch2
10 pages
Kategorisasi Variabel: Dep - Tal
No ratings yet
Kategorisasi Variabel: Dep - Tal
5 pages
Crime Prediction Using Machine Learning and Deep Learning: A Systematic Review and Future Directions
No ratings yet
Crime Prediction Using Machine Learning and Deep Learning: A Systematic Review and Future Directions
35 pages
Nsbe9ege Ism Ch12
No ratings yet
Nsbe9ege Ism Ch12
88 pages
Fake Job Detection Using Machine Learning
No ratings yet
Fake Job Detection Using Machine Learning
8 pages
Models Long Memory
No ratings yet
Models Long Memory
24 pages
Analisis Regrensi Linear Berganda
No ratings yet
Analisis Regrensi Linear Berganda
15 pages
Algorithmic Trading Bot: Medha Mathur, Satyam Mhadalekar, Sahil Mhatre, Vanita Mane
No ratings yet
Algorithmic Trading Bot: Medha Mathur, Satyam Mhadalekar, Sahil Mhatre, Vanita Mane
9 pages
Production of Activated Carbon From Coconut Shell Optimization Using Response Surface Methodology
No ratings yet
Production of Activated Carbon From Coconut Shell Optimization Using Response Surface Methodology
9 pages
Water Absorption of Expanded Polystyrene Boards: Polymer Testing
No ratings yet
Water Absorption of Expanded Polystyrene Boards: Polymer Testing
7 pages
Regression: (Dataset1) D:/Kuliah/Metil/Crossec1
No ratings yet
Regression: (Dataset1) D:/Kuliah/Metil/Crossec1
27 pages
Activity 5 - Statistical Analysis and Design - Regression - Correlation
No ratings yet
Activity 5 - Statistical Analysis and Design - Regression - Correlation
29 pages
Acctg. 120 Prelim Exam 2021 - None
No ratings yet
Acctg. 120 Prelim Exam 2021 - None
13 pages
(eBook PDF) Intermediate Social Statistics: A Conceptual and Graphic Approach download pdf
100% (5)
(eBook PDF) Intermediate Social Statistics: A Conceptual and Graphic Approach download pdf
41 pages
Risk Analysis - 2016 - Bakkensen - Validating Resilience and Vulnerability Indices in The Context of Natural Disasters
No ratings yet
Risk Analysis - 2016 - Bakkensen - Validating Resilience and Vulnerability Indices in The Context of Natural Disasters
23 pages
GitHub Copilot Impact
No ratings yet
GitHub Copilot Impact
10 pages
Ecotrics (PR) Panel Data Reference
No ratings yet
Ecotrics (PR) Panel Data Reference
22 pages
Instant download Statistics Unlocking the Power of Data 3rd Edition Robin H. Lock pdf all chapter
No ratings yet
Instant download Statistics Unlocking the Power of Data 3rd Edition Robin H. Lock pdf all chapter
88 pages
Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests
No ratings yet
Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests
16 pages
Mayers, Andrew - Introduction To Statistics and SPSS in Psychology-Pearson (2013)
100% (1)
Mayers, Andrew - Introduction To Statistics and SPSS in Psychology-Pearson (2013)
626 pages
ResearchIIQ3Mod1Wk1-2 Removed
No ratings yet
ResearchIIQ3Mod1Wk1-2 Removed
24 pages
Factsheet SAS VDD
No ratings yet
Factsheet SAS VDD
4 pages
Evaluating Abdominal Core Muscle Fatigue: Assessment of The Validity and Reliability of The Prone Bridging Test
No ratings yet
Evaluating Abdominal Core Muscle Fatigue: Assessment of The Validity and Reliability of The Prone Bridging Test
9 pages
Ats Quantitative Analysis PDF
No ratings yet
Ats Quantitative Analysis PDF
463 pages
Uses and Gratifications Theory and Digital Media Use - The Test of Emotional Factors
100% (1)
Uses and Gratifications Theory and Digital Media Use - The Test of Emotional Factors
10 pages
Cost Accounting Prestest 2
No ratings yet
Cost Accounting Prestest 2
19 pages