MLunit 1
MLunit 1
Course Name -
Course
Outcomes /
Program
Outcomes 1 2 3 4 5 6 7 8 9 10 11 12
Machine Learning
CO1: Student
should be able to
understand the
basic concepts H H H
such as decision
trees and neural
networks.
CO2 : Ability to
formulate
machine learning
H H M H M M M L
techniques to
respective
problems
CO3: Apply
machine learning
algorithms to
L H L
solve problems of
moderate
complexity
Machine
Algorithm
Learning
Output-
Data as Input Write Program Feeding the examples of
Data input/output
data
COMPUTER
COMPUTER
Program/model
with which you
Program Produces can solve the
Output problem/task
• Confidence(AB) =
Support(AUB)/Support(A)
Items Pair frequen Suppo sr Rules Confidence
cy rt no
(Bread, 2 40% 1 (BreadJuice) 75%
cheese)
(Bread, Juice) 3 60% 2 (JuiceBread) 75%
(Bread, Milk) 2 40% 3 (Cheese Juice) 100%
( Cheese, 3 60%
Juice) 4 (Juice Cheese) 75%
(Cheese,
• Remove items 1 Pair with
20%<50% support. Then only Two pairs will be left, ie is (Bread,
Milk)Juice) and (cheese, Juice
• ForMilk)
(Juice, Rules only
2 two pairs
40% that is (Bread, Juice) and (cheese, Juice)
• Now consider item pair (bread, Juice) the rules can be generated are (BreadJuice) and
(JuiceBread) but which rule is to be considered, For that we need consider the
Confidence and calculate the confidence of the rules
• Financial Services
• Marketing and Sales
• Healthcare
• Transportation
• Oil and Gas
ChooseMove : B->M
B – Set of legal board states
M – Best move
Alternative Target Function: V : B -> R
R – Real value
• Final Design
• PERSPECTIVES:
It involves searching a very large space of
possible hypothesis to determine one that best
fits the observed data and prior knowledge held
by the learner.
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
• INDUCTIVE BIAS:
• Remarks on CE and VS algorithms:
1) Will the CE alg gives us correct hypothesis?
2) What training example should the learner request
next?
Inductive Learning– From examples we derive
rules.
Deductive Learning– Already existing rules are
applied to our examples.
01/22/2025 Dr S Pratap Singh 47
Biased hypothesis space:
Doesnot consider all types of training examples.
Solution-> include all hypothesis.
Unbiased hypothesis space:
Providing a hypothesis capable of representing set
of all examples.
Possible Instances : 3*2*2*2*2*2=96
Target Concepts : 2 power 96(huge)(practically not
possible )
01/22/2025 Dr S Pratap Singh 48
• Idea of Inductive Bias:
• Learner generalizes beyond the observed training examples to
infer new examples.
• ‘>’ inductively inferred from
X>Y (y is inductively inferred from x).
1) Selects in favour of shorter trees over longer trees
2) Selects trees that place the attributes with highest information
gain closest to the root.
-> A preferred bias is more desirable than a restriction bias,
because it allows the learner to work within a complete hypothesis
space that is assures to contain the unknown target function.
CSE-MLRITM-AUTONOMOUS
• Decision Tree to purchase new house
Cost
No<50L
of Bed Rooms >50L
Reject
>=2 <2
Distance from
office Reject
No
<10km >10km
Good
Construction Reject
Yes
Purchase No
Reject
CSE-MLRITM-AUTONOMOUS
Decision tree to Play Tennis
Outlook
Humidity YES
Wind
NO YES NO YES
Disjunctive Description:
(outlook=sunny ᴧ Humidity=Normal)V(Outlook=overcast)V(Outlook=rain ᴧ Wind=weak)
CSE-MLRITM-AUTONOMOUS
• A decision tree learning is a method for
approximating discrete valued target
functions, in which the learned function is
represented by a decision tree.
• Learned trees can also be re-represented as
sets of if-then rules.
CSE-MLRITM-AUTONOMOUS
Problems Characteristics
• Instances are represented by attribute-value
pairs.
• The target function has discrete output values.
• Disjunctive descriptions may be required.
• The training data may contain errors.
• The training data may contain missing
attribute values.
CSE-MLRITM-AUTONOMOUS
Decision Tree Algorithms
• ID3
• C4.5
• C5.0
• CART
CSE-MLRITM-AUTONOMOUS
ID3 Algorithm
• Which attribute should be tested at the root
of the tree?
• Selection of descendant of root node.
• To select root node and descendant of root
node we need to concentrate on entropy and
information gain.
CSE-MLRITM-AUTONOMOUS
• Entropy is the measures of impurity ,disorder or
uncertainty in a bunch of examples.
• Information gain is the difference in entropy
before and after splitting dataset on the attribute.
• Information gain is the main key that is used
by Decision Tree Algorithms to construct a
Decision Tree.
• Decision Trees algorithm will always tries to
maximize Information gain.
• An attribute with highest Information gain will
tested/split first.
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
Entropy
Gain(S,Outlook) =Entropy(s)-I(Outlook)
= 0.940- 0.693 = 0.247
CSE-MLRITM-AUTONOMOUS
• E(Temperature=Hot)= -(2/4) log(2/4)-(2/4)log(2/4) = 1
• E(Temperature=Mild)=-(4/6)log(4/6)-(2/6)log(2/6)= 0.918
CSE-MLRITM-AUTONOMOUS
• Average information entropy of Temperature
Gain(S,Temperature) =Entropy(s)-I(Temperature)
= 0.940- 0.911 = 0.029
CSE-MLRITM-AUTONOMOUS
Gain(S , Outlook)=0.246
Gain(S , Humidity)= 0.151
Gain(S , Wind)= 0.048
Gain(S , Temperature)= 0.029
Outlook is selected as the decision attribute for
the root node because of its higher gain value,
and branches are created below the root for
each of its possible values (i.e., Sunny, Overcast,
and Rain).
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
• Overcast have zero entropy so it becomes as
leaf node with the classification play tennis as
yes.
• Sunny and Rain still have non-zero entropy
and the decision tree will be elaborated below
these nodes.
• Repeats the above process to select the
descendent nodes.
CSE-MLRITM-AUTONOMOUS
• Repeat the same for descendant of root node
CSE-MLRITM-AUTONOMOUS
•
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
Attribute Gain
Temperature 0.571
Humidity 0.971
Windy 0.02
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
Gain(srain,Humidity)=O.02
Gain(srain,wind)=0.971
Gain(srain,Temp)=0.02
Wind is selected as next node under rainy.
Outlook Wind Playtennis
Rain Weak Yes
Rain Weak Yes
Rain Strong No
Rain Weak Yes
Rain Strong No
CSE-MLRITM-AUTONOMOUS
Final Decision Tree
CSE-MLRITM-AUTONOMOUS
CSE-MLRITM-AUTONOMOUS
Hypothesis space and inductive bias