0% found this document useful (0 votes)
17 views6 pages

Ai&ml M-3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

Ai&ml M-3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

1.

Explain the concept of Decision Tree Learning -6M


• Decision tree learning a method for approximating discrete-valued target functions, in
which the learned function is represented by a decision tree.

DECISION TREE REPRESENTATION

• Decision trees classify instances by sorting them down the tree from the root to some leaf
node, which provides the classification of the instance.

• Each node in the tree specifies a test of some attribute of the instance, and each branch
descending from that node corresponds to one of the possible values for this attribute.

• An instance is classified by starting at the root node of the tree, testing the attribute specified
by this node, then moving down the tree branch corresponding to the value of the attribute in
the given example. This process is then repeated for the subtree rooted at the new node.

• Decision trees represent a disjunction of conjunctions of constraints on the attribute values of


instances.

• Each path from the tree root to a leaf corresponds to a conjunction of attribute tests, and the
tree itself to a disjunction of these conjunctions

For example,

The decision tree shown in above figure corresponds to the expression

(Outlook = Sunny 𝖠 Humidity = Normal)

(Outlook = Overcast)

(Outlook = Rain 𝖠 Wind = Weak)

2. Explain the concept of Entropy and Information Gain -8M


ENTROPY MEASURES HOMOGENEITY OF EXAMPLES

• To define information gain, we begin by defining a measure called entropy. Entropy


measures the impurity of a collection of examples.
• Given a collection S, containing positive and negative examples of some target concept,
the entropy of S relative to this Boolean classification is
Where,
p+ is the proportion of positive examples in S
p- is the proportion of negative examples in S.
Example: Entropy
• Suppose S is a collection of 14 examples of some boolean concept, including 9 positive and
5 negative examples. Then the entropy of S relative to this boolean classification is

• The entropy is 0 if all members of S belong to the same class

• The entropy is 1 when the collection contains an equal number of positive and negative
examples
• If the collection contains unequal numbers of positive and negative examples, the entropy
is between 0 and 1

INFORMATION GAIN MEASURES THE EXPECTED REDUCTION IN ENTROPY

• Information gain, is the expected reduction in entropy caused by partitioning the examples
according to this attribute.

• The information gain, Gain (S, A) of an attribute A, relative to a collection of examples S, is defined
as
Example: Information gain

Let, Values(Wind) = {Weak, Strong}

S =[9+,5-]
S Weak =[6+,2-]
S Strong =[3+,3-]
Information gain of attribute Wind:

Gain(S, Wind) = Entropy(S) − 8/14 Entropy (S Weak) − 6/14 Entropy (S Strong)

= 0.94 – (8/14)* 0.811 – (6/14) *1.00

= 0.048

4. Construct decision tree to represent the following


Boolean function -4M
i)A and B

ii)A or [B and C]

7. Explain inductive bias for decision tree learning -8M


Inductive bias is the set of assumptions that, together with the training data, deductively justify
the classifications assigned by the learner to future instances
Given a collection of training examples, there are typically many decision trees consistent with
these examples. Which of these decision trees does ID3 choose?

ID3 search strategy

(a) selects in favour of shorter trees over longer ones

(b) selects trees that place the attributes with highest information gain closest to the root

Approximate inductive bias of ID3: Shorter trees are preferred over larger trees

• Consider an algorithm that begins with the empty tree and searches breadth first through
progressively more complex trees.

• First considering all trees of depth 1, then all trees of depth 2, etc.

• Once it finds a decision tree consistent with the training data, it returns the smallest
consistent tree at that search depth (e.g., the tree with the fewest nodes).

• Let us call this breadth-first search algorithm BFS-ID3.

• BFS-ID3 finds a shortest decision tree and thus exhibits the bias "shorter trees are preferred
over longer trees.

A closer approximation to the inductive bias of ID3: Shorter trees are preferred over longer
trees. Trees that place high information gain attributes close to the root are preferred over those that
do not.

• ID3 can be viewed as an efficient approximation to BFS-ID3, using a greedy heuristic search to
attempt to find the shortest tree without conducting the entire breadth-first search through the
hypothesis space

• Because ID3 uses the information gain heuristic and a hill climbing strategy, it exhibits a more
complex bias than BFS-ID3.

• In particular, it does not always find the shortest consistent tree, and it is biased to favour trees
that place attributes with high information gain closest to the root.

8. Write decision tree to represent the following functions


i)
ii)

iii)

iv)

You might also like