0% found this document useful (0 votes)

37 views83 pages

ML Unit 3

Uploaded by

sanju.25qt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views83 pages

ML Unit 3

Uploaded by

sanju.25qt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

UNIT 3

MACHINE LEARNING
TREE MODELS
• Feature Tree:
A compact way of representing a number of
conjunctive concepts in the hypothesis space.

• Tree:
1. Internal nodes as Features
2. Edges labelled as Literals.
3. Split : set of literals at a node.
4. Leaf: Logical expression conjunction of literals
in the path from root to that edge
TREE MODELS
• Generic Algorithms
Three functions:
1. Homogenous(D)  all instances belong to
True/False (Single Class)

2. Label(D)  returns label of D

3. BestSplit(D,F) on which feature the dataset

is divided(two classes/more
classes)
TREE MODELS
• Divide-and-Conquer algorithm:
it divides the data into subsets,builds a tree for each
of those and then combines those subtrees into a
single tree.
• Greedy:
whenever there is a choice (such as choosing the
best split), the best alternative is selected on the basis
of the information then available, and this choice is
never reconsidered.
• Backtracking search algorithm:
which can return an optimal solution, at the expense
of increased computation time and memory
requirements
DECISION TREES
• Classification Task: D
Homogenous(D)
Single Label(D)

Non- Homogenous(D) Majority Class Label

D Di =Ø (zero)
D 1 D2
DECISION TREES
• D + 1 = D+ and D -1 = Ø

• D -1 = D - and D+1 = Ø

pure

• Impurity: n+ , n- (impurity depends only on magnitude)

• Impurity is measured in Proportional format  p˙ = n+ / (n++n-)  empirical

probability of positive class.

• Aim : We need a function that returns

0 if p=0 or p=1
½ if p reaches maximum value
FUNCTIONS
1. MINORITY CLASS(Error Rate)

2. GINI INDEX(Expected Error Rate).

3. ENTROPY(Expected Information)
MINORITY CLASS
• Min(p,1-p), it returns error rate.

• Minority class is Proportion to misclassified examples.

• Spam=40 majority class,

Ham=10misclassified(minority class)

• If set of instances are Pure set  fewer(no error)

• Minority class as impurity class then ½ -|p- ½ |

GINI INDEX
• It is an expected error rate.

• Randomly assigns a label to instances.

• P(positive instances), 1-p(negative instances)

• False positive  p (1-p)

• False Negative (1-p) p

ENTROPY
• It is an Expected Information.

• Formula: -p log 2p – (1-p) log 2(1-p)

Decision Trees
Entropy
Gini Index
Decision Tree
• K>2

• One vs rest

• K class Entropy =

• K class Gini Index=

RANKING AND PROBABILITY ESTIMATION
• Grouping classifiers divide instance space into segments.
• Instance space

• Segments

• Rankers by learning an ordering algorithm

• Decision trees (can access Local class distribution) directly used

to construct Leaf ordering in an optimal way.

• Using Empirical Probability easy to calculate leaf ordering.

• Highest priority for Positives.

• Convex ROC Curve.

the empirical probability of the parent is a weighted average of the
empirical probabilities of its children; but this only tells us that p˙1 ≤ p˙ ≤ p˙2 or p˙2 ≤ p˙ ≤
p˙1.
• Tree is a feature tree with unlabelled data.

• How many ways we can label the tree and the

performance.

• If we know the number of positives and

negatives.

• L-labels, C-classes then CL ways to arrange the

leaves.

• Ex:24= 16 ways.
• Graph follows symmetry property.

• +-+-, -+-+  they are locating at same

place(symmetric property).

• Path of coverage corner contains optimal

• ----, --+-, +-+-, +-++, ++++

• L labels then L! permutations are possible.

• Feature tree is turned into

-- Rankers (Order leaves in descending

order based on Empirical
probability.

-- Probability Estimator(Predict Empirical

probability in each leaf or calculate
Laplace or m-estimate)

-- Classifier(choose operating conditions , find

the operating point that fits the
conditions
• the optimal labelling under these operating conditions
is +−++.
• use the second leaf to filter out negatives.
• In other words, the right two leaves can be merged into
one – their parent.
• the operation of merging all leaves in a subtree is called
pruning the subtree.
• The advantage of pruning is that we can simplify the
tree without affecting the chosen operating point,
which is sometimes useful
• if we want to communicate the tree model to
somebody else
• The disadvantage is that we lose ranking performance,
Sensitivity to Skewed Class Distribution
• Parent p Gini index = 2(n + / n)(n - / n)

• Average Weight of Gini index children

n1 = n1+ + n1-

n1/n * 2(n + / n)(n - / n)

• Relative impurity= sqrt(n1+ * n1- )/ (n + * n - )

How you would train decision Trees for
a dataset
• Good Ranking Estimator.
• Distributive-Insensitive data
• Disable Pruning.
• Operating Condition, Operating point ROC.
• Prune all the leaves at the same level.
Tree Learning as Variance Reduction
• Gini Index 2p(1-p)  Expected error rate.

• Label instances randomly.

• Coin---Head, tail  probability of occurring of

head is p then variance is p(1-p)
P is occurring
1-p is non-occurring
REGRESSION TREE
Regression Tree
Model(A100,B3,E112,M102,T202)
• A100[1051,1770,1900]mean=1574
• B3[4513] mean=4513
• E112[77] mean=77
• M102[870] mean=870
• T202[99,270,625] mean= 331

Calculate variance:
• A100
1/9 sq(1574-1051)+sq(1574-1770)+sq(1574-1900)=
1/9(523)+(-196)+(-326)=
273529+38416+106,276=46469
• B3
1/9sq(4513-4513)=0
• E112
1/9sq(77-77)=0
• M102
1/9sq(870-870)=0
• T202
1/9sq(331-99)+sq(331-270)+sq(331-625)=1/9(232+61+(-
294))=15997
• Calculate weigthed average of Model:-
• 3/9(46469)+0+0+0+3/9(15597)=2,686.5978
• Similarly for condition(excellent, good, fair)
excellent[1770,4513]mean=3142
good[270,870,1051,1900] mean=1023
fair[77,99,625] mean=267
Variance:-
• Excellent
1/9 sq(3142-1770)+sq(3142-4513)=1372+1371=418002
• good
1/9sq(1023-270)+sq(1023-870)+sq(1023-1051)+sq(1023-1900)
=1/9*sq(753)+sq(153)+sq(28)+sq(877)=
=1/9*567009+ 23409+ 784+769,129
=1,51,147
• fair
1/9(267-77)+(267-99)+(267-625)=190+168+358=21331
• Weighted Average of condition:-
2/9(418002)+4/9(151147)+3/9(21331)=
167,176.1111
• Similarly for Leslie(yes,no)
yes[625,870,1900] mean=1132
no[77,99,270,1051,1770,4513] mean= 1297
Variance:-
• Yes
1/9 sq(1132-625)+(1132-870)+(1132-1900)
=1/9 sq(507)+262+(-768)=101,704.11
• No
1/9 sq(1297-77)+(1297-99)+(1297-270)+(1297-1051)+(1297-
1770)+(1297-4513)
=1/9 sq(1220)+1198+1027+246+(-473)+(-3216)
=16223803.77
• Calculate weighted average of Leslie:-
• 3/9* 101,704.11 + 6/9* 16223803.77
=33901.36+10815869.180
=10849770.54

Weighted averages :
1. Model= 2,686.5978
2. Condition= 167,176.1111
3. Leslie= 10849770.54
• For A100 the splits are
Condition[excellent,good,fair]
[1770] [1051,1900] []  ignored
Leslie[yes,no]  [1900][1051,1770]calculate variance
• For T202 the splits are
Condition[excellent,good,fair][] [270][99,625]ignored
Leslie[yes,no] - [625][99,270]  calculate variance
Regression Tree
Clustering Trees
• Regressions finds an instance space segment that
target values are tightly clustered around the mean.
• Variance of set of target value is average
squared Euclidean distance to mean.
•
• Learning a clustering tree using
1. Dissimilarity Matrix.
2. Euclidean distance
• For A100 the means of the three numerical
features(price, reserve,bids)
11,8,13
18,15,15
19,19,1
• vectors (means) are(16,14,9.7)
• Variance is:
1/3sq(16-11)+(16-18)+(16-19)=1/3sq(5)+(-2)+(-
3)=12.7
• 1/3sq(14-8)+(14-15)+(14-19)= 20.7
• 1/3sq(9.7-13)+(9.7-15)+(9.7-1)=38.2
RULE MODELS
• Logical Models:
1. Tree models.
2. Rule models.

• Rule models consist of a collection of implications

or if–then rules.

• if-part defines a segment, and the then-part

defines the behaviour of the model in this
segment
• Two Approaches:
1. find a combination of literals – the body of the
rule, which is called a concept – that covers a
sufficiently homogeneous set of examples, and
find a label(class) to put in the head of the rule.
Ordered sequence of Rules  Rule Lists

2. first select a class you want to learn, and then find

rule bodies that cover (large subsets of ) the
examples of that class.
 Unordered collection of Rules Rule Sets
Learning Ordered Rule Lists
• Growing Rule body that improves Homogeneity
• Decision Tree Rule Lists

C1 C2 True False

Impurity for 2 classes Only for 1 children

• Separate and Conquer

many
many

[0+,0-] [0+,0-]

1-
many

0-
Learning Unordered Rule Sets
• Alternative approach to rule learning.

• Rules are learned for one class at a time.

• minimizing min(p˙, 1 − p˙).

• maximize p˙, the empirical probability of the

class.
Descriptive Rule Learning
• Descriptive models can be learned in either a
supervised or an unsupervised way.

• Supervised:
how to adapt the given rule learning
algorithms ---- subgroup discovery.

• Unsupervised Learning:
---frequent item sets and association rule
discovery.
Learning from Sub group Discovery

• Equal Proportion of Positives to Overall

Population.
1. Precision
|Prec – Pos|

2. Average-Recall
|avgrec – 0.5|

3. Weighted Relative Accuracy

= Pos * Neg (TPR - FPR)
Association Rule Mining

Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
Chapter 7 Supervised Learning
No ratings yet
Chapter 7 Supervised Learning
71 pages
Lec 46
No ratings yet
Lec 46
6 pages
Lesson 5.0 Supervised Learning With Decision Trees
No ratings yet
Lesson 5.0 Supervised Learning With Decision Trees
16 pages
Unit 3 by GPT
No ratings yet
Unit 3 by GPT
10 pages
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
18 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
2025 Ensemble Learning
No ratings yet
2025 Ensemble Learning
25 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Class 2a-Decision Trees
No ratings yet
Class 2a-Decision Trees
28 pages
ML Unit-3
No ratings yet
ML Unit-3
23 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Unit3 ML
No ratings yet
Unit3 ML
23 pages
Machine Learning Notes 1
No ratings yet
Machine Learning Notes 1
120 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Unit II
No ratings yet
Unit II
34 pages
ML Unit 03
No ratings yet
ML Unit 03
23 pages
Decision Trees CLS
No ratings yet
Decision Trees CLS
43 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
ML Unit3
No ratings yet
ML Unit3
24 pages
ML Important
No ratings yet
ML Important
11 pages
Unit 4 Classification & Prediction
No ratings yet
Unit 4 Classification & Prediction
10 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
CZ4032 Data Analytics & Mining Notes
No ratings yet
CZ4032 Data Analytics & Mining Notes
16 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
RB's ML2 Notes
No ratings yet
RB's ML2 Notes
5 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Machine Learning: Decision Trees: CS540 Jerry Zhu University of Wisconsin-Madison
No ratings yet
Machine Learning: Decision Trees: CS540 Jerry Zhu University of Wisconsin-Madison
49 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Classification: Basic Concepts, Decision Trees, and Model Evaluation
No ratings yet
Classification: Basic Concepts, Decision Trees, and Model Evaluation
46 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
Random Forest
No ratings yet
Random Forest
5 pages
Classification Problems
No ratings yet
Classification Problems
53 pages
Unit-6: Classification and Prediction
No ratings yet
Unit-6: Classification and Prediction
63 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
48 pages
AIML Module 4 Imp
No ratings yet
AIML Module 4 Imp
5 pages
ML Unit 3 New
100% (1)
ML Unit 3 New
24 pages
Propagation Delay and Short-Circuit Power Dissipation Modeling of The CMOS Inverter
No ratings yet
Propagation Delay and Short-Circuit Power Dissipation Modeling of The CMOS Inverter
12 pages
CE 212 Digital Systems Ch4
No ratings yet
CE 212 Digital Systems Ch4
37 pages
Pros Dle24
No ratings yet
Pros Dle24
37 pages
Intro To Data Science Summary
No ratings yet
Intro To Data Science Summary
17 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
In The Future All Cars
No ratings yet
In The Future All Cars
51 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Codigo de Barras EP2000
No ratings yet
Codigo de Barras EP2000
48 pages
Constant Voltage and Constant Current DC Power Supply Instruction 2021.12.21
No ratings yet
Constant Voltage and Constant Current DC Power Supply Instruction 2021.12.21
33 pages
Deregistration of Tax Groups
No ratings yet
Deregistration of Tax Groups
28 pages
Ebit 30: Portable Color Doppler System
100% (2)
Ebit 30: Portable Color Doppler System
14 pages
AI
No ratings yet
AI
48 pages
User Guide For Free Version
No ratings yet
User Guide For Free Version
20 pages
BLF Q8 Narsil v1-3
0% (1)
BLF Q8 Narsil v1-3
4 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Design and Implementation of A Computerised Stadium Management Information System
100% (8)
Design and Implementation of A Computerised Stadium Management Information System
32 pages
Thomas Adrienne 2
No ratings yet
Thomas Adrienne 2
2 pages
END Semester Lab Exam EVEN 2025
No ratings yet
END Semester Lab Exam EVEN 2025
1 page
q8, q9, q10 Question and Answers
No ratings yet
q8, q9, q10 Question and Answers
16 pages
Sample Course - Answer-Booklet
No ratings yet
Sample Course - Answer-Booklet
20 pages
Accurate Ignition Detection of Solid Fuel Particles Using Machine Learning
No ratings yet
Accurate Ignition Detection of Solid Fuel Particles Using Machine Learning
9 pages
16631271278
No ratings yet
16631271278
12 pages
Bni Iol-712-000-K023 - en - Bni00041
No ratings yet
Bni Iol-712-000-K023 - en - Bni00041
12 pages
OKI Printer Driver Compatibility and Schedule With Mac OS X 10.7 Lion
No ratings yet
OKI Printer Driver Compatibility and Schedule With Mac OS X 10.7 Lion
9 pages
A Case Study Application of Linear Programming and Simulation To Mine Planning
No ratings yet
A Case Study Application of Linear Programming and Simulation To Mine Planning
9 pages
Riki Endri S (Kipas Angin Dinding Portable)
No ratings yet
Riki Endri S (Kipas Angin Dinding Portable)
10 pages
2-DigitalOcean Invoice 2023 Sep (7467235-466314537)
No ratings yet
2-DigitalOcean Invoice 2023 Sep (7467235-466314537)
2 pages
Arrays
No ratings yet
Arrays
5 pages
MM17 Custom Fields Update MVKE Table
0% (1)
MM17 Custom Fields Update MVKE Table
10 pages
Box Sensor 2
No ratings yet
Box Sensor 2
1 page
S 8401 PDF
No ratings yet
S 8401 PDF
110 pages
Greater Amman Water SCADA Project (GASS) - TECO GROUP
No ratings yet
Greater Amman Water SCADA Project (GASS) - TECO GROUP
2 pages
KM Assumption
No ratings yet
KM Assumption
32 pages
LEDGENTS For Building
No ratings yet
LEDGENTS For Building
1 page
High School Pre-Calculus Tutor
From Everand
High School Pre-Calculus Tutor
The Editors of REA
4/5 (1)
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)

ML Unit 3

Uploaded by

ML Unit 3

Uploaded by

UNIT 3

2. Label(D)  returns label of D

3. BestSplit(D,F) on which feature the dataset

Non- Homogenous(D) Majority Class Label

• Impurity: n+ , n- (impurity depends only on magnitude)

• Impurity is measured in Proportional format  p˙ = n+ / (n++n-)  empirical

• Aim : We need a function that returns

2. GINI INDEX(Expected Error Rate).

• Minority class is Proportion to misclassified examples.

• Spam=40 majority class,

• If set of instances are Pure set  fewer(no error)

• Minority class as impurity class then ½ -|p- ½ |

• Randomly assigns a label to instances.

• P(positive instances), 1-p(negative instances)

• False positive  p (1-p)

• False Negative (1-p) p

• Formula: -p log 2p – (1-p) log 2(1-p)

• K class Gini Index=

• Rankers by learning an ordering algorithm

• Decision trees (can access Local class distribution) directly used

• Using Empirical Probability easy to calculate leaf ordering.

• Highest priority for Positives.

• Convex ROC Curve.

• How many ways we can label the tree and the

• If we know the number of positives and

• L-labels, C-classes then CL ways to arrange the

• +-+-, -+-+  they are locating at same

• Path of coverage corner contains optimal

• ----, --+-, +-+-, +-++, ++++

• L labels then L! permutations are possible.

-- Rankers (Order leaves in descending

-- Probability Estimator(Predict Empirical

-- Classifier(choose operating conditions , find

• Average Weight of Gini index children

n1/n * 2(n + / n)(n - / n)

• Relative impurity= sqrt(n1+ * n1- )/ (n + * n - )

• Label instances randomly.

• Coin---Head, tail  probability of occurring of

• Rule models consist of a collection of implications

• if-part defines a segment, and the then-part

2. first select a class you want to learn, and then find

Impurity for 2 classes Only for 1 children

• Separate and Conquer

• Rules are learned for one class at a time.

• minimizing min(p˙, 1 − p˙).

• maximize p˙, the empirical probability of the

• Equal Proportion of Positives to Overall

3. Weighted Relative Accuracy

You might also like