0% found this document useful (0 votes)

5 views43 pages

תרגול - Decision Trees

The document discusses concepts related to gradients, gradient descent, hyperplanes, and decision trees in machine learning. It explains how to define hyperplanes, the process of finding linear separators, and the challenges with non-linearly separable data, such as the XOR problem. Additionally, it covers decision tree construction, impurity measures, and techniques for pruning trees to avoid overfitting.

Uploaded by

benmizar2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views43 pages

תרגול - Decision Trees

Uploaded by

benmizar2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

© Ben Galili IDC

¡ Gradient – the vector of the partial

derivatives. For some !(#$ , #& , … , #( ), the
gradient will be:
,! ,! ,!
*! = , ,…,
,#$ ,#& ,#(
¡ The gradient points in the direction of the
greatest rate of increase of the function, and
its magnitude is the slope of the graph in that
direction
© Ben Galili IDC
¡ Gradient descent – going in the opposite
direction to the gradient (toward the
minimum)
¡ We need the learning rate, alpha, to
determine how fast or slow we will move
towards the minimum (optimal weights)

© Ben Galili IDC

¡ How do we define a hyperplane in the space?
§ The space of the hyperplane is n-1 (if n is the
space that we work on)
§ All the point on the hyperplane solve the equation
!" #" + ⋯ + !& #& = ( (= !* )
▪ Where x are the point coordinate
§ The hyperplane separates the space into two half-
spaces
▪ All the point that the equation result > b
▪ All the point that the equation result < b

© Ben Galili IDC

In order to find the distance
from the hyperplane we −!" + !# = 1
need to normalized w, w0 by −!" + !# − 1 = 0
||w||
(for points on the plane, the −!" + !# = 0
normalization doesn’t
matter, why?)
!# −!" + 2!# = 1
−!" + 2!# − 1 = 0
−!" + 2!# = 0

!# = 1
!# − 1 = 0
!" !# = 0

© Ben Galili IDC

¡ We want to find linear separator:
§ All point above with result greater than 0, will be
belong to the +1 class (or -1)
§ All point under with result lower than 0, will be
belong to the -1 class (or +1)
¡ So, what we need to find?
§ The hyperplane weights ! ∈ # $%& (n hyperplane
weights & the bias !' )
§ We will predict 1 if ∑$
)*& !) +) + !' > 0 and -1
otherwise
© Ben Galili IDC
¡ !" AND !#
0#

0"
¡ Solution?
¡ If 1×!" + 1×!# − 1.5 > 0 predict 1
¡ Otherwise predict -1.
¡ i.e. ,- = −1.5, ," = 1, ,# = 1

© Ben Galili IDC

¡ !" OR !#
*#

*"
¡ Solution?
¡ !" + !# − 0.5 > 0 predict 1
¡ Otherwise -1

© Ben Galili IDC

¡ !" XOR !#
$#

$"
¡ Solution?
¡ There is no solution
¡ Many functions cannot be represented using
a linear separator, i.e., they are not linearly
separable
© Ben Galili IDC
¡ We will talk about linear classifiers in future
recitation
¡ The problem with linear classifiers is that not
all the data is linear separable
¡ We need more ‘tools’ to deal with more
complex data

© Ben Galili IDC

¡ Lets look on the classic XOR problem

¡ There is no linear classifier in this dimension

that can separate the classes
¡ How can we separate them?
© Ben Galili IDC
¡ By intuition, where would you
put the first line (horizontal or
vertical) and why?
¡ Now, we have 2 different
parts, and the same question !#

for each on of them

¡ This procedure produce a !"

decision tree

© Ben Galili IDC

¡ Decision Tree definitions:
§ Each internal node (all
nodes except the leafs)
tests an attribute
§ Each branch corresponds
to attribute value
§ Each leaf node assigns a
classification

© Ben Galili IDC

¡ Some facts on trees:
§ Any tree with no child limit (more than 2 children)
can be converted to binary tree (only 2 children)
§ Any polythetic tree (A node query can involve
more than one property) can be converted to
monothetic tree (Each node query involves only
one property)

© Ben Galili IDC

¡ For continuous variables we choose a
threshold value to split the attribute
¡ For example:
§ we have a continuous attribute ! ∈ 0,100 . If we
are testing for this variable then we create a
threshold value & and ask ! < & or ! ≥ &?

© Ben Galili IDC

© Ben Galili IDC
¡ While there are nodes in the queue do
§ Get next node n
§ If training examples in n perfectly classified
▪ Then continue to next node
§ else
▪ A <- the “best” decision attribute for the set in n But, how can
▪ Assign A as the decision attribute for n we know which
▪ For Each value of A attribute is the
▪ Create new descendant of n
best?
▪ Distribute training examples to descendant nodes
▪ Insert descendent nodes to queue
¡ End While

© Ben Galili IDC

¡ We want to choose the attribute that brings
us closer to perfect classification
¡ In order to do that we need to measure how
far are we from the perfect classification
¡ This measure called impurity:
§ High impurity means – we are far from perfect
classification
§ Low impurity means – we are close to perfect
classification
* Look in the lecture for formal definition
© Ben Galili IDC
¡ How can we use impurity to choose the best
attribute?
§ Calculate the impurity in the current node
§ Calculate a weighted average of the impurity over
the children nodes after a split according to the
test attribute
§ Reduce the second from the first and you will get
the impurity reduce
§ Choose the attribute that cause the largest
impurity reduce
© Ben Galili IDC
¡ The formula for the impurity reduce:
|#) |
∆" #, % = " # − ( " #)
|#|
)∈+,-./0(2)
* Where " is the impurity measure
¡ Important fact:
§ The " measure the impurity according to the classes
distribution in a node
§ The instances are split according to the test attribute
values – A
§ This means that you split the instances according to
an attribute values and then calculate the impurity
according to the classes values
© Ben Galili IDC
¡ There are 2 main implementation of the
impurity criterion
Gini Entropy
/ /
(, |(, | |(, |
Impurity 34"45"678 ( = 1 − +( )< !"#$%&' ( = − + 1%2
( |(| |(|
,-. ,-.

34"4_3?4" = 5"=%$>?#4%"_3?4" =
Goodness |(A | |(A |
of split 34"45"678 ( − + 34"45"678 (A !"#$%&' ( − + !"#$%&' (A
|(| |(|
A∈CDEFGH(I) A∈CDEFGH(I)

© Ben Galili IDC

¡ Uniform distribution:
§ GiniIndex =
' ) )
($ 1 1
1−# =1−+ =1− ≅1
( + +
$%&
§ Entropy =
' '
($ ($ 1 1 1
−# -./ = − # -./ ≅ −+ −1 =1
( ( + + +
$%& $%&

¡ Perfect distribution (all instances have the same class):

§ GiniIndex =
' )
($ + )
1−# =1− =1−1=0
( +
$%&
§ Entropy =
'
($ ($ + +
−# -./ = − -./ = 0
( ( + +
$%&

© Ben Galili IDC

¡ We want to predict if you will pass the test
¡ We have a training set of the last year
students (100 students) with 5 attributes – ID,
Gender, Bagrut Average, hours spend on
study, luck (1-10)
¡ Which attribute the InformationGain will
choose, and why?
¡ If an attribute has many values, the
InformationGain will tend to select it
© Ben Galili IDC
¡ Imagine using the attribute DAY=[D1,…,.D14]

© Ben Galili IDC

¡ To solve that we can use GainRatio:
,$-'./"&#'$!"#$ (, *
!"#$%"&#' (, * =
(01#&,$-'./"&#'$((, *)

¡ Where (01#&,$-'./"&#'$((, *) is the Entropy with respect to the

attribute A
|(6 | |(6 |
(01#&,$-'./"&#'$((, *) = − 5 1':
|(| |(|
6∈8
* In contrast to what we used as Entropy of S, which was with respect to
the target class
¡ Example:
?@
1 1 1
(01#&,$-'./"&#'$ (, ;"< = − 5 1': = −1': = 3.8074
14 14 14
=>?
0.94
!"#$%"&#' (, * = = 0.2469
3.8074

© Ben Galili IDC

¡ Decision trees tend to overfit
¡ This means that our tree gets too specific in
handling the training data

© Ben Galili IDC

¡ This suggests that we should prune or cut off
some branches of the tree to make it smaller,
and therefore get better test error
¡ How do we do this?

© Ben Galili IDC

¡ Post pruning
§ when we traverse the tree (starting from the leaf
nodes) and go all the way up (finishing at the root)
and at each node we decide whether or not to
completely cut off that branch of the tree
§ We decide whether to cut the branch or not based
on whether the splitting attribute helps or not
¡ We can also prune during the tree building =
we won’t create the children nodes
© Ben Galili IDC
¡ The Chi Square test is supposed to tell us:
does splitting according to some attribute
give us a distribution which is completely
random or does it have some predicting
power?
¡ So we check if splitting according to the
chosen attribute gives a distribution which is
similar to exactly random

¡ What is exactly random?
¡ If we have in our data Y=0 for 10 times and Y=1 for 90
times then the probability Y=0 is 10% and probability
Y=1 is 90%
¡ If we split according to !" and there are 50 instances
where !" = 1, then if splitting according to !" was
completely random we expect to see 50*0.1 instances
where Y=0 and 50*0.9 instances where Y=1. If we
don’t see this, then splitting according to !" is not a
random distribution and has predictive power.

¡ The test itself (assume Y can only take values of 0 \ 1):
# '() *+,-.+/0,
¡ ! "=0 ≈
#1+,-.+/0,
¡ Call 23 = number of instances where 45 = 6
¡ 73 = number of instances where 45 = 6 & " = 0
¡ 93 = number of instances where 45 = 6 & " = 1
¡ ;) = 23 ∗ !(" = 0), ;? = 23 ∗ !(" = 1)
¡ So Chi Square statistic is:
A A
A
73 − ;) 93 − ;?
@ = B +
;) ;?
3∈D.EF0,(GH )

¡ Once we have the Chi Square statistic we use
a chart to check if it is significant or not

X1 X2 Y Count X1
D=6, p=1, n=5 X1=0 X1=1
1 1 + 2
1 0 + 2 X2
0 1 - 5
X2=0 X2=1
0 0 + 1

D0=1, p0=1, n0=0 D1=5, p1=0, n1=5

" "
"
1% − 34 6% − 37
Where X2=0 ! = $ + = Where X2=1
34 37
%∈'()*+,(./ )
14 −34 " 64 − 37 " 17 − 34 " 67 − 37 "
= + + +
34 37 34 37
" " " "
1 5 5 25
1− 0− 0− 5−
= 6 + 6 + 6 + 6
1 5 5 25
6 6 6 6
© Ben Galili IDC
X1 X2 Y Count X1
D=6, p=1, n=5 X1=0 X1=1
1 1 + 2
1 0 + 2 X2
0 1 - 5
X2=0 X2=1
0 0 + 1

D0=1, p0=1, n0=0 D1=5, p1=0, n1=5

" " " "
1 5 5 25
1− 0− 0− 5− 25 5 5 1
6 6 6 6
!" = + + + = + + + =6
1 5 5 25 6 6 6 6
6 6 6 6
Now, we need to look at the chi square chart for the appropriate degree of freedom (number of
attribute values – 1 in the 2 classes case) with 95% of confidence or 0.05 p-value

© Ben Galili IDC
¡ What calculations are needed to find the feature Instance Attraction Weather Classification
to split the root of the decision tree using 1 Swim Hot -
Information Gain 2 Dance Hot +
¡ Reminder: 3 Casino Hot +
|03 | 4 Golf Hot -
!"#$%&'()$"_+')" = -"(%$./ 0 − 2 -"(%$./ 03 5 Swim Mild -
|0|
3∈56789:(<) 6
B Casino Mild -
|0? | |0? | 7 Dance Mild +
-"(%$./ 0 = − 2 C$D
|0| |0| 8 Golf Mild -
?@A
9 Ski Mild +
§ c – number of classes
10 Ski Cold +
§ Values(A) – all the values in the A feature
11 Casino Cold -
¡ We need to calculate: 12 Dance Cold -
§ Entropy(root)
§ Weighted average of the Entropy according to “Attraction”
§ Weighted average of the Entropy according to “Weather”

¡ Entropy(root) Instance Attraction Weather Classification
7 7 5 5
!"#$%&' $%%# = −( .%/ + .%/ ) 1 Swim Hot -
12 12 12 12
2 Dance Hot +
¡ Weighted average of the Entropy according to 3 Casino Hot +
“Attraction” 4 Golf Hot -
5 Swim Mild -
D4 2 2 2 3 2 2 1 1 6 Casino Mild -
3 !"#$%&' D4 = −( .%/ + .%/ + .%/
D 12 2 2 12 3 3 3 3 7 Dance Mild +
4∈6789:; <==>?@=ABC
3 1 1 2 2 2 2 2 2 2 2 8 Golf Mild -
+ .%/ + .%/ + .%/ + .%/ )
12 3 3 3 3 12 2 2 12 2 2 9 Ski Mild +
10 Ski Cold +
¡ Weighted average of the Entropy according to 11 Casino Cold -
“Weather” 12 Dance Cold -

D4 4 2 2 2 2
3 !"#$%&' D4 = −( .%/ + .%/
D 12 4 4 4 4
4∈6789:; FG?=HG>
5 2 2 3 3 3 1 1 2 2
+ .%/ + .%/ + .%/ + .%/ )
12 5 5 5 5 12 3 3 3 3

¡ Put it all together in the Information Gain formula Instance Attraction Weather Classification
!"#$%&'()$"_+')"(%$$(, Attraction) 1 Swim Hot -
7 7 5 5 2 2 2 2 Dance Hot
=− <$= + <$= +( <$= +
12 12 12 12 12 2 2 3 Casino Hot +
3 2 2 1 1 3 1 1 2 2 4 Golf Hot -
+ <$= + <$= + <$= + <$=
12 3 3 3 3 12 3 3 3 3 5 Swim Mild -
2 2 2 2 2 2 6 Casino Mild -
+ <$= + <$= )
12 2 2 12 2 2 7 Dance Mild +
8 Golf Mild -
!"#$%&'()$"_+')"(%$$(, WB'(ℎB%) 9 Ski Mild +
7 7 5 5 4 2 2 2 2 10 Ski Cold +
=− <$= + <$= +( <$= + <$= 11 Casino Cold -
12 12 12 12 12 4 4 4 4
5 2 2 3 3 3 1 1 2 2 12 Dance Cold -
+ <$= + <$= + <$= + <$= )
12 5 5 5 5 12 3 3 3 3

¡ We need to create an object which represents
a tree
¡ Let’s try to state all of the properties that we
need each node of the tree to have
¡ We need nodes and we need each node to
have a children (if a node doesn’t have
children then its called a leaf) and we need
each node to have a parent, unless it’s a root.

Public class Node {
Node[] children;
Node parent;

¡ Now we have a tree. Node node = new

Node();
¡ Set children to be nodes and that’s it.
© Ben Galili IDC
¡ For our purposes we need a bit more
¡ We need that if a node is a leaf node then we
know which value to return (i.e. if it’s a leaf
node and we are supposed to output 1 we
should know it).
¡ Also we need to know which attribute is the
splitting attribute

Public class Node {
Node[] children;
Node parent;
int attributeIndex;
double returnValue;
}

¡ Implement a Decision Tree
¡ Use Chi square pruning with different p-values:
§ Zero – no pruning
§ 0.95
§ 0.75
§ 0.5
§ 0.25
§ 0.05
¡ You should find the relevant value according to the degree of
freedom in the current node
¡ Report for each p-value the following:
§ Training error
§ Test error
§ Max tree height (according to the test data)
§ Average tree height (according to the test data)
© Ben Galili IDC

Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
Introduction To Classification - PPT Slides 1
No ratings yet
Introduction To Classification - PPT Slides 1
62 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
Calculus of Several Variable Syllabus
100% (3)
Calculus of Several Variable Syllabus
2 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
3.1 Vector Calculus PDF
100% (2)
3.1 Vector Calculus PDF
72 pages
Bayesian Classifier Implementation Using MATLAB
No ratings yet
Bayesian Classifier Implementation Using MATLAB
21 pages
02 - Decision Trees
No ratings yet
02 - Decision Trees
38 pages
Lecture 4
No ratings yet
Lecture 4
79 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Module 3
No ratings yet
Module 3
132 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
WS - Data Analytics Fundamental-R
No ratings yet
WS - Data Analytics Fundamental-R
51 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
Unit 3-Classification
No ratings yet
Unit 3-Classification
71 pages
Ds 6
No ratings yet
Ds 6
24 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
83 pages
Decision Trees Palagraism
No ratings yet
Decision Trees Palagraism
16 pages
AI Assignment: Vishal Batch 10 17SCSE101611
No ratings yet
AI Assignment: Vishal Batch 10 17SCSE101611
4 pages
Decision Trees MIT 15.097 Course Notes
No ratings yet
Decision Trees MIT 15.097 Course Notes
17 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Lecture 5 Classification P2 Decision Tree
No ratings yet
Lecture 5 Classification P2 Decision Tree
54 pages
DM Lect8
No ratings yet
DM Lect8
56 pages
IML Unit04 - Learning Decision Trees
No ratings yet
IML Unit04 - Learning Decision Trees
28 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Lec7 - Nonparametric Methods - II
No ratings yet
Lec7 - Nonparametric Methods - II
38 pages
04 Classification
No ratings yet
04 Classification
72 pages
3 - Sınıflandırma 2
No ratings yet
3 - Sınıflandırma 2
62 pages
VII - CS8031 - DMDW - Module 6 - Classification - VBP
No ratings yet
VII - CS8031 - DMDW - Module 6 - Classification - VBP
99 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
ML-Lec-07-Decision Tree Overfitting
No ratings yet
ML-Lec-07-Decision Tree Overfitting
25 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Unit 4
No ratings yet
Unit 4
19 pages
3 Decision Trees
No ratings yet
3 Decision Trees
41 pages
תרגול - Bayesian Learning
No ratings yet
תרגול - Bayesian Learning
45 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
CSE445 NSU Week - 4
No ratings yet
CSE445 NSU Week - 4
48 pages
Clase12 13
No ratings yet
Clase12 13
15 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
DM 4
No ratings yet
DM 4
68 pages
5 1 Decision Trees
No ratings yet
5 1 Decision Trees
34 pages
0580 Differentiation Teaching Pack v2
No ratings yet
0580 Differentiation Teaching Pack v2
53 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Lssds Trees
No ratings yet
Lssds Trees
41 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
48 pages
Chap 3 Vectors EC
No ratings yet
Chap 3 Vectors EC
12 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Unit-3 ML
No ratings yet
Unit-3 ML
47 pages
CSE 455 Artificial Intelligence: Decision Trees
No ratings yet
CSE 455 Artificial Intelligence: Decision Trees
16 pages
05 Classification II 2024
No ratings yet
05 Classification II 2024
54 pages
Math2011 Notes
No ratings yet
Math2011 Notes
166 pages
Vector Analysis
No ratings yet
Vector Analysis
36 pages
TT - R19 - 184p 25 Books
No ratings yet
TT - R19 - 184p 25 Books
184 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
21 pages
Lemur 5.0.3 User Guide Addendum PDF
No ratings yet
Lemur 5.0.3 User Guide Addendum PDF
19 pages
EGP Merged
No ratings yet
EGP Merged
1,300 pages
Oxo - AQA16 - C801 - cs01 - Xxaann Edit
No ratings yet
Oxo - AQA16 - C801 - cs01 - Xxaann Edit
6 pages
Unconstrained Optimization Methods: Amirkabir University of Technology Dr. Madadi
No ratings yet
Unconstrained Optimization Methods: Amirkabir University of Technology Dr. Madadi
13 pages
Vector and Tensor Otation': Appendix
No ratings yet
Vector and Tensor Otation': Appendix
16 pages
Vector Analysis Formulae Compilation: A Guide in ECEM315
No ratings yet
Vector Analysis Formulae Compilation: A Guide in ECEM315
6 pages
Faculty of Engineering (Chem., and Pet.) Semester One 2009 2010 Calculus MTHE01C02: Instructions Five Problems
No ratings yet
Faculty of Engineering (Chem., and Pet.) Semester One 2009 2010 Calculus MTHE01C02: Instructions Five Problems
6 pages
Dyadics Identities
No ratings yet
Dyadics Identities
22 pages
l3 Differentiation
100% (2)
l3 Differentiation
4 pages
Nautical Science Syllabus PDF Series (Mathematics) Fourier Series
No ratings yet
Nautical Science Syllabus PDF Series (Mathematics) Fourier Series
1 page
Lec 1
No ratings yet
Lec 1
4 pages
MAA SL 5.1-5.3 DERIVATIVES - TANGENT AND NORMAL (Concise)
No ratings yet
MAA SL 5.1-5.3 DERIVATIVES - TANGENT AND NORMAL (Concise)
9 pages
Chapter 05
No ratings yet
Chapter 05
32 pages
Chapter 2 ENGINEERING CALCULUS
No ratings yet
Chapter 2 ENGINEERING CALCULUS
25 pages
The Deal - II Library - The Step-7 Tutorial Program (Neuman, Error)
No ratings yet
The Deal - II Library - The Step-7 Tutorial Program (Neuman, Error)
34 pages
Multivariate Calculus Unit Two: M. A. Boateng, PHD., Mima
No ratings yet
Multivariate Calculus Unit Two: M. A. Boateng, PHD., Mima
26 pages
Gradient, Jacobian, Hessian, Laplacian and All That
No ratings yet
Gradient, Jacobian, Hessian, Laplacian and All That
2 pages
As Level Co-Ordinates, Points & Lines
No ratings yet
As Level Co-Ordinates, Points & Lines
14 pages
Directional Derivatives and The Gradient: Remark
No ratings yet
Directional Derivatives and The Gradient: Remark
4 pages
5.2 Lecture Part2 Tangents and Normals
No ratings yet
5.2 Lecture Part2 Tangents and Normals
12 pages
Differentiation 2
No ratings yet
Differentiation 2
8 pages
CampusX Course Index
No ratings yet
CampusX Course Index
3 pages
Homework 4
No ratings yet
Homework 4
3 pages
Beyond Effective Go: Part 1 - Achieving High-Performance Code
From Everand
Beyond Effective Go: Part 1 - Achieving High-Performance Code
Corey S Scott
5/5 (1)
Visual Financial Accounting for You: Greatly Modified Chess Positions as Financial and Accounting Concepts
From Everand
Visual Financial Accounting for You: Greatly Modified Chess Positions as Financial and Accounting Concepts
Anthony Brticevic
No ratings yet
SSC CGL Preparatory Guide -Mathematics (Part 1)
From Everand
SSC CGL Preparatory Guide -Mathematics (Part 1)
Dr. DK Sukhani
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet

תרגול - Decision Trees

Uploaded by

תרגול - Decision Trees

Uploaded by

© Ben Galili IDC

¡ Gradient – the vector of the partial

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

¡ There is no linear classifier in this dimension

for each on of them

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

¡ Perfect distribution (all instances have the same class):

© Ben Galili IDC

© Ben Galili IDC

¡ Where (01#&,$-'./"&#'$((, *) is the Entropy with respect to the

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

D0=1, p0=1, n0=0 D1=5, p1=0, n1=5

D0=1, p0=1, n0=0 D1=5, p1=0, n1=5

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

© Ben Galili IDC

¡ Now we have a tree. Node node = new

© Ben Galili IDC

© Ben Galili IDC

You might also like