0% found this document useful (0 votes)

23 views

Session 04 - Tree-Based Methods

The document discusses a session on tree-based methods and ensemble learning. It will cover regression trees, decision trees, and their terminology. An example baseball salary dataset is used to demonstrate how the data could be stratified into regions based on average hits and experience, and represented visually as a decision tree. Regression trees are also introduced for predicting a continuous variable, where the response is the mean of training observations in the region a test observation falls into.

Uploaded by

HGE05

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Session 04 - Tree-Based Methods

Uploaded by

HGE05

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

28/06/2019

Session 4 – Tree-based methods and

ensemble learning
Dr Ivan Olier
[email protected]

ECI – International Summer School /

Machine Learning
2019

In this session
• We will learn about tree-based methods and ensemble learning

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 2

1
28/06/2019

Example – Baseball salary data*

• The figure shows the salary of
baseball players as a function of
average hits and experience in
years.

• Salary is color-coded from low

(blue, green) to high (yellow, red)

• How would you stratify the data?

*Hitters data (available from ISLR R package)

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 3

Example – Baseball salary data

• Salary data could be stratified as
follows: R1 R3

• R1 – IF Years<4.5 THEN Salary = low

• R2 – IF Years>=4.5 AND Hits<117.5
THEN Salary = average 117.5
• R3 – IF Years>=4.5 AND Hits>=117.5
THEN Salary = high

R2
4.5

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 4

2
28/06/2019

Representing the rules as a decision tree

Partitioning of the space… Building a binary decision tree

R1 R3 𝑥1

< 4.5

117.5

R1 𝑥2
< 117.5
R2 Input variables
4.5 𝑥1 Years R2 R3
𝑥2 Hits
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 5

Terminology for Trees

Root
• We keep a tree analogy.
• Regions R1, R2, and R3 are known as terminal
nodes. 𝑥1
Branch Internal
• Decision trees are typically drawn upside nodes
< 4.5
down, in the sense that the leaves are at the
bottom of the tree.
• The points along the tree where the predictor R1 𝑥2
space is split are referred to as internal nodes < 117.5
• Usually, left-hand branch corresponds to Xj < tk,
and right-hand branch, to Xj >= tk. R2 R3
• In the hitters tree, the two internal nodes are Terminal
nodes or
indicated by the text Years<4.5 and Hits<117.5. leaves

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 6

3
28/06/2019

Output of a decision tree

𝑥1
• The output of a decision tree is a set of rules < 4.5

• Examples: 𝑥2 𝑥3
< −2 <0

1) IF 𝑥1 ≥ 4.5 AND 𝑥3 ≥ 0 THEN “E”

𝑥4 C D E

< 100
2) IF 𝑥1 < 4.5 AND 𝑥2 < −2 AND 𝑥4 ≥ 100
THEN “B”
A B

3) …

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 7

Regression trees
• Let’s assume we have the following
regression problem:

𝑌 = 𝑓 𝑋1 , 𝑋2
Y
• Estimate f using decision trees.

• Decision trees for regression are called X2

“Regression trees”. X1

• The figure displays the training data.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 8

4
28/06/2019

Regression trees
• A tree training algorithm will partition the
space following a diverse of criteria …
• Different decision tree variants will follow 𝑦4
different partitioning criteria. 𝑦3
• The figure is one possible partitioning of 𝑦2
the space.
• In regression trees, response takes discrete Y
values.
𝑦1
• In our example: 𝑌 = 𝑦0 , 𝑦1 , 𝑦2 , 𝑦3 , 𝑦4 𝑦0
• The response for a given test observation
will be the mean of the training 𝑥2𝐵
observations in the region to which that 𝑥2𝐴 𝑥1𝐵
test observation belongs. 𝑥1𝐴
X2 X1

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 9

Regression trees
𝑥1
X2 < 𝑥1𝐴
𝑦4 𝑦4
𝑦3 𝑥2𝐵
𝑦0 𝑥2 𝑥1
𝑦2
𝑦2 < 𝑥2𝐴 < 𝑥1𝐵
𝑥2𝐴
𝑦1 𝑦3
𝑦0 𝑦0 𝑦1 𝑦2 𝑥2
𝑦1 < 𝑥2𝐵
𝑥2𝐵
𝑥2𝐴 𝑥1𝐵
𝑥1𝐴
𝑥1𝐴 𝑥1𝐵
X1 𝑦3 𝑦4
X2 X1

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 10

5
28/06/2019

The tree-building process

• The goal is to find boxes 𝑅1 , … , 𝑅𝐽 that minimise the residual sum of squares (RSS), given
by:

𝐽
2
𝑅𝑆𝑆 = ෍ ෍ 𝑦𝑖 − 𝑦ො𝑅𝑗
𝑗=1 𝑖∈𝑅𝑗

• where 𝑦ො𝑅𝑗 is the mean response for the training observations within the jth box.

• Unfortunately, it is computationally unfeasible to consider every possible partition of the

feature space into J boxes.
• For this reason, we take a top-down, greedy approach that is known as recursive binary
partitioning.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 11

Recursive binary partitioning

• We first select the input variable 𝑥𝑗 and the cut-point s such that splitting the feature space
into the regions 𝑥|𝑥𝑗 < 𝑠 and 𝑥|𝑥𝑗 ≥ 𝑠 leads to the greatest possible reduction in RSS.

• Next, we repeat the process, looking for the best input variable and best cut-point in order
to split the data further so as to minimise the RSS within each of the resulting regions.

• However, this time, instead of splitting the entire predictor space, we split one of the two
previously identified regions. We now have three regions.

• Again, we look to split one of these three regions further, so as to minimise the RSS. The
process continues until a stopping criterion is reached; for instance, we may continue until
no region contains more than five observations.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 12

6
28/06/2019

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R2
< 11

R1 R2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 464 𝑆𝑎𝑙𝑎𝑟𝑦2 = 755

𝑅𝑆𝑆 = 49171342 Iteration = 1

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 13

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R2
< 16.5

R1 R2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 532 𝑆𝑎𝑙𝑎𝑟𝑦2 = 613

16.5

𝑅𝑆𝑆 = 53232041 Iteration = 2

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 14

7
28/06/2019

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R2
< 6.5

R1 R2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 365 𝑆𝑎𝑙𝑎𝑟𝑦2 = 742

6.5

𝑅𝑆𝑆 = 44051523 Iteration = 3

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 15

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R2
< 4.5*

R1 R2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 226 𝑆𝑎𝑙𝑎𝑟𝑦2 = 697

4.5

*optimum cut-
𝑅𝑆𝑆 = 40162637 point for Years

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 16

8
28/06/2019

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R3
< 4.5

125
R1 𝑥2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 226 < 125
R2
4.5 R2 R3
𝑆𝑎𝑙𝑎𝑟𝑦2 = 494 𝑆𝑎𝑙𝑎𝑟𝑦3 = 963
𝑅𝑆𝑆 = 30820324
Hits (x2) variable added, the
process is repeated
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 17

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R3
< 4.5

R1 𝑥2
77
𝑆𝑎𝑙𝑎𝑟𝑦1 = 226 < 77
R2
4.5 R2 R3
𝑆𝑎𝑙𝑎𝑟𝑦2 = 402 𝑆𝑎𝑙𝑎𝑟𝑦3 = 795
𝑅𝑆𝑆 = 35161358

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 18

9
28/06/2019

Recursive partitioning example – Hitters data (2

variables)
x1 – Years
𝑥1 x2 – Hits
R1 R3
< 4.5

117.5
R1 𝑥2
𝑆𝑎𝑙𝑎𝑟𝑦1 = 226 < 117.5*
R2
4.5 R2 R3
𝑆𝑎𝑙𝑎𝑟𝑦2 = 465 𝑆𝑎𝑙𝑎𝑟𝑦3 = 949
𝑅𝑆𝑆 = 30037022
*optimum cut-
point for Hits.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 19

Interpretation of the results

• Years is the most important factor in determining Salary, and players with less experience
earn lower salaries than more experienced players.

• Given that a player is less experienced, the number of Hits that he made in the previous
year seems to play little role in his Salary.

• But among players who have been in the major leagues for five or more years, the number
of Hits made in the previous year does affect Salary, and players who made more Hits last
year tend to have higher salaries.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 20

10
28/06/2019

Recursive partitioning example – Hitters data (2

variables)
…but we can keep
splitting the space until
RSS converges

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 21

Overfitting problem
• The process described above may produce good predictions on the training set, but is
likely to overfit the data, leading to poor test set performance.

Test data
RSS

Training data

# splits

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 22

11
28/06/2019

Pruning
• A better strategy is to grow a very large tree T0, and then prune it back in order to obtain a
subtree.
• Cost complexity pruning:
𝑇
2
෍ ෍ 𝑦𝑖 − 𝑦ො𝑅𝑚 +𝛼 𝑇
𝑚=1 𝑖:𝑥𝑖 ∈𝑅𝑚

• 𝛼 - complexity parameter: It is a nonnegative tuning parameter that controls the trade-off

between the subtree's complexity and its fit to the training data.
• For each value of 𝛼 there corresponds a subtree 𝑇 ⊂ 𝑇0.
• 𝑇 - indicates the number of terminal nodes of the tree 𝑇.
• 𝛼 can be selected by cross-validation.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 23

Tree algorithm – Summary

1. Use recursive binary splitting to grow a large tree on the training data, stopping only
when each terminal node has fewer than some minimum number of observations.

2. Apply cost complexity pruning to the large tree in order to obtain a sequence of best
subtrees, as a function of 𝛼.

3. Use K-fold cross-validation to choose 𝛼.

4. Return the subtree from Step 2 that corresponds to the chosen value of 𝛼.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 24

12
28/06/2019

Classification trees
• Very similar to regression trees
• But we predict a qualitative response (class) instead.
• Instead of using the mean, the predicted class of an observation will be the most commonly
occurring class (mode).
• In the classification setting, RSS cannot be used as a criterion for making the binary splits.
• One option could be the equivalent misclassification rate:
𝑀

𝐸 = ෍ 1 − max 𝑝Ƹ 𝑚𝑘
𝑘
𝑚=1
• Here 𝑝Ƹ𝑚𝑘 represents the proportion of training observations in the mth region that are
from the kth class.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 25

Gini index
• An alternative to misclassification rate is the Gini index.
• It is defined by

𝑀 𝐾

𝐺 = ෍ ෍ 𝑝Ƹ𝑚𝑘 1 − 𝑝Ƹ𝑚𝑘
𝑚=1 𝑘=1

• It is a measure of total variance across the K classes. The Gini index takes on a small value if
all of the 𝑝Ƹ𝑚𝑘 ’s are close to zero or one.

• For this reason the Gini index is referred to as a measure of node purity – a small value
indicates that a node contains predominantly observations from a single class.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 26

13
28/06/2019

Information gain
• An alternative to the Gini index is cross-entropy, given by

𝑀 𝐾

𝐷 = − ෍ ෍ 𝑝Ƹ𝑚𝑘 log 𝑝Ƹ 𝑚𝑘
𝑚=1 𝑘=1

• Cross-entropy is a measure of information gain.

• It turns out that the Gini index and the cross-entropy are very similar numerically.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 27

Trees Versus Linear Models

• Top Row: True linear boundary;

Bottom row: true non-linear
boundary.

• Left column: linear model; Right

column: tree-based model

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 28

14
28/06/2019

Advantages and Disadvantages of Trees

• Trees are very easy to explain to people. In fact, they are even easier to explain than linear
regression!

• Some people believe that decision trees more closely mirror human decision-making than
do other regression and classification approaches.

• Trees can be displayed graphically, and are easily interpreted even by a non-expert
(especially if they are small).

• Trees can easily handle qualitative predictors without the need to create dummy variables.

• Unfortunately, trees generally do not have the same level of predictive accuracy as some of
the other regression and classification approaches.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 29

Reducing the variance

• The same data may be partitioned differently

• That depends on the data available, data complexity, number of free model parameters,
etc.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 30

15
28/06/2019

Reducing the variance

• One way to reduce the variance is by having more independent data:

• … given a set of n independent observations:

𝑍1 , 𝑍2 , … , 𝑍𝑛 , each with variance 𝜎 2 ,

• The variance of the mean 𝑍ҧ of the observations is

𝜎 2 Τ𝑛
• In other words, averaging a set of observations reduces variance.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 31

Bagging
Dataset

• Bagging uses the same

principle. Although, instead
of using independent
datasets, it bootstraps the
Sample 1

Sample 2

Sample N

training data.

…
• That is, it takes repeated
samples from the training Model 1 Model 2 Model N
set.

Prediction 1 Prediction 2 Prediction N

Final prediction : Average over all predictions

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 32

16
28/06/2019

Bagging
• Algorithm for regression:
1. Generate B different bootstrapped training data sets.
2. Train method on the bth bootstrapped training set in order to get 𝑓መ ∗𝑏 𝑥 , the
prediction at a point x.
3. The final prediction is the average over all predictions:

𝐵
1
𝑓መbag 𝑥 = ෍ 𝑓መ ∗𝑏 𝑥
𝐵
𝑏=1

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 33

Bagging
• For classification – instead of taking the average, the overall predicted value (class) is the
majority vote – i.e. the mode of all predictions:

𝑓መbag 𝑥 = mode𝐵𝑏=1 𝑓መ ∗𝑏 𝑥

• In principle, bagging can be applied to any machine learning method. But it is usually used
with decision trees.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 34

17
28/06/2019

Random forests
Random forests are similar to bagged trees, but with a tweak to decrease the variance even
further:

• As in bagging, a number of decision trees are built on bootstrapped training samples.

• But when building these decision trees, each time a split in a tree is considered, a random
selection of m input variables is chosen as split candidates from the full set of p.

• The split is allowed to use only one of those m predictors.

• Usually we choose 𝑚 ≈ 𝑝

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 35

Example – Boston dataset

Task: Predict median housing value

Input variables (crime rate, zone, average age, taxes, etc)

Response variable:
Median Housing Value

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 36

18
28/06/2019

Example – Solution using random forest

Solution with 500 trees

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 37

Variable importance measure

• One disadvantage of using bagged trees and RFs is the lost of interpretability, if we
compare with decision trees.

• However, we can still measure variable importance:

• For regression:
• We record the total amount that the RSS is decreased due to splits over a given input
variable, averaged over all trees. A large value indicates an important variable.

• For classification:
• We add up the total amount that the Gini index (or cross-entropy measure) is decreased
by splits over a given input variable, averaged over all trees.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 38

19
28/06/2019

Example – Variable importance

LSTAT - % lower status of the population

RM - average number of rooms per dwelling
CRIM - per capita crime rate by town
…
ZN - proportion of residential land zoned for
lots over 25,000 sq.ft.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 39

Ensemble learning
Dataset

• Ensemble learning is a machine

learning paradigm where multiple
learners are trained to solve the
Dataset 1

Dataset 2

Dataset N

same problem.
• In contrast to ordinary machine
learning approaches which try to …
learn one hypothesis from
Model 1 Model 2 Model N
training data, ensemble methods
try to construct a set of
hypotheses and combine them to
use. Prediction 1 Prediction 2 Prediction N

• Bagging and random forests are

ensemble learning approaches. Combine prediction

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 40

20
28/06/2019

Stacking Base learners New dataset

Algorithm 1 Predictions 1
Prediction
Predictions 2 (<non-linear>
Algorithm 2
Meta-learner combination)
Dataset

(combiner)
Algorithm 3 Predictions 3

Algorithm 4 Predictions 4

• Stacking (or stacked generalisation, or super-learning) trains an algorithm (meta-learner)

to combine the predictions of several other algorithms.
• Performance is usually better than individual algorithms.

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 41

Stacking – algorithm
Level 0 – Base learners
Original dataset Algorithm responses
Fold 1 Fold 2 Fold 3 Truth Algoritm Algoritm Algoritm
1 2 3
Tr_split_1 Tr_split_1 Tst_split_1 Truth_1 Resp_1_1 Resp_2_1 Resp_3_1

Tr_split_2 Tst_split_2 Tr_split_2 Truth_2 Resp_1_2 Resp_2_2 Resp_3_2

Tst_split_3 Tr_split_3 Tr_split_3 Truth_3 Resp_1_3 Resp_2_3 Resp_3_3

Inputs
Algoritm1 Algoritm2 Algoritm3 Truth
Level 1 – Meta-learner
Resp_1_1 Resp_2_1 Resp_3_1 Truth_1

New ML problem, inputs are the algorithm Resp_1_2 Resp_2_2 Resp_3_2 Truth_2
responses using test sets from level 0. Outputs
Resp_1_3 Resp_2_3 Resp_3_3 Truth_3
are the original true values.
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 45

21
28/06/2019

Winning Data Science competitions

The following figure shows the winning solution for the 2015 KDD Cup competition

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 43

Summary
We learnt about tree-based methods and ensemble learning (classification and
regression trees – CART, bagging, random forests).

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 44

22
28/06/2019

Exercise
1. Estimate the RSS of a regression tree that was generated from the dataset in the table.
2. Predict Salary for 𝑌𝑒𝑎𝑟𝑠 = 2,5,14
𝑥1

< 11

R1 R2
𝑆𝑎𝑙𝑎𝑟𝑦1 =? ? 𝑆𝑎𝑙𝑎𝑟𝑦2 =? ?

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 45

Solution
• RSS estimation
75 + 145 + 155 + 1600 + 600 + 1008.333
𝑆𝑎𝑙𝑎𝑟𝑦1 = = 597.2
6
825 + 733.333
𝑆𝑎𝑙𝑎𝑟𝑦2 = = 779.2
2
𝑅𝑆𝑆 = 75 − 597.2 2 + 145 − 597.2 2 + 155 − 597.2 2 +
1600 − 597.2 2 + 600 − 597.2 2 + 1008.333 − 597.2 2 +
825 − 779.2 2 + 733.333 − 779.2 2
𝑅𝑆𝑆 = 1851566

• Predictions
𝑆𝑎𝑙𝑎𝑟𝑦 𝑌𝑒𝑎𝑟 = 2 = 592.2 𝑆𝑎𝑙𝑎𝑟𝑦 𝑌𝑒𝑎𝑟 = 5 = 592.2 𝑆𝑎𝑙𝑎𝑟𝑦 𝑌𝑒𝑎𝑟 = 14 = 779.2

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 46

23
28/06/2019

Exercise
𝑥1
• Draw the resulting partitioning of a 2-D <4
feature space according to this binary
tree, and for 𝑥1 , 𝑥2 > 0. 𝑥2 𝑥2
<2 <3

𝑥1 R4 R5 R6

𝑥2 R3
<1

R1 R2

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 47

Solution
𝑥1
<4 x2
R6
𝑥2 𝑥2
<3 3 R4
<2

𝑥1 R4 R5 R6 2
<1 R2 R5
1 R3
𝑥2 R3 R1
<1
1 4 x1

R1 R2

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 48

24
28/06/2019

Exercise
• We have a 1-D dataset. Two classification
trees (A and B) were fitted. The cut-points 𝑥1 𝒚 ෝ
𝒚𝑨 ෝ
𝒚𝑩
for x1 were 4.6 and 3.0, respectively. 1.3 c1 c1 c1
2.9 c1 c1 c1
3.1 c1 c1 c2
• Table shows the dataset along with the
ෝB ) by the two
𝒚𝑨 and 𝒚
predicted classes (ෝ 4.5 c2 c1 c2
trees. 4.8 c1 c2 c2
6.1 c2 c2 c2
7.2 c2 c2 c2
• Estimate misclassification rate, Gini index
and cross-entropy 8.9 c2 c2 c2

True Predictions Predictions

classes of Tree A of Tree B

2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 49

Solution Tree A
𝐸𝐴 = 1 − 3Τ4 + 1 − 3Τ4 = 0.5
𝑥1
𝐺𝐴 = 3Τ4 1 − 3Τ4 + 3Τ4 1 − 3Τ4 = 0.187
< 4.6
𝑥1 𝒚 ෝ
𝒚𝑨 ෝ
𝒚𝑩
1.3 c1 c1 c1 𝐷𝐴 = − 3Τ4 ln 3Τ4 − 3Τ4 ln 3Τ4 = 0.216
2.9 c1 c1 c1 C1 C2
3.1 c1 c1 c2 𝐸𝐵 = 1 − 2Τ4 + 1 − 4Τ4 = 0.5
(3,1) (1,3)
4.5 c2 c1 c2
𝐺𝐵 = 2Τ4 1 − 2Τ4 + 4Τ4 1 − 4Τ4 = 0.25
4.8 c1 c2 c2
6.1 c2 c2 c2 Tree B 𝐷𝐵 = − 2Τ4 ln 2Τ4 − 4Τ4 ln 4Τ4 = 0.347
7.2 c2 c2 c2
𝑥1
8.9 c2 c2 c2
< 3.0 Notice that misclassification rate is less
sensitive than Gini index and cross-
C1 C2 entropy
(2,0) (2,4)
2019 - ECI - International Summer School/Machine Learning - Dr Ivan Olier 50

9EC01 Specimen Paper
50% (2)
9EC01 Specimen Paper
51 pages
WST01 - 01 - Que - 2024 (Summer) S1
100% (1)
WST01 - 01 - Que - 2024 (Summer) S1
24 pages
Higher Diploma in Network Technology & Cyber Security - ICBT Campus
100% (1)
Higher Diploma in Network Technology & Cyber Security - ICBT Campus
3 pages
Network Operations Center
100% (4)
Network Operations Center
3 pages
VW Transport Protocol 2.0 (TP 2.0) For CAN Bus
No ratings yet
VW Transport Protocol 2.0 (TP 2.0) For CAN Bus
8 pages
Biology: Pearson Edexcel
100% (3)
Biology: Pearson Edexcel
32 pages
Common Core Language Arts and Math, Grade 5
From Everand
Common Core Language Arts and Math, Grade 5
Spectrum
No ratings yet
2007 May P1 QP
No ratings yet
2007 May P1 QP
16 pages
See Maths General 22 Ea s2 p2 Question Response
No ratings yet
See Maths General 22 Ea s2 p2 Question Response
20 pages
7362 01 Que 20070514
No ratings yet
7362 01 Que 20070514
28 pages
GET IT Stolen
No ratings yet
GET IT Stolen
20 pages
4305 01 PDF
No ratings yet
4305 01 PDF
28 pages
IGCSE Mathematics 4400 May 2004 Question Paper and Mark Scheme Paper 1F N20708
No ratings yet
IGCSE Mathematics 4400 May 2004 Question Paper and Mark Scheme Paper 1F N20708
24 pages
Guide Questions & Peer Evaluation PETA 2.1
No ratings yet
Guide Questions & Peer Evaluation PETA 2.1
4 pages
2007 Jan Paper 2
No ratings yet
2007 Jan Paper 2
36 pages
2007 Jan Paper 1
No ratings yet
2007 Jan Paper 1
16 pages
Student Name: Test Name:: Academic Reading Free - 1
No ratings yet
Student Name: Test Name:: Academic Reading Free - 1
3 pages
Student Name: Test Name:: Academic Reading Free - 1
No ratings yet
Student Name: Test Name:: Academic Reading Free - 1
3 pages
English Code BrE L1 Practice Test U4
No ratings yet
English Code BrE L1 Practice Test U4
2 pages
7 Decision Trees Annotated
No ratings yet
7 Decision Trees Annotated
61 pages
Study Guide Math Grade 7 Exponents Factors and Fractions
No ratings yet
Study Guide Math Grade 7 Exponents Factors and Fractions
4 pages
London Examinations Igcse: Accounting
No ratings yet
London Examinations Igcse: Accounting
28 pages
7011 Accounting January 2010
No ratings yet
7011 Accounting January 2010
40 pages
London Examinations Igcse: Accounting
No ratings yet
London Examinations Igcse: Accounting
28 pages
June 2003 Paper 5
No ratings yet
June 2003 Paper 5
16 pages
Bio_WBI14_June_21_QP
No ratings yet
Bio_WBI14_June_21_QP
32 pages
P76988 LCCI Certificate in Advance Business Calculations ASE20106 QP
No ratings yet
P76988 LCCI Certificate in Advance Business Calculations ASE20106 QP
28 pages
Pert 2
No ratings yet
Pert 2
36 pages
Cambridge Assessment International Education: This Document Consists of 14 Printed Pages
No ratings yet
Cambridge Assessment International Education: This Document Consists of 14 Printed Pages
14 pages
London Examinations IGCSE: English As A Second Language
No ratings yet
London Examinations IGCSE: English As A Second Language
17 pages
Question Paper Unit 6 June 2014
No ratings yet
Question Paper Unit 6 June 2014
16 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
Jun P1 2022
No ratings yet
Jun P1 2022
32 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
2010 May P1 QP
No ratings yet
2010 May P1 QP
28 pages
Scientific Presentation
No ratings yet
Scientific Presentation
15 pages
Writing #1
No ratings yet
Writing #1
7 pages
PracticeTest2 Written
No ratings yet
PracticeTest2 Written
20 pages
Written Paper - Level 2
No ratings yet
Written Paper - Level 2
20 pages
rrrrr 666
No ratings yet
rrrrr 666
12 pages
Location Entry Codes
No ratings yet
Location Entry Codes
25 pages
Progress Passport
No ratings yet
Progress Passport
2 pages
Oct 2021
No ratings yet
Oct 2021
15 pages
WMA11 01 Que 20220112
No ratings yet
WMA11 01 Que 20220112
32 pages
Linier Programing - Metode Grafis
No ratings yet
Linier Programing - Metode Grafis
39 pages
Cambridge International AS & A Level: Information Technology 9626/04 October/November 2020
No ratings yet
Cambridge International AS & A Level: Information Technology 9626/04 October/November 2020
10 pages
HTTPS:/WWW Ieltsidpindia com/Access/CandidateDashboard#
No ratings yet
HTTPS:/WWW Ieltsidpindia com/Access/CandidateDashboard#
1 page
wst01-01-que-20240522
No ratings yet
wst01-01-que-20240522
24 pages
SPEC-2023 Annexure-2-Summary Page
No ratings yet
SPEC-2023 Annexure-2-Summary Page
1 page
Edexcel GCE: Business Studies (9076)
No ratings yet
Edexcel GCE: Business Studies (9076)
12 pages
London Examinations Igcse: Economics
No ratings yet
London Examinations Igcse: Economics
20 pages
London Examinations GCE: Accounting (Modular Syllabus)
No ratings yet
London Examinations GCE: Accounting (Modular Syllabus)
40 pages
Math, Quarter 4: © XSEED Education Math - Grade 7
No ratings yet
Math, Quarter 4: © XSEED Education Math - Grade 7
5 pages
Finans5
No ratings yet
Finans5
81 pages
5 Chapter 4 Fuzzy AHP 2020
No ratings yet
5 Chapter 4 Fuzzy AHP 2020
28 pages
math progression test stage7 2011-2024 witm ms
No ratings yet
math progression test stage7 2011-2024 witm ms
389 pages
Valuation of Financial Instruments - Stocks
No ratings yet
Valuation of Financial Instruments - Stocks
21 pages
Higher Tier: London Examinations IGCSE
No ratings yet
Higher Tier: London Examinations IGCSE
28 pages
Math Practice Simplified: Tables & Graphs (Book J): Using Tables and Graphs to Make Decisions, Estimations, and Predictions
From Everand
Math Practice Simplified: Tables & Graphs (Book J): Using Tables and Graphs to Make Decisions, Estimations, and Predictions
Sharon Schwartz
5/5 (1)
Reading, Grade 5
From Everand
Reading, Grade 5
Carson Dellosa Education
No ratings yet
Common Core Connections Language Arts, Grade 5
From Everand
Common Core Connections Language Arts, Grade 5
Carson Dellosa Education
5/5 (1)
Updated Maths Zone 8 (18-19)
From Everand
Updated Maths Zone 8 (18-19)
No Author
No ratings yet
October Monthly Collection, Grade 1
From Everand
October Monthly Collection, Grade 1
Carson Dellosa Education
No ratings yet
Teorema de Divergencia
No ratings yet
Teorema de Divergencia
5 pages
Coulomb, Potencial, Campo, Densidad (V, S, L) 2
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L) 2
13 pages
Examenes Corte 1 y 3
No ratings yet
Examenes Corte 1 y 3
40 pages
Coulomb, Potencial, Campo, Densidad (V, S, L)
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L)
7 pages
Coulomb, Potencial, Campo, Densidad (V, S, L)
No ratings yet
Coulomb, Potencial, Campo, Densidad (V, S, L)
7 pages
HandsOn 3. Sensor Data
No ratings yet
HandsOn 3. Sensor Data
3 pages
Hands-On Activity: 3. Exploring The Array Data Model of An Image
No ratings yet
Hands-On Activity: 3. Exploring The Array Data Model of An Image
3 pages
Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON
No ratings yet
Hands-On Activity: 2. Exploring The Semi-Structured Data Model of JSON
3 pages
Session 01 - Introduction
No ratings yet
Session 01 - Introduction
28 pages
Session 03 - Neural Networks
No ratings yet
Session 03 - Neural Networks
21 pages
Session 02 - Regression - and - Classification
No ratings yet
Session 02 - Regression - and - Classification
22 pages
Big Data Computing
No ratings yet
Big Data Computing
57 pages
Big Data Computing: Working With Data Models and Big Data Processing
No ratings yet
Big Data Computing: Working With Data Models and Big Data Processing
46 pages
Elite Manual
No ratings yet
Elite Manual
203 pages
JAVA Bridge Course Contents and Schedule
No ratings yet
JAVA Bridge Course Contents and Schedule
1 page
Security Updates For The Browser Control Google Chromium Delivered With SAP Business Client
No ratings yet
Security Updates For The Browser Control Google Chromium Delivered With SAP Business Client
7 pages
Cs Cert Answers ANON 001
No ratings yet
Cs Cert Answers ANON 001
11 pages
Cybersecurity Prologue
No ratings yet
Cybersecurity Prologue
32 pages
SQL Performance Tuning: Ch.V.N.Sanyasi Rao, Tiruveedula Gopi Krishna
No ratings yet
SQL Performance Tuning: Ch.V.N.Sanyasi Rao, Tiruveedula Gopi Krishna
3 pages
Resume Quang Diem Pham
No ratings yet
Resume Quang Diem Pham
3 pages
Reviewer Grade 8
No ratings yet
Reviewer Grade 8
6 pages
Hypertext Transfer Protocol
No ratings yet
Hypertext Transfer Protocol
14 pages
Appml PHP
No ratings yet
Appml PHP
20 pages
License Certificate: Free For Commercial Use WITH ATTRIBUTION License
No ratings yet
License Certificate: Free For Commercial Use WITH ATTRIBUTION License
2 pages
Math
No ratings yet
Math
9 pages
Thesis - Dinesh Mavaluru
No ratings yet
Thesis - Dinesh Mavaluru
142 pages
Bachelor Thesis - Anatolii Shokhin PDF
No ratings yet
Bachelor Thesis - Anatolii Shokhin PDF
71 pages
A Faculty Development Program on MATLAB Software
No ratings yet
A Faculty Development Program on MATLAB Software
3 pages
KV 2 Holiday Homework
100% (1)
KV 2 Holiday Homework
4 pages
12.2 - BASIC MPLS Configuration Guide BGP PE To CE Routing Sessions
No ratings yet
12.2 - BASIC MPLS Configuration Guide BGP PE To CE Routing Sessions
3 pages
Evolution of Data Analysis
No ratings yet
Evolution of Data Analysis
1 page
UVM-MSPublicReviewDraft
No ratings yet
UVM-MSPublicReviewDraft
51 pages
Junipoer Module JX SFP 1GE LX
No ratings yet
Junipoer Module JX SFP 1GE LX
4 pages
Bluepill LCD Test
No ratings yet
Bluepill LCD Test
17 pages
Gr. - 6 - English - 2nd - Term - Revision Worksheet - 2023-24
No ratings yet
Gr. - 6 - English - 2nd - Term - Revision Worksheet - 2023-24
7 pages
List of Trackers
100% (1)
List of Trackers
10 pages
NAE Commissioning Guide
No ratings yet
NAE Commissioning Guide
101 pages
Building A Wordnet For Arabic
No ratings yet
Building A Wordnet For Arabic
6 pages
Module 8 - Session 1 PowerPoint - Ehealth - Mhealth - and HIS EA
No ratings yet
Module 8 - Session 1 PowerPoint - Ehealth - Mhealth - and HIS EA
23 pages
ReleaseNotes Combined 7.34 Web
No ratings yet
ReleaseNotes Combined 7.34 Web
9 pages