Module09 TreeBasedMethods

This document discusses tree-based methods for regression and classification. It introduces decision trees and how they are used to segment predictor spaces into regions. Regression trees are used for continuous target variables, while classification trees predict categorical classes. The document covers tree construction, pruning, and algorithms for building regression and classification trees.

Uploaded by

riya pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views36 pages

Module09 TreeBasedMethods

Uploaded by

riya pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Tree-Based Methods

Reference
• James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An
introduction to statistical learning (Vol. 112, p. 18). New York:
springer.
• Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H.
(2009). The elements of statistical learning: data mining,
inference, and prediction (Vol. 2, pp. 1-758). New York: springer.
Example: Construction of Regions

Fig: A tree corresponding to dataset Fig: Regression tree for corresponding dataset
Carseats
Regression Tree (Carseats Data, Response = Sales)

R1 R2 R5

R1 R2

R5
R3 R4 R3
Classification Tree (if Sales > 8, Yes, No)
Introduction
➢ Tree based methods for regression and classification involve stratifying or
segmenting the predictor space into a number of simple regions.
➢ In this formalism, a classification or regression decision tree is used as a
predictive model to draw conclusions about a set of observations.
➢ The set of splitting rules used to segment the given predictor space can be
summarized in a tree, these types of approaches are known as decision tree
methods.
➢ The goal of decision tree based methods is to create a model that can predict
the value of target variable based on several input variable.
Regression Trees
➢Decision trees where the target variable can take continuous values (typically real
numbers) are called regression trees.
➢A regression is a statistical technique that relates a dependent variable to one
or more independent (explanatory) variables.
➢A regression tree is built through a process known as binary recursive
partitioning.
➢Binary recursive partitioning is an iterative process that splits the data into
partitions or branches, and then continues splitting each partition into smaller
groups as the method moves up each branch.
➢Regression tree model divide the data into subsets using nodes, branches, and
leaves.
Terminology
➢The regions formed after dividing the data
points are known as “terminal nodes” or
“leaves” of the tree.
➢The points along the tree where predictor space
is divided are referred as “internal nodes”.
➢The segment of the tree that connects the nodes
are termed as “branches”.
➢Building regression tree consider following two
steps:
1) Divide the predictor space 𝑋1 , 𝑋2 … . , 𝑋𝐽
into 𝐽 distinct and non-overlapping regions
𝑅1 , 𝑅2 … . , 𝑅𝐽 .
2) For every observation that falls into the region
𝑅𝐽 , the prediction is simply the mean of the
response values for the training observations
in 𝑅𝐽 .
Construction of regions
➢The region could have any shape, but high-dimensional rectangles, or boxes are preferred
for simplicity and for ease of interpretation of the resulting predictive model.
➢The regions are created using recursive binary splitting approach that uses top-down
and greedy techniques.
➢The goal is to find the boxes 𝑅1 , 𝑅2 … . , 𝑅𝐽 that minimizes the RSS (Residual Sum
Square).
𝐽
2
𝑅𝑆𝑆 = ෍ ෍ 𝑦𝑖 − 𝑦ො𝑅𝑗
𝑖∈𝑅𝑗
𝑗=1
➢𝑦ො𝑅𝑗 is the mean response for training observations within the 𝑗th box.
➢Observations in the same box are intended to be homogeneous, and observations in
different boxes should be as different as possible.
Recursive Binary Splitting
➢We first select the predictor 𝑿𝒋 and the cutpoint 𝒔 such that splitting the
predictor space into the regions {𝑋|𝑋𝑗 < 𝑠} and {𝑋|𝑋𝑗 ≥ 𝑠} leads to the greatest
possible reduction in RSS.
➢For any 𝑗 and 𝑠, we define the pair of half-planes
𝑅1 𝑗, 𝑠 = 𝑋|𝑋𝑗 < 𝑠 and 𝑅2 𝑗, 𝑠 = 𝑋|𝑋𝑗 ≥ 𝑠
➢And we seek the value of 𝑗 and 𝑠 that minimize the equation
2 2
෍ 𝑦𝑖 − 𝑦ො𝑅1 + ෍ 𝑦𝑖 − 𝑦ො𝑅2
𝑖:𝑥𝑖 ∈𝑅1 𝑗,𝑠 𝑖:𝑥𝑖 ∈𝑅2 𝑗,𝑠
➢Next, repeat the process, looking for the best predictor and best cutpoint in order
to split the data further so as to minimize the RSS within each of the resulting
regions.
Recursive Binary Splitting
➢However, this time, instead of splitting the entire predictor space, split one of the
two previously identified regions. Now predictor space is divided into three
regions.
➢Again, look to split one of these three regions further, so as to minimize the RSS.
The process continues until a stopping criterion is reached; for instance, we may
continue until no region contains more than five observations.
➢Predict the response for a given test observation using the mean of the
training observations in the region to which that test observation belongs.
Tree Pruning
➢Recursive binary splitting produce good predictions on the training set, but is
likely to overfit the data, leading to poor test set performance.
➢A smaller tree with fewer splits (regions) might lead to lower variance and better
interpretation at the cost of a little bias.
➢The fewer splits can be done by building the tree only so long as the decrease in
the RSS due to each split exceeds some threshold. This strategy work well but it is
too short-sighted since some end split might be followed by a very good split.
➢To overcome all the above issues a better strategy is to grow a very large tree 𝑇0 ,
and then prune it back in order to obtain a subtree.
Shallow Tree after Pruning
Cost complexity pruning
➢It is also known as weakest tree pruning.
➢we consider a sequence of trees indexed by a nonnegative tuning parameter 𝛼. For
each value of 𝛼 there corresponds a subtree 𝑇 ⊂ 𝑇0 such that,
|𝑇| 2
෍ ෍ 𝑦𝑖 − 𝑦ො𝑅𝑚 + 𝛼|𝑇|
𝑚=1
𝑖:𝑥𝑖 ∈𝑅𝑚
is as small as possible. Here |𝑇| indicates the number of terminal nodes of the tree
𝑇, 𝑅𝑚 is the region corresponding to the 𝑚th terminal node, and 𝑦ො𝑅𝑚 is the mean of
the training observations in 𝑅𝑚 .
➢The tuning parameter 𝜶 controls a trade-off between the subtree’s complexity
and its fit to the training data.
➢Select an optimal value 𝛼ො using cross-validation. Then return to the full data set
and obtain the subtree corresponding to 𝛼.
ො
Algorithm: Building a Regression Tree
1. Use recursive binary splitting to grow a large tree on the training data, stopping
only when each terminal node has fewer than some minimum number of
observations.
2. Apply cost complexity pruning to the large tree in order to obtain a sequence of
best subtrees, as a function of 𝛼.
3. Use K-fold cross-validation to choose 𝛼. For each 𝑘 = 1, . . . , 𝐾:
𝑘−1
3.1. Repeat Steps 1 and 2 on all but the th fraction of the training data,
𝑘
excluding the kth fold.
3.2. Evaluate the mean squared prediction error on the data in the left-out 𝑘th
fold, as a function of α.
3.3. Average the results, and pick α to minimize the average error.
4. Return the subtree from Step 2 that corresponds to the chosen value of 𝛼.
Regression Tree with Hitters Data
Classification Trees
➢Classification tree is used to predict a qualitative response.
➢For a classification tree, we predict that each observation belongs to the most
commonly occurring class of training observations in the region to which it
belongs.
➢Classification trees interpret the class prediction corresponding to particular
terminal node region, and class proportions among the training observations that
fall into that region.
➢Instead of RSS in classification trees classification error rate (CER) is used.
➢CER is the fraction of the training observations in that region that do not belong to
the most common class
𝐸 = 1 − max 𝑝Ƹ 𝑚𝑘
𝑘
Here 𝑝Ƹ 𝑚𝑘 represents the proportion of training observations in the 𝑚th region that
are from the 𝑘th class.
Classification Tree
Gini index
➢Classification error is not sufficiently sensitive for tree-growing, so two other
measures are preferable:
1) Gini index
2) Cross-entropy
➢The Gini index is defined by a measure of total variance across the K classes.
𝐾

𝐺 = ෍ 𝑝Ƹ 𝑚𝑘 1 − 𝑝Ƹ 𝑚𝑘
𝑘=1
➢The Gini index takes on a small value if all of the 𝑝Ƹ 𝑚𝑘 are close to zero or one.
➢For this reason the Gini index is referred to as a measure of node purity — a
small value indicates that a node contains predominantly observations from a
single class.
Cross-entropy
➢The cross-entropy is given by
𝐾

𝐷 = − ෍ 𝑝Ƹ 𝑚𝑘 𝑙𝑜𝑔𝑝Ƹ 𝑚𝑘
𝑘=1
➢Since 0 ≤ 𝑝Ƹ𝑚𝑘 ≤ 1, cross-entropy will take value near to zero if the 𝑝Ƹ 𝑚𝑘 are all near zero
or one.
➢Therefore, the Gini index and the cross-entropy are very similar numerically, takes a
small value if the node is pure.
➢For pruning the tree, either Gini-index or Cross-entropy is used to evaluate quality of each
split.
➢Lower the value of Gini Index or Cross-entropy, better the classification performance
➢Classification error rate is preferable when prediction accuracy of the test-set is the goal.
Misclassification Error with Tree Size (Pruning)

Cost complexity pruning on classification tree using Carseats Data

Advantages and Disadvantages of Trees
➢Advantages:
1. Trees are very easy to explain to people.
2. Some people believe that decision trees more closely mirror human decision-making
system than do the regression and classification approaches.
3. Trees can be displayed graphically, and are easily interpreted even by a non-expert
(especially if they are small).
4. Trees can easily handle qualitative predictors without the need to create dummy
variables.
➢Disadvantages:
1. Unfortunately, trees generally do not have the same level of predictive accuracy as
some of the other regression and classification approaches.
2. High variance
➢To improve the predictive performance of trees bagging, random forests, and boosting
methods can be used.
Bagging (Breiman 1996)
➢Bootstrap aggregation, or bagging, is a general-purpose procedure for reducing the
variance of a statistical learning method. It is frequently used in the context of decision
trees.
➢For a given set of n independent observations 𝑍1 , . . . , 𝑍𝑛 , each with variance 𝜎 2 , the
2
ҧ
variance of the mean 𝑍 of the observations is given by Τ𝑛 . So averaging a set of
𝜎
observations reduces variance. This is not practical because generally access to multiple
training sets is difficult.
➢Instead, we can bootstrap, by taking repeated samples from the (single) training data set.
➢In this approach generate 𝐵 different bootstrapped training data sets. Then train our
method on the 𝑏th bootstrapped training set in order to get 𝑓መ ∗𝑏 (𝑥) the prediction at a
point 𝑥. We then average all the predictions to obtain
𝐵
1
𝑓𝑏𝑎𝑔 𝑥 = ෍ 𝑓መ ∗𝑏 (𝑥)
መ
𝐵
𝑏=1
This is called bagging.
➢Bagging provide improvements in accuracy by combining together multiple number of
trees into a single procedure.
Bagging for classification tree
➢For classification trees for each test observation prediction is done as
follows:
1) record the class predicted by each of the 𝐵 trees, and take a majority
vote
2) the overall prediction is the most commonly occurring class among
the 𝐵 predictions.
➢The value of 𝐵 should be sufficiently large that the error has settled
down.
Out-of-Bag Error Estimation
➢The bagged tree uses two-third of observations to build the tree, the
remaining one-third observations are referred to as the out-of-bag (OOB)
observations.
➢The prediction of the OOB observation is obtained by averaging the
response (for regression) or take a majority vote (for classification) among
𝐵/3 predictions.
➢The OOB approach for estimating the test error is particularly convenient
when performing bagging on large data sets.
➢Bagging improves prediction accuracy at the expense of interpretability.
Bagging and RF applied on Heart Data
Random Forest (Breiman 1999)
➢Random forests (RF) are a combination of tree predictors such that each tree depends on
the values of a random vector sampled independently and with the same distribution for
all trees in the forest.
➢Random forests provide an improvement over bagged trees by way of a small tweak that
decorrelates the trees. This reduces the variance when we average the trees.
➢In random forest the number of predictors considered at each split is approximately equal
to the square root of the total number of predictors (𝒎 ≈ 𝒑).
➢In random forest the algorithm is not allowed to consider a majority of the available
predictor at each split in the tree.
➢In random forest 𝑝 − 𝑚 Τ𝑝 of the splits not considered the strong predictor that leads to
substantial reduction in variance over a single tree. This process is known as decorrelating
the trees.
➢In RF, while splitting if 𝑚 = 𝑝 then this is simply bagging.
Errors (Random forest with Boston Data)
Boosting (Test error with Number of Trees)
Boosting
➢Boosting is similar to bagging except in bagging trees are build simultaneously
whereas in boosting trees are build sequentially using the information from
previously grown trees.
➢Unlike fitting the data hard and potentially overfitting, the boosting approach
instead learns slowly.
➢In boosting fit a decision tree to the residuals from the current model. Add this
new decision tree into the fitted function in order to update the residuals.
➢Each of these trees can be rather small, with just a few terminal nodes, determined
by the parameter d in the algorithm.
➢By fitting small trees to the residuals, we slowly improve 𝑓መ in areas where it does
not perform well. The shrinkage parameter 𝜆 slows the process down even further,
allowing more and different shaped trees to attack the residuals.
Algorithm: Boosting for Regression Trees
1. Set 𝑓መ 𝑥 = 0 and 𝑟𝑖 = 𝑦𝑖 for all i in the training set.
2. For 𝑏 = 1, 2, . . . , 𝐵, repeat:
a) Fit a tree 𝑓መ 𝑏 with 𝑑 splits (𝑑 + 1 terminal nodes) to the training data (𝑋, 𝑟).
b) Update 𝑓መ by adding in a shrunken version of the new tree:
𝑓መ 𝑥 ← 𝑓መ 𝑥 + 𝜆𝑓መ 𝑏 (𝑥)
c) Update the residuals,
𝑟𝑖 ← 𝑟𝑖 − 𝜆𝑓መ 𝑏 (𝑥𝑖 )
3. Output the boosted model,
𝐵

𝑓መ 𝑥 = ෍ 𝜆 𝑓መ 𝑏 (𝑥)
𝑏=1
Boosting (Boston dataset)
Boosting Parameters
➢Boosting has three tuning parameters:
1) The number of tree 𝑩: boosting also overfit if 𝐵 is too large, for that cross-
validation is used for selection of 𝐵.
2) The shrinkage parameter 𝝀: A small positive number (0.01-0.001) known as
boosting learning rate.
3) The number d of splits in each tree, which controls the complexity of the
boosted ensemble. Mostly d=1 works well, in which case each tree is a stump,
consisting of a single split.
Adaboost.m1 (Freund and Schapire (1997))
• Adaboost.m1, sequentially applies the weak classification algorithm to repeatedly
modified versions of the data, thereby producing a sequence of weak classifiers 𝐺𝑚 (𝑥)
• The predictions from all of them are then combined through a weighted majority vote to
produce the final prediction:
𝑀

𝐺 𝑥 = 𝑠𝑖𝑔𝑛[ ෍ 𝛼𝑚 𝐺𝑚 (𝑥)]
𝑚=1
• Here 𝛼1 , 𝛼2 , . . . , 𝛼𝑀 are computed by the boosting algorithm, which are weights of the
contribution of each respective 𝐺𝑚 (𝑥).
• At step m, those observations that were misclassified by the classifier 𝐺𝑚−1 𝑥 induced
at the previous step have their weights increased
• The weights are decreased for those that were classified correctly.
• Thus as iterations proceed, observations that are difficult to classify correctly receive
ever-increasing influence.
Adaboost.m1 (Discrete Adaboost for 2 classes)

1. Initialize the observation weights 𝑤 = 1/𝑁, 𝑖 = 1, 2, . . . , 𝑁.

𝑖

2. For 𝑚 = 1 𝑡𝑜 𝑀:
(a) Fit a classifier 𝐺𝑚 (𝑥) to the training data using weights 𝑤 .
𝑖

σ𝑁
𝑖=1 𝑤𝑖 𝐼(𝑦𝑖 ≠𝐺𝑚 𝑥𝑖 )
(b) Compute 𝑒𝑟𝑟𝑚 = σ𝑁 𝑖=1 𝑤𝑖
(c) Compute 𝛼𝑚 = log((1 − 𝑒𝑟𝑟𝑚 )/𝑒𝑟𝑟𝑚 ).
(d) Set 𝑤𝑖 ← 𝑤𝑖 exp[𝛼𝑚 𝐼(𝑦𝑖 ≠ 𝐺𝑚 (𝑥𝑖 )] for 𝑖 = 1, 2, . . . , 𝑁.

3. Output 𝐺 𝑥 = 𝑠𝑖𝑔𝑛[σ𝑀 𝑚=1 𝛼𝑚 𝐺𝑚 (𝑥)]

*Observations misclassified by 𝐺𝑚 (𝑥) have their weights scaled by a factor
exp(𝛼𝑚 ), increasing their relative influence

Coding For Kids Python A Playful Way For - Mark B Bennet
100% (1)
Coding For Kids Python A Playful Way For - Mark B Bennet
143 pages
A Detailed Lesson Plan in Mathematics Grade 7 (Algebra) : I. Objectives
0% (1)
A Detailed Lesson Plan in Mathematics Grade 7 (Algebra) : I. Objectives
7 pages
Chapter 7 - Trees
No ratings yet
Chapter 7 - Trees
80 pages
Trees Handout
No ratings yet
Trees Handout
51 pages
Answers PDF
100% (1)
Answers PDF
138 pages
Random Forest
No ratings yet
Random Forest
83 pages
08 Tree Regression 1
No ratings yet
08 Tree Regression 1
49 pages
Decision Trees: Chapter 08 (Part 01)
No ratings yet
Decision Trees: Chapter 08 (Part 01)
33 pages
Decision Tree & Regression
No ratings yet
Decision Tree & Regression
33 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Machine Learning: Classification & Decision Trees
No ratings yet
Machine Learning: Classification & Decision Trees
24 pages
Insurance Analytics: Prof. Julien Trufin
No ratings yet
Insurance Analytics: Prof. Julien Trufin
64 pages
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
No ratings yet
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
36 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
36 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
37 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
STAT 432: Basics of Statistical Learning: Tree and Random Forests
No ratings yet
STAT 432: Basics of Statistical Learning: Tree and Random Forests
54 pages
Random Forest
No ratings yet
Random Forest
8 pages
Tree-Based Methods
No ratings yet
Tree-Based Methods
32 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Unit-5 Decision Trees and Ensemble Learning
100% (1)
Unit-5 Decision Trees and Ensemble Learning
162 pages
Predict 422 - Module 8
100% (1)
Predict 422 - Module 8
138 pages
09 Decision Trees Nearest Neighbor
No ratings yet
09 Decision Trees Nearest Neighbor
8 pages
Regression Trees
No ratings yet
Regression Trees
11 pages
Chapter 09 CART - N
No ratings yet
Chapter 09 CART - N
24 pages
Dadm s16 Cart
No ratings yet
Dadm s16 Cart
18 pages
Resensi Big Data, Data Mining, and Machine Learning "Bahasa Inggris"
No ratings yet
Resensi Big Data, Data Mining, and Machine Learning "Bahasa Inggris"
2 pages
Unit 4
No ratings yet
Unit 4
33 pages
Introduction To Decision Tree: Gini Index
No ratings yet
Introduction To Decision Tree: Gini Index
15 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
HFSS-High Frequency Structure Simulator
No ratings yet
HFSS-High Frequency Structure Simulator
38 pages
6 - CART Models
No ratings yet
6 - CART Models
15 pages
AIML Ak
No ratings yet
AIML Ak
21 pages
Design of Radial Gate Using Rectangular 2
100% (1)
Design of Radial Gate Using Rectangular 2
55 pages
Acgih Manual 1998 (401-500)
No ratings yet
Acgih Manual 1998 (401-500)
100 pages
Gee Cart 2008
No ratings yet
Gee Cart 2008
8 pages
Spare Parts Book SK550 1.1
No ratings yet
Spare Parts Book SK550 1.1
26 pages
Decision Trees
No ratings yet
Decision Trees
17 pages
BS 5493 1977 Amd 2 Code of Practice For Protective Coating o PDF
No ratings yet
BS 5493 1977 Amd 2 Code of Practice For Protective Coating o PDF
118 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
48 pages
Chap 8
No ratings yet
Chap 8
9 pages
Bridge Works - Miscellaneous
No ratings yet
Bridge Works - Miscellaneous
26 pages
Module10 TreeBasedMethods
No ratings yet
Module10 TreeBasedMethods
33 pages
Tree Based Learning Methods
No ratings yet
Tree Based Learning Methods
28 pages
Unit IV
No ratings yet
Unit IV
36 pages
Classification and Regression Tree Construction
No ratings yet
Classification and Regression Tree Construction
18 pages
THUẬT TOÁN
No ratings yet
THUẬT TOÁN
4 pages
Unit 4-2
No ratings yet
Unit 4-2
20 pages
Random Forest Explained
No ratings yet
Random Forest Explained
39 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Emotional Intelligence Brochure PLI
100% (1)
Emotional Intelligence Brochure PLI
2 pages
BSC ML Ch3
No ratings yet
BSC ML Ch3
106 pages
Subject G11-Goodyear Tvl-Ia Eclassrecord 1stsem 2018-19
No ratings yet
Subject G11-Goodyear Tvl-Ia Eclassrecord 1stsem 2018-19
29 pages
5 PG TRB Unit 10 Phrases and Patterns
No ratings yet
5 PG TRB Unit 10 Phrases and Patterns
109 pages
MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
Bit Locker Administration and Monitoring
No ratings yet
Bit Locker Administration and Monitoring
17 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
Chap9 Cart 574 1
No ratings yet
Chap9 Cart 574 1
42 pages
Apoc Datasheet Agents of The Imperium Web
No ratings yet
Apoc Datasheet Agents of The Imperium Web
24 pages
Ltintegratedreport 2023
No ratings yet
Ltintegratedreport 2023
100 pages
Lecture 7 - Decision Tree Regression Imran 19032025 103416am
No ratings yet
Lecture 7 - Decision Tree Regression Imran 19032025 103416am
40 pages
Ch8 Tree Based Methods
No ratings yet
Ch8 Tree Based Methods
81 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Module08 PolynomialRegressionSplineGAMs
No ratings yet
Module08 PolynomialRegressionSplineGAMs
56 pages
QUP Digital Marketing TimeLine (Final Edited) 11!10!2021
No ratings yet
QUP Digital Marketing TimeLine (Final Edited) 11!10!2021
18 pages
Unit 1 - What Kind of Movies Have You Been Watching Recently
No ratings yet
Unit 1 - What Kind of Movies Have You Been Watching Recently
12 pages
Unit-4 Notes
No ratings yet
Unit-4 Notes
13 pages
A Rose For Emily Is A Story Told by William Faulkner. The Setting of The Story Occurred
No ratings yet
A Rose For Emily Is A Story Told by William Faulkner. The Setting of The Story Occurred
4 pages
Technical Data Sheet: Art. 630 Art. 630/1 - 630/2 - 630/3 Art. W51 Description
No ratings yet
Technical Data Sheet: Art. 630 Art. 630/1 - 630/2 - 630/3 Art. W51 Description
4 pages
Decision Tree
No ratings yet
Decision Tree
82 pages
Module 9 - CART
No ratings yet
Module 9 - CART
33 pages
SAEJ435 CV 001
100% (1)
SAEJ435 CV 001
13 pages
Revival and Reinvention of Kathak Dance
No ratings yet
Revival and Reinvention of Kathak Dance
14 pages
Peh Topics Review
No ratings yet
Peh Topics Review
3 pages
Module10 - Support Vector Machine
No ratings yet
Module10 - Support Vector Machine
23 pages
Happy Elevators India PVT LTD: Sub: Elevator Quotation
No ratings yet
Happy Elevators India PVT LTD: Sub: Elevator Quotation
9 pages
Sy19 A22 Cours7
No ratings yet
Sy19 A22 Cours7
76 pages
Unit 4 Da
No ratings yet
Unit 4 Da
23 pages
PSC 2 - Conduction
No ratings yet
PSC 2 - Conduction
1 page
MTM 2024 Mod 1 Lec 2
No ratings yet
MTM 2024 Mod 1 Lec 2
12 pages
Faktor Pengeboran Sumur Make Up
No ratings yet
Faktor Pengeboran Sumur Make Up
16 pages
Surat Undangan Peserta ADIA
No ratings yet
Surat Undangan Peserta ADIA
9 pages
SAT Suite Question Bank - 1 o 10 Difficult and Hard Grammar 2622024 Answers
No ratings yet
SAT Suite Question Bank - 1 o 10 Difficult and Hard Grammar 2622024 Answers
10 pages
Trees and Random Forest
No ratings yet
Trees and Random Forest
34 pages
Additive Properties
No ratings yet
Additive Properties
1 page
Cours 2 Anglais M1 Informatique
No ratings yet
Cours 2 Anglais M1 Informatique
5 pages
Anderson Peter Chapter 5 Two
No ratings yet
Anderson Peter Chapter 5 Two
4 pages
EV Battery Recycling
No ratings yet
EV Battery Recycling
3 pages
Reg_Tree
No ratings yet
Reg_Tree
38 pages
20schonlau_rforest
No ratings yet
20schonlau_rforest
23 pages
unit 3--ML (NEW)
No ratings yet
unit 3--ML (NEW)
68 pages
ML-PPT UNIT III-1
No ratings yet
ML-PPT UNIT III-1
38 pages

Module09 TreeBasedMethods

Uploaded by

Module09 TreeBasedMethods

Uploaded by

Tree-Based Methods

Cost complexity pruning on classification tree using Carseats Data

1. Initialize the observation weights 𝑤 = 1/𝑁, 𝑖 = 1, 2, . . . , 𝑁.

3. Output 𝐺 𝑥 = 𝑠𝑖𝑔𝑛[σ𝑀 𝑚=1 𝛼𝑚 𝐺𝑚 (𝑥)]

You might also like