0% found this document useful (0 votes)

10 views

Module 3 - Ensemble Learning

Ensemble Learning - Vietnamese

Uploaded by

thaihaidang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Module 3 - Ensemble Learning

Ensemble Learning - Vietnamese

Uploaded by

thaihaidang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 178

Random Forest

(Basic, Advanced Concepts and Its Applications)

Vinh Dinh Nguyen

PhD in Computer Science
2
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Vs. Supervised Learning

➢ Example

➢ Summary
3
Decision Tree Review

Tiêm 5 đơn vị
Unit(đơn vị) Effect (hiệu quả) (%) vaccine
10 98
20 0
35 100
5 44
… …

Khi có 1 vaccine ra đời, chúng ta muốn dự đoán xem nó hiệu

quả bao nhiêu % ứng với từng liều lượng dùng trên bệnh Hiệu quả vaccine:
nhân. 44%
4
Decision Tree Review
14.5 23.5 29

Unit < 14.5

Kết quả dự đoán
100 cho unit < 23.5

Predict effect: Unit < 29

4.2
75
Kết quả dự đoán
cho unit > 23.5 Unit < 23.5 Predict effect 2.5
50
Effectiveness
(Hiệu quả)
(%) 25 Predict effect: 100 Predict effect: 52.8

Kết quả dự đoán Kết quả dự đoán

cho unit < 14.5 cho unit >= 29

10 20 30 40
Unit (Đơn vị) vaccine
5
Decision Tree Review
14.5 23.5 29

Unit < 14.5

Kết quả dự đoán
100 cho unit < 23.5

Predict effect: Unit < 29

4.2
75
Kết quả dự đoán
cho unit > 23.5 Unit < 23.5 Predict effect 2.5
50
Effectiveness
(Hiệu quả)
(%) 25 Predict effect: 100 Predict effect: 52.8

Kết quả dự đoán Kết quả dự đoán

cho unit < 14.5 cho unit >= 29

10 20 30 40
Dữ liệu test
Unit (Đơn vị) vaccine
Dữ liệu train Error
6
Decision Tree Review
14.5 23.5 29

Unit < 14.5

Kết quả dự đoán
100 cho unit < 23.5

Predict effect: Unit < 29

4.2
75
Kết quả dự đoán
cho unit > 23.5 Delete Predict effect 2.5
50
Effectiveness
(Hiệu quả)
(%) 25 Delete Delete

Kết quả dự đoán

Kết quả dự đoán cho unit >= 29 Dữ liệu test Dữ liệu train
cho unit < 14.5

10 20 30 40
Unit (Đơn vị) vaccine Note : If we want to prune the tree more, we could remove last two
leaves and replace the split with a leaf that is the average of all of the Error
observations
7
Decision Tree Review
14.5 29

Unit < 14.5

Kết quả dự đoán
100 cho unit < 23.5

Predict effect: Unit < 29

4.2
75 Kết quả dự đoán
cho unit < 29

Predict effect 73.8 Predict effect 2.5

50
Effectiveness
(Hiệu quả)
(%) 25
Kết quả dự đoán
Kết quả dự đoán cho unit >= 29
cho unit < 14.5

10 20 30 40
Dữ liệu test
Unit (Đơn vị) vaccine
Dữ liệu train Error
8
Tree Complexity Penalty

The tree complexity penalty compensates for the difference in the

number of leaves.

Tree Score = sum of squared residual + αT

α (alpha) is a tuning parameter that we finding using cross validation.
T is the total number of terminal nodes/the total number of leaves

For now, let’s let α = 10,000 and calculate tree score for each tree.
9
How to Select α

α=0 α = 10,000 α = 15000 α =20,000

Split 1 … … … …
Split 2 … … … …
Split 3 … … … …
Split 4 … … … …
Split 5 … … … …
Average 50,000 5000 11,000 30,000

In this case, the optimal trees built with α = 10,000 had, on

average, the lowest sum of square residuals. So α = 10,000 is our
final value.
10
Decision Tree Review

Advantages Disadvantages

✓ Very easy to explain ✓ Do not have the same level of predicting accuracy as
(Bạn nghĩ có dễ hiểu hơn linear regression không?) some other regression and classification methods

✓ More closely mirror human ✓ Small changes in the data can cause a large change in
(Bạn nghĩ sao về điều này?) the large estimated tree.

✓ Can easily handle qualitative predictors without the ✓ Are less effective in making predictions when the
need of create dummy variables. main goal is to predict the outcome of a continuous
(Dummy variable là gì?) variable

https://twitter.com/gsutters/status/1281001812577976329
11
Dummy Variable

https://twitter.com/gsutters/status/1281001812577976329
12
Bias-Variance Trade-off
Y2 Y2 Y3

X3
X2
High bias, low X1 Low variance, low High variance, low
variance bias bias
(Underfitting) (just right) (overfitting)
Weak Learner
13
Bias-Variance Trade-off
Height
Need to develop a ML algorith to capture this
Real Dataset relationship

True
Relationshi
p

Weight
Weight

Trannin Testing
g
14
Bias-Variance Trade-off
Linear Linear Regression will never capture the
Regression true relationship between weight and
height

The inability of machine learning to

capture the true relationship is call bias
>> 0

Linear Regression has a high bias

Polynomial Polynomial Regression can capture the

Regression true relationship between weight and
height

Polynomial Regression has a low bias

≈0
15
Bias-Variance Trade-off
Linear
Regression Linear Regression has a high bias because …

Linear Regression has low variance because its SSR

are very similar for diference dataesets
>> 0

The different in fits between datatasets is call

variance
Polynomial Polynomial Regression has a low bias because …
Regression
Polynomial Regression has high variance because it
returns in huge different in SSR between train and test
>> 0 dataset
16
Bias-Variance Trade-off
Low bias
Input Low
Our model
Train (Linear SSR on Train
Regression)
High
Output High bias

DATASET
Low Variance
Input High
Our model
Test (Linear SSR on Test
Regression)
Output High High variance

Bias as the error rate of the training data. The different in fits between datatasets is call variance
17 Prediction errors: (Bias and Variance)

Bias as the error rate of the training data. The different in fits between datatasets is call variance
18 Random Forest: Motivation

You want to buy a perfume for your

girlfriend(s)?
What would you do?
1 2 Channel đi
Channel đi
con! bạn!!

Ask for
Idea!
5 Search:
Cucci 3
Cucci đi!!
4
Cucci ạ!!
19 Random Forest: Motivation

You want to buy a perfume for your

girlfriend(s)?
What would you do?
1 2 Channel đi
Channel đi
con! bạn!!

Mua
Gucci
Thôi!!!
5 Search:
Cucci 3
Cucci đi!!
4
Cucci ạ!!
20 Random Forest: Motivation

You want to buy a perfume for your

girlfriend(s)?
What would you do?
1 2 Channel đi
Channel đi
con! bạn!!

Mua
Gucci
Thôi!!!
5 Search:
Cucci 3
Cucci đi!!
4
Cucci ạ!!
21 Random Forest: Motivation

ENSEMPLE LEARNING
22 What is an Ensemble Learning
23 Homogenous Approach
24 Heterogeneous Approach
25 Ensemble Learning Techniques

Ensemple Learning
Thông dụng ở các cuộc thi về
AI

Bagging Boosting Stacking

homogeneous weak learners homogeneous weak learners Heterogeneous weak learners
Random Forest
26 Bagging-based Method

Random Forest

ENSEMPLE LEARNING
27 Boosting-Based Method
28 Stacking-Based Method
29 Decision Tree vs Random Forest

https://commons.wikimedia.org/wiki/File:Decision_Tree_vs._Random_Forest.png
30
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Data

➢ Example

➢ Summary
31 Random Forest is a Solution
32 Step to Random Forest

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART

CIRCULATION ARTERIES DISEASE
NO NO NO 125 NO

YES YES YES 180 YES

YES YES NO 210 NO

YES NO YES 167 YES

33 1st Step: Create a New Dataset
CHEST GOOD BLOCKE WEIGHT HEART
PAIN BLOOD D DISEASE
CIRCULATIO ARTERIE
N S
NO NO NO 125 NO Original DATA

YES YES YES 180 YES

YES YES NO 210 NO

New DATA
YES NO YES 167 YES
CHEST GOOD BLOCKED WEIGHT HEART
PAIN BLOOD ARTERIES DISEASE
Chọn lựa ngẫu nhiên từ CIRCULATION
dataset ban đầu YES YES YES 180 YES
NO NO NO 125 NO
YES NO YES 167 YES
YES NO YES 167 YES
34 1st Step: Create a New Dataset
CHEST GOOD BLOCKE WEIGHT HEART
PAIN BLOOD D DISEASE
CIRCULATIO ARTERIE
N S
NO NO NO 125 NO Original DATA

YES YES YES 180 YES

YES YES NO 210 NO

Bootrapped Dataset
YES NO YES 167 YES
CHEST GOOD BLOCKED WEIGHT HEART
PAIN BLOOD ARTERIES DISEASE
Chọn lựa ngẫu nhiên từ CIRCULATION
dataset ban đầu YES YES YES 180 YES
NO NO NO 125 NO
YES NO YES 167 YES
YES NO YES 167 YES
35 2nd Step: Create a New Dataset
GENERATE DECISION TREES FROM THE BOOTSTRAPPED DATASET USING PREDEFINED CONDITIONS

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART

CIRCULATION ARTERIES DISEASE
A RANDOM SUBSET OF 2 ATTRIBUTES
YES YES YES 180 YES (OR 2 COLUMNS).
NO NO NO 125 NO
YES NO YES 167 YES
YES NO YES 167 YES

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART Traditional Tree

CIRCULATION ARTERIES DISEASE

YES YES YES 180 YES

NO NO NO 125 NO Tree with Predefined
Conditions
YES NO YES 167 YES
YES NO YES 167 YES
36

Chọn lựa ngẫu nhiên 2

features (columns)
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
Giả sử Good Blood là
CIRCULATION ARTERIES DISEASE
root node GOOD BLOOD
YES YES YES 180 YES
NO NO NO 125 NO
YES NO YES 167 YES
??? ???
YES NO YES 167 YES

Chọn lựa ngẫu nhiên 2 features (columns) Loại bỏ Good Blood ra

khỏi dataset
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
CIRCULATION ARTERIES DISEASE CIRCULATION ARTERIES DISEASE

YES YES YES 180 YES YES YES YES 180 YES
NO NO NO 125 NO NO NO NO 125 NO
YES NO YES 167 YES YES NO YES 167 YES
YES NO YES 167 YES YES NO YES 167 YES
37

Chọn lựa ngẫu nhiên 2 features (columns)

GOOD BLOOD
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
CIRCULATION ARTERIES DISEASE Giả sử Chest pain là
node tối ưu
YES YES YES 180 YES
NO NO NO 125 NO Chest Pain ???
YES NO YES 167 YES
YES NO YES 167 YES
??? ???
Chọn lựa ngẫu nhiên 2 features (columns)
Loại bỏ chest pain ra khỏi dataset
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
CIRCULATION ARTERIES DISEASE CIRCULATION ARTERIES DISEASE

YES YES YES 180 YES YES YES YES 180 YES
NO NO NO 125 NO NO NO NO 125 NO
YES NO YES 167 YES YES NO YES 167 YES
YES NO YES 167 YES YES NO YES 167 YES
38

Chọn lựa ngẫu nhiên 2 features (columns) GOOD BLOOD

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART Giả sử Weight là node
CIRCULATION ARTERIES DISEASE
tối ưu
Chest Pain ???
YES YES YES 180 YES
NO NO NO 125 NO
YES NO YES 167 YES
YES NO YES 167 YES
Weight ???

Loại bỏ Weight ra khỏi dataset

Blocked Arteries Weight
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
CIRCULATION ARTERIES DISEASE

YES YES YES 180 YES

NO NO NO 125 NO Weight
Weight
YES NO YES 167 YES
YES NO YES 167 YES
39 1st Decision Tree
CHEST GOOD BLOCKED WEIGHT HEART
PAIN BLOOD ARTERIES DISEASE
CIRCULATION
YES YES YES 180 YES
NO NO NO 125 NO
YES NO YES 167 YES
YES NO YES 167 YES
40 Create N Tree
1st bootstrapped dataset 1st tree

2nd 2nd tree

bootstrapped dataset

Generate
…
nth bootstrapped dataset nth tree
41 Create N Tree
1st bootstrapped dataset 1st tree

2nd 2nd tree

bootstrapped dataset

Random Forest
Generate
…
nth bootstrapped dataset nth tree
42 How to Predict New Sample

Chest Pain No
NEW PATIENT
GOOD BLOOD No
CIRCULATION
BLOCKED ARTERIES No
Weight 125

Tôi có thể bị
bệnh không?

Heart Disease
43 Bagging Technique

Dataset

Bootstrapped
Dataset
44 How to Predict New Sample

1st tree
Predict
Chest Pain No
GOOD BLOOD No
CIRCULATION
BLOCKED ARTERIES No
Weight 125 2nd tree Heart Disease

Predict Yes No
Tôi có thể bị
7 2
bệnh không?
Rất tiếc, bạn
đã mắc
nth tree bệnh!
Predict
45 Review
ORIGINAL DATA

BOOTSTRAPPED DATASET
RANDOMLY SELECT DATA

ALLOW DUPLICATED VALUES

46 Review

Original Dataset Bootstrapped Dataset

CHEST PAIN GOOD BLOOD BLOCKED WEIGH HEART CHEST PAIN GOOD BLOOD BLOCKED HEART
CIRCULATION ARTERIES T DISEASE CIRCULATION ARTERIES WEIGHT DISEASE
NO NO NO 125 NO YES YES YES 180 YES

YES YES YES 180 YES NO NO NO 125 NO

YES YES NO 210 NO YES NO YES 167 YES

YES NO YES 167 YES YES NO YES 167 YES

Mộp phần của dataset ban đầu có thể không có mặt ở Bootstrapped dataset
47 Out-of-bag Dataset

OUT-OF-BAG ERROR

Chúng ta có thể sử dụng out-of-bag dataset để đo lường độ chính xác của Random Forest
48
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Data

➢ Example

➢ Summary
49 Random Forest with Missing Data
50 Types of Missing Data

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART

CIRCULATION ARTERIES DISEASE

NO NO NO 125 NO

YES YES YES 180 YES

YES YES NO 210 NO

YES NO N/A N/A NO

Text or
Numbering
51 How to Fill in Missing Data

DATA WITH
MISSING GUESSING REFINE THE
VALUES THE DATA GUESSES
52 Guessing the Data
CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART
CIRCULATION ARTERIES DISEASE

NO NO NO 125 NO

YES YES YES 180 YES

YES YES NO 210 NO

YES NO No 167.5 NO

Ý tưởng: Điền giá ban đầu, sau đó

hiệu chỉnh dần cho nó tốt hơn
53 Refine the Guesses

BUILD EVALUATE THE

RANDOM DATA FOR ALL
FOREST TREES
54 Proximity Matrix

Sample 3 and sample 4 reaches to

the same decision

Mỗi cột thể hiện 1 sample

1 2 3 4
1
2
1st Tree Mỗi dòng thể hiện
1 sample 3 1
4 1

Dòng 3 và 4 cùng trả về kết quả là No

55 Proximity Matrix

Sample 3 and sample 4 reaches to

the same decision

Mỗi cột thể hiện 1 sample

1 2 3 4
1
2 1 1
Mỗi dòng thể hiện
2nd Tree 3 1 2
1 sample
4 1 2
56 Proximity Matrix Of N Trees

1 2 3 4

1 2 1 1

2 2 1 1

3 1 1 8

4 1 1 8
57 Proximity Matrix Of N Trees

1 2 3 4
1 0.2 0.1 0.1
Normalization:
Assume we have 10 2 0.2 0.1 0.1
trees.

3 0.1 0.1 0.8

4 0.1 0.1 0.8
58 Fill in the Missing Values

Frequency of Yes: 1/3

The weight frequency of Yes = Frequency of Yes *

Weight for Yes
Proximity Matrix The weight frequency of Yes = 1/3 * 0.1 = 0.03
1 2 3 4
Weight for Yes = Proximity of Yes/All
1 0.2 0.1 0.1 proximities
2 0.2 0.1 0.1
Proximity of Yes:
3 0.1 0.1 0.8
0.1
4 0.1 0.1 0.8 All proximities: 0.1 + 0.1 + 0.8 =
1.0
59 Fill in the Missing Values

Frequency of No: 2/3

The weight frequency of No = Frequency of No *

Weight for No
Proximity Matrix The weight frequency of No = 2/3 * 0.9 = 0.6
1 2 3 4
Weight for No = Proximity of No/All
1 0.2 0.1 0.1 proximities
2 0.2 0.1 0.1
Proximity of No: 0.1 + 0.8 =
3 0.1 0.1 0.8
0.9
4 0.1 0.1 0.8 All proximities: 0.1 + 0.1 + 0.8 =
1.0
60 Fill in the Missing Values

Predict

The weight frequency of No = 2/3 * 0.9 = The weight frequency of Yes = 1/3 * 0.1 =
0.6 0.03
61 Fill in the Missing Values

s1’s weight = 125

Weight s1 = s1’s weight * Weighted average weight of

s1
Weight s1 = 125 * 0.1 = 12.5

Proximity Matrix
1 2 3 4
1 0.2 0.1 0.1 Weighted average weight of s1 = 0.1 / (0.1 + 0.8 + 0.1)
= 0.1
2 0.2 0.1 0.1
3 0.1 0.1 0.8
4 0.1 0.1 0.8
62 Fill in the Missing Values

s2’s weight = 180

Weight s2 = s2’s weight * Weighted average weight of

s2
Weight s2 = 180 * 0.1= 18.0

Proximity Matrix
1 2 3 4
1 0.2 0.1 0.1 Weighted average weight of s2 = 0.1 / (0.1 + 0.8 + 0.1)
= 0.1
2 0.2 0.1 0.1
3 0.1 0.1 0.8
4 0.1 0.1 0.8
63 Fill in the Missing Values

s3’s weight = 210

Weight s3 = s3’s weight * Weighted average weight of

s3
Weight s3 = 210 * 0.8= 168.0

Proximity Matrix
1 2 3 4
1 0.2 0.1 0.1 Weighted average weight of s3 = 0.8 / (0.1 + 0.8 + 0.1)
= 0.8
2 0.2 0.1 0.1
3 0.1 0.1 0.8
4 0.1 0.1 0.8
64 Fill in the Missing Values
Summation
Weight s1 = 125 * 0.1 = 12.5

Weight s2 = 180 * 0.1= 18.0

Weight s3 = 210 * 0.8= 168.0

198.5
65
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Vs. Supervised Learning

➢ Example

➢ Summary
66 Time Series vs Supervised Learning
Time series data is a collection of data points over time.
67 Time Series vs Supervised Learning
A time series is a sequence of numbers A supervised learning problem is comprised of input
that are ordered by a time index. This can patterns (X) and output patterns (y), such that an algorithm
be thought of as a list or column of can learn how to predict the output patterns from the input
ordered values. patterns.
68 Time Series vs Supervised Learning
Time series data can be phrased as supervised learning

Time Series data Supervised learning

Sliding Window For Time Series Data

Sliding Window With Multivariate

Time Series Data
69 Time Series vs Supervised Learning
A key function to help transform time series data into a supervised learning problem is the Pandas shift() function.

Time Series data Supervised learning

Sliding Window For Time Series Data

Sliding Window With Multivariate

Time Series Data
70 Time Series vs Supervised Learning
A key function to help transform time series data into a supervised learning problem is the Pandas shift() function.
71 Time Series vs Supervised Learning
A key function to help transform time series data into a supervised learning problem is the Pandas shift() function.
72 Time Series vs Supervised Learning
A key function to help transform time series data into a supervised learning problem is the Pandas shift() function.
73 Time Series vs Supervised Learning
One-Step Univariate Forecasting
74 Time Series vs Supervised Learning
Multi-Step or Sequence Forecasting
75 Time Series vs Supervised Learning
Multivariate Forecasting
76
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Vs. Supervised Learning

➢ Example

➢ Summary
77 Example
The daily female births dataset, that is the monthly births across three years

We will use only the previous six time steps as input to the model
78 Example
The daily female births dataset, that is the monthly births across three years

We will use only the previous six time steps as input to the model
79
k-Fold cross-validation
1. Split the dataset into k equal (if possible)
parts (they are called folds)

2.Choose k – 1 folds as the training set. The

remaining fold will be the test set

3.Train the model on the training set. On each

iteration of cross-validation, you must train a
new model independently of the model
trained on the previous iteration

4.Validate on the test set

5.Save the result of the validation

6.Repeat steps 3 – 6 k times. Each time use

the remaining fold as the test set. In the end,
you should have validated the model on every
fold that you have.

7.To get the final score average the results

that you got on step 6.
80 Time Series Cross Validation
With time series data, we cannot shuffle the data!
Rolling Window

Expanding Window
81 Time Series Cross Validation
With time series data, we cannot shuffle the data!
82 Time Series Cross Validation
83
Outline
➢ Decision Tree: Review

➢ Random Forest

➢ Fill in missing data with Random Forest

➢ Time series Vs. Supervised Learning

➢ Example

➢ Summary
84

84
AdaBoost & Gradient Boost
(Basic, Advanced Concepts and Its Applications)

Vinh Dinh Nguyen

PhD in Computer Science
2
Outline
➢ Boosting Techniques

➢ AdaBoost Clearly Explain

➢ Gradient Boost Clearly Explain

➢ Time Series Data: Predicting Energy Consumption

➢ Summary
3
Decision Tree and Its Variance

Tiêm 5 đơn vị
Unit(đơn vị) Effect (hiệu quả) (%) vaccine
10 98
20 0
35 100
5 44
… …

Khi có 1 vaccine ra đời, chúng ta muốn dự đoán xem nó hiệu

quả bao nhiêu % ứng với từng liều lượng dùng trên bệnh Hiệu quả vaccine:
nhân. 44%
4
Decision Tree and Its Variance
5 What is an Ensemble Learning?
6 Homogeneous Approach
7 Heterogeneous Approach
8 Ensemple Learning Techniques

Ensemple Learning
Thông dụng ở các cuộc thi về
AI

Bagging Boosting Stacking

homogeneous weak learners homogeneous weak learners Heterogeneous weak learners
9 Bagging-based Method

Random
Forest

Last Week
10 Boosting-Based Method
11 Stacking-Based Method
12 Stacking-Based Method
13 Boosting Technique

Boosting is an ensemble modelling,

technique that attempts to build a
strong classifier from the number of
weak classifiers

https://en.wikipedia.org/wiki/Boosting_%28machine_learning%29#/media/File:Ensemble_Boosting.svg
14 Boosting Technique

Boosting is an ensemble modelling,

technique that attempts to build a
strong classifier from the number of
weak classifiers

https://en.wikipedia.org/wiki/Boosting_%28machine_learning%29#/media/File:Ensemble_Boosting.svg
15 Stump Definition

a node with two leaves and this is known as

Stump
16
Outline
➢ Boosting Techniques

➢ AdaBoost Clearly Explain

➢ Gradient Boost Clearly Explain

➢ Time Series Data: Predicting Energy Consumption

➢ Summary
17 AdaBoost: Forest of Stump
18 AdaBoost: Forest of Stump
19 Sample Dataset

CHEST PAIN GOOD BLOOD BLOCKED WEIGHT HEART

CIRCULATION ARTERIES DISEASE

NO NO NO 125 NO

YES YES YES 180 YES

YES YES NO 210 NO

YES NO YES 167 YES

20 Sample Dataset

STRONG CLASSIFIER

WEAK CLASSIFIER
21 Random Forest
• Each tree in the random forest has equal votes(weights) on the
final decision
22 AdaBoost: FOREST OF STUMP

• Stump are not equally weighted in the final decision.

• Stump that create more error will have less contribution in the final
decision
23 Random Forest

Tree are indepently

created
24 AdaBoost: Forest of Stump
1 2

Influence
4

3
25 Differences Between RF and AdaBoost

Weak Learners is a ___

AdaBoost combines a lot of___

Stumps have various ___ to Each stump is created by

the final result considering ___
26 Heart Disease Dataset

Chest Pain Blocked Arteries Patient Weight Heart Disease

Yes Yes 205 Yes
No Yes 180 Yes
Yes No 210 Yes
Yes Yes 167 Yes
No Yes 156 No
No Yes 125 No
Yes No 168 No
Yes Yes 172 No

Important of sample = Sample weight = 1 / number of samples = 1/8

27 1st Stump in the Tree
28 Compute Gini Index For Chest Pain

Gini index = 5/8 * (1 – (3/5)^2 – (2/5)^2) + 3/8* (1 - (1/3)^2 - (2/3)^2) =

0.57

CHEST PAIN

Yes No

HEART DISEASE HEART DISEASE

YES NO YES NO
3 2 1 2
29 Gini Index for Blocked Arteries

Gini index = 6/8 * (1 – (3/6)^2 – (3/6)^2) + 2/8* (1 - (1/2)^2 - (1/2)^2) =

0.5

BLOCKED ARTERIES

Yes No

HEART DISEASE HEART DISEASE

YES NO YES NO
3 3 1 1
30 Gini Index for Heart Disease

Gini index = =4/8 * (1-(1/4)^2 - (3/4)^2) + 4/8* (1-(1/4)^2 -(3/4)^2) = 0.375

192.5
195
188.5
161.5
140.5
146.5
170
PATIENT WEIGHT > 170

HEART DISEASE HEART DISEASE

YES NO YES NO
3 1 1 3
31 Amount of Say

How was this stump contribute to the final decision (classification)?

PATIENT WEIGHT > 170

YES NO

HEART DISEASE HEART DISEASE

YES NO YES NO
3 1 1 3

Amount
of Say
32 Amount of Say: Patient Weight

• Total error is equal to the sum of the weights of the incorrect

classified
• Amount of say = 1/2*log((1-2/8) / (2/8)) = 0.55
33 Amount of Say: Weight of The Tree

Probability Vs. Odds

Your friend went fishing 10 times a month
• Caught a fish 4 times
• Failed to catch 6 times
What is the probability and odds of getting a Fish for lunch?

𝐶hance for catching fish 4

Probability = = = 0.4
Total chances 10

𝐶hance for catching fish 4

Odds = = = 0.67
𝐶hance for not catching fish 6

Probability of catching fish 4/10

Odds = = = 0.67
Probability of not catching fish 6/10

1 −Probability of not catching fish 4/10

Odds = = = 0.67
Probability of not catching fish 6/10
34 Amount of Say: Weight of The Tree

PATIENT WEIGHT > 170

YES NO
1 −Probability of not catching fish 4/10
Odds = = = 0.67
Probability of not catching fish 6/10
HEART DISEASE HEART DISEASE

YES NO YES NO 1 −Probability of incorrect prediction

Odds =
Probability of incorrect prediction
3 1 1 3

Amount of says = Odds or Amount of says = ½ x log(Odds)

Why?
35 Amount of Say: Chest Pain

• Total error is equal to the sum of the weights of the incorrect

classified
• Amount of say = 1/2*log((1-3/8) / (3/8)) = 0.25
1 −Probability of incorrect prediction
log(Odds) = log( )
Probability of incorrect prediction
36 Amount of Say: Blocked Arteries

• Total error is equal to the sum of the weights of the

incorrect classified
• Amount of say = 1/2*log((1-4/8) / (4/8)) = 0
37 Assumptions

Known: weight cho các sample dự đoán sai được sử dụng để tính “Amount of
Say” cho từng stump hiện tại.

Unknown: Tiếp theo, chúng ta cần làm thế nào để sử dụng thông tin các weight
của sample dự đoán sai này để xây dựng stump tiếp và khắc phục các dự đoán sai
này
38 Idea: Improved Bootstrapped Dataset

Incorrect

Incorrect
Create new dataset

This new stump can handel incorrect

classification
39 How to Build Next Stump

Increase the sample weights of samples that were incorrectly classified and
decrease sample weights of samples that were correctly classified. Label {-1,
1}

New sample weight = 1/8 * e^{0.55} =

0.22
40 How to Build Next Stump

Increase the sample weights of samples that were incorrectly classified and
decrease the sample weights of samples that were correctly classified. Label {-1,
1}

New sample weight = 1/8 * e^{-0.55} = 0.07

41 How to Build Next Stump

Increase the sample weights of samples that were incorrectly classified and
keep the sample weights of samples that were correctly classified. Label {0,
1}

New sample weight = 1/8 * e^{0.55*1} =

0.22
42 How to Build Next Stump

Increase the sample weights of samples that were incorrectly classified and keep
the sample weights of samples that were correctly classified. Label {0, 1}

New sample weight = 1/8 * e^{0.55*0} =

0.125
43 New Sample Weight

Chest Pain Blocked Patient Weight Heart Diease Sample New Weight Normal
Arteries Weight Weight
Yes Yes 205 Yes 1/8 0.07 0.08
No Yes 180 Yes 1/8 0.07 0.08
Yes No 210 Yes 1/8 0.07 0.08
Yes Yes 167 Yes 1/8 0.22 0.25
No Yes 156 No 1/8 0.07 0.08
No Yes 125 No 1/8 0.07 0.08
Yes No 168 No 1/8 0.07 0.08
Yes Yes 172 No 1/8 0.22 0.25
Sum ~1.0 0.86 ~1.0

Update
44 New Sample Weight

Chest Pain Blocked Arteries Patient Weight Heart Diease New Weight
Yes Yes 205 Yes 0.08
No Yes 180 Yes 0.08
Yes No 210 Yes 0.08
Yes Yes 167 Yes 0.25
No Yes 156 No 0.08
No Yes 125 No 0.08
Yes No 168 No 0.08
Yes Yes 172 No 0.25
Sum ~1.0
45 AdaBoost: FOREST OF STUMPS

IMPROVE ERROR

IMPROVE ERROR
46 New Dataset
47 New Dataset

Random [0,1] to select samples

48 New Dataset

Chest Blocked Patient Heart Normal Range

Pain Arteries Weight Diease Weight
Yes Yes 205 Yes 0.08 [0-0.08]
No Yes 180 Yes 0.08 (0.08-0.16]
Yes No 210 Yes 0.08 (0.16-0.24]
Yes Yes 167 Yes 0.25 (0.24-0.495]
No Yes 156 No 0.08 (0.495-0.575]
No Yes 125 No 0.08 (0.575-0.655]
Yes No 168 No 0.08 (0.655-0.735]
Yes Yes 172 No 0.25 (0.735-1.0]
Sum ~1.0
49 New Dataset

Random
[0,1] to
select CONTINUE TO BUILD
samples THE NEXT STUMP

Old dataset

Ý tưởng: các sample bị phân loại sai, sẽ được được

nhiều hơn vào dataset mới

New dataset
50 How to Classify The Final Result

These stumps for predicting These stumps for predicting

heart disease no heart disease
51 How to Classify The Final Result

https://hastie.su.domains/Papers/samm
52 How to Classify The Final Result

SAMME — Stagewise
Additive Modeling using a
Multi-class Exponential
loss function

https://hastie.su.domains/Papers/samm
e.pdf
53 Behind The Scene

M trees Label: {-1,1} N Samples

The hypothesis function f(x) is
defined as:

Exponential loss
function:

Supposing that:

Expand the error function:

not dependent
on
αₘ and Gm
54 Behind The Scene

M trees
Expand the error function: N Samples
Label: {-1,1}

It’s easy to show that the expression -yᵢαₘGm(xᵢ) is -αₘ if yᵢ=

Gm(xᵢ)
and is αₘ if yᵢ!= Gm(xᵢ).

Total Error weight Ew :

weight Tw as:
55 Behind The Scene

Taking log on
both sides we
have
56 Summary

Improve error
Adaboost builds a stump based on the
the error made by previous stumps

Improve error
Improve error
57 Example: Spam Classification

• Classifying Email as Spam or Non-Spam

http://archive.ics.uci.edu/dataset/94/spambase
58 Example: Spam Classification
59 Example: Spam Classification
Our implementation

Using Sklearn library

60
Outline
➢ Boosting Techniques

➢ AdaBoost Clearly Explain

➢ Gradient Boost Clearly Explain

➢ Time Series Data: Predicting Energy Consumption

➢ Summary
61 Gradient Boost For Regression

Height Favorite Color Gender Weight

1.6 Blue Male 88
1.6 Green Female 76
1.5 Blue Female 56
1.8 Red Male 73
1.5 Green Male 77
1.4 Blue Female 57

Input Output
62 Tree-based Gradient Boost
• Step 1: Build 1st tree
• Calculate the average of weights
Height Favorite Color Gender Weight
1.6 Blue Male 88
1.6 Green Female 76
1.5 Blue Female 56
1.8 Red Male 73
1.5 Green Male 77
1.4 Blue Female 57

Node of 1st Tree Average of Weights: 71.17

63 Tree-based Gradient Boost
• Step 2:
➢Build 2nd tree

Average of weights: 71.17

64 Tree-based Gradient Boost
• Step 2:
➢Build 2nd tree

Average of weights: 71.2

Height Favorite Color Gender Weight Residual Error

1.6 Blue Male 88 16.8
1.6 Green Female 76 1.8
1.5 Blue Female 56 -15.2
1.8 Red Male 73 1.8
1.5 Green Male 77 5.8
1.4 Blue Female 57 -14.2
65 Tree-based Gradient Boost

Height Favorite Color Gender Residual Error

1.6 Blue Male 16.8
Tại sao lại xây dựng cây
1.6 Green Female 1.8
1.5 Blue Female -15.2 dự đoán Residual Error
1.8 Red Male 1.8
1.5 Green Male 5.8
Gender is Female
1.4 Blue Female -14.2

Height < 1.6 Color is not Blue

-14.2, -15.2 4.8 1.5, 5.8 16.8

66 Tree-based Gradient Boost

Height Favorite Color Gender Residual Error

1.6 Blue Male 16.8
Tại sao lại xây dựng cây
1.6 Green Female 1.8
1.5 Blue Female -15.2 dự đoán Residual Error
1.8 Red Male 1.8
1.5 Green Male 5.8
Gender is Female
1.4 Blue Female -14.2

Height < 1.6 Color is not Blue

Trung bình residual -14.7 4.8 3.8 16.8

67 Tree-based Gradient Boost

1st Tree
Average of weights: 71.2

Gender is Female

2nd Tree
Height < 1.6 Color is not Blue

-14.7 4.8 3.8 16.8

68 Prediction
Height Favorite Color Gender Weight Prediction
1.6 Blue Male 88 88
1.6 Green Female 76 76

1.5 Blue Female 56 56

1.8 Red Male 73 73
1.5 Green Male 77 77
1.4 Blue Female 57 57

OVER FITTING
69 Prediction

1st Tree
AVG of weights: 71.2

Learning
Rate
70 Prediction

Height Favorite Color Gender Weight Prediction

1.6 Blue Male 88 74.56

Prediction = 71.2 + 0.2 * 16.8 = 74.56

0.2
71 Tree-based Gradient Boost
• Step 3: 2nd Tree
➢Build 3rd tree 1st Tree
Average of weights: 71.2

Height Favorite Color Gender Weight Predicted Weight Residual Error

1.6 Blue Male 88 74.56 12.44
1.6 Green Female 76 … …
1.5 Blue Female 56 … …
1.8 Red Male 73 … …
1.5 Green Male 77 … …
1.4 Blue Female 57 … …
72 Tree-based Gradient Boost
• Step 3: 2nd Tree
➢Build 3rd tree 1st Tree
Average of weights: 71.2

Height Favorite Color Gender First Tree Second Tree Third Tree
Residual Residual Residual
1.6 Blue Male 16.8 12.44 ???
1.6 Green Female 1.8 … ???
1.5 Blue Female -15.2 … ???
1.8 Red Male 1.8 … ???
1.5 Green Male 5.8 … ???
1.4 Blue Female -14.2 … ???
73 Prediction

AVG of weights: 71.2

74 Gradient Boost: Behind The Scenes

Loss Height Favorite Color Gender Weight

1.6 Blue Male 88
Function:
1.6 Green Female 76
n
1.5 Blue Female 56
1.8 Red Male 73
1.5 Green Male 77
1.4 Blue Female 57

• {(xi,yi)}n1
Loss function = L(yi, F(x)) = 1/2* (Output - Predicted)2

𝒅𝑳
= 2/2 (Output – Predicted) * -1 = - (Output – Predicted)
𝒅𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅
Tricky implementation here
75 Gradient Boost: Behind The Scenes

• Step 1:
• Initialize a model with a constant value:
• 𝑭𝟎 𝒙 = 𝒂𝒓𝒈𝒎𝒊𝒏 σ𝒏𝒊=𝟏 𝑳(𝒚, 𝜹)
𝜹

𝒚
SSR = 1/2 {(88 – 𝜹)^2 + (76 – 𝜹)^2 + (56 - 𝜹)^2}
𝑑𝑆𝑆𝑅
= -(88 - 𝜹) – (76 - 𝜹) – (56 - 𝜹) = 0
𝑑𝜹
88+76+56
𝜹= = 73.3 = average of all sample’ weights
3
76 Gradient Boost: Behind The Scenes

• Step 1:
• Initialize a model with a constant value:
• 𝑭𝟎 𝒙 = 𝒂𝒓𝒈𝒎𝒊𝒏 σ𝒏𝒊=𝟏 𝑳(𝒚, 𝜹) = 73.3
𝜹

𝑺𝑺𝑹

𝜹
77 Gradient Boost: Behind The Scenes

• M is the number of trees

• n is the number of
samples
78 Gradient Boost: Behind The Scenes

R is the residual error; m is the m-th tree; j is the j-th leaf

Height Favorite Color Gender Weight ri1

1.6 Blue Male 88 14.7
1.6 Green Female 76 2.7

1.5 Blue Female 56 -17.3

R11 R21
79 Step 2 Step 1

𝑭m − 𝟏 𝒙 = 𝑭𝟎 𝒙 = 𝟕𝟑. 𝟑

R is the residual error; m is the m-th tree; j is the i-th leaf

Height Favorite Color Gender Weight ri1

1.6 Blue Male 88 14.7
1.6 Green Female 76 2.7

1.5 Blue Female 56 -17.3

1
𝛾11 = argmin y3 − (F0 x3 + 𝛾) 2
𝛾 2
1
𝛾11 = argmin 56 − (73.3 + 𝛾) 2
𝛾 2
1
𝛾11 = argmin −17.3 − 𝛾 2
𝛾 2
xi ∈ R11 xi ∈ R21
𝛾11 = -17.3 𝛾21 = 8.7
80 Step 2 Step 1

𝑭m − 𝟏 𝒙 = 𝑭𝟎 𝒙 = 𝟕𝟑. 𝟑

R is the residual error; m is the m-th tree; j is the i-th leaf

Height Favorite Color Gender Weight ri1

1.6 Blue Male 88 14.7
1.6 Green Female 76 2.7

1.5 Blue Female 56 -17.3

1 1
𝛾21 = argmin y1 − (F0 x1 + 𝛾) 2 + y2 − (F0 x2 + 𝛾) 2
𝛾 2 2
1 1
𝛾11 = argmin 88 − (73.3 + 𝛾) 2 + 76 − (73.33 + 𝛾) 2
2 2
𝛾 1 1
𝛾11 = argmin 14,7 − 𝛾 2 + 2.7 − 𝛾) 2 Giá trị trung
2 2
𝛾 xi ∈ R11 xi ∈ R21 bình của 2
𝛾21 = 8.7
𝛾11 = -17.3 𝛾21 = 8.7 samples
Step 2
81
Step 1

𝑭m − 𝟏 𝒙 = 𝑭𝟎 𝒙 = 𝟕𝟑. 𝟑

R is the residual error; m is the m-th tree; j is the i-th leaf

Height Favorite Color Gender Weight ri1 F1(X)

1.6 Blue Male 88 14.7 74.2

1.6 Green Female 76 2.7 74.2

1.5 Blue Female 56 -17.3 71.6

xi ∈ R11 xi ∈ R21
𝛾11 = -17.3 𝛾21 = 8.7
82 Repeat for the next m + 1 iteration
83 Summary
84
Outline
➢ Boosting Techniques

➢ AdaBoost Clearly Explain

➢ Gradient Boost Clearly Explain

➢ Time Series Data: Predicting Energy Consumption

➢ Summary
85 Time Series Forecasting
We will focus on the energy consumption problem, where given a sufficiently large dataset of the daily energy
consumption of different households in a city, we are tasked to predict as accurately as possible the future energy
demands.

Preprocessing
86 Time Series Forecasting
We will focus on the energy consumption problem, where given a sufficiently large dataset of the daily energy
consumption of different households in a city, we are tasked to predict as accurately as possible the future energy
demands.

Consumption changes through the years

During the winter months we observe high demands in

energy, while throughout the summer the consumption is at
the lowest levels.
87 Time Series Forecasting
Visualize the fluctuation in the span of a year we can do
88 Time Series Forecasting
We have only one feature: the full date. We can extract different features based on the full date
such as the day of the week, the day of the year, the month and others
89 Time Series Forecasting
The dataset contains almost 2.5 years of data, so for the testing set we will use only the last 6 months
90 Time Series Forecasting
Prepare dataset for Training and Testing

Performance Evaluation and

Visualization
91 Time Series Forecasting
Gradient Boosting For Training
92 Time Series Forecasting
Gradient Boosting For Predicting

MAE: 0.7535964848932999
MSE: 1.3449409757830804
MAPE: 0.1951895064391348
94

J. K. Sharma - Business Statistics-Pearson Education (2007)
92% (37)
J. K. Sharma - Business Statistics-Pearson Education (2007)
752 pages
Data Science & Business Analytics: Post Graduate Program in
No ratings yet
Data Science & Business Analytics: Post Graduate Program in
16 pages
Machine Learning and Data Mining: Introduction to (Học máy và Khai phá dữ liệu)
No ratings yet
Machine Learning and Data Mining: Introduction to (Học máy và Khai phá dữ liệu)
26 pages
L9 Model Assessment
No ratings yet
L9 Model Assessment
26 pages
Mid
No ratings yet
Mid
14 pages
Unit 4 Classification (1) (P)
No ratings yet
Unit 4 Classification (1) (P)
50 pages
Đại Học Quốc Gia Thành Phố Hồ Chí Minh Trường Đại Học Khoa Học Tự Nhiên Khoa Công Nghệ Thông Tin Bộ Môn Công Nghệ Tri Thức
No ratings yet
Đại Học Quốc Gia Thành Phố Hồ Chí Minh Trường Đại Học Khoa Học Tự Nhiên Khoa Công Nghệ Thông Tin Bộ Môn Công Nghệ Tri Thức
9 pages
Description XGBoost
No ratings yet
Description XGBoost
15 pages
LeHuuHoang Lab Assignment
No ratings yet
LeHuuHoang Lab Assignment
63 pages
2023 Logictic Regression VN
No ratings yet
2023 Logictic Regression VN
49 pages
09 - ML-Model Evaluation
No ratings yet
09 - ML-Model Evaluation
41 pages
Midterm Exam - MCQ
No ratings yet
Midterm Exam - MCQ
7 pages
Bai Nop Ngay 03.12.23pdf
No ratings yet
Bai Nop Ngay 03.12.23pdf
4 pages
Stock Price Prediction
No ratings yet
Stock Price Prediction
12 pages
Random Forest
No ratings yet
Random Forest
25 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
05.Random Forest (2)
No ratings yet
05.Random Forest (2)
3 pages
Đặng Mạnh Trường: Mục Tiêu Nghề Nghiệp
No ratings yet
Đặng Mạnh Trường: Mục Tiêu Nghề Nghiệp
2 pages
Baitap 2 basicML
No ratings yet
Baitap 2 basicML
3 pages
THỰC HÀNH 07012025
No ratings yet
THỰC HÀNH 07012025
7 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Ml2-Summary
No ratings yet
Ml2-Summary
8 pages
Machine learning lecture 2,3,4
No ratings yet
Machine learning lecture 2,3,4
26 pages
Ôn Thi KTDL
No ratings yet
Ôn Thi KTDL
18 pages
09 - ML-Model Evaluation
No ratings yet
09 - ML-Model Evaluation
33 pages
Decision Tree, Random Forest
No ratings yet
Decision Tree, Random Forest
37 pages
M04W03_Wed_Adaboost_GradientBoost_13_09_2023 - update_2
No ratings yet
M04W03_Wed_Adaboost_GradientBoost_13_09_2023 - update_2
86 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Unit 5
No ratings yet
Unit 5
12 pages
Random Forest
No ratings yet
Random Forest
25 pages
ML.6-Ensemble Learning - Random Forests (Week 10)
No ratings yet
ML.6-Ensemble Learning - Random Forests (Week 10)
16 pages
Naive Bayes
No ratings yet
Naive Bayes
35 pages
Random+Forest+Summary
No ratings yet
Random+Forest+Summary
6 pages
Text Classification: Dr. Nguyen Van Vinh CS Department - UET, Hanoi VNU
No ratings yet
Text Classification: Dr. Nguyen Van Vinh CS Department - UET, Hanoi VNU
50 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
Week+1+ Lecture+Slide+and+Notes
No ratings yet
Week+1+ Lecture+Slide+and+Notes
12 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
CODE
No ratings yet
CODE
4 pages
Machine Learning For Interviews
No ratings yet
Machine Learning For Interviews
12 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
3 - Chapter3
No ratings yet
3 - Chapter3
52 pages
bài5
No ratings yet
bài5
2 pages
Lecture 2.1 - AML
No ratings yet
Lecture 2.1 - AML
32 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Machine Learning (Trang 2 trên 4)
No ratings yet
Machine Learning (Trang 2 trên 4)
3 pages
2025 Ensemble Learning.docx
No ratings yet
2025 Ensemble Learning.docx
25 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Lecture 14 - Panel Data Models - Auto.vi
No ratings yet
Lecture 14 - Panel Data Models - Auto.vi
40 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Random Forest
No ratings yet
Random Forest
29 pages
Weather Prediction Big Data
No ratings yet
Weather Prediction Big Data
29 pages
Daily AI Exercise - Kmeans - KNN
No ratings yet
Daily AI Exercise - Kmeans - KNN
15 pages
DSS08 - CLS-ANN, SVM, Ensemble-Vn
No ratings yet
DSS08 - CLS-ANN, SVM, Ensemble-Vn
44 pages
Baitapcomment 03 07 Hqanh
No ratings yet
Baitapcomment 03 07 Hqanh
18 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
Random Forest
No ratings yet
Random Forest
10 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Lab2
No ratings yet
Lab2
17 pages
10 Time Series Fundamentals and Milestone Project 3 Bitpredict
No ratings yet
10 Time Series Fundamentals and Milestone Project 3 Bitpredict
48 pages
PCA For Nonstationary Series
No ratings yet
PCA For Nonstationary Series
55 pages
Kuliah-1 (Pengantar Analisis Data)
No ratings yet
Kuliah-1 (Pengantar Analisis Data)
7 pages
Arima Time Series Stock Prediction
No ratings yet
Arima Time Series Stock Prediction
23 pages
Import Seaborn As Sns
No ratings yet
Import Seaborn As Sns
27 pages
Instant Download A Guide To Temporal Networks 2nd Edition Naoki Masuda PDF All Chapters
100% (5)
Instant Download A Guide To Temporal Networks 2nd Edition Naoki Masuda PDF All Chapters
81 pages
Structural Macroeconometrics: David N. Dejong Chetan Dave
No ratings yet
Structural Macroeconometrics: David N. Dejong Chetan Dave
12 pages
Fresco
No ratings yet
Fresco
50 pages
Zhang Et Al. (2016)
No ratings yet
Zhang Et Al. (2016)
43 pages
Managing Supply Chain Operations - Leonardo DeCandia
No ratings yet
Managing Supply Chain Operations - Leonardo DeCandia
302 pages
1 STA 2420 Financial Time Series Notes 1
No ratings yet
1 STA 2420 Financial Time Series Notes 1
11 pages
Data Science & ML Syllabus
No ratings yet
Data Science & ML Syllabus
12 pages
Full Download Market Momentum: Theory and Practice Stephen Satchell PDF DOCX
100% (7)
Full Download Market Momentum: Theory and Practice Stephen Satchell PDF DOCX
55 pages
1 s2.0 S2352710221001443 Main
No ratings yet
1 s2.0 S2352710221001443 Main
13 pages
Week9 Time Series Analysis
No ratings yet
Week9 Time Series Analysis
8 pages
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
No ratings yet
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
104 pages
Atfnet: Adaptive Time-Frequency Ensembled Network For Long-Term Time Series Forecasting
No ratings yet
Atfnet: Adaptive Time-Frequency Ensembled Network For Long-Term Time Series Forecasting
24 pages
Time Series Analysis - Smoothing Methods PDF
No ratings yet
Time Series Analysis - Smoothing Methods PDF
29 pages
Tigist Mideksa
No ratings yet
Tigist Mideksa
70 pages
Instant download (Ebook) A Second Course in Statistics: Regression Analysis 8th Edition by William Mendenhall, Terry Sincich ISBN 9780135163795, 013516379X, 2018040176 pdf all chapter
100% (10)
Instant download (Ebook) A Second Course in Statistics: Regression Analysis 8th Edition by William Mendenhall, Terry Sincich ISBN 9780135163795, 013516379X, 2018040176 pdf all chapter
71 pages
SQL Server 2012 Tutorials - Analysis Services Data Mining
No ratings yet
SQL Server 2012 Tutorials - Analysis Services Data Mining
215 pages
Business Strategy Assignment: Forecasting Techniques
No ratings yet
Business Strategy Assignment: Forecasting Techniques
13 pages
Business Analytics Practical Problems
No ratings yet
Business Analytics Practical Problems
26 pages
Marwala's MSC Dissertation
No ratings yet
Marwala's MSC Dissertation
166 pages
Signals Notes
No ratings yet
Signals Notes
21 pages
A Guide To Temporal Networks 2nd Edition Naoki Masuda download
100% (1)
A Guide To Temporal Networks 2nd Edition Naoki Masuda download
86 pages
Stata 14 Tutorial PDF
No ratings yet
Stata 14 Tutorial PDF
44 pages
Machine Learning in Finance
100% (3)
Machine Learning in Finance
300 pages