0% found this document useful (0 votes)

34 views11 pages

Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters

Uploaded by

emil hard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views11 pages

Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters

Uploaded by

emil hard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Comparative study of bayesian optimization process

for the best machine learning hyperparameters

Fatima FATIH1 , Zakariae EN-NAIMANI2 , and Khalid HADDOUCH3

1
Laboratory LISA, ENSA University of Sidi Mohamed Ben Abdellah, Fez, Morocco
2
Laboratory SSDIA, ENSET University of Hassan II Casablanca, Mohammedia, Morocco

Abstract. Bayesian optimization is important algorithm that uses two essential

components, namely the surrogate model and the acquisition function. They are
used to approximate the unknown objective function. This optimization is used as
a hyperparameter tuning technique for the four machine learning algorithms to in-
crease their performance. In this work, we applied Bayesian optimization to choose
the best hyperparameters for a set of ML algorithms namely RF, SVM, KNN and
LR. For this, we used a heart disease dataset. In this context, we obtained the best
hyperparameters with accuracy for each machine learning algorithm optimized by
BO-GP and BO-TPE. The results demanstrate the highest accuracy in BO-GP
and BO-TPE are respectively LR is 89.01% and SVM is 89.01%. Then, the set-
ting of hyperparameters allows to find the best hyperparameters that is improved
accuracy for each algorithm.

Keywords: Bayesian optimization · Machine learning · Gaussian process· Tree

structured parzen estimator · Hyperparameter optimization.

1 Introduction

Hyperparameter tuning [8] is used to test the combination of hyperparameters that are
randomly chosen to improve machine learning problems. It is difficult to manually choose
the best values of hyperparameters because the choice of hyperparameters affects the
performance of the model. So, the performance of the learning model depends on better
choice of hyperparameters. In this regard, there existe an important technique for tuning
hyperparameters are random search, grid search, particle swarm optimization, genetic
algorithm and Bayesian optimization [8] [9] [16] [17]. Bayesian optimization is one of the
good technique for tuning hyperparameters in automatic learning models.

Bayesian optimization [4] [11] [13] is a method used to solve objective functions that
are costly to evaluate and also to find the global maximum of this function. It is based on
the Gaussian process, Random forest and Tree structure parzen estimator (TPE) which
constitutes two density functions (good density and bad density) and the observations
are divided in to together [1]. In TPE the ratio must be maximized for minimize the
expected improvement of the acquisition function which allows to find the new configura-
tion of the hyperparameters [1]. All three surrogate models are used to approximate the
objective function. Hawever, the most of the time, the Gaussian process is the most used
in Bayesian optimization.
2 F. Fatih et al.

In order to explain this, we use the heart disease dataset. This dataset is on of the
serious diseases that threaten human life. ML machine learning algorithms playing a key
role in predicting heart disease based on different symptoms such as age, gender etc. The
main objective is to detect the patient in its early stages where it can be treated and
save lives from death in order to reduce the morality rates by heart disease. In this work,
we applied Bayesian optimization to find the best hyperparameters that improved the
accuracy for each algorithm namely RF, SVM, KNN and LR as the performance of the
learning models depends on better hyperparameters. So, our result shows that the highest
accuracy in BO-GP and BO-TPE are of LR and SVM respectively.

This paper is structured as follows. We present the related Bayesian optimization

work in section 2. section 3 illustrates two components of Bayesian optimization. Then
the steps of Bayesian optimization to find the global maximum of an objective function
which are costly to evaluate are presented in section 4. The optimization process which
consists of three surrogate functions is explained in section 5. In section 6, we present
experimental results. And finally, we end with a conclusion.

2 Related Work
Bayesian optimization (BO) [4] [11] [13] [17] is an efficient method that consists of two
essential components namely the surrogate models and the acquisition function to de-
termine the next hyperparameters configurations that allows to find an approximation
of a costly objective function to be evaluated. The surrogate models are: Tree structure
parzen estimator (TPE) [1], random forest [15] [17], and Gaussian process [4] [13] [10] [7]
[8] . The acquisition functions are the expected improvement (EI) [2], the probability of
improvement [11] and the upper confidence bound (UCB) [11] . The most used acquisition
function in Bayesian optimization is the expected improvement [11]. But, according to [2]
shows that there is a better acquisition function than EI is the acquisition function E3 I
which balances exploitation and exploration in BO.

The concept of Bayesian optimization was introduced in [11] with two experiments.
The first experiment is determining the global maximum of the objective function f(x,y).
The second experiment is compared between Bayesian optimization and random search
in the SVM machine learning algorithm, which shows that there is no difference in the
performance of these two methods. There are more works as [6] [8] [9] [16] [17] show that
Bayesian optimization is one of the most effective hyperparameter optimization techniques
for tuning hyerparameters in machine learning models.

In this article [9] presents a comparative study between three HPO methods: grid
search, random search, Bayesian optimization. This comparison is used to find the best
method that can be used to obtain the highest accuracy in a short time simulation. The
results of [9] show that the method of Bayesian optimization is more efficient than the
other methods.

This work [6] presents a comparative analysis of various hyperparameter tuning tech-
niques, namely Grid Search, Random Search, Bayesian Optimization, Particle Swarm Op-
timization (PSO), and Genetic Algorithm (GA). They are used to optimize the accuracy of
Title Suppressed Due to Excessive Length 3

six machine learning algorithms, namely, Logistic Regression (LR), Ridge Classifier (RC),
Support Vector Machine Classifier (SVC), Decision Tree (DT), Random Forest (RF), and
Naive Bayes (NB) classifiers. These algorithms are used to solve the tree sentiment classi-
fication problem. The results of [6] shows that the performance for each machine learning
algorithm before and after setting the hyperparameters shows that the highest accuracy
was given by SVC before and after setting the hyperparameters with the highest scores
obtained when using Bayesian optimization.

We have seen in [17] a comparative study between eight different hyperparameter

optimization methods that are implemented on three machine learning models (KNN,
RF, SVM) of classification and regression. First, it compares accuracy and computation
time (CT) for classification problems that evaluated on the MNIST dataset. Secondly, it is
compared MSE and computation time for the regression problem which is evaluated on the
Bosten-houssing dataset. In this paper it is shown that using the default hyperparameter
settings does not give the best model performance. So, it is important to use HPO methods
to determine the best hyperparameters.

3 The components of Bayesian optimization

Bayesian optimization uses the following two important components:

3.1 Surrogate functions

The surrogate model [11] is a probability model that gives a representation of the ob-
jective function that is expensive to evaluate. We will see in section 5 three Surrogate
models namely: gaussian process (GP), random forest (RF), tree structure parzen esti-
mator (TPE). However, in the most of the time, GP [3] is good tool used in Bayesian
optimization. The main idea of these surrogate models are used to approximate the un-
known objective function and to search the global optimization of this function.

3.2 Acquisition functions

The acquisition function is an essential technique in Bayesian optimization. Mathemat-
ically, the point that maximizes the acquisition function is used to propose the next
sampling point for the next iteration. The most commonly used acquisition functions in
Bayesian optimization are:
1. Probability of Improvement (PI)

We can define the improvement I(x) as follows:

f (x) − f (x∗ ) if f (x) > f (x∗ )

∗
I(x) = max((f (x) − f (x ), 0) =
0 if f (x) < f (x∗ )
The probability of improvement is defined as follows
P I(x) = P[I(x) > 0]
= P[f (x) > f (x∗ )]
µ(x) − f (x∗ )
= Φ( )
σ(x)
4 F. Fatih et al.

where
• The mean µ and the variance σ,
• Φ is the cumulative distribution functions (CDF):
Z z
Φ(z) = φ(z)dz
−∞

−z 2

with φ(z) = √12π exp 2
is the probability density function (PDF) of the normal
distribution N (0, 1).

2. Expected Improvement (EI)

The expected improvement is defined as follows:

(x∗ ) (x∗ )
(
(µ(x) − f (x∗ ))Φ( µ(x)−f
σ(x) ) + σ(x)φ( µ(x)−f
σ(x) ) if σ(x) > 0,
EI(x) =
0 if σ(x) = 0

Where Φ and φ are the cumulative distribution functions (CDF) and probability den-
sity function (PDF).

3. The Upper Confidence Bound (UCB)

The Upper Confidence Bound is defined as the sequence [11]:

U CB(x) = µ(x) + βσ(x)

where β > 0 is a user-selected parameter that is used to balance exploration and
exploitation [11].

4 Bayesian optimization steps

To find the new hyperparameters to approximate an unknown objective function f , we
have the following Bayesian optimization steps [17]:
1. Build a surrogate model of the objective function and we used almost all the time the
Gaussian process to approximate the true objective function.
2. Find the optimal values of hyperparameters on the substitution model.
In this step we used the acquisition function to choose the next hyperparameters.
The hyperparameters that maximizes the acquisition function is an hyperparameter
chosen to use as the first sample in the graph of the substitution function.
3. We compute the true objective function of this new hyperparameter that we obtained
in step 2 and obtain a score.
4. Update the substitution probability model with the new results.
In this step we compute the substitution function to determine the mean and the
variance of this hyperparameter, then we define the value of µ in new iteration.
5. Repeat steps 2 through 4 until the maximum iteration pattern is reached.
Finally we find an approximation of the real objective function that allows us to find the
global maximum from the previously evaluated samples.
Title Suppressed Due to Excessive Length 5

5 Optimization process

There are three following substitution models in Bayesian optimization:

5.1 Bayesian optimization - Gaussian process (BO-GP)

The Gaussian process [17] [11] is a surrogate model most commonly used in Bayesian
optimization to approximate the objective function f : X −→ R with X is a finite set
of N points , and the values of the objective function f = [f (x1 ), . . . , f (xn )] [11] are
distributed according to a multivariate Gaussian distribution. Thus the Gaussian process
is given by [9] [12] [14]:
′
f ∼ GP(µ(x), K(x, x ))
Where µ is a mean vector and K is a covariance matrix. Predictions following a normal
distribution [17]:
P (y|x, D) = N (y|µ̃, σ̃ 2 )
Where D is the configuration space of the hyperparameters, and y = f (x) is the result
of the evaluation of each hyperparameter value X [17]. We assume µ(x) = 0 so the new
means and variances are [14] :

µ̃ = K(x)T K −1 y,
σ̃ 2 = K(x, x) − K(x)T K −1 K(x).
These new means and variance will be used in the acquisition function to find the
next evaluation point of the true objective function f .

5.2 Sequential model-based algorithm configuration (SMAC)

Bayesian optimization using RF as a surrogate model. It’s also called sequential model
based algorithm configuration (SMAC) [17]. Assuming that there is a Gaussian model
N (y|µ̃, σ˜2 ), which µ̃ and σ˜2 are the mean and variance of the regression function r(x),
respectively [5] [15] [17]:
1 X
µ̃ = r(x)
|B|
r∈B

1
σ˜2 =
X
(r(x) − µ̃)2
|B| − 1
r∈B

where B is a set of regression trees in the forest.

5.3 Tree structure parzen estimator (TPE)

The tree structured parzen estimator (TPE) is another common surrogate model for
Bayesian optimization [17]. It creates a model applying the Bayes rule to calculate p(y|x)
following:
P (x|y)P (y)
P (y|x) =
P (x)
6 F. Fatih et al.

But this method takes a different approach, since Bayesian optimization try to determine
p(y|x) [1], Tree structure parzen estimator models p(x|y) and p(y), i.e. (TPE) does not
directly model p(y|x) but rather p(x|y) and p(y), and the likelihood probability is defined
as follows[1]:
l(x) if y < y ∗

P (x|y) =
g(x) if y > y ∗
Where l(x) is the probability density function formed using the observed variables x
such that the objective function value is less than the threshold y ∗ . Then l(x) models
the density of the best observations, and g(x) is the density function using the remaining
observations such that the objective function value is greater than the threshold y ∗ .
Then g(x) models the density of bad observations [1]. TPE uses the following expected
improvement [1]:

R y∗
γy ∗ l(x) − l(x) −∞
P (y) dy
EIy∗ (x) =
γl(x) − (1 − γ)g(x)
g(x)
∝ (γ + (1 − γ))−1
l(x)
l(x)
The expected improvement is proportional to the ratio g(x) the Tree structure parzen
estimator works by drawing x values from l(x) based only on x values that give scores
below the threshold, and not g(x) to increase the EI, then to maximize the expected
improvement, we must maximize this ratio [1].

6 Experimental and results

The heart disease dataset, we downloaded from kaggle, contain 76 attributes but all
published experiments refer to the use of a subset of 14 of them. So the 14 features are
detailed as follows:
1. Age : Age of the patient,
2. Sex : Sex of the patient,
3. exang: exercise induced angina (1 = yes, 0 = no),
4. ca: number of major vessels (0-3),
5. cp : Chest Pain type (1: typical angina, 2: atypical angina, 3: non-anginal pain, 4:
asymptomatic ),
6. trtbps : resting blood pressure (in mm Hg),
7. chol : cholestoral in mg/dl fetched via BMI sensor,
8. fbs : (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false),
9. rest-ecg : resting electrocardiographic results (0: normal, 1: having ST-T wave ab-
normality (T wave inversions and/or ST elevation or depression of > 0.05 mV), 2:
showing probable or definite left ventricular hypertrophy by Estes’ criteria),
10. thalach : maximum heart rate achieved,
11. target : (0 = less chance of heart attack 1 = more chance of heart attack contenu).
Based on this dataset, we categorized patients as 1 indicating the presence of heart
disease and 0 indicating the absence of heart disease. Then we used 70% training data
and 30% used data to test the obtained model.
Title Suppressed Due to Excessive Length 7

6.1 Results

A comparative analysis between BO-GP and BO-TPE are used to determine process
gives the highest accuracy for different ML algorithms such as RF, KNN, SVM and LR
applying to heart disease prediction:

Table 1: The accuracy of ML algorithms without using BO.

ML Accuracy Time (s)
RF 0.8461 0.2447
KNN 0.6593 0.0210
SVM 0.5714 0.0230
LR 0.8791 0.0601

Fig. 1: Accuracy.
8 F. Fatih et al.

Table 2: Evaluation of BO-GP performance.

ML Best hyperpameters Precision Recall F1_score accuracy Time (s)
criterion=’entropy’,
max_depth=61,
max_features=’sqrt’,
RF 0.8 0.9756 0.8791 0.8791 80.36
min_samples-leaf=11,
min_samples-split=14
n_estimators=205
KNN n-neighbors=17 0.8 0.9756 0.8791 0.8791 15.34
C=13.273723936150628
SVM 0.8 0.9756 0.8791 0.8791 169.41
kernel=’linear’
C=3.131333053846714
LR penalty=l2, 0.8163 0.9756 0.8888 0.8901 25.87
solver= ’liblinear’

Fig. 2: Accuracy Fig. 3: Precision

Fig. 4: Recall Fig. 5: F1-score

Title Suppressed Due to Excessive Length 9

Table 3: Evaluation of BO-TPE performance.

ML Best hyperpameters Precision Recall F1_score accuracy Time (s)
criterion=’gini,’,
max_depth=80.0,
max_features=’auto’,
RF 0.7692 0.9756 0.8602 0.85714 53.283
min_samples-leaf=0.1232 ,
min_samples-split=0.2559
n_estimators=210
KNN n_neighbors=24.0 0.6078 0.7560 0.6739 0.6703 0.5642
C= 6.609156
SVM 0.8297 0.9512 0.8863 0.8901 42.917
kernel=’linear’
C =1.1082953781569425
LR penalty=l1, 0.8 0.9756 0.8791 0.8791 0.5868
solver= ’liblinear’

Fig. 6: Accuracy Fig. 7: Precision

Fig. 8: Recall Fig. 9: F1-score

10 F. Fatih et al.

6.2 Discussion of results and comparisons

Automatic learning algorithms are used to measure the performance of RF, SVM, KNN
and LR. According to Table 1.1 the highest accuracy is of LR with a score of 87.91%,
while the lowest accuracy we obtained for SVM is 57.14%.

BO-GP shows that the highest accuracy for LR of 89, 012% compared to other learning
algorithms. The BO-TPE results show that the highest accuracy is from SVM with a score
of 89.01%, and the lowest accuracy is from KNN which gives a score of 67.03%. Then,
the accuracy of LR in BO-GP and the accuracy of SVM in BO-TPE are larger than the
accuracy of Table 1.1. From these results we deduce that the hyperparameters setting
allows to find the best parameters that helped to improve the accuracy for each learning
model. The following tables show the performance ranking of BO-GP and BO-TPE:

Precision Recall F1_score accuracy

BO-GP 1 1 1 1
BO-TPE 2 1 2 2
(a) RF.

Precision Recall F1_score accuracy

BO-GP 1 1 1 1
BO-TPE 2 2 2 2
(b) KNN.

Precision Recall F1_score accuracy

BO-GP 2 1 2 2
BO-TPE 1 2 1 1
(c) SVM.

Precision Recall F1_score accuracy

BO-GP 1 1 1 1
BO-TPE 2 1 2 2
(d) LR.
Table 4: Performance ranking of BO-GP and BO-TPE for each machine learning algo-
rithm.

7 Conclusion

Bayesian optimization is an efficient hyperparameter tuning technique to improve machine

learning problems. Experience shows that BO-GP and proposed BO-TPE can find the
best hyperparameters with accuracy for machine learning models namely RF, KNN, SVM
and LR.
Title Suppressed Due to Excessive Length 11

The results we obtained from the highest accuracy in BO-GP and BO-TPE are re-
spectively LR is 89.01% and SVM is 89.01%. Then, hyperparameter tuning allows to find
the best hyperparameters that improve the accuracy for each algorithm. Our study shows
that the right choice of hyperparameters depends on the performance of machine learning
models.

References
1. Bergstra, J., Bardenet, R., Bengio, Y., and Kégl, B. Algorithms for hyperparameter optimiza-
tion. Advances in neural information processing systems 24 (2011).
2. Berk, J., Nguyen, V., Gupta, S., Rana, S., and Venkatesh, S. Exploration enhanced expected
improvement for bayesian optimization. In joint european conference on machine learning and
knowledge discovery in databases (2018), Springer, pp. 621–637.
3. Bodin, E., Kaiser, M., Kazlauskaite, I., Dai, Z., Campbell, N., and Ek, C. H. Modulating
surrogates for bayesian optimization. In International Conference on Machine Learning (2020),
PMLR, pp. 970–979.
4. Brochu, E., Cora, V. M., and De Freitas, N. A tutorial on bayesian optimization of expen-
sive cost functions, with application to active user modeling and hierarchical reinforcement
learning. arXiv preprint arXiv :1012.2599 (2010).
5. Dewancker, I., McCourt, M., and Clark, S. Bayesian optimization for machine learning : A
practical guidebook. arXiv preprint arXiv :1612.04858 (2016).
6. Elgeldawi, E., Sayed, A., Galal, A. R., and Zaki, A. M. Hyperparameter tuning for machine
learning algorithms used for arabic sentiment analysis. In Informatics (2021), vol. 8, Multi-
disciplinary Digital Publishing Institute, p. 79.
7. Hoffman, M., Brochu, E., De Freitas, N., et al. Portfolio allocation for bayesian optimization.
In UAI (2011), Citeseer, pp. 327–336.
8. Joy, T. T., Rana, S., Gupta, S., and Venkatesh, S. Hyperparameter tuning for big data using
bayesian optimisation. In 2016 23rd International Conference on Pattern Recognition (ICPR)
(2016), IEEE, pp. 2574–2579.
9. Kim, H.-C., and Kang, M.-J. Comparison of hyper-parameter optimization methods for deep
neural networks. Journal of IKEEE 24, 4 (2020), 969–974.
10. Li, D., and Kanoulas, E. Bayesian optimization for optimizing retrieval systems. In Proceed-
ings of the Eleventh ACM International Conference on Web Search and Data Mining (2018),
pp. 360–368.
11. Matosevic, A. On bayesian optimization and its application to hyperparameter tuning, 2018.
12. Nguyen, V., Gupta, S., Rana, S., Li, C., and Venkatesh, S. Regret for expected improve-
ment over the best-observed value and stopping condition. In Asian Conference on Machine
Learning (2017), PMLR, pp. 279–294.
13. Nomura, M., and Abe, K. A simple heuristic for bayesian optimization with a low budget.
arXiv preprint arXiv :1911.07790 (2019).
14. Rasmussen, C. E., and Nickisch, H. Gaussian processes for machine learning (gpml) toolbox.
The Journal of Machine Learning Research 11 (2010), 3011–3015.
15. van Hoof, J., and Vanschoren, J. Hyperboost : Hyperparameter optimization by gradient
boosting surrogate models. arXiv preprint arXiv :2101.02289 (2021).
16. Wu, J., Toscano-Palmerin, S., Frazier, P. I., and Wilson, A. G. Practical multifidelity
bayesian optimization for hyperparameter tuning. In Uncertainty in Artificial Intelligence
(2020), PMLR, pp. 788–798.
17. Yang, L., and Shami, A. On hyperparameter optimization of machine learning algorithms :
Theory and practice. Neurocomputing 415 (2020), 295–316

Problems On Skewness
No ratings yet
Problems On Skewness
4 pages
Hypothesis Testing, Test Statistic (Z, P, T, F)
100% (3)
Hypothesis Testing, Test Statistic (Z, P, T, F)
22 pages
Grouped Data Calculation PDF
100% (1)
Grouped Data Calculation PDF
14 pages
Model Training: (Anything Done While We Train The Model)
No ratings yet
Model Training: (Anything Done While We Train The Model)
194 pages
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
No ratings yet
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
69 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
ARMA Forecasting Using Eviews - 15 Feb 24
No ratings yet
ARMA Forecasting Using Eviews - 15 Feb 24
9 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Hyperparameter Optimization of ML Algorithms
No ratings yet
Hyperparameter Optimization of ML Algorithms
69 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Gonzalez 2021
No ratings yet
Gonzalez 2021
67 pages
Ryan Adams 140814 Bayesopt Ncap
No ratings yet
Ryan Adams 140814 Bayesopt Ncap
84 pages
ML Module2
No ratings yet
ML Module2
124 pages
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles For Better Empirical Performance
No ratings yet
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles For Better Empirical Performance
74 pages
Hyperband
No ratings yet
Hyperband
52 pages
Master's Thesis Explaining SMBO
No ratings yet
Master's Thesis Explaining SMBO
64 pages
Bergstra12a PDF
No ratings yet
Bergstra12a PDF
25 pages
Bayesian Optimization PDF
No ratings yet
Bayesian Optimization PDF
22 pages
4-2 Generalizing Bayesian Optimization With Likelihood-Free Inference and Decision-Theoretic Entropies
No ratings yet
4-2 Generalizing Bayesian Optimization With Likelihood-Free Inference and Decision-Theoretic Entropies
45 pages
JW Chapter11solutions
No ratings yet
JW Chapter11solutions
49 pages
Package Desire': R Topics Documented
No ratings yet
Package Desire': R Topics Documented
22 pages
A Tutorial On Bayesian Optimization of
No ratings yet
A Tutorial On Bayesian Optimization of
49 pages
Introduction To ML Linear Regression
No ratings yet
Introduction To ML Linear Regression
33 pages
A Survey of Optimization Methods ML
No ratings yet
A Survey of Optimization Methods ML
30 pages
Pranjal - Singh - 25.12.2022 - Data Mining Project
No ratings yet
Pranjal - Singh - 25.12.2022 - Data Mining Project
8 pages
Automatic Problem-Specific
No ratings yet
Automatic Problem-Specific
53 pages
Bayes Optimization For Machine Learning
No ratings yet
Bayes Optimization For Machine Learning
29 pages
Corporate Board Attributes and Tax Aggressiveness
No ratings yet
Corporate Board Attributes and Tax Aggressiveness
28 pages
A Tutorial On Bayesian Optimization
No ratings yet
A Tutorial On Bayesian Optimization
22 pages
Gage Repeatability and Reproducibility Data Sheet Variable Data Results
No ratings yet
Gage Repeatability and Reproducibility Data Sheet Variable Data Results
30 pages
Scale of Likert
No ratings yet
Scale of Likert
12 pages
Bayesian Optimisation (AutoML)
No ratings yet
Bayesian Optimisation (AutoML)
26 pages
Frazier 2018
No ratings yet
Frazier 2018
25 pages
1 s2.0 S1674862X19300047 Main
No ratings yet
1 s2.0 S1674862X19300047 Main
15 pages
Turner 21 A
No ratings yet
Turner 21 A
24 pages
Introductory of Statistics - Chapter 3
No ratings yet
Introductory of Statistics - Chapter 3
7 pages
06 - Normal Distribution Template
No ratings yet
06 - Normal Distribution Template
16 pages
WIREs Data Min Knowl - 2023 - Bischl - Hyperparameter Optimization Foundations Algorithms Best Practices and Open
No ratings yet
WIREs Data Min Knowl - 2023 - Bischl - Hyperparameter Optimization Foundations Algorithms Best Practices and Open
43 pages
Improving Accuracy of Interpretability Measures in Hyper-Parameter Optimization Via Bayesian Algorithm Execution
No ratings yet
Improving Accuracy of Interpretability Measures in Hyper-Parameter Optimization Via Bayesian Algorithm Execution
24 pages
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
No ratings yet
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
15 pages
Metalearning For Hyperparameter Optimization
No ratings yet
Metalearning For Hyperparameter Optimization
20 pages
BI - Lecture06A - EDA - Statistical Testing
No ratings yet
BI - Lecture06A - EDA - Statistical Testing
22 pages
Lecture6c HyperparameterOptimization
No ratings yet
Lecture6c HyperparameterOptimization
19 pages
Batch Bayesian Optimization Via Adaptive Local Search
No ratings yet
Batch Bayesian Optimization Via Adaptive Local Search
16 pages
Statistical Analysis
100% (1)
Statistical Analysis
2 pages
Pso Adaboost1
No ratings yet
Pso Adaboost1
16 pages
Adaptive Bayesian Contextual Hyperband: A Novel Hyperparameter Optimization Approach
No ratings yet
Adaptive Bayesian Contextual Hyperband: A Novel Hyperparameter Optimization Approach
11 pages
An Empirical Study of Bayesian Optimization: Acquisition Versus Partition
No ratings yet
An Empirical Study of Bayesian Optimization: Acquisition Versus Partition
25 pages
10 1109@tcyb 2019 2950779
No ratings yet
10 1109@tcyb 2019 2950779
14 pages
Hyperopt A Python Library For Model Selection and
No ratings yet
Hyperopt A Python Library For Model Selection and
25 pages
Ex1 Descriptive Statistics
No ratings yet
Ex1 Descriptive Statistics
8 pages
BA Tableau Final Capstone A Section
No ratings yet
BA Tableau Final Capstone A Section
17 pages
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
No ratings yet
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
10 pages
3406-Article Text-6396-1-10-20210421
No ratings yet
3406-Article Text-6396-1-10-20210421
6 pages
A Survey of Optimization Methods From A Machine Learning Perspective
No ratings yet
A Survey of Optimization Methods From A Machine Learning Perspective
14 pages
ML Chap 5
No ratings yet
ML Chap 5
14 pages
Bayesian and Surroagte 2
No ratings yet
Bayesian and Surroagte 2
14 pages
The Framingham Offspring Study: Risk Variable Clustering in The Insulin Resistance Syndrome
No ratings yet
The Framingham Offspring Study: Risk Variable Clustering in The Insulin Resistance Syndrome
7 pages
Bayesian Optimization
No ratings yet
Bayesian Optimization
15 pages
Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates
No ratings yet
Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates
8 pages
Evaluation of Bayesian Optimization Applied To Discrete-Event Simulation
No ratings yet
Evaluation of Bayesian Optimization Applied To Discrete-Event Simulation
9 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
NIPS 2016 Bayesian Optimization With Robust Bayesian Neural Networks Paper
No ratings yet
NIPS 2016 Bayesian Optimization With Robust Bayesian Neural Networks Paper
9 pages
Understanding High-Dimensional Bayesian Optimization: Leonard Papenmeier Matthias Poloczek Luigi Nardi
No ratings yet
Understanding High-Dimensional Bayesian Optimization: Leonard Papenmeier Matthias Poloczek Luigi Nardi
19 pages
Power of Human-Algorithm Collaboration in Solving Combinatorial Optimization Problems
No ratings yet
Power of Human-Algorithm Collaboration in Solving Combinatorial Optimization Problems
19 pages
ADIGRAT UNIVERSITY Bass New
No ratings yet
ADIGRAT UNIVERSITY Bass New
8 pages
A Majorization-Minimization Algorithm For (Multiple) Hyperparameter Learning
No ratings yet
A Majorization-Minimization Algorithm For (Multiple) Hyperparameter Learning
8 pages
1 s2.0 S235197892030696X Main
No ratings yet
1 s2.0 S235197892030696X Main
8 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
5 pages
CS Ec Ec116 Bacani - J A 2019 1
No ratings yet
CS Ec Ec116 Bacani - J A 2019 1
4 pages
Mbedded Methods For Feature Selection in Neural Networks: Reprint
No ratings yet
Mbedded Methods For Feature Selection in Neural Networks: Reprint
7 pages
Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling
No ratings yet
Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling
14 pages
Bayesian Approach
No ratings yet
Bayesian Approach
6 pages
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
No ratings yet
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
18 pages
Package Rbayesianoptimization': R Topics Documented
No ratings yet
Package Rbayesianoptimization': R Topics Documented
6 pages
Kebiasaan Merokok Dalam Rumah Kejadian ISPA Crosstabulation
No ratings yet
Kebiasaan Merokok Dalam Rumah Kejadian ISPA Crosstabulation
5 pages
Egg Et Al 13
No ratings yet
Egg Et Al 13
5 pages
Hyper Parameter Tuning
No ratings yet
Hyper Parameter Tuning
4 pages
Hyperparameters
No ratings yet
Hyperparameters
2 pages
EILPR Toward End-To-End Irregular License Plate Recognition Based On Automatic P
No ratings yet
EILPR Toward End-To-End Irregular License Plate Recognition Based On Automatic P
10 pages
Bayesian Optimization For Accelerating Hyper-Parameter Tuning
No ratings yet
Bayesian Optimization For Accelerating Hyper-Parameter Tuning
4 pages
Discrete Simulation Optimization For Tuning Machine Learning Method Hyperparameters
No ratings yet
Discrete Simulation Optimization For Tuning Machine Learning Method Hyperparameters
23 pages
Full Bayesian Optimization Report
No ratings yet
Full Bayesian Optimization Report
12 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
3 pages
Bayesian Optimization Primer: 1. Sigopt
No ratings yet
Bayesian Optimization Primer: 1. Sigopt
4 pages
Bayesian Optimization Report
No ratings yet
Bayesian Optimization Report
6 pages
EXAM 3 Review
No ratings yet
EXAM 3 Review
2 pages
Worksheet 3 For Eng
No ratings yet
Worksheet 3 For Eng
2 pages
Dsbda Insem
No ratings yet
Dsbda Insem
1 page
Lecture 7
No ratings yet
Lecture 7
3 pages
Mini Tabl CG&CGK
No ratings yet
Mini Tabl CG&CGK
2 pages
Optimization Theory with Applications
From Everand
Optimization Theory with Applications
Donald A. Pierre
4/5 (4)
Differential Evolution: Fundamentals and Applications
From Everand
Differential Evolution: Fundamentals and Applications
Fouad Sabry
No ratings yet
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mathematical Optimization: Fundamentals and Applications
From Everand
Mathematical Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet

Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters

Uploaded by

Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters

Uploaded by

Comparative study of bayesian optimization process

for the best machine learning hyperparameters

Fatima FATIH1 , Zakariae EN-NAIMANI2 , and Khalid HADDOUCH3

Abstract. Bayesian optimization is important algorithm that uses two essential

Keywords: Bayesian optimization · Machine learning · Gaussian process· Tree

This paper is structured as follows. We present the related Bayesian optimization

We have seen in [17] a comparative study between eight different hyperparameter

3 The components of Bayesian optimization

3.1 Surrogate functions

3.2 Acquisition functions

We can define the improvement I(x) as follows:

2. Expected Improvement (EI)

The expected improvement is defined as follows:

3. The Upper Confidence Bound (UCB)

The Upper Confidence Bound is defined as the sequence [11]:

U CB(x) = µ(x) + βσ(x)

4 Bayesian optimization steps

There are three following substitution models in Bayesian optimization:

5.1 Bayesian optimization - Gaussian process (BO-GP)

5.2 Sequential model-based algorithm configuration (SMAC)

where B is a set of regression trees in the forest.

5.3 Tree structure parzen estimator (TPE)

6 Experimental and results

Table 1: The accuracy of ML algorithms without using BO.

Table 2: Evaluation of BO-GP performance.

Fig. 2: Accuracy Fig. 3: Precision

Fig. 4: Recall Fig. 5: F1-score

Table 3: Evaluation of BO-TPE performance.

Fig. 6: Accuracy Fig. 7: Precision

Fig. 8: Recall Fig. 9: F1-score

6.2 Discussion of results and comparisons

Precision Recall F1_score accuracy

Precision Recall F1_score accuracy

Precision Recall F1_score accuracy

Precision Recall F1_score accuracy

Bayesian optimization is an efficient hyperparameter tuning technique to improve machine

You might also like