0% found this document useful (0 votes)

79 views6 pages

Optimization of Hyper-Parameter For CNN Model Using Genetic Algorithm

Uploaded by

Hamzah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views6 pages

Optimization of Hyper-Parameter For CNN Model Using Genetic Algorithm

Uploaded by

Hamzah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Optimization of hyper-parameter for CNN model

using genetic algorithm

Ji-Hyun Yoo Hyun-il Yoon Hyeong-Gyun Kim
dept. of Information and dept. of Information and dept. of Information and
Communication Engineering Communication Engineering Communication Engineering
Myongji University Myongji University Myongji University
Yongin, Republic of Korea Yongin, Republic of Korea Yongin, Republic of Korea
[email protected] [email protected] [email protected]

Hee-Seung Yoon Seung-Soo Han

dept. of Information and dept. of Information and
Communication Engineering Communication Engineering
Myongji University Myongji University
Yongin, Republic of Korea Yongin, Republic of Korea
[email protected] [email protected]

Abstract—Recently CNN is not only widely used in the field cannot be scientifically reproduced and requires intuitive and
of image recognition but also used in various fields such as many experiences. Recent research suggests that more
classifying vibration data. Therefore, increasing the complex and automated methods are needed to find more
performance of CNN models is becoming more important. One optimal parameters [7].
of the various methods to improve the performance of CNN
models is to optimize hyper-parameters. Two other simple methods are “Grid Search” and
“Random Search”, of which “Random Search” is theoretically
This paper presents a method for optimizing the hyper- and experimentally more efficient than “Grid Search” [6-7].
parameters of CNN models that classify MNIST data using
genetic algorithm. Population-based algorithms, different from “Bayesian Optimization” is also used to optimize hyper-
previous studies, can be used to optimize several parameters at parameter. However, when optimizing a large number of
once. In addition, different types and ranges of parameters from hyper-parameters at the same time, such as CNN, population-
the existing genetic algorithms are used. Using this method, the based algorithms are more appropriate [7].
hyper-parameter values that best classify MNIST have been Recently, reinforcement learning such as Q-Learning is
obtained and are presented.
also used for hyper-parameter optimization [7-9]. Most
Keywords—Genetic Algorithm, optimization, hyper-parameter
reinforcement learning, however, is advantageous when
optimizing structural parameter, but many other parameter,
I. INTRODUCTION such as learning rate and regularization, have the disadvantage
of being selected by the user [7].
Deep Learning began with the Perceptron concept in 1958
[1]. Perceptron can be applied to linearly separable issues, but In this paper, the population based Genetic Algorithm is
not to problems such as XORs that are not linearly separable used to optimize the hyper-parameters for the CNN network
[2]. However, this limit was resolved through the MLP (Multi- to classify MNIST data [14]. In addition, the accuracy of the
Layer Perceptron), which added ‘Hidden’ layer in 1986. With trained CNN network with optimized hyper-parameters was
the application of multiple layers, the number of required calculated to verify the performance of the optimized
parameters was increased, making it difficult to find the parameters using test data which was not used during training
optimal weight and bias value [3], but using the process.
Backpropagation algorithm, the optimal weight and bias can
be found. II. BACKGROUND
In 1989, LeCun, Yann, et. al. combined the Hyper-parameters are not parameters obtained through
backpropagation algorithm with the Convolution layer and training, but parameters that user must set oneself before
applied it to MNIST data [4] and published a structure called applying a deep learning model. Even if the deep learning
LeNet-5, which is the neural network underlying the model is created correctly, setting the wrong hyper-parameter
Convolution Neural Networks (CNN) [5]. Using CNN, the will prevent the training from working properly.
local invariant feature can be easily extracted and overcome For example, as shown in Table I, the two CNN models
the problems of existing neural networks, which have result in with different parameters, show quite different performance in
superior performance in the field of text and voice recognition. terms of accuracy. Even though only learning rate is different
The deeper the layer, the gradient vanishing occurs in the in Case 1 and Case 2, the test data accuracy is remained at
backpropagation process. However, deep learning is now around 0.1(10%) in Case 1, and remained at around over 90%
applicable to various fields by using the ReLU function as an in Case 2 (Fig. 1). This is the results of 20 repeated
activation function to solve the vanishing problem. experiments in each case.
However, there are still no established methods to This shows that it is difficult to determine the optimal
optimize hyper-parameter, and various studies are underway. hyper-parameter values systematically because there are
Traditional optimization methods include “Manual Search”. It various types and ranges of hyper-parameters.
can be determined by the experimenter’s experience, so it

978-1-7281-3939-5/19/$31.00 © 2019 IEEE

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.
TABLE I. HYPER-PARAMETER VALUE

PARAMETER CASE 1 CASE 2

LEARNING RATE 0.1 0.0001
DROPOUT 1 0.5 0.5
DROPOUT 2 0.5 0.5
BATCH SIZE 250 250
LAYER 4 4

Fig. 2. Genetic Algorithm flow chart

III. OPTIMIZATION PROCESS

A. Data selection
The data applied to CNN modeling in this experiment is
MNIST data. This data is black and white handwritten
numeric data with 60,000 training data and 10,000 test data,
which is the most commonly used data set.
Fig. 1. Accuracy according to learning rate difference
B. Hyper-parameter selection
CNN is a neural network including a convolution layer. The hyper-parameters chosen in this experiment are
Basic components are Convolution layer, Pooling layer, and learning rate, dropout rate, batch size and number of layers.
Fully-connected layer [10]. Convolution layer extracts
features of input data [10]. In the Convolution layer, the input The learning rate determines how much to learn at once. If
data of the neural network keeps the same dimension and the the value is too small, learning will finish before training and
data type of the input/output data is maintained for each layer, vice versa, the values will diverge, preventing them from
which is effective for recognizing spatial features in image learning correctly.
training. Pooling Layer reduces the sensitivity of input/output
Dropout is the random deletion of neuronal connections as
data to minimize the effect of movement and distortion [5, 10].
shown in Fig. 3 [15]. If dropout is not applied, the network is
That is, even if the image is slightly distorted or changed in
overfitting with training data and is less accurate with test data.
position, the image can be recognized as the same image.
To prevent overfitting and to improve accuracy the dropout
After repeating the above two steps, the data is transformed in
process should be applied. However, in the case of the
one dimension through the Fully-Connected layer and this
convolution layer, the dropout effect is smaller than that of the
transformed one dimensional data is classified in the last layer
fully-connected layer because the number of parameters is
[10].
relatively small [15]. Therefore, the ratio of dropout applied to
Genetic Algorithm is inspired by the evolution of nature convolution layers and dropout applied to fully-connected
and is the most representative algorithm for finding the global layers was set differently. In this paper, the dropout rate of the
optimal solution [13]. GA has genes that represent genetic convolution layer is called Dropout 1, and the dropout rate of
information, and the genes are gathered to form chromosomes the fully-connected layer is called Dropout 2.
[12]. In other words, chromosomes are a collection of genetic
information, which is a solution to the specific problem that
GA is trying to solve. If a solution is obtained, the result of
this solution is obtained to determine how appropriate it is.
The result indicates how appropriate this solution, which is
called fitness. Fitness is used to determine whether to proceed
with GA or to select chromosomes to find the next generation
of the population [12, 13].
The fitness is used to select the parent chromosome and Fig. 3. (a) Standard Neural Net (b) After applying dropout [6]
transform it to produce an offspring chromosome. There is a
crossover process in which part of two parent chromosomes Batch Size means the number of data to be trained at once.
are exchanged based on the crossover probability. Mutation It is used to prevent computer overload and to speed up
prevents the solution from falling into local optimal solution. training when training a large number of data such as MNIST.
The process of calculating the fitness of the offspring In this experiment, it was chosen to ensure whether the batch
chromosomes is shown in Fig 2. The offspring chromosome size affected the training.
becomes the new parent chromosome and generates the next
generation of new offspring chromosomes. This is GA process In this paper, the number of layers to optimize is the layer
to find the optimal solution by repeating the above steps. that does the convolution. Convolution layers extract the
features of the image. The deeper the layer, the better the

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.
extraction of small features. Therefore, it is important to E. Parent chromosome generation
determine the appropriate number of layers for the data. Fig. Based on calculated fitness value, the parent chromosomes
4 shows the convolution layer structure with four layers. [4]. are selected to create next generation. Use roulette wheel
selection method to select parent chromosomes. This is a way
that if the fitness is high, the probability of selection is also
high, and even if the fitness is low, there is a probability to be
selected.
F. Offspring chromosome generation
New offspring genes were generated by crossover two
Fig. 4. 4 Convolution layers structure parent chromosomes, which are selected by the roulette wheel
selection method.
C. Create initial population
One chromosome consisted of five genes. The five genes
are Learning rate, Dropout in convolution layer, Dropout in
Fully-connected layer, Batch size and number of Convolution
layers.
To perform GA, each gene was encoded into binary code
with certain number of bits. The learning rate was expressed
in 8 bits, ranging from 0.0001 to 0.1. This means that there are
28 (=256) values between 0.0001 and 0.1. dropout 1 is the
dropout ratio in the convolution layer, dropout 2 is the dropout
ratio in the fully-connected layer. These two values was
represented by four bits, 0 to 0.5, respectively. There are 24 (=
16) values between 0 and 0.5. For the batch size, 8 values {50,
100, 200, 250, 400, 500, 1000, 1250} are set and used 3 bits
to express this. Similarly, the number of convolution layers
was set to {1, 2, 3, 4} and expressed in 2 bits to indicate this. Fig. 6. Chromosome crossover
These five genes were combined to form a 21 bit chromosome.
The structure of the chromosome is shown in Fig. 5 and Table The centers of each gene are selected as crossover points
II. as shown in Fig. 6. The probability of crossover was set to 0.6.
The mutation process is applied to the newly generated
offsprings with a probability of 0.05. The mutation process
reverses the values of the randomly selected bit.

Fig. 5. Chromosome structure

50 chromosomes were generated by randomly selected

values , which is called the initial population.
Fig. 7. Chromosome mutation
TABLE II. CHROMOSOME STRUCTURE
Fig. 7 shows that mutations change the values of the fourth
PARAMETER RANGE NUMBER OF BITS element of the learning rate, the first element of dropout 1 and
LEARNING RATE 0.0001 ~ 0.1 8 dropout 2, the second element of batch size, and the last
DROPOUT 1 0 ~ 0.5 4 element of the layer.
DROPOUT 2 0 ~ 0.5 4
50, 100, 200, 250, TABLE III.
BATCH SIZE 3
400, 500, 1000, 1250
VALUE
LAYER 1, 2, 3, 4 2
POPULATION 50
D. Fitness evaluation CROSSOVER RATE 0.6
To evaluate the generated chromosomes, fitness should be MUTATION RATE 0.05
calculated. In this experiment, fitness is defined as the
accuracy of the test data. Accuracy is the ratio of the number
of properly classified data out of 10,000 test data. In this paper, the number of generation was set to 30.
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑧𝑒𝑑 𝑑𝑎𝑡𝑎
Fitness = (1)
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑒𝑠𝑡 𝑑𝑎𝑡𝑎(10,000)

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.
IV. RESULT
During 30 generation, the parameters with highest fitness
were obtained for each generation. In addition, a total of four
experiments were conducted to verify the performance of this
parameter optimization method.

Fig. 10. Dropout 1 by generation

Fig. 10 shows the change of the dropout ratio applied to

the convolution layer. The optimal dropout is around 0.2.

Fig. 8. Fitness by generation (4 experiments)

Fig. 8 shows the fitness change of the test data during 30

generations of GA . As generation progresses, it can be seen
that the fitness increases. The accuracy of four experiments is
between 0.994 and 0.996.

TABLE IV. FINAL FITNESS VALUE

Fitness
1 0.9946
2 0.9947
3 0.9953
4 0.9947
Fig. 11. Dropout 2 by generation

The values of each hyper-parameter used to obtain fitness Fig. 11 shows the dropout ratio applied to the fully-
in table IV are shown in Fig. 9-13. As the generation connected layer. The optimal dropout has a range difference
progresses, each hyper-parameter value changed and fitness for each trial.
increases.

Fig. 9. Learning rate by generation Fig. 12. Batch size by generation

As shown in Fig. 9, the range of learning rate was set Fig. 12 shows the change in the batch size. Optimal batch
between 0.0001 and 0.1, but the optimal learning rate remains sizes are different in each experiment. It has a relatively small
very low. When the number of generation is about 15, it can range of values. Training more data at one time did not
be seen that the learning rate does not change and maintains a increase the accuracy of the test data.
constant value.

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.
Fig. 14. Case 1 drop out
Fig. 13. Layer by generation

Fig. 13 shows the variation of the number of convolution

layers. When the convolution layer has 3 layers or 4 layers, it
is confirmed that the optimal fitness can be obtained.

TABLE V. OPTIMIZED HYPER-PARAMETER VALUE

Learning rate 0.0009

Dropout 1 0.2667
1 Dropout 2 0.2
Batch size 50
Layer 4
Learning rate 0.0013
Dropout 1 0.2
2 Dropout 2 0.43333 Fig. 15. Case 4 drop out
Batch size 250
As shown in Fig. 14 and Fig. 15, in the added experiment,
Layer 3 the value of dropout 1 did not steadily change, but the value
Learning rate 0.0005 of dropout 2 has change.
Dropout 1 0.2333 TABLE VI. OPTIMIZED DROPOUT RATE (50 GENERATRIONS)
3 Dropout 2 0.3333
Dropout 1 0.2667
Batch size 50 1
Dropout 2 0.46667
Layer 4
Dropout 1 0.2
Learning rate 0.0005 2
Dropout 2 0.43333
Dropout 1 0.2333 Dropout 1 0.2333
Dropout 2 0.1667 3
4 Dropout 2 0.3333
Batch size 100 Dropout 1 0.2333
4
Layer 4 Dropout 2 0.43333
As shown in Table VI, after 50 generations, a similar range
Table V shows the optimized hyper-parameters found of dropout 2 values were obtained. At this time, the accuracy
through GA. The values in the table are rounded to four increased to less than 0.01%. However, the accuracy is hardly
decimal places. Modeling CNNs using the values given in the changed compared to the time spent. Thus, in this experiment,
table above yields over 99.4% accuracy. 30 generations were adequate to achieve high accuracy using
MNIST data.
Since the value of dropout 2 is not stable, a total of 50
generations of experiments were performed by adding 20 V. CONCLUSION
generations to the existing experiments. As a result, the In this paper, optimal hyper-parameter values of the CNN
dropouts of Case 2 and Case 3 did not change, but the values model for classifying MNIST data were obtained using
of dropout 2 of Case 1 and Case 4 changed. genetic algorithm. The learning rate is approximately 0.0004
to 0.0012. The learning rate was found to be less than that of
Backpropagation neural network. Dropout 1 obtained similar
values near 0.2, but dropout 2 was in the range of 0.16 to 0.43.

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.
The dropout value is not large because simple black and white [4] Y. LeCun, Be Boser and et al., “Handwritten digit recognition with a
data is used. In addition, training less than 250 data at a time back-propagation network”, Advances in neural information
processing systems., 1990
gives high accuracy. Training a lot of data at once did not yield
[5] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
high accuracy. When the number of layers is one or two, the learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11,
structure is too simple to train well. pp. 2278–2323, 1998.
This study was conducted using Intel Core i5-8500 CPU [6] J. Bergstra and Y. Bengio, “Random search for hyper-parameter
optimization,” J. Mach. Learn. Res., vol. 13, pp. 281–305, 2012.
and NVIDIA GeFORCE GTX 1060 6GB GPU. It took about
[7] T. Hinz, N. Navarro-Guerrero, S. Magg, and S. Wermter, “Speeding up
4110 seconds to evaluate a generation's fitness and find the the Hyperparameter Optimization of Deep Convolutional Neural
best one. An approximation of the global optimal solution was Networks,” Int. J. Comput. Intell. Appl., vol. 17, no. 02, p. 1850008,
obtained, but it took a long time because there was a lot of data 2018.
used in the experiment. To reduce the time needed for [8] B. Zoph and Q. V. Le, “Neural architecture search with reinforcement
optimization and achieve better results, research is needed on learning.,” arXiv Prepr. arXiv1611.01578, 2016.
hybrid algorithms that use GA to find global optimal solutions [9] B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neural
and then local optimal search algorithm to find precise optimal network architectures using reinforcement learning.,” arXiv Prepr.
point. arXiv1611.02167, 2016.
[10] J. Gu et al., “Recent advances in convolutional neural networks,”
Pattern Recognit., vol. 77, pp. 354–377, 2018.
[11] H. Wang and Bhiksha Raj, “On the origin of deep learning,” arXiv
ACKNOWLEDGEMENT Prepr. arXiv1702.07800, 2017.
This research was supported by Basic Science Research [12] J. McCall, “Genetic algorithms for modelling and optimisation,” J.
Comput. Appl. Math., vol. 184, no. 1, pp. 205–222, 2005.
Program through the National Research Foundation of Korea
(NRF) funded by the Ministry of Education(No. [13] John H. Holland, “Genetic Algorithms,” Sci. Am., vol. 267, no. 1, pp.
66–73, 1992.
2017R1D1A1B03029991).
[14] E. Kussul and T. Baidyk, “Improved method of handwritten digit
recognition tested on MNIST database,” in Image and Vision
Computing, 2004, vol. 22, no. 12 SPEC. ISS., pp. 971–981.
REFERENCES [15] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R.
Salakhutdinov, “Dropout: A simple way to prevent neural networks
from overfitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.
[1] F. Rosenblatt, “The perceptron: A probabilistic model for information [16] Y. Gal and Z. Ghahramani, “Bayesian convolutional neural networks
storage and organization in the brain,” Psychol. Rev., vol. 65, no. 6, pp. with Bernoulli approximate variational inference.,” arXiv Prepr.
386–408, 1958. arXiv1506.02158, 2015.
[2] M. L. Minsky and S. Papert, Perceptrons (An introduction to
computational geometry): Epilogue. 1988.
[3] J. L. McClelland, D. E. Rumelhart, and P. D. P. R. Group, “Parallel
distributed processing,” Explor. Microstruct. Cogn., pp.216-271,
1986.

Authorized licensed use limited to: De Montfort University. Downloaded on May 18,2020 at 07:35:53 UTC from IEEE Xplore. Restrictions apply.

SC92F7352 7351 7350v0.1en
No ratings yet
SC92F7352 7351 7350v0.1en
109 pages
Vastu 3
No ratings yet
Vastu 3
36 pages
Artificial Intelligence - AL3391 2021 Regulation - Question Paper 2023 Nov Dec
No ratings yet
Artificial Intelligence - AL3391 2021 Regulation - Question Paper 2023 Nov Dec
4 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
19 pages
UNIT-III DLL Full Unit
No ratings yet
UNIT-III DLL Full Unit
63 pages
Multimedia-Unit 3
No ratings yet
Multimedia-Unit 3
23 pages
Introduction To Machine Learning and Neural Networ
No ratings yet
Introduction To Machine Learning and Neural Networ
10 pages
Vivobarefoot Upgrades Technology Infrastructure
100% (1)
Vivobarefoot Upgrades Technology Infrastructure
5 pages
2.PC Jotun Chart1011 PDF
No ratings yet
2.PC Jotun Chart1011 PDF
3 pages
ResearchPaper2 1 David Laredo
No ratings yet
ResearchPaper2 1 David Laredo
31 pages
Genetic Algorithm Based Hyper-Parameters Optimization For Transfer Convolutional Neural Network
No ratings yet
Genetic Algorithm Based Hyper-Parameters Optimization For Transfer Convolutional Neural Network
20 pages
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
No ratings yet
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
16 pages
Todos Tienen Celular Uso Apropiacion e I
No ratings yet
Todos Tienen Celular Uso Apropiacion e I
15 pages
Artifcial Neural Networks Training Algorithm Integrating Invasive
No ratings yet
Artifcial Neural Networks Training Algorithm Integrating Invasive
9 pages
Amcs 2023 33 1 2
No ratings yet
Amcs 2023 33 1 2
11 pages
The Evaluation of Convolutional Neural Network and Genetic Algorithm Performance Based On The Number of Hyperparameters For English Handwritten Recognition
No ratings yet
The Evaluation of Convolutional Neural Network and Genetic Algorithm Performance Based On The Number of Hyperparameters For English Handwritten Recognition
10 pages
Deep Neural Evolution
No ratings yet
Deep Neural Evolution
23 pages
CPSO
No ratings yet
CPSO
10 pages
(IJCST-V11I2P11) :dr. Girish Tere, Mr. Kuldeep Kandwal
No ratings yet
(IJCST-V11I2P11) :dr. Girish Tere, Mr. Kuldeep Kandwal
7 pages
Assessment 2
No ratings yet
Assessment 2
4 pages
Capstone Project
No ratings yet
Capstone Project
9 pages
03 PL, Activation, BackProp, CNN
No ratings yet
03 PL, Activation, BackProp, CNN
95 pages
CNN and Genetic Algorithm
No ratings yet
CNN and Genetic Algorithm
12 pages
Transferability in Deep Learning: A Survey: Junguang Jiang
No ratings yet
Transferability in Deep Learning: A Survey: Junguang Jiang
64 pages
De Paper Final
No ratings yet
De Paper Final
6 pages
Convolution Neural Network Hyperparameter Optimiza
No ratings yet
Convolution Neural Network Hyperparameter Optimiza
8 pages
KS18 Data Centres - An Introduction To Concepts and Design (2012)
No ratings yet
KS18 Data Centres - An Introduction To Concepts and Design (2012)
85 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Storytelling - 3 - Zappos
No ratings yet
Storytelling - 3 - Zappos
13 pages
ABCs2018 Paper 156
No ratings yet
ABCs2018 Paper 156
5 pages
Dynamic Neural Diversification Path To Computation
No ratings yet
Dynamic Neural Diversification Path To Computation
9 pages
Simple Introduction of Neural Network
No ratings yet
Simple Introduction of Neural Network
28 pages
Additional MCQs Chap 2 MA
No ratings yet
Additional MCQs Chap 2 MA
4 pages
Unit4 PPT
No ratings yet
Unit4 PPT
118 pages
Moxa MC 3201 Series Datasheet v1.1
No ratings yet
Moxa MC 3201 Series Datasheet v1.1
8 pages
GE Digital Camera x400 - Power - Pro - Series
No ratings yet
GE Digital Camera x400 - Power - Pro - Series
89 pages
D N N A R L: Esigning Eural Etwork Rchitectures Using Einforcement Earning
No ratings yet
D N N A R L: Esigning Eural Etwork Rchitectures Using Einforcement Earning
18 pages
Capstone Project
No ratings yet
Capstone Project
7 pages
Image Recognition Based On Deep Learning
No ratings yet
Image Recognition Based On Deep Learning
5 pages
Updated Dbms Lab Obe
No ratings yet
Updated Dbms Lab Obe
4 pages
Apache Airflow Documentation
No ratings yet
Apache Airflow Documentation
101 pages
An Experimental Study On Hyper Parameters For Training Deep Convolutional Networks
No ratings yet
An Experimental Study On Hyper Parameters For Training Deep Convolutional Networks
8 pages
January Budget 2021
No ratings yet
January Budget 2021
6 pages
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
No ratings yet
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
5 pages
Bayesian Optimization3
No ratings yet
Bayesian Optimization3
5 pages
Bayesian Optimization For Accelerating Hyper-Parameter Tuning
No ratings yet
Bayesian Optimization For Accelerating Hyper-Parameter Tuning
4 pages
Convolutional Neural Network With An Optimized Backpropagation Technique
No ratings yet
Convolutional Neural Network With An Optimized Backpropagation Technique
5 pages
Cisco LoopBack e Interfaces Virtuales
No ratings yet
Cisco LoopBack e Interfaces Virtuales
16 pages
Fundamentals of Applets in Java
No ratings yet
Fundamentals of Applets in Java
14 pages
An Introduction To Convolutional Neural Networks
No ratings yet
An Introduction To Convolutional Neural Networks
7 pages
Topology Design Through Evolution
No ratings yet
Topology Design Through Evolution
7 pages
DL Practical 02 Binary Class Classifier Using ANN
No ratings yet
DL Practical 02 Binary Class Classifier Using ANN
5 pages
DL Exp-6 16010422230
No ratings yet
DL Exp-6 16010422230
8 pages
Gen Aiml Notes by Piyush
No ratings yet
Gen Aiml Notes by Piyush
39 pages
Lecture6c HyperparameterOptimization
No ratings yet
Lecture6c HyperparameterOptimization
19 pages
G6S User Manual PDF
No ratings yet
G6S User Manual PDF
72 pages
NowakowskiG Neuralnetwork
No ratings yet
NowakowskiG Neuralnetwork
10 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
168 pages
Convolutional Neural Network For Image Recognition
No ratings yet
Convolutional Neural Network For Image Recognition
8 pages
Hyper Parameters
No ratings yet
Hyper Parameters
7 pages
SeeGull MX 2.6.5.0 Release Notes Rev Y
No ratings yet
SeeGull MX 2.6.5.0 Release Notes Rev Y
51 pages
Real-Time Wireless Sensor Network For Landslide Detection
No ratings yet
Real-Time Wireless Sensor Network For Landslide Detection
5 pages
Finding Optimal Neural Network Architecture Using Genetic Algorithms
No ratings yet
Finding Optimal Neural Network Architecture Using Genetic Algorithms
10 pages
Unit Iii
No ratings yet
Unit Iii
26 pages
Lab DigitRecognitionMINST
No ratings yet
Lab DigitRecognitionMINST
10 pages
An Introduction To Convolutional Neural Networks: November 2015
No ratings yet
An Introduction To Convolutional Neural Networks: November 2015
12 pages
SOC 2 Checklist
100% (3)
SOC 2 Checklist
16 pages
Dynamic Programming
No ratings yet
Dynamic Programming
10 pages
Lecture 9 Model Selection
No ratings yet
Lecture 9 Model Selection
15 pages
Electronic Health Record
No ratings yet
Electronic Health Record
2 pages
1.convolutional Neural Networks For Image Classification
No ratings yet
1.convolutional Neural Networks For Image Classification
11 pages
ML Short Question and Answers
No ratings yet
ML Short Question and Answers
11 pages
Hyper Parameters
No ratings yet
Hyper Parameters
24 pages
Brochure Powerful and Intuitive Machine Tool Probing Software
No ratings yet
Brochure Powerful and Intuitive Machine Tool Probing Software
17 pages
e 20171130
No ratings yet
e 20171130
14 pages
MG30-R1, Machine Protection Relay
No ratings yet
MG30-R1, Machine Protection Relay
43 pages
Designing Convolutional Neural Network Architecture Using Genetic Algorithms
No ratings yet
Designing Convolutional Neural Network Architecture Using Genetic Algorithms
7 pages
Unit 4 A
No ratings yet
Unit 4 A
16 pages
RTSP Vip Configuration Note Enus 9007200806939915
No ratings yet
RTSP Vip Configuration Note Enus 9007200806939915
21 pages
Room Classification Using Machine Learning
No ratings yet
Room Classification Using Machine Learning
16 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Image Classification Using Small Convolutional Neural Network
No ratings yet
Image Classification Using Small Convolutional Neural Network
5 pages
Image Classification Using CNN: Page - 1
No ratings yet
Image Classification Using CNN: Page - 1
13 pages
Oracle 1Z0-083 v2022-05-21 q220 - 2
No ratings yet
Oracle 1Z0-083 v2022-05-21 q220 - 2
75 pages
Neural Network Weight Selection Using Genetic Algorithms
No ratings yet
Neural Network Weight Selection Using Genetic Algorithms
17 pages
Mnist Handwritten Digit Classification
No ratings yet
Mnist Handwritten Digit Classification
26 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Introduction To Neural Networks: RWTH Aachen University Chair of Computer Science 6 Prof. Dr.-Ing. Hermann Ney
No ratings yet
Introduction To Neural Networks: RWTH Aachen University Chair of Computer Science 6 Prof. Dr.-Ing. Hermann Ney
31 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
TSB 55L16XMEA Service Manual
No ratings yet
TSB 55L16XMEA Service Manual
38 pages
CSS10 - Q1 - Module2 - Ronald A. Rigua
No ratings yet
CSS10 - Q1 - Module2 - Ronald A. Rigua
26 pages
Multiple Ips and Subnet Support: Pjsip Set Logger On Pjsip Show Endpoints Pjsip Show Registrations
No ratings yet
Multiple Ips and Subnet Support: Pjsip Set Logger On Pjsip Show Endpoints Pjsip Show Registrations
3 pages
Adobe Photoshop 7 Keyboard Shortcuts
81% (27)
Adobe Photoshop 7 Keyboard Shortcuts
2 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet

Optimization of Hyper-Parameter For CNN Model Using Genetic Algorithm

Uploaded by

Optimization of Hyper-Parameter For CNN Model Using Genetic Algorithm

Uploaded by

Optimization of hyper-parameter for CNN model

using genetic algorithm

Hee-Seung Yoon Seung-Soo Han

978-1-7281-3939-5/19/$31.00 © 2019 IEEE

PARAMETER CASE 1 CASE 2

Fig. 2. Genetic Algorithm flow chart

III. OPTIMIZATION PROCESS

Fig. 5. Chromosome structure

50 chromosomes were generated by randomly selected

Fig. 10. Dropout 1 by generation

Fig. 10 shows the change of the dropout ratio applied to

Fig. 8. Fitness by generation (4 experiments)

Fig. 8 shows the fitness change of the test data during 30

TABLE IV. FINAL FITNESS VALUE

Fig. 9. Learning rate by generation Fig. 12. Batch size by generation

Fig. 13 shows the variation of the number of convolution

TABLE V. OPTIMIZED HYPER-PARAMETER VALUE

Learning rate 0.0009

You might also like