Parameters Optimization of Deep Learning Models Using Particle Swarm Optimization

This paper discusses the optimization of deep learning model parameters using Particle Swarm Optimization (PSO) to enhance performance and reduce computational resources. The authors demonstrate that PSO significantly decreases the number of configurations needed for tuning parameters like the number of hidden layers and neurons, achieving better accuracy compared to traditional grid search methods. The study utilizes a dataset from a Wi-Fi campus network and suggests that PSO can be effectively applied across various application domains.

Uploaded by

kircalinecla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views6 pages

Parameters Optimization of Deep Learning Models Using Particle Swarm Optimization

Uploaded by

kircalinecla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Parameters Optimization of Deep Learning Models

using Particle Swarm Optimization

Basheer Qolomany1, Majdi Maabreh1, Ala Al-Fuqaha1, Ajay Gupta1 Driss Benhaddou2
1 2
Computer Science Dept., College of Engineering and Applied Sciences, Engineering Technology Dept., College of
Western Michigan University, Technology, University of Houston, Houston,
Kalamazoo, Michigan, USA Texas, USA
{basheer.qolomany, majdi.a.maabreh, ala.al-fuqaha, ajay.gupta}@wmich.edu [email protected]

Abstract— Deep learning has been successfully applied in made available by open source projects as well, currently
several fields such as machine translation, manufacturing, and various commonly used deep learning platforms include: H2O
pattern recognition. However, successful application of deep platform, Deeplearning4j (DL4j), Theano, Torch, TensorFlow,
learning depends upon appropriately setting its parameters to and Caffe.
achieve high-quality results. The number of hidden layers and
the number of neurons in each layer of a deep machine learning One of the challenges in a successful implementation of
network are two key parameters, which have main influence on deep machine learning is setting the values for its many
the performance of the algorithm. Manual parameter setting and parameters, particularly the topology of its network. Let L be
grid search approaches somewhat ease the users’ tasks in setting the number of hidden layers, Ni be the number of neurons in
these important parameters. Nonetheless, these two techniques
can be very time-consuming. In this paper, we show that the
layer i and N={N1, N2, …, NL}. Parameters L and N are very
Particle swarm optimization (PSO) technique holds great important and have a major influence on the performance of
potential to optimize parameter settings and thus saves valuable deep machine learning. Manually tuning these parameters
computational resources during the tuning process of deep (essentially through trial and error method) and finding high-
learning models. quality settings is a time-consuming process [3]. Besides, the
solutions obtained by the manual process are usually not
Specifically, we use a dataset collected from a Wi-Fi campus
network to train deep learning models to predict the number of equally distributed in the objective space.
occupants and their locations. Our preliminary experiments To address this challenge, grid search is a common
indicate that PSO provides an efficient approach for tuning the approach for setting parameter values of the deep learning
optimal number of hidden layers and the number of neurons in
each layer of the deep learning algorithm when compared to the
models. Grid search is more efficient and saves time in setting
grid search method. Our experiments illustrate that the L and N; with this approach, a list of discrete values of L and
exploration process of the landscape of configurations to find the N are prepared in advance, where each entry shows the
optimal parameters is decreased by 77 % - 85%. In fact, the PSO number of hidden layers and its corresponding number of
yields even better accuracy results. neurons. The deep learning algorithm trains multiple different
Keywords - smart building services, deep machine learning, parameter models using all the list’s entries. Finally, the selection of the
optimization, particle swarm optimization. parameters is measured using the models’ accuracy. However,
grid search is still a computationally demanding process as the
I. INTRODUCTION number of possible combinations is exponential, especially
Deep learning is an aspect of artificial neural networks when the number of parameters increases and the interval
that aims to imitate complex learning methods that human between discrete values is reduced. In addition if the list of
beings use to gain certain types of knowledge. We can think of parameters are poorly chosen, the network may learn slowly,
deep learning as a technique that employs neural networks that or perhaps not at all [4].
utilize multiple hidden layers of abstraction. This is in contrast This paper proposes another parameter selection method
to traditional shallow neural networks that employ one hidden for deep learning models using PSO. PSO is a popular
layer [1]. population-based heuristic algorithm that simulates the social
Deep learning models are utilized in a wide variety of behavior of individuals such as birds flocking, a school of fish
applications including the popular iOS Siri and Google voice swimming or a colony of ants moving to a potential position
systems. Recently, deep neural networks have been utilized to to achieve particular objectives in a multidimensional space
win numerous contests in pattern recognition and machine [5]. PSO is found to have the extensive capability of global
learning. Some leading examples include Microsoft research optimization for its simple concept, easy implementation,
on a deep learning system that demonstrated the ability to scalability, robustness, and fast convergence. It employs only
classify 22,000 categories of pictures at 29.8 percent of simple mathematical operators and is computationally
accuracy. They also demonstrated real-time speech translation inexpensive in terms of both memory requirements and speed
between Mandarin Chinese and English. [2]. Deep learning is [6].

Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.
Several researchers have explored parameter optimization Configuring deep learning models using the above rules is
of various machine learning algorithms. PSO has been applied almost free of any computations, since what all needed is a
to train shallow neural networks [7]. There are a number of basic and simple calculation. However, these rules of thumb
studies about specifying and optimizing the initial weights of are not applicable all the time because they ignore the number
Artificial Neural Networks (ANN) learning [8] [9] [10] [11]. of trainings, the amount of noise in the targets, and the
Finding the best number of hidden neurons, learning rate, complexity of the function. Further experiments using a large
momentum coefficient and initial weights have been studied in number of different datasets are needed in order to find good
the literature. Bovis et al. [12] worked on mammographic rules of thumb for the different application domains.
mass to find an optimum number of hidden neurons for
In our experiments, we use deep learning models for
classification. Mirjalili et al. [13] proposed a hybrid of PSO
predictive modeling. H2O uses a purely supervised training
and gravitational search algorithm to train feed forward neural
protocol [19]. The configurations of deep learning algorithms
networks. PSO has been used to optimize the parameters of
in the H2O platform and other popular platforms have no
SVM. Bamakan et al. [14] proposed a hybrid approach for
default settings for the hidden layer size and the number of
parameter determination of the non-parallel SVM using PSO.
neurons in each layer. Experimenting with building deep
They considered the number of support vectors along with the
learning models using different network topologies and
classification accuracy as a weighted objective function.
different datasets will lead to intuition for these parameters.
In this study, we use PSO to optimize the number of For manual parameter selection, we selected different
hidden layers (L) and the number of neurons (Ni’s) in each configurations in our experiments in terms of the number of
layer for deep learning models [3]. To the best of our hidden layers and the number of neurons in each layer of the
knowledge, no one has used PSO for setting these parameters. deep learning model. Figure 1 shows the effect of different
Currently, the H2O platform utilizes grid search for parameter configurations on the accuracy. The figure illustrates that the
selection. In our experiments, we observed that PSO results in parameter selection process has a significant impact on the
a significant decrease in the number of configurations that accuracy of the deep learning model. However, the number of
need to be evaluated to find optimal parameters for deep potential configurations is large. In fact searching for the best
learning models. Specifically the decrease was by 77% - 85% configuration is like searching for a needle on a haystack.
while achieving higher model accuracy compared to grid
search. While the results presented in this paper are based on a
dataset collected from a campus Wi-Fi network, we believe
that PSO would result in similar results in other application
domains.
The remainder of the paper is organized as follows: Section
II presents the motivations behind this work. In Section III, we
present our proposed deep learning parameter selection method
using PSO. Section IV presents our experimental results and
the lessons learned, and finally, Section V concludes this study
and discusses future research directions.
II. MOTIVATIONS
Fig. 1: The effect of manual configuration setting on the accuracy.
To the best of our knowledge, there is no theory yet to
determine the best number of hidden layers and the number of As illustrated in Figure 1, the number of hidden layers
neurons in each layer that should be used by a deep learning and the number of neurons in each layer play a major role to
model to approximate a given function. There are several efficiently enhance the accuracy. For example, in comparison
alternatives, rules of thumb that could mitigate the modelers’ to the deep learning model that employs 10 hidden layers and
effort and time. For instance, number of hidden layers could 170 neurons in each layer, the accuracy is improved by 40 %
be selected to be between the number of inputs and outputs for the deep learning model that employs 5 layers and 200
[15]. Another rule suggests that the number of hidden layers neurons per layer. This accuracy is further improved by 76 %
can be based on the following formula [16]: when the deep learning model employs 1 hidden layer and 61
neurons in the layer. By running numerous configurations, one
ൎሺ ൅ሻǤ ʹൗ͵ (1) can find the best parameter values. However, that is a
computationally intensive endeavor. Thus, it can be easily
Where H is the number of neurons in the hidden layers, I is the
seen that finding high-quality parameter settings of a deep
number of features in the input layer, and O is the number of
learning model is a time consuming process that requires an
neurons in the output layer.
in-depth knowledge of the underlying algorithms, properties
In [17], Swingler argues that the number of hidden layers of the learning domain and the nature of the dataset that are
should never exceed the number of input variables. In terms of being used in the training process. In section 4 of this paper,
neurons, the number of hidden layer neurons should be less we compare our proposed PSO based parameter selection
than twice of the number of neurons in the input layer [18]. method with the grid search technique.

1286
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.
III. PSO-BASED PARAMETER OPTIMIZATION MODEL
The PSO algorithm is an iterative optimization method Algorithm 1: PSO for Parameter Optimization of Deep
that was originally proposed in 1995 by Kennedy and Eberhart Learning Models.
[5]. PSO was developed to mimic bird and fish swarms.
Animals who move as a swarm can reach their aims easily. Input: Wi-Fi dataset, location, time and MAC addresses
The basic form of the PSO algorithm is composed of a group Output: Optimal configuration in terms of the number of
of particles which repeatedly communicate with each other; hidden layers and number of neurons in each layer for the
the population is called a swarm. Each particle represents a deep learning model.
possible solution to the problem (i.e., the position of one Begin:
particle represents the values of the attributes of a solution) 1) Initialization
[20]. Each particle has its position, velocity and a fitness value a. Set the values of acceleration constants (c1 and c2), W,
that is determined by an optimization function. The velocity PopSize, MaxIt, and specify the range bounds:
determines the next direction and distance to move. The MinLayer, MaxLayer, MinNeurons, MaxNeurons,
fitness value is an assessment of the quality of the particle. MaxLayerVelocity and MaxNeuronVelocity.
The position of each particle in the swarm is tweaked to move b. Define the fitness function (i.e., deep learning model
closer to the particle which has the best position. Each particle accuracy).
updates its velocity and position by tracking two extremes in c. Establish initial random population for the number of
each iteration. One is called the personal best, which is the hidden layers and number of neurons in each layer.
best solution that the particle was able to obtain individually d. Calculate the fitness value for each particle and set the
so far. The other is called the global best which is the best personal best () for each particle and the global
solution that all particles were able to find collectively so far. best () for the population.
2) Repeat the following steps until the gbest solution does not
PSO is mathematically modeled as follows [5]: change anymore or the maximum number of iterations is
‫ݒ‬௜௧ାଵ = w.‫ݒ‬௜௧ + c1 Ǥ ‫݀݊ܽݎ‬Ǥ(௜ - ௧௜ ) + c2 Ǥ ‫݀݊ܽݎ‬Ǥ ( - reached.
௧௜ ) (2) a. Update the number of hidden layers, the number of
Each step t, the position of particle i, ‫ݔ‬௜௧ is updated based on neurons in each layer, the velocity of the number of
the particle’s velocity‫ݒ‬௜௧ : hidden layers and the number of neurons in each particle
‫ݔ‬௜௧ାଵ = ‫ݔ‬௜௧ + ‫ݒ‬௜௧ାଵ (3) according to the Equations (4) through (7).
b. Calculate the fitness value for each particle. If the
In Equations (2) and (3) above, ‫ݒ‬௜௧ and ‫ݔ‬௜௧ are the tth speed fitness value of the new location is better than the
and position components of the ith particle. c1 and c2 are the fitness value of personal best, the new location is
acceleration coefficients and represent the weights of updated to be the personal best location.
approaching the ௜ and of a particle. w is the c. If the currently best particle in the population is better
inertia coefficient as it helps the particles to move by interia than the global best, the best particle replaces the
towards better positions. rand is a uniform random value recorded global best.
between 0 and 1. The parameters utilized in our experiments 3) Return the optimal number of hidden layers, the number of
are listed in Table I. neurons in each layer for the deep learning model.
End
TABLE I: THE PARAMETERS UTILIZED IN OUR EXPERIMENTS
Algorithm 1 above provides the details of our proposed
Parameter Value PSO based parameter selection techniques for deep learning
Population size 10, 25, or 50 models. The algorithm is presented for campus occupant
Learning coefficients: c1, c2 uniformly distributed between [0, 4] prediction scenario using Wi-Fi collected data. This scenario
will be fully explored in the next section (i.e., Section IV).
Maximum number of iterations 10
Number of hidden layers within the range [1, 200]
In our implementation of PSO, the ith particle’s velocity is
calculated according to the following:
Number of neurons in each layer within the range [1, 10]
x Velocity of number of layers
represents the number of hidden layers ௧ାଵ ௧
୐ǡ௜ =w.୐ǡ௜ + c1 Ǥ ‫݀݊ܽݎ‬Ǥ(‫ܮ‬௜
ୠୣୱ୲
- ௧୐ǡ௜ ) + c2 Ǥ ‫݀݊ܽݎ‬Ǥ (‫୐ ܩ‬ୠୣୱ୲ -
Particle dimensions and the number of neurons in each ௧
layer ୐ǡ௜ ) (4)
MinLayerVelocity= -0.1(MaxLayers - Where ܸ௅ is the velocity of the number of hidden layers,
Hidden layers velocity MinLayers) ‫ܮ‬ୠୣୱ୲
௜ is the particle’s best local value of the number of
MaxNeuronVelocity=
+0.1(MaxNeurons - MinNeurons) hidden layers, and ‫୐ ܩ‬ୠୣୱ୲ is the best global value of the
MaxNeuronVelocity= 0.1 number of hidden layers.
Neuron velocity (MaxNeurons - MinNeurons) x Velocity of number of neurons
MinNeuronVelocity= -(0.1
(MaxNeurons - MinNeurons)) ௧ାଵ ௧
୒ǡ௜ =w.ேǡ௜ + c1 Ǥ ‫݀݊ܽݎ‬Ǥ(ܰ௜
ୠୣୱ୲
- ௧୒ǡ௜ ) + c2 Ǥ ‫݀݊ܽݎ‬Ǥ (‫୒ ܩ‬ୠୣୱ୲
- ௧୒ǡ௜ ) (5)

1287
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.
Where ܸே is the velocity of the number of neurons in each evaluate the accuracy of the predicted occupancy for each
hidden layer, ܰୠୣୱ୲
௜ is the particle’s best local value of the dataset.
number of neurons in each hidden layer, and ‫୒ ܩ‬ୠୣୱ୲ is the best Three different swarm sizes of 10, 25 and 50 particles are
global value of the number of neurons in each hidden layer. used in our experiments. Figure 2 shows the accuracy (c.f.
x Position for number of layers Figure 2a) and the number of configurations that need to be
‫ܮ‬௧ାଵ ௧ ௧ାଵ evaluated (c.f. Figure 2b) to achieve that accuracy for
௜ = ‫ܮ‬௜ +୐ǡ௜ (6)
predicting the occupancy within a 60 minute time window.
x Position for number of neurons
These two figures jointly illustrate the by using a small
ܰ௜௧ାଵ = ܰ௜௧ +௧ାଵ
୒ǡ௜ (7)
population size (e.g., 10 particles), the PSO based parameter
IV. EXPERIMENTAL RESULTS AND LESSONS LEARNED value selection technique was able to achieve an accuracy that
is almost the same as that achieved by using a larger
In our experiments, we select a smart building application population size (e.g., 25 and 50 particles) for almost all the
to assess the performance of our proposed PSD based datasets that we experimented with. Therefore, the PSO based
parameter selection technique. We built a deep learning model parameter value selection approach does not require a large
based on 6 weeks (January 15, 2016 – Feb 29, 2016) of Wi-Fi number of particles to produce competitive results. Another
access data collected from 14 buildings of the campus of the observation drawn from Figure 2(b) is that the number of
University of Houston campus. Our goal is to build a deep iterations needed to reach the globally best solution is almost
learning model that predicts the number of occupants at a one-third and one-fifth the number of configurations that need
given location in 15, 30 and 60 minutes from the current time. to be evaluated by the grid search method when the PSO
Awareness of the number of occupants in a building at a given based techniques employs 25 and 50 particles, respectively.
time is crucial for many smart building applications including This demonstrates that the PSO based technique can be
energy efficiency and emergency response services [21]. computationally efficient to determine the deep learning
Our experiments were conducted using the R language. parameters. Therefore, in the following experiments can
We executed our experiments on a 24-core machine with simply consider the PSO based technique with 10 particles
2.40GHz Intel® Xeon® CPU and 32 GB RAM. In our and compare our results with the grid searching technique.
scenarios, we split a 6 weeks dataset into 7 parts; each part Figures 3-5 show the number of different configurations
corresponds to a day of the week. Each dataset has the that need to be evaluated to reach the globally best solution in
following features: Access Point ID (APID), Date, Time, User terms of predicting the occupancy in the next 60, 30 and 15
MAC address and Building number. The three features that minutes, respectively. These figures illustrate that better
our deep learning model needs to predict are the count of accuracy can be achieved when using our proposed PSO based
MAC addresses within 15, 30 and 60 minutes from the current parameter value selection technique while having to evaluate a
time at a given date, time and location (i.e., APID and significantly lower number of configurations compared to the
Building number). In the process, we built a deep learning grid search approach. This clearly exhibits the supremacy of
model for each day of the week. Table II summarizes the the PSO based technique over grid search. Thus, it can serve
different parts of the dataset. Further, each dataset for a as a great candidate for parameter tuning of deep machine
specific day of the week has been split into training and learning models. Of course, one needs to carefully analyze
testing sets. Specifically, the first five weeks of the dataset dataset biases or domain specific properties that give rise to
were used as a training set while the data that pertains to the these results, but that is beyond the scope of this paper and is
sixth week is used as a testing set. We then set out to address left for future extensions.
the main goal of this paper which is to compare our proposed
PSO based parameter selection technique vis-à-vis the grid TABLE II: TRAINING AND TESTING DATASETS FOR THE DAYS OF
THE WEEK
search technique in terms of finding the best parameters for
the seven models that correspond to the days of the week. Number of Number Actual Actual Actual
records in of records occupancy occupancy occupancy
In order to evaluate and compare the grid search and PSO Dataset
training in testing in the next in the next in the next
approaches, both the accuracy and the number of set set 60 minute 30 minute 15 minute
configurations that need to be explored to get the best Sat. 335137 71551 167 110 93
accuracy are evaluated. In case of PSO, the algorithm
Sun. 213434 108597 184 100 80
terminates when the maximum number of iterations is reached
or when there is no difference between the accuracies of two Mon. 1686200 795439 715 648 488
consecutive iterations. Since the count of occupants at a given Tue. 2129033 411025 732 628 474
time and location is a continually changing number (i.e.,
Wed. 2141754 404023 792 618 481
regression problem), it does not make much sense to predict
the exact number of occupants N at a given date, time and Thur. 1986703 269976 794 689 493
locations. Rather, it is more practical to allow a small Fri. 1200046 253995 323 262 234
tolerance in the count, for example N± n. Therefore, we
consider clusters with window size ±n (e.g., 20) when we

1288
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.
(a) Accuracy vs. Model (b) Number of different configurations vs. Model
Fig. 2: Comparison between three different swarm sizes (10, 25 and 50).

(a) Accuracy vs. Model (b) Number of different configurations vs. Model
Fig. 3: Comparison between PSO and grid search to predict within a 60 minute interval.

(a) Accuracy vs. Model (b) Number of different configurations vs. Model
Fig. 4: Comparison between PSO and grid search to predict within a 30 minute interval.

(a) Accuracy vs. Model (b) Number of different configurations vs. Model
Fig. 5: Comparison between PSO and grid search to predict within a 15 minute interval.

1289
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.
[4] Y. Ganjisaffar, T. Debeauvais, S. Javanmardi, R. Caruana, and C. V.
V. CONCLUSIONS AND FUTURE WORK Lopes, “Distributed tuning of machine learning algorithms using
MapReduce clusters,” in Proceedings of the Third Workshop on Large
Multiple parameters have to be set and tuned for deep Scale Data Mining: Theory and Applications, 2011, p. 2.
learning models. These parameters can have a significant [5] J. Kennedy and R. Eberhart, "Particle swarm optimization,” in
influence on the results and the computational needs of deep Proceedings of the IEEE international conference on neural networks
learning models. Optimization methods therefore need to be IV, 1995, pp. 1942-1948.
[6] C. W. de Silva, Mechatronic Systems: Devices, Design, Control,
used to help find optimal parameter settings. Consequently, Operation and Monitoring. Boca Raton: CRC Press, 2007.
the user can focus on the results of deep learning rather than [7] J. Karwowski, M. Okulewicz, and J. Legierski, “Application of Particle
on spending time and efforts on deciding the optimal Swarm Optimization Algorithm to Neural Network Training Process in
parameter values. This paper presents a PSO based parameter the Localization of the Mobile Terminal,” in Engineering Applications
of Neural Networks, 2013, pp. 122–131.
value selection technique to optimize the performance of deep [8] Y. M. M. Hassim and R. Ghazali, “Solving a classification task using
learning models, by selecting the number of hidden layers and Functional Link Neural Networks with modified Artificial Bee Colony,”
the number of neurons in each layer. Our results show that the in 2013 Ninth International Conference on Natural Computation
proposed PSO algorithm is useful in the process of training (ICNC), 2013, pp. 189–193.
[9] H. Shah and R. Ghazali, “Prediction of Earthquake Magnitude by an
deep learning models. We demonstrated the performance of Improved ABC-MLP,” in 2011 Developments in E-systems Engineering,
the proposed technique in a smart building scenario where the 2011, pp. 312–317.
number of occupants needs to be predicted in the next 60, 30 [10] L. Qiongshuai and W. Shiqing, “A hybrid model of neural network and
and 15 minutes based on collected Wi-Fi data. The results classification in wine,” in 2011 3rd International Conference on
obtained show that training times decreased by 77% - 85% Computer Research and Development, 2011, vol. 3, pp. 58–61.
[11] B. A. Garro, H. Sossa and R. A. Vázquez, "Artificial neural network
when using the PSO based approach compared to the grid
synthesis by means of artificial bee colony (ABC) algorithm," 2011
search method. Our proposed PSO based technique also gives IEEE Congress of Evolutionary Computation (CEC), New Orleans, LA,
a better classification accuracy compared to the grid search 2011, pp. 331-338. doi: 10.1109/CEC.2011.5949637.
approach. As a future extension, we intend to explore the use [12] K. Bovis, S. Singh, J. Fieldsend and C. Pinder, "Identification of masses
in digital mammograms with MLP and RBF nets," in Proceedings of the
of PSO to tune other deep learning parameters such as: the
IEEE-INNS-ENNS International Joint Conference on Neural Networks.
activation functions and the number of epochs. Note that it is IJCNN 2000. Neural Computing: New Challenges and Perspectives for
easy to implement parallel versions of PSO on GPUs. the New Millennium, Como, 2000, pp. 342-347.
Therefore, resulting in further reduced training times, while [13] S. Mirjalili, S. Z. Mohd Hashim, and H. Moradian Sardroudi, “Training
feedforward neural networks using hybrid particle swarm optimization
letting researchers focus on extracting subject matter
and gravitational search algorithm,” Applied Mathematics and
knowledge using deep learning models, rather than letting Computation, vol. 218, no. 22, Jul. 2012, pp. 11125–11137.
them focus on the parameter value selection process itself. [14] S. M. H. Bamakan, H. Wang, and A. Z. Ravasan, “Parameters
Optimization for Nonparallel Support Vector Machine by Particle
ACKNOWLEDGMENT Swarm Optimization,” Procedia Computer Science, vol. 91, 2016, pp.
482–491.
This article was made possible by NPRP grant # [7- [15] A. Blum, Neural Networks in C++: An Object-Oriented Framework for
Building Connectionist Systems, 1 edition. New York: Wiley, 1992.
1113-1-199] from the Qatar National Research Fund (a [16] Z. Boger and H. Guterman, “Knowledge extraction from artificial neural
member of Qatar Foundation). The statements made herein are network models,” in Systems, Man, and Cybernetics, 1997.
solely the responsibility of the authors. Computational Cybernetics and Simulation, 1997 IEEE International
Conference on, 1997, vol. 4, pp. 3030–3035.
REFERENCES [17] K. Swingler, Applying Neural Networks: A Practical Guide, Pap/Dsk
[1] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, edition. San Francisco: Morgan Kaufmann, 1996.
MA: The MIT Press, 2016. [18] G. S. Linoff and M. J. A. Berry, Data Mining Techniques: For
[2] “Machine Learning and Understanding for Intelligent Extreme Scale Marketing, Sales, and Customer Relationship Management, 3 edition.
Scientific Computing and Discovery,” Advanced Scientific Computing Indianapolis, IN: Wiley, 2011.
Research (ASCR) Division of the Office of Science, U.S. Department of [19] A. Arora, A. Candel, J. Lanford, E. LeDell, and V. Parmar, Deep
Energy, Workshop Report, Jan. 2015. [Online]. Available: Learning with H2O. 2015.
https://www.orau.gov/machinelearning2015/Machine_Learning_DOE_ [20] C. J. A. Bastos-Filho, D. F., M. P., P. B. C. Miranda, and E. M. N.
Workshop_Report_6.pdf . [Accessed: January 28-2017]. Figueiredo, “Multi-Ring Particle Swarm Optimization,” in Evolutionary
[3] Y. Malitsky, D. Mehta, B. O’Sullivan, and H. Simonis, “Tuning Computation, W. P. dos Santos, Ed. InTech, 2009.
parameters of large neighborhood search for the machine reassignment [21] V. L. Erickson, M. Á. Carreira-Perpiñán, and A. E. Cerpa, “Occupancy
problem,” in International Conference on AI and OR Techniques in Modeling and Prediction for Building Energy Management,” ACM
Constriant Programming for Combinatorial Optimization Problems, Trans. Sen. Netw., vol. 10, no. 3, May 2014, p. 42:1–42:28.
2013, pp. 176–192.

1290
Authorized licensed use limited to: ULAKBIM UASL - EGE UNIVERSITESI. Downloaded on September 05,2024 at 18:43:18 UTC from IEEE Xplore. Restrictions apply.

Haldirams Intro
100% (2)
Haldirams Intro
12 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
No ratings yet
E H O D L U V L G A: Fficient Yperparameter Ptimization in EEP Earning Sing A Ariable Ength Enetic Lgorithm
16 pages
Unit 1
No ratings yet
Unit 1
21 pages
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
A Comprehensive Overview and Comparative Analysis On Deep Learning Models
No ratings yet
A Comprehensive Overview and Comparative Analysis On Deep Learning Models
61 pages
Deep Learning
No ratings yet
Deep Learning
52 pages
INTRODUCTION
No ratings yet
INTRODUCTION
67 pages
ChatGPT背后的底层技术 Deep+Learning (深度学习) 论文翻译
No ratings yet
ChatGPT背后的底层技术 Deep+Learning (深度学习) 论文翻译
39 pages
Deep Learning Algorithms
No ratings yet
Deep Learning Algorithms
21 pages
Sensors 21 08003
No ratings yet
Sensors 21 08003
16 pages
Few-Shot Machine Learning: Doing More with Less Data
From Everand
Few-Shot Machine Learning: Doing More with Less Data
Robert Johnson
No ratings yet
Solve Complex Problems Using Artificial Neural Network Learned by PSO
No ratings yet
Solve Complex Problems Using Artificial Neural Network Learned by PSO
7 pages
Ai Pso
No ratings yet
Ai Pso
5 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
Nature14539 PDF
No ratings yet
Nature14539 PDF
9 pages
A Comprehensive Overview and Comparative Analysis On Deep Learning Models
No ratings yet
A Comprehensive Overview and Comparative Analysis On Deep Learning Models
62 pages
Tuning Differential Evolution For Artificial Neura
No ratings yet
Tuning Differential Evolution For Artificial Neura
20 pages
Unit-3 Notes
No ratings yet
Unit-3 Notes
16 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Deep Learning For Wind Speed Forecasting in Northeastern Region of Brazil
No ratings yet
Deep Learning For Wind Speed Forecasting in Northeastern Region of Brazil
6 pages
Deep Learning Midsem Merged Previous Batch
No ratings yet
Deep Learning Midsem Merged Previous Batch
423 pages
ML Archs
No ratings yet
ML Archs
36 pages
Radhapaper 2
No ratings yet
Radhapaper 2
19 pages
Artifcial Neural Networks Training Algorithm Integrating Invasive
No ratings yet
Artifcial Neural Networks Training Algorithm Integrating Invasive
9 pages
Lecun 2015
No ratings yet
Lecun 2015
10 pages
Evolving Neural Networks Using Bird Swarm Algorithm For Data Classification and Regression Applications
No ratings yet
Evolving Neural Networks Using Bird Swarm Algorithm For Data Classification and Regression Applications
29 pages
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Supporting Future Electrical Utilities - Infotech
No ratings yet
Supporting Future Electrical Utilities - Infotech
6 pages
Singa Tomm
No ratings yet
Singa Tomm
23 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
3rd Unit DL Final Class Notes
No ratings yet
3rd Unit DL Final Class Notes
78 pages
Efficient Training and Design of Photonic Neural Network Through Neuroevolution
No ratings yet
Efficient Training and Design of Photonic Neural Network Through Neuroevolution
11 pages
SC Report-1
No ratings yet
SC Report-1
34 pages
Lecun 2015
No ratings yet
Lecun 2015
9 pages
Information Sciences: Le Zhang, P.N. Suganthan
No ratings yet
Information Sciences: Le Zhang, P.N. Suganthan
3 pages
Processes 10 02579 v2
No ratings yet
Processes 10 02579 v2
36 pages
Dynamic Neural Diversification Path To Computation
No ratings yet
Dynamic Neural Diversification Path To Computation
9 pages
A Survey of Randomized Algorithms For Training Neural Networks
No ratings yet
A Survey of Randomized Algorithms For Training Neural Networks
10 pages
Applsci 13 00283 v3
No ratings yet
Applsci 13 00283 v3
18 pages
A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks
No ratings yet
A Modified Invasive Weed Optimization Algorithm For Training of Feed Forward Neural Networks
8 pages
Taeho Jo - Deep Learning Foundations-Springer (2023) (Z-Lib - Io)
No ratings yet
Taeho Jo - Deep Learning Foundations-Springer (2023) (Z-Lib - Io)
433 pages
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
2022 - Neural Optimization Machine-A Neural Network Approach For Optimization
No ratings yet
2022 - Neural Optimization Machine-A Neural Network Approach For Optimization
22 pages
ANDONIE, R. Hyperparameter Optimization in Learning Systems. Journal of Membrane Computing. 2019.
No ratings yet
ANDONIE, R. Hyperparameter Optimization in Learning Systems. Journal of Membrane Computing. 2019.
13 pages
2015 Lecun Deeplearn
No ratings yet
2015 Lecun Deeplearn
10 pages
MRK - Spring 2022 - CS719 - 2 - MS210400057
No ratings yet
MRK - Spring 2022 - CS719 - 2 - MS210400057
6 pages
Diagnosis On Lung Cancer Using Artificia PDF
No ratings yet
Diagnosis On Lung Cancer Using Artificia PDF
7 pages
Mastering Deep Learning with Keras: From Basics to Expert Proficiency
From Everand
Mastering Deep Learning with Keras: From Basics to Expert Proficiency
William Smith
No ratings yet
Research Paper
No ratings yet
Research Paper
19 pages
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
From Everand
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
William Sullivan
1/5 (1)
Theory and Novel Applications of Machine Learning
No ratings yet
Theory and Novel Applications of Machine Learning
386 pages
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
No ratings yet
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
10 pages
2015 Training Artificial Neural Network Using Modificationof Differential Evolution Algorithm
No ratings yet
2015 Training Artificial Neural Network Using Modificationof Differential Evolution Algorithm
7 pages
An Analysis of Machine Learning and Deep Learning Sharif Zhanel
No ratings yet
An Analysis of Machine Learning and Deep Learning Sharif Zhanel
8 pages
Seminar - Report Iit Bmbay
No ratings yet
Seminar - Report Iit Bmbay
42 pages
A Comprehensive Overview and Comparative Analysis On Deep Learning Models: CNN, RNN, LSTM, GRU
No ratings yet
A Comprehensive Overview and Comparative Analysis On Deep Learning Models: CNN, RNN, LSTM, GRU
16 pages
An Artificial Neural Network Based Learning Method For Mobile Robot Localization
No ratings yet
An Artificial Neural Network Based Learning Method For Mobile Robot Localization
10 pages
Self-Organizing Democratized Learning Toward Large-Scale Distributed Learning Systems
No ratings yet
Self-Organizing Democratized Learning Toward Large-Scale Distributed Learning Systems
13 pages
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
No ratings yet
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
12 pages
A Study of The Optimization Algorithms in Deep Learning
No ratings yet
A Study of The Optimization Algorithms in Deep Learning
4 pages
A Review On Optimization Algorithm For Deep Learning Method in Bioinformatics Field
No ratings yet
A Review On Optimization Algorithm For Deep Learning Method in Bioinformatics Field
5 pages
A Comparative Analysis of Gradient Descent-Based Optimization Algorithms On Convolutional Neural Networks
No ratings yet
A Comparative Analysis of Gradient Descent-Based Optimization Algorithms On Convolutional Neural Networks
8 pages
Mustapha 2021 J. Phys. Conf. Ser. 1743 012002
No ratings yet
Mustapha 2021 J. Phys. Conf. Ser. 1743 012002
13 pages
Imm Rota New
No ratings yet
Imm Rota New
1 page
Linear Regression
100% (1)
Linear Regression
56 pages
Grammar Lesson 1 - Overview
No ratings yet
Grammar Lesson 1 - Overview
5 pages
VVCSL Seafarers Health Self Declaration With COVID 19 Vaccine and Testing and Temperature Control Form
No ratings yet
VVCSL Seafarers Health Self Declaration With COVID 19 Vaccine and Testing and Temperature Control Form
3 pages
Electric Spoon
No ratings yet
Electric Spoon
2 pages
Woodpecker Lx16
No ratings yet
Woodpecker Lx16
46 pages
CCA Shree Cement
No ratings yet
CCA Shree Cement
10 pages
Vedant's Resume
No ratings yet
Vedant's Resume
2 pages
SART Manual
100% (1)
SART Manual
20 pages
Visual Memory 1st Edition Steven J. Luck All Chapter Instant Download
100% (22)
Visual Memory 1st Edition Steven J. Luck All Chapter Instant Download
84 pages
Sop For Digital Crop Survey
No ratings yet
Sop For Digital Crop Survey
8 pages
Private Islands ISSUE # 32 - FALL-WINTER 2024-2025
No ratings yet
Private Islands ISSUE # 32 - FALL-WINTER 2024-2025
57 pages
Iec 309.1-1988
No ratings yet
Iec 309.1-1988
66 pages
PLAY - The Bean Game - Worksheet
No ratings yet
PLAY - The Bean Game - Worksheet
5 pages
Lab Items
No ratings yet
Lab Items
44 pages
Global Humanitarian Overview 2025 (Abridged Report)
No ratings yet
Global Humanitarian Overview 2025 (Abridged Report)
20 pages
Minor Project File
No ratings yet
Minor Project File
29 pages
Li Ion Standards
No ratings yet
Li Ion Standards
4 pages
Water-Soluble Polymers For Petroleum Recovery PDF
No ratings yet
Water-Soluble Polymers For Petroleum Recovery PDF
355 pages
Wetted Surface Area of Partially Filled Horizontal Vessel
No ratings yet
Wetted Surface Area of Partially Filled Horizontal Vessel
1 page
Magnetism - Notes 24-25
No ratings yet
Magnetism - Notes 24-25
13 pages
MSG Catalogue Equipment 2023 EU Web
No ratings yet
MSG Catalogue Equipment 2023 EU Web
54 pages
INTRODUCTION TO BIOMEDICAL TECHNOLOGY - Module-1ppt
No ratings yet
INTRODUCTION TO BIOMEDICAL TECHNOLOGY - Module-1ppt
17 pages
Comparison of Critical Rate Correlations: Firdavs A. Aliev, Khurshed A. Rahimov, Balabek Amzayev, Alim F.Kemalov
No ratings yet
Comparison of Critical Rate Correlations: Firdavs A. Aliev, Khurshed A. Rahimov, Balabek Amzayev, Alim F.Kemalov
7 pages
DLL G6 Q3 WEEK 9 Version2 (Mam Inkay Peralta)
No ratings yet
DLL G6 Q3 WEEK 9 Version2 (Mam Inkay Peralta)
71 pages
Damac Construction Update June 2025
No ratings yet
Damac Construction Update June 2025
34 pages
Social Studies Unit Plan Organizer Teacher Candidate: Andrea Murree Grade: 6 Social Studies
No ratings yet
Social Studies Unit Plan Organizer Teacher Candidate: Andrea Murree Grade: 6 Social Studies
23 pages
2020 Proposal
No ratings yet
2020 Proposal
14 pages
6EKOD Specifications
No ratings yet
6EKOD Specifications
4 pages

Parameters Optimization of Deep Learning Models Using Particle Swarm Optimization

Uploaded by

Parameters Optimization of Deep Learning Models Using Particle Swarm Optimization

Uploaded by

Parameters Optimization of Deep Learning Models

using Particle Swarm Optimization

978-1-5090-4372-9/17/$31.00 ©2017 IEEE 1285

You might also like