Nature Inspired Meta-Heuristic Algorithms For Deep Learning: Recent Progress and Novel Perspective
Nature Inspired Meta-Heuristic Algorithms For Deep Learning: Recent Progress and Novel Perspective
net/publication/332623214
CITATIONS READS
16 1,449
7 authors, including:
Some of the authors of this publication are also working on these related projects:
SRUM PROCESS MODEL FOR THE DEVELOPMENT OF SMART PAYROLL INTEGRATED WITH TASK MANAGER View project
All content following this page was uploaded by Shafi’i Muhammad Abdulhamid on 19 May 2019.
Abstract. Deep learning is presently attracting extra ordinary attention from both
the industry and the academia. The optimization of deep learning models through
nature inspired algorithms is a subject of debate in computer science. In this pa-
per, we present recent progress on the application of nature inspired algorithms
in deep learning. The survey pointed out recent development issues, strengths,
weaknesses and prospects for future research. A new taxonomy is created based
on natured inspired algorithms for deep learning. The trend of the publications in
this domain is depicted; it shows the research area is growing but slowly. The
deep learning architectures not exploit by the nature inspired algorithms for op-
timization are unveiled. We believed that the survey can facilitate synergy be-
tween the nature inspired algorithms and deep learning research communities. As
such, massive attention can be expected in a near future.
1 Introduction
Nature inspired algorithms are metaheuristic algorithms inspired from the nature.
The inspiration of the algorithms can be from natural biological system, evolution,
human activities, group behaviour of animals, etc. for example, biological human brain
inspired the proposing of artificial neural network (ANN) [1], genetic algorithm (GA)
inspired from the theory of evolution [2], cuckoo search algorithms (CSA) inspired from
behaviour of cuckoo births [3], artificial bee colony (ABC) got inspiration from the
behaviour of bee [4], among many other algorithms. These nature inspired algorithms
are found to be very effective and efficient in solving real world optimization problems
better than the conventional algorithms because of their ability to effectively handle
highly nonlinear and complex problems especially in science and engineering [5].
The ANN is the early and major breakthrough in the field of artificial intelligence.
The ANN model has been very active in solving real world complex problems in
different domain of machine learning application such as the health [6, 7], agriculture
[8], automobile industry [9], finance [10], etc. Currently, ANN in its single, hybrid or
ensemble form remained an active research area [11] and expected to witness more
attention in the future, for example it is role in self-driving vehicles [9]. However, the
ANN is trained with back propagation algorithm with limitations such as the problem of
being fall in local minima and over fitting of training data. As a result of that, many
researchers propose the use of nature inspired algorithm for the training of the ANN to
avoid the challenges. For example, GA [12, 13], ABC [14], CSA [15], particle swarm
2
optimization (PSO) [16] were applied to train ANN and it was found to be better than
the back propagation algorithm and avoid the local minima problem.
Presently, deep learning [17] is a hot research topic in machine learning. The deep
learning is the deep architecture of ANN with logistic on node weights update and
activation function. When supply with a large scale data, the deep learning models and
extract high level abstraction from the large scale data set [18]. However, the deep
learning is face with many limitations but not limited to: lack of systematic procedure to
realized optimum parameter values, manual configuration of the deep learning
architecture and lack of standard training algorithm. As such, many approaches
including nature inspired algorithms were proposed by researchers to mitigate the
challenges.
The application of nature inspired algorithms in deep learning is limited because of
the lack of synergy between the deep learning and nature inspired metaheuristic
algorithm [19]. [18] present the role of natured inspired algorithms in deep learning in
the context of big data analytics. However, the study argued that limited study can be
found to apply nature-inspired algorithm in deep learning. Only one study that
incorporated nature inspired algorithm in deep learning is reviewed in the study.
In this paper, we propose to extend the work of [18] by surveying more number of
literature that hybridized nature inspired algorithm and deep learning architecture. This
is to show the strength of the application of nature inspired algorithms in deep learning
and new perspective for future research to encourage synergy between the natured in-
spired algorithms and deep learning research communities.
3 Deep Learning
In machine learning, deep learning is considered as one of the most vibrant research
area. The deep learning started gaining prominence from 2006, the time it was pre-
sented in the literature [22, 23]. In the real sense of it, the deep learning has been in
3
existence since the 1940’s. However, it is prominence came to lamplight starting from
2006 to current times because of the technological advancement in computing such as
high performance computing systems, GPU, etc and the advent of large scale data [24].
Machine learning algorithm success highly depends on data representation, as such, the
deep learning plays a vital role in processing the large scale data because it can uncover
valuable hidden knowledge [23]. The Design of the deep learning architecture resulted
from the extension of the feed forward ANN with multiples number of hidden layers
[18]. The ANN forms the core of the deep learning. The major architecture of the deep
learning involves convolutional neural network (ConvNet), deep belief network
(DBN), deep recurrent neural network (DRNN), stacked auto-encoder (SAE) and gen-
erative adversarial network (GAN). The deep learning has performed excellently in
different domain of applications including image and video analysis [25-27], natural
language processing [28], text analysis [29], object detection [30], speech processing
[31] and dimension reduction [32].
Fig. 1. Taxonomy of the natured inspired algorithms for deep learning architecture
There are efforts been made by researchers in the optimization of the deep learning
architecture parameters through nature inspired algorithms. Fig. 1 is the taxonomy cre-
ated based on the combination of the nature inspired algorithms and deep learning ar-
chitecture as found in the literature. The efforts made by researchers are discussed as
follows:
The HSA and it is variants are found to be applied by researchers to optimise the
parameters of DBN. For example, [19] propose quaternion HSA (QHSA) and improve
quaternion HSA (IQHSA) to optimise the parameters of the DBN. The QHSA and the
IQHSA are used to optimise the learning rate, hidden layer neurons, momentum and
weight decay of the DBN (QHSA-DBN and IHSA-DBN). Both the QHSA-DBN and
IHSA-DBN are evaluated on image reconstruction problem. It was found that the
QHSA-DBN and IHSA-DBN perform better than the standard algorithms based on the
HSA. [33] optimise Bernoulli restricted Boltzmann machine through HSA to select
suitable parameters that minimizes reconstruction error. The HSA selected the suitable
learning rate, weight decay, penalty parameter, and hidden units of the Bernoulli
restricted Boltzmann machine (RBM) (HSA-BRBM). The HSA-BRBM is evaluated on
benchmark dataset to solve image reconstruction. The HSA-BRBM is found to improve
the performance of the state-of-the-art algorithms. [34] applied the optimization of
discriminative RBM (DRBM) based on meta-heuristic algorithms to deviate from
commonly use random search technique of parameter selection. The variants of the HSA
(VHSA) and the PSO were used to select the optimum parameters (learning rate, weight
decay, penalty parameter, and hidden units) of the DRBM (VHSA-DRBM and PSO-
DRBM). The VHSA-DRBM and PSO-DRBM effectiveness were tested on benchmark
dataset and found to perform better than the commonly use random search technique of
5
selecting parameters. The HSA provide trade-off between the accuracy and
computational load. [33] propose vanilla HSA (VaHSA) for optimizing the parameters
of DBN. The parameters of the DBN were optimised by the VaHSA (VaHSA-DBN).
The VaHSA-DBN is tested on multiple dataset and it was found to performs better than
the classical algorithms. However, the convergence time is expensive.
The FFA is one of the nature inspired algorithms that is used to optimise the
parameters of deep learning architecture e.g DNN. For example, [35] introduce FFA in
DBN (FFA-DBN) to calibrate its parameters for image reconstruction. The DBN
calibration is done automatically by the FFA to eliminate the manual method of
calibrating the DBN. The FFA-DBN is applied for binary image reconstruction. The
results show that the FFA-DBN outperform the classical algorithms.
The EA is used to optimise the parameters of DBN. This is the only study to the best
of the author’s knowledge that applied EA for DBN parameter optimization. [24]
applied adaptive EA in DBN to automatically tuned the parameters without the need for
pre-requisite knowledge on DBN domain knowledge. The EA DBN (EADBN) is
evaluated on both benchmark and real world data set. The result of the evaluation shows
that the EADBN enhance the performance of the standard variants of the DBN.
performance shows that the ACO-DDBN performs better than the classical grid based
DDBN and support vector machine in both accuracy and computational time [37].
The GA is one of the earliest nature inspired algorithm that motivated researchers to
propose different variants of nature inspired algorithms. It has been used extensively in
solving optimization problems. The GA is used to optimise the parameters of deep
learning model. For example, [39] applied GA and grammatical evolution to reduce the
manual trial and error procedure of determining the parameters of ConvNet
(GAConvNet and GEConvNet). The evolutionary algorithms are used to determine the
ConvNet architecture and hyperparametrs. The GAConvNet and GEConvNet are
evaluated on benchmark dataset. The results suggested that the GAConvNet and
GEConvNet enhance the performance of the classical ConvNet.
The GA is also applied for the optimization of RBM parameters, two studies were
found to use GA for the optimization of RBM. First, [40] propose the use of GA for the
automatic design of RBM (GA-RBM). The GA initializes the RBM weights for
determining the number of both visible and hidden neurons. The GA is able to realized
optimum structure of the Deep RBM. The GA-RBM was tested on handwritten
classification problems. The results show that the GA-RBM performs better than the
conventional RBM and the shallow structure of the RBM. [41] incorporated GA into
DeRBM. The weighted nearest neighbour (WNN) weight is evolved using the GA. The
effectiveness of the propose GA based WNN (GAWNN) and DeRMB (GADeRMB) is
7
evaluated on classification problems. It is found to perform better than the SVM and
statistical nearest neighbour.
An overview of the research area is presented in this section to show the strength of
incorporating nature inspired algorithms in deep neural network architecture. It is clearly
indicated that the penetration of nature inspired algorithms in deep learning has received
little attention from the research community. This is highly surprising in view of the fact
that both the deep learning and nature inspired algorithms research received
unprecedented attention from the research communities (e.g see [44] for deep learning
and [21] for nature inspired algorithms). Moreover, the two research areas are well
established in solving real world challenging and complex problems. As already stated
earlier, lack of synergy between the two research communities exist. However, evidence
from the literature clearly indicated that combining nature inspired algorithms and deep
learning has advantage of improving the performance of the deep learning architecture.
This is because the empirical evidence shows that the synergy between the nature
inspired algorithms and deep learning architecture always improve the accuracy of the
conventional deep learning architecture. In addition, the laborious trial and error
technique of determining the high number of parameters for the deep learning
architecture is eliminated because the optimum parameters are being realized
automatically by the nature inspired algorithms. As such, human effort in determining
the optimum parameters is eliminated.
8
Fig. 2. The trend of the integration of natured inspired algorithms and deep learning
architecture
Fig. 2 depicted the trend of the synergy between the nature inspired algorithms and deep
learning. As shown in Fig. 2, despite the fact that the two research areas pre-date 2012,
evidence from the literature indicated that the combination of nature inspired algorithms
and deep learning started appearing in 2012 with two literature. In 2013, a break occurred
without a single research in this direction. It is found that 2015 and 2017 witness the
highest number of works. Though, 2018 is still active, a literature has appeared as at the
time of writing this manuscript. We realised that papa et al. are at the forefront of
promoting the synergy between the nature inspired algorithms and deep learning
research communities. The research area does not get the magnitude of the attention it
deserved. In general, it can be deduced that the research area is slowly gaining
acceptability within the research community because the number of research in the last
four years has increased. The trend is expected to grow in the near future at a faster rate.
The nature inspired algorithms require setting of parameters themselves, the best
systematic way to realize the optimum parameter settings of the nature inspired
algorithms remain an open research problem. Therefore, adding nature inspired
algorithm to deep learning constitute additional parameter settings. Though, the
parameter settings of the deep learning architecture can be reduced in view of the fact
that some of the parameters can be determine automatically by the nature inspired
9
algorithms. In a situation whereby the parameter settings of the nature inspired algorithm
is not good enough to provide good performance, it would have a multiplier effect on
the deep learning architecture. Hence, reduce it is performance and possibly caused the
model to stuck in local minima. This is correct because the performance of the nature
inspired algorithms heavily depends on parameter settings. Future work should be on
parameterless nature inspired algorithm to eliminate the need for human intervention in
setting parameters. We expect future deep learning models to be autonomous.
Weights of the deep learning architecture plays a critical role because the
performance of the deep learning architecture heavily depends on the optimal initial
weights of the architecture. [32] argued that fine-tuning weights can be accomplished by
gradient decent in auto encoder and it works well especially if the initial weights are near
optimum. The critical nature of the weights in influencing the performance of the deep
learning prompted many researchers to propose various ways of getting optimum
weights (e.g [45-47]). Despite the critical role been played by the initial weights of the
deep learning architecture, very few concern is shown on the deep learning architecture
initial weights optimization through the application of nature inspired algorithms.
Intensive study on the application of nature inspired algorithms for optimising deep
learning initial weights should a major concern in future research.
Despite the fact that the nature-inspired algorithm improve accuracy, it sometimes
increases convergence time for the deep learning architecture. As such, real life
application that time is critical will not be suitable for implementing deep learning
models incorporated with nature inspired algorithms. Typical example is medical
facilities because one second can cause a serious tragedy or dead. Though, there is
evidence that the nature inspired algorithms can improve the convergence speed of deep
learning models. It can be concluded the performance of the nature inspired algorithm
as regard to convergence speed is not consistent. In the future, researchers should work
on the convergence speed of the nature inspired algorithm for deep learning to ensure it
is consistency.
One of the major challenge of meta-heuristic algorithm is that it requires meta-
optimization in some cases to enhance it is performance. The meta-optimization
procedure is excessive in a deep learning applications. However, the deep learning
application already has significant effort in computation [34]. As such, it can add to the
complexity and challenges of the deep learning. In the future, researchers should work
towards reducing the meta-optimization efforts in nature inspired algorithms.
[18] pointed out that the excessive optimization of ANN through nature inspired
algorithm mitigates the flexibility of the ANN which can caused over fitting of the
training data. Excessive training of deep learning models with nature inspired algorithms
should be discouraged and control to the tolerant level.
The survey revealed that some major deep learning architectures such as the GAN, SAE,
DRNN and deep echo state network were not exploit by nature inspired algorithms. The
GAN [48] is a newly propose architecture of deep learning.
10
6 Conclusions
This paper proposes to present the recent development regarding the incorporation of
nature inspired algorithms into deep learning architectures. The concise view of the
recent developments, strengths, challenges and opportunities for future research
regarding the synergy between the natured inspired algorithms and deep learning are
presented. It was found that the synergy between the nature inspired algorithms and the
deep learning research communities is limited considering the little attention it attracted
in the literature. We belief this paper has the potential to bridge the communication gap
between the nature inspired algorithms and deep learning research communities. Experts
researchers can use this paper as a benchmark for developing the research area while
novice researchers can use it as an initial reading material for starting a research in this
domain.
References
[1] W. S. McCulloch and W. Pitts, "A logical calculus of the ideas immanent in
nervous activity," The bulletin of mathematical biophysics, vol. 5, pp. 115-133,
1943.
[2] J. Holland, "Adaptation in natural and artificial systems: an introductory analysis
with application to biology," Control and artificial intelligence, 1975.
[3] X.-S. Yang and S. Deb, "Cuckoo search via Lévy flights," in Nature &
Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on, 2009,
pp. 210-214.
[4] D. Karaboga, "An idea based on honey bee swarm for numerical optimization,"
Technical report-tr06, Erciyes university, engineering faculty, computer
engineering department2005.
[5] X.-S. Yang, S. Deb, S. Fong, X. He, and Y. Zhao, "Swarm intelligence: today and
tomorrow," in 2016 3rd International Conference on Soft Computing & Machine
Intelligence (ISCMI), 2016, pp. 219-223.
[6] H. Chiroma, S. Abdul-kareem, U. Ibrahim, I. G. Ahmad, A. Garba, A. Abubakar,
et al., "Malaria severity classification through Jordan-Elman neural network
based on features extracted from thick blood smear," Neural Network World, vol.
25, p. 565, 2015.
[7] H. Chaoui and C. C. Ibe-Ekeocha, "State of charge and state of health estimation
for lithium batteries using recurrent neural networks," IEEE Transactions on
Vehicular Technology, vol. 66, pp. 8773-8783, 2017.
[8] P. Dolezel, P. Skrabanek, and L. Gago, "Pattern recognition neural network as a
tool for pest birds detection," in Computational Intelligence (SSCI), 2016 IEEE
Symposium Series on, 2016, pp. 1-6.
11
[9] L. Nie, J. Guan, C. Lu, H. Zheng, and Z. Yin, "Longitudinal speed control of
autonomous vehicle based on a self-adaptive PID of radial basis function neural
network," IET Intelligent Transport Systems, 2018.
[10] A. Bahrammirzaee, "A comparative survey of artificial intelligence applications
in finance: artificial neural networks, expert system and hybrid intelligent
systems," Neural Computing and Applications, vol. 19, pp. 1165-1195, 2010.
[11] Y. Xu, J. Cheng, L. Wang, H.-y. Xia, F. Liu, and D. Tao, "Ensemble One-
Dimensional Convolution Neural Networks for Skeleton-based Action
Recognition," IEEE Signal Processing Letters, 2018.
[12] H. Lam, S. Ling, F. H. Leung, and P. K.-S. Tam, "Tuning of the structure and
parameters of neural network using an improved genetic algorithm," in Industrial
Electronics Society, 2001. IECON'01. The 27th Annual Conference of the IEEE,
2001, pp. 25-30.
[13] H. Chiroma, S. Abdulkareem, A. Abubakar, and T. Herawan, "Neural networks
optimization through genetic algorithm searches: a review," APPLIED
MATHEMATICS, vol. 11, pp. 1543-1564, 2017.
[14] D. Karaboga, B. Akay, and C. Ozturk, "Artificial bee colony (ABC) optimization
algorithm for training feed-forward neural networks," in International conference
on modeling decisions for artificial intelligence, 2007, pp. 318-329.
[15] N. M. Nawi, A. Khan, and M. Z. Rehman, "A new back-propagation neural
network optimized with cuckoo search algorithm," in International Conference on
Computational Science and Its Applications, 2013, pp. 413-426.
[16] C.-F. Juang, "A hybrid of genetic algorithm and particle swarm optimization for
recurrent network design," IEEE Transactions on Systems, Man, and Cybernetics,
Part B (Cybernetics), vol. 34, pp. 997-1006, 2004.
[17] G. E. Hinton, S. Osindero, and Y.-W. Teh, "A fast learning algorithm for deep
belief nets," Neural computation, vol. 18, pp. 1527-1554, 2006.
[18] S. Fong, S. Deb, and X.-s. Yang, "How meta-heuristic algorithms contribute to
deep learning in the hype of big data analytics," in Progress in Intelligent
Computing Techniques: Theory, Practice, and Applications, ed: Springer, 2018,
pp. 3-25.
[19] J. P. Papa, G. H. Rosa, D. R. Pereira, and X.-S. Yang, "Quaternion-based deep
belief networks fine-tuning," Applied Soft Computing, vol. 60, pp. 328-335,
2017.
[20] I. Fister Jr, X.-S. Yang, I. Fister, J. Brest, and D. Fister, "A brief review of nature-
inspired algorithms for optimization," arXiv preprint arXiv:1307.4186, 2013.
[21] B. Xing and W.-J. Gao, Innovative computational intelligence: a rough guide to
134 clever algorithms: Springer, 2014.
[22] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, p. 436,
2015.
[23] Y. Bengio, A. Courville, and P. Vincent, "Representation learning: A review and
new perspectives," IEEE transactions on pattern analysis and machine
intelligence, vol. 35, pp. 1798-1828, 2013.
[24] C. Zhang, K. C. Tan, H. Li, and G. S. Hong, "A cost-sensitive deep belief network
for imbalanced classification," IEEE Transactions on Neural Networks and
Learning Systems, 2018.
12
[40] K. Liu, L. M. Zhang, and Y. W. Sun, "Deep Boltzmann machines aided design
based on genetic algorithms," in Applied Mechanics and Materials, 2014, pp. 848-
851.
[41] E. Levy, O. E. David, and N. S. Netanyahu, "Genetic algorithms and deep learning
for automatic painter classification," in proceedings of the 2014 Annual
Conference on Genetic and Evolutionary Computation, 2014, pp. 1143-1150.
[42] L. R. Rere, M. I. Fanany, and A. M. Arymurthy, "Simulated annealing algorithm
for deep learning," Procedia Computer Science, vol. 72, pp. 137-144, 2015.
[43] L.-O. Fedorovici, R.-E. Precup, F. Dragan, R.-C. David, and C. Purcaru,
"Embedding gravitational search algorithms in convolutional neural networks for
OCR applications," in Applied Computational Intelligence and Informatics
(SACI), 2012 7th IEEE International Symposium on, 2012, pp. 125-130.
[44] J. Schmidhuber, "Deep learning in neural networks: An overview," Neural
networks, vol. 61, pp. 85-117, 2015.
[45] I. Chaturvedi, Y.-S. Ong, I. W. Tsang, R. E. Welsch, and E. Cambria, "Learning
word dependencies in text by means of a deep recurrent belief network,"
Knowledge-Based Systems, vol. 108, pp. 144-154, 2016.
[46] K. Mannepalli, P. N. Sastry, and M. Suman, "A novel adaptive fractional deep
belief networks for speaker emotion recognition," Alexandria Engineering
Journal, 2016.
[47] J. Qiao, G. Wang, X. Li, and W. Li, "A self-organizing deep belief network for
nonlinear system modeling," Applied Soft Computing, vol. 65, pp. 170-183,
2018.
[48] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et
al., "Generative adversarial nets," in Advances in neural information processing
systems, 2014, pp. 2672-2680.