JDSE18 M.Outahar
JDSE18 M.Outahar
1 Introduction
In the last few years, neuroevolution [1] has gained interest in the research com-
munity. It has been shown to outperform reinforcement learning algorithms in
certain situations where the search space is non-convex and noisy or the gradient
is not available [2]. Neuroevolution describes a method to optimize neural net-
works with evolutionary algorithms. The algorithm is extremely parallelizable
and scalable [3]. This document aims to demonstrate a method to automatically
tune a PID controller of a car-like mobile robot using neuroevolution. The CMA-
ES optimization algorithm was chosen to optimize the neural network. CMA-ES
being also noted CMA with ES standing for Evolution Strategy which is a fam-
ily of algorithms that is loosely based on biological evolution (hence the name).
Multiple Evolutionary algorithms exist and they are all based on the same ba-
sic steps of population generation, evaluation, selection and reproduction. The
CMA-ES has outperformed many algorithms in black box optimization prob-
lems [4]. That is why this algorithm has been chosen to tune PID controllers in
many cases with promising results [5] [6].
As seen in figure 1, the system is composed of a robot controlled by a PID
controller. The state of the robot is observed by an extended Kalman filter
(EKF). The EKF provides the state x̂ and the corresponding covariance matrix
P. The core concept of the document is to use the covariance matrix with the
corresponding error as inputs to a neural network which outputs the parameters
of the controller in real time.Both of the CMA-ES blocks are used to define and
optimize the neural network in order to adapt the behavior of the robot to the
level of uncertainty in the measurements.
The goal here is to find the optimal parameters KP , KI and KD to control the
robot, by taking into consideration the error and the covariance matrix of the
EKF. A neural network is used because if offers both adaptability and efficiency.
PID controllers are wildly used in the industry. This is due to their reliability
and simplicity. PID has shown to have good preferences in multiple cases [7].
The general formula for the PID controller is as follows:
t
d
Z
C(t) = KP e(t) + KI e(τ ) dτ + KD e(t) (1)
0 dt
with e(t)=actual(t)-target(t)
KP , KI and KD are the proportional, integral and derivative gains respectively.
Even thou the controller is easy to implement, the tuning of its parameters is not
a simple task and is a large area of research [8]. The three actions of proportional,
integral and derivative have different and some concurrent effects. For example,
the proportional term decreases the rise time while the derivative term increases
it, while both are essential for the stability of complex systems. This is the reason
of the difficulty of finding optimal gains.
2.2 Neural network
Neural networks are highly connected systems that are used to model complex,
non-linear functions.In a simple representation of a neural network, the outputs
of the each layer are multiplied by the weights and summed together with the
biases and passed through the activation functions. Activation functions are what
makes the system capable of modeling non-linear behavior, it can be represented
graphically by neurons.
A big part of the progress done in this area is due to the backpropagation
algorithm. This algorithm allows the neural network to learn patterns and desired
behaviors. However The backpropagation algorithm uses the gradient to optimize
the neural network. In this case, the gradient is not available, therefore the
backpropagation algorithm can not be used.
3 Neuroevolution
A neural network is used to find the optimal parameters to control the robot
efficiently, even with the presence of uncertainty. In traditional neural networks,
the backpropagation algorithm is used to update the weights and biases. Here,
an evolutionary algorithm is used instead. The choice was made because of the
need of exploration in our problem and because neuroevolution is a gradient free
method, which reduces execution time by orders of magnitude [2].
3.1 CMA-ES
The CMA-ES is an evolutionary algorithm [9]. it had been used because it out-
performed most black box optimization algorithms. The algorithm starts off by
generating a population of candidates. Those candidates are evaluated and put
in order of fitness. From the top preforming candidates, a percentage is selected
to regenerate the new population. The new population is again reevaluated and
the cycle continues until a termination condition is met. The termination condi-
tion is based on number of generations or the resemblance between parents and
offspring.
The CMA-ES algorithm takes an objective function as an input, and has the
neural network parameters as outputs. the objective function is critical to the
performance of the optimization. In our case it will be set to take into consid-
eration the absolute error between the non noisy signals and the reference. In
other words, the CMA-ES will tweak the neural network parameters in order
to minimize the influence of the noise on the system. This is done to force the
neural network to learn to control the system based on the level of noise (EKF’s
covariance matrix).
4 Results and perspectives
Multiple implementations varying in complexity, were realized for this work. At
first a fix PID controller was optimized with the CMA-ES to adapt to fluctuations
in the precision of the perception. After this initial phase, a neural network was
used to tune a PID controller on line. The neural network was optimized by
CMA-ES. The architecture of the neural network was chosen by the user. One of
the latest Implementations describes the complete system, where both CMA-ES
blocks work to have an optimal system.
Fig. 2. Evolution of the objective function across generations. The size of the step
between generations is displayed in green, the change in the objective function in cyan
and the minimum objective function of each generation in blue. The red asterisk is the
overall minimum objective function found by CMA-ES [10].
References
1. K. O. Stanley and R. Miikkulainen, “Evolving neural networks through augmenting
topologies,” Evolutionary Computation, vol. 10, no. 2, pp. 99–127, 2002.
2. T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever, “Evolution Strategies as
a Scalable Alternative to Reinforcement Learning,” ArXiv e-prints, 2017.
3. X. Zhang, J. Clune, and K. O. Stanley, “On the relationship between the openai
evolution strategy and stochastic gradient descent,” CoRR, vol. abs/1712.06564,
2017.
4. I. Loshchilov, “Cma-es with restarts for solving cec 2013 benchmark problems,” 06
2013.
5. M. s. Saad, H. Jamaluddin, and I. Mat Darus, “Pid controller tuning using evolu-
tionary algorithms,” vol. 7, pp. 139–149, 01 2012.
6. K. Marova, “Using CMA-ES for tuning coupled PID controllers within models of
combustion engines,” CoRR, vol. abs/1609.06741, 2016.
7. Y. Wakasa, S. Kanagawa, K. Tanaka, and Y. Nishimura, “PID Controller Tuning
Based on the Covariance Matrix Adaptation Evolution Strategy,” IEEJ Transac-
tions on Electronics, Information and Systems, vol. 130, pp. 737–742, 2010.
8. B. Doicin, M. Popescu, and C. Patrascioiu, “Pid controller optimal tuning,” 2016
8th International Conference on Electronics, Computers and Artificial Intelligence
(ECAI), June 2016.
9. N. Hansen, “The cma evolution strategy: A tutorial,” 2010.
10. N. Hansen, “Cma-es source code.”