Buy Ebook Engineering Applications of Neural Networks 17th International Conference EANN 2016 Aberdeen UK September 2 5 2016 Proceedings 1st Edition Chrisina Jayne Cheap Price
Buy Ebook Engineering Applications of Neural Networks 17th International Conference EANN 2016 Aberdeen UK September 2 5 2016 Proceedings 1st Edition Chrisina Jayne Cheap Price
Buy Ebook Engineering Applications of Neural Networks 17th International Conference EANN 2016 Aberdeen UK September 2 5 2016 Proceedings 1st Edition Chrisina Jayne Cheap Price
com
https://textbookfull.com/product/windows-10-primer-what-to-expect-
from-microsoft-s-new-operating-system-1st-edition-mike-halsey/
textbookfull.com
Beyond Classical Liberalism; Freedom and the Good 1st
Edition James Dominic Rooney
https://textbookfull.com/product/beyond-classical-liberalism-freedom-
and-the-good-1st-edition-james-dominic-rooney/
textbookfull.com
https://textbookfull.com/product/translation-in-the-digital-age-
translation-4-0-1st-edition-carsten-sinner-editor/
textbookfull.com
https://textbookfull.com/product/divine-powers-in-late-antiquity-1st-
edition-marmodoro/
textbookfull.com
https://textbookfull.com/product/intellectual-property-law-p-
narayanan-parameswaran-narayanan/
textbookfull.com
Semionauts of Tradition Music Culture and Identity in
Contemporary Singapore Juliette Yu-Ming Lizeray
https://textbookfull.com/product/semionauts-of-tradition-music-
culture-and-identity-in-contemporary-singapore-juliette-yu-ming-
lizeray/
textbookfull.com
Chrisina Jayne
Lazaros Iliadis (Eds.)
Engineering Applications
of Neural Networks
17th International Conference, EANN 2016
Aberdeen, UK, September 2–5, 2016
Proceedings
123
Communications
in Computer and Information Science 629
Commenced Publication in 2007
Founding and Former Series Editors:
Alfredo Cuzzocrea, Dominik Ślęzak, and Xiaokang Yang
Editorial Board
Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
Rio de Janeiro, Brazil
Phoebe Chen
La Trobe University, Melbourne, Australia
Xiaoyong Du
Renmin University of China, Beijing, China
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Orhun Kara
TÜBİTAK BİLGEM and Middle East Technical University, Ankara, Turkey
Igor Kotenko
St. Petersburg Institute for Informatics and Automation of the Russian
Academy of Sciences, St. Petersburg, Russia
Ting Liu
Harbin Institute of Technology (HIT), Harbin, China
Krishna M. Sivalingam
Indian Institute of Technology Madras, Chennai, India
Takashi Washio
Osaka University, Osaka, Japan
More information about this series at http://www.springer.com/series/7899
Chrisina Jayne Lazaros Iliadis (Eds.)
•
Engineering Applications
of Neural Networks
17th International Conference, EANN 2016
Aberdeen, UK, September 2–5, 2016
Proceedings
123
Editors
Chrisina Jayne Lazaros Iliadis
Robert Gordon University Lab of Forest Informatics (FiLAB)
Aberdeen Democritus University of Thrace
UK Orestiada
Greece
General Chair
Chrisina Jayne Robert Gordon University, UK
Advisory Chair
Nikola Kasabov Auckland University of Technology, New Zenland
Program Chairs
Chrisina Jayne Robert Gordon University, UK
Lazaros Iliadis Democritus University of Thrace, Greece
Program Committee
A. Canuto Federal University of Rio Grande do Norte, Brazil
A. Petrovski Robert Gordon University, UK
B. Beliczynski Institute of Control and Industrial Electronics, Poland
D. Coufal Czech Academy of Sciences
D. Pérez University of Oviedo, Spain
E. Kyriacou Frederick University, Cyprus
H. Leopold Austrian Institute of Technology GmbH, Austria
I. Bukovsky Czech Technical University in Prague, Czech Republic
J.F. De Canete University of Malaga, Spain
Rodriguez
K.L. Kermanidis Ionian University, Greece
K. Margaritis University of Macedonia, Greece
M. Holena Academy of Sciences of the Czech Republic
M. Fiasche Politecnico di Milano, Italy
M. Trovati Derby University, UK
N. Wiratunga Robert Gordon University, UK
P. Hajek University of Pardubice, Czech Republic
P. Kumpulainen Tempere University of Technology, Finland
S. Massie Robert Gordon University, UK
V. Kurkova Czech Academy of Sciences
Z. Ding Hewlett Packard Enterprise, USA
VIII Organization
Supporting Organizations
Semi-supervised Modeling
Classification Applications
Clustering Applications
Elastic Net Application: Case Study to Find Solutions for the TSP
in a Beowulf Cluster Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Marcos Lévano and Andrea Albornoz
Predictive Model for Detecting MQ2 Gases Using Fuzzy Logic on IoT
Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Catalina Hernández, Sergio Villagrán, and Paulo Gaona
Time-Series Prediction
Learning-Algorithms
Short Papers
Tutorials
1 Introduction
One of the important aspects of artificial intelligence is the ability of autonomous
agents to behave effectively and realistically in a given task. There is a rising
demand for applications in which agents can act and make decisions similar to
human behavior in order to achieve a goal. Imitation learning is a paradigm in
which an agent learns how to behave by observing demonstrations of correct
c Springer International Publishing Switzerland 2016
C. Jayne and L. Iliadis (Eds.): EANN 2016, CCIS 629, pp. 3–17, 2016.
DOI: 10.1007/978-3-319-44188-7 1
4 A. Hussein et al.
the policy with a limited number of queried instances. Once trained, the agent is
able to extract features from the scene and predict actions in real time. We con-
duct our experiments on benchmark testbed that makes it seamless to replicate
our results and compare with other approaches.
Benchmark environments are useful tools for evaluating intelligent agents.
A few benchmarks are available for 2D tasks such as [3,15,25] and are being
increasingly employed in the literature. 3D environments however have not been
as widely explored, although they provide a closer simulation to real robotic
applications. We use mash-simulator [19] as our testbed to facilitate the evalua-
tion and comparison of learning methods. It is also convenient for extending the
experiments to different navigation tasks within the same framework.
In the next section we review related work. Section 3 describes the proposed
methods. Section 4 details our experiments and results. Finally we present our
conclusions and discuss future steps in Sect. 5.
2 Related Work
2.1 Navigation
Navigation tasks have been of interest in AI in general and imitation learning
specifically from an early stage. Sammut et al. [29] provides an early exam-
ple of an aircraft learning autonomous flight from demonstrations provided via
remote control. Later research tackle more elaborate navigation problems includ-
ing obstacles and objects of interest. Chernova et al. [7] use Gaussian mixture
models to teach a robot to navigate through a maze. The robot is fitted with
an IR sensor to provide information about the proximity of obstacles. This data
coupled with input from a teacher controlling the robot is used to learn a policy.
The robot is then able to make a decision to execute one of 4 motion primi-
tives(unit actions) based on its sensory readings. In [10] the robot uses a laser
sensor to detect and recognize objects of interest. A policy is learned to predict
subgoals associated with the detected objects rather than directly predicting
the motion primitives. Such sensing methods provide an abstract view of the
environment, but can’t convey visual details that might be needed for intelligent
agents to mimic human behavior. [22] use neural networks to learn a policy for
driving a car in racing game using features extracted from the game engine (such
as position of the car relative to the track). Driving is a complex task compared
to other navigation problems due to the complexity of the possible actions. The
outputs of the neural network in [22] are high DOF low level actions. However,
the features extracted from the game engine to train the policy would be dif-
ficult to extract in the real world. Advances in computational resources have
prompted the use of visual data over simpler sensory data. Visual sensors pro-
vide detailed information about the agents surrounding and are suitable to use in
real world applications. In [28] a policy for a racing game is learned from visual
data. Demonstrations are provided by capturing the games video stream and the
controller input. The raw frames (downsampled) without extracting engineered
features are used as input to train a neural network.
6 A. Hussein et al.
Deep learning methods are highly effective in problems that don’t have estab-
lished sets of engineered features. CNNs have been used with great success to
extract features from images. In recent studies [20,21] CNNs are coupled with
reinforcement learning to learn several Atari games. A sequence of raw frames is
used as input to the network and trial and error is used to learn a policy. Trial
and error methods such as reinforcement learning have been extensively used
to learn policies for intelligent agents [16]. However, providing demonstrations
of correct behavior can greatly expedite the learning rate. Moreover, learning
through trial and error can lead the agent to learn a way of performing the
task that doesn’t seem natural or intuitive to a human observer. In [12] learn-
ing from demonstrations is applied on the same Atari benchmark. A supervised
network is used to train a policy using samples from a high performing but non
real time agent. This approach is reported to outperform agents that learn from
scratch through reinforcement learning. Other examples of using deep learning
to play games include learning the game of ‘GO’ using supervised convolution
networks [9] and a combination of supervised and reinforcement learning [33].
These examples all focus on learning 2D games that have a fixed view. However
in real applications, visual sensors would capture 3D scenes, and the sensors
would most likely be mounted on the agent which means it is unrealistic to have
a fixed view of the entire scene at all times.
In [18] a robot is trained to perform a number of object manipulation tasks.
First a trajectory is learned using reinforcement learning with the position of
the objects and targets known to the robot. These trajectories then serve as
demonstrations train a supervised convolutional neural network. In this case no
demonstrations are needed to be provided by a teacher. However, this approach
requires expert knowledge for the initial setup of the reinforcement learning
phase. Compared to related work that employs deep learning to teach an intel-
ligent agent, this is a realistic application implemented with a physical robot.
However, the features are extracted from a set scene with small variations. This
is different from applications where the agent moves and turns around, and with
that completely altering it’s view.
active learning is used to teach a robot navigation tasks. The agent estimates a
confidence measure for its prediction and queries a teacher for the correct action
when the confidence is low. Erroneous behavior may also be identified by the
teacher. In [5] the robot is allowed to perform the task while a human teacher
physically adjusts its actions, which in turn provides corrected demonstrations.
Some imitation learning tasks involve actions that are performed continuously
over a period of time (i.e. an action is comprised of a series of motions performed
in sequence). In such cases a correction can be provided by the teacher at any
point in the action trajectory [14,28]. This way the agent is able to adapt to
errors in the trajectory.
3 Proposed Method
In this section we detail our proposed method for learning navigation tasks from
demonstrations. The source code for this work can be accessed at:
https://github.com/ahmedsalaheldin/ImitationMASH.git
down-samples the output of the convolution layer. The convolution layers take
advantage of spacial connection between visual features to reduce connections
in the network. The pooling layers reduce the dimensionality to further alleviate
the computations needed. Our network follows the pattern in [21]. It consists of
3 convolution layers each followed by a pooling layer. The input to the first layer
is a frame of 120 × 90 pixels. We apply a luminance map to the colored images
to obtain one value for each pixel instead of 3 channels, resulting in a feature
vector of size 10,800. Figure 1 shows the architecture of the network. The filter
sizes for the three layers are 7×9, 5×5 and 4×5 respectively; and the number of
filters are 20, 50 and 70 respectively. The pooling layers all use maxpool of shape
(2,2). Following the last convolution layer is a fully connected hidden layer with
rectifier activation function and fully connected output layer with three output
nodes representing the 3 possible actions. Table 1 summarizes the architecture
of the network.
Active learning is employed to improve the initial policy learned from demon-
strations. This is achieved by acquiring a new data set to train the agent that
emphasizes the weaknesses of the initial policy. The agent is allowed to perform
the task for a number of rounds. For each prediction the network’s confidence
is calculated, and if the confidence is low the optimal policy is queried for the
Deep Active Learning for Autonomous Navigation 9
correct action. The action provided by the teacher is performed by the agent
and is recorded along with the frame image. The confidence is measured as the
entropy of the output of the final layer in the network. The entropy H(X) is
calculated as:
H(X) = − P (xi ) log2 P (xi ) (1)
i
4 Experiments
We conduct our experiments in the framework of mash-simulator [19]. Mash-
simulator is a tool for benchmarking computer vision techniques for naviga-
tion tasks. The simulator includes a number of different tasks and environments.
10 A. Hussein et al.
As well as optimal policies for a number of tasks. All the navigation is viewed
from the first person perspective. The player has 4 possible actions: ‘Go forward’,
‘Turn left’, ‘Turn right’ and ‘Go back’. Although there are 4 possible actions, the
action ‘Go back’ was never used in the demonstrations by the optimal policy.
Therefore the network is only presented with 3 classes in the training set and
thus has 3 output nodes.
4.1 Tasks
The experiments are conducted on the following 4 navigation tasks:
Reach the Flag. This task is set in a single rectangular room with a flag placed
randomly in the room. The goal is to reach the flag. The task fails if the flag is
not reached within a time limit.
Follow the Line. This task is set in a room with directed lines drawn on the
floor. The lines show the direction to follow in order to reach the flag. The target
is to follow the line to the flag, and the agent fails if it deviates from the line on
the floor.
Reach the Correct Object. In this task two objects are placed on pedestals
in random positions in the room. The objective is to reach the pedestal with the
trophy on it. The task fails if a time limit is reached or if the player reaches the
wrong object. The wrong object has the same material of the trophy and can
take different shapes.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Deep Active Learning for Autonomous Navigation 11
Eat All Disks. This task is set in a large room containing several black disks
on the floor. The target is to keep reaching the disks. A disk is ‘eaten’ once the
agent reaches it and dissapears. New disks appear when one is eaten. The goal
of this task is to eat as many disks as possible within a time limit.
Figures 2, 3, 4 and 5 show sample images of the 4 tasks in the 120 × 90 size
used in the experiments.
4.2 Setup
To evaluate the proposed methods, the performance of the agent is measured
over 1,000 rounds. A round starts when the task is initialized and ends when
the agent reaches the target or a time limit is reached. The number of frames
in a round might vary depending on how fast the agent can reach the target.
For all tasks, in each round the environment is randomized including room size
and shape, lighting and the location of the target and the agent. A time limit
is set for each round and the round fails if the limit is reached before the agent
reaches the target. The time limit is measured in frames to avoid any issues with
different frame rates. The time limit is set as the maximum time needed for the
optimal policy to finish the task; which is 500 frames for “Reach the flag” and
“Reach the correct object” and 5000 frames for “Follow the line”. In “Eat all
disks” the task is continuous, so a time limit was set to match the total number
of frames in the other tasks.
the simulator via a TCP connection as follows: The agent requests a task from
the server, the server initiates a round and sends an image to the client. The
client sends an action to the server. The server calculates the simulations and
responds with a new image. Figure 6 shows a flowchart of the data collection
process.
The network used for prediction is also decoupled from the agent. The net-
work acts as a predicting server where an agent sends frames that it receives
from the simulator and in return receives a decision from the network. The
entire process of communication with both servers occurs in real time. This
implementation facilitates experimentation, as making changes to the network
doesn’t affect the client or the simulator server. Moreover, it is easier to extend
this system to physical robots. A predicting server can be located on the robot or
on another machine if the robot’s computational capabilities are not sufficient.
A predicting server can also serve multiple agents simultaneously. The agent
client is implemented in c++ to facilitate interfaceing with the mash-simulator.
The predicting server and the training process are implemented in python using
the Theano deep learning library [34]. Figure 7 shows a flowchart of the agent
performing a task.
4.4 Results
In this section we present the results of the proposed method. The same network
and parameters are used to learn all tasks. For each task 20,000 images are used
for training. Testing is conducted by allowing an agent to attempt the tasks
in the mash-simulator and recording the number of successful attempts. An
agent’s performance for the first 3 tasks is evaluated as the percentage of times
it reaches the target in 1,000 rounds. For “Eat all disks”, the performance is
measured as the number of disks eaten in 1,000 rounds. We also report the
classification error on an unseen test set of 20,000 images collected from the
teacher’s demonstrations.
Table 2 shows the results for the first 3 tasks. The success measure is the
percentage of rounds (out of 1000) in which the agent reached the target. While
error is the classification error on the test set collected from the teacher’s demon-
strations. The agent performs well on “Reach the flag” and is significantly less
successful in the other two tasks. “Follow the line” is considerably less fault tol-
erant than “Reach the flag”. As a small error can result in the agent deviating
from the line and subsequently failing the round. Whereas in “Reach the flag” the
agent can continue to search for the target after a wrong prediction. In “Reach
the correct object” the agent is not able to effectively distinguish between the
two objects. This could be attributed to insufficient visual details in the training
set, as the teacher avoids the wrong object from a distance. Qualitative analysis
of “Reach the flag” shows that the agent aims towards corners as they resemble
the erect flag from a distance. Upon approaching the corner, as the details of the
image become clearer, the agent stops recognizing it as the target and continues
its search. While this did not pose a big problem in the agent’s ability to exe-
cute the task it is interesting to examine the ability of CNNs to distinguish small
details in such environments. It is also worth noting that the teacher’s policy for
“Reach correct object” does not avoid the wrong object if it is in the way of the
target and achieves 80.2 % success rate.
Table 3 shows results for the 4th task “Eat all disks”. The table shows the
score of the agent compared to the score achieved using the optimal policy. The
agent is shown to achieve 97.9 % of the score performed by the optimal policy.
To improve the agent’s ability to adapt to wrong predictions and unseen sit-
uations, active learning is used to train the agent on “Follow the line”. In the
other tasks where the agent searches for the target, the optimal policy remem-
bers the location of the target even if it goes out of view due to agent error.
14 A. Hussein et al.
The task in which the time limit affected the performance was “Reach the
flag”. As the agent continues to follow its policy in search of the flag even after
performing wrong predictions. The effect of the time limit is evaluated in Fig. 9
which presents the success rate of “reach the flag” task with different time limits.
The horizontal axis represents the time limit as a percentage of the maximum
time needed by the teacher. The graph shows that the longer the agent is allowed
to look for the target the higher the success rate.
Overall the results show good performance on 3 out of the 4 tasks. They
demonstrate the effectiveness of active learning to significantly improve a weak
policy with a limited number of samples. Even without active learning the agent
can learn a robust policy for simple navigation tasks.
Deep Active Learning for Autonomous Navigation 15
Fig. 9. Results for “reach the flag” task with increasing time limits
References
1. Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement
learning to aerobatic helicopter flight. Adv. Neural Inf. Process. Syst. 19, 1 (2007)
2. Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning
from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
3. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning
environment: an evaluation platform for general agents (2012). arXiv preprint
arXiv:1207.4708
4. Bemelmans, R., Gelderblom, G.J., Jonker, P., De Witte, L.: Socially assistive
robots in elderly care: a systematic review into effects and effectiveness. J. Am.
Med. Direct. Assoc. 13(2), 114–120 (2012)
Random documents with unrelated
content Scribd suggests to you:
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.