Buy Ebook Engineering Applications of Neural Networks 17th International Conference EANN 2016 Aberdeen UK September 2 5 2016 Proceedings 1st Edition Chrisina Jayne Cheap Price

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Download the full version of the textbook now at textbookfull.

com

Engineering Applications of Neural Networks


17th International Conference EANN 2016
Aberdeen UK September 2 5 2016 Proceedings
1st Edition Chrisina Jayne
https://textbookfull.com/product/engineering-
applications-of-neural-networks-17th-
international-conference-eann-2016-aberdeen-uk-
september-2-5-2016-proceedings-1st-edition-
chrisina-jayne/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Engineering Applications of Neural Networks 19th


International Conference EANN 2018 Bristol UK September 3
5 2018 Proceedings Elias Pimenidis
https://textbookfull.com/product/engineering-applications-of-neural-
networks-19th-international-conference-eann-2018-bristol-uk-
september-3-5-2018-proceedings-elias-pimenidis/
textbookfull.com

Web Reasoning and Rule Systems 10th International


Conference RR 2016 Aberdeen UK September 9 11 2016
Proceedings 1st Edition Magdalena Ortiz
https://textbookfull.com/product/web-reasoning-and-rule-systems-10th-
international-conference-rr-2016-aberdeen-uk-
september-9-11-2016-proceedings-1st-edition-magdalena-ortiz/
textbookfull.com

Engineering Applications of Neural Networks: 20th


International Conference, EANN 2019, Xersonisos, Crete,
Greece, May 24-26, 2019, Proceedings John Macintyre
https://textbookfull.com/product/engineering-applications-of-neural-
networks-20th-international-conference-eann-2019-xersonisos-crete-
greece-may-24-26-2019-proceedings-john-macintyre/
textbookfull.com

Windows 10 Primer What to Expect from Microsoft s New


Operating System 1st Edition Mike Halsey

https://textbookfull.com/product/windows-10-primer-what-to-expect-
from-microsoft-s-new-operating-system-1st-edition-mike-halsey/

textbookfull.com
Beyond Classical Liberalism; Freedom and the Good 1st
Edition James Dominic Rooney

https://textbookfull.com/product/beyond-classical-liberalism-freedom-
and-the-good-1st-edition-james-dominic-rooney/

textbookfull.com

Translation in the Digital Age Translation 4 0 1st Edition


Carsten Sinner (Editor)

https://textbookfull.com/product/translation-in-the-digital-age-
translation-4-0-1st-edition-carsten-sinner-editor/

textbookfull.com

Images of the Byzantine World Visions Messages and


Meanings Studies Presented to Leslie Brubaker Angeliki
Lymberopoulou (Ed.)
https://textbookfull.com/product/images-of-the-byzantine-world-
visions-messages-and-meanings-studies-presented-to-leslie-brubaker-
angeliki-lymberopoulou-ed/
textbookfull.com

Divine powers in Late Antiquity 1st Edition Marmodoro

https://textbookfull.com/product/divine-powers-in-late-antiquity-1st-
edition-marmodoro/

textbookfull.com

Intellectual Property Law: P. Narayanan Parameswaran


Narayanan

https://textbookfull.com/product/intellectual-property-law-p-
narayanan-parameswaran-narayanan/

textbookfull.com
Semionauts of Tradition Music Culture and Identity in
Contemporary Singapore Juliette Yu-Ming Lizeray

https://textbookfull.com/product/semionauts-of-tradition-music-
culture-and-identity-in-contemporary-singapore-juliette-yu-ming-
lizeray/
textbookfull.com
Chrisina Jayne
Lazaros Iliadis (Eds.)

Communications in Computer and Information Science 629

Engineering Applications
of Neural Networks
17th International Conference, EANN 2016
Aberdeen, UK, September 2–5, 2016
Proceedings

123
Communications
in Computer and Information Science 629
Commenced Publication in 2007
Founding and Former Series Editors:
Alfredo Cuzzocrea, Dominik Ślęzak, and Xiaokang Yang

Editorial Board
Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
Rio de Janeiro, Brazil
Phoebe Chen
La Trobe University, Melbourne, Australia
Xiaoyong Du
Renmin University of China, Beijing, China
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Orhun Kara
TÜBİTAK BİLGEM and Middle East Technical University, Ankara, Turkey
Igor Kotenko
St. Petersburg Institute for Informatics and Automation of the Russian
Academy of Sciences, St. Petersburg, Russia
Ting Liu
Harbin Institute of Technology (HIT), Harbin, China
Krishna M. Sivalingam
Indian Institute of Technology Madras, Chennai, India
Takashi Washio
Osaka University, Osaka, Japan
More information about this series at http://www.springer.com/series/7899
Chrisina Jayne Lazaros Iliadis (Eds.)

Engineering Applications
of Neural Networks
17th International Conference, EANN 2016
Aberdeen, UK, September 2–5, 2016
Proceedings

123
Editors
Chrisina Jayne Lazaros Iliadis
Robert Gordon University Lab of Forest Informatics (FiLAB)
Aberdeen Democritus University of Thrace
UK Orestiada
Greece

ISSN 1865-0929 ISSN 1865-0937 (electronic)


Communications in Computer and Information Science
ISBN 978-3-319-44187-0 ISBN 978-3-319-44188-7 (eBook)
DOI 10.1007/978-3-319-44188-7

Library of Congress Control Number: 2016947184

© Springer International Publishing Switzerland 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG Switzerland
Preface

The 17th International Conference on Engineering Applications of Neural Networks


(EANN) was held at Robert Gordon University in Aberdeen, UK, during September,
2016. The supporters for the conference were the International Neural Network Society
(INNS), The Scottish Informatics and Computer Science Alliance (SICSA), Visit
Aberdeen, Visit Scotland, and Robert Gordon University in Aberdeen, UK. The 17th
EANN 2016 attracted delegates from 12 countries across the world: Czech Republic,
China, Chile, Colombia, Greece, Italy, Japan, Poland, Portugal, Russia, the UK, and
USA.
The volume includes 22 full papers, three short papers, and two tutorial papers. All
papers were subject to a rigorous peer-review process by at least two independent
academic referees. EANN 2016 accepted approximately 53 % of the submitted papers
as full papers. The authors of the best 10 papers were invited to submit extended
contributions for inclusion in a special issue of Neural Computing and Applications
(Springer). The papers demonstrate a variety of novel neural network and other
computational intelligence approaches applied to challenging real-world problems. The
papers cover topics such as: convolutional neural networks and deep learning appli-
cations, real-time systems, ensemble classification, chaotic neural networks, self-
organizing maps applications, intelligent cyber physical systems, text analysis, emotion
recognition, and optimization problems.
The following keynote speakers were invited and gave lectures on exciting neural
network application topics:
– Professor Nikola Kasabov, Director and Founder, Knowledge Engineering and
Discovery Research Institute (KEDRI), Chair of Knowledge Engineering, Auckland
University of Technology, New Zealand
– Professor Marley Vellasco, Head of the Electrical Engineering Department and the
Applied Computational Intelligence Laboratory (ICA) at PUC-Rio, Brazil
– Professor John MacIntyre, Dean of the Faculty of Appliced Sciences, Pro Vice
Chancellor Director of Research, Innovation and Employer Engagement, University
of Sunderland, UK
On behalf of the conference Organizing Committee we would like to thank all those
who contributed to the organization of this year’s program, and in particular the
Program Committee members.

September 2016 Chrisina Jayne


Lazaros Iliadis
Organization

General Chair
Chrisina Jayne Robert Gordon University, UK

Advisory Chair
Nikola Kasabov Auckland University of Technology, New Zenland

Program Chairs
Chrisina Jayne Robert Gordon University, UK
Lazaros Iliadis Democritus University of Thrace, Greece

Local Organizing Committee Chair


Michael Heron Robert Gordon University, UK

Program Committee
A. Canuto Federal University of Rio Grande do Norte, Brazil
A. Petrovski Robert Gordon University, UK
B. Beliczynski Institute of Control and Industrial Electronics, Poland
D. Coufal Czech Academy of Sciences
D. Pérez University of Oviedo, Spain
E. Kyriacou Frederick University, Cyprus
H. Leopold Austrian Institute of Technology GmbH, Austria
I. Bukovsky Czech Technical University in Prague, Czech Republic
J.F. De Canete University of Malaga, Spain
Rodriguez
K.L. Kermanidis Ionian University, Greece
K. Margaritis University of Macedonia, Greece
M. Holena Academy of Sciences of the Czech Republic
M. Fiasche Politecnico di Milano, Italy
M. Trovati Derby University, UK
N. Wiratunga Robert Gordon University, UK
P. Hajek University of Pardubice, Czech Republic
P. Kumpulainen Tempere University of Technology, Finland
S. Massie Robert Gordon University, UK
V. Kurkova Czech Academy of Sciences
Z. Ding Hewlett Packard Enterprise, USA
VIII Organization

A. Papaleonidas Democritus University of Thrace, Greece


A. Kalampakas Democritus University of Thrace, Greece
B. Ribeiro University of Coimbra, Portugal
D. Gorse University College London, UK
E. Elyan Robert Gordon University, UK
F. Marcelloni University of Pisa, Italy
I. Bougoudis Democritus University of Thrace, UK
I. Stephanakis Hellenic Telecommunication Organization SA, Greece
K. Demertzis Democritus University of Thrace, Greece
K. Koutroumbas National Observatory of Athens, Greece
M. Gaber Robert Gordon University, UK
M. Kolehmainen Environmental Science University of Eastern, Finland
M. Tauber Austrian Institute of Technology GmbH, Austria
N. Nicolaou Imperial College London, UK
P. Gastaldo Università degli Studi di Genoa, Italy
P. Vidnerov Czech Academy of Sciences
R. Tanscheit PUC-Rio, Brazil
S. Sani Robert Gordon University, UK
Y. Manolopoulos Aristotle University of Thessaloniki, Greece

Supporting Organizations

International Neural Network Society (INNS)


The Scottish Informatics and Computer Science Alliance (SICSA)
Visit Aberdeen
Visit Scotland
Robert Gordon University, Aberdeen, UK
Contents

Active Learning and Dynamic Environments

Deep Active Learning for Autonomous Navigation . . . . . . . . . . . . . . . . . . . 3


Ahmed Hussein, Mohamed Medhat Gaber, and Eyad Elyan

2D Recurrent Neural Networks for Robust Visual Tracking of Non-Rigid


Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
G.L. Masala, B. Golosio, M. Tistarelli, and E. Grosso

Choice of Best Samples for Building Ensembles in Dynamic Environments . . . 35


Joana Costa, Catarina Silva, Mário Antunes, and Bernardete Ribeiro

Semi-supervised Modeling

Semi-supervised Hybrid Modeling of Atmospheric Pollution in Urban


Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Ilias Bougoudis, Konstantinos Demertzis, Lazaros Iliadis,
Vardis-Dimitris Anezakis, and Antonios Papaleonidas

Classification Applications

Predicting Abnormal Bank Stock Returns Using Textual Analysis


of Annual Reports – a Neural Network Approach . . . . . . . . . . . . . . . . . . . . 67
Petr Hájek and Jana Boháčová

Emotion Recognition Using Facial Expression Images for a Robotic


Companion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Ariel Ruiz-Garcia, Mark Elshaw, Abdulrahman Altahhan,
and Vasile Palade

Application of Artificial Neural Networks for Analyses of EEG Record


with Semi-Automated Etalons Extraction: A Pilot Study . . . . . . . . . . . . . . . 94
Hana Schaabova, Vladimir Krajca, Vaclava Sedlmajerova,
Olena Bukhtaieva, Lenka Lhotska, Jitka Mohylova,
and Svojmil Petranek

Clustering Applications

Economies Clustering Using SOM-Based Dissimilarity . . . . . . . . . . . . . . . . 111


Adam Chudziak
X Contents

Elastic Net Application: Case Study to Find Solutions for the TSP
in a Beowulf Cluster Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Marcos Lévano and Andrea Albornoz

Comparison of Methods for Automated Feature Selection Using


a Self-organising Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Aliyu Usman Ahmad and Andrew Starkey

EEG-Based Condition Clustering Using Self-Organising Neural


Network Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Hassan Hamdoun and Aliyu Ahmad Usman

Cyber-Physical Systems and Cloud Applications

Intelligent Measurement in Unmanned Aerial Cyber Physical Systems


for Traffic Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Andrei Petrovski, Prapa Rattadilok, and Sergey Petrovskii

Predictive Model for Detecting MQ2 Gases Using Fuzzy Logic on IoT
Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Catalina Hernández, Sergio Villagrán, and Paulo Gaona

A Multi-commodity Network Flow Model for Cloud Service Environments . . . 186


Ioannis M. Stephanakis, Syed Noor-Ul-Hassan Shirazi,
Antonios Gouglidis, and David Hutchison

Designing a Context-Aware Cyber Physical System for Smart Conditional


Monitoring of Platform Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Farzan Majdani, Andrei Petrovski, and Daniel Doolan

Time-Series Prediction

Convolutional Radio Modulation Recognition Networks . . . . . . . . . . . . . . . 213


Timothy J. O’Shea, Johnathan Corgan, and T. Charles Clancy

Mutual Information with Parameter Determination Approach for Feature


Selection in Multivariate Time Series Prediction . . . . . . . . . . . . . . . . . . . . . 227
Tianhong Liu, Haikun Wei, Chi Zhang, and Kanjian Zhang

Learning-Algorithms

On Learning Parameters of Incremental Learning in Chaotic Neural


Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Toshinori Deguchi and Naohiro Ishii
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Contents XI

Accelerated Optimal Topology Search for Two-Hidden-Layer Feedforward


Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Alan J. Thomas, Simon D. Walters, Miltos Petridis,
Saeed Malekshahi Gheytassi, and Robert E. Morgan

An Outlier Ranking Tree Selection Approach to Extreme Pruning


of Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Khaled Fawagreh, Mohamed Medhat Gaber, and Eyad Elyan

Lower Bounds on Complexity of Shallow Perceptron Networks . . . . . . . . . . 283


Věra Kůrková

Kernel Networks for Function Approximation. . . . . . . . . . . . . . . . . . . . . . . 295


David Coufal

Short Papers

Simple and Stable Internal Representation by Potential Mutual Information


Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Ryotaro Kamimura

Urdu Speech Corpus and Preliminary Results on Speech Recognition . . . . . . 317


Hazrat Ali, Nasir Ahmad, and Abdul Hafeez

Bio-inspired Audio-Visual Speech Recognition Towards the Zero


Instruction Set Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Mario Malcangi and Hao Quan

Tutorials

Classification of Unbalanced Datasets and Detection of Rare Events


in Industry: Issues and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Marco Vannucci and Valentina Colla

Variable Selection for Efficient Design of Machine Learning-Based


Models: Efficient Approaches for Industrial Applications . . . . . . . . . . . . . . . 352
Silvia Cateni and Valentina Colla

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367


Active Learning and Dynamic
Environments
Deep Active Learning for Autonomous
Navigation

Ahmed Hussein(B) , Mohamed Medhat Gaber, and Eyad Elyan

School of Computing, Robert Gordon University,


Garthdee Road, Aberdeen AB10 7QB, UK
[email protected]

Abstract. Imitation learning refers to an agent’s ability to mimic a


desired behavior by learning from observations. A major challenge fac-
ing learning from demonstrations is to represent the demonstrations in a
manner that is adequate for learning and efficient for real time decisions.
Creating feature representations is especially challenging when extracted
from high dimensional visual data. In this paper, we present a method
for imitation learning from raw visual data. The proposed method is
applied to a popular imitation learning domain that is relevant to a vari-
ety of real life applications; namely navigation. To create a training set,
a teacher uses an optimal policy to perform a navigation task, and the
actions taken are recorded along with visual footage from the first per-
son perspective. Features are automatically extracted and used to learn
a policy that mimics the teacher via a deep convolutional neural net-
work. A trained agent can then predict an action to perform based on
the scene it finds itself in. This method is generic, and the network is
trained without knowledge of the task, targets or environment in which it
is acting. Another common challenge in imitation learning is generalizing
a policy over unseen situation in training data. To address this challenge,
the learned policy is subsequently improved by employing active learn-
ing. While the agent is executing a task, it can query the teacher for
the correct action to take in situations where it has low confidence. The
active samples are added to the training set and used to update the ini-
tial policy. The proposed approach is demonstrated on 4 different tasks
in a 3D simulated environment. The experiments show that an agent can
effectively perform imitation learning from raw visual data for naviga-
tion tasks and that active learning can significantly improve the initial
policy using a small number of samples. The simulated testbed facilitates
reproduction of these results and comparison with other approaches.

1 Introduction
One of the important aspects of artificial intelligence is the ability of autonomous
agents to behave effectively and realistically in a given task. There is a rising
demand for applications in which agents can act and make decisions similar to
human behavior in order to achieve a goal. Imitation learning is a paradigm in
which an agent learns how to behave by observing demonstrations of correct

c Springer International Publishing Switzerland 2016
C. Jayne and L. Iliadis (Eds.): EANN 2016, CCIS 629, pp. 3–17, 2016.
DOI: 10.1007/978-3-319-44188-7 1
4 A. Hussein et al.

behavior provided by a teacher. In contrast to explicit programming, learning


from demonstrations does not require knowledge of the task to be integrated in
the learning process. It favors a generic learning process where the task is learned
completely from observing the demonstrations. Thus, an intelligent agent can be
trained to perform a new task simply by providing examples. Since an agent is
able to learn complex tasks by mimicking a teacher’s behavior, imitation learning
is relevant to many robotic applications [2,4,6,11,13,24,29,36] and is considered
an integral part in the future of intelligent robots [31].
One of the biggest challenges in imitation learning is finding adequate repre-
sentations for the state of the agent in its environment. The agent should be able
to extract meaningful information from sensing of its surroundings, and utilize
this information to perform actions in real time. Deep learning methods have
recently been applied in a wide array of applications and are especially successful
in handling raw data. One of the most popular deep learning techniques is Con-
volutional Neural Networks (CNNs). CNNs are particularly popular in vision
applications due to their ability to extract features from high dimensional visual
data. The ability of deep networks to automatically discover patterns provides
a generic alternative to engineered features which have to be designed for each
specific task. For instance traditional planning approaches that use computer-
vision methods of object recognition and localization need to tailor the methods
for every individual target and task. CNNs achieve results competitive with the
state of the art in many image classification tasks [8,17] and have been recently
used to learn Atari 2600 games from raw visual input [20,21]. These and other
recent attempts have shown that deep learning can be successful in teaching an
agent to perform a task from visual data. However, most studies focus on 2D
environments with stationary views; which does not reflect real world applica-
tions. Moreover, direct imitation is performed without considering refining the
policy based on the agent’s performance. To the best of our knowledge, training
an agent from raw visual input using deep networks and active learning in a 3D
environment has not been done.
In this paper we present a novel method that utilizes deep learning and active
learning to train agents in a 3D setting. The method is demonstrated on sev-
eral navigation tasks in a 3D simulated environment. Navigation is one of the
most explored domains in imitation learning due to its relevance to many robotic
applications, such as flying [1,23,29] and ground vehicles [7,26,27,32]. Naviga-
tion is also an essential base task in high degree of freedom robots (e.g. humanoid
robots) [7,30]. We propose a generic method for learning navigation tasks from
demonstrations that does not require any prior knowledge of the task’s goals,
environment or possible actions. A training set is gathered by having a teacher
control the agent to successfully perform the task. The controlled agent’s view of
the 3D environment is captured along with the actions performed in each frame.
A deep convolution network is used to learn visual representation from the cap-
tured video footage and learn a policy to mimic the teacher’s behavior. We also
employ active learning to improve the agents policy by emphasizing situations in
which it is not confident. We show that active learning can significantly improve
Deep Active Learning for Autonomous Navigation 5

the policy with a limited number of queried instances. Once trained, the agent is
able to extract features from the scene and predict actions in real time. We con-
duct our experiments on benchmark testbed that makes it seamless to replicate
our results and compare with other approaches.
Benchmark environments are useful tools for evaluating intelligent agents.
A few benchmarks are available for 2D tasks such as [3,15,25] and are being
increasingly employed in the literature. 3D environments however have not been
as widely explored, although they provide a closer simulation to real robotic
applications. We use mash-simulator [19] as our testbed to facilitate the evalua-
tion and comparison of learning methods. It is also convenient for extending the
experiments to different navigation tasks within the same framework.
In the next section we review related work. Section 3 describes the proposed
methods. Section 4 details our experiments and results. Finally we present our
conclusions and discuss future steps in Sect. 5.

2 Related Work
2.1 Navigation
Navigation tasks have been of interest in AI in general and imitation learning
specifically from an early stage. Sammut et al. [29] provides an early exam-
ple of an aircraft learning autonomous flight from demonstrations provided via
remote control. Later research tackle more elaborate navigation problems includ-
ing obstacles and objects of interest. Chernova et al. [7] use Gaussian mixture
models to teach a robot to navigate through a maze. The robot is fitted with
an IR sensor to provide information about the proximity of obstacles. This data
coupled with input from a teacher controlling the robot is used to learn a policy.
The robot is then able to make a decision to execute one of 4 motion primi-
tives(unit actions) based on its sensory readings. In [10] the robot uses a laser
sensor to detect and recognize objects of interest. A policy is learned to predict
subgoals associated with the detected objects rather than directly predicting
the motion primitives. Such sensing methods provide an abstract view of the
environment, but can’t convey visual details that might be needed for intelligent
agents to mimic human behavior. [22] use neural networks to learn a policy for
driving a car in racing game using features extracted from the game engine (such
as position of the car relative to the track). Driving is a complex task compared
to other navigation problems due to the complexity of the possible actions. The
outputs of the neural network in [22] are high DOF low level actions. However,
the features extracted from the game engine to train the policy would be dif-
ficult to extract in the real world. Advances in computational resources have
prompted the use of visual data over simpler sensory data. Visual sensors pro-
vide detailed information about the agents surrounding and are suitable to use in
real world applications. In [28] a policy for a racing game is learned from visual
data. Demonstrations are provided by capturing the games video stream and the
controller input. The raw frames (downsampled) without extracting engineered
features are used as input to train a neural network.
6 A. Hussein et al.

2.2 Deep Learning

Deep learning methods are highly effective in problems that don’t have estab-
lished sets of engineered features. CNNs have been used with great success to
extract features from images. In recent studies [20,21] CNNs are coupled with
reinforcement learning to learn several Atari games. A sequence of raw frames is
used as input to the network and trial and error is used to learn a policy. Trial
and error methods such as reinforcement learning have been extensively used
to learn policies for intelligent agents [16]. However, providing demonstrations
of correct behavior can greatly expedite the learning rate. Moreover, learning
through trial and error can lead the agent to learn a way of performing the
task that doesn’t seem natural or intuitive to a human observer. In [12] learn-
ing from demonstrations is applied on the same Atari benchmark. A supervised
network is used to train a policy using samples from a high performing but non
real time agent. This approach is reported to outperform agents that learn from
scratch through reinforcement learning. Other examples of using deep learning
to play games include learning the game of ‘GO’ using supervised convolution
networks [9] and a combination of supervised and reinforcement learning [33].
These examples all focus on learning 2D games that have a fixed view. However
in real applications, visual sensors would capture 3D scenes, and the sensors
would most likely be mounted on the agent which means it is unrealistic to have
a fixed view of the entire scene at all times.
In [18] a robot is trained to perform a number of object manipulation tasks.
First a trajectory is learned using reinforcement learning with the position of
the objects and targets known to the robot. These trajectories then serve as
demonstrations train a supervised convolutional neural network. In this case no
demonstrations are needed to be provided by a teacher. However, this approach
requires expert knowledge for the initial setup of the reinforcement learning
phase. Compared to related work that employs deep learning to teach an intel-
ligent agent, this is a realistic application implemented with a physical robot.
However, the features are extracted from a set scene with small variations. This
is different from applications where the agent moves and turns around, and with
that completely altering it’s view.

2.3 Active Learning

In many imitation learning applications direct imitation is not sufficient for


robust behavior. One of the common challenges facing direct imitation is that the
training set doesn’t fully represent the desire task. The collected demonstrations
only include optimal actions performed by the teacher. If the agent makes an
error it arrives at a state that was not represented in its learned policy [35].
It is therefore necessary in many cases to provide further training to an agent
based on its own performance of the task. One of the methods to enhance a
trained agent is active learning. Active learning relies on querying a teacher
for the correct decision in cases where the trained model performs poorly. The
teacher’s answers are used to improve the model in its weakest areas. In [7]
Deep Active Learning for Autonomous Navigation 7

active learning is used to teach a robot navigation tasks. The agent estimates a
confidence measure for its prediction and queries a teacher for the correct action
when the confidence is low. Erroneous behavior may also be identified by the
teacher. In [5] the robot is allowed to perform the task while a human teacher
physically adjusts its actions, which in turn provides corrected demonstrations.
Some imitation learning tasks involve actions that are performed continuously
over a period of time (i.e. an action is comprised of a series of motions performed
in sequence). In such cases a correction can be provided by the teacher at any
point in the action trajectory [14,28]. This way the agent is able to adapt to
errors in the trajectory.

3 Proposed Method
In this section we detail our proposed method for learning navigation tasks from
demonstrations. The source code for this work can be accessed at:
https://github.com/ahmedsalaheldin/ImitationMASH.git

3.1 Collecting Demonstrations


In imitation learning it is assumed that a human teacher is following an unknown
optimal policy. It is therefore possible to use an optimal policy if it exists to
collect demonstrations. To collect a training set we use a deterministic automated
teacher that has access to information hidden from a human or intelligent playing
agent such as position of targets and obstacles in a 3D space. Each training
instance consists of a raw 120 × 90 image of the rendered 3D scene and the
action performed by the teacher. We only use the current frame (not a sequence
of previous frames) in an instance because for the navigation tasks investigated
here adhere to the Markov property. That is, that current state is sufficient to
make a decision. And any previous actions and states need not be included in the
representation of the current state. In that case training an imitation learning
policy is reduced to a supervised image classification problem; where the current
view of the agent is the image and the action chosen by the teacher is the label.
Subsequently the trained agent will be able to predict a decision (as it would be
taken by the teacher) given its current view. More formally, the agent learns a
policy π from a set of demonstrations D = (xi , yi ) such that u = π(x, α). Where
xi is a 120 × 90 image, y is the action performed by the teacher at frame i, u is
the action predicted by policy π for input x and α is the set of policy parameters
that are changed through learning.

3.2 Deep Learning


To learn the policy we employ a deep convolutional neural network. The proposed
network uses several convolution layers to automatically extract features from
the raw visual footage. Then a fully connected layer is used to map the learned
features to actions. Each convolution layer is followed by a pooling layer that
8 A. Hussein et al.

down-samples the output of the convolution layer. The convolution layers take
advantage of spacial connection between visual features to reduce connections
in the network. The pooling layers reduce the dimensionality to further alleviate
the computations needed. Our network follows the pattern in [21]. It consists of
3 convolution layers each followed by a pooling layer. The input to the first layer
is a frame of 120 × 90 pixels. We apply a luminance map to the colored images
to obtain one value for each pixel instead of 3 channels, resulting in a feature
vector of size 10,800. Figure 1 shows the architecture of the network. The filter
sizes for the three layers are 7×9, 5×5 and 4×5 respectively; and the number of
filters are 20, 50 and 70 respectively. The pooling layers all use maxpool of shape
(2,2). Following the last convolution layer is a fully connected hidden layer with
rectifier activation function and fully connected output layer with three output
nodes representing the 3 possible actions. Table 1 summarizes the architecture
of the network.

Fig. 1. Architecture of the neural network used to train the agent

Table 1. Neural network architecture

Layer Size of activation volume


Input 120 * 90
Conv1 7 * 9 * 20
Conv2 5 * 5 * 50
Conv3 4 * 5 * 70
FC 500
Output(FC) 3

3.3 Active Learning

Active learning is employed to improve the initial policy learned from demon-
strations. This is achieved by acquiring a new data set to train the agent that
emphasizes the weaknesses of the initial policy. The agent is allowed to perform
the task for a number of rounds. For each prediction the network’s confidence
is calculated, and if the confidence is low the optimal policy is queried for the
Deep Active Learning for Autonomous Navigation 9

correct action. The action provided by the teacher is performed by the agent
and is recorded along with the frame image. The confidence is measured as the
entropy of the output of the final layer in the network. The entropy H(X) is
calculated as: 
H(X) = − P (xi ) log2 P (xi ) (1)
i

Where X is the prediction of the network, P (xi ) is the probability distribution


produced by the network for action i.
The active samples are added to the training set and used to update the
initial policy. We find that updating a trained network using only the active
samples results in forgetting the initial policy in favor of an inadequate one
rather than complementing it. Therefore the training set is augmented with the
active samples collected from the playing agent. The augmented dataset is used
to update the network that was previously trained. We find that it is easier and
faster for the network to converge if it is pre-trained with the initial dataset
than training from scratch. Algorithm 1 shows the steps followed to perform
active learning.
Low confidence predictions are mainly caused by situations that were not
covered by the training data. Therefore, for active learning to be effective, it
is important that it is performed in the simulation rather than on a collected
dataset. Because by performing its current policy in the simulation, the agent
arrives at unfamiliar situations where it is not confident in its behavior and thus
utilize active learning.

Algorithm 1. Active Learning Algorithm


1: Given: A policy π trained on a Data set D = (xi , yi )
Confidence threshold β
2: while Active Learning do
3: x = current frame
4: u = π(x, α)
5: H(X) = − P (ui ) log2 P (ui )
i
6: if H(X) < β then
7: y = Query(x)
8: perform action y
9: add (x, y) to D
10: else
11: perform max(u)
12: Update π using D

4 Experiments
We conduct our experiments in the framework of mash-simulator [19]. Mash-
simulator is a tool for benchmarking computer vision techniques for naviga-
tion tasks. The simulator includes a number of different tasks and environments.
10 A. Hussein et al.

As well as optimal policies for a number of tasks. All the navigation is viewed
from the first person perspective. The player has 4 possible actions: ‘Go forward’,
‘Turn left’, ‘Turn right’ and ‘Go back’. Although there are 4 possible actions, the
action ‘Go back’ was never used in the demonstrations by the optimal policy.
Therefore the network is only presented with 3 classes in the training set and
thus has 3 output nodes.

4.1 Tasks
The experiments are conducted on the following 4 navigation tasks:

Reach the Flag. This task is set in a single rectangular room with a flag placed
randomly in the room. The goal is to reach the flag. The task fails if the flag is
not reached within a time limit.

Fig. 2. sample images from “Reach the flag”

Follow the Line. This task is set in a room with directed lines drawn on the
floor. The lines show the direction to follow in order to reach the flag. The target
is to follow the line to the flag, and the agent fails if it deviates from the line on
the floor.

Fig. 3. sample images from “Follow the line”

Reach the Correct Object. In this task two objects are placed on pedestals
in random positions in the room. The objective is to reach the pedestal with the
trophy on it. The task fails if a time limit is reached or if the player reaches the
wrong object. The wrong object has the same material of the trophy and can
take different shapes.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Deep Active Learning for Autonomous Navigation 11

Fig. 4. sample images from “Reach the correct object”

Eat All Disks. This task is set in a large room containing several black disks
on the floor. The target is to keep reaching the disks. A disk is ‘eaten’ once the
agent reaches it and dissapears. New disks appear when one is eaten. The goal
of this task is to eat as many disks as possible within a time limit.

Fig. 5. sample images from “Eat all disks”

Figures 2, 3, 4 and 5 show sample images of the 4 tasks in the 120 × 90 size
used in the experiments.

4.2 Setup
To evaluate the proposed methods, the performance of the agent is measured
over 1,000 rounds. A round starts when the task is initialized and ends when
the agent reaches the target or a time limit is reached. The number of frames
in a round might vary depending on how fast the agent can reach the target.
For all tasks, in each round the environment is randomized including room size
and shape, lighting and the location of the target and the agent. A time limit
is set for each round and the round fails if the limit is reached before the agent
reaches the target. The time limit is measured in frames to avoid any issues with
different frame rates. The time limit is set as the maximum time needed for the
optimal policy to finish the task; which is 500 frames for “Reach the flag” and
“Reach the correct object” and 5000 frames for “Follow the line”. In “Eat all
disks” the task is continuous, so a time limit was set to match the total number
of frames in the other tasks.

4.3 Implementation Details


Inter-process communication is used to communicate data across the different
components of the testbed. The agent acts as a client and communicates with
12 A. Hussein et al.

the simulator via a TCP connection as follows: The agent requests a task from
the server, the server initiates a round and sends an image to the client. The
client sends an action to the server. The server calculates the simulations and
responds with a new image. Figure 6 shows a flowchart of the data collection
process.
The network used for prediction is also decoupled from the agent. The net-
work acts as a predicting server where an agent sends frames that it receives
from the simulator and in return receives a decision from the network. The
entire process of communication with both servers occurs in real time. This
implementation facilitates experimentation, as making changes to the network
doesn’t affect the client or the simulator server. Moreover, it is easier to extend
this system to physical robots. A predicting server can be located on the robot or
on another machine if the robot’s computational capabilities are not sufficient.
A predicting server can also serve multiple agents simultaneously. The agent
client is implemented in c++ to facilitate interfaceing with the mash-simulator.
The predicting server and the training process are implemented in python using
the Theano deep learning library [34]. Figure 7 shows a flowchart of the agent
performing a task.

Fig. 6. Dataset collection flowchart

Fig. 7. Imitation agent playing flowchart


Deep Active Learning for Autonomous Navigation 13

4.4 Results

In this section we present the results of the proposed method. The same network
and parameters are used to learn all tasks. For each task 20,000 images are used
for training. Testing is conducted by allowing an agent to attempt the tasks
in the mash-simulator and recording the number of successful attempts. An
agent’s performance for the first 3 tasks is evaluated as the percentage of times
it reaches the target in 1,000 rounds. For “Eat all disks”, the performance is
measured as the number of disks eaten in 1,000 rounds. We also report the
classification error on an unseen test set of 20,000 images collected from the
teacher’s demonstrations.
Table 2 shows the results for the first 3 tasks. The success measure is the
percentage of rounds (out of 1000) in which the agent reached the target. While
error is the classification error on the test set collected from the teacher’s demon-
strations. The agent performs well on “Reach the flag” and is significantly less
successful in the other two tasks. “Follow the line” is considerably less fault tol-
erant than “Reach the flag”. As a small error can result in the agent deviating
from the line and subsequently failing the round. Whereas in “Reach the flag” the
agent can continue to search for the target after a wrong prediction. In “Reach
the correct object” the agent is not able to effectively distinguish between the
two objects. This could be attributed to insufficient visual details in the training
set, as the teacher avoids the wrong object from a distance. Qualitative analysis
of “Reach the flag” shows that the agent aims towards corners as they resemble
the erect flag from a distance. Upon approaching the corner, as the details of the
image become clearer, the agent stops recognizing it as the target and continues
its search. While this did not pose a big problem in the agent’s ability to exe-
cute the task it is interesting to examine the ability of CNNs to distinguish small
details in such environments. It is also worth noting that the teacher’s policy for
“Reach correct object” does not avoid the wrong object if it is in the way of the
target and achieves 80.2 % success rate.

Table 2. Direct imitation results

Task Reach the flag Reach object Follow the line


success 96.20 % 53.10 % 40.70 %
error 2.48 % 4.06 % 0.86 %

Table 3 shows results for the 4th task “Eat all disks”. The table shows the
score of the agent compared to the score achieved using the optimal policy. The
agent is shown to achieve 97.9 % of the score performed by the optimal policy.
To improve the agent’s ability to adapt to wrong predictions and unseen sit-
uations, active learning is used to train the agent on “Follow the line”. In the
other tasks where the agent searches for the target, the optimal policy remem-
bers the location of the target even if it goes out of view due to agent error.
14 A. Hussein et al.

Table 3. “Eat all disks” results

Task Agent Optimal policy


score 1051 1073
error 1.70 % -

Therefore active learning samples include information that is not represented


in the visual data available to the agent and thus degrade the performance.
This can be rectified by devising a teaching policy that does not use historical
information, or by incorporating past experience in the learned model.
Figure 8 shows the results of active learning on the “Follow the line” task.
Active learning is demonstrated to significantly improve the performance of the
agent using a relatively small number of samples. Comparing the classification
error with success rate emphasizes the point that the errors come from situations
that are not represented in the teacher’s demonstrations.

Fig. 8. Results for active learning on “follow the line” task

The task in which the time limit affected the performance was “Reach the
flag”. As the agent continues to follow its policy in search of the flag even after
performing wrong predictions. The effect of the time limit is evaluated in Fig. 9
which presents the success rate of “reach the flag” task with different time limits.
The horizontal axis represents the time limit as a percentage of the maximum
time needed by the teacher. The graph shows that the longer the agent is allowed
to look for the target the higher the success rate.
Overall the results show good performance on 3 out of the 4 tasks. They
demonstrate the effectiveness of active learning to significantly improve a weak
policy with a limited number of samples. Even without active learning the agent
can learn a robust policy for simple navigation tasks.
Deep Active Learning for Autonomous Navigation 15

Fig. 9. Results for “reach the flag” task with increasing time limits

5 Conclusion and Future Directions


In this paper, we propose a framework for learning autonomous policies for nav-
igation tasks from demonstrations. A generic learning process is employed to
learn from raw visual data without integrating any knowledge of the task. The
experiments are conducted on a testbed that facilitates reproduction, compari-
son and extension of this work. The results show that CNNs can learn meaningful
features from raw images of 3D environments and learn a policy from demon-
strations. They also show that active learning can significantly improve a learned
policy with a limited number of samples.
Our next step is to conduct an investigation of the proposed approach in more
visually cluttered environments to further evaluate the ability of convolution
networks to create adequate representations from (relatively) low resolution 3D
scenes. As well as extend active learning experiments to more tasks. We also
aim to integrate reinforcement learning with learning from demonstrations to
improve the learned policies through trial and error. This allows the agent to
generalize its policy to unseen situations and adapt to changes in the task without
requiring to query the teacher.

References
1. Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement
learning to aerobatic helicopter flight. Adv. Neural Inf. Process. Syst. 19, 1 (2007)
2. Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning
from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
3. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning
environment: an evaluation platform for general agents (2012). arXiv preprint
arXiv:1207.4708
4. Bemelmans, R., Gelderblom, G.J., Jonker, P., De Witte, L.: Socially assistive
robots in elderly care: a systematic review into effects and effectiveness. J. Am.
Med. Direct. Assoc. 13(2), 114–120 (2012)
Random documents with unrelated
content Scribd suggests to you:
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like