Neural Networks and Artificial Intelligence: Vladimir Golovko Akira Imada
Neural Networks and Artificial Intelligence: Vladimir Golovko Akira Imada
Neural Networks and Artificial Intelligence: Vladimir Golovko Akira Imada
123
Communications
in Computer and Information Science 440
Editorial Board
Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
Rio de Janeiro, Brazil
Phoebe Chen
La Trobe University, Melbourne, Australia
Alfredo Cuzzocrea
ICAR-CNR and University of Calabria, Italy
Xiaoyong Du
Renmin University of China, Beijing, China
Joaquim Filipe
Polytechnic Institute of Setúbal, Portugal
Orhun Kara
TÜBİTAK BİLGEM and Middle East Technical University, Turkey
Igor Kotenko
St. Petersburg Institute for Informatics and Automation
of the Russian Academy of Sciences, Russia
Krishna M. Sivalingam
Indian Institute of Technology Madras, India
Dominik Śle˛zak
University of Warsaw and Infobright, Poland
Takashi Washio
Osaka University, Japan
Xiaokang Yang
Shanghai Jiao Tong University, China
Vladimir Golovko Akira Imada (Eds.)
13
Volume Editors
Vladimir Golovko
Brest State Technical University
Moskowskaja 267
224017 Brest, Belarus
E-mail: [email protected]
Akira Imada
Brest State Technical University
Moskowskaja 267
224017 Brest, Belarus
E-mail: [email protected]
Executive Committee
Honorary Chair
Petr Poyta Brest State Technical University
Conference Chair
Vladimir Golovko Brest State Technical University
Akira Imada Brest State Technical University
Conference Co-chairs
Vladimir Rubanov Brest State Technical University
Rauf Sadykhov Belarusian State University of Informatics and
Radioelecrtonics
Advisory Board
Vladimir Golenkov Belarusian State University of Informatics and
Radioelecrtonics
Valery Raketsky Brest State Technical University
Vitaly Sevelenkov Brest State Technical University
Program Committee
Dmitry Bagayev Kovrov State Techhological Academy, Russia
Irina Bausova University of Latvia, Latvia
Alexander Doudkin National Academy of Sciences of Belarus,
Belarus
Nistor Grozavu Paris 13 University, France
Marifi Guler Eastern Mediterranean University, Turkey
Stanislaw Jankowski Warsaw University of Technology, Poland
Viktor Krasnoproshin Belarusian State University, Belarus
Bora I. Kumova Izmir Institute of Technology, Turkey
Kurosh Madani University Paris-Est Creteil, France
VIII Organization
Sponsoring Institutions
ERICPOL Brest (Dzerzhynskogo 52, 224030 Brest Belarus)
Fig. 1.
Table of Contents
Optimization
Multi Objective Optimization of Trajectory Planning of Non-holonomic
Mobile Robot in Dynamic Environment Using Enhanced GA by Fuzzy
Motion Control and A* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Bashra Kadhim Oleiwi, Rami Al-Jarrah, Hubert Roth, and
Bahaa I. Kazem
Classification
New Procedures of Pattern Classification for Vibration-Based
Diagnostics via Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Nicholas Nechval, Konstantin Nechval, and Irina Bausova
X Table of Contents
Fuzzy Approach
Quality Evaluation of E-commerce Sites Based on Adaptive Neural
Fuzzy Inference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Huan Liu and Viktor V. Krasnoproshin
Machine Intelligence
At Odds with Curious Cats, Curious Robots Acquire Human-Like
Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Dominik M. Ramı́k, Kurosh Madani, and Christophe Sabourin
Analytical Approach
A Learning Technique for Deep Belief Neural Networks . . . . . . . . . . . . . . . 136
Vladimir Golovko, Aliaksandr Kroshchanka,
Uladzimir Rubanau, and Stanislaw Jankowski
Mobile Robot
A Multi-agent Efficient Control System for a Production Mobile
Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Uladzimir Dziomin, Anton Kabysh, Vladimir Golovko, and
Ralf Stetter
Koji Nakamura
Kanazawa University
Kakuma Kanazawa Japan
[email protected]
About 30% on average of the surface of the earth is covered with forests. The
types of the forests are variable depending on the ecological conditions of habitats
such as temperature, amount of rainfall, topography and soil. Forest has multi-
ple functions beneficial to human society, which is called as ecosystem services
(Millennium Ecosystem Assessment, carried out by UN in 2002-2005). Ecosys-
tem services include provisioning (wood, charcoal, game animals, medical plants,
mushroom and other foods), regulating (climate regulation, water purification,
CO2 absorption, preventing soil erosion and flood regulation), cultural (tradi-
tional festival and recreation) and supporting (nutrient cycling, soil formation
and primary production). Biodiversity, fostered by forests, has crucially impor-
tant roles for these functions.
As the results of Millennium Ecosystem Assessment indicated, forest degra-
dation and fragmentation have been very rapidly increasing during last 50 years,
which may have contributed to substantial net gains in human well-being and
economic development, but these gains have been achieved at growing costs in
the form of the degradation of many ecosystem services and increased risks of
nonlinear changes.
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 1–4, 2014.
c Springer International Publishing Switzerland 2014
2 K. Nakamura
to the effects of the monsoon climate. All of these factors combined have created
diverse habitats and environments for the growth of plants and animals. At
present, more than 90000 species have been confirmed as existing in Japan.
Compared to other developed countries, Japan has an extremely high proportion
of endemic species. In addition, extensive areas, where unique habitats have been
created by humans.
Satoyama is a Japanese term applied to the border zone or area between
mountain foothills and arable flat land. Literally, sato means arable and livable
land or homeland, and yama means hill or mountain. Satoyama have been de-
veloped through centuries of small scale agricultural and forestry use.1 In short,
Satoyama is rural landscapes formed by sustainable agriculture and forestry.
Satoyama areas have also contributed to the country’s rich biodiversity (See
’The National Biodiversity Strategy of Japan 2012-2020’ Ministry of Environ-
ment of Japan, 2012).
Diversified forest types are found in Japan, growing in the diversified habitats
and environments, as mentioned above. Forests cover about 66% of Japans land,
which is the third largest forest coverage rate in the world, following Finland
(72%, top) and Sweden (68%, second). However, in recent years, Japan has
imported about 80% of woods consumed in the country from abroad due to a
cheaper price, and forestry in Japan has been in economic slump for a long time.
(See, also the next session.)
It has become clear that these SEPLS and the sustainable practices and infor-
mation they represent are increasingly threatened in many parts of the world.
Commonly recognized causes include urbanization, industrialization, and rapidly
shrinking rural populations. The SI has taken a global perspective and sought
to consolidate expertise from around the world regarding the sustainable use of
resources in SEPLS.
A global initiative relevant to the SI is ”Globally Important Agricultural Her-
itage Systems (GIAHS),” which was launched by the Food and Agriculture Or-
ganization (FAO) of the United Nations in 2002. The overall goal of the initiative
is to identify and safeguard GIAHS and their associated landscapes, agricultural
biodiversity, and knowledge systems through catalyzing and establishing a long-
term support program and enhance global, national, and local benefit derived
through sustainable management and enhanced viability.
Traditional agriculture systems are still providing food for some two thousand
million people in the world today, and also sustain biodiversity, livelihoods, prac-
tical knowledge, and culture. So far, a total of 25 GIAHS areas have been desig-
nated from 11 countries, including Algeria, Chile, China, India, Japan, Mexico,
Morocco, Peru, Philippines, and so on (Berglund et al. 2014). See Fig. 1.
References
Duraiappah, A.K., Nakamura, K., Takeuchi, K., Watanabe, M., Nishi, M. (eds.):
Satoyama-Satoumi Ecosystems and Human Well-being: Socio-ecological Production
Landscapes of Japan. United Nations University Press (2012)
Berglund, B.E., Kitagawa, J., Lagerås, P., Nakamura, K., Sasaki, N., Yasuda, Y.: Tradi-
tional Farming Landscapes for Sustainable Living in Scandinavia and Japan: Global
Revival Through the Satoyama Initiative. AMBIO (2014), doi:10.1007/s13280-014-
0499-6
Ecological and Economical Aspects of Felling
with Use of Multi-operational Felling Machines
and Machinery
1 Introduction
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 5–8, 2014.
c Springer International Publishing Switzerland 2014
6 M.V. Levkovskaya and V.V. Sarnatsky
2 Analysis
From analysing mechanical felling of this year it was concluded that the lowest
defectiveness of pine trees (4.5%) observed after felling in winter time. In spring-
summer time (April-May) in comparison to winter, intensity of damage of roots
and trunk are greater in 1.2-2 times. In spring-summer time when strength of
bark is minimal there is chance of bark stripping. Low percentage of damaged
trees is noted when cleaning cutting made by linear block technology. On trees,
contiguous to skidder track at time of lumbering, share of damages sized below
100 cm2 is about the half. Most of damages on sample areas of assortment logging
is a bark stripping on height below 2.5 meters. In all cases most of damaged trees
are concentrated on the border of technological corridors. Damages inflicted to
tree’s trunk occurs in time of felling trees by harvester and loading assortment.
If felling made in winter time frozen ground and show protects roots and butt
of trees from damages.
It is determined, that defectiveness of Scots pine (Pinus sylvestrisL) after
felling in average is 5.9%, Norway spruce (Picea abies Karst) - 2.3%, silver birch
(Betula pendula Roth) - 0.7%, that corresponds to technical requirements for
preservation of the trunk and besides this damages do not lead to cessation of
growth and desiccation of tree.
Acidity of the upper soil layers (pH) in the block of researched sample areas
variates between 4.64 and 5.13, in the skidder track - from 4.74 to 5.33. In the
felling area acidity of the soil reduced on 0.1-0.5 points and depends from kinds
of growing plants.
For comparative analysis of influence of machinery on the density, hardness,
moisture of the soil on sample area in zones of technological corridors and blocks
samples of undisturbed soil were get from upper horizons (50 cm). Density and
moisture of soil were determined in laboratory condition.
It is found, that transport trees additionally compacts the soil in skidder tracks
in average on 6%, maximal value of soil compaction reaches 20%. Transport trees,
in some cases, leads to soil compaction in forest in 1.1–1.2 times. If lumbering
made in spring-summer time in pine forests maximal density of soil in run reaches
1.47 g/cm3 and it exceeds limit in 1.2–1.4 times. In summer time soil density in
skidder tracks is higher on 2-33%. Over time difference of soil density in skidder
track and blocks reduces. So, if after felling it reaches 19%, then in places, where
felling was in 2004, 2005 years in winter time it varies in limits of 2–7%.
Relatively more favourable conditions are in lumbering areas aged 8–9 years.
Density of soil is reduced, that shows reversibility of the process of soil com-
paction. Changes of hardness of humus horizon are determined. Increase of hard-
ness of soil under influence of transport trees are observed up to 10–17 kg/cm2
(in 2-4 times). Hardness of soil in skidder tracks exceeds on 37-73% the values
for blocks, that is in 1.6-3.3 times higher. Moisture of soil in skidder tracks is
lower, than in blocks, but higher than in test woodland. Increase of rainfall in
skidder tracks, that reaches soil level is connected with removing of leaf canopy.
Soil compaction in skidder tracks leads to reduction of its porosity, infiltration of
water, changes in moisture regime and difficulties in water penetration. In some
Ecological and Economical Aspects of Felling 7
3 Estimation
If its identified, that placing felling tailings on the skidder tracks in the process
of lumbering significantly reduces negative influence of transport trees on the
soil and decrease frost line. Severity of exposure to vegetable ground cover from
logging machinery depends on season of felling, design features of machines and
technological processes of cutting area operations. Regeneration and growth of
forest range greatly depends on soil density, which also determines tempera-
ture, moisture, atmosphere conditions of soil and intensity of physico–chemical
and biological processes, that are performed in soil. Changes of hydrophysical
conditions of soil in 10 from 20 sample areas are observed.
After mechanical cleaning cutting mass of bioactive roots in soil of skidder
tracks are decreased, that is connected with degradation of hydrophysical con-
ditions of soil of skidder tracks. Dependence of level of root occupation in tech-
nological corridor and block could be traced.
Estimation of difference in level of root occupation in technological corridor
and block are made by formula:
4 Summary
In time of thinning and accretion cutting defectiveness of trees are variates from
2% to 12.3%. In all cases most of damaged trees are concentrated on border of
technological corridors. Hydrophysical conditions of soil of skidder tracks and
blocks are subjected to considerable changes depending on technology and age
of felling, initial differences in physical characteristics of soil and on season in
which clearing cutting was made.
8 M.V. Levkovskaya and V.V. Sarnatsky
Making the felling in winter time, when soil is frozen and there is snow cov-
erage, has a positive impact on state of soil. In connection with soil compaction
after mechanical clearing cutting in the soil of skidder tracks reduced mass of
small roots of pine.
Degradation of hydrophysical conditions of soil increases time for regeneration
of mass of roots. It makes sense to continue researches on this objects for deter-
mining the duration of influence of lumbering machinery on different components
of forests.
With a purpose of increase of economical efficiency of harvesting of wood and
use of lumbering machinery it its necessary to organize so called concentrated
felling, that held in limits of not big forest area or 1–3 blocks including clear-
ing cutting, sanitary felling, final felling and other kinds of felling. Important
technological aspect is a reduce of non-productive loses in work of machines
and machinery by reducing run unload and distance of dragging, skidding of
converted wood, transportation of it to the storage area.
(The original draft was written in Russian. Aleksandr Brich (Brest State Tech-
nical University) translated the draft into English.)
A Literature Review: Forest Management
with Neural Network and Artificial Intelligence
Akira Imada
1 Introduction
Peng (1999) wrote, ”Data concerning forest environment are sometimes obscure
and unpredictable, artificial neural network, which is good at processing such
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 9–21, 2014.
c Springer International Publishing Switzerland 2014
10 A. Imada
Every year we hear quite a lot of news of wildfire somewhere in the globe, such
as in US, Turkey, Greece, Spain, Lebanon and so on and on. Sometimes wildfire
kills people or even firefighters. The article in New York times on 30 June 2013
reads:
Let’s see how frequently wildfires had happened in U.S. in 2013, as an example.
Tres Lagunas fire, Thompson Ridge fire, Silver fire, Jaroso fire in New Mex-
ico; Black Forest fire, Royal Gorge fire in Colorado; Yarnell Hill fire in Arizona
(where 19 firefighters were killed as cited above); Quebec fire in Quebec; Mount
Charleston fire, Bison fire in Nevada; Idaho Little Queens fire in Idaho; Silver
fire, Beaver Creek fire, Rim fire, Morgan fire, Clover fire in California.1
Vasilakos et al. (2009) estimated, the percentage of the influence of lots of
factors to fire ignition risk in Lesvos Island in Greece. Here let’s see, at first, his
well organized survey on wildfire prediction in detail, which would help us make
a further survey in this topic.
Vasilakos wrote, ”Wildland fire danger evaluation is an integration of weather,
topography, vegetative fuel, and socioeconomic input variables to produce numeric
indices of fire potential outputs (Andrews et al. 2003; Pyne et al. 1996).” He went
on, ”Various quantitative methods have been explored for the correlation of the
input variables in fire danger assessment; most of these methods include the input
variables’ importance as a direct or indirect output.” Traditionally, statistical
methods were widely used for fire danger calculation. Vasilakos further wrote,
”More specifically, linear and logistic regression techniques were proposed, so the
coefficients of the models reflect the influence of inputs on fire danger (Chou
1992; Chou et al. 1993; Kalabokidis et al. 2007; Vasconcelos et al. 2001).
”Artificial neural networks have been also used in fire ignition risk estimation
(Chuvieco et al. 1999; Vasconcelos et al. 2001; Vasilakos et al. 2007; Vega-Garcia
et al. 1996). ... In our previously published research (Vasilakos et al., 2007), three
different neural networks were developed and trained to calculate three interme-
diate outcomes of Fire Ignition Index, i.e., the Fire Weather Index, the Fire
Hazard Index, and the Fire Risk Index.”
1
Extracted from http://en.wikipedia.org/wiki/List of wildfires.
A Literature Review: Forest Management with NN and AI 11
air temperature
humidity
wind speed
risk of fire occurence
rain
W ij W jk
Fig. 1. One example of Multilatyer perceptron from those used by Vasilakos (2009)
to know which input is the most influential factor for the output of ignition of fire
occurrence
0.8
y (probability)
0.6
0.4
0.2
-10 -5 0 5 10
x
Fig. 2. A fictitious example of logistic regression from nine different values of x, after
adjusting two parameters a0 , and a1 by the maximum likelihood estimation
i=1 j=1
Wj
where
N
Wj = wrj (4)
r=1
for normalization.
Using logistic-regression, Garson’s equation and some other methods, Vasi-
lakos et al. (2008) estimated, the percentage of the influence of lots of factors
to fire ignition risk in Lesvos Island. Here let’s see the result by Garson’s equa-
tion, among others. The degree of importance of air temperature, wind speed,
humidity and amount of rainfall to the risk of fire occurrence were found to be
28.7%, 20.9%, 14.5% and 35.9%, respectively, where in this example, the neural
A Literature Review: Forest Management with NN and AI 13
network was feedforward one with four input neurons, four hidden neurons and
one output neuron trained by the backpropagation. The other factors chosen by
the authors were altitude, distance to urban areas, day of the week, month of
the year, etc. Thus, the authors determined influential ones out of 17 factors,
with dependent variable being binary expressing presence or absence of fire ig-
nition possibility. Training and validation samples were created from the total
fire history database.
Support Vector Machine is, to simply put, a method to classify objects in
a multi-dimensional space with hopefully by hyperplane, or otherwise a hyper
surface. Sakr et al. (2010) proposed a forest fire risk prediction algorithm based
on Support Vector Machines. The algorithm predicts the fire hazard level of the
day from previous weather condition. The algorithm used the data from a forest
in Lebanon for training.
Safi et al. (2013) used similar approach with forest fire data from the wild area
of 700 square kilometers of ancient oak forests in the Portuguese Montesinho
Natural Park with the output signal representing the total surface in hectare of
the corresponding burned area.
Gil-Tena et al. (2010) modeled bird species richness in Catalonia, Spain. The
authors wrote, ”Forest characteristics that determine forest bird distribution may
also be influencing other forest living organisms since birds play a key functional
role in forest ecosystems and are often considered good biodiversity indicators
(Sekercioglu 2006).”
Then authors exploited a three layer feedforward neural network with training
being based on adaptive gradient learning, a variant of backpropagation. After
optimizing the structure of neural network, estimated bird richness values by
each neural network model are compared with the observed values in the real
forest, and evaluated using linear correlation. Forest bird species richness was
obtained using presence/absence data of 53 forest bird species as well as 11 data
concerning forest (such as tree-species-diversity and past-burnt-area), five data
on climate (such as temperature and precipitation) and 5 data on human pressure
(such as road density and human population). Those bird data were collected
by the volunteers from the Catalan Breeding Bird Atlas (Estrada et al. 2004).
The other data are obtained from various sources, such as Spanish Forest Map,
Forest Canopy Cover, Catalan Department Environment and Housing, Spanish
Digital Elevation Model, National Center of Geographical Information, etc. (See
references cited therein.) Data were divided into two groups. One was used for
training and the other was for validation.
Also from this aspect, using neural networks, Peng et al. (1999) studied which
of the forest features correlate with biodiversity, and then modeled forest bird
species richness as a function of environment and forest structure. The authors
wrote, ”Much progress has been made in this area since the initial use of artificial
neural network to model individual tree mortality in 1991 (Guan and Gertner
14 A. Imada
1991a). In the same year, Guan and Gertner (1991b) successfully developed a
model, based on an artificial neural network, that predicts red pine tree survival.”
To know this topic more in detail, The Ph.D dissertation by Fernandez (2008)
might be good to be read.
Fig. 3. An example of forest cover maps (taken from the web page by NationalA-
tras.gov)
types in the four wilderness areas of the Roosevelt National Forest in northern
Colorado – Rawah, Comanche Peak, Neota, and Cache la Poudre. A feedforward
neural network model was used. After looking for an optimal architecture by trial
and error experiment, the neural network was made up of 54 input neurons, 120
hidden neurons, and 7 output neurons. Training was by backpropagation. Then
they compared the results with the results by a traditional statistical model
based on Gaussian discriminant analysis, and found a more accurate prediction
by neural network.
In addition to the accuracy, Blackard et al. (1999) wrote, ”Recording the data
by human is prohibitively time consuming and/or costly. ... Furthermore, an
agency may find it useful to have inventory information for adjoining lands
that are not directly under its control, where it is often economically or legally
impossible to collect inventory data. Predictive models provide an alternative
method for obtaining such data.”
3
http://forestry.about.com/cs/glossary/g/for cov type.htm.
4
http://nationalatlas.gov/mld/foresti.html.
A Literature Review: Forest Management with NN and AI 15
(Paruelo et al. 1997), (Gevrey 2003) and (Leite et al. 2011) as a report of esti-
mation of forest resources by regression.
It’s not difficult to search for papers that claim artificial intelligence. There exist
lots of such papers that propose a new method for predicting a future. Let’s take
a look at those already published papers, for example, on wildfire prediction, the
title of which includes the term ’artificial intelligence.’
In their paper, Peng et al. (1999) wrote, ”The application of artificial intelli-
gence in forest and natural resources management started with the development
of expert systems (Coulson et al. 1987).”
7
NIIS research team (1993) ”Nepal irrigation institutions and systems’ database cod-
ing sheets and forms.” Indiana University, Workshop in political theory and policy
analysis.
A Literature Review: Forest Management with NN and AI 17
Since then, indeed, not a few approach have claimed that they use the ar-
tificial intelligent methodology. Let’s name a few. Kourtz (1990) studied forest
management in all aspects of Canadian forestry by expert system in his pa-
per entitled ’Artificial intelligence: a new tool for forest management.’ Arrue
et al. (2000) proposed a system to detect forest fires by using computer vision,
neural networks and expert fuzzy rules, in their paper entitled ’An intelligent
system for false alarm reduction in infrared forest-fire detection.’ Actually, the
late 1990’s was a dawn of artificial intelligence and they dreamed a bright future
of establishing a human-like artificial intelligence. But nowadays, we don’t think
the state of the art then had such a bright future.
Now let’s see more recent ones. Angayarkkani et al. (2010) proposed a sys-
tem of detecting forest fires in their paper ’An intelligent system for effective
forest fire detection using spatial data.’ The digital image in the forest area were
converted from RGB to XYZ color space, and then segmented by employing
anisotropic diffusion to identify fire region. Radial basis function neural net-
works was employed.
The title of already mentioned paper by Sakr et al. (2010) was ’Artificial
intelligence for forest fire prediction.’
But Are Those Intelligence Really Intelligent? In his paper, Castro (2013)
wrote, ”Artificial neural networks are a type of artificial intelligence system sim-
ilar to human brain, having a computational capability which is acquired through
learning.”
In fact, many claim their proposed machine to be intelligent. However, are
these machines really intelligent like human intelligence as they claim? We cannot
be so sure. For example, the paper by Sakr et al. (2010) reads, ”The methods
are based on artificial intelligence” in the abstract, while the term ’artificial
intelligence’ never appeared afterwords in the whole text. Instead, he concluded,
just ”advanced information communication technologies could be used to improve
wildfire prevention and protection.” That’s all there is to it.
Yet another such example is the paper by Wendt et al. (2011) entitled ’In-
put parameter calibration in forest fire spread prediction: Taking the intelligent
way.’ The appearance of the term ’intelligent’ is only once, i.e., ”Evolutionary
Intelligent System” without mentioning what is that.
Tom Pittman has made a career as a Border Patrol agent here guarding
this city’s underground drainage system, where the tunnels that carry
sewage and storm runoff between the United States and Mexico are also
busy drug-smuggling routes. Over the years, he has crawled and slithered
past putrid puddles, makeshift latrines and discarded needles left behind
by drug users, relying on instincts, mostly, to gauge the risks ahead. It is
a dirty and dangerous business, but these days, there is a robot for that.
As the article went on ”The robots can serve as the first eyes on places considered
too risky for humans to explore,” the aim is not a creation of intelligent robot
agent, at least up to this moment. However, there would be a evolutionary race
between the smugglers and robots. Which will become more intelligent next
time?
4 Concluding Remarks
To cover the plenary talk on ’forest resource maintenance’ in general, a literature
survey on the topics specifically from IT point of view has made, with focus being
wildfire prediction, preservation of biodiversity in ecosystems, forest cover type
prediction, forest resource maintenance such as common pool resource.
In Belarus, we have a huge forest called ’Belovezhskaya Pushcha National
Park’ where it is said to be ”the home to 900 plants and 250 animals and birds,
including several rare species.” Hence, contributions to a maintenance of this
ecological environment is a duty to us IT scientists in Belarus. Further, this
issue is going to be worldwide now in this era of global warming. The author
wish this small survey paper to play a role of useful pointers for this field.
A Literature Review: Forest Management with NN and AI 19
References
Agrawal, A., et al.: Explaining success on the commons: community forest governance
in the Indian Himalaya. World Development 34(1), 149–166 (2006)
Andrews, P.L., et al.: BehavePlus fire modeling system user’s guide, v. 2.0. General
technical report RMRS-GTR-106WWW, USDA, Forest Service, Rocky Mountain
Research Station (2003)
Angayarkkani, K., et al.: An intelligent system for effective forest fire detection us-
ing spatial data. International Journal of Computer Science and Information Secu-
rity 7(1) (2010)
Arrue, B.C.: An intelligent system for false alarm reduction in infrared forest-fire de-
tection. IEEE Intelligent Systems and their Applications 15(3), 64–73 (2000)
Atkinson, P.M., et al.: Introduction: Neural networks in remote sensing. International
Journal of Remote Sensing 18, 699–709 (1997)
Benediktsson, J.A., et al.: Neural network approaches versus statistical methods in
classification of multisource remote sensing data. IEEE Transaction on Geoscience
and Remote Sensing 28, 540–552 (1990)
Blackard, J.A.: Comparison of neural networks and discriminant analysis in predicting
forest cover types. Ph.D. dissertation, Department of Forest Sciences, Colorado State
University (1998)
Blackard, J.A.: Comparative accuracies of artificial neural networks and discriminant
analysis in predicting forest cover types from cartographic variables. Computers and
Electronics in Agriculture 24, 131–151 (1999)
Braitenberg, V., et al.: Cortex: statistics and geometry of neuronal connectivity.
Springer (1997)
Campbell, W.J., et al.: Automatic labeling and characterization of objects using arti-
ficial neural networks. Telematic and Informatics 6, 259–271 (1989)
Castro, R.V.O.: Individual growth model for eucalyptus stands in Brazil using artifi-
cial neural network. In: International Scholarly Research Network, ISRN Forestry
Volume. Hindawi Publishing Corporation (2013)
Chou, Y.H.: Spatial autocorrelation and weighting functions in the distribution of
wildland fires. International Journal Wildland Fire 2(4), 169–176 (1992)
Chou, Y.H., et al.: Mapping probability of fire occurrence in San Jacinto Mountains,
California, USA. Environment Management 17(1), 129–140 (1993)
Coulson, R.N., et al.: Artificial intelligence and natural resource management. Sci-
ence 237, 26–67 (1987)
Chuvieco, E., et al.: Integrated fire risk mapping. In: Remote Sensing of Large Wildfires
in the European Mediterranean Basin. Springer (1999)
Diamantopoulou, M.J.: Artificial neural networks as an alternative tool in pine bark
volume estimation. Computers and Electronics in Agriculture 48(3), 235–244 (2005)
Downey, I.D., et al.: A performance comparison of Landsat thematic mapper land cover
classification based on neural network techniques and traditional maximum likeli-
hood algorithms and minimum distance algorithms. In: Proceedings of the Annual
Conference of the Remote Sensing Society, pp. 518–528 (1992)
Estrada, J., et al.: Atles dels ocells nidificants de Catalunya 1999–2002. In: Institut
Catal d’Ornitologia (ICO)/Lynx Edicions, Barcelona, España (2004)
Fernandez, C.A.: Towards greater accuracy in individual-tree mortality regression.
Ph.D dissertation, Michigan Technological University (2008)
Frey, U.J., et al.: Using artificial neural networks for the analysis of social-ecological
systems. Ecology and Society 18(2) (2013)
20 A. Imada
Pattie, D.C., et al.: Forecasting wilderness recreation use: Neural network versus re-
gression. AI Application 10(1), 67–74 (1996)
Peddle, D.R., et al.: Multisource image classification II: An empirical comparison of ev-
idential reasoning, linear discriminant analysis, and maximum likelihood algorithms
for alpine land cover classification. Canadian Journal Remote Sensing 20, 397–408
(1994)
Peng, C., et al.: Recent applications of artificial neural networks in forest resource man-
agement: An overview. In: Environmental Decision Support Systems and Artificial
Intelligence, pp. 15–22 (1999)
Pyne, S.J., et al.: Introduction to wildland fire, 2nd edn. Wiley (1996)
Safi, Y., et al.: Prediction of forest fires using artificial neural networks. Applied Math-
ematical Sciences 7(6), 271–286 (2013)
Sakr, G.E., et al.: Artificial intelligence for forest fire prediction. In: Proceeding of In-
ternational Conference on Advanced Intelligent Mechatronics, pp. 1311–1316 (2010)
Sekercioglu, C.H.: Increasing awareness of avian ecological function. Trends of Ecolog-
ical Evolution 21, 464–471 (2006)
Seric, L., et al.: Observer network and forest fire detection. Information Fusion 12,
160–175 (2011)
Stipanicev, D.: Intelligent forest fire monitoring system - from idea to realization. In:
Annual 2010/2011 of the Croatian Academy of Engineering (2011)
Tang, S.Y.: Institutions and collective action in irrigation systems. Dissertation. Indi-
ana University (1989)
Ulrich, J.F., et al.: Using artificial neural networks for the analysis of social-ecological
systems. Ecology and Society 18(2), 42–52 (2013)
Vasconcelos, M.J.P., et al.: Spatial prediction of fire ignition probabilities: Compar-
ing logistic regression and neural networks. Photogramm Engineering Remote Sen-
sors 67(1), 73–81 (2001)
Vasilakosi, C., et al.: Integrating new methods and tools in fire danger rating. Interna-
tional Journal of Wildland Fire 16(3), 306–316 (2007)
Vasilakosi, C., et al.: Identifying wildland fire ignition factors through sensitivity anal-
ysis of a neural network. Natural Hazards 50(1), 12–43 (2009)
Vega-Garcia, C., et al.: Applying neural network technology to human caused wildfire
occurrence prediction. Artificial Intelligence Application 10(3), 9–18 (1996)
Weingartner, M., et al.: Improving tree mortality predictions of Norway Spruce Stands
with neural networks. In: Proceedings of Symposium on Integration in Environmen-
tal Information Systems (2000)
Wendt, K., et al.: Input parameter calibration in forest fire spread prediction: Taking
the intelligent way. In: Proceedings of the International Joint Conference on Artificial
Intelligence, pp. 2862–2863 (2011)
Yoon, S.-H.: An intelligent automatic early detection system of forest fire smoke sig-
natures using Gaussian mixture model. Journal Information Process System 9(4),
621–632 (2013)
Can Artiticial Neural Networks Evolve
to be Intelligent Like Human?
A Survey on “Formal Definitions
of Machine Intelligence”
Akira Imada
1 Introduction
Since John McCarthy at MIT coined the term ”Artificial Intelligence” in 1956
aiming to make a machine acquire a human-like intelligence in a visible future, we
have had lots of discussions whether it is possible in a real sense, and actually lots
of what they call an intelligent machine have been reported. The term is ubiqui-
tous in our community these days. Let’s name a few: ’intelligent route finding,’
’intelligent city transportation system,’ ’intelligent forecasting stock market,’ etc.
Then the question arises, ”Degree to how they are intelligent, indeed?”
Or, are they not intelligent at all? Malkiel (2007) once wrote, ”A monkey
throwing darts at the Wall Street Journal to select a portfolio might be better
than the one carefully selected by experts”
Lots of research works are now on going to understand the mechanism of
brain using state of the art technology. For example, we can map human (or
no-human) brain by fMRI, or we can measure micro-voltage fluctuation in any
part of the brain by EEG. Thus, by capturing a snapshot of the brain in action,
we can observe discrete activities that occur in specific locations of the brain.
On the other hand, Weng (2013) wrote, ”A person who does not understand
how a computer works got the wiring diagram of a computer and all snapshots
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 22–33, 2014.
c Springer International Publishing Switzerland 2014
Can ANN Evolve to be Intelligent Like Human? 23
of voltages at all the spots of the wires as a high definition fMRI movie. Then
he said ”now we have data, and we need to analyze it!” Our efforts are like
”Tree climbing with our eye’s on the moon?” as once Dreyfus (1979) sarcastically
wrote?
2 Turing Test
In mid February in 2011, IBM’s room size supercomputer called Watson chal-
lenged ’Jeopardy’ - America’s favorite quiz show. In Jeopardy, normally three
human contestants fight to answer questions over various topics, with a penalty
for a wrong answer. The questions are like ”Who is the 19th-century painter
whose name means police officer?” or ”What is the city in US whose largest
airport is named for a World War II hero; and its second largest for a World
War II battle?”
The contests were held over three days with Watson being one of the three
contestants and the other two being the ex-champions of Jeopardy - Ken Jen-
nings and Brad Rutter. As Watson cannot see or hear, questions were shown as a
text file at the same moment when the questions were revealed to the two human
24 A. Imada
contestants. By the end of the third days, Watson got $77,147 while Jennings
got $24,000 and Rutter $21,600. Watson beat the two human ex-champions.1 If
we set up an appropriate scenario, Watson could pass the Turing Test.
Or, in mid March in 2012, a computer program, called Dr. Fill, challenged 600
humans of the world’s best crossword players at the American Crossword Puzzle
Tournament in Brooklyn. Dr. Fill was created by Matthew Ginsberg aiming
specifically to solve crossword puzzles. At the tournament, players will get six
puzzles to solve on Saturday, and one on Sunday - progressively more difficult.
Rankings are determined by accuracy and speed. The top three finishers enter
a playoff with an eighth puzzles on Sunday afternoon, competing for the $5,000
prize. The trophy went to a human and the computer program finished 141st.2
Nevertheless, it was an impressive result, was it not?
In such an era when computer technology is so magnificent, it would not be
difficult to imitate an human intelligence. One of the easiest ways to make a
human believe that the machine is a human, might be a deliberate mistake from
time to time pretending not to be very precise to imitate a human. Even at that
time, Turing (1950) wrote,
Suppose you are playing an interactive video game with some entity.
Could you tell, solely from the conduct of the game, whether the other
1
This is from the article in New York Times by John Markoff entitled ”Creating
artificial intelligence based on the real thing,” on 17 February 2011.
2
This is from the two articles in New York Times by Steve Lohr. One is ”The com-
puter’s next conquest: crosswords,” on 17 March 2012, and the other is ”In cross-
words, it’s man over machine, for now,” on 19 March 2012.
Can ANN Evolve to be Intelligent Like Human? 25
entity was a human player or a bot? If not, then the bot is deemed to
have passed the test.
4
Note here that ri must be chosen such that Vμπ ranges from 0 to 1. In the closely
related concept of reinforcement learning, this is denoted as E( ∞ i=1 (1 − γ)γ
i−1
ri )
where 0 ≤ γ ≤ 1 is called a discount factor, meaning ”future rewards should be dis-
counted,” and ∞ i=1 (1−γ)γ
i−1
= 1. Later, Goertzel (2011) pointed out ”discounting
is so that near-term rewards are weighted much higher than long-term rewards,” as
being a not necessarily favorable outcome.
Can ANN Evolve to be Intelligent Like Human? 27
Then intelligence γ(π) is defined as a weighted sum of this expected value of the
sum of rewards over infinitely large number of various environments.
γ(π) = wμ · Vμπ , (5)
μ∈E
where p is a binary string which we call a program, l(p) is the length of this
string in bits, and U is a prefix universal Turing machine.
Thus the complexity of μi is expressed by K(μi ). We use this in the form of
probability distribution 2−K(μ) . Then finally,
wμ = 2−K(μ) . (7)
In case where we have a multiple paths to the goal, the simplest one should
be preferred, which is sometimes called the principle of Occam’s razor. In other
words, Given multiple hypotheses that represents the data, the simplest should
be preferred, as Legg and Hutter wrote. The above definition of the weight value
follows this principle, that is, the smaller the complexity the larger the weight.
In summary, the intelligence of the agent π is the expected performance of
agent with respect to the universal distribution 2−K(μ) over the space of all
environments E, that is,
γ(π) = 2−K(μ) · Vμπ . (8)
μ∈E
In other words, weighted sum of the formal measure of success in all environ-
ments where the weight is determined by the Kolmogorov complexity of each
environment.
We now recall the starting informal definition: an ability to achieve goals in a
wide range of environments. In the above equation, the agent’s ability to achieve
goals is represented by Vμπ , and a wide range of environments is represented
by the summation over E, that is, all the well-defined environments in which
rewards can be summed. Occam’s razor principle is given by the factor 2−K(μ) .
Thus Legg and Hutter called this the universal intelligence of agent π.
5
Legg and Hutter described this as ”the summation should be over the space of all
computable reward-summable environments,” meaning the total amount of rewards
the environment returns to any agent is bounded by 1.
28 A. Imada
To be a little more realistic than Legg and Hutter’s definition, Goertzel (2011)
extends the definition by adding (i) the distribution function ν that assigns
each environment a probability; (ii) a goal function that maps finite sequence
a1 o1 r1 a2 o2 r2 · · · ai oi into ri ; (iii) a conditional distribution γ, so that γ(g, μ)
gives the weight of a goal g in the context of a particular environment μ where
g is given by a symbol from G; (iv) the Boolean value τg,μ (n) that tells whether
it makes sense when we evaluate performance on goal g in environment μ over a
period of n time steps with 1 meaning yes and 0 meaning no. If the agent who
is provided with goal g in environment μ during a time-interval T = {t1 , · · · , t2 }
then the expected goal-achievement of the agent during the time-interval is:
t2
Vμ,g,T
π
= ri . (9)
i=t1
which Goertzel call a pragmatic general intelligence. Let me skip a detail here.
Thus, ”the environments can be generated and their complexity can be com-
puted automatically,” as Hernándes-Orallo wrote.
Hernándes-Orallo also describes a possibility to use a finite state machine,
noting ”finite-state machines are not Turing-complete, though.”
Mp : B × Sp → Sp × B, (11)
meaning that predictor receives an input from the evader and state is transferred
to a new state making an action.
Similarly, the finite state machine Me for evader e, has state set Se , initial
state Ie , and mapping:
Me : B × Se → Se × B, (12)
meaning that the evader see the action of the predictor as an input and transfer
the current state to a new state and give a reward to the predictor.
Evader e creates a finite binary sequence x1 x2 x3 · · · , and predictor p creates
also a finite binary sequence y1 y2 y3 · · · . A pair of evader e and predictor p
interacts where e produces the sequence according to
Then predictor p wins the round (n + 1) if yn+1 = xn+1 (implies evader catches
predictor) and evader e wins if yn+1 = xn+1 (implies evader fails to catch pre-
dictor).
6
Hibbard also showed Turing machine version of this speculation in the same paper,
but we only show its finite state machine version here.
30 A. Imada
First of all, let’s restrict our definition simply to just one specific task. And let’s
forget measuring complexity by Turing machine or something like that, which
is far from being pragmatic. Instead, let’s look for some simpler complexity
measure. Then, let’s measure how an action is unpredictable, or spontaneous.
A similarity of one action comparing to the previous actions should be also
incorporated to the definition. Finally, as a measure of learning capability, by
repeating the algorithm a multiple of times to observe how an action in a run
has been improved from the one made in the previous runs.
Hence, the formula to know intelligence of an agent π for a task μ has a form
like
N M
Vμπ = F (aij ) · G(aij ) · H(aij ) · U (aij ) (15)
j=1 i=1
where aij is the i-th action in the j-th run. M is a total number of actions in a
run, and N is a number of runs repeated in a same environment.
The function F represents complexity or simplicity, G is a measure of un-
predictability, H is a similarity measure of an action comparing to the previous
actions, and U measures how one action in a run is better or worth than the
same situation in previous runs.
The function form of F is up to the philosophy of designer of this formula.
The function G and H might be a slightly monotonic increasing function, the
larger the better more or less. The function U is a monotonic decreasing function
assuming we measure efficiency or time for an agent to process one task.
5 Conclusion
References
Chaitin, G.J.: Gödel’s theorem and information. Theoretical Physics 21(12), 941–954
(1982)
Dreyfus, H.: What computers can’t do. MIT Press (1979)
Frosini, P.: Does intelligence imply contradiction? Cognitive Systems Research 10(4),
297–315 (2009)
Gnilomedov, I., Nikolenko, S.: Agent-based economic modeling with finite state ma-
chines. In: Combined Proceedings of the International Symposium on Social Network
Analysis and Norms for MAS, pp. 28–33 (2010)
Can ANN Evolve to be Intelligent Like Human? 33
Bashra Kadhim Oleiwi1,*, Rami Al-Jarrah1, Hubert Roth1, and Bahaa I. Kazem2
1
Siegen University/Automatic Control Engineering, Siegen, Germany,
Hoelderlinstr. 3
57068 Siegen
{bashra.kadhim,rami.al-jarrah,hubert.roth}@uni-siegen.de
2
Mechatronics Eng. Dept University of Baghdad-Iraq
[email protected]
Keywords: Global and Local path planning, Trajectory generating, Mobile ro-
bot, Multi objective optimization, Dynamic environment, Genetic algorithm,
Fuzzy control, A* search algorithm.
1 Introduction
The soft computing or intelligent systems include such as fuzzy logic, genetic algo-
rithm, neural network can solve such complex within a reasonable accuracy [1, 2].
Motion planning [3] is one of the important tasks in intelligent control of an auto-
nomous mobile robot. It is often decomposed into path planning and trajectory plan-
ning, although they are not independent of each other. Path planning is to generate a
collision-free path in an environment with obstacles and to optimize it with respect to
some criterion. Trajectory planning is to schedule the movement of a mobile robot
along the planned path. There have been many methods proposed for motion planning
of mobile robot [3]. Usually, motion planning of mobile robot under unknown
environment was divided into two categories [4]. First, obstacles are unknown and
*
Affiliated at University of Technology/Control and systems Eng.Dept. – Iraq.
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 34–49, 2014.
© Springer International Publishing Switzerland 2014
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 35
The article is organized as follow: section 2 describes the problem and case study.
Section 3 is devoted to Kinematics model of mobile robot and then the proposed ap-
proach, flow charts and evaluation criteria are introduced in section 4. The Multi ob-
jectives optimization function is addressed in section 5. The fuzzy motion planning
was presented in section 6. Based on this formulation, section 7 is presented simula-
tion results. Finally, in section 8 conclusions and future work are discussed.
2 Problem Description
As it is known, the path planning is used to generate a path off-line from initial to
final position. On other hand, the trajectory generation is to impose a velocity profile
to convert the path to a trajectory. A trajectory is a path which is an explicit function
of time. Fig. 1 shows the motion planning of mobile robot. Finding a feasible trajecto-
ry is called trajectory planning or motion planning [18].
Note that the trajectory means a time-based profile of position and velocity from
start to destination while paths are based on non-time parameters. It is used to have
smooth movement during the parametric Cubic Spline function of trajectory, and it
must give continuous velocity and acceleration. Basically path planning problem is a
form of geometrical problem which can be solved by geometrical description of mo-
bile robot and its workspace, starting and target configuration of mobile robot and
evaluation of degrees of freedom (DOF) of mobile robot [19]. Trajectory planning in
general requires a path planner to compute geometric collision free path. This free
path is to be converted into a time based trajectory [19]. The Kinematics model of the
used mobile robot is presented as shown in the following section. Given a point mo-
bile robot moves in 2D static and dynamic environment. These obstacles can be
placed at any grid point in the map. The mobile robot’s mission is to search offline
and online optimal path that travels from a start point to a goal point starting point and
the goal point in both environments that complies with the some restrictions. First,
collision-free which means there should be no collision with the static and dynamic
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 37
obstacles that appear on its way. Then, the short that minimize traveling distance.
Third, smooth in order to minimize total angles of all vectorial path segments and
minimum curvature. Fourth, security (safe, clear) to maintain the clearance require-
ments, the path is safer and farthest from obstacles (it should not approach to the
obstacles very closely) or maximum clearance distance from obstacles should be keep
to. The shortest time to minimize time traveled. Then, to minimize energy consump-
tion of robot. Finally, the path must stay inside the grid boundaries.
As it is shown in Fig.2, when the robot navigates from start point to the target
point, the multi objectives optimal trajectory planning which has been presented in
this work is trade off among all of the aforementioned objectives with the advantage
that multiple tradeoff solutions can be obtained in a single run.
The boundary conditions including the position, velocity and acceleration con-
straints of robot. the position constraint of the mobile robot[8].
x 0 xinitial and x tf xtarget (1)
Velocity and acceleration constraints are that the mobile robot should start from
rest at initial position with certain acceleration to reach its maximum velocity and
near the target location it should decelerate to stop at the goal position.
4 Proposed Approach
In this section, a description of proposed approach is presented and the general sche-
ma of the proposed approach is shown in Fig 4 as well as the flow chart in Fig. 5. In
order to solve the MOOP problem can five main steps have been introduced. Note
that the fifth step is the fuzzy motion logic which will be described later.
The first step is the initialization. There are some of definitions corresponding to
the initialization stage are presented in Table 1. The next step describes the environ-
ment model and some corresponding definitions are presented. We construct a closed
workspace (indoor area) with different numbers of static obstacles. This area is de-
scribed by a 2D static map (20 × 20); the starting point is S=(1, 1), and the target
point is T=(19, 19) for a path. The positions of the obstacles are randomly chosen; in
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 39
other words, the obstacles can be placed at any grid point in the map. In order to elim-
inate obstacles nodes from the map at the beginning of the algorithm a shortcut or
decreased operator had been used.
In the third stage, which is called an initial population which generate and moving
for sub optimal feasible paths. The classical method and modified A* is used for ge-
nerating a set of the sub optimal feasible paths in both simple and complex maps.
Then, the obtained paths are used for establishing the initial population for the GA
optimization. Here, the mobile robot moves in an indoor area and it can move in any
of the eight directions (forward, backward, right, left, right-up, right-down, left-up,
and left-down). In the classical method, the movement of the mobile robot is con-
trolled by a transition rule function, which in turn depends on the Euclidean distance
between two points (the next position j and the target position T) and roulette wheel
method to select the next point and to avoid falling in local min in complex map.
Hence, the distance value (D) between two points is:
The robot moves through every feasible solution to find the optimal solution in fa-
vored tracks that have a relatively less distance between two points, where the loca-
tion of the mobile robot and the quality of the solution are maintained such that the
sub optimal solution can be obtained. However, when the number of the obstacles is
increasing, the classical method may face difficulties to find a solution or even they
may not find one. Also, the more via points are used the more time consuming in the
algorithm that depends mostly on the number of via points that they will use in the
path in maze map. In the case of adding the modified A* search algorithm in initiali-
zation stage of GA, the proposed approach will find a solution in any case, even if
there are many obstacles. In fact, the traditional A* algorithm is the standard search
algorithm for the shortest path problem in a graph. The A* algorithm as shown in
equation (3) below can be considered as the best first search algorithm that combines
the advantages of uniform-cost and greedy searches using a fitness function [23].
two nodes. The robot seleccts the next node depends on the minimum value of F F(n).
The modification of A* allgorithm is the most effective free space searching allgo-
rithms in term of path lenggth optimization (for single objective). We proposed m mod-
ub optimal feasible path regardless of length to establish the
ified A* for searching of su
initial solution of GA in maaze map, by adding the probability function to A* methhod.
We have modified the A* in i order to avoid use the shortest path which it could affffect
the path performance in terrm of multi objective (length, security and smoothness) in
initial stage.
Fig. 7. GA Operators
combination of the objectives, in this way creating a single objective function to op-
timize or by converting the objectives into restrictions imposed on the optimization
problem. In regards to evolutionary computation [24], proposed the first implementa-
tion for a multi objective evolutionary search. Most of proposed methods focus
around the concept of Pareto optimality and the Pareto optimal set. Using these con-
cepts of optimality of individuals evaluated under a multi objective problem, they
propose a fitness assignment to each individual in a current population during an evo-
lutionary search based upon the concepts of dominance and non-dominance of Pareto
optimality [24, 25]. In recent years, the idea of Pareto-optimality is introduced to
solve multi-objective optimization problem with the advantage that multiple tradeoff
solutions can be obtained in a single run [26]. The total cost of fitness (or objective)
function of feasible path P with n points is obtained by a linear combination of the
weighted sum of multi objectives as follows [27, 28]:
minF(P) = min{ω F1(P) + ω2F2 (P) + ω3F3(P) + ω F (P)} (5)
1 44
Where tT is the total time from start to target point. By minimizing the overall fitness
function regarding the assigned weights of each criterion, a suitable path is obtained.
The weights of the shortest, smoothest and security fitness functions, w1, w2, w3 and
w4 respectively, are tuned through simulation and try and errors, with best found val-
ues.
The membership functions for the proposed algorithms are shown in Fig 9, Fig 10
and Fig 11. According to Fig. 9 and Fig 10, fuzzy membership functions for
input velocity has 9 linguistic variables. These linguistic variables are (Z: Zero, VVL:
very very low, VL: very low, L: low, Medium, H: High, VH: very high, VVH: very
very high, Maximum). The second input has 5 linguistic variables ( No, Far, Medium,
Close, Very Close). It should be noted that the input 2 is normalized. For the output
9 membership functions have been used (Z: Zero, VVL: very very low, VL: very low,
L: low, Medium, H: High, VH: very high, VVH: very very high, Maximum).
Fig. 9. Membership functions for the input Ve- Fig. 10. Membership functions for the
locity input detect obstacle
The most common defuzzification methods are including centroid and the
weighted average methods. This step is an operation to produce a non-fuzzy control
action. It transforms fuzzy sets into crisp value. Therefore, in this work, for the ulti-
mate defuzzification the gravity method has been used as it is given by:
Za =
μ c ( z ). zdz (7)
μ c ( z ) dz
7 Simulation Results
This section presents the results of robot case study. In order to verify the effective-
ness of the proposed hybrid approach, we applied it in simple and complicated 2D
static environments with different numbers of obstacles. The MATLAB software
44 B.K. Oleiwi et al.
(CPU is 2.61 GHz) is used for the simulation. Fig. 13, Fig. 17 and Fig. 21 show the
velocity profile for the optimal trajectory generation for different maps. The robot
paths have been successful designed by GA in this cases there are no unknown ob-
stacles appears in the paths. The optimal trajectory generation for these paths are
shown in Fig 12, Fig 16 and Fig 20.the fuzzy velocity profile for the optimal Trajecto-
ry generation are shown in Fig 14, Fig. 18 and Fig. 22. The fuzzy motion controller
have given approximately the similar GA velocity values for the robot. In these cases,
we still have no unknown obstacles come to the robot path. Therefore, the fuzzy con-
trol and the GA work simultaneously in order to make the robot navigate to its target
without colliding with obstacles. Fig. 15, Fig. 19 and Fig. 23 are show how this algo-
rithm can reduce the velocity of the robot not only if there are static obstacles, but
also in case there are moving objects come to the path. Fig. 15 shows the velocity of
the robot with respect to time for optimum and it starts increasing from 0 to maximum
0.75 m/s and the robot might reduce its speed in case there is a dynamic obstacle close
to its path in order to turn left or right. Fig. 19 shows the optimum robot velocity with
respect to time. It starts accelerate from 0 to maximum 0.75 m/s and the robot has the
ability to decrease its speed if a dynamic obstacle come close to its path as well as
accelerate again back to its maximum velocity. In Fig. 23 shows the optimum velocity
of the robot compares to the output velocity gained from the fuzzy control. The fuzzy
logic not only could control this velocity but also might work if the robot detects a
moving obstacle in its path. It is obvious clear that when the robot detects the obstacle
at time 30 the fuzzy control will decrease the velocity to 0.55m/sec and increase it
again to 0.75m/sec when the path gets free from the obstacle.
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Fig. 12. Optimal trajectory generation Fig. 13. Final Velocity Profile for Optimal
Trajectory generation
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 45
Fig. 14. Final Fuzzy Velocity Profile for the Fig. 15. The Fuzzy Velocity Profile with
Optimal Trajectory generation obstacle appears
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Fig. 16. Optimal trajectory generation Fig. 17. The Final Velocity Profile for the
Optimal Trajectory generation
Fig. 18. Final Fuzzy Velocity Profile for the Fig. 19. The Fuzzy Velocity Profile with
Optimal Trajectory generation obstacle appears
46 B.K. Oleiwi et al.
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Fig. 20. Optimal trajectory generation Fig. 21. The Final Velocity Profile for the
Optimal Trajectory generation
Fig. 22. Final Fuzzy Velocity Profile for the Fig. 23. The Fuzzy Velocity Profile with
Optimal Trajectory generation obstacle appears
As it is shown in Fig. 24 in order to avoid unknown moving object (red) in the en-
vironment the fuzzy motion controller will reduce the final velocity for the robot
(green) when it detects new unknown obstacle. Hence the robot can avoid collision
with the moving object and when the path becomes free again the controller will in-
crease the velocity and the robot can finish its mission to reach the target.
Fig. 24. Path planning with fuzzy logic in relatively complex map
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 47
8 Conclusion
The main contribution of this study is the presentation of a proposed approach to gen-
erate multi objective optimization of path and trajectory of mobile robot with
free-collision. This approach is considering cubic spline data interpolation and the
non-holonomic constrains in Kinematic equations of mobile robot, which is used to
obtain a smooth trajectory with an associated minimum energy cost. In this work, a
global optimal path with avoiding obstacles is generated initially. Then, global optim-
al trajectory is fed to fuzzy motion controller to be regenerated into time based trajec-
tory. The fuzzy control shows a good performance to deal with dynamic obstacle in
the environment. The objective function for the proposed approach is for minimizing
travelling distance, travelling time, smoothness and security, avoiding the static and
dynamic obstacles in the workspace. The simulation results show that the proposed
approach is able to achieve multi objective optimization in dynamic environment
efficiently.
References
[1] Hui, N.B., Mahendar, V., Pratihar, D.K.: Time Optimal, Collision-Free Navigation of a
Car-Like Mobile Robot Using Neuro-Fuzzy Approaches. Fuzzy Sets and Systems 157,
2171–2204 (2006)
[2] Jelena, G., Nigel, S.: Neuro-Fuzzy Control of a Mobile Robot. Neuro Computing 28,
127–143 (2009)
[3] Sugihara, K., Smith, J.: Genetic Algorithms for Adaptive Motion Planning of an Auto-
nomous Mobile Robot. In: IEEE International Symposium on Computational Intelligence
in Robotics and Automation, pp. 138–143 (1997)
[4] Lei, L., Wang, H., Wu, Q.: Improved Genetic Algorithms Based Path Planning of Mobile
Robot Under Dynamic Unknown Environment. In: IEEE International Conference on
Mechatronics and Automation, pp. 1728–1732 (2006)
[5] Li, X., Choi, B.-J.: Design of Obstacle Avoidance System for Mobile Robot Using Fuzzy
Logic Systems. International Journal of Smart Home 7(3), 321–328 (2013)
[6] Li, X., Choi, B.-J.: Obstacle Avoidance of Mobile Robot by Fuzzy Logic System. In:
ISA, ASTL, vol. 21, pp. 244–246. SERSC (2013)
[7] Purian, F.K., Sadeghian, E.: Path Planning of Mobile robots Via Fuzzy Logic in Un-
known Dynamic Environments with Different Complexities. Journal of Basic and Ap-
plied Scientific Research 3(2s), 528–535 (2013)
[8] Arshad, M., Choudhry, M.A.: Trajectory Planning of Mobile robot in Unstructured Envi-
ronment for Multiple Objects. Mehran University Research Journal of Engineering &
Technology 31(1), 39–50 (2012)
48 B.K. Oleiwi et al.
[9] Rusu, C.G., Birou, I.T., Szöke, E.: Fuzzy Based Obstacle Avoidance System for Auto-
nomous Mobile Robot. In: IEEE International Conference on Automation Quality and
Testing Robotics, vol. (1), pp. 1–6 (2010)
[10] Rusu, C.G., Birou, I.T.: Obstacle Avoidance Fuzzy System for Mobile Robot with IR
Sensors. In: 10th International Conference on Development and Application, pp. 25–29
(2010)
[11] Shi, P., Cui, Y.: Dynamic Path Planning for Mobile Robot Based on Genetic Algorithm
in Unknown Environment: In. IEEE Conference on, pp. 4325–4329 (2010)
[12] Benbouabdallah, K., Qi-dan, Z.: Genetic Fuzzy Logic Control Technique for a Mobile
Robot Tracking a Moving Target. International Journal of Computer Science Is-
sues 10(1), 607–613 (2013)
[13] Senthilkumar, K.S., Bharadwaj, K.K.: Hybrid Genetic-Fuzzy Approach to Autonomous
Mobile Robot. In: IEEE International Conference on Technologies for Practical Robot
Applications, pp. 29–34 (2009)
[14] Farshchi, S.M.R., NezhadHoseini, S.A., Mohammadi, F.: A Novel Implementation of G-
Fuzzy Logic Controller Algorithm on Mobile Robot Motion Planning Problem. Canadian
Center of Science and Education, Computer and Information Science 4(2), 102–114
(2011)
[15] Phinni, M.J., Sudheer, A.P., RamaKrishna, M., Jemshid, K.K.: Obstacle Avoidance of a
wheeled mobile robot: A Genetic-neurofuzzy approach. In: IISc Centenary – Internation-
al Confonference on Advances in Mechanical Engineering (2008)
[16] Oleiwi, B.K., Hubert, R., Kazem, B.: Modified Genetic Algorithm based on A* algo-
rithm of Multi objective optimization for Path Planning. In: 6th International Conference
on Computer and Automation Engineering, vol. 2(4), pp. 357–362 (2014)
[17] Oleiwi, B.K., Roth, H., Kazem, B.: A Hybrid Approach based on ACO and GA for Multi
Objective Mobile Robot Path Planning. Applied Mechanics and Materials 527, 203–212
(2014)
[18] Kim, C.H., Kim, B.K.: Minimum-Energy Motion Planning for Differential-Driven
Wheeld Mobile Robots. Motion Planning Source in Tech. (2008)
[19] Breyak, M., Petrovic, I.: Time Optimal Trajectory Planning Along Predefined Path for
Mobile Robots with Velocity and Acceleration Constraints. In: IEEE/ASME Internation-
al Conference on Advanced Intelligent Mechatronics, pp. 942–947 (2011)
[20] Vivekananthan, R., Karunamoorthy, L.: A Time Optimal Path Planning for Trajectory
Tracking of Wheeled Mobile Robots. Journal of Automation, Mobile Robotics & Intelli-
gent Systems 5(2), 35–41 (2011)
[21] Xianhua, J., Motai, Y., Zhu, X.: Predictive fuzzy control for a mobile robot with nonho-
lonomic constraints. In: IEEE Mid-Summer Workshop on Soft Computing in Industrial
Applications, Helsinki University of Technology, Espoo, Finland (2005)
[22] Buniyamin, N., Sariff, N.B.: Comparative Study of Genetic Algorithm and Ant Colony
Optimization Algorithm Performances for Robot Path Planning in Global Static Envi-
ronments of Different Complexities. In: IEEE International Symposium on Computation-
al Intelligence in Robotics and Automation, pp. 132–137 (2009)
[23] Buniyamin, N., Sariff, N.B.: An Overview of Autonomous Mobile Robot Path Planning
Algorithms. In: 4th IEEE Student Conference on Research and Development, pp. 183–
188 (2006)
Multi Objective Optimization of Trajectory Planning of Non-holonomic Mobile Robot 49
[24] Krishnan, P.S., Paw, J.K.S., Tiong, S.K.: Cognitive Map Approach for Mobility Path Op-
timization using Multiple Objectives Genetic Algorithm. In: 4th IEEE International Con-
ference on Autonomous Robots and Agents, pp. 267–272 (2009)
[25] Castillo, O., Trujillo, L.: Multiple Objective Optimization Genetic Algorithms for Path
Planning in Autonomous Mobile Robots. International Journal of Computers, Systems
and Signals 6(1), 48–63 (2005)
[26] Fonseca, C.M., Fleming, P.J.: An Overview of Evolutionary Algorithms in Multi-
objective Optimization. Evolutionary Computing 3(1), 1–16 (1995)
[27] Jun, H., Qingbao, Q.: Multi-Objective Mobile Robot Path Planning based on Improved
Genetic Algorithm. In: Proc. IEEE International Conference on Intelligent Computation
Technology and Automation, vol. 2, pp. 752–756 (2010)
[28] Geetha, S., Chitra, G.M., Jayalakshmi, V.: Multi Objective Mobile Robot Path Planning
based on Hybrid Algorithm. In: 3rd IEEE International Conference on Electronics Com-
puter Technology, vol. 6, pp. 251–255 (2011)
[29] Saffiotti, A.: The use of fuzzy logic for autonomous robot navigation. Soft Compu-
ting 1(4), 180–197 (1997)
Multi Objective Optimization of Path and Trajectory
Planning for Non-holonomic Mobile Robot
Using Enhanced Genetic Algorithm
1 Introduction
Real-world problem solving commonly involve the optimization of two or more
objectives at once. A consequence of this is that it is not always possible to reach an
optimal solution with respect to all of the objectives evaluated individually.
Historically a common method used to solve multi objective problems is a linear
combination of the objectives, creating a single objective function to optimize, or
converting the objectives into restrictions imposed on the optimization problem. With
regards to evolutionary computation [1], the first implementation of a multi objective
evolutionary search is proposed. Most of the proposed methods focus on the concept
of Pareto optimality and the Pareto optimal set. Using these concepts of optimality of
*
Affiliated at University of Technology/Control and systems Eng.Dept. – Iraq.
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 50–62, 2014.
© Springer International Publishing Switzerland 2014
Multi Objective Optimization of Path and Trajectory Planning 51
individuals evaluated under a multi objective problem, they each propose a fitness
assignment for each individual in a current population during an evolutionary search
based upon the concepts of dominance and non-dominance of Pareto optimality [1].
At present, mobile robotic path planning method uses two types of methods:
traditional methods and intelligent methods. The GA is an excellent and mature
intelligent method and is popular with the majority of researchers. The initial
population in the traditional GA is generated randomly. However, the huge population
size leads to a large search space, poor removal of redundant individuality, and
unsatisfactory speed and accuracy of path planning, especially in complex
environment, multi-robot path planning [2] and multi objectives. When the
environment is complex and the number of the obstacles is increasing, the basic GA
may have difficulties finding a solution or even may not find one at all. In addition to
that it is difficult to achieve convergence and generate optimized and feasible multi
objective path planning by using only the basic operators in standard GA [3]. A multi-
objective mobile robot path planning is a wide and active research area, where many
methods have been applied to tackle this problem [4-9]. A multi-objective GA robot
trajectory planner is proposed in [4]. The hybrid gravitational search algorithm and a
particle swarm optimization algorithm in [5] solved the optimization problems by
minimizing the objective functions, producing optimal collision-free trajectories in
terms of minimizing the length of the path that needs to be followed by the robot and
also assuring that the generated trajectories are at a safe distance from the danger
zones. In [6] the described the use of a GA for the problem of offline point-to-point
mobile robot path planning. The problem consist of generating “valid” paths or
trajectories, represented by a two dimensional grid, with obstacles and dangerous
ground that the robot must evade. This means that the GA optimizes possible paths
based on two criteria: length and difficulty. Actually, some of these researches solved
the multi objective optimization for mobile robot path planning problem without
taking into account some important issues such as minimum travelling time for the
trajectory generating [7, 8]. Even though a few researchers studied the multi objective
problem included minimum travelling time, but they ignore some important aspects
such safety factor and security as a multi object optimization for the trajectory
generating.
In this work, we extend our approach in [10] by taking into account the travelling
time as a fourth objective as well as reducing the energy consumption of mobile
robot. The approach based on performing modification of A* and GA to enhance the
searching ability of robot movement to reach an optimal solution in term of multi
objective optimal path and trajectory for mobile robot navigation in complex static
environment. In addition, the traditional GA has some drawbacks such as huge
population size, large search space, and poor ability of removing redundant
individuality. Also, when the environment is complex and the number of the obstacles
is increasing, the GA might not find a solution or it can face some difficulties to find
it. To solve this issue, we proposed an approach to convert the optimal path to
trajectory. More precisely, a time-based profile trajectory of position and velocity was
introduced.
The article is organized as follow: Section 2 describes the problem and case study.
Section 3 describes a kinematics model of a mobile robot and the proposed approach,
flowcharts and evaluation criteria are introduced in Section 4. The multi objective
52 B.K. Oleiwi, H. Roth, and B.I. Kazem
2 Problem Definition
The mobile robot’s mission is to travel from a start point to a goal point by
minimizing traveling distance and traveling time, maintaining smooth (minimum
curvatures) and safety of movement with collision free in 2D complex environment
with stationary obstacles.
As shown in Fig.1, even though the shortest path is the red line, it has the lowest
security performance. The best smoothness performance path is the orange line, but
unfortunately it has the longest path. Although the best security performance is the
green path, its length and smoothness are not the best [7, 8]. The trajectory marked in
blue has a minimum-time smooth trajectory and minimum energy consumption but
the lowest security performance. The multi objective optimal trajectory planning of
mobile robot is the dotted black line, a trade off with the advantage that multiple
solutions can be obtained in a single run.
whereas paths are based on non-time parameters. To have smooth movement the
parametric Cubic Spline function of trajectory is used, and the trajectory must afford
continuous velocity and acceleration. According to that the Kinematics model for
mobile robot is described in next section
, (2)
54 B.K. Oleiwi, H. Roth, and B.I. Kazem
Where and are angular velocities of left and right wheels of mobile robot
respectively. Both wheels have same radius defined by r. The distance between two
wheels is l. For the robot's movement, the linear velocity and the angular velocity
are chosen and can be obtain by (3) and (4)
(3)
, (4)
(6)
(7)
(8)
The boundary conditions include the position, velocity and acceleration constraints of
the robot [16, 17].
0 (9)
Velocity and acceleration constraints mean that the mobile robot should start from
rest at its initial position, accelerate to reach its maximum velocity and near the target
location decelerate to stop at the goal position.
4 Proposed Approach
The proposed approach based on performing modification of A* and GA is presented
to enhance the searching ability of robot movement towards optimal solution state. In
addition, the approach can find a multi objective optimal path and trajectory for
mobile robot navigation as well as to use it in complex static environment. The
classical method and modified A* search method in initialization stage for single
objectives and multi objectives have been proposed to overcome GA drawbacks.
Also, in order to avoid fall into a local minimum complex static environment we have
proposed several genetic operators such as deletion operator and enhanced mutation
operator by adding basic A* to improve the best path partly. The aim of this
combination is to enhanced GA efficiency and path planning performance. Hence,
several genetic operators are proposed based on domain-specific knowledge and
characteristics of path planning to avoid falling into a local minimum in complex
environment and to improve the optimal path partly such as deletion operator and
enhanced mutation with basic A*. In addition, the proposed approach is received an
Multi Objective Optimization of Path and Trajectory Planning 55
initial population from a classical method or modified A*. For more details someone
could see [10].
The general schema of the proposed approach to the multi objective optimization
of path planning and trajectory generating problem can be defined in five main steps
as shown in Fig. 4 and flowchart in Fig. 5. First, some of definitions correspond to the
initialization stage are presented. We construct 2D static map (indoor area) with and
without different numbers of static obstacles. Also, a shortcut or decreased operator
has been used to eliminate obstacles nodes from the map at the beginning of the
algorithm. The next step is the initial population which is called generating and
moving for sub optimal feasible paths. In this stage, the classical method and
modified A* are used for generating a set of sub optimal feasible paths in a simple
map and a complex map, respectively. Then, the paths obtained are used for
establishing the initial population for the GA optimization. Here, the mobile robot
moves in an indoor area and it can move in any of eight directions (forward,
backward, right, left, right-up, right-down, left-up, and left-down). There are two
methods in this step. Someone can use the classical method where the movement of
the mobile robot is controlled by a transition rule function, which in turn depends on
the Euclidean distance between two points and the roulette wheel method is used to
select the next point and to avoid falling in a local minimum on the complex map.
The robot moves through every feasible solution to find the optimal solution in
favored tracks that have a relatively short distance between two points, where the
location of the mobile robot and the quality of the solution are maintained such that
the sub optimal solution can be obtained. When, the number of the obstacles is
increasing, the classical method may face difficulties finding a solution or may not
even find one. Also, the more via points are used, the more time consuming is the
algorithm, as it depends mostly on the number of via points it will use on the path in
the complex map. If the modified A* search algorithm is added in the initialization
stage of GA proposed approach will find a solution in any case, even if there are
many obstacles. The A* algorithm is the most effective free space searching
algorithms in term of path length optimization (for a single objective). We propose a
modified A* for searching for sub optimal feasible path regardless of length to
establish the initial solution of GA in a complex map, by adding the probability
function to the A* algorithm. We have modified the A* in order to avoid using the
shortest path which could affect the path performance in terms of multi objective
(length, security and smoothness) in the initial stage.
a set of genes (via-points) from the start position to the target position of the path.
Since p(x0, y0) is always the starting point and p(xn,yn) is always the target point, the
via-points of the path are p(x1,y1)and p(xi+1,yi+1), and all these points represent the
genes of the chromosome as shown in Fig. 3.
( x i +1 − x i ) + ( y i +1 − y i ) (12)
n −1 2 2
F1 ( P ) = i=0
θ (p i p i +1 , p i +1 p i + 2 ) + C 1 × S.
n−2
F2 ( P ) = i =1
(13)
C2 (14)
F3 ( P ) =
n −1
i =1
min_dist (p i p i +1 , OB * )
F4 ( P) = t T (15)
Where w1, ws, wc and wt represent the weight of each objective to total cost F(P). F1(P)
is the total length of path and criteria of path shortness is defined as the Euclidean
distance between two point, F2(P) is the path smoothness, where θ(p i p i+1 , p i +1p i+ 2 ) is
the angle between the vectorial path segments p i p i +1 and p i+1p i+2 ,(0≤θ≤π). C1 is a
positive constant; S is the number of line segments in the path, F3 (P) is the path
clearance or path security, where min_dist(p i p i+1 , OB *) is the shortest distance
between the path segment p i p i +1 and its proximate obstacle OB * . C2 is a positive
constant and its purpose is to make the numerical scope of F3(P) in the same order of
magnitude with the previous two objective values. F4 (P) represents the total
consumed time for robot motion, where t T are the total time from start to target
point. The weights of the shortest, smoothest, security and time fitness functions, w1,
ws, wc and wt respectively, are tuned through simulation and trial and error, with best
found values and by minimizing the overall fitness function regarding the assigned
weights of each criterion, a suitable optimal path and trajectory can be obtained
6 Simulation Results
The proposal approach has tested in simple and complicated 2D static environments
with different numbers of obstacles. MATLAB software (CPU is 2.61 GHz) was used
for the simulation. Figs. 6 (a-g), Figs. 7(a-g) and Table 1, 2, 3 show the execution of
the program for final Pareto optimized path and trajectory various maps.
58 B.K. Oleiwi, H. Roth, and B.I. Kazem
Optimum path in the last iteration by GA+A* approach Optimum path at the each iteration by GA+A* approach
20 30.5
GA+A*
18 Path length= 27.799
No. of line segments= 6 30
16 Sum of angles = 225
Time (Sec) =39.57
14 Multi objective function= 176.5372 29.5
8
28.5
6
4 28
2
27.5
0 5 10 15 20 25 30 35 40 45 50
0
0 2 4 6 8 10 12 14 16 18 20 Iteration
(a):Optimal path planning in Map 1 (b): Relationship between optimal path length and
iteration in Map 1
16
210
14
205
Objective function (f)
12
200
10
195
8
190
6
185
4
180 2
GA+A*=176.5372
175 0
0 5 10 15 20 25 30 35 40 45 50 0 2 4 6 8 10 12 14 16 18 20
iteration (i)
(c): Objective function F versus iteration (d): Optimal path and trajectory are generating
marked in blue and red , respectively in Map 1
20
x-axis direction velocity
x or y axis with respect to time
18 0.8 y-axis direction velocity
16 0.7
14
0.6
X or Y direction Velocity
x of y distance
12
0.5
10
0.4
8
0.3
6
4 0.2
2 X axis 0.1
Y axis
0
0 5 10 15 20 25 30 35 40 0
0 5 10 15 20 25 30 35 40
Time (Sec) Time (sec)
(e): X, Y axes are blue and red, (f): X,Y axes direction velocity for optimal
respectively. for optimal trajectory trajectory
y p p p
0.8
0.7
Table 1. statistics results of performance index for
0.6 proposed approaches in map1
0.5
0.4
Approach
0.3
Optimal path 27.799
0.2
length
0.1 No. of Segments 6
0
Sum of angles 225
0 5 10 15 20 25 30 35 40
Time Travel time (sec.) 39.57
Multi objective 179.5372
(g): The final velocity profile for the value
optimal trajectory generation Max. Iteration (i) 50
Fig. 6. (continued)
Optimum path in the last iteration by GA+A* approach Optimum path at the each iteration by GA+A* approach
20 53
GA+A*
18 Path length= 45.4558
52
No. of line segments= 11
16 Sum of angles = 540
Time (Sec) =66.02 51
14 Multi opjective function= 241.0416
Optimum path length
50
12
10 49
8 48
6
47
4
46
2
0 45
0 2 4 6 8 10 12 14 16 18 20 0 5 10 15 20 25 30 35 40 45 50
Iteration
(a):Optimal path planning in Map 2 (b): Relationship between optimal path length and
iteration and in Map 2
18
340
16
320
Objective function (f)
14
12
300
10
8
280
260 4
2
240 GA+A*=241.0416
0 5 10 15 20 25 30 35 40 45 50
0
iteration (i) 0 2 4 6 8 10 12 14 16 18 20
(c): Objective function F versus iteration (d): Optimal path and trajectory are generating
in blue and red, respectively
20
x or y axis with respect to time x-axis direction velocit
18 0.8 y-axis direction velocit
16 0.7
14
0.6
X or Y direction Velocity
12
x of y distance
0.5
10
0.4
8
0.3
6
0.2
4
0.1
2 X axis
Y axis
0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Time (sec)
Time (Sec)
(e): X,Y axis for optimal trajectory (f): X and Y axes direction velocity are blue and red,
respectively. for optimal trajectory
0.8
Table 2. Statistics results of performance
0.7 index for proposed approaches in Map2
0.6
0.5
Performance Index Proposed Approach
Velocity
Fig. 7. (continued)
Hence, the proposed approach was tested to generate the optimal collision free path
planning and trajectory generation in terms of length, smoothness, security and time
in complex static environment. The results for the multi objective optimal path and
trajectory for the robot are shown in Fig 6 (a-d) and 7 (a-d), respectively. Hence, the
statistics results of performance index for proposed approach in Tables (1-3) Figs.
6(e) and Figs. 7(e) show the X and Z coordinate with time for optimal trajectory in
map 1 and 2, respectively. As shown in Figs. 6 (f) and Figs. 7(f) the velocity profile in
Multi Objective Optimization of Path and Trajectory Planning 61
X and Y direction for the mobile robot in Map1 and 2, respectively. The final velocity
profile for the optimal trajectory generation in Fig. 6 (g) and Fig. 7(g). The simulation
results show that the mobile robot travels successfully from one location to another
and reaches its goal after avoiding all obstacles that are located in its way in all tested
environments, and indicates that the proposed approach is accurate and can find a set
Pareto optimal solution efficiently in a single run as shown in Fig. 8.
7 Conclusion
References
[1] Krishnan, P.S., Paw, J.K.S., Tiong, S.K.: Cognitive Map Approach for Mobility Path
Optimization using Multiple Objectives Genetic Algorithm. In: 4th IEEE International
Conference on Autonomous Robots and Agents, pp. 267–272 (2009)
[2] Yongnian, Z., Yongping, L., Lifang, Z.: An Improved Genetic Algorithm for Mobile
Robotic Path Planning. In: 24th IEEE Conference on Control and Decision Conference,
pp. 3255–3260 (2012)
[3] Panteleimon, Z., Ijspeert, A.J., Degallier, S.: Path Planning with the humanoid robot
iCub. Semester project, Biologically inspired Robotics Group, Birg (2009)
[4] Castillo, O., Trujillo, L.: Multiple Objective Optimization Genetic Algorithms for Path
Planning in Autonomous Mobile Robots. International Journal of Computers, Systems
and Signals 6(1), 48–63 (2005)
[5] Purcaru, C., Precup, R.E., Iercan, D., Fedorovici, L.O., David, R.C.: Hybrid PSO-GSA
Robot Path Planning Algorithm in Static Environments with Danger Zones. In: 17th IEEE
International Conference on System Theory, Control and Computing, pp. 434–439 (2013)
[6] Solteiro Pires, E.J., Tenreiro Machado, J.A., De Moura Oliveira, P.B.: Robot Trajectory
Planning Using Multi-objective Genetic Algorithm Optimization. In: Deb, K., Tari, Z.
(eds.) GECCO 2004. LNCS, vol. 3102, pp. 615–626. Springer, Heidelberg (2004)
[7] Jun, H., Qingbao, Q.: Multi-Objective Mobile Robot Path Planning based on Improved
Genetic Algorithm. In: IEEE International Conference on Intelligent Computation
Technology and Automation, vol. 2, pp. 752–756 (2010)
[8] Geetha, S., Chitra, G.M., Jayalakshmi, V.: Multi Objective Mobile Robot Path Planning
based on Hybrid Algorithm. In: 3rd IEEE International Conference on Electronics
Computer Technology, vol. 6, pp. 251–255 (2011)
[9] Gong, D.W., Zhang, J.H., Zhang, Y.: Multi-Objective Particle Swarm Optimization for
Robot Path Planning in Environment with Danger Sources. Journal of Computers 6(8),
1554–1561 (2011)
[10] Oleiwi, B.K., Hubert, R., Kazem, B.: Modified Genetic Algorithm based on A*
algorithm of Multi objective optimization for Path Planning. In: 6th International
Conference on Computer and Automation Engineering, vol. 2(4), pp. 357–362 (2014)
[11] Trajano, T.A.A., Armando, A.M.F., Max, M.S.D.: Parametric Trajectory Generation for
Mobile Robots. ABCM Symposium Series in Mechatronics, vol. 3, pp. 300–307 (2008)
[12] Sedaghat, N.: Mobile Robot Path Planning by New Structured Multi-Objective Genetic
Algorithm. In: IEEE International Conference on Soft Computing and Pattern
Recognition, pp. 79–83 (2011)
[13] Vivekananthan, R., Karunamoorthy, L.: A Time Optimal Path Planning for Trajectory
Tracking of Wheeled Mobile Robots. Journal of Automation, Mobile Robotics &
Intelligent Systems 5(2), 35–41 (2011)
[14] Alves, S.F.R., Rosario, J.M., Filho, H.F., Rincon, L.K.A., Yamasaki, R.A.T.: Conceptual
Bases of Robot Navigation Modeling Control and Applications. Alejandra Barrera (2011)
[15] Xianhua, J., Motai, Y., Zhu, X.: Predictive fuzzy control for a mobile robot with non-
holonomic constraints. In: IEEE Mid-Summer Workshop on Soft Computing in
Industrial Applications, Helsinki University of Technology, Espoo, Finland (2005)
[16] Breyak, M., Petrovic, I.: Time Optimal Trajectory Planning Along Predefined Path for
Mobile Robots with Velocity and Acceleration Constraints. In: IEEE/ASME
International Conference on Advanced Intelligent Mechatronics, pp. 942–947 (2011)
[17] Arshad, M., Choudhry, M.A.: Trajectory Planning of Mobile robot in Unstructured
Environment for Multiple Objects. Mehran University Research Journal of Engineering
& Technology 31(1), 39–50 (2012)
[18] Fonseca, C.M., Fleming, P.J.: An Overview of Evolutionary Algorithms in Multi-
objective Optimization. Evolutionary Computing 3(1), 1–16 (1995)
New Procedures of Pattern Classification
for Vibration-Based Diagnostics via Neural Network
1 Introduction
The machines and structural components require continuous monitoring for the
detection of fatigue cracks and crack growth for ensuring an uninterrupted service.
Non-destructive testing methods like ultrasonic testing, X-ray, etc., are generally
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 63–75, 2014.
© Springer International Publishing Switzerland 2014
64 N. Nechval, K. Nechval, and I. Bausova
useful for the purpose. These methods are costly and time consuming for long
components, e.g., railway tracks, long pipelines, etc. Vibration-based methods can
offer advantages in such cases [1]. This is because measurement of vibration
parameters like natural frequencies is easy. Further, this type of data can be easily
collected from a single point of the component. This factor lends some advantages for
components, which are not fully accessible. This also helps to do away with the
collection of experimental data from a number of data points on a component, which
is involved in a prediction based on, for example, mode shapes. Nondestructive
evaluation (NDE) of structures using vibration for early detection of cracks has
gained popularity over the years and, in the last decade in particular, substantial
progress has been made in that direction. Almost all crack diagnosis algorithms based
on dynamic behaviour call for a reference signature. The latter is measured on an
identical uncracked structure or on the same structure at an earlier stage. Dynamics of
cracked rotors has been a subject of great interest for the last three decades and
detection and monitoring have gained increasing importance, recently. Failures of any
high speed rotating components (jet engine rotors, centrifuges, high speed fans, etc.)
can be very dangerous to surrounding equipment and personnel (see Fig. 1), and must
always be avoided.
Jet engine disks operate under high centrifugal and thermal stresses. These stresses
cause microscopic damage as a result of each flight cycle as the engine starts from the
cold state, accelerates to maximum speed for take-off, remains at speed for cruise,
then spools down after landing and taxi. The cumulative effect of this damage over
time creates a crack at a location where high stress and a minor defect combine to
create a failure initiation point. As each flight operation occurs, the crack is enlarged
by an incremental distance. If allowed to continue to a critical dimension, the crack
would eventually cause the burst of the disk and lead to catastrophic failure (burst) of
the engine. Engine burst in flight is rarely survivable.
New Procedures of Pattern Classification for Vibration-Based Diagnostics 65
In this paper, we will focus on aircraft or jet engines, which are a special class of
gas turbine engines. Typically, physical faults in a gas turbine engine include prob-
lems such as erosion, corrosion, fouling, built-up dirt, foreign object damage (FOD),
worn seals, burned or bowed blades, etc. These physical faults can occur individually
or in combination and cause changes in performance characteristics of the
compressors, and in their expansion and compression efficiencies. In addition, the
faults cause changes in the turbine and exhaust system nozzle areas. These changes in
the performance of the gas turbine components result in changes in the measurement
parameters, which are therefore dependent variables.
In this section, we look at a problem where vibration characteristics are used for gas
turbine diagnostics. The present chapter focuses on turbine blade damage. Turbine
blades undergo cyclic loading causing structural deterioration, which can lead to
failure. It is important to know how much damage has taken place at any particular
time to monitor the condition or health of the blade and to avoid any catastrophic
failure of the blades. Several studies look at damage at a given time during the
operational history of the structure. This is typically called diagnostics and involves
detection, location, and isolation of damage from a set of measured variables. The
detection function is most fundamental, as it points out if the damage is present or not.
However, some level of damage due to microcracks and other defects is always
present in a structure. The important issue of indicating when to detect damage
depends on how much life is left in the structure. It is not advantageous to detect
small levels of damage in a structure. It would be useful if damage detection were
triggered some time before final failure. The subject of prognostics involves
predicting the evolution of structural or vibrational characteristics of the system with
time and is important for prediction of failure due to operational deterioration. Some
recent studies have considered dynamical systems approaches to model damage
growth based on differential equations [2], while others have used physics-based
models [3]. The stiffness of the structure is gradually reduced with crack growth, and
stiffness is related to the vibrational characteristics of the structure. The decreased
frequency shows that stiffness of the structure is decreasing, and thus serves as a
damage indicator for monitoring crack growth in the structure. Selected studies have
looked at modeling turbine blades as rotating Timoshenko beams with twist and taper
[4–6]. Some studies have addressed damage in such beams using vibrational
characteristics [7, 8]. However, these studies typically address damage at a given time
point in the operational history and do not look at the effect of damage growth on the
vibrational characteristics. In additions, turbine blades are designed to sustain a
considerable amount of accumulated damage prior to failure. Therefore, it is desirable
to indicate that a blade is damaged at the point when its operational life is almost
over.
66 N. Nechval, K. Nechval, and I. Bausova
z = w ′y = ( y1 − y 2 )′S12
-1
y, (1)
[w ′( y1 − y 2 )]2
JF = , (2)
w ′S12 w
S12 is the pooled within-class covariance matrix, in its bias-corrected form given by
S1 and S2 are the unbiased estimates of the covariance matrices of classes C1 and C2,
respectively, and there are ni observations in class Ci (n1+n2=n). The solution for w
that maximizes JF can be obtained by differentiating JF with respect to w and equating
to zero. This yields
68 N. Nechval, K. Nechval, and I. Bausova
2w′( y1 − y 2 ) w′( y1 − y 2 )
( y1 − y 2 ) − S12 w = 0.
(4)
w′S12 w w′S12 w
−1
w ∝ S12 ( y1 − y 2 ). (5)
We may take equality without loss of generality. For convenience we speak of
classifying y rather than classifying the subject or object associated with y.
−1
z1 = w′y1 = ( y1 − y 2 )′S12 y1, (6)
−1
z 2 = w′y 2 = ( y1 − y 2 )′S12 y2. (7)
This is true in general because z1 is always greater than z 2 , which can easily be
shown as follows:
−1
z1 − z 2 = w ′( y1 − y 2 ) = ( y1 − y 2 )′S12 ( y1 − y 2 ) > 0, (9)
−1
because S12 is positive definite. Thus z1 > z 2 . [If w were of the form
−1
w′ = (y 2 − y1 )′S12 , then z 2 − z1 would be positive.] Since ( z1 + z 2 ) / 2 is the
midpoint, z > ( z1 + z 2 ) / 2 implies that z is closer to z1 . By (9) the distance from z1 to
z 2 is the same as that from y1 to y 2 .
−1
z1 + z 2 w ′( y1 + y 2 ) ( y1 − y 2 )′S12 ( y1 + y 2 )
= = . (10)
2 2 2
New Procedures of Pattern Classification for Vibration-Based Diagnostics 69
−1
−1 ( y1 − y 2 )′S12 ( y1 + y 2 )
w ′y = ( y1 − y 2 )S12 y> , (11)
2
and assign y to C2 if
−1
−1 ( y1 − y 2 )′S12 ( y1 + y 2 )
w ′y = ( y1 − y 2 )S12 y< . (12)
2
Fisher’s approach [9] using (11) and (12) is essentially nonparametric because no
distributional assumptions were made. However, if the two populations are normal
with equal covariance matrices, then this method is (asymptotically) optimal; that is,
the probability of misclassification is minimized.
If y has been embedded in the sample from Ci, the Mahalanobis distance between two
vectors y • i and y j is given by
If y has been embedded in the sample from Cj, the Mahalanobis distance between two
vectors y i and y j• is given by
be the total Mahalanobis distance in the case of pattern classification into k classes,
where
d ij = d • ij , if i = r, (17)
70 N. Nechval, K. Nechval, and I. Bausova
and
d ij = d ij• , if j = r. (18)
Then the classification rule becomes: Assign y to the class Cr, r∈{1, 2, …, k}, for
which d r (y ) is largest.
If (Σ1 = Σ2 = · · · = Σk) does not hold, then instead of the pooled sample covariance
matrix
(ni − 1)S i + (n j − 1)S j
S ij = (19)
ni + n j − 2
we use
Si S j
S ij = + . (20)
ni n j
Classification via Total Generalized Euclidean Distance. Let us assume that each of
the k populations has the same covariance matrix (Σ1 = Σ2 = · · · = Σk). We can
estimate the common population covariance matrix by a pooled sample covariance
matrix
−1
k k
i =1
S pl = (ni − 1)S i ni − k ,
(21)
i =1
where ni and Si are the sample size and covariance matrix of the ith class. The
generalized Euclidean distance between two vectors y i and y j , where i, j∈{1, 2, …,
k}, i≠j, is given by
~ ( y i − y j )′( y i − y j )
dij = , (22)
| S pl |
If y has been embedded in the sample from Ci, then the generalized Euclidean
distance between two vectors y • i and y j is given by
~ ( y i − y j )′( y • i − y j )
d • ij = • . (23)
| S pl( • i ) |
If y has been embedded in the sample from Cj, then the generalized Euclidean
distance between two vectors y i and y j• is given by
New Procedures of Pattern Classification for Vibration-Based Diagnostics 71
~ ( y i − y j• )′( y i − y j• )
d ij• = . (24)
| S pl( j• ) |
Let
k −1 k
~ ~
d r (y ) = dij , r∈{1, 2, …, k}, (25)
i =1 j = i +1
be the total generalized Euclidean distance in the case of pattern classification into k
classes, where
~ ~
d ij = d • ij , if i = r, (26)
and
~ ~
d ij = d ij• , if j = r. (27)
Then the classification rule becomes: Assign y to the class Cr, r∈{1, 2, …, k}, for
~
which d r (y ) is largest.
k
nii .
S
S = (28)
i =1
Classification via Total Modified Euclidean Distance. Let us assume that each of the
k populations has the same covariance matrix (Σ1 = Σ2 = · · · = Σk). The modified
Euclidean distance between two vectors y i and y , i∈{1, 2, …, k}, is given by
( y − y )′( y i − y )
di = i , (29)
| S pl |
where
k k
y= ni y i ni (30)
i =1 i =1
represents the ‘overall average’. If y has been embedded in the sample from Ci, then
the modified Euclidean distance between two vectors y • i and y is given by
( y i − y )′( y • i − y )
d•i = • , i∈{1, 2, …, k}. (31)
| S pl ( • i ) |
Let
k
~
d r (y ) = di , r∈{1, 2, …, k}, (32)
i =1
72 N. Nechval, K. Nechval, and I. Bausova
be the total modified Euclidean distance in the case of pattern classification into k
classes, where
d i = d • i , if i = r. (33)
Then the classification rule becomes: Assign y to to the class Cr, r∈{1, 2, …, k}, for
which d r (y ) is largest.
If (Σ1 = Σ2 = · · · = Σk) does not hold, then instead of Spl we use S° (28).
Consider the observations on p=2 variables from k=3 populations (classes) [10]. The
input data samples are given below.
− 2 5 0 6 1 − 2
C1 = 0 3; C2 = 2 4; C3 = 0 0. (34)
− 1 1 1 2 − 1 − 4
We found that
− 1 1 0 0
y1 = , y 2 = , y 3 = , y= . (35)
3 4 − 2 5 / 3
1 − 0.33333
S pl = . (36)
− 0.33333 4
Suppose that we have to classify the new observation y′=[1, 3] into the above classes.
Let us assume that each of the k=3 populations has the same covariance matrix (Σ1 =
Σ2 = Σ3).
Thus, since
New Procedures of Pattern Classification for Vibration-Based Diagnostics 73
~ ~
d 2 (y ) = max d r (y ), (38)
r
we assign y to class C2.
Classification via Total Generalized Euclidean Distance. It follows from (25) that
~ ~ ~
d1 (y ) = 15.14, d 2 (y ) = 21.91, d 3 (y ) = 7.51. (39)
Thus, since
~ ~
d 2 (y ) = max d r (y ), (40)
r
Classification via Total Modified Euclidean Distance. It follows from (32) that
d1 (y ) = 5.066, d 2 (y ) = 7.312, d 3 (y ) = 2.596. (41)
Thus, since
d 2 (y ) = max d r (y ), (42)
r
normality is present as well, since only good discrimination can ensure good
allocation. In practice, we often are in need of analyzing input data samples, which
are not adequate for Fisher’s classification rule, such that the distributions of the
groups are not multivariate normal or covariance matrices of those are different or
there are strong multi-nonlinearities. An example of one of situations of pattern
classification, which produces non-linear separation of classes and is not adequate for
Fisher’s classification rule, would become clear from the illustration shown in Fig. 3.
One solution to address this problem would be the application of kernels to input
data, which would essentially transform the input to a higher dimensional space,
wherein the probability of linearly separating the classes is higher.
This paper proposes the improved approaches to pattern classification for
vibration-based diagnostics via neural network which represent the new distance-
based embedding procedures that allow one to take into account the cases which are
not adequate for Fisher’s classification rule. Moreover, these approaches allow one to
classify sets of multivariate observations, where each of the sets contains more than
one observation. For the cases, which are adequate for Fisher’s classification rule, the
proposed approaches give the results similar to that of FLDA.
The methodology described here can be extended in several different directions to
handle various problems of pattern classification (recognition) that arise in practice
(in particular, the problem of changepoint detection in a sequence of multivariate
observations).
Acknowledgments. This research was supported in part by Grant No. 06.1936, Grant
No. 07.2036, and Grant No. 09.1014 from the Latvian Council of Science and the
National Institute of Mathematics and Informatics of Latvia.
New Procedures of Pattern Classification for Vibration-Based Diagnostics 75
References
1. Dimarogonas, A.D.: Vibration of Cracked Structures: a State of the Art Review.
Engineering and Fracture Mechanics 55, 831–857 (1996)
2. Adams, D.E., Nataraju, M.: A Nonlinear Dynamics System for Structural Diagnosis and
Prognosis. International Journal of Engineering Science 40, 1919–1941 (2002)
3. Roy, N., Ganguli, R.: Helicopter Rotor Blade Frequency Evolution with Damage Growth
and Signal Processing. Journal of Sound and Vibration 283, 821–851 (2005)
4. Krupka, R.M., Baumanis, A.M.: Bending-Bending Mode of Rotating Tapered Twisted
Turbo Machine Blades Including Rotary Inertia and Shear Deflection. Journal of
Engineering for Industry 91, 10–17 (1965)
5. Thomas, J., Abbas, B.A.H.: Finite Element Model for Dynamic Analysis of Timoshenko
Beam. Journal of Sound and Vibration 41, 291–299 (1975)
6. Rao, S.S., Gupta, R.S.: Finite Element Vibration Analysis of Rotating Timoshenko Beams.
Journal of Sound and Vibration 242, 103–124 (2001)
7. Takahashi, I.: Vibration and Stability of Non-Uniform Cracked Timoshenko Beam
Subjected to Follower Force. Computers and Structures 71, 585–591 (1999)
8. Hou, J., Wicks, B.J., Antoniou, R.A.: An Investigation of Fatigue Failures of Turbine
Blades in a Gas Turbine Engine by Mechanical Analysis. Engineering Failure Analysis 9,
201–211 (2000)
9. Fisher, R.: The Use of Multiple Measurements in Taxonomic Problems. Ann. Eugenics 7,
178–188 (1936)
10. Nechval, N.A., Purgailis, M., Skiltere, D., Nechval, K.N.: Pattern Recognition Based on
Comparison of Fisher’s Maximum Separations. In: Proceedings of the 7th International
Conference on Neural Networks and Artificial Intelligence (ICNNAI 2012), Minsk,
Belarus, pp. 65–69 (2012)
11. Choi, S.W., Park, J.H., Lee, I.B.: Process Monitoring Using a Gaussian Mixture Model via
Principal Component Analysis and Discriminant Analysis. Computers and Chemical
Engineering 28, 1377–1387 (2004)
12. Chiang, L.H., Russell, E.L., Braatz, R.D.: Fault Diagnosis in Chemical Processes Using
Fisher Discriminant Analysis, Discriminant Partial Least Squares, and Principal
Component Analysis. Chemometrics and Intelligent Laboratory Systems 50, 243–252
(2000)
Graph Neural Networks for 3D Bravais Lattices
Classification
1 Introduction
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 76–86, 2014.
c Springer International Publishing Switzerland 2014
Graph Neural Networks for 3D Bravais Lattices Classification 77
introducing the idea of the unfolded network. Finally, two models were created to
process graph data without creating a lossless encoding: the Graph Machines [5]
and the Graph Neural Networks [6]. Both models are designed to learn an en-
coding of graphs which is sufficient for a graph classification/regression task.
Such an approach proved to work well in various domains. Graph Machines were
successfully used in many QSAR and QSPR tasks [7] [8], including most recently
prediction of fuel compounds properties [9]. Graph Neural Networks were used in
XML document mining [10], web page ranking [11], spam detection [12], object
localisation in images [13] and in image classification [14].
In the field of material technology, the properties of a material often depend
on defects in its crystal structure. Such defects may consist of a vacant site
in the crystal lattice or of a basis substitution. This article describes how three
dimensional crystal structures, such as Bravais lattices, can be processed directly
by using Graph Neural Networks. By using a graph oriented model, the spacial
location of lattice points can be described. By using node labels corresponding
to lattice points, information about the basis can be taken into account.
2.1 Data
A single GNN model is built for a set of graphs (the dataset). Each graph be-
longing to the dataset can have a different structure. The whole dataset can be
78 A. Barcz and S. Jankowski
thus represented as a single disconnected graph. Each nth graph node is repre-
sented by its label ln of constant size |ln | ≥ 1. Each directed edge u ⇒ n (from
uth to nth node) is represented by edge label l(n,u) of constant size |l(n,u) | ≥ 0.
The edge label size may differ from the node label size. In the implemented
model undirected edges were represented as pairs of directed edges to account
for the mutual impact of connected nodes. For each nth graph node an output
on of constant size can be sought - we say it’s a node-oriented task. Alterna-
tively, a single output og can be sought for the whole graph - in such case we say
it’s a graph-oriented task. In this paper a node-oriented approach was chosen,
as a potentially more flexible one.
The GNN model consists of two computational units: the transition unit fw and
the output unit gw . The vector of parameters w is distinct for each unit, however,
for consistency with previous publications it will be denoted by w for both units.
The transition unit is used to build the encoded representation of each node, xn
(the state of nth node). The output unit is used to calculate the output on for
each node. Let’s denote by ne[n] the set of neighbors of the nth node, that is such
nodes u that are connected to the nth node with a directed edge u ⇒ n. Let’s
further denote by co[n] the set of directed edges from ne[n] to the nth node. For
this implementation the following transition and output functions were used:
on = gw (xn ) . (2)
The GNN model offers two forms of the fw function, one better suited for po-
sitional graphs, the other one for non-positional graphs. To make processing
non-positional graphs possible, the non-positional form of fw was chosen [6],
which can be described as a simple sum of hw unit outputs. This yields the final
set of equations describing the model:
xn = hw (ln , l(n,u) , xu ) , (3)
u∈ne[n]
on = gw (xn ) . (4)
The hw and gw units were implemented as fully-connected three layer neural
networks. Hidden layers consisted for both networks of tanh neurons. The output
layer of the hw network consisted of tanh neurons, as the state xn must consist
of bounded values only. The output layer of the gw network consisted of neurons
with linear activation function. During initialization, weights of both networks
were set so that the input to every jth neuron, netj , was bounded assuming
normalised input: netj ∈ (−1, 1). The initial input weights corresponding to
the state inputs were divided by an additional factor, i.e. the maximum node
Graph Neural Networks for 3D Bravais Lattices Classification 79
indegree, to take into consideration the fact, that the state consists of a sum of
hw outputs. All the input data (node and edge labels) was normalised before
feeding to the model. The fw and gw functions are presented in Fig. 1 altogether
with one of the corresponding edges, where the comma-separated list of inputs
stands for a vector obtained by stacking all the listed values one after another.
xn
ln Σ on
l[n, ui] g
h h xn
lui
ln, l[n,u1], xu1 ln, l[n,up], xup
Fig. 1. The fw and gw functions for a single node and one of the corresponding edges
2
g
1 3 f
f f
g g
additional effort. Let’s define the global state x as the set of all node states.
Let’s define l as the set of all node and edge labels. Let’s further define o as
the set of all node outputs. The global transition function Fw and global output
function Gw , being the unfolded network counterparts of fw and gw , are defined
as follows:
x = Fw (l, x) , (5)
o = Gw (x) . (6)
The global state x is computed at each time step and is expected to converge
to x̂ after a finite number of steps. Then, the output on is calculated by the gw
units. The output error en = (dn − on )2 (where dn stands for the expected
output) is calculated and backpropagated through the output units, yielding
∂o · ∂x (x̂). That value is backpropagated through the unfolded network using
∂ew ∂Gw
error is injected to the fw layer. In such a way the error backpropagated through
the fw layer at time ti comes from two sources. Firstly, it is the original output
error of the network ∂e ∂o · ∂x (x̂). Secondly, it is the error backpropagated from
w ∂Gw
the subsequent time layers of the fw unit from all nodes connected with the given
node u by an edge u ⇒ n. The backpropagation continues until the error value
converges, which usually take less time steps than the state convergence. The un-
folding and error backpropagation phases are presented in Fig. 3.
The described algorithm makes one important assumption: we want the global
state x to converge. To assure convergence, Fw must be a contraction map.
According to Banach Theorem, this guarantees that the state calculation will
converge to a unique point x̂ and the convergence will be exponentially fast.
To assure the contraction property, a penalty term ∂p w
∂w is added to the total
∂ehw
∂w error term after performing BPTS. The penalty value pw is calculated as
follows. Let A = ∂F ∂x (x, l) be a block matrix of size N × N with blocks of size
w
Graph Neural Networks for 3D Bravais Lattices Classification 81
f f f f f f
tm g g g tm g g g
t1 g g g g g g
t0 g g g tm-n g g g
Fig. 3. Unfolded encoding network for the sample graph and backpropagation
s × s, where N is the number of nodes in the processed graph and |xn | = s is the
state size for a single node. A single block An,u measures the influence of the
uth node on the nth node if an edge u ⇒ n exists or is zeroed otherwise. Let’s
denote by Iuj the influence of uth node on the jth element of state xn (Eq. 9).
The penalty pw added to the network error ew is defined by Eq. 7.
s
pw = L(Iuj , μ) , (7)
u∈N j=1
y−μ if y > μ
L(y, μ) = , (8)
0 otherwise
s
Iuj = |Ai,j
n,u | . (9)
(n,u) i=1
This does not guarantee that Fw will remain a contraction map, as the penalty is
added post factum and it must be tuned using the contraction constant μ. How-
ever, even if the convergence isn’t always reached, the model can still be trained
and yield good results. The necessary constraint in such cases is a maximum
number of unfolding and error backpropagation steps.
3 Bravais Lattices
T = n1 a + n2 b + n3 c . (10)
Each translation is described by three constant parameters: a, b and c which are
called the primitive vectors and three integers: n1 , n2 and n3 . If we translate
the lattice by any such vector T , we will obtain the same lattice. A parallelepiped
formed from the primitive vectors is called a primitive cell. A primitive cell is
a basic component of a crystal structure. We can distinguish 14 Bravais lattices
in three dimensions, differing in the primitive vectors length relations, the angles
between them and the presence of additional lattice points in the cell.
a
b
Many material properties depend on crystal structure defects, which are in-
troduced on purpose. A defect may e.g. consist of an atom missing at one of
the lattice points (vacancy) or of a different atom introduced at one of the lat-
tice points (substitution). Properties of materials containing a particular defect
are determined experimentally by producing a specimen and testing it in a lab-
oratory. Modelling of such phenomena with neural network models could prove
useful for approximating these properties before the actual experiments take
place.
4 Experimental Results
For the scope of this work, two simple Bravais lattices were chosen for experi-
ments: the primitive (P) tetragonal lattice and the body-centered (I) tetragonal
lattice, containing additional lattice point in the center of each cell. For both
tetragonal lattices a = b = c, as presented in Fig. 5. For all the experiments
each crystal structure was represented by a single cell. Each lattice point was
Graph Neural Networks for 3D Bravais Lattices Classification 83
described as a graph node, with node label containing information about the ba-
sis used at this node. The mutual location of two adjacent points u and n was
described as a pair of directed edges, each containing in its label the 3D cartesian
coordinates of the vector u ⇒ n or n ⇒ u, respectively. In such way, the de-
scription of a cell was independent from the actual location of the cell in space,
which was the goal. A spherical coordinates system was also tried out, yielding
similar results to the cartesian one.
Fig. 5. Simple tetragonal cell (P) and body-centered tetragonal cell (I)
For every experiment the dataset was generated as follows. First a single graph
(cell) was created as a cubic (P) cell a = b = c = 1. Node labels were set to
1 (|ln | = 1, ln = 1), unless stated otherwise. A second graph, cubic (I) cell,
was created by introducing an additional node in the center of the graph. Then,
datasets were generated by random scaling of all the input graphs edges using
factors pertaining to uniform distribution U (−5, 5), but maintaining the tetrag-
onal constraint: a = b = c. Then, a small random error ε with zero mean and
standard deviation equal to 0.01 was added to all node and edge labels. For ev-
ery experiment the dataset consisted of two classes of 200 graphs each. Among
each class every graph node had its expected output set to the same value: 1
or −1, depending on the class. During evaluation, a node was assigned to class
depending on the sign of its output on . For every experiment, the training set
consisted of 50 graphs, while the test set consisted of 150 graphs.
For every experiment, the state size |xn | was set to 5. The number of hidden
neurons was set for both networks to 5. For the output network linear neurons
were used as output neurons. The contraction constant μ was set to 30 as smaller
values tended to disturb the learning process significantly. The maximum number
of state calculation and error backpropagation steps was set to 200. The number
of full training iterations (forward-backward) was set to 200, as it could be seen
that for some GNNs the RMSE on training set began to drop significantly only
after more than 100 iterations. In each experiment the best GNN was selected as
the one that achieved the smallest RMSE on the training set. Then, the selected
GNN was evaluated on the test set. The training set for each experiment con-
sisted of 50 samples from each of the two classes (100 samples in total). The test
set consisted of 150 samples from each class (300 samples in total).
84 A. Barcz and S. Jankowski
For this experiment, a tetragonal (P) dataset was compared to a tetragonal (I)
dataset. All node labels were set to the same value. Thus, the task to solve was
to distinguish cells with the central atom missing from full tetragonal (I) cells.
The results achieved by the selected GNN are presented in Table 1. It can be
seen, that a GNN can be trained to deal very well with such a task, in which
the number of nodes in two graphs differs.
For this experiment, two tetragonal (I) datasets were used. In one dataset all
node labels were set to 1, while in the other one the labels of central nodes were
set to 2, to simulate a single atom substitution. The unusually good results for the
small random error, presented in Table 2, can be explained by the simplicity of
the task. As node labels are explicitly given to the model (not alike the structure
of the graph, which was the case described in the previous section), a simple
linear classifier would be sufficient for this task, even taking into consideration
the noise applied.
To check the model performance with a more demanding task, a larger random
error ε with zero mean and standard deviation equal to 0.1 was added to the
original node and edge labels and a new classifier was trained. The results are
significantly worse, however, it must be stated, that such a random error disturbs
greatly the graph structure in the case of small edge lengths.
5 Conclusion
The Graph Neural Network is a model successfully used in many two dimensional
graph processing tasks. This article presents the capabilities of the GNN model
to process three dimensional data, such as crystal structures. The main difference
of this data lays in the fact, that the spatial location of all the nodes must be
taken into consideration and not only the edge and node properties. The tasks
used for testing the GNN model were selected so as to present how the GNN
can be used to deal with various structural and property differences. The model
proved to work well in all the tasks, including a simulated crystal structure
vacancy defect and two different atom substitution defects.
References
1. Goulon-Sigwalt-Abram, A., Duprat, A., Dreyfus, G.: From hopfield nets to recur-
sive networks to graph machines: numerical machine learning for structured data.
Theoretical Computer Science 344(2), 298–334 (2005)
2. Pollack, J.B.: Recursive distributed representations. Artificial Intelligence 46(1),
77–105 (1990)
3. Sperduti, A.: Labelling recursive auto-associative memory. Connection Sci-
ence 6(4), 429–459 (1994)
4. Goller, C., Kuchler, A.: Learning task-dependent distributed representations by
backpropagation through structure. In: IEEE International Conference on Neural
Networks, vol. 1, pp. 347–352. IEEE (1996)
86 A. Barcz and S. Jankowski
5. Goulon, A., Duprat, A., Dreyfus, G.: Learning numbers from graphs. In: Applied
Statistical Modelling and Data Analysis, Brest, France, pp. 17–20 (2005)
6. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph
neural network model. IEEE Transactions on Neural Networks 20(1), 61–80 (2009)
7. Goulon, A., Picot, T., Duprat, A., Dreyfus, G.: Predicting activities without com-
puting descriptors: graph machines for QSAR. SAR and QSAR in Environmental
Research 18(1-2), 141–153 (2007)
8. Goulon, A., Faraj, A., Pirngruber, G., Jacquin, M., Porcheron, F., Leflaive, P.,
Martin, P., Baron, G., Denayer, J.: Novel graph machine based QSAR approach
for the prediction of the adsorption enthalpies of alkanes on zeolites. Catalysis
Today 159(1), 74–83 (2011)
9. Saldana, D., Starck, L., Mougin, P., Rousseau, B., Creton, B.: On the rational
formulation of alternative fuels: melting point and net heat of combustion pre-
dictions for fuel compounds using machine learning methods. SAR and QSAR in
Environmental Research 24(4), 259–277 (2013)
10. Yong, S., Hagenbuchner, M., Tsoi, A., Scarselli, F., Gori, M.: XML document
mining using graph neural network. Center for Computer Science, 354 (2006),
http://inex.is.informatik.uni-duisburg.de/2006
11. Scarselli, F., Yong, S.L., Gori, M., Hagenbuchner, M., Tsoi, A.C., Maggini, M.:
Graph neural networks for ranking web pages. In: Proceedings of the 2005
IEEE/WIC/ACM International Conference on Web Intelligence, pp. 666–672.
IEEE (2005)
12. Scarselli, F., Tsoi, A.C., Hagenbuchner, M., Noi, L.D.: Solving graph data issues
using a layered architecture approach with applications to web spam detection.
Neural Networks 48, 78–90 (2013)
13. Monfardini, G., Di Massa, V., Scarselli, F., Gori, M.: Graph neural networks for
object localization. Frontiers in Artificial Intelligence and Applications 141, 665
(2006)
14. Quek, A., Wang, Z., Zhang, J., Feng, D.: Structural image classification with graph
neural networks. In: International Conference on Digital Image Computing Tech-
niques and Applications (DICTA), pp. 416–421. IEEE (2011)
15. Zhang, Y., Yang, S., Evans, J.R.G.: Revisiting Hume-Rotherys Rules with artificial
neural networks. Acta Materialia 56(5), 1094–1105 (2008)
16. Willighagen, E., Wehrens, R., Melssen, W., De Gelder, R., Buydens, L.: Supervised
self-organizing maps in crystal property and structure prediction. Crystal Growth
& Design 7(9), 1738–1745 (2007)
17. Bianchini, M., Maggini, M., Sarti, L., Scarselli, F.: Recursive neural networks
for processing graphs with labelled edges: Theory and applications. Neural Net-
works 18(8), 1040–1050 (2005)
18. Kittel, C., McEuen, P.: Introduction to solid state physics, vol. 8. Wiley, New York
(1986)
Quality Evaluation of E-commerce Sites
Based on Adaptive Neural Fuzzy Inference System
1 Introduction
In the rapid development of global networks gained wide popularity with businesses
of electronic commerce (also called "e-commerce" or “eCommerce”). This term
covers a wide range of activities of modern enterprises. It includes the entire Internet -
the process for the development, marketing, sale, delivery, maintenance, payment for
goods and services.
The key to e-commerce companies are problems of understanding customer
inquiries and the development of tools for the implementation of feedback.
Companies is usually presented poorly online with sites, which are difficult for the
reaction. This significantly weakens the position of the company as a whole.
Consequently, it is very important that businesses have the opportunity to assess the
quality of their business proposals and understand how customers perceive them in
the context of the industry [1,2].
Therefore, it plays an important role that e-commerce companies assess their sites
for successful operation. Ratings are a kind of feedback mechanism that allows
refining the strategy and methods of control. Website is a software product that can be
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 87–97, 2014.
© Springer International Publishing Switzerland 2014
88 H. Liu and V.V. Krasnoproshin
For the above purpose, at first, we constructed the knowledge base from these
projects; then we designed and realized the evaluation ANFIS; at last, we trained and
tested the ANFIS. Each of previous E-commerce website assessment projects can be
expressed as follows:
P =< S , FEM > ,
S = { EM Ci
}, for i = 1 ... n (1)
where S is the set of evaluation marks of all criteria, FEM is the final evaluation
mark, EM C i is evaluation mark ith criterion Ci, and n is the total amount of criteria.
Moreover, the EWAS can be descripted as:
O =< PS Tr , PS Ts , S , R > ,
(2)
PS Tr = { P i },
PS Ts = { P m − i }, for i = 1 ... m
Quality Evaluation of E-commerce Sites Based on ANFIS 89
where PSTr is training set from part of all projects, PSTs is testing set from the rest part
of all projects, S is set of evaluation marks of all criteria, and R is final evaluation
mark from our ANFIS.
Analysis of existing literature [6,7,8,9] shows that ANFIS has good opportunities for
learning, prediction and classification. The architecture of these networks allows
adaptively based on numerical or expertise data to create a knowledge base (in the
form of a set of fuzzy rules) for the system output.
ANFIS is a multilayer unidirectional neural learning network, which uses fuzzy
reasoning. Figure 1. shows a typical ANFIS architecture with two entrances, four
rules and one output. Each input network mapped two membership functions (MF).
In this model, the first-order rules are used if-then type that are given by:
Layer 1: The input nodes. All nodes are adaptive layer. They generate membership
grades to which they belongs to each of the appropriate fuzzy sets by the following
formulas:
90 H. Liu and V.V. Krasnoproshin
O1Ai = μ Ai ( x) i = 1, 2,
OB1 j = μ B j ( y ) j = 1, 2, (3)
where x and y where x and y are crisp inputs, and Ai and Bj are fuzzy sets such as low,
medium, high characterized by appropriate MFs, which could be triangular,
trapezoidal, Gaussian functions or other shapes. In this study, the generalized bell-
shaped MFs defined below are utilized
1
μ A ( x) = 2 bi
, i = 1, 2,
x − ci
i
1 +
ai
1 (4)
μB ( y ) = 2b j
, j = 1, 2,
y − cj
j
1+
a
j
where {ai, bi, ci} and {aj, bj, cj} are the parameters of the MFs, governing the bell-
shaped functions. Parameters in this layer are referred to as premise parameters.
Layer 2: The nodes in this layer are fixed nodes labelled, indicating that they perform
as a simple multiplier. The outputs of this layer are represented as
which represents the firing strength of each rule. The firing strength means the degree
to which the antecedent part of the rule is satisfied.
Layer 3: The nodes in this layer are also fixed nodes labelled N, indicating that they
play a normalization role in the network. The outputs of this layer can be represented
as
Wij
Oij3 = W ij = , i, j = 1, 2, (6)
W11 + W12 + W21 + W22
Layer 4: Each node in this layer is an adaptive node, whose output is simply the
product of the normalized firing strength and a first-order polynomial (for a first-order
Sugeno model). Thus, the outputs of this layer are given by
Layer 5: The single node in this layer is a fixed node labelled Σ, which computes the
overall output as the summation of all incoming signals, i.e.
Parameters in this layer are referred to as consequent parameters.
i =1 j =1 i =1 j =1 (8)
[ ]
2 2
= (Wij x) pij + (Wij y )qij + (Wij )rij ,
i =1 j =1
which is a linear combination of the consequent parameters when the values of the
premise parameters are fixed.
It can be observed that the ANFIS architecture has two adaptive layers: Layers 1
and 4. Layer 1 has modifiable parameters {ai, bi, ci} and {aj, bj, cj} related to the input
MFs. Layer 4 has modifiable parameters {pij, qij, rij} pertaining to the first-order
polynomial. The task of the learning algorithm for this ANFIS architecture is to tune
all the modifiable parameters to make the ANFIS output match the training data.
Learning or adjusting these modifiable parameters is a two-step process, which is
known as the hybrid learning algorithm. In the forward pass of the hybrid learning
algorithm, the premise parameters are hold fixed, node outputs go forward until Layer
4 and the consequent parameters are identified by the least squares method. In the
backward pass, the consequent parameters are held fixed, the error signal propagate
backward and the premise parameters are updated by the gradient descent method.
The detailed algorithm and mathematical background of the hybrid-learning algorithm
can be found in [9].
With the training dataset, we choose two generalized bell-shaped MFs for each of the
three inputs to build the ANFIS, which leads to 27 if–then rules containing 104
parameters to be learned. Note that it is inappropriate to choose four or more MFs for
each input; because the parameters needing to be learned in that case will be greater
than the number of training samples. Fig. 3 shows the structure of the ANFIS that is
to be built for E-commerce website assessment in this study. The model structure is
implemented using the fuzzy logic toolbox of MATLAB software package [10].
The trained if–then rules are presented in Fig. 4, which can be used for prediction.
For example, if we change the values of the four inputs from 1.5 to 3, then we
immediately get the new output value of the ANFIS as 75.7. This is illustrated in
Fig. 5.
Quality Evaluation of E-commerce Sites Based on ANFIS 93
The trained ANFIS is validated by the testing dataset. Fig. 6 shows the testing
errors for the testing dataset. For convenience, the fitting errors for the training dataset
are also shown in Fig. 8, from which it can be observed that except for three
94 H. Liu and V.V. Krasnoproshin
E-commerce website assessments (EWAs), the fitting and testing errors for all the
other 504 EWAs s are all nearly zero. The three exceptional EWAs are EWA178,
EWA407 and EWA446, whose assessment scores are 56, 77 and 77, respectively, but
the fitted or predicted values for them by the ANFIS are 44, 83 and 83. The relative
errors for these three EWAs are 21.43%, 7.79% and 7.79%, respectively.
It should not be expected that the ANFIS produce very good results for all the
training and testing samples. Particularly in the case that there might be conflicting
data in the training or testing dataset.
Looking into the training dataset, we find that both EWA170 and EWA178 have
the same assessment ratings: 2, 2, 3 for U, R, D, respectively, but different assessment
scores: 44 and 56.
These two samples are obviously in conflict with each other. Moreover, it is also
found that EWA177 and EWA178 have different assessment ratings: (2, 3, 1) and (2,
1, 1), but the same assessment score of 56. These two samples also conflict with each
other. It can be concluded that the assessment score of EWA178 is very likely to be
an outlier. The performance of the developed ANFIS is in fact very good if the fitting
error of EWA178 is not included. This can be seen clearly from Fig. 8.
where At and Ft are actual (desired) and fitted (or predicted) values, respectively, and
N is the number of training or testing samples.
Mean absolute percentage error (MAPE):
1 N
At − Ft
MAPE =
N
t =1 At
× 100 (10)
Quality Evaluation of E-commerce Sites Based on ANFIS 95
N
( At − A )( Ft − F ) (11)
R= t =1
t =1 ( At − A ) 2 ⋅ t =1 ( Ft − F ) 2
n n
Usability
design
Fig. 8 shows the fitting and testing errors for the 507 E-commerce website
assessment projects obtained by the ANN. It is very clear from Table 1 and Figs. 6
and 8 that the ANFIS has smaller RMSE and MAPE as well as bigger R for both the
training and testing datasets than the ANN model. In other words, the ANFIS
achieves better performances than the ANN model. Therefore, ANFIS is a good
choice for modelling E-commerce website assessment.
96 H. Liu and V.V. Krasnoproshin
Moreover, ANN is a black box in nature and its relationships between inputs and
outputs are not easy to be interpreted, while ANFIS is transparent and its if–then rules
are very easy to understand and interpret. However, the drawback of ANFIS is its
limitation to the number of outputs. It can only model a single output.
4 Conclusion
References
1. Lee, S.: The effects of usability and web design attributes on user preference for e-
commerce web sites. Computers in Industry 61(4), 329–341 (2010)
2. Liu, H., Krasnoproshin, V., Zhang, S.: Fuzzy analytic hierarchy process approach for E-
Commerce websites evaluation. World Scientific Proceedings Series on Computer
Engineering and Information Science 6, 276–285 (2012)
3. Liu, H., Krasnoproshin, V., Zhang, S.: Algorithms for Evaluation and Selection E-
Commerce Web-sites. Journal of Computational Optimization in Economics and
Finance 4(2-3), 135–148 (2012)
4. Liu, H., Krasnoproshin, V., Zhang, S.: Combined Method for E-Commerce Website
Evaluation Based on Fuzzy Neural Network. Applied Mechanics and Materials 380-384,
2135–2138 (2013)
5. Law, R., Qi, S., Buhalis, D.: Progress in tourism management: A review of website
evaluation in tourism research. Tourism Management 31(3), 297–313 (2010)
6. Hung, W., McQueen, R.J.: Developing an evaluation instrument for e-commerce web sites
from the first-time buyer’s viewpoint. Electron. J. Inform. Syst. Eval. 7(1), 31–42 (2004)
7. Azamathulla, H.M., Ghani, A.A., Fei, S.Y., Azamathulla, H.M.: ANFIS-based approach
for predicting sediment transport in clean sewer. Applied Soft Computing 12(3), 1227–
1230 (2012)
8. Dwivedi, A.A., Niranjan, M., Sahu, K.: Business Intelligence Technique for Forecasting
the Automobile Sales using Adaptive Intelligent Systems (ANFIS and ANN). International
Journal of Computer Applications 74(9), 7–13 (2013)
9. Jang, J.S.R.: ANFIS: Adaptive-network-based fuzzy inference systems. IEEE Transactions
on Systems Man and Cybernetics 23, 665–685 (1993)
10. Petković, D., Issa, M., Pavlović, N.D.: Adaptive neuro-fuzzy estimation of conductive
silicone rubber mechanical properties. Expert Systems with Applications 39(10), 9477–
9482 (2012)
11. Tsai, C.F., Wu, J.W.: Using neural network ensembles for bankruptcy prediction and credit
scoring. Expert Systems with Applications 34(4), 2639–2649 (2008)
12. Singh, R., Kainthola, A., Singh, T.N.: Estimation of elastic constant of rocks using an
ANFIS approach. Applied Soft Computing 12(1), 40–45 (2012)
Visual Fuzzy Control for Blimp Robot
to Follow 3D Aerial Object
Abstract. This works presents a novel visual servoing system in order to follow
a 3D aerial moving object by blimp robot and to estimate the metric distances
between both of them. To realize the autonomous aerial target following, an
efficient vision-based object detection and localization algorithm is proposed by
using Speeded Up Robust Features technique and Inverse Perspective Mapping
which allows the blimp robot to obtain a bird's eye view. The fuzzy control
system is relies on the visual information given by the computer vision
algorithm. The fuzzy sets model were introduced imperially based on
possibilities distributions and frequency analysis of the empirical data. The
system is focused on continuously following the aerial target and maintaining it
within a fixed safe distance. The algorithm showing robustness against
illumination changes , rotation invariance as well as size invariance. The results
indicate that the proposed algorithm is suitable for complex control missions.
1 Introduction
Recently, the developing in computer vision makes it very important part on the robot
researches. In order to provide the UAVs with additional information to perform
visually guided missions, the developing of computer vision techniques are necessary.
The robot vision tracking technology is utilized widely due to its advantages such as
reliability and low cost. It is applied widely in the fields of security, traffic
monitoring, and object recognition. Actually, there are several researches have been
presented and devoted to vision flight control, navigation, tracking and object
identification [1,2,3]. In [4], they have proposed a visual information algorithm for
cooperative robotics. In addition, the visual information has been proposed on aerial
robotics for flying information [5]. Another approach also has been proposed for fixed
wind UAV flying at constant altitude following circular paths in order to pursuit a
moving object on a ground planar surface [6]. In addition, the autonomous
surveillance blimp system is provided with one camera as a sensor. The motion
segmentation to improve the detection and tracking of point features on a target object
was presented in [7] . However, it has some disadvantages like sensitivity to light and
the amount of image data. Some studies have proposed many vision tracking methods
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 98–111, 2014.
© Springer International Publishing Switzerland 2014
Visual Fuzzy Control for Blimp Robot to Follow 3D Aerial Object 99
such as the simple color tracking [8], template matching [9], background subtraction
[10], feature based approaches [11] and feature tracking [12].
However, many algorithms could not meet the requirements of real time and
robustness due to large amount of video data. Therefore, it becomes important to
improve the accuracy and real time of tracking algorithms. In order to improve the
detection of targets in real time and robustness, Lowe has proposed the Scale
Invariant Features Transform (SIFT) [13], and then the Speeded Up Robust Features
(SURF) has been proposed [14]. Because the processing time of SURF algorithm is
faster than that of SIFT, interest point detection of SURF algorithm is used for real-
time processing. The visual servoing has been implemented on UAVs successfully.
Pose-base methods have been employed for some applications like autonomous
landing on moving objects. These methods are important in order to estimate the 3D
object position [15]. The image based methods also have been used widely for
positioning [16]. In [17] they deal with this problem by proposed a measurement
model of the vision sensor based on the specific image processing technique that is
related to the size of the target. However, a nonlinear adaptive observer is
implemented for the problem and the performance of the proposed method was
verified through numerical simulations. For the bearing measurement sensor, the
maximization of determinant and fisher information matrix have been studied to
generate the optimal trajectory [18,19].
Moreover, these approaches are hard to implement because of the high
computational load. In addition, the tracking problem has been studied by using a
robust adaptive observer and the intelligent excitation concept [20-22]. Note that the
observer is only applied to the systems whose relative degree is 1, and therefore it
could not be implemented to higher relative degree systems. The 3D object following
method has been developed based on visual information to generate a dynamic look
and move control architecture for UAV [23]. Also, aerial object following using
visual fuzzy servoing has been presented [24]. However, they approach the problem
of the tracking by exploiting the color characteristic of the target which means define
a basic color to the target and assuming a simple colored mark to track it. This process
is not always perfect and might face problems due to the changes in color over time.
Also, the design of the fuzzy controllers were made based on the excellent results
after trials and errors method. Actually, the intelligent controllers are getting more
importance in robotics researches and one of the most techniques used in intelligent
computing techniques is the fuzzy logic.
The most important problem in the fuzzy logic is how to design the fuzzy
knowledge base without relies on a human control expert or simulation studies.
Recently, the possibility theory which is an alternative to probability theory becomes
very important in robot research. It is a mathematical notion that deals with certain
types of uncertainties in any system. The theory itself is not only strongly linked to
fuzzy systems in its mathematics, but also in its semantics [25]. The possibility theory
was introduced by Lotfi Zadeh [26] as an extension of his theory of fuzzy sets and
fuzzy logic and then developed by Dubois et al [27]. Therefore, the combination
between the possibility theory and fuzzy sets leads to model the complex systems
empirically without regard to the presence of the expert. As the possibilities theory
100 R. Al-Jarrah and H. Roth
deals only with evidence in where the focal elements are overlapping, it is always
better to collect possibilities data empirically in Labs. Joslyn Cliff [28] used interval
statistics sets with their empirical random sets to develop an approach to construct the
possibility distribution histogram.
In this paper, we present the design and implementation of vision system for a
blimp robot. The proposed vision is designed by using SURF to localize the 3D aerial
target. In addition, the Inverse Perspective Mapping (IPM) has been used to allow to
remove the perspective effect from the image and to remap the image into a new 2-D
domain where the information content is homogeneously distributed among all pixels.
This approach will not depend on the colour, shape or even the size of the tracked
object. Moreover, it depends on the target itself by finding the interest points of the
objects relies on the SURF and IPM algorithms. Hence, the proposed approach can
estimate the metric distances between the blimp robot and the 3D flying object. Then,
a fuzzy sets model will be designed imperially to advanced and efficient design the
fuzzy controllers to command the blimp to track , follow the aerial target, and keep it
maintain with a fixed distance. The blimp system would achieve to be robust object
recognition even when the object in a captured image is at a scale in-variance and/or
at rotation in-variance compared to the reference images.
The reminder of this paper is organized as follows: after the introduction section
we will discuss the blimp hardware of the vision system. Section 3 presents the
vision-based robot localization. Then, we introduce the fuzzy sets model based on
possibilities histograms in section 4. The experimental results of the vision system
obtained through actual flight tests are presented in Section 5. Finally, section 6
presents the conclusion of the work.
2 Blimp Hardware
In order to develop a blimp based on a small size, light weight, high level
functionality and communications, the navigation system and autonomous embedded
blimp system were presented in our previous works [29, 30]. The core of the blimp
system is distributed among the Gumstix Overo-Air COM (weight 5.6g) and
Atmega328 microcontroller. The Gumstix will process the data of the sensors and
then, execute the commands to the Atmega via I2C protocol . Since the weight is
especially important as the mass budget will be one of the most restricting criteria
during design, a Gumstix Caspa Camera was chosen due to its weight 22.9 grams.
A visual system has been used in order to localize the flying target. We processed the
captured images from the camera to localize the flying target inside the image. Then,
we made a correspondence between the location on the image and the location on
work field. Because we seek a detection method which allows having good detection
with scaled invariant, rotation invariant and robust against noise, and several
approaches had been studied and the examination shows that SURF is the most
Visual Fuzzy Control for Blimp Robot to Follow 3D Aerial Object 101
convenient for our demands. We proposed the SURF algorithm in our previous work
[31] in order to detect and track a ground robot. However, in this work we extended
and updated the algorithm by using IPM algorithm. This combination gives the blimp
ability not only to obtain the bird's eye view, but also to estimate the distances
between the blimp and the 3D object. In addition, this work deals with detecting and
tracking flying object in the space. As the SURF needs a robust estimator to filter the
total set of matched points and eliminate erroneous matches, the Random Sample
Consensus RANSAC algorithm is used to discard the outliers from the set of matched
points [32]. Considering a flying object at a certain altitude Ho on the world space.
The blimp robot has a certain known altitude Hb with position Pb and the flying object
location is Po, The distance between both of them is D in ground plane as shown in
Fig 1.
It was assumed that the position of the blimp robot is known when the mission
starts. Therefore, These data might be used in order to localize the 3D object with
respect to the blimp position as it is given by:
Po = ( X − H o , Yb , Z b + D )
(1)
b
by estimating the distance between both of them and keep the 3D object at the center
of the image, Eq. 1 could be valid to find the object position. Then, 3D object
projection on the image plane defines by its center point in ( xi, yi) plane with polar
coordinate ( r, ɸ ) where:
(2)
, Φ = tan − 1 ( x
2 2
r = x + y y )
i i i i
where r is the distance between the projection point of the abject and the projection
point of the camera onto the image plane. ɸ is the angle between object and center line
of image plane. In fact, the perspective view of captured image somehow distorts the
actual shape of the space in ( Xw, Yw, Zw) world coordinate. The angle of view and the
distances of the object from the camera contribute to associate a different information
to each pixel of the image .Hence, this image needs to go through a pre-processing
step to remedy this distortion by using the transformation technique known as the
Inverse Perspective Mapping ( IPM) [33]. The IPM allows to remove the perspective
effect from the image and to remap the image into a new 2-D domain where the
information content is homogeneously distributed among all pixels. If we assume that
102 R. Al-Jarrah and H. Roth
the plane of 3D flying object is planner as it is shown in Fig 2. The rotation value
which is the translation along the camera optical axis is θ. However, the use of single
camera ,which is mounted on the gondola, does not provide the depth information
about this object because the non linearity between the object position in the image
plane and its position in the 3D object plane as well as in the real world plane.
The transforming of the image could remove the non linearity of these distances.
The mapping of the object in its plane (Xo, Yo, Zo) to its projection on the image plane
(u, v) is shown in Fig. 2. In order to create the top- down view, we need to rotate the
image with angle θ, translate along the camera optical view axis, and then, scale by
the camera parameters matrix as it is given by:
(3)
[u , v ,1]T = KTR ( x , y , z ,1 )T
1 0 0 0 1 0 0 0
0 α 0
0 0 s u
cos θ − sin θ 1 0 0 , K = 0
x o
R = ,T = α y v 0
0 sin θ cos θ 0 0 0 1 − h sin θ o (4)
0 0 1 0
0 0 0 1 0 0 0 1
So the Eq.3 which using to map each pixel on the image plane to a top down view can
be re-written as:
u q X w
14 Y
q q q (6)
i 11 12 13
w
v i = q 21 q
22
q
23
q
24 Z
1 q q q q w
31 32 33 34 1
Visual Fuzzy Control for Blimp Robot to Follow 3D Aerial Object 103
We now have a position of the object in the transformed image in pixels. In order to
convert it into meters, the automatic calibration of the bird's eye view projection
system can be used as it is explained in details by [35]. The use of IPM allows the
blimp robot to obtain the bird's eye view of the as it is shown on Fig. 3. It allows to
remove the perspective effect from the acquired image to weigh each pixel according
to its information content. The SURF algorithm might now detect the object with
scale invariant or rotation invariant as it is shown in Fig. 4. Then, we could obtain the
approximated distances between the camera (blimp robot) and the object. The whole
working process of blimp system is shown in Fig. 5.
Data Ai Si
[-90,-55] <[-1000, -820], [-814, -451]> {[-1000, -820] = 0.5, [-814, -451]= 0.5}
[-55,-5] <[-495, -257], [-281, -15]> {[-495, -257] = 0.5, [-281, -15]= 0.5}
[-5, 5] <[-10, 6]> {[-10, 6]=1}
[5, 55] <[10, 246], [257, 478] > {[10, 246] = 0.5, [257, 478]= 0.5 }
[55, 90] < [575, 772], [880, 1000]> { [575, 772] = 0.5, [880, 1000] = 0.5}
EL ER Core Support
[-1000, -814] [-820, -451] [-814, -820] {(-1000, -814), (-814, -820), (-820, -
451)}
[ -495, -281] [-257, -15] [-257, -281] {(-495, -281), (-281, -257), (-257, -15)}
[ -10, 6] [-10, 6] [-10, 6] {(-10,6)}
[10, 257] [246, 478] [ 246, 257] {(10, 257), (246,257), (246, 478)}
[575, 880] [772, 1000] [ 772,880] {(575, 880), (772,880), (772, 1000)}
The form of possibility histogram and distributions (π) depends on the core and
support of the measurement record sets as it is shown in Table 3. The core and
support of the possibilities distribution for the histogram is given by:
C i (π ) = [E i L , E i R ] (8)
sup i ( π ) = [
E i
L
, Ei
R
] (9)
Where EL and ER are the left and right endpoints vectors, respectively. The
possibilities histograms for the data as shown in Fig. 6 can be transferred to fuzzy
membership functions as shown in Fig. 7 without any changes because that both of
them has the same mathematical description and all possibilistic histogram are fuzzy
intervals. Fig. 8 also shows the histograms for the first input. The general formula of
the above trapezoidal membership functions is given by:
x − a b − a a < x < b
b ≤ x < c
1 (10)
μ A (x) =
d − x d − c c < x < d
0 otherwise
The new output of the membership functions can be calculated as the following [37]:
m i = 2 d i − 2 a i + c i − bi
n i = c i2 + 2 c i d i + 2 d i2 − 2 a i2 − 2 a i b i − b i2 (11)
F = ( n i − m i2 − c i m i ) ( 2 m i + c i − b i )
H = F + mi
Where ai, bi, ci, and di are the breakpoints of the trapezoidal membership functions
and F, H are the new breakpoints for left and right sides. The optimized and final
membership functions for the output and inputs are shown in Fig. 9, Fig. 10 and Fig.
11 , respectively. Each input and the output has 5 linguistic variables. The number of
rules for this controller were 25 base rules.
106 R. Al-Jarrah and H. Roth
For this controller, the first input is estimated distances between blimp and the 3D
object target D. The second input is the difference between last two measures. Then,
the output is the vectorization angle for the main propellers. Same procedures were
done here in order to design the fuzzy membership functions for the vectorization
angle. The empirical data ,the analysis as well as the histograms parameters are given
in Table 4,5 and 6.
Data Ai Si
[120,200] <[-232,-98], [-114,5]> {[-232,-98] = 0.5, [-114,5]=0.5}
[200,240] <[-17, 30]> {[-17, 30] = 1}
[240,320] <[0, 123], [100, 204] > {[0, 123] = 0.5, [100, 204]= 0.5}
[320, 400] <[228, 317], [325, 435]> {[228, 317]= 0.5, [325, 435] = 0.5}
[400, 480] <[474, 552], [664, 720]> {[474, 552] = 0.5, [664, 720] = 0.5}
EL ER Core Support
[-232, -114] [-98, 5] [-98,-114] {(-232, -114), (-98,-114), (-98, 5)}
[-17, 30] [-17, 30] [-17, 30] {(-17, 30), (-17, 30), (-17, 30)}
[0, 100] [123, 204] [123, 100] {(0, 100), (123, 100), (123, 204)}
Fig. 14. The Histograms for Input1 Fig. 15. The Final Output µ
Fig. 16. The Final µ for Input1 Fig. 17. µ for Input 2
108 R. Al-Jarrah and H. Roth
Same procedure for this controller was done in order to find the histograms for
output and input1 as they are shown in Fig. 12, Fig. 13 and Fig. 14. Then, by using
bacterial algorithm the membership functions have been obtained as shown in Fig. 15,
Fig. 16 and Fig. 17. Therefore, 5 linguistic variables for each input as well as 5 for
input had been found. The total number of the fuzzy rules were 21 rules.
5 Experimental Results
In order to verify the proposed fuzzy vision system, several experiments of the
complete system were conducted. During these experiments the blimp robot was
flying at a certain altitude as well as the 3D objects move in the plane with constant
altitude. We assume that the 3D object target is already in the view of the on-board
camera. The target would be identified and tracked in the video sequence by the
vision system. Based on the vision data, the vectorization angle for the main
propellers and the yaw angle were controlled to follow the target and keep it in a
certain position in the image as well as in a certain distance from the blimp. Fig. 18
shows the distance between the blimp and the object. We assume here that the target
is static at a certain position from the blimp. The blimp vision system detects and flies
toward the target. Then, the engine will stop and the blimp stop moving at a certain
distance which is here 75 cm. In Fig. 19, Fig. 20 and Fig. 21, they show the blimp
behavior in order to control the yaw angle and keep the object at the center of the
image plane. The distance in pixels between the blimp and the target is shown in Fig.
22. The vectorization angle between the blimp robot and the object is shown in Fig.
23 and we should note that the blimp controller try to follow the object as well as
keep the blimp at a certain altitude which means the vectorization angle should back
to 90 Degree during the mission, otherwise the blimp will go down while it is flying.
In Fig. 24 shows the trajectory which has been made by the blimp robot during the
mission in our LAB. These data are collected using the IMU mounted on the blimp
robot , then, reconstructed after the missions are completed. Also, the sequences
images for the experiments is shown in Fig. 25.
Fig. 18. Estimated distances when the 3D object Fig. 19. The angle between the center and
static in the environment. the object in image plane
Visual Fuzzy Control for Blimp Robot to Follow 3D Aerial Object 109
Fig. 20. The angle between the center and the Fig. 21. The angle between the center and
object in image plane the object in image plane
Fig. 22. distance in polar coordinate (in pixels) Fig. 23. Vectorization angle
Fig. 24. Trajectory of blimp during tests Fig. 25. Experiments Sequences images
6 Conclusion
In this paper, an efficient vision-based object detection and localization algorithm to
realize the autonomous 3D object following has been presented. The visual information
is provided by SURF algorithm as well as Inverse Perspective Mapping. The fuzzy logic
controllers have been designed experimentally in order to keep the blimp at a certain
distance and maintain it in the center of the image. The experiments results validate that
the algorithm is not only be able to track the target effectively, but also improves the
robustness and accuracy. Also, we presented a method in order to estimate the distance
between two 3D objects in the space based on visual information and using single
camera. This estimation helps to localize the 3D object with respect to the known blimp
position, assuming the altitude of both are known. However, this algorithm could face
some limitations due to the vision sensor characteristics as well as the size of the target
that could affect the performance. In the future, the multi 3D target tracking using binary
sensor network will be presented to estimate the locations of two flying objects on indoor
environments, takes into account that the targets are flying at unknown altitude. Also, the
Extended Kalman filter (EKF) or modified EKF might be used for extracting necessary
information about the dynamics of the moving objects.
110 R. Al-Jarrah and H. Roth
References
[1] Browning, B., Veloso, M.: Real Time, Adaptive Color Based Robot Vision. In:
IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3871–3876
(2005)
[2] Guenard, N., Hamel, T., Mahony, R.: A Practical Visual Servo Control for an Unmanned
Aerial Vehicle. IEEE Trans. Robot. 24, 331–340 (2008)
[3] Lin, F., Lum, K., Chen, B., Lee, T.: Development of a Vision-Based Ground Target
Detection and Tracking System for a Small Unmanned Helicopter. Sci. China—Series F:
Inf. Sci. 52, 2201–2215 (2009)
[4] Korodi, A., Codrean, A., Banita, L., Volosencu, C.: Aspects Regarding The Object
Following Control Procedure for Wheeled Mobile Robots. WSEAS Trans. System
Control 3, 537–546 (2008)
[5] Betser, A., Vela, P., Pryor, G., Tannenbaum, A.: Flying Information Using a Pursuit
Guidance Algorithm. In: American Control Conference, vol. 7, pp. 5085–5090 (2005)
[6] Savkin, A., Teimoori, H.: Bearings-Only Guidance of an Autonomous Vehicle Following
a Moving Target With a Smaller Minimum Turning Radius. In: 47th IEEE Conference
on Decision and Control, pp. 4239–4243 (2008)
[7] Fukao, T., Kanzawa, T., Osuka, K.: Tracking Control of an Aerial Blimp Robot Based on
Image Information. In: 16th IEEE International Conference on Control Applications part
of IEEE Multi-Conference on Systems and Control, Singapore, pp. 874–879 (2007)
[8] Canals, R., Roussel, A., Famechon, J., Treuillet, S.: A Bi-processor Oriented Vision-
Based Target Tracking System. IEEE Trans. Ind. Electron. 49(2), 500–506 (2002)
[9] Mejias, L., Saripalli, S., Cervera, P., Sukhatme, G.: Visual Servoing of an Autonomous
Helicopter in Urban Areas Using Feature Tracking. Journal of Field Robot 23, 185–199
(2006)
[10] Hu, W.M., Tan, T.N., Wang, L., Maybank, S.: A Survey on Visual Surveillance of
Object Motion and Behaviours. IEEE Trans. Syst. 34, 334–352 (2004)
[11] Hu, Y., Zhao, W., Wang, L.: Vision-based Target Tracking and Collision Avoidance for
Two Autonomous Robotic Fish. IEEE Trans. Ind. Electron. 56, 1401–1410 (2009)
[12] Xu, D., Han, L., Tan, M., Li, Y.: Ceiling-Based Visual Positioning for an Indoor Mobile
Robot With Monocular Vision. IEEE Trans. Ind. Electron. 56, 1617–1628 (2009)
[13] Lowe, D.: Distinctive Image Features From Scale Invariant Keypoints. International
Journal of Computer Vision 60(2), 91–110 (2004)
[14] Bay, H., Ess, A., Tuyelaars, T., Gool, V.: SURF: Speeded Up Robust Features. Computer
Vision and Image Understanding CVIU 110(3), 346–359 (2008)
[15] Saripalli, S., Sukhatme, G.S.: Landing a helicopter on a moving target. In: Proceedings of
IEEE International Conference on Robotics and Automation, Rome, Italy, pp. 2030–2035
(2007)
[16] Bourquardez, O., Mahony, R., Guenard, N., Chaumette, F., Hamel, T., Eck, L.: Image
based visual servo control of the translation kinematics of a quadrotor aerial vehicle.
IEEE Transactions on Robotics 25(3), 743–749 (2009)
[17] Choi, H., Kim, Y.: UAV guidance using a monocular-vision sensor for aerial target
tracking. Control Engineering Practice Journal 22, 10–19 (2014)
[18] Passerieus, J.M., Cappel, D.V.: Optimal Observer Maneuver for Bearings-Only Tracking.
IEEE Transactions on Aerospace and Electronic Systems 34(3), 777–788 (1998)
[19] Watanabe, Y., Johnson, E.N., Calise, A.J.: Optimal 3D Guidance from a 2D Vision
Sensor. In: AIAA Guidance, Navigation, and Control Conference, Providence, RI (2004)
[20] Cao, C., Hovakimyan, N.: Vision-Based Aerial Tracking using Intelligent Excitation. In:
American Control Conference, Portland, OR, pp. 5091–5096 (2005)
Visual Fuzzy Control for Blimp Robot to Follow 3D Aerial Object 111
[21] Stepanyan, V., Hovakimyan, N.: A Guidance Law for Visual Tracking of a Maneuvering
Target. In: IEEE Proceeding of American Control Conference, Minneapolis, MN, pp.
2850–2855 (2006)
[22] Stepanyan, V., Hovakimyan, N.: Adaptive Disturbance Rejection Controller for Visual
Tracking of a Maneuvering Target. Journal of Guidance, Control and Dynamics 30(4),
1090–1106 (2007)
[23] Mondragon, I.F., Campoy, P., Olivares-Mendez, M.A., Martinez, C.: 3D Object
following based on visual information for unmanned aerial vehicles. In: Robotics
Symposium IEEE IX Latin American and IEEE Colombian Conference on Automatic
Control and Industry Applications, Bogota (2011)
[24] Olivares-Mendez, M.A., Mondragon, I.F., Campoy, P., Mejias, L., Martinez, C.: Aerial
Object Following Using Fuzzy Servoing. In: Proceeding of Workshop on Research,
Development and Education on Unmanned Aerial Systems RED-UAS, Seville, Spain
(2011)
[25] Joslyn, C.: In Support of an Independent Possibility Theory. In: de Cooman, G., Raun,
D., Kerre, E.E. (eds.) Foundations and Applications of Possibility Theory, pp. 152–164.
World Scientific, Singapore (1995a)
[26] Zadeh, L.: Fuzzy Sets as the Basis for a Theory of Possibility. Fuzzy Sets and Systems 1,
3–28 (1978); Reprinted in Fuzzy Sets and Systems 100(suppl.), 9–34 (1999)
[27] Dubois, D., Prade, H.: Probability Theory and Multiple-valued Logics. A Clarification.
Annals of Mathematics and Artificial Intelligence 32, 35–66 (2001)
[28] Joslyn, C.: Measurement of Possibilistic Histograms from Interval Data. NCR Research
Associate, Mail Code 522, NASA Goddard Space Flight Center, Greenbelt, MD 20771,
USA (1996)
[29] Al-Jarrah, R., Roth, H.: Design Blimp Robot based on Embedded System & Software
Architecture with high level communication & Fuzzy Logic. In: Proceeding in IEEE 9th
International Symposium on Mechatronics & its Applications (ISMA 2013), Amman,
Jordan (2013)
[30] Al-Jarrah, R., Roth, H.: Developed Blimp Robot Based on Ultrasonic Sensors Using
Possibilities Distributions and Fuzzy Logic. In: 5th International Conference on
Computer and Automation Engineering, ICCAE, Belgium, vol. 1(2), pp. 119–125 (2013)
[31] Al-Jarrah, R., AitJellal, R., Roth, H.: Blimp based on Embedded Computer Vision and
Fuzzy Control for Following Ground Vehicles. In: 3rd IFAC Symposium on Telematics
Applications (TA 2013), Seoul, Korea (2013)
[32] Fischer, M.A., Bolles, R.C.: Random Sample Consensus: a paradigm for Model Fitting
With Applications to Image Analysis and Automated Cartography. Communication of
the ACM 24(6), 381–395 (1981)
[33] Mallot, H.A., Biilthoff, H.H., Little, J.J., Bohrer, S.: Inverse Perspective Mapping
Simplifies Optical Flow Computation and Obstacle Detection. Biological Cybernetics,
177–185 (1991)
[34] Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge
University Press (2000)
[35] Bradsky, G., Kaebler, A.: Learning OpenCV: Computer Vision with the OpenCV
Library, pp. 408–412. Oreilly Media (2008)
[36] Jang, J.-S.R., Sun, C.-T., Mizutani, E.: Nuero Fuzzy and Soft Computing, pp. 13–91.
Prentice-Hall, Upper Saddle River (1997)
[37] Botzheim, J., Hámori, B., Kóczy, L.T.: Applying Bacterial Algorithm to Optimize
Trapezoidal Membership Functions in a Fuzzy Rule Base. In: Proceeding of the
International Conference on Computational Intelligence, Theory and Applications, 7th
Fuzzy Days, Dortmund, Germany (2001)
At Odds with Curious Cats, Curious Robots Acquire
Human-Like Intelligence
Signals, Images, and Intelligent Systems Laboratory (LISSI / EA 3956), University Paris-Est
Creteil, Senart-FB Institute of Technology, 36-37 rue Charpak, 77127 Lieusaint, France
{dominik.ramik,madani,sabourin}@u-pec.fr
1 Introduction
If nowadays machines and robotic bodies are fully automated outperforming human
capacities, nonetheless, none of them can be called truly intelligent or can defeating
human’s cognitive skills. The fact that human-like machine-cognition is still beyond
the reach of contemporary science only proves how difficult the problem is.
Somewhat, it is due to the fact that we are still far from fully understanding the human
cognitive system. Partly, it is so because if contemporary machines are often fully
automatic, they linger rarely fully autonomous in their knowledge acquisition.
Nevertheless, the concepts of bio-inspired or human-like machine-cognition remain
foremost sources of inspiration for achieving intelligent systems (intelligent
machines, intelligent robots, etc…).
Emergence of cognitive phenomena in machines has been and remains active part of
research efforts since the rise of Artificial Intelligence (AI) in the middle of the last
century. Among others, [1] provides a survey on cognitive systems. It accounts on
different paradigms of cognition in artificial agents markedly on the contrast of
emergent versus cognitivist paradigms and on their hybrid combinations. The work of
[2] brings an in-depth review on a number of existing cognitive architectures such those
which adheres to the symbolic theory and reposes on the assumption that human
knowledge can be divided to two kinds: declarative and procedural. Another discussed
architecture belongs to class of those using “If-Then” deductive rules dividing
knowledge again on two kinds: concepts and skills. In contrast to above-mentioned
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 112–123, 2014.
© Springer International Publishing Switzerland 2014
At Odds with Curious Cats, Curious Robots Acquire Human-Like Intelligence 113
works, the work of [3] focuses the area of research on cognition and cognitive robots
discussing purposes linking knowledge representation, sensing and reasoning in
cognitive robots. However, there is no cognition without perception (a cognitive system
without the capacity to perceive would miss the link to the real world and so it would be
impaired) and thus autonomous acquisition of knowledge from perception is a problem
that should not be skipped when dealing with cognitive systems.
Prominently to the machine-cognition’s issue is the question: “what is the compel
or the motivation for a cognitive system to acquire new knowledge?” For human
cognitive system Berlyne states, that it is the curiosity that is the motor of seeking for
new knowledge [4]. Consequently a few works have been since there dedicated to
incorporation of curiosity into a number of artificial systems including embodied
agents or robots. However the number of works using some kind of curiosity
motivated knowledge acquisition with implementation to real agents (robots) is still
relatively small. Often authors view curiosity only as an auxiliary mechanism in
robot’s exploration behavior. One of early implementations of artificial curiosity may
be found in [5]. Accordingly to the author, the introduction of curiosity further helps
the system to actively seek similar situations in order to learn more. On the field of
cognitive robotics a similar approach may be found in [6] where authors present an
approach including a mechanism called “Intelligent Adaptive Curiosity”. Two
experiments with AIBO robot are presented showing that the curiosity mechanism
successfully stimulates the learning progress. In a recent publication, authors of [7]
implement the psychological notion of surprise-curiosity into the decision making
process of an agent exploring an unknown environment. Authors conclude that the
surprise-curiosity driven strategy outperformed classical exploration strategy
regarding the time-energy consumed in exploring the delved environment. On the
other hand, the concept of surprise, relating closely the notion of curiosity, has been
exploited in [8] by a robot using the surprise in order to discover new objects and
acquire their visual representations. Finally, the concept of curiosity has been
successfully used in [9] for learning affordances of a mobile robot in navigation task.
The mentioned works are attempting to respond the question: “how an autonomous
cognitive system should be designed in order to exhibit the behavior and functionality
close to its human users”.
That is why even though in English literature “curiosity killed a cat” (BENNY—
(with a wink): “Curiosity killed a cat! Ask me no questions and I'll tell you no lies.”,
Different, Eugene O'Neill, 1920), taking into consideration the aforementioned
enticing benefits of curiosity, we have made it our principle foundation in investigated
concept. The present paper is devoted to the description of a cognitive system based
on artificial curiosity for high-level human-like knowledge acquisition from visual
information. The goal of the investigated system is to allow the machine (such as a
humanoid robot) to observe, to learn and to interpret the world in which it evolves,
using appropriate terms from human language, while not making use of a priori
knowledge. This is done by word-meaning anchoring based on learning by
observation stimulated (steered) by artificial curiosity and by interaction with the
human. Our model is closely inspired by juvenile learning behavior of human infants
([10] and [11]).
114 D.M. Ramík, K. Madani, and C. Sabourin
Accordingly to Berlyne’s theory of human curiosity [4], two cognitive levels contribute to
human’s desire of acquiring new knowledge. The first is so-called “perceptual curiosity”,
which leads to increased perception of stimuli. It is a lower level cognitive function,
more related to perception of new, surprising or unusual sensory input. It contrasts to
repetitive or monotonous perceptual experience. The other one is called “epistemic
curiosity”, which is more related to the “desire for knowledge that motivates
individuals to learn new ideas, eliminate information-gaps, and solve intellectual
problems” [12]. It also seems that it acts to stimulate long-term memory in
remembering new or surprising (e.g. what may be contrasting with already learned)
information [13]. By observing the state of the art (including the referenced ones), it
may be concluded that the curiosity is usually used as an auxiliary mechanism instead
of being the fundamental basis of the knowledge acquisition. To our best knowledge
there is no work to date which considers curiosity in context of machine cognition as
a drive for knowledge acquisition on both low (perceptual) level and high
(“semantic”) level of the system. Without striving for biological plausibility whilst by
analogy with natural curiosity, we founded our system on two cognitive levels ([14],
[15]). The first ahead of reflexive visual attention plays the role of perceptual
curiosity and the second coping with intentional learning-by-interaction undertakes
the role of epistemic curiosity.
robot would use to descriibe it based on its current belief. The closer the robbot’s
description is to that given by the human, the higher the fitness is. Once the evoluttion
has been finished, the belief with the highest fitness is adopted by the robot andd is
used to interpret occurren nces of new (unseen) objects. Fig. 1 depicts throughh an
example important parts and d operations of the proposed system.
Let us suppose a robot equipped by a sensor observing the surrounding world and
interacting with the hum man. The world is represented as a set of featu ures
I i1 , i2 ,, ik , which can be acquired by robot’s sensor. Each time the ro
obot
makes an observation o , its epistemic curiosity stimulates it to interact with the hu-
man asking him to gives a set of utterances U H describing the found salient objeects.
Let us denote the set of alll utterances ever given about the world as U . The obserrva-
tion o is defined as an orrdered pair o I l ,U H , where I l I , expressed by (1),
ures obtained from observation and U H U is the sett of
stands for the set of featu
utterances (describing o ) given by human in the context of that observation. i p
de-
notes the pertinent information for a given u (i.e. features that can be described se-
mantically as u in the lang
guage used for communication between the human and the
robot), ii the impertinent information
i (i.e. features that are not described by the giv
ven
u, but might be described by another ui
U ) and sensor noise ε . The goal iss to
116 D.M. Ramík, K. Madani, and C. Sabourin
distinguish the pertinent information from the impertinent one and to correctly map
the utterances to appropriate perceived stimuli (features). Let us define an interpreta-
tion X u u , I of an utterance u as an ordered pair where I I is a set of
j j
features from I So, the belief B is defined accordingly to (2) as an ordered set of
X u interpreting utterances u from U .
I l i p u ii u ε (1)
UH UH
Accordingly to the criterion expressed by (3), one can calculate the belief B
which interprets coherently the observations made so far: in other words, by looking
for such a belief, which minimizes across all the observations oq
O the difference
between the utterances U made by human, and those utterancesU , made by the
Hq Bq
system by using the belief B . Thus, B is a mapping from the set U to I : all members
of U map to one or more members of I and no two members of U map to the same
member of I .
arg min
U U
O
(3)
q 1
Hq Bq
B
iq
I q (with I q I ) are extracted. As described in (1), the extracted set of features
contains as well pertinent as impertinent features. The coherent belief generation is
done by deciding, which features i
I may possibly be the pertinent ones. The
q q
decision is driven by two principles. The first one is the principle of “proximity”,
stating that any feature i is more likely to be selected as pertinent in the context of
At Odds with Curious Cats, Curious Robots Acquire Human-Like Intelligence 117
u , if its distance to other already selected features is comparatively small. The se-
cond principle is the “coherence” with all the observations in O . This means, that any
observation o
O , corresponding to u
U , has to have at least one feature as-
q Hq
tion) which is computed accordingly to the equation (4), where ν is the number of
utterances that are not present in both sets U and U (e.g. either missed or are
Bq Hq
superfluous utterances interpreting the given features). The globally best fitting organ-
ism is chosen as the belief that best explains observations O made (by robot).
1
D ν with ν U Hq U Bq U Hq U Bq (4)
1 ν
Human beings learn both by observation and by interaction with the world and with
other human beings. The former is captured in our system in the “best interpretation
search” outlined previous subsections. The latter type of learning requires that the
robot be able to communicate with its environment and is facilitated by learning by
observation, which may serve as its bootstrap. In our approach, the learning by
interaction is carried out in two kinds of interactions: human-to-robot and robot-to-
human. The human-to-robot interaction is activated anytime the robot interprets
wrongly the world. When the human receives a wrong response (from robot), he
provides the robot a new observation by uttering the desired interpretation. The robot
118 D.M. Ramík, K. Madani, and C. Sabourin
takes this new corrective knowledge about the world into account and searches for a
new interpretation of the world conformably to this new observation. The robot-to-
human interaction may be activated when the robot attempts to interpret a particular
feature classified with a very low confidence: a sign that this feature is a borderline
example. In this case, it may be beneficial to clarify its true nature. Thus, led by the
epistemic curiosity, the robot asks its human counterpart to make an utterance about
the uncertain observation. If the robot's interpretation is not conforming to the
utterance given by the human (robot's interpretation was wrong), this observation is
recorded as a new knowledge and a search for the new interpretation is started.
The designed system has been implemented on NAO robot (from Aldebaran
Robotics). It is a small humanoid robot which provides a number of facilities such as
onboard camera (vision), communication devices and onboard speech generator. The
fact that the above-mentioned facilities are already available offers a huge save of
time, even if those faculties remain quite basic in that kind of robots. If NAO robot
integrates an onboard speech-recognition algorithm (e.g. some kind of speech-to-text
converter) which is sufficient for “hearing” the tutor, however its onboard speech
generator is a basic text-to-speech converter. It is not sufficient to allow the tutor and
the robot conversing in natural speech. To overcome NAO’s limitations relating this
purpose, the TreeTagger tool 1 was used in combination with robot's speech-
recognition system to obtain the part-of-speech information from situated dialogs.
Standard English grammar rules were used to determine whether the sentence is
demonstrative (e.g. for example: “This is an apple.”), descriptive (e.g. for example:
“The apple is red.”) or an order (e.g. for example: “Describe this thing!”). To
communicate with the tutor, the robot used its text-to-speech engine.
3.1 Implementation
The core of the implementation’s architecture is split into five main units:
Communication Unit (CU), Navigation Unit (NU), Low-level Knowledge Acquisition
Unit (LKAU), High-level Knowledge Acquisition Unit (HLAU) and Behavior
Control Unit (BCU). Fig. 2 illustrates the bloc-diagram of the implementation’s
architecture. The aforementioned units control NAO robot (symbolized by its sensors,
its actuators and its interfaces in Fig. 2) through its already available hardware and
software facilities. In other words, the above-mentioned architecture controls the
whole robot’s behavior.
The purpose of NU is to allow the robot to position itself in space with respect to
objects around it and to use this knowledge to navigate within the surrounding
environment. Capacities needed in this context are obstacle avoidance and
1
Developed by the ICL at University of Stuttgart, available online at: http://www.ims
.uni-stuttgart.de/projekte/corplex/TreeTagger
At Odds with Curious Cats, Curious Robots Acquire Human-Like Intelligence 119
determination of distance too objects. Its sub-unit handling spatial orientation receivess its
inputs from the camera an nd from the LKAU. To get to the bottom of the obstaacle
avoidance problem, we hav ve adopted a technique based on ground color modeliing.
Inspired by the work presented in [19], color model of the ground helps the roboot to
distinguish free-space from obstacles.
o
The LKAU ensures gatthering of visual knowledge, such as detection of saliient
( the sub-unit in charge of salient object detection) and
objects and their learning (by
sub-recognition (see [18] and [20]). Those activities are carried out mostly inn an
“unconscious” manner, i.e. they are run as an automatism in “background” whhile
collecting salient objects an
nd learning them. The learned knowledge is stored in Loong-
term Memory for further usse.
The HKAU is the centerr where the intellectual behavior of the robot is constructed.
Receiving its features from m the LKAU (visual features) and from the CU (linguiistic
features), this unit processses the beliefs’ generation, the most coherent belieef’s
emergence and constructs the high-level semantic representation of acquired vissual
knowledge. Unlike the LKA AU, this unit represents conscious and intentional cognittive
activity. In some way, it operates as a baby who learns from observation and frrom
verbal interaction with adullts about what he observes developing in this way his oown
representation and his own opinion about the observed world [21].
received its inputs from alll other units and returns its outputs to each concerned uunit
including robot’s devices (e.g.
( sensors, actuators and interfaces) [21]. The hum man-
robot interaction is perform
med by this unit in cooperation with HLAU. In other worrds,
driven by HLAU, a part off the robot’s epistemic curiosity based behavior is handdled
by BCU.
3.2 dation
Experimental Valid
(a) (b)
Fig. 3. Experimental setup sho
owing collected set of objects (a) and the learning rate versuss the
number of presented exampless (of the same object) to the robot (b)
Fig. 5. Images from a video seequence showing the robot exploring the room (left-side pictture)
and finding salient objects in th
hat room (middle and right-side pictures, respectively)
Fig. 6. Images from a video seequence showing the robot interacting with the human and leearn-
ing-by-interaction from him
Within the above-mentioned experimental protocol, the robot has been askedd to
explore an unknown envirronment (an unseen room) where a subset among thee 25
collected objects has been placed. After exploration, the human show, one-by-oone,
objects detected by robot in that room accordingly to its perceptual curiosity and
asked the robot if the robot has seen each indicated object. However, the human shhow
also some unseen objects (as a blue teddy-bear). The epistemic curiosity of the roobot
stimulates it to ask the hum
man additional information and learn-by-interaction with the
human the discovered objects. Extracted from the video, Fig. 5 and Fig. 6 shhow
photographs of the robot during the above-described experiment. Here-bellow w an
example of the Human-Rob bot interactive learning is reported:
• Human [giving the orrder to the Robot]: “Go explore the room.”
• Robot: “I am explorin
ng the room.”
122 D.M. Ramík, K. Madani, and C. Sabourin
• Robot [after finishing the exploration of the room]: “I finished the exploring and
I have seen several interesting objects.”
• Human [showing the book to the Robot]: “Have you seen this?”
• Robot: “Yes! I have seen this, what is its name?”
• Human: “This is a book.”
• Robot: “OK! I will remember that it is a book.”
• Human [taking out from his pocket an unseen blue teddy-bear]: “Have you seen
this?”
• Robot: “No! I haven’t seen this before, what is this?”
• Human: “This is a teddy-bear.”
• Robot: “OK! I will remember that this is a teddy-bear.”
After having learned objects (among which a black book), the robot has been asked
to search for the “book” placed in different positions in that room. Fig. 7 shows
photographs of the robot during the above-described experiment. Robot searches and
successfully finds the book in different positions and stances. Additional experimental
results are available on: http://youtu.be/W5FD6zXihOo.
References
1. Vernon, D., Metta, G., Sandini, G.: A Survey of Artificial Cognitive Systems: Implications
for the Autonomous Development of Mental Capabilities in Computational Agents. IEEE
Transactions on Evolutionary Computation 11(2), 151–180 (2007)
2. Langley, P., Laird, J.E., Rogers, S.: Cognitive architectures: Research issues and chal-
lenges. Cognitive Systems Research 10(2), 141–160 (2009)
3. Levesque, H.J., Lakemeyer, G.: Cognitive robotics. In: Handbook of Knowledge Repre-
sentation. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Dagstuhl (2010)
At Odds with Curious Cats, Curious Robots Acquire Human-Like Intelligence 123
4. Berlyne, D.E.: A theory of human curiosity. British Journal of Psychology 45(3), 180–191
(1954)
5. Schmidhuber, J.: Curious model-building control systems. In: Proceedings of International
Joint Conference on Neural Networks (IEEE-IJCNN 1991), vol. 2, pp. 1458–1463 (1991)
6. Oudeyer, P.-Y., Kaplan, F., Hafner, V.V.: Intrinsic Motivation Systems for Autonomous
Mental Development. IEEE Transactions on Evolutionary Computation 11(2), 265–286
(2007)
7. Macedo, L., Cardoso, A.: The exploration of unknown environments populated with enti-
ties by a surprise-curiosity-based agent. Cognitive Systems Research 19-20, 62–87 (2012)
8. Maier, W., Steinbach, E.G.: Surprise-driven acquisition of visual object representations for
cognitive mobile robots. In: Proc. of IEEE Int. Conf. on Robotics and Automation, Shang-
hai, pp. 1621–1626 (2011)
9. Ugur, E., Dogar, M.R., Cakmak, M., Sahin, E.: Curiosity-driven learning of traversability
affordance on a mobile robot. In: Proc. of IEEE 6th Int. Conf. on Development and Learn-
ing, pp. 13–18 (2007)
10. Yu, C.: The emergence of links between lexical acquisition and object categorization: a
computational study. Connection Science 17(3-4), 381–397 (2005)
11. Waxman, S.R., Gelman, S.A.: Early word-learning entails reference, not merely associa-
tions. Trends in Cognitive Science (2009)
12. Litman, J.A.: Interest and deprivation factors of epistemic curiosity. Personality and Indi-
vidual Differences 44(7), 1585–1595 (2008)
13. Kang, M.J.J., Hsu, M., Krajbich, I.M., Loewenstein, G., McClure, S.M., Wang, J.T.T.,
Camerer, C.F.: The wick in the candle of learning: epistemic curiosity activates reward
circuitry and enhances memory. Psychological Sci. 20(8), 963–973 (2009)
14. Madani, K., Sabourin, C.: Multi-level cognitive machine-learning based concept for hu-
man-like artificial walking: Application to autonomous stroll of humanoid robots. Neuro-
computing, 1213–1228 (2011)
15. Ramik, D.-M., Sabourin, C., Madani, K.: From Visual Patterns to Semantic Description: a
Cognitive Approach Using Artificial Curiosity as the Foundation. Pattern Recognition Let-
ters 34(14), 1577–1588 (2013)
16. Ramik, D.-M., Sabourin, C., Madani, K.: A Real-time Robot Vision Approach Combining
Visual Saliency and Unsupervised Learning. In: Proc. of 14th Int. Conf. CLAWAR, Paris,
France, pp. 241–248 (2011)
17. Ramik, D.-M., Sabourin, C., Madani, K.: Hybrid Salient Object Extraction Approach with
Automatic Estimation of Visual Attention Scale. In: Proc. of Seventh Int. Conf. on Signal
Image Technology & Internet-Based Systems, Dijon, France, pp. 438–445 (2011)
18. Ramik, D.M., Sabourin, C., Moreno, R., Madani, K.: A Machine Learning based Intelli-
gent Vision System for Autonomous Object Detection and Recognition. J. of Applied In-
telligence (2013), doi:10.1007/s10489-013-0461-5
19. Hofmann, J., Jüngel, M., Lötzsch, M.: A vision based system for goal-directed obstacle
avoidance used in the rc’03 obstacle avoidance challenge. In: Proc. of 8th Int. Workshop
on RoboCup, pp. 418–425 (2004)
20. Moreno, R., Ramik, D.M., Graña, M., Madani, K.: Image Segmentation on the Spherical
Coordinate Representation of the RGB Color Space. IET Image Processing 6(9), 1275–
1283 (2012)
21. Ramik, D.M., Sabourin, C., Madani, K.: Autonomous Knowledge Acquisition based on
Artificial Curiosity: Application to Mobile Robots in Indoor Environment. J. of Robotics
and Autonomous Systems 61(12), 1680–1695 (2013)
A Statistical Approach to Human-Like Visual Attention
and Saliency Detection for Robot Vision:
Application to Wildland Fires’ Detection
1 Introduction
Wildland fires represent a major risk for many Mediterranean countries like France,
Spain, Portugal and Greece but also other regions around the world such as Australia,
California and recently Russia. They often result in significant human and economic
losses. For example, in 2009 in Australia alone, 300 000 hectares were devastated and
200 people killed. The situation is so dramatic, that in the recent past (May 2007), the
FAO (Food and Agriculture Organization of the United Nations) stressed the world
governments to take actions for a better prevention, understanding and fighting of
wildfires [1].
If the modeling of combustion and the prediction of fire front propagation remain
foremost subjects in better prevention of fire disasters (be it wildland fire or
compartment fire), efficient fire fighting lingers a curial need in precluding the
dramatic consequences of such disasters. Such efficiency may be reached, or at least
substantially enhanced, with information relative to the current state and the dynamic
evolution of fires. For example, GPS systems make it possible to know the current
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 124–135, 2014.
© Springer International Publishing Switzerland 2014
A Statistical Approach to Human-Like Visual Attention and Saliency Detection 125
position of the resources and satellite images can be used to sense and track fire.
However, the time scale and spatial resolution of these systems are still insufficient
for the needs of operational forest fires fighting [2]. Fig. 1 shows views of wildland
fires (forests’ fires) and firefighting difficulty inherent to the intervention’s
conditions. From this sample of pictures it could be visible that if GPS and satellite
images may help in locating and appreciation of overall fire, they remain inefficient in
appreciation of local characteristics or persons to be rescued. On the other hand, over
the two passed decades, visual and infrared cameras have been used as
complementary metrological instruments in flame experiments [3]. Vision systems
are now capable of reconstructing a 3D turbulent flame and its front structure when
the flame is the only density field in images [4]. These 3D imaging systems are very
interesting tools for flame study, however they are not suitable for large and wildland
fires. In fact, the color of the fire depends on the experimental conditions. Outdoor
fires are characterized by dominant colors in yellow, orange and red intervals. In
outdoor experiments, the scenes are unstructured with inhomogeneous fuel of various
colors (green, yellow, brown). Moreover overall light conditions may vary
influencing the fires’ aspect and their natural colors. Recent research in fire region
segmentation has shown promising results ([5], [6], [7] and [8]). However,
comparison tests showed that none of these techniques is well suited for the
segmentation of all fire scenarios.
Authors of [9] have recently developed a number of works based on the use of a
stereovision system for the measurement of geometrical characteristics of a fire front
during its propagation. Experiments were carried out in laboratory and in semi open
field. The first part of this work corresponds to a segmentation procedure in order to
extract the fire areas from the background of the stereoscopic images. However,
although showing a number of promising results, especially relating the 3D aspects,
as other referenced recent research works in fire region segmentation, the same work
concluded that none of proposed techniques is well suited for the segmentation of all
fire scenarios. So, the critical need (and the already open problem) of efficient
extraction of only the region containing fire data remains entire. Moreover, in order to
be useful for online fire-fighting strategy’s updating the fire’s segmentation and
126 V. Kachurka et al.
Visual saliency is computed using saliency features, called Global Saliency Map and
Local Saliency Map, resp pectively. These features play the role of some kindd of
g luminance and chromaticity characterizing items of the
saliency indicators relating
image. The Global Salieency Map relates global variance of luminance and
chromaticity of image whiile Local Saliency Map deals with locally centered (e.g.
focusing a local region) chharacteristics within the image. The final saliency map is a
nonlinear fusion of these tw
wo kinds of saliency features.
M Y (x ) = ΩYμ − Ω Y ( x )
M CrCb ( x ) = [Ω μ
Cr ] [
− Ω Cr (x ) + Ω Cb
2 μ
− Ω Cb ( x ) ]2 (1)
M (x ) = M CrCb ( x ) + 1 − M Y (x )
1 1
−C ( x )
1 − e −C ( x ) 1 − e
Coefficient C(x), determ mined using equation (2), is defined as value linkking
saturation Cc of each pixel (note
( the Cc is computed in RGB color space).
C ( x) = 10(Cc ( x) − 0.5),
0 (2)
Cc ( x) = max(Ω R (x)), Ω G (x), Ω B (x)) − min(Ω R (x), Ω G (x), Ω B (x))
Then let us define a (cen nter) histogram of pixel intensities inside it andd a
(surround) histogram a histogram of intensities in a window
as surroundingg
in a manner that the areaa of (see Fig. 2-a). Then center-surrouund
feature d ( x ) is given as ex
xpressed in equation (3) over all histogram bins (w
with
i ∈ {Y , Cr, Cb}), where and are pixel counts for each histogram allowinng it
to be normalized although a part of the windows is out of the image frame. In this
case only pixels inside the image
i are counted.
Wiindow Q
Window P
Fig. 2. The idea of center-surrround difference of histograms (left-side) and example of loocal
saliency map (right-side)
255 H C (i ) H S (i )
d i (x ) = − (3)
i =1 HC H S
Calculating the th
hroughout all the Y, Cr and Cb channels, one can comppute
the resulting center-surrounnd saliency on a given position as it is shownn in
t average color saturation over the sliding window . It
equation (4), where Cμ is the
is computed from RGB color model for each pixel as normalized ( rannge)
pseudo-norm accordingly to o equation (5), where C is obtained similarly from equattion
(2). When C is low (too dull,
d unsaturated colors) importance is given to intennsity
saliency. When C is high (vivid colors) chromatic saliency is emphasized. Fig. 2-b
gives an example of resultinng so-called local saliency map.
D(x ) = d Y ( x ) + 1 − Max (d Cr ( x ), d Cb ( x ))
1 1
−C ( x )
(4)
−C μ ( x )
1− e 1− e
p
( C) (k )
C μ (x ) =
k∈P x
(5)
p
Over size (of the preeviously defined P sliding window), local features D ( x )
are scale-dependent. Thus the parameter may play the role of “visual attention”
A Statistical Approach to
t Human-Like Visual Attention and Saliency Detection 129
control parameter driving the saliency extraction either in the direction of rooomy
items’ relevance (impact off high value) or toward details’ highlighting (impact of
low value). Such visual attention parameter allows a top-down control as well of
the attention as of the seensitivity of the feature in scale space. High vaalue
(resulting in a large sliding window size) with respect to the image size will make the
local saliency feature more sensitive to large coherent parts of the image. On the otther
hand, low values of will allow focusing to smaller details. For examp mple,
considering the human’s faace image, larger will leads to extraction of the enntire
face, while lower will fo
ocus on smaller items of the face as: eyes, lips, etc…
2.2 Segmentation
d Y (x , x ' ) = Ω Y (x ) − Ω Y (x ' )
(7)
d CrCb ( x , x ' ) = [Ω Cr (x ) − Ω Cr (x ')]2 + [Ω Cb (x ) − Ω Cb (x ')]2
0 if d Hyb ≤ a
c c π (d Hyb − a )
α (d Hyb ) = + Sin + π if a < d Hyb < b (8)
2 2 b − a
c if d Hyb ≥ b
threshold-based fusion of global and local saliency maps (detailed in subsection 2.1).
The parameter σ represents the radius of Gaussian smoothing function taking the
value σ = 5 in our case. Equation (9) details the so called FSM, where M F (x )
(equation (10)) represents the result involving global and local saliency maps’ fusion.
D( x ) if D( x ) < M ( x )
M F (x ) = (10)
M (x ) D( x ) else
Once SFMGB ( x) available, for each homogeneous segment S i from d Hyb (x , x')
belonging to the ensemble of segments (e.g. potential objects in image) found in
d Hyb (x , x') (e.g. ∀ Si ∈{S1 ,, Si ,, S n }), the statistical central momentum μ (Si ) and
variance Var (Si ) of S i are calculated from corresponding pixels of that segment in
FSM (e.g. from SFMGB (x) ) accordingly to the operations expressed in equation (11).
These two statistical momentums serve then to generate a binary map (e.g. a binary
mask) filtering salient objects from the input image. The resulting map M Mask ( x ) is
an image containing only salient objects where trifling objects have been replaced by
black pixels. Equation (12) details this last operational step.
SFM GB (x )
μ (S i ) = x ∈Si
(11)
Si
Var (S i ) = (SFM GB ( x ) − μ (S i ))2
x ∈S i
(a) (b)
The designed system has been implemented on a Wifibot-M, a 6-wheels mobile robot
(from NEXTER Robotics). Initially constructed for local mobile surveillance
applications, Wifibot-M is composed by a six-wheel-driven waterproof (IP64)
polycarbonate chassis controllable using WIFI. The chassis is composed by 3 parts
linked with a 2 dimensional link. Wifibot-M robot can handle devices such as IP
camera (MJPEG or MPEG) or any Ethernet sensors. A liteStation2 from UBNT router
is the main CPU allowing data transfer, however, a 5Ghz router can be added. The
robot used in the frame of the present research is equipped with an analogue PTZ
(Pan-Tilt-Zoom, three degrees of freedom) camera (WONWOO WCM-101), attached
to chassis through AXIS M7001 video encoder. The robot can be controlled through
WIFI or Ethernet network connection.
A software controller is shipped with Wifibot-M. Designed by robot producents,
this controller is programmed in C++ with usage of WinAPI and DirectConnect
technologies, making it a “Windows-dedicated” software. This controller can be run
on any Windows PC connecting the same network as the robot. It allows the usage of
any Plug-and-Play device compatible with Windows. Thus, it makes the robot to be
controlled by any Plug-and-Play joystick, or its virtual simulation to control either the
robot’s movement or to collect data from robot’s internal sensors, as those relating
robot’s speed or odometers’ data from its wheels: this is somehow an interesting point
regarding the aimed application involving human operator. Also it can switch camera
on/off, while it cannot give the image from this camera to user. The image can be
collected, using either Web-interface of AXIS Encoder or AXIS Camera Client
application, which is also Windows-only software. Although the above-mentioned
already available facilities, the last point remains a drawback for the focused
application, because the involvement of an operator based guidance results on the
necessity of combining both movement controller and video stream controller in a
same unit. In fact the exploitation has to be lightweight for the user. Due to these
conditions, a new controller has been designed, “Wifibot-M iPy Controller” allowing:
- Connection via network, switch on/off its camera and/or controlling its wheels;
- To control robot’s movement by a Plug-and-Play joystick;
- To collect video stream from its camera and store frames as JPEG images;
- To control camera’s PTZ-routines;
- To detect salient objects from video stream in human-compatible real-time;
- To react consequently (thus in human-compatible real-time) on collected results.
From an architectural slant of view, the designed controller could be seen as a 4-
modules unit: movement controller, PTZ & video stream handler, salient regions
detector and reaction strategies inspector. Fig. 5 depicts the general bloc diagram of
such architecture. The main tasks-managing module includes a Graphical User
Interface (GUI) and a number of standard task-handling routines. That is why this part
is not shown in the Fig.5.
The validation scenario has been based on a real outdoor fire’s detection situation.
In order to make the validation scenario compatible with plausible fire-fighting
A Statistical Approach to Human-Like Visual Attention and Saliency Detection 133
circumstances, the scenario has been realized within the robot’s scale. Accordingly to
the Wifibot-M robot’s size, this means some 300 m2 area (typically 25 × 12 m 2 action-
area) and a 80-to-100 centimeters-height and 100 centimeters-width fire with smoke
somewhere in that area. This also ensures a correct WIFI connection (as well
regarding network connection’s quality as regarding the relative simplicity of required
supply deployment in outdoor conditions) and a correct energetic autonomy of the
robot allowing performing several experimental tests during several hours if required.
Fig. 6. Experimental area showing the robot with its camera and the combustion zone
Several tests supposing robot moving toward and around the combustion (fire)
zone with aim of detection of the fire’s shape as salient target have been realized. Fig
6 depicts the experimental setup and the fire’s perimeter. Fig. 7 gives the obtained
results showing on the left-side pictures the robot’s camera view of the scenery and
on the right-side pictures the detected salient combustion area. As it is visible from
right-side pictures of Fig. 7, the salient combustion zone is correctly detected.
Moreover, it is pertinent to note that as well fire’s outline as the smoke’s perimeter
have been correctly detected and recognized as salient objects and events. The
detection of the smoky perimeter as salient item of the scenery is an interesting point
because often the woodland fires generate smoky atmosphere which may be used as
an early-stage salient indicator in early-detection of upcoming woodland fire deserter.
134 V. Kachurka et al.
Fig. 7. Pictures extracted fromm the robot’s video stream showing the robot’s view of scennery
(left-side pictures) and the corrresponding detected fire’s outline (right-side pictures)
References
1. FAO, Wildfire managem ment, a burning issue for livelihoods and land-use (20007),
http://www.fao.org g/newsroom/en/news/2007/1000570/index.html l
2. San-Miguel-Ayanz, J., Ravail,
R N., Kelha, V., Ollero, A.: Active fire detection for
emergency management:: potential and limitations for the operational use of rem mote
sensing. Natural Hazards 35, 361–376 (2005)
A Statistical Approach to Human-Like Visual Attention and Saliency Detection 135
3. Lu, G., Yan, Y., Huang, Y., Reed, A.: An Intelligent Monitoring and Control System of
Combustion Flames. Meas. Control 32(7), 164–168 (1999)
4. Gilabert, G., Lu, G., Yan, Y.: Three-Dimensional Tomographic Renconstruction of the
Luminosity Distribution of a Combustion Flame. IEEE Trans. on Instr. and
Measure. 56(4), 1300–1306 (2007)
5. Rossi, L., Akhloufi, M., Tison, Y.: Dynamic fire 3D modeling using a real-time
stereovision system. J. of Communication and Computer 6(10), 54–61 (2009)
6. Ko, B.C., Cheong, K.H., Nam, J.Y.: Fire detection based on vision sensor and support
vector machines. Fire Safety J. 44, 322–329 (2009)
7. Celik, T., Demirel, H.: Fire detection in video sequences using a generic color model. Fire
Safety J. 44, 147–158 (2009)
8. Chen, T., Wu, P., Chiou, Y.: An early fire-detection method based on image processing.
In: Proc. of Int. Conf. on Image Processing, pp. 1707–1710 (2004)
9. Rossi, L., Akhloufi, M., Tison, Y., Pieri, A.: On the use of stereovision to develop a novel
instrumentation system to extract geometric fire fronts characteristics. Fire Safety
Journal 46(1-2), 9–20 (2011)
10. Brand, R.J., Baldwin, D.A., Ashburn, L.A.: Evidence for ‘motionese’: modifications in
mothers infant-directed action. Developmental Science, 72–83 (2002)
11. Wolfe, J.M., Horowitz, T.S.: What attributes guide the deployment of visual attention and
how do they do it? Nature Reviews Neuroscience, 495–501 (2004)
12. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned Salient Region
Detection. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition
(2009)
13. Itti, L., Koch, C., Niebur, E.: A Model of Saliency-Based Visual Attention for Rapid Scene
Analysis. IEEE Trans. on Pattern Analysis and Machine Intel. 20, 1254–1259 (1998)
14. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural
Information Processing Systems, vol. 19, pp. 545–552 (2007)
15. Achanta, R., Estrada, F., Wils, P., Süsstrunk, S.: Salient Region Detection and
Segmentation. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS,
vol. 5008, pp. 66–75. Springer, Heidelberg (2008)
16. Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.-Y.: Learning to Detect
a Salient Object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2011)
17. Liang, Z., Chi, Z., Fu, H., Feng, D.: Salient object detection using content-sensitive
hypergraph representation and partitioning. Pattern Rec. 45(11), 3886–3901 (2012)
18. Ramík, D.M., Sabourin, C., Madani, K.: Hybrid Salient Object Extraction Approach with
Automatic Estimation of Visual Attention Scale. In: Proc. of 7th Int. Conf. on Signal
Image Technology & Internet-Based Systems, Dijon, France, pp. 438–445 (2011)
19. Ramik, D.M., Sabourin, C., Moreno, R., Madani, K.: A Machine Learning based
Intelligent Vision System for Autonomous Object Detection and Recognition. J. of
Applied Intelligence (2013), doi:10.1007/s10489-013-0461-5
20. Moreno, R., Ramik, D.M., Graña, M., Madani, K.: Image Segmentation on the Spherical
Coordinate Representation of the RGB Color Space. IET Image Processing 6(9), 1275–
1283 (2012)
21. Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.-Y.: Learning to Detect
a Salient Object. In: Proc. of Computer Vision and Pattern Recognition, pp. 353–367
(2011)
A Learning Technique for Deep Belief Neural
Networks
1 Introduction
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 136–146, 2014.
c Springer International Publishing Switzerland 2014
A Learning Technique for Deep Belief Neural Networks 137
this the next layer is trained etc. As a result the initialization of neural network
is performed and we could use supervised learning for fine tuning parameters of
whole neural networks.
The second technique for DBNN training uses auto-encoder approach for pre-
training of each layer and after this fine tuned method is applied in supervised
manner. In this case we train in the beginning the first layer as autoassociative
neural network in order to minimize the reconstruction error. Then the hidden
units are used as the input for the next layer and next layer is trained as auto-
encoder. Finally fine-tuning all of parameters of neural network by supervised
way is performed.
In this work we propose a new interpretation of learning rules for RBM. The
conventional approach to training the RBM uses energy-based model. The pro-
posed approach is based on minimization of reconstruction mean square error,
which we can obtain using a simple iterations of Gibbs sampling. We have shown
that classical equations for DBNN training are particular case of proposed tech-
nique.
The rest of the paper is organized as follows. Section 2 describes the con-
ventional approach for restricted Boltzmann machine training based on energy
model. In Sect. 3 we propose novel approach for inference of RBM training
rules. Section 4 demonstrates the results of experiments and finally Sect. 5 gives
conclusion.
RBM Autoencoder
Fig. 1. DBNN training approaches
Sjk = wij yi + Tjk ,
k k−1
(2)
i=1
where F is the activation function, Sjk is the weighted sum of the j -th unit, wij
k
is the weight from the i-th unit of the (k − 1)-th layer to the j -th unit of the
k -th layer, Tjk is the threshold of the j -th unit.
For the first layer
yi0 = xi . (3)
In common case we can write, that
Y k = F S k = F W k Y k−1 + T k , (4)
where W is a weight matrix, Y k−1 is the output vector for (k − 1)-th layer, T k
is the threshold vector.
It should be also noted that the output of the DBNN is often defined using
softmax function:
e Sj
yjF = softmax Sj = S . (5)
le
l
Let’s examine the restricted Boltzmann machine, which consists of two layers:
visible and hidden (Fig. 3). The restricted Boltzmann machine is the part of the
deep belief neural network and can represent any discrete distribution, if enough
hidden units are used [5].
The layers of neural units are connected by bidirectional weights W . In mostly
cases the binary units are used [1–3]. The RBM is a stochastic neural network
and the states of visible and hidden units are defined using probabilistic version
of sigmoid activation function:
1
p ( yj | x) = , (6)
1 + e−Sj
1
p ( xi | y) = . (7)
1 + e−Si
The energy function of the binary state (x, y) is defined as
E (x, y) = − xi Ti − yj T j − xi yj wij . (8)
i j i,j
Z= e−E(x,y) , (10)
x,y
∂ log P (x)
= xi yj
d − xi yj
r , (11)
∂wij
∂ log P (x)
= xi
d − xi
r , (12)
∂Ti
∂ log P (x)
= yj
d − yj
r . (13)
∂Tj
As a result we can obtain the RBM training rules [2] as follows:
wij (t + 1) = wij (t) + α xi yj
d − xi yj
r , (14)
Ti (t + 1) = Ti (t) + α ( xi
d − xi
r ) , (15)
Tj (t + 1) = Tj (t) + α yj
d − yj
r . (16)
Here α is learning rate, <>d denotes the expectation for the data distribution
and <>r denotes the expectation for the reconstructed (model) data distribu-
tion. Since computing the expectation using RBM model is very difficult, Hinton
propose to use an approximation of that term called contrastive divergence (CD)
[2]. It is based on Gibbs sampling. In this case the first term in equation (11)
denotes the data distribution at the time t = 0 and the second term is distri-
bution of reconstructed states at the step t = n. Therefore the CD-n procedure
can be represented as follows:
x0 → y0 → x1 → y1 → . . . → xn → yn . (17)
W1 W2 W3 W4
X0 X0 Y1 Y2 Y3 Y4
... ... ... ... ...
Y1 Y2 Ym
X1 X2 Xn
Sj (0) = wij xi (0) + Tj . (19)
i
A Learning Technique for Deep Belief Neural Networks 141
The inverse layer reconstructs the data from hidden layer. As a result we can
obtain x(1) at time 1:
xi (1) = F (Si (1)) , (20)
Si (1) = wij yj (0) + Ti . (21)
j
After this the x(1) enters to the visible layer and we can obtain the output of
the hidden layer by the following way:
yj (1) = F (Sj (1)) , (22)
Sj (1) = wij xi (1) + Tj . (23)
i
The purpose of the training this neural network is to minimize the reconstruction
mean squared error (MSE):
1 k 2 1
L
L n m
k 2
Es = xi (1) − xki (0) + yj (1) − yjk (0) , (24)
2 i=1
2 j=1
k=1 k=1
where L is the number of training patterns. If we use online training of RBM the
weights and thresholds are updated iteratively in accordance with the following
rules:
∂E
wij (t + 1) = wij (t) − α , (25)
∂wij (t)
∂E
Ti (t + 1) = Ti (t) − α , (26)
∂Ti (t)
∂E
Tj (t + 1) = Tj (t) − α . (27)
∂Tj (t)
The cost function E for one sample is defined by the expression:
1 2 1 2
E= (xi (1) − xi (0)) + (yj (1) − yj (0)) . (28)
2 i 2 j
Differentiating (28) with respect to wij , Ti and Tj we can get the following
training rules for RBM network:
We can use these rules for training RBM network for any data (binary and
real). Let’s examine the interrelation between conventional and proposed RBM
training rules. Let us suppose, that linear activation function is used. This is
equivalent to
∂xi (1) ∂yj (1)
F (Si (1)) = = 1 and F (Sj (1)) = =1 . (32)
∂Si (1) ∂Sj (1)
Then the training rules can transform by the following way:
∂Es
Ti (t + 1) = Ti (t) − α , (43)
∂Ti (t)
∂Es
Tj (t + 1) = Tj (t) − α . (44)
∂Tj (t)
A Learning Technique for Deep Belief Neural Networks 143
Then we can obtain the following equations for RBM training using CD-n pro-
cedure:
L
k
wij (t + 1) = wij (t) − α xi (n) − xki (0) F Sik (n) yjk (n − 1)+
k=1
+ yjk (n) − yjk (0) F Sjk (n) xki (n) , (45)
L
k
Ti (t + 1) = Ti (t) − α xi (n) − xki (0) F Sik (n) , (46)
k=1
L
k
Tj (t + 1) = Tj (t) − α yj (n) − yjk (0) F Sjk (n) . (47)
k=1
In this section we have obtained the training rules for restricted Boltzmann ma-
chine. It is based on minimization of reconstruction mean square error, which we
can obtain using a simple iterations of Gibbs sampling. The proposed approach
permits to take into account the derivatives of nonlinear activation function for
neural network units. We will call the proposed approach reconstruction error-
based approach (REBA). It was shown, that the classical equations for RBM
training are particular case of proposed technique.
Y(1)
4 Experimental Results
To assess the performance of the proposed learning technique experiments were
conducted on artificial data set. The artificial data x lie on a one-dimensional
manifold (a helical loop) embedded in three dimensions [9] were generated from
a uniformly distributed factor t in the range [-1,1]:
⎧
⎪
⎨x1 = sin(πt)
x2 = cos(πt) (48)
⎪
⎩
x3 = t
144 V. Golovko et al.
Figure 6 depicts the evolution of mean square error depending on epochs for
the first layer of deep auto-encoder. The number of epochs for training of each
layer is 50.
It is evident from the simulation results that the use of the REBA technique
can improve the generalization capability of deep auto-encoder in case of CD-1
and CD-5.
A Learning Technique for Deep Belief Neural Networks 145
Figure 7 shows the original training data and the reconstructed data from one
component, using test data. As can be seen, the auto-encoder reconstructs the
data from one nonlinear component with well accuracy.
120
100
80
MSE
60
40
20
0
0 10 20 30 40 50
Epochs
1.0 1.0
0.5 0.5
0.0 0.0
− 0.5 − 0.5
− 1.0 − 1.0
1.0 1.0
0.5 0.5
0.0 0.0
− 0.5 − 0.5
5 Conclusion
In this paper we have addressed some key aspects of deep belief neural network
training. We described both traditional energy-based method, which is based on
linear representation of neural units and proposed approach, which is based on
nonlinear neurons. The proposed approach is based on minimization of recon-
struction mean square error, which we can obtain using a simple iterations of
Gibbs sampling. As can be seen, the classical equations for RBM training are
particular case of proposed technique. The simulation results demonstrate an
efficiency of proposed technique.
146 V. Golovko et al.
References
1. Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets.
Neural Computation 18, 1527–1554 (2006)
2. Hinton, G.: Training products of experts by minimizing contrastive divergence. Neu-
ral Computation 14, 1771–1800 (2002)
3. Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural
networks. Science 313(5786), 504–507 (2006)
4. Hinton, G.E.: A practical guide to training restricted Boltzmann machines (Tech.
Rep. 2010–000). Machine Learning Group, University of Toronto, Toronto (2010)
5. Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine
Learning 2(1), 1–127 (2009)
6. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of
deep networks. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural
Information Processing Systems, vol. 11, pp. 153–160. MIT Press, Cambridge (2007)
7. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why
does unsupervised pre-training help deep learning? Journal of Machine Learning
Research 11, 625–660 (2010)
8. Golovko, V., Vaitsekhovich, H., Apanel, E., Mastykin, A.: Neural network model
for transient ischemic attacks diagnostics. Optical Memory and Neural Networks
(Information Optics) 21(3), 166–176 (2012)
9. Scholz, M., Fraunholz, M., Selbig, J.: Nonlinear principal component analysis: neural
network models and applications. In: Principal Manifolds for Data Visualization and
Dimension Reduction, pp. 44–67. Springer, Heidelberg (2008)
Modeling Engineering-Geological Layers
by k-nn and Neural Networks
1 Introduction
The online presentation of geological data and information, fostered by the
development of international spatial data infrastructure, demands an effective
numerical algorithms for visualization and analysis of geological structures.
The public institutions in different countries, the National Geological Surveys, are
collecting and managing the very large datasets of geological boreholes (exceeding
tens thousands of boreholes), which are then the basis for preparation of geological
maps, cross-sections and geological models (2D, 3D, and 4D). One of the goals of
this paper was to assess the practical potential of statistical learning systems as an
effective numerical tools for large geological datasets. The two methods presented in
the paper are: k-nn (k nearest neighbours) and neural network approximation [1-7].
Both methods are characterized by the elastic adaptation to the natural complexity of
the problem, also they have a vast areas of practical application. The paper is an
attempt to answer if k-nn and neural networks approximation methods can be useful
for interpolation of roof surfaces of engineering-geological layers. The potential areas
of application were discussed (eg. lightweight mobile applications or online geo-
reporting GIS systems).
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 147–158, 2014.
© Springer International Publishing Switzerland 2014
148 S. Jankowski et al.
The use of neural networks algorithms for construction of geological models was
already discussed by many authors [8, 11, 12]. The presented paper is an attempt to
continue research in this topic by application of statistical learning systems to a
geological dataset of 1083 boreholes from the Engineering-Geological Atlas of
Warsaw database managed by the Polish Geological Survey.
There are two main methods of engineering-geological layer modeling:
• Function approximation [8] f: R2→R – the depth of the layer roof is calculated
using 2 arguments (point coordinates). One fundamental drawback is that complex
geometry of the layer (e.g. containing folds) causes the method to fail. Advantages
are its simplicity and short computation time.
• Volumetric approach based on classification of the voxels. There are two
drawbacks of the volumetric method. It requires a lot of resources (e.g. memory,
computation time). Some post processing is needed for visualization of such data.
However this method offers modeling of layers of arbitrary shapes.
2 Analyzed Dataset
Fig. 1. The analyzed dataset consists of 1083 geological boreholes from an area of 3,5 x
4,0 km, located in the center of Warsaw
Fig. 2. Example of classically generated map of pliocene clay roof, using ordinary krigging
method (from Polish Geological Institute archival materials)
150 S. Jankowski et al.
For the purpose of this paper the analysis of the geological data described above
was focused mostly on engineering geological-layers of fluvioglacial sands
(engineering geological layer no. 15) and pliocene clays (layer no. 27).
The pliocene clays on the area of Warsaw were highly glacitectonicaly deformed
during the latest glacial period, their roof surface morphology is very undulated. In
the depressed areas of pliocene clays roof surface there are often accumulated layers
of glacial sands, which are saturated with the pressurized groundwater, that cannot
infiltrate through impermeable layers of underlying pliocene clays. This kind of
geological conditions in Warsaw is very unfavorable and can cause severe dangers
during earthworks or underground infrastructure construction (eg. construction of new
metro lines).
From the geological point of view it is very important to know the morphology of
pliocene clays and the spatial distribution of the discontinuous lenses of fluvio-glacial
sands to properly and accurately identify the geological risk.
Modeling Engineering-Geological Layers by k-nn and Neural Networks 151
Table 1. Example data used for approximation. The dataset contained 10466 records from 1083
boreholes.
(ground level)
X coordinate
Y coordinate
coordinate
roof of layer
Description
Engineering
[m, depth]
-geological
Borehole
coordinate
coordinate
coordinate
Symbol
name
local
local
local
layer
system
Zsystem
[m]
[m]
[m]
t
made
ground
O-2/226 0,00 -1210,0 -1518,0 114,60 NN 1 (variable
compositi
on)
O-2/226 0,30 -1210,0 -1518,0 114,60 ML 12 silt
SW- clayey
O-2/226 4,00 -1210,0 -1518,0 114,60 14
SC sand
O-2/226 4,60 -1210,0 -1518,0 114,60 SM 16 silty sand
fine
grained
sand with
O-2/226 5,40 -1210,0 -1518,0 114,60 SP 16
clay
interbedd
ings
O-2/226 6,20 -1210,0 -1518,0 114,60 ML 16 silt
Structure of geological data used for computations is presented in the table 1. Each
record in the table represents a roof depth of unique lithological layer documented in
each borehole. The coordinates of each layer roof depth point in the borehole profile
are spatialy localized by x,y,z coordinates, given in the local coordinate system
(Warszawa 75). Each lithological layer has its symbol (an standardized abbreviation)
and a short description. The numbers of engineering-geological layers are ascribed to
each lithological layer. It can be seen as an example, that in the table 1., all
lithological layers from the depth of 4,60 m b.g.l. down to de end of borehole, are
classified as engineering-geological layer no.16.
152 S. Jankowski et al.
(x,y,z,id)
Fig. 4. Volumetric representation of space
The models are based on the volumetric representation of space. Each voxel is
described by its coordinates and identifier of engineering-geological layer which it is
belonging to (fig. 4).
Preparation of the layer model consists of following stages:
• Creation of learning data set - the available data (described in section 2) after
removal of the artifacts is transformed (e.g. normalized) for use with the classifier.
• Learning of the classifier –neural network, in case of k-nn this stage is skipped.
• Classification of the voxels in considered area.
• A surface model can be prepared based on obtained volumetric data of considered
layer.
Validation
In order to validate the classifiers false positive (FP), false negative (FN), true
negative (TN) and true positive (TP), precision and recall [9, 10] parameters were
calculated. Precision is defined as
· 100% (1)
Modeling Engineeering-Geological Layers by k-nn and Neural Networks 153
· 100% (2)
· 100% (3)
Fig. 5. The terrain surface obttained by Delaunay’s triangulation of borehole data accordinng to
the engineering-geological prrofile. Vertices of triangles represent borehole locations. The
model was base for cross section shown on figures. 6,7,8.
algorithm is used in wide range of classification problems such as: medical diagnosis,
image processing, predicting of properties of amino acids sequences. The popularity
of KNN is based on its simplicity and effectiveness.
The data set D is defined as:
An arbitrary data point xi can be described by a vector [xi1, xi2, …, xid], where d is the
dimension of the samples space. The distance between two data points xi and xj is
defined to be dist(xi,xj), where
(x − x jr )
d
dist (x i , x j ) ≡
2
ir (5)
r =1
f :Rn → L (6)
where L is the finite set of labels of data points {-1,1}. During the training phase all
labeled points are added to the list of training examples. The classification of data
point xq can be described as
k
f ( xq ) ← arg max δ (t , f (xi )) (7)
t ∈L i =1
where
1 a=b
δ ( a, b ) =
0 a <> b
and xl, ...xk denote the k instances from training examples that are nearest to xq.
The k parameter (nearest neighbors count) was set to 5. Several experiments were
made in order to determine the best value. It turned out that when the k value is
increased the recall increases, but precision decreases. The original coordinates were
scaled by factor 3.23⋅104 along X axis, 2.64⋅104 along Y axis and 1.49⋅102 along Z
axis.
The results of KNN classification on the learning set and the test set are shown in
table 2. The data set contained 5846 points. The learning set consisted of 5310
randomly selected points. 536 points were left out for testing purposes and were not
included in the learning set. The cross-section of the obtained model is shown in fig.
6, which was used for validation by the geologists.
Modeling Engineeering-Geological Layers by k-nn and Neural Networks 155
The results of neural nettwork classification on the learning set and the test set are
shown in table 3. The dataa set contained 5846 points. The learning set consistedd of
5310 randomly selected poiints. 536 points were left out for testing purposes and w were
not included in the learninng set. The accuracy of classification enables creationn of
volumetric models. Selected d cross-sections of the models are shown in fig. 7,8, whhich
were used for validation by geologists.
Table 3. Results of neural neetwork classification of voxels (layer 15) on learning and test sset
The results obtained by b the KNN and neural classifier on the test set are
comparable. The KNN perfforms much better on the learning set due to local naturee of
this model. The shapes of the engineering-geological layers obtained by the neuural
classifiers are smoother thhan shapes generated by the KNN method (fig. 6,8). A
significant difference between the models lies is the time needed for creation of the
model and classification of o an unknown sample. The creation of a KNN moodel
requires only collecting of the
t data set containing example points and their labels. T
The
neural classifier needs an additional step (computationally intensive) consistingg of
calculation of the neurons weights. However, once the neural model is ready, the
prediction is very fast. Thee KNN classifier requires a data search for each prediccted
point.
4 Conclusions
The use of volumetric meth hods for modeling of engineering-geological layers is noot a
common approach. According to the results obtained from analyses performed in this
paper, the voxel classificatiion method offers attractive possibilities to model compllex,
discontinuous shapes, chaaracteristic to geological layers. This method enabbles
predicting of engineering-g geological layers in unmeasured areas between borehooles.
The results can be easily vissualized and presented to the experts for validation.
Method of neural netw work classification was applied to model the surfacess of
engineering-geological lay yers no. 15 (pleistocene fluvioglacial sands) ass a
discontinuous layer and 27 (pliocene clays) as a continuous discrete layer. The shaapes
generated with the use of o the neural classification are gently and irregulaarly
undulated. This is a good form of a representation especially for glacitectoniccaly
deformed surfaces of plio ocene clays from area of Warsaw. Also the effectt of
modeling of the discontinu uous layer of fluvioglacial sands is interesting. The shaapes
of sand lenses generated byy the model seam to represent the geological conditions iin a
appropriate way.
The effects of volumettric modeling with supervised learning methods cann be
presented in form of cloud of o points generated by the presented in the paper numerrical
algorithms. Such data caan be then easily processed in different visualizattion
environments, like GIS sofftware platforms or geological 3D modeling software. T The
geological model based on data processed by methods like neural approximation and
k-nn can be used to geneerate cross-sections and engineering-geological mapss at
certain, user defined depthss.
Concluding, the presen nted methods can be assessed as a useful tool for
engineering-geological layeers modeling, with broad scope of practical applications.
158 S. Jankowski et al.
References
1. Dreyfus, G.: Neural Networks - Methodology and Applications. Springer, Heidelberg
(2005)
2. Cover, T.M., Hart, P.E.: Nearest Neighbor Pattern Classification. IEEE Transactions on
Information Theory 13(1), 21–27 (1967)
3. Kulkarni, S.R., Lugosi, G., Venkatesh, S.S.: Learning Pattern Classification—A Survey.
IEEE Transactions on Information Theory 44(6), 2178–2206 (1998)
4. Shakhnarovich, G., Darrell, T., Indyk, P. (eds.): Nearest-Neighbor Methods in Learning
and Vision. MIT Press (2006)
5. Alippi, C., Fuhrman, N., Roveri, M.: k-NN classifiers: investigating the k=k(n)
relationship. In: Proc. 2008 International Joint Conference on Neural Networks (IJCNN
2008), pp. 3676–3680 (2008)
6. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning - Data
Mining, Inference, and Prediction. Springer (2008)
7. Mitchell, T.M.: Machine Learning. McGraw-Hill Science/Engineering/Math (1997)
8. Kraiński, A., Mrówczyńska, M.: The use of neural networks in the construction of
geological model of the Głogów-Baruth Ice-Marginal Valley in the Nowa Sól area,
Poland. Prz. Geol. 60, 650–656 (2012)
9. Makhoul, J., Kubala, F., Schwartz, R., et al.: Performance measures for information
extraction. In: Proc. DARPA Broadcast News Workshop, Herndon, VA (1999)
10. van Rijsbergen, C.V.: Information Retrieval. Butterworth, London (1975)
11. Kumar, J., Konno, M., Yasuda, N.: Subsurface Soil-Geology Interpolation Using Fuzzy
Neural Network. J. Geotech. Geoenviron. Eng. 126(7), 632–639
12. Mohseni-Astani, R., Haghparast, P., Bidgoli-Kashani, S.: Assessing and Predicting the
Soil Layers Thickness and Type Using Artificial Neural Networks - Case Study in Sari
City of Iran. Middle-East Journal of Scientific Research 6(1), 62–68 (2010)
A Hybrid Genetic Algorithm and Radial Basis
Function NEAT
Keywords: genetic algorithm, radial basis function, car racing strategy, double
pole balancing.
1 Introduction
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 159–170, 2014.
© Springer International Publishing Switzerland 2014
160 H. Mohabeer and K.M. Sunjiv Soyjaudah
One way in which agents based on ANN can evolve their behavior is allowing
them to change their integral structure, thereby exhibiting plasticity. In this way, they
follow the same concept of real organisms toward changing and unpredictable
environment [7] [8] [9]. In 2002, Stanley et al. developed neuroevolution of
augmented topologies (NEAT) [10], which makes use of genetic algorithms (GA) to
evolve. Although not new [11] [12], the idea of speciation to allow historical marking
was introduced in a more efficient manner [10] in order to optimize the functionality
of NEAT. Benchmark problems such as the pole balancing and the double pole
balancing have been solved with greater efficiency and lower complexity as compared
to conventional NE and evolutionary programming [13] [14].
Even though NEAT has performed well on many reinforcement learning problems
[15] [16] [18] yet, on a high level domain control such as the racing strategy [19],
NEAT has performed rather poorly. One approach to solve these kinds of problems
was to adopt the concept of the fractured domain hypothesis. Kohl et al. [30]
successfully improved the performance of NEAT by using radial basis function (RBF)
instead of the classical multilayer perceptron (MLP). It was reported in [20] that a
value- function reinforcement learning method normally benefits from approximating
value function using a local function approximator like RBF networks. Stone et al.
used the benchmark keep away soccer domain to demonstrate that a RBF- based value
function approximator allowed local behavioral adjustments [22] [23] thus
outperforming normal neural network value function approximator.
Such results were promising for neuroevolution learning. Although, this method
was successful, it did not provide a plausible explanation as to why NEAT performed
poorly on fractured domains. This paper introduces a new method which makes use of
GA to help RBF-NEAT solve reinforcement learning problems. To demonstrate the
potential of this approach, this paper performs comparisons in the racing strategy
domain [19] and the double pole balancing problem.
By allowing GA to emulate a smoothing operator in a given problem prior to its
exposure to the RBF-NEAT algorithm, the performance of the algorithm significantly
increases. In fact, it outperforms the traditional RBF-NEAT in fractured domain,
suggesting a powerful new means of solving high level behavior as well as
conventional reinforcement learning tasks [15].
The remainder of the paper is organized as follows: Section 2 provides a brief
methodology of the theories involved. Section 3 consists of the methodology adopted
to implement the proposed system. Section 4 discusses the results obtained and
comparisons are made with various other algorithms used to solve the same problem.
Finally, section 5 concludes the proposed work and discusses future works that may
arise as a result of the proposed work.
2 Reinforcement Learning
Reinforcement learning (RL) is concerned with how an agent (actor) ought to take
action in an environment so as to maximize some notion of cumulative reward. RL
learn policies for agents acting in an unknown stochastic world, observing the state
A Hybrid Genetic Algorithm and Radial Basis Function NEAT 161
that occur and the rewards that are given at each step [24]. In this way the agent
progresses towards the desired solutions iteratively. Reinforcement learning problems
are divided into two categories, namely low level reinforcement learning and high
level reinforcement learning problem. One example of a low level RL problem is the
non- Markovian double pole balancing which is discussed below, while the racing
strategy is presented as a high level RL problem. Both of these problems are used
later as benchmarks to test the proposed algorithm.
Non- Markovian double pole balancing (DPNV) [25] can be considered as a difficult
benchmark task for control optimization. K. Stanley et al. [10] compared the results of
the neuroevolution methods which have reportedly solved the DPNV problem:
Cellular Encoding CE [26], Enforced Sub Populations ESP [27], and NEAT
outperformed all the other algorithms.
The double pole balancing setup (see figure 1), consists of a car with mass
(mcar) 1 kg and one degree of freedom x, in which two poles of different lengths
l1 = l m and l2 = 0.1 m are mounted. The poles have the masses m1 = 1 kg and
m2 = 0.1 kg. Based on the measured values of the joint angles θ1 = θ2 and the
position of the car x, the controller is required to balance both of the poles by
applying a force Fx (with a maximal magnitude Fmax = 10 [N]). Assuming rigid
body dynamics and neglecting friction, the system can be described by the
equations of motion shown below.
162 H. Mohabeer and K.M. Sunjiv Soyjaudah
∑
∑
3
cos sin
4
The numerical simulation of the system is based on a 4th-order Runge-Kutta
integration of these equations with a time step of∆ 0.01 . The above parameters
and setups were inspired from [29] as the same configurations were adopted for
comparison with other neuroevolution method cited above.
• The capability of generating a large number of random tracks with very little
efforts.
• The model must be fast to compute. Evolutionary algorithms may require millions
of simulated time steps in order to converge.
• The sensor inputs for a controller should be reasonably simple; this encourages
more members of the research community to participate.
• The waypoints are randomly distributed around a square area at the beginning of
each race, and the car knows the position of the current waypoint and the next
waypoint.
The simulations result for both the double pole balancing and the car racing strategy
are discussed later in this paper. As part of the buildup of the proposed algorithm, the
next section gives a vague description of GA and its contributions in neuroevolution.
A Hybrid Genetic Algorithm and Radial Basis Function NEAT 163
Genetic algorithm (GA) has been inspired by the Darwinian theory of evolution.
Basically, GA consists of any population based model that uses selection and
recombination operators to generate new sample points in a search space. Crossover
and mutation are the two major reproductive operators that are responsible to
evolving GA toward much fitter generation.
Crossover, in the context of GA, is the process of combining two chromosomes to
produce new offspring. The idea is to transmit the best characteristics from each
parent chromosomes to the offspring. In this way, the new generation of individuals
will be more efficient than the previous one. A crossover operator dictates the way
selection and recombination of the chromosomes occur. Moreover, these operators are
usually applied probabilistically according to a crossover rate.
Mutation refers to the random process changing an allele of a gene in order to
produce a new genetic structure. The probability of mutation is usually applied in the
range of 0.001 to 0.01, and modifies elements of the chromosomes. It has been
reported in [34] that mutation is responsible for preventing all the solutions in a
population from falling into a local optimum of solved problem. It is important to note
that mutation is both dependent on crossover and the encoding scheme.
To summarize, it could be said that the role of mutation consists of restoring lost or
unexplored genetic material into the populations. A fitness value is assigned to each
chromosome such that the one which is fit is allowed to crossover. The fitness
function is responsible for assigning each chromosome a fitness value based on their
performance in the problem domain. It is imperative to design a good fitness function
in order to effectively solve a problem. A good fitness function will help in probing
the search space more efficiently and effectively. It also helps to escape local
optimum solution. Across each generation of population there is a convergence
toward an overall higher fitness value.
In this way, GA probes for the solution in the search space. Speciation is another
property which has been given significant amount of importance in NEAT. Speciation
has the main function of preserving diversity. In NEAT, it is used to protect
innovation thus speciation allows organisms to compete primarily within their own
niches instead of with the population at large. Topological innovations are protected
in a new niche where they have time to optimize their structure through competition
within the niche [17]. One of the most attracting features of GA is its robustness and
efficacy. A more in depth analysis of GA is provided in [12].
3 Methodology
value is assigned as a stopping criterion such that when this value is reached, GA
instantly stops probing that state space. It is assumed that during the search, GA
allows the problem to be either solved or develop a pattern which is directed to the
solution is created. Both possibilities should be considered since depending on the
complexity of the problem it is postulated that GA may or may not find the optimal
solution on its own.
Hence after an initial pass through GA, the pattern developed within the population
enables NEAT to converge toward the solution in a more efficient manner. The
algorithm has been implemented using matlab software. The initial NEAT software
was obtained from [33].
In the proposed algorithm, the maximum overall fitness value of 16 has been
assigned to both set of population. As stated previously, this value acts as a stopping
criterion for the algorithm to complete searching within the search space. All the
parameters set has been standardized so that the algorithm could be compared and
contrasted to known reported results.
START
Generate a population of random data
//create a population with random data
DISABLE NEAT in NN
//Enable only the GA to work
Initiate GA to search for pattern in the population of data
related to the RL task
//Possible method include a genetic clustering
Generate a population of random network (NN)
//Create a population of random network
Re- initialize GA
//Enable NN in NEAT
Perform fine tune search on problem
//Initiate memory in NN by activating RBF nodes
Disable GA
//Allow only complexified NN to be enabled
Apply NN to random data in (1)
//This is made to verify that the NN has effectively captured
the essence of the pattern to solving the problem
END
Algorithm
m Evaluations
n Neural Nets
t
CEE 840,00
0 1600
0
ESP
P 169, 46 1000
0
NEATT 33,184
4 1000
0
RBFF 33,275
5
Proposed algorithm
h 14,237
7 350
0
The evaluations obtained for RBF and the proposed algorithm were obtained via
simulations, while the rest of the evaluations were reported and well documented. The
results obtained were averaged upon 100 simulations. Figure 4 compares the
simulation results for RBF and the proposed algorithm. The performance of the
proposed algorithm is significantly better. The performance has been generated upon
the successful solving of the DPNV problem with a higher level of accuracy.
However, a performance of greater than 80% has been considered to be good enough
to be regarded as successful. Both curves roughly undergo the same progress.
A Hybrid Genetic Algorithm and Radial Basis Function NEAT 167
Fig. 5. Time taken to complete each lap for the 1400 laps
Figure 5 compared the time taken to complete a lap for the 1400 laps. ESP and CE
had no evaluation record for the car racing strategy domain. Therefore, comparisons
were done between RBF and the proposed algorithm. It was observed that the
proposed algorithm were immediately more successful than RBF. One plausible
explanation is the fact that, like in the DPNV problem, GA contributes a lot in
familiarizing the data to the subsequent algorithm. In this way, it gives a significant
advantage as compared to other traditional neuroevolution techniques such as the
NEAT and even the RBF network.
The proposed algorithm has been constructed using two powerful, well known
algorithms; GA and RBF. Unlike NEAT, it makes use of GA not only to evolve the
RBF network, but also in an attempt to reorganize the problem such that upon
complexification, the RBF network is exposed to a date which is partially solved. In
this way, it has a definite advantage over other neuroevolution techniques since it will
require a lesser complex network and secondly, it will tend to solve a problem in a
faster and more efficient way. Higher level domain problems such as the car racing
strategy are considered as a difficult problem that requires much complexity. In this
168 H. Mohabeer and K.M. Sunjiv Soyjaudah
References
1. Karpov, I., Sheblak, J., Miikkulainen, R.: OpenNERo; A game platform for AI research
and education. In: Proceeding of the Fourth Artificial Intelligence and Interactive Digital
Entertainment Conference (2008)
2. Shettleworth, S.J.: Evolution and Learning-The Baldwin effect reconsidered. MIT Press,
Cambridge (2003)
3. Kull, K.: Adaptive Evolution Without Natural Selection-Baldwin effect. Cybernetics and
Human Knowing 7(1), 45–55 (2000)
4. Baeck, T.: Evolutionary Algorithms in Theory and Practice. Oxford University Press,
New-York (1996)
5. Beyer, H.G.: The Theory of Evolution Strategies. Springer, Heidelberg (2001)
A Hybrid Genetic Algorithm and Radial Basis Function NEAT 169
6. Yao, X., Liu, Y.: A new Evolutionary System for Evolving Artificial Neural
Networks 8(3), 694–713 (1997)
7. Floreano, D., Urzelai, J.: Evolutionary Robots with Self-Organization and Behavioral
Fitness. Neural Networks 13, 431–443 (2000)
8. Niv, Y., Noel, D., Ruppin, E.: Evolution of reinforcement learning in Uncertain
Environments; A Simple Explanation for Complex Foraging Behaviors. Adaptive
Behavior 10(1), 5–24 (2002)
9. Soltoggio, A., Bulliaria, J.A., Mattiussi, C., Durr, P., Floreano, D.: Evolutionary
Advantages of Neuromodulated Plasticity in Dynamics, Reward based Scenario. In:
Artificial Life XI, pp. 569–576. MIT Press, Cambridge (2008)
10. Stanley, K.O., Miikkulainen, R.: Evolving Neural Networks through Augmenting
Topologies. Evolutionary Computation 10(2) (2002)
11. Potter, M.A., De Jong, K.A.: Evolving Neural Networks with Collaborative Species. In:
Proceedings of the 1995 Summer Computing Simulation Conference (1995)
12. Radcliff, N.J.: Genetic Set Recombination and its application to Neural Network Topology
Optimization. Neural Computing and Applications 1(1), 67–90 (1992)
13. Wieland, A.: Evolving Neural Networks Controllers for unstable systems. In: Proceedings
of the International Joint Conference on Neural Networks, Seattle, WA. IEEE (1991)
14. Saravanan, N., Fogel, D.B.: Evolving Neural Control Systems. IEEE Expert, 23–27 (1995)
15. Reisinger, J., Bahceci, E., Karpov, I., Miikkulainen, R.: Coevolving Strategies for general
game playing. In: Proceedings of the IEEE Symposium on Computational Intelligence and
Games (2007)
16. Stanley, K.O., Bryant, B.D., Miikkulainen, R.: Real-Time Neuroevolution in the NERO
video game. IEEE Transactions on Evolutionary Computation 9(6), 653–668 (2007)
17. Stanley, K.O., Miikkulainen, R.: Competitive Coevolution through Evolutionary
Complexification. Journal of Artificial Intelligence Research, 63–100 (2004)
18. Stanley, K.O., Miikkulainen, R.: Evolving a roving eye for go. In: Proceeding of the
Genetic and Evolutionary Computation Conference (2004)
19. Lucas, S.M., Togelius, J.: Point-to-pointcar racing; an Initial Study of Evolution Versus
Temporal Difference Learning. In: IEEE Symposium of Computational Intelligence and
Games, pp. 260–267 (2007)
20. Li, J., Martinez-Maron, T., Lilienthal, A., Duckett, T.: Q-ran; A Constructive
Reinforcement Learning approach for Robot behavior Learning. In: Proceedings of
IEEE/RSJ International Conference on Intelligent on Intelligent Robot and System (2006)
21. Stone, P., Kuhlmann, G., Taylor, M.E., Liu, Y.: Keepaway Soccer: From Machine
Learning Testbed to Benchmark. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y.
(eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 93–105. Springer, Heidelberg (2006)
22. Moody, J., Darken, C.J.: Fast Learning in Networks of Locally tuned Processing units.
Neural Computation, 281–294 (1989)
23. Sutton, R.S., Barto, A.G.: Reinforcement Learning; An Introduction. MIT Press (1998)
24. Wieland, A.: Evolving Neural Network Controllers for Unstable Systems. In: Proceedings
of the IJCNN, Seattle, WA, pp. 667–673. IEEE (1991)
25. Gruau, F., Whitley, D., Pyeatt, L.: A comparison between Cellular Encoding and Direct
Encoding for Genetic Programming. In: Genetic Programming 1996: Proceedings of the
First Annual Conference, pp. 81–89 (1996)
26. Gomez, F.J., Miikkulainen, R.: Solving Non-Markovian Control tasks with Neuroevolution.
In: Proceedings of the IJCAI, pp. 1356–1361 (1999)
170 H. Mohabeer and K.M. Sunjiv Soyjaudah
27. Dürr, P., Mattiussi, C., Floreano, D.: Neuroevolution with Analog Genetic Encoding. In:
Runarsson, T.P., Beyer, H.-G., Burke, E.K., Merelo-Guervós, J.J., Whitley, L.D., Yao, X.
(eds.) PPSN IX. LNCS, vol. 4193, pp. 671–680. Springer, Heidelberg (2006)
28. Jakobsen, M.: Learning to Race in a Simulated Environment, http://www.hiof.
no.neted/upload/attachment/site/group12/Morgan_Jakobsen_Lear
ning_to_race_in_a_simulated_environment.pdf (last accessed March 18,
2012)
29. Kietzmann, T.C., Reidmiller, M.: The Neuro Slot car Racer; Reinforcement Learning in a
Real World Setting, http://ml.informatik.unifreiburg.de/_media/
publications/kr09.pdf (last accessed March 12, 2013)
30. Kohl, N., Miikkulainen, R.: Evolving Neural Networks for Fractured Domains. Neural
Networks 22(3), 326–337 (2009)
31. Togelius, J., Lucas, S.M.: IEEE CEC Car Racing Competition, http://Julian.
togelius.com/cec2007competition/ (last accessed February10, 2012)
32. NEAT Matlab, http://www.cs.utexas.edu/users/ai-lab/?neatmatlab
(last accessed March 02, 2012)
33. The Radial Basis Function Network, http://www.csc.kth.se/utildning/
kth/kurser/DD2432/ann12/forelasningsanteckningar/RBF.pdf (last
accessed: June 21, 2012)
34. Bajpai, P., Kumar, M.: Genetic Algorithm-an Approach to Solve Global Optimization
Problems. Indian Journal of Computer Science and Engineering 1(3), 199–206 (2010)
A Multi-agent Efficient Control System
for a Production Mobile Robot
1 Introduction
An efficient robot control is an important task for the applications of a mobile robot in
production. The important control tasks are power consumption optimization and
optimal trajectory planning. Control subsystems should provide energy consumption
optimization in a robot control system. Four levels of robot power consumption
optimization can be distinguished:
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 171–181, 2014.
© Springer International Publishing Switzerland 2014
172 U. Dziomin et al.
a) b)
Fig. 1. a) Production mobile platform; b) Driving module
Let’s decompose the robot’s platform into the independent driving module agents.
The agent stays in physical, 2-D environment with a reference beacon, as shown in
Fig. 2. The beacon position is defined by coordinates (xb, yb). The rotation radius ρ is
the distance from the center of the module to the beacon.
In the simulated model environment, all necessary information about an agent and
a beacon is provided. In a real robotic environment, this information is taken from
wheel odometers and a module angle sensor. The environment information states are
illustrated in Table 1. The navigation subsystem of real steering uses odometer
sensors for navigation purposes in the presented platform. The full set of actions
available to the agent is presented in Table 2. The agent can change the angle error
φerr around beacon, using control of linear ν and angular speed ω.
that modules not directly controlled by virtual leader. The modules remain
independent entities and adopt their behavior to conform desired position in platform.
In fig. 4, (xi, yi) and (xiopt, yiopt) represent, respectively, the i-th module’s actual and
desired position, and represent the desired deviation vector of the i-th module relative
to desired position, where
Here d i t – distance from virtual center to current module position and d i opt required
distance between virtual center and i-th module position derived from platform
topology.
4 Module Positioning
Where rt – reward value obtained for action αt selected in st, and γ – discount rate,
A(st+1) – set of actions available at st+1.
Here φstop – the value of angle, where robot reduce speed to stop at the correct
orientation, ωopt [0.6 .. 0.8] rad/s – optimal speed minimizing module power
consumption. The parameter φstop is used to decrease the search space for the agent.
When the agent angle error becomes smaller than φstop, an action that reduces the
speed will get the highest award. The parameter ωopt shows a possibility of power
optimization by setting a value function. If agent angle error more than φstop and
ωoptmin< ω< ωoptmax, then agent award will get better increasing coefficient. This
coefficient ranges between [0 .. 1]. The optimization gives the possibility to use
preferred speed with the lowest power consumption.
5 Cooperative Moving
In this section, we consider a multi-agent reinforcement learning model for
cooperative moving problem. The problem is to control module’s individual speed in
order to achieve stable circular motion of whole platform. Modules with different
distances to beacon should have a different speed: for two modules i and j, with
distances to beacon ρi and ρj respectively, the speed vj will more than vi if the distance
to beacon ρj more than ρi. Every module should have additional policy to control its
forward speed with respect to speed of other modules.
The main principles of the technique are described in [13]–[14]. The basic idea of
selected approach is to use influences between module and platform virtual leader to
determine sequences of correct actions in order to coordinate behavior among them. The
good influences should be rewarded and negative should be punished. The code design
question is how to determine such influences in terms of received individual reward.
RL-framework used for such control problem is illustrated in Fig. 6:
The i-th module at the state sit selects action αit using current policy Qi and goes to
next state sit+1 taking action to environment. Platform observes changes done by
executed action, calculates and assigns reward rit+1 to module as a feedback reflecting
successiveness of specified action.
The same Q-learning rule (2) can be used to update module control policy. The
main difference between both rules is that in second case reward is assigned by a
virtual leader instead of environment:
Instead of trying to build global Q-function Q({s1, s2,…, sn}, {a1, a2, …,an}) for n
modules we decompose the problem and build set of local Q-functions – Q1(s, a),
Q2(s, a),…, QN(s, a), where every policy contains specific control rule for each
module.
The combination of such individual policies produces cooperative control law.
Let, state of the module is pair of st = {vt, d ierr dierr}, where vt – current value of linear
speed, and d errt
– distance error calculated by (4). Action set Aν = { ∅ , ν+, ν-} is
represented by increasing/decreasing of linear speed from the Table 2 and action
at ∈ Aν is a change of forward speed Δνt for given moment of time t.
The virtual agent receives error information for each module and calculates
displacement error. This error can be positive (module ahead of the platform) or
negative (the module behind of the platform). The learning process follows toward to
minimization of d ierr for every module. The maximum reward is given for case where
d ierr →0, and a penalty given when the position of the module deviates from the
predefined.
The simulation results showed in previous publication [4]. In this paper we consider
on experiment with the real robot to make verification of the control system.
The learning of the agent was executed on the real robot after a simulation with the
same external parameters. The learning process took 1440 iterations. The topology of
Q-function is shown in Fig. 7. A real learning process took more iterations in average
because the real system has noise and errors of sensors. Figure 8 illustrates the result
of execution of a studied control system to turn modules to the center which is placed
behind right.
A Multi-agent Efficient Control System for a Production Mobile Robot 179
Fig. 8. Execution of a learned control system to turn modules to the center that placed behind
right relatively to the platform
Figure 9 shows the experimental result of the cooperative movement after learning
positioning. The knowledge base of the learned agents was transferred to the agents of
the control system on the real robot. Fig. 9 demonstrates the process of the platform
moving by the learned system. At first, modules turn in the driving direction relative
to the center of rotation (the circle drawn on white paper), as shown in screenshots 1-
6 in Fig. 9. Then, the platform starts driving around the center of rotation in
screenshots 7-9 in Fig. 9. The stabilization of the real module orientation is based on a
low-level controller with feedback. This controller is provided by software control
system of the robot. It helps to restrict the intellectual control system by manipulating
linear speed of modules. As shown, the distance to the center of rotation is always the
same on the entire trajectory of the platform.
180 U. Dziomin et al.
Fig. 9. The experiment of modules turning to car kinematics scheme (1-6 screenshots) and
movement around white beacon (7-9)
7 Conclusions
• Decomposition means that the instead of trying to build global Q-function we build
a set of local Q-functions.
• Adaptability – the platform will adapt its behavior for dynamically assigned beacon
and will auto reconfigure moving trajectory.
• Scalability and generalization – the same learning technique is used for every
agent, for every beacon position and every platform configuration.
The paper presents successful experiments with the real robot. Developed system
provides robust steering of the platform for circular motion. The experiment results
indicate that the application of the intellectual adaptive control system for real mobile
robot have great prospects in a production.
In future works we will consider on comparison of the developed approach with
existing approaches of a mobile robot steering and will provide further information
about efficiency of the developed control system.
A Multi-agent Efficient Control System for a Production Mobile Robot 181
References
1. Andreas, J.C.: Energy-Efficient Electric Motors, 2nd edn. Marcel Dekker, New York
(1992)
2. de Almeida, A.T., Bertoldi, P., Leonhard, W.: Energy efficiency improvements in electric
motors and drives. Springer, Berlin (1997)
3. Stetter, R., Ziemniak, P., Paczynski, A.: Development, Realization and Control of a
Mobile Robot. In: Obdržálek, D., Gottscheber, A. (eds.) EUROBOT 2010. CCIS, vol. 156,
pp. 130–140. Springer, Heidelberg (2011)
4. Dziomin, U., Kabysh, A., Golovko, V., Stetter, R.: A multi-agent reinforcement learning
approach for the efficient control of mobile robot. In: IEEE 7th International Conference
on Intelligent Data Acquisition and Advanced Computing Systems, Berlin, vol. 2, pp. 867–
873 (2013)
5. Mei, Y., Lu, Y.-H., Hu, Y.C., Lee, C.G.: Energy-efficient motion planning for mobile
robots. In: 2004 IEEE International Conference on Robotics and Automation: Proceedings,
ICRA 2004, vol. 5, pp. 4344–4349 (2004)
6. Ogunniyi, S., Tsoeu, M.S.: Q-learning based energy efficient path planning using weights.
In: 24th Symposium of the Pattern Recognition Association of South Africa, pp. 76–82
(2013)
7. Mei, Y., Lu, Y.-H., Lee, C.G., Hu, Y.C.: Energy-efficient mobile robot exploration. In:
IEEE International Conference, Robotics and Automation, pp. 505–511. IEEE Press
(2006)
8. Ceccarelli, N., Di Marco, M., Garulli, A., Giannitrapani, A.: Collective circular motion of
multi-vehicle systems with sensory limitations. In: 44th IEEE Conference, Decision and
Control, 2005 and 2005 European Control Conference, pp. 740–745 (2005)
9. Ceccarelli, N., Di Marco, M., Garulli, A., Giannitrapani, A.: Collective circular motion of
multi-vehicle systems. Automatica 44(12), 3025–3035 (2008)
10. Benedettelli, D., Ceccarelli, N., Garulli, A., Giannitrapani, A.: Experimental validation of
collective circular motion for nonholonomic multi-vehicle systems. Robotics and
Autonomous Systems 58(8), 1028–1036 (2010)
11. Ren, W., Sorensen, N.: Distributed coordination architecture for multi-robot formation
control. Robotics and Autonomous Systems 56(4), 324–333 (2008)
12. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press (1998)
13. Kabysh, A., Golovko, V.: General model for organizing interactions in multi-agent
systems. International Journal of Computing 11(3), 224–233 (2012)
14. Kabysh, A., Golovko, V., Lipnickas, A.: Influence Learning for Multi-Agent Systems
Based on Reinforcement Learning. International Journal of Computing 11(1), 39–44
(2012)
A Low-Cost Mobile Robot for Education
1 Introduction
Education of robotics is not easy. Students need knowledges from various scientific
areas to carry out high quality researches. Robotics is one of the most important
subjects and most active areas in IT education today. In fact, lots of robots for
educational purpose have been proposed [1,2,3,4,5]. If these robots are available, then
students may start a course simply by designing a tool for training these robots. On
the other hand, if students start by creating an actual robot, they will learn a deeper
methodology in robotics. In this paper, we consider both of such possibilities.
Unlike textbook problems, real world problems usually have multiple solutions.
Moreover, it is the real world not us which decides whether our design or our hypothesis
is correct or not. We could not ignore resource limitations such as time, money and
materials. And more importantly, real world would less likely follow our models. For
example, the assumption that sensors always deliver exact and valid values, or, that
motors always deliver the commanded speed and torque will not work as planned due to
noises, spurious inputs, unreliable outputs, etc. Students who lack hands-on experiences
significantly underestimate the importance of these real world issues.
Hence our purpose is to develop a universal system for robotics for education that
consists of constructing hardware of mobile robots, and software to control the robot
in any environment given so that we can expect this system to work not only in a
virtual world but also in the real world.
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 182–190, 2014.
© Springer International Publishing Switzerland 2014
A Low-Cost Mobile Robot for Education 183
We started this project along the ideas already reported [6,7]. The key features are:
low cost, modularity, simplicity of programming and multi-functionality. Each of the
above mentioned authors tried to find a balance between these qualities.
Also, a lot of interesting thought was found in Raymond's robot [8], although it
was not for universal task but a particular task of rescue operations. A suggestion by
Beer et al. [10] also attracts our interest. They used a platform based on LEGO [11].
However, LEGO is somehow a difficult platform for us to use together with the
common packages available in public to control a robot.
The platform and prototype called 'Robot Pioneer 3DX available from
AdeptRobotics [12] is one of the most attractive tools for us. It gives us a platform for
two wheel mobile robot as well as a set of algorithms to control the robot by PC. The
problem for us is its price. The idea of combining the virtual robot with a real robot
was implemented by Gerkey et al. [13]. Their Player/Stage Project, tools for multi-
robot and distributed sensor systems, is popular in the community of the mobile robot
nowadays. It's often conveniently used together with commercial mobile robot of
Pioneer 3-DX available from Adept Mobilerobots inc. [14].
We use a similar but more easily accessible mobile robot MARVIN (Mobile
Autonomous Robot & Virtual Intelligent ageNt). Differences of our work from such
already reported projects described above, and hopefully our novelties are that our
mobile robots are our home made, partly because of the high price of the commercial
products, and more importantly, in this way we have a possibility of more effective
mobile robot and software for it than a commercial one. Furthermore, this allows us to
make students study methods of artificial intelligence not only on models or by theory
but also with experiments with using a physical world.
2 Our Robot
2.1 Requirements
Software Choice: Our decision is, programming should be made for micro computer
on board in the robot and then executes a command from a remote computer.
Sensor System Choice: Sensor system should be fixed for a specific task in advance.
we have to options of choice. One is to use a small number of expensive sensors give
a common flow of data that grows huge as time goes. The other is a large number of
cheap sensors each of which issue a small information one by one.
A robot such as Marvin prefers the second option, and a low-cost mobile robot for
education goes for our robot.
2.2 Platform
As mentioned above, we use the MARVIN Robot, and we chose DFRobot’s two-
wheeled platform with Arduino Board. See Fig. 1.
2.3 Software
The robot consists of three modules, that is, onboard computer, motor drivers and
power subsystem. On board computer is also based on the Arduino's project. It
operates all peripheral devices of the robot. See Fig. 5. A number of resources for the
onboard computer, including sample programs, are freely available in the Internet.
Therefore, our onboard computer as well as motor driver is made by ourselves in our
laboratory.
186 V. Kasyanik and S. Potapchuk
For communication between the onboard computer and external devices, we can
use wireless modules such as Bluetooth, Wi-Fi or connection by RS-232, which
simplifies algorithms on the remote computer. We can also use a smartphone as a
remote computer. Alternatively we could even replace the onboard computer with a
smartphone.
2.4 Sensors
See Fig. 3. The sensor subsystem is also modular. It consists of two types of
sensors: sensors for perception of the world outside and sensors of states of inside the
robot. Locations of such sensors are shown in Fig. 6.
2.5 Environment
3 Programming
Under the Arduino platform, the program is usually fed to the robot through USB port
when the robot is switched off. However we sometimes want a wireless access to the
robot in order for the remote PC to be able to send an operating command on line, so
that we can run algorithms on the remote PC in order for the robot to make actions
just by receiving a series of commands. For the purpose we exploit the Firmata
project. As its homepage [18] read, Firmata is a generic protocol for communicating
with microcontrollers from software on a host computer. It is intended to work with
any host computer software package. It is easy for other software to use this protocol.
Basically, this is a protocol for talking to the Arduino from the host software. The aim
is to control the Arduino completely from software on the host computer.
processes etc. ROS was originally developed in 2007 by the Stanford Artificial
Intelligence Laboratory. As of 2008, development continues primarily at Willow
Garage. ROS is framework based on ideas of Player/Stage and models from ROS can
be used in Player/Stage simulator.
Development of the driver for our robot in webots simulator is now on going in our
laboratory. See Fig. 6.
Acknowledgments. We are very grateful to Professor Akira Imada for his important
notes and corrections which made our paper much better.
References
1. K-Team Corporation - Mobile Robotics, http://www.k-team.com
2. Festo Didactic GmbH & Co.KG: Technical Documentation of Robotino. Martin Williams,
Denkendorf (2007)
3. Mondada, F., Bonani, M., Raemy, X., Pugh, J., Cianci, C., Klaptocz, A., Magnenat, S.,
Zufferey, J.C., Floreano, D., Martinoli, A.: The e-puck, a Robot Designed for Education in
Engineering. In: 9th Conference on Autonomous Robot Systems and Competitions, pp.
59–65. IPCB: Instituto Politécnico de Castelo Branco, Portugal (2009)
4. Surveyor SRV-1 Blackfin Robot, http://www.surveyor.com/
5. Nourbakhsh, I.: Robotics and education in the classroom and in the museum: On the study
of robots, and robots for study. In: Workshop for Personal Robotics for Education. IEEE
ICRA (2000)
6. Kumar, D., Meeden, L.: A robot laboratory for teaching artificial intelligence. In: 29th
SIGCSE Symposium on Computer Science Education, pp. 341–344. ACM, New York
(1998)
7. Schilling, K., Roth, H., Rusch, O.: Mobile Mini-Robots for Engineering Education. Global
Journal of Engineering Education 6, 79–84 (2002)
8. Raymond, S.: The Building of Redback. In: 2005 Rescue Robotics Camp, Istituto
Superiore Antincendi, Rome (2005)
190 V. Kasyanik and S. Potapchuk
9. Beer, R., Chiel, H., Drushel, R.: Using autonomous robots to teach science and
engineering. Communications of the ACM (1999)
10. LEGO Education 2012 (2012), http://www.legoeducation.com
11. Lego Education WeDo, http://www.legoeducation.us/eng/product/lego
_education_wedo_robotics_construction_set/2096
12. Intelligent Mobile Robotic Platforms for Service Robots, Research and Rapid Prototyping,
http://www.mobilerobots.com/Mobile_Robots.aspx
13. Gerkey, B., Vaughan, R., Howard, A.: The Player/Stage Project: Tools for Multi-Robot
and Distributed Sensor Systems. In: 11th International Conference on Advanced Robotics,
Coimbra, Portugal, pp. 317–323 (2003)
14. Software for Pioneer DX, http://www.mobilerobots.com/Software.aspx
15. DFrobot 2WD mobile platform, http://www.dfrobot.com
16. Webpage of Arduino Project, http://www.arduino.cc
17. Webots Simulator, http://www.cyberbotics.com/overview
18. Firmata Project, http://firmata.org
19. Robotics Operation System, http://www.ros.org
20. Arduino Support from MATLAB, http://www.mathworks.com/academia/
arduino-software/arduino-matlab.html
21. Interfacing android and arduino through an audio connection, http://androino.
blogspot.com/p/project-description.html
22. Lay, K., Rassler, E., Dillmann, R., Grunwald, G., Hagele, M., Lawitzky, G., Stopp, A., von
Seelen, W.: MORPHA: Communication and interaction with intelligent, anthropomorphic
robot assistants. In: The International Status Conference - Lead Projects Human-
Computer-Interactions (2001)
23. Kanda, T., Ishiguro, H., Ono, T., Imai, M., Mase, K.: Multi-robot cooperation for human-
robot communication. In: IEEE Int. Workshop on Robot and Human Communication
(ROMAN 2002), pp. 271–276. IEEE Press, Berlin (2002)
24. Carnegie, D.A., Prakash, A., Chitty, C., Guy, B.: A human-like semi autonomous mobile
security robot. In: 2nd International Conference on Autonomous Robots and Agents,
Palmerston North (2004)
25. Rogers, T.E., Peng, J., Zein-Sabatto, S.: Modeling human-robot interaction for intelligent
mobile robotics. In: IEEE International Workshop on Robot and Human Interactive
Communication, pp. 36–41. IEEE Press (2005)
26. Dautenhahn, K.: Socially intelligent robots: dimensions of human-robot interaction.
Philosophical Transactions of the Royal Society B: Biological Sciences 362(1480), 679–
704 (2007)
27. Nieuwenhuisen, M., Behnke, S.: Human-like interaction skills for the mobile
communication robot robotinho. International Journal of Social Robotics 5(4), 549–561
(2013)
28. Loo, C.-K., Rajeswari, M., Wong, E.K., Rao, M.B.C.: Mobile Robot Path Planning Using
Hybrid Genetic Algorithm and Traversability Vectors Method. Journal of Intelligent
Automation & Soft Computing 10(1), 51–63 (2004)
29. Hachour, O.: Path planning of autonomous mobile robot. International Journal of Systems
Application, Engineering & Development 2(4), 178–190 (2008)
30. Sarkar, S., Shome, S.N., Nandy, S.: An intelligent algorithm for the path planning of
autonomous mobile robot for dynamic environment. In: Vadakkepat, P., et al. (eds.) FIRA
2010. CCIS, vol. 103, pp. 202–209. Springer, Heidelberg (2010)
Data-Driven Method for High Level Rendering
Pipeline Construction
Abstract. The paper describes a software methodology for the graphics pipeline
extension. It is argued that common modern visualization techniques do not
satisfy current visualization software development requirements adequately
enough. The proposed approach is based on specialized formal language called
visualization algebra. By invoking data-driven design principles inherited from
the existing programmable pipeline technology, the technique has a potential to
reduce visualization software development costs and build a way for further
computer graphics pipeline automation.
1 Introduction
Computer graphics (CG) remains one of the most rich and constantly evolving fields
of study in computer science. CG consists of two large parts, each studying its own
problem: image recognition and image generation. Both of these problems are very
broad and complex in nature. In this paper, we concentrate on image generation, i.e.
visualization.
Literally every area in human life which involves computer technology requires
some sort of visual representation of information. Accurate and adequate visualization
becomes vitally necessary with the growing complexity of problems being solved
with computers. CG provides tools and theories that target the growing requirements
for visualization, but as requirements become more complex and demanding so does
the need for improvement in this field.
In this paper we analyze the most widely used visualization methodology and
provide a technical solution for its improvement.
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 191–200, 2014.
© Springer International Publishing Switzerland 2014
192 V. Krasnoproshin and D. Mazouka
2 Basic Definitions
3 Analysis
The following figure (Fig.1) summarizes the structure of the visualization process and
differences between two approaches mentioned above.
The model on the picture is not compatible with either of two engines. This
happens, for example, in the case when the engines specialize in solid 3D geometric
object rendering, and the Model represents a flat user interface.
In this situation software engineers can take one of the three ways:
1. change the model so it fits an engine;
2. provide an engine-model adapter layer;
3. fallback to the pure pipeline rendering.
The first option probably needs the least effort, but the outcome of visualization may
seriously differ from the initial expectations as the visualized model gets distorted.
The second option tries to preserve the model's consistency with additional
translation stage (Fig.3):
This may work in some situations depending on how different the model and
engine are. If the difference is too big, the adapter itself becomes cumbersome and
unmaintainable. In this case the third option becomes preferable: the visualization
problem gets solved from the scratch.
The conclusion of this is that we cannot rely on graphics engines from a general
perspective. Tasks and models change all the time and engines become obsolete. The
variety and quantities grow, which make it troublesome to find the proper match. And
so, we have a question: whether anything can be done here in order to improve the
visualization process construction.
In our previous works [4, 5] we suggested that it was possible to develop a general
methodology for high-level visualization abstractions. That is, to create a model-
independent language for visualization process construction.
Together with corresponding support layer libraries, the new visualization process
construction would change in the following way (Fig.4):
Data-Driven Method for High Level Rendering Pipeline Construction 195
In this scheme, the model is rendered using an engine with either the first or the
second method. SL stands for the standard layer, and VS – for visualization system.
Standard layer is a set of support rendering libraries based on visualization algebra
methodology [5]. The library provides tools for process construction in model-
agnostic data-driven way. Less effort (in comparison to pure pipeline development) is
necessary to make a model adaptation visualization system. And, though the
implementation complexity remains slightly bigger than that of the graphics engines,
this technique has the advantage of preserving the pipeline's flexibility together with
higher level of language abstraction.
4 Object Shaders
Before we start describing the details of the proposed technology, a few words needs
to be said regarding how the graphics pipeline is programmed in general.
The graphics pipeline consists of several stages including (DirectX11 model [7]):
Input Assembler, Vertex Shader, Hull Shader, Tesselator, Domain Shader, Geometry
Shader, Rasterizer, Pixel Shader and Output Merger.
The stages implement different parts of the rasterization algorithm and provide
some additional functionality. Shader stages are programmable; all the others are
configurable. Configurable stages implement fixed algorithms with some adjustable
parameters. Programmable stages, in their turn, can be loaded with custom programs
(which are called shaders) implementing specific custom algorithms. This is what
essentially gives the pipeline its flexibility.
Shader programs of different types operate with different kinds of objects. Vertex
shaders operate with geometry vertices and control geometrical transformation of 3D
model and its projection onto 2D screen. Pixel shaders work with fragments – pieces
of the resulting image that are later translated into pixels.
The most common way in writing the shader programs is to use one of the high
level shading languages: HLSL for DirectX, or GLSL for OpenGL. These languages
have common notation and similar instruction sets determined by underlying
hardware. By their nature, the languages are vector processing SIMD-based. And
corresponding shader programs implement algorithms that describe how every vertex
or fragment needs to be treated independently, enabling massive parallel data
processing capability.
In our work, we pursue technological integration into the graphics pipeline, rather
than its replacement with another set of tools. That is why the technical realization of
the proposed methodology is based on emulation of an additional programmable
pipeline stage which we call Object Shader.
196 V. Krasnoproshin and D. Mazouka
Object Shader stage is a broad name for three types of programmable components:
pipeline, sampling, and rendering. These components are based on corresponding
notions from visualization algebra [5]: visual expression, sample, and render
operations.
Visual expression in visualization algebra (VA) is a formalized algorithm operating
in a generalized object space. Objects in VA are represented with tuples of attributes
projected onto a model-specific semantic grid. The methodology does not make any
assumptions on the nature of objects and their content, treating all data equally.
Visual expressions in VA are constructed using four basic operations:
1. sample – object subset selection,
2. transform – derivation of new objects on the base of existing ones,
3. render – objects translation into an image,
4. blend – operations with the resulting images.
The final expression for target model visualization must have one input (all the
model's data) and one output (the resulting image).
On the technical side, the expression is represented with a program on a language
similar to HLSL. The additions are:
1. data type ptr used for declaration of resources
2. object types Scene, Objects and Frame
3. various rendering-specific functions
Object layout declaration in object shaders is done in a common way with structures:
struct Object
{
field-type field-name : field-semantic;
…
};
Render procedures generate operation sequences for objects of the supported type and
return Frame as a result:
Frame Render(Object object)
{
instruction;
…
instruction;
return draw-instruction;
}
Data-Driven Method for High Level Rendering Pipeline Construction 197
Pipeline procedures combine sampling and rendering operations into the final
visualization algorithm. The input of a pipeline procedure is a Scene object and a
Frame object is output:
Visualization system routes data streams according to the pipeline procedure logic,
splitting it with samplers and processing with renderers. The unit routines are
designed to be atomic in the same way how it is done for the other shader types:
processing one object at a time, allowing massive parallelization.
This technique is fairly similar to effects in DirectX [8]; but, at the same time, the
differences are apparent: effects cannot be used for building the complete
visualization pipeline.
In the last part of the paper we will go through the real usage example.
5 Usage Example
The sample model consists of one 3D object (a building) and one 2D object (overlay
image). The resulting visualization should visualize the building at the center of
screen and make it possible to change its orientation. The overlay image should be
drawn at the top right corner.
From the model description we know that there are two types of objects and
therefore we need two sampling and two rendering procedures.
Sampling procedures:
The procedures get the type field from object stream and compare it against
predefined constant values. So the first procedure will sample 3D objects (type 1) and
the second 2D objects (type 2).
198 V. Krasnoproshin and D. Mazouka
The supported objects must have transform matrix, vertex buffer (VB), vertex
declaration (VD), texture (tx0) and common geometry information: primitives count
and vertex size.
Then, we declare external variables, which are required to be provided by the user:
extern ptr VS = VertexShader("/Test#VS_World");
extern ptr PS = PixelShader("/Test#PS_Tex");
extern float4x4 f4x4ViewProjection;
These variables are: common vertex shader (VS), common pixel shader (PS) and
view-projection transformation matrix.
The rendering procedure itself:
Frame OS_Basic(Object obj)
{
SetStreamSource(0, obj.VB, 0, obj.vertex_size);
SetVertexDeclaration(obj.VD);
SetVertexShader(VS);
SetPixelShader(PS);
SetVertexShaderConstantF(0, &obj.transform, 4);
SetVertexShaderConstantF(4, &f4x4ViewProjection, 4);
SetTexture(0, obj.tx0);
return DrawPrimitive(4, 0, obj.primitive_count);
}
The procedure makes a number of state change calls and invokes a drawing routine.
The second rendering procedure is implemented in a similar way:
struct Object
{
float4 rect : f4Rectangle;
ptr tx0 : pTX0;
};
The resulting pipeline procedure is very simple: it needs to use the declared samplers
and renderers, and combine their output:
"Object1" :
{
"iType" : "int(1)",
"f4x4Transform" : "float4x4(0.1,0,0,0, 0,0.1,0,0,
0,0,0.1,0, 0,-1,0,1)",
"pVB" : "ptr(VertexBuffer(/Model#VB_House))",
"pVD" : "ptr(VertexDeclaration(/Model#VD_P3N3T2T2))",
"pTX0" : "ptr(Texture2D(/Model#TX_House))",
"iPrimitiveCount" : "int(674)",
"iVertexSize" : "int(40)"
},
"Rect1" :
{
"iType" : "int(2)",
"pTX0" : "ptr(Texture2D(/Test#TX_Tex))",
"f4Rectangle" : "float4(0.75, 0.75, 0.25, 0.25)"
}
6 Conclusion
Computer visualization still holds the status of a heavily evolving scientific and
engineering area. Dozens of new techniques and hardware emerge every year. And,
with further advancements, this environment may require certain intensive changes in
order to stay comprehensible and maintainable.
This paper provides justification and a short overview of the technological
implementation of the new visualization methodology based on so-called
visualization algebra. This methodology has a potential to improve the most popular
existing methods of visualization in terms of flexibility and accessibility.
References
1. Tavenrath, M., Kubisch, C.: Advanced Scenegraph Rendering Pipeline. In: GPU
Technology Conference, San Jose (2013)
2. Andersson, J., Tartarchuk, N.: Frostbite Rendering Architecture and Real-time Procedural
Shading Texturing Techniques. In: Game Developers Conference, San Francisco (2007)
3. MSDN, Programming Guide for HLSL, http://msdn.microsoft.com/en-
us/library/windows/desktop/bb509635%28v=vs.85%29.aspx
4. Krasnoproshin, V., Mazouka, D.: Graphics pipeline automation based on visualization
algebra. In: 11th International Conference on Pattern Recognition and Information
Processing, Minsk (2011)
5. Krasnoproshin, V., Mazouka, D.: Novel Approach to Dynamic Models Visualization.
Journal of Computational Optimization in Economics and Finance 4(2-3), 113–124 (2013)
6. Gomes, J., Velho, L., Sousa, M.C.: Computer Graphics: Theory and Practice. A K
Peters/CRC Press (2012)
7. MSDN, Graphics Pipeline, http://msdn.microsoft.com/en-us/library/
windows/desktop/ff476882%28v=vs.85%29.aspx
8. MSDN, Effects (Direct3D 11), http://msdn.microsoft.com/en-us/library
/windows/desktop/ff476136%28v=vs.85%29.aspx
Efficiency of Parallel Large-Scale Two-Layered MLP
Training on Many-Core System
1 Introduction
V. Golovko and A. Imada (Eds.): ICNNAI 2014, CCIS 440, pp. 201–210, 2014.
© Springer International Publishing Switzerland 2014
202 V. Turchenko and A. Sachenko
Processing Unit is presented in [3]. The authors of [4] have presented the
development of parallel training algorithm of fully connected RNN based on linear
reward penalty correction scheme. In our previous works, within the development of
the parallel grid-aware library for neural networks training [5], we have developed the
batch pattern back propagation (BP) training algorithm for multi-layer perceptron
with one hidden layer [6], recurrent neural network [7], recirculation neural network
[8] and neural network with radial-basis activation function [9] and showed their good
parallelization efficiency on different high performance computing systems.
However the analysis of the state-of-the-art has showed that the parallelization of
the MLP with two hidden layers of neurons was not properly investigated yet. For
example, the authors of [2] have parallelized the MLP architecture 16-10-10-1 (16
neurons in the input layer, two hidden layers with 10 neurons in each layer and one
output neuron) on the huge number of the training patterns (around 20000) coming
from Large Hadron Collider. Their implementation of this relatively small NN with
270 internal connections (number of weights of neurons and their thresholds) does not
provide positive parallelization speedup due to large communication overhead, i.e. the
speedup is less than 1. In our opinion this overhead is caused by the fact, that the
“communication part” of their algorithm is not optimized and contains at least three
separate communication messages.
According to the theorem of universal approximation [10, 11], the MLP with two
hidden layers provides better control on approximation process instead the MLP with
one hidden layer [1]. Also the MLP with two hidden layers allows processing the
“global features”, i.e. the neurons of the first hidden layer gather the “local features”
from input data and the neurons of the second hidden layer generalized their outputs
providing higher-level representation. This is especially urgent for solving object
classification, recognition and computer vision tasks. However the computational
time for such kind of tasks is extremely long. For example, the training of the network
with 500 neurons in the first hidden layer and 300 neurons in the second hidden layer
on 60,000 training vectors of the MNIST database [12] was extremely slow; only 59
epochs of pre-training of the hidden layers as a Restricted Boltzmann Machine took
about a week [13]. Therefore the research of efficiency of a parallel training algorithm
for MLP with two hidden layers of neurons is an actual research problem.
Taking into account that the batch pattern BP training scheme showed good
parallelization efficiency on the number of NN architectures, the goal of this paper is
to apply this scheme for the MLP with two hidden layers and investigate its
parallelization efficiency on the large-scale data classification task on a many-core
high performance computing system. The rest of the paper is organized as follows:
Section 2 details the mathematical description of batch pattern BP algorithm, Sections
3 describes its parallel implementation, Section 4 presents the obtained experimental
results and Section 5 concludes this paper.
Efficiency of Parallel Large-Scale Two-Layered MLP Training on Many-Core System 203
The batch pattern training algorithm updates neurons’ weights and thresholds at the
end of each training epoch, i.e. after all training patterns processing, instead of
updating weights and thresholds after processing of each pattern in the sequential
training mode [6]. The output value of the MLP with two hidden layers (Fig. 1) is
described by [1]:
K
y 3 = F3 (S 3 ), S 3 = y 2 k ⋅ w3k − T3 , (1)
k =1
N
y 2 k = F2 (S 2 k ), S 2 k = y 1j ⋅ w2 jk − T2 k , (2)
j =1
y1 j = F1 (S1 j ), S1 j =
M
x ⋅w
i =1
i 1ij − T1 j , (3)
y1 j w2 jk
w1ij y 2k
x1
w3k
x2 y3
T3
xM
T2k
T1 j
γ 1ptj (t ) = γ 2ptk (t ) ⋅ w2 jk (t ) ⋅ F2′( S 2ptk (t )) , where S2ptk (t ) are the weighted sums of
k =1
sΔw2 jk = sΔw2 jk + γ 2ptk (t ) ⋅ F2′( S 2ptk ( t )) ⋅ y1ptj ( t ) , sΔT2 j = sΔT2 j + γ 2ptk (t ) ⋅ F2′( S 2ptk (t )) , (5)
sΔw1ij = sΔw1ij + γ 1ptj ( t ) ⋅ F1′( S1ptj (t )) ⋅ xipt (t ) , sΔT1 j = sΔT1 j + γ 1ptj (t ) ⋅ F1′( S1ptj ( t )) ; (6)
7. If E (t ) is greater than the desired error Emin then increase the number of training
epochs to t + 1 and go to step 3, otherwise stop the training process.
Efficiency of Parallel Large-Scale Two-Layered MLP Training on Many-Core System 205
Start Start
Define PT and p
Yes Message to
Send PT/p patterns to
finish work?
each Worker
No
Calculate s.3 and s.4 for Receive PT/p patterns
own training patterns from Master
No
E nd a) End b)
4 Experimental Results
The many-core high performance computing system Remus, located in the Innovative
Computing Lab, the University of Tennessee, USA is used for the research. Remus
consists of two socket G34 motherboards RD890 (AMD 890FX chipset) connected
each other by AMD Hyper Transport. Each motherboard contains two twelve-core
AMD Opteron 6180 SE processors with a clock rate of 2500 MHz and 132 GB of
local RAM. Thus the total number of computational cores is 48. Each processor has
the L2 cache of 12x512 Kb and the L3 cache of 2x6 Mb. We run the experiments
using MPI library Open MPI 1.6.3 [16].
Efficiency of Parallel Large-Scale Two-Layered MLP Training on Many-Core System 207
Analogously to the work [13], the MNIST database of handwritten digits, which
contains 60,000 training images and 10,000 test images, is used for the research of
the parallelization efficiency of large-scale data classification task. The size of the
input images in the MNIST database is equal to the 784 elements; therefore the
number of the input neurons of our MLP will be 784. In order to assess the different
sizes of the computational problem, the number of the neurons of the first hidden
layer was changed as 80, 160, 240, 320 and 400. Similarly, the number of the
neurons of the second hidden layer was changed as 50, 100, 150, 200 and 250.
There are ten classes in the database, the digits 0…9, thus our large-scale MLP
model has only one output neuron. We did the training using continuously growing
number of training images, from 10,000 patterns in the smallest scenario to 50,000
patterns in the biggest scenario. Thus, the following parallelization scenarios were
researched: 784-80-50-1/10,000 patterns, 784-160-100-1/20,000 patterns, 784-240-
150-1/30,000 patterns, 784-320-200-1/40,000 patterns, and 784-400-250-1/50,000
patterns. The last scenario is slightly lesser from the scenario 784-500-300-1/60,000
mentioned in the paper [13].
Results of our previous researches have showed that the parallelization efficiency
of the parallel batch pattern training algorithm does not depend on the number of
training epochs [17] since the neurons’ weights and thresholds are combining in the
end of each epoch. Therefore, taking into account the possible huge computational
time of the whole experiment, we have researched the parallelization efficiency of the
parallel training algorithm for one hundred training epoch only. The learning rates are
constant and equal α1 (t ) = 0.6 , α 2 (t ) = 0.6 and α 3 (t ) = 0.6 . The expressions
S=Ts/Tp and E=S/p×100% are used to calculate the speedup and efficiency of
parallelization, where Ts is the time of sequential executing of the routine, Tp is the
time of executing the parallel version of the same routine on p processors of a parallel
system.
The parallelization efficiency of the parallel batch pattern training algorithm for the
large-scale MLP with two hidden layers of neurons for one hundred training epoch
and the average communication (messaging) overhead on 5 researched scenarios are
presented in Fig. 3 and Fig. 4 respectively. The experimental results (Fig. 3) showed
high parallelization efficiency of the MLP with two hidden layers of neurons: 96-98%
on 48 cores for the three bigger parallelization scenarios and 72-78% on 48 cores for
the two smaller parallelization scenarios. Normally, the communication overhead is
increasing with increasing the number of the cores used for the parallelization (Fig.
4).
Thus, taking into account obtained results, the large-scale data classification task
which, for the successful results, may require, for example, 104 training epochs for the
MLP 784-400-250-1 with two hidden layers of neurons on the 50,000 training
patterns of the MNIST database will be computed approximately 21 days by
sequential routine and only 10 and half hours on 48 cores of many-core high
performance computing system.
208 V. Turchenko and A. Sachenko
100
90
85
80
784-80-50-1/10000 patterns
784-160-100-1/20000 patterns
75 784-240-150-1/30000 patterns
784-320-200-1/40000 patterns
784-400-250-1/50000 patterns
70
2 4 8 16 32 48
Processors of Remus
5
Communication overhead, seconds
0
2 4 8 16 32 48
Processors of Remus
5 Conclusions
The development of the parallel batch pattern back propagation training algorithm of
multilayer perceptron with two hidden layers of neurons and the research of its
parallelization efficiency for the large-scale data classification task on many-core high
performance computing system are presented in this paper. The model of multilayer
perceptron and the batch pattern back propagation training algorithm are theoretically
described. The algorithmic description of the parallel batch pattern training method is
presented. Our results show high parallelization efficiency of the developed algorithm
on many-core system with 48 CPUs: the parallelization efficiency is (i) 96-98% for
the three bigger parallelization scenarios with 240, 320 and 400 neurons in the first
hidden layer and 150, 200 and 250 neurons in the second hidden layer and 30000,
40000 and 50000 training patterns respectively and (ii) 72-78% for the two smaller
parallelization scenarios with 80 and 160 neurons in the first hidden layer and 50 and
100 neurons in the second hidden layer and 10000 and 20000 training patterns
respectively.
The future direction of our research can be considered as investigation of parallel
training algorithms for other multilayer architectures, in particular convolution neural
network.
References
1. Haykin, S.: Neural Networks and Learning Machines, 3rd edn. Prentice Hall, New Jersey
(2008)
2. De Llano, R.M., Bosque, J.L.: Study of Neural Net Training Methods in Parallel and
Distributed Architectures. Future Generation Computer Systems 26(2), 183–190 (2010)
3. Čerňanský, M.: Training Recurrent Neural Network Using Multistream Extended Kalman
Filter on Multicore Processor and Cuda Enabled Graphic Processor Unit. In: Alippi, C.,
Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009, Part I. LNCS, vol. 5768,
pp. 381–390. Springer, Heidelberg (2009)
4. Lotrič, U., Dobnikar, A.: Parallel Implementations of Recurrent Neural Network Learning.
In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds.) ICANNGA 2009. LNCS,
vol. 5495, pp. 99–108. Springer, Heidelberg (2009)
5. Parallel Grid-aware Library for Neural Network Training, http://uweb.deis.
unical.it/turchenko/research-projects/pagalinnet/
6. Turchenko, V., Grandinetti, L.: Scalability of Enhanced Parallel Batch Pattern BP Training
Algorithm on General-Purpose Supercomputers. In: de Leon F. de Carvalho, A.P.,
Rodríguez-González, S., De Paz Santana, J.F., Corchado Rodríguez, J.M. (eds.)
Distributed Computing and Artificial Intelligence. AISC, vol. 79, pp. 525–532. Springer,
Heidelberg (2010)
210 V. Turchenko and A. Sachenko
7. Turchenko, V., Grandinetti, L.: Parallel Batch Pattern BP Training Algorithm of Recurrent
Neural Network. In: 14th IEEE International Conference on Intelligent Engineering
Systems, Las Palmas of Gran Canaria, Spain, pp. 25–30 (2010)
8. Turchenko, V., Bosilca, G., Bouteiller, A., Dongarra, J.: Efficient Parallelization of Batch
Pattern Training Algorithm on Many-core and Cluster Architectures. In: 7th IEEE
International Conference on Intelligent Data Acquisition and Advanced Computing
Systems, Berlin, Germany, pp. 692–698 (2013)
9. Turchenko, V., Golovko, V., Sachenko, A.: Parallel Training Algorithm for Radial Basis
Function Neural Network. In: 7th International Conference on Neural Networks and
Artificial Intelligence, Minsk, Belarus, pp. 47–51 (2012)
10. Funahashi, K.: On the Approximate Realization of Continuous Mappings by Neural
Network. Neural Networks 2, 183–192 (1989)
11. Hornik, K., Stinchcombe, M., White, H.: Multilayer Feedforward Networks are Universal
Approximators. Neural Networks 2, 359–366 (1989)
12. The MNIST Database of Handwritten Digits, http://yann.lecun.com/exdb/
mnist/
13. Hinton, G.E., Osindero, S., Teh, Y.: A Fast Learning Algorithm for Deep Belief Nets.
Neural Computation 18, 1527–1554 (2006)
14. Golovko, V., Galushkin, A.: Neural Networks: Training, Models and Applications.
Radiotechnika, Moscow (2001) (in Russian)
15. Turchenko, V., Grandinetti, L., Bosilca, G., Dongarra, J.: Improvement of Parallelization
Efficiency of Batch Pattern BP Training Algorithm Using Open MPI. Procedia Computer
Science 1(1), 525–533 (2010)
16. Open MPI: Open Source High Performance Computing, http://www.open-
mpi.org/
17. Turchenko, V.: Scalability of Parallel Batch Pattern Neural Network Training Algorithm.
Artificial Intelligence. Journal of Institute of Artificial Intelligence of National Academy
of Sciences of Ukraine 2, 144–150 (2009)
Author Index