(Lecture Notes in Computer Science 6729 Theoretical Computer Science and General Issues) Li Nie, Liang Gao, Peigen Li, Xiaojuan Wang (Auth.), Ying Tan, Yuhui Shi, Yi Chai, Guoyin Wang (Eds.) - Advance
(Lecture Notes in Computer Science 6729 Theoretical Computer Science and General Issues) Li Nie, Liang Gao, Peigen Li, Xiaojuan Wang (Auth.), Ying Tan, Yuhui Shi, Yi Chai, Guoyin Wang (Eds.) - Advance
(Lecture Notes in Computer Science 6729 Theoretical Computer Science and General Issues) Li Nie, Liang Gao, Peigen Li, Xiaojuan Wang (Auth.), Ying Tan, Yuhui Shi, Yi Chai, Guoyin Wang (Eds.) - Advance
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbruecken, Germany
Ying Tan Yuhui Shi Yi Chai
Guoyin Wang (Eds.)
Advances
in Swarm Intelligence
13
Volume Editors
Ying Tan
Peking University
Key Laboratory of Machine Perception (MOE)
Department of Machine Intelligence
Beijing, 100871, China
E-mail: [email protected]
Yuhui Shi
Xi’an Jiaotong-Liverpool University
Department of Electrical and Electronic Engineering
Suzhou, 215123,China
E-mail: [email protected]
Yi Chai
Chongqing University
Automation College
Chongqing 400030, China
E-mail: [email protected]
Guoyin Wang
Chongqing University of Posts and Telecommunications
College of Computer Science and Technology
Chongqing, 400065, China
E-mail: [email protected]
This book and its companion volume, LNCS vols. 6728 and 6729, constitute
the proceedings of the Second International Conference on Swarm Intelligence
(ICSI 2011) held during June 12–15, 2011 in Chongqing, well known as the
Mountain City, the southwestern commercial capital of China. ICSI 2011 was
the second gathering in the world for researchers working on all aspects of swarm
intelligence, following the successful and fruitful Beijing ICSI event in 2010,
which provided a high-level international academic forum for the participants to
disseminate their new research findings and discuss emerging areas of research.
It also created a stimulating environment for the participants to interact and
exchange information on future challenges and opportunities in the field of swarm
intelligence research.
ICSI 2011 received 298 submissions from about 602 authors in 38 countries
and regions (Algeria, American Samoa, Argentina, Australia, Austria, Belize,
Bhutan, Brazil, Canada, Chile, China, Germany, Hong Kong, Hungary, India,
Islamic Republic of Iran, Japan, Republic of Korea, Kuwait, Macau, Madagas-
car, Malaysia, Mexico, New Zealand, Pakistan, Romania, Saudi Arabia, Singa-
pore, South Africa, Spain, Sweden, Chinese Taiwan, Thailand, Tunisia, Ukraine,
UK, USA, Vietnam) across six continents (Asia, Europe, North America, South
America, Africa, and Oceania). Each submission was reviewed by at least 2
reviewers, and on average 2.8 reviewers. Based on rigorous reviews by the Pro-
gram Committee members and reviewers, 143 high-quality papers were selected
for publication in the proceedings with an acceptance rate of 47.9%. The pa-
pers are organized in 23 cohesive sections covering all major topics of swarm
intelligence research and development.
In addition to the contributed papers, the ICSI 2011 technical program in-
cluded four plenary speeches by Russell C. Eberhart (Indiana University Pur-
due University Indianapolis (IUPUI), USA), K. C. Tan (National University of
Singapore, Singapore, the Editor-in-Chief of IEEE Computational Intelligence
Magazine (CIM)), Juan Luis Fernandez Martnez (University of Oviedo, Spain),
Fernando Buarque (University of Pernambuco, Brazil). Besides the regular oral
sessions, ICSI 2011 had two special sessions on ‘Data Fusion and Swarm Intelli-
gence’ and ‘Fish School Search Foundations and Application’ as well as several
poster sessions focusing on wide areas.
As organizers of ICSI 2011, we would like to express sincere thanks to
Chongqing University, Peking University, Chongqing University of Posts and
Telecommunications, and Xi’an Jiaotong-Liverpool University for their spon-
sorship, to the IEEE Computational Intelligence Society, World Federation on
Soft Computing, International Neural Network Society, and Chinese Association
for Artificial Intelligence for their technical co-sponsorship. We appreciate the
Natural Science Foundation of China for its financial and logistic supports.
VI Preface
We would also like to thank the members of the Advisory Committee for their
guidance, the members of the International Program Committee and additional
reviewers for reviewing the papers, and members of the Publications Committee
for checking the accepted papers in a short period of time. Particularly, we are
grateful to the proceedings publisher Springer for publishing the proceedings in
the prestigious series of Lecture Notes in Computer Science. Moreover, we wish
to express our heartfelt appreciation to the plenary speakers, session chairs,
and student helpers. There are still many more colleagues, associates, friends,
and supporters who helped us in immeasurable ways; we express our sincere
gratitude to them all. Last but not the least, we would like to thank all the
speakers and authors and participants for their great contributions that made
ICSI 2011 successful and all the hard work worthwhile.
General Chairs
Russell C. Eberhart Indiana University - Purdue University, USA
Dan Yang Chongqing University, China
Ying Tan Peking University, China
Publications Chairs
Rajkumar Roy Cranfield University, UK
Radu-Emil Precup Politehnica University of Timisoara, Romania
Yue Sun Chongqing University, China
Publicity Chairs
Xiaodong Li RMIT Unversity, Australia
Haibo He University of Rhode Island Kingston, USA
Lei Wang Tongji University, China
Weiren Shi Chongqing University, China
Jin Wang Chongqing University of Posts and
Telecommunications, China
Finance Chairs
Chao Deng Peking University, China
Andreas Janecek University of Vienna, Austria
Additional Reviewers
Bi, Chongke Qing, Li
Cheng, Chi Tai Quirin, Arnaud
Damas, Sergio Saleem, Muhammad
Ding, Ke Samad, Rosdiyana
Dong, Yongsheng Sambo, Francesco
Duong, Tung Singh, Satvir
Fang, Chonglun Sun, Fuming
Guo, Jun Sun, Yang
Henmi, Tomohiro Tang, Yong
Hu, Zhaohui Tong, Can
Huang, Sheng-Jun Vázquez, Roberto A.
Kalra, Gaurav Wang, Hongyan
Lam, Franklin Wang, Lin
Lau, Meng Cheng Yanou, Akira
Leung, Carson K. Zhang, Dawei
Lu, Qiang Zhang, X.M.
Nakamura, Yukinori Zhang, Yong
Osunleke, Ajiboye Zhu, Yanqiao
Table of Contents – Part II
Intelligent Control
Using Genetic Algorithm for Parameter Tuning on ILC Controller
Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Alireza rezaee and Mohammad jafarpour jalali
PSO Algorithm with Chaos and Gene Density Mutation for Solving
Nonlinear Zero-One Integer Programming Problems . . . . . . . . . . . . . . . . . . 101
Yuelin Gao, Fanfan Lei, Huirong Li, and Jimin Li
Ant Colony Optimization for Global White Matter Fiber Tracking . . . . . 267
Yuanjing Feng and Zhejin Wang
Differential Evolution
Novel Binary Encoding Differential Evolution Algorithm . . . . . . . . . . . . . . 416
Changshou Deng, Bingyan Zhao, Yanling Yang, Hu Peng, and
Qiming Wei
Adaptive Learning Differential Evolution for Numeric Optimization . . . . 424
Yi Liu, Shengwu Xiong, Hui Li, and Shuzhen Wan
Differential Evolution with Improved Mutation Strategy . . . . . . . . . . . . . . 431
Shuzhen Wan, Shengwu Xiong, Jialiang Kou, and Yi Liu
Gaussian Particle Swarm Optimization with Differential Evolution
Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Chunqiu Wan, Jun Wang, Geng Yang, and Xing Zhang
Neural Networks
Evolving Neural Networks: A Comparison between Differential
Evolution and Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 447
Beatriz A. Garro, Humberto Sossa, and Roberto A. Vázquez
Identification of Hindmarsh-Rose Neuron Networks Using GEO
metaheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Lihe Wang, Genke Yang, and Lam Fat Yeung
Delay-Dependent Stability Criterion for Neural Networks of
Neutral-Type with Interval Time-Varying Delays and Nonlinear
Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
Guoquan Liu, Simon X. Yang, and Wei Fu
Application of Generalized Chebyshev Neural Network in Air Quality
Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Fengjun Li
Financial Time Series Forecast Using Neural Network Ensembles . . . . . . . 480
Anupam Tarsauliya, Rahul Kala, Ritu Tiwari, and Anupam Shukla
Selection of Software Reliability Model Based on BP Neural Network . . . 489
Yingbo Wu and Xu Wang
Genetic Algorithms
Atavistic Strategy for Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Dongmei Lin, Xiaodong Li, and Dong Wang
XXIV Table of Contents – Part I
Evolutionary Computation
Evaluation of Two-Stage Ensemble Evolutionary Algorithm for
Numerical Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Yu Wang, Bin Li, Kaibo Zhang, and Zhen He
A Novel Genetic Programming Algorithm For Designing Morphological
Image Analysis Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
Jun Wang and Ying Tan
Fuzzy Methods
Optimizing Single-Source Capacitated FLP in Fuzzy Decision
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
Liwei Zhang, Yankui Liu, and Xiaoqing Wang
New Results on a Fuzzy Granular Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Xu-Qing Tang and Kun Zhang
Fuzzy Integral Based Data Fusion for Protein Function Prediction . . . . . 578
Yinan Lu, Yan Zhao, Xiaoni Liu, and Yong Quan
Hybrid Algorithms
Gene Clustering Using Particle Swarm Optimizer Based Memetic
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
Zhen Ji, Wenmin Liu, and Zexuan Zhu
Hybrid Particle Swarm Optimization with Biased Mutation Applied to
Load Flow Computation in Electrical Power Systems . . . . . . . . . . . . . . . . . 595
Camila Paes Salomon, Maurilio Pereira Coutinho,
Germano Lambert-Torres, and Cláudio Ferreira
Table of Contents – Part I XXV
1 Introduction
Production scheduling problem is one of the most important tasks carried out in manu-
facturing systems and has received considerable attention in operations research litera-
ture. In this area, it is usually assumed that all the jobs to be processed are available at
the beginning of the whole planning horizon. However, in many real situations, jobs
may arrive over time due to transportation etc.
There are many approaches have been proposed to solve production scheduling
problem, such as branch and bound [1], genetic algorithms [2], tabu search [3] etc.
However, these methods usually offer good quality solutions with the cost of a huge
amount of computational time. Furthermore, these techniques are not applicable in
dynamic or uncertain conditions because it is needed to frequently modify the original
schedules to respond to the changes of system status.
Scheduling with scheduling rules (SRs) that defines only the next state of the system
is highly effective in such dynamic environment [4]. Due to inherent complexity and
variability of scheduling problem, a considerable effort is needed to develop suitable
SRs for the problem at hand. Many researchers have investigated the use of genetic
programming (GP) to create problem specific SRs [4][5][6]. In our previous work, we
have applied gene expression programming (GEP), a new evolutionary algorithm, on
dynamic single-machine scheduling problem (DSMSP) with job release dates and
demonstrated that GEP is more promising than GP to create efficient SRs [7]. All the
*
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 1–9, 2011.
© Springer-Verlag Berlin Heidelberg 2011
2 L. Nie et al.
2 Problem Description
The DSMSP with job release dates is described as following: The shop floor consists
of one machine and n jobs, which are released over time and are processed once on the
machine without preemption. The attributes of a job, such as processing time, release
date, due date, are unknown in advance till the job is currently available at the machine
or arrive in the immediate future. It is assumed that the machine is available all the
time and cannot process more than one job simultaneously. The task of scheduling is to
determine a sequence of jobs on the machine in order to minimize several optimization
criteria simultaneously, in our case, makespan, mean flow time, maximum lateness and
mean tardiness. The four performance criteria are defined below.
F1 = Cmax = max(ci , i = 1,..., n) . (1)
1 n
F2 = F = ∑ (ci − ri ) .
n i =1
(2)
1 n
F4 = T = ∑ max(ci − di , 0) .
n i =1
(4)
Where, ci, ri and di denote the completion time, release date and due date of job i,
respectively. n denotes the number of jobs. Cmax, F , Lmax and T denote makespan,
mean flow time, maximum lateness and mean tardiness, respectively.
Multi-Objective Optimization for Dynamic Single-Machine Scheduling 3
Where X is a possible solution, Ω is the feasible solution space, F (i) is the objec-
tive function and Fr (i) is the rth objective function (for 1 ≤ r ≤ L ).
A solution a dominates a solution b (or b is dominated by a), if the following condi-
tions are satisfied:
Fi (a ) ≤ Fi (b), ∀i ∈{1, 2,..., L} . (6)
size M, external archive NDSet is filled with dominated solutions in population Pt;
otherwise if the number of non-dominated solutions exceeds archive size M, some
member in the archive NDSet is removed according with the diversity maintaining
scheme (section 4.3).
Step 4: If iter exceeds the maximal number of iteration, the algorithm is ended and
NDSet is put out; otherwise, go to the Step 5.
Step 5: According with the elitist strategy (section 4.4) the individuals in the exter-
nal archive NDSet are copied directly to the next population Pt+1.
Step 6: Genetic operators (section 4.6) are employed on population Pt and the off-
spring individuals are saved into the population Pt+1. Pt+1 size is maintained to be N .
Then increment iteration counter iter = iter + 1, and go to step 2.
As for MOPs, assignment scheme of fitness is very important and effective fitness
assignment scheme makes sure that the search is directed towards the Pareto-optimal
solutions. In this paper, a fitness assignment scheme which combines Pareto
dominance relation and the density information is proposed. Each individual at each
generation is evaluated according the following steps: (1) Rank for an individual is
determined. (2) Density of an individual is estimated. (3) Fitness of an individual is
determined through incorporating its density information into its rank.
The non-dominated sorting algorithm [9] is used to define a rank for each individ-
ual. According Pareto dominance relation the population is splits into different non-
dominated fronts PF1, PF2,…, PFG, where G is the number of non-dominated fronts.
The individual in PFj+1 is dominated by at lease an individual in PFj (j=1,…, G-1).
And the individuals in each non-dominated fronts PFj (j=1,…, G) are indifferent to
each other. The rank of each individual i in PFj is assigned as below:
R (i ) = j − 1(i ∈ PFj ) . (8)
Since the individuals in each non-dominated front do not dominate each other and
have identical rank, additional density information is necessary to be incorporated
to discriminate between them. The density estimation technique [8] is used to define an
order among the individuals in PFj (j=1,…,G). Specifically, for each individual i in PFj,
the distances to all individuals in PFj are calculated and stored in increasing order. The
k
k-th element is denoted as di . k is set to be the square root of the front size. The density
D(i) corresponding to individual i is defined by:
Where
d max = max{d ik , i ∈ PF j } . (10)
According with the fitness assignment scheme, the fitness of the individuals in PF1
is in the interval of [G-1,G), the fitness of those in PF2 is in the interval of [G-2,G-1),
and the fitness of those in PFG is in the interval of [0,1). It is notable that fitness is to
be maximized here, in other words, a better individual is assigned a higher fitness so
that it may transfer fine genes to offspring with a higher probability.
Apart from the population, an external archive, whose size is fixed, is used to save the
non-dominated individuals of the population. If the number of non-dominated indi-
viduals exceeds the predefined archive size, some individuals is needed to be deleted
from the archive. In order to maintain the diversity of the population, the individuals
with higher density should be deleted, i.e., the individuals with lower fitness should be
deleted from the archive.
The function set (FS) and terminal set (TS) used to construct SRs are defined as fol-
lows. FS = {+, -, *, /}. TS = { p, r, d, sl, st, wt}, where p denotes job’s processing time;
r denotes job’s release date; d denotes job’s due date; sl denotes job’s positive slack, sl
= max {d − p − max{t, r}, 0}, where t denotes the idle time of the machine; st denotes
job’s stay time, st = max {t – r, 0}, where t is defined as above; wt denotes job’s wait
time, wt = max {r – t, 0}, where t is defined as above.
Chromosomes are encoded according with the stipulations: (1) The head may con-
tain symbols from both FS and TS, whereas the tail consists only of symbols come
from TS. (2) The length of head and tail must satisfy the equation tl = hl * (arg − 1) +
1, where hl and tl is the length of head and tail, respectively, and arg is the maximum
number of arguments for all operations in FS. An example chromosome expressed
with the elements of FS and TS defined in above is illustrated in Fig. 1(a), where un-
derlines are used to indicate the tails.
Decoding is the process transferring the chromosomes to SRs. For the example
chromosome shown in Fig. 1(a), it is mapped into expression tree (ET) following a
depth-first fashion (Fig. 1(b)). The ET is interpreted into a SR with a mathematical
form as shown in Fig. 1(c).
6 L. Nie et al.
* wt
-.*.+.r.d./.sl.p.wt.st.wt.p.sl. (r+d)*sl/p-wt
+ /
(a) chromosome (c) SR
r d sl p
(b) ET
The genetic operators are carried out on the population are listed below [15]:
Selection operator creates a mating pool comprised of individuals selected from
current population according to fitness by roulette wheel sampling. The roulette is
spun N-M times in order to maintain the population size unchanged (Note that M
individuals are copied directly from NDSet).
Mutation operator randomly changes symbols in a chromosome. In order to main-
tain the structural organization of chromosomes, in the head, any symbol can change
into any other function or terminals, while symbols in the tail can only change into
terminals.
Transposition operator (1) IS transposition, i.e., randomly choose a fragment be-
gins with a function or terminal (called IS elements) and transpose it to the head of
genes, except for the root of genes; (2) RIS transposition, i.e., randomly choose a
fragment begins with a function (called RIS elements) and transpose it to the root of
genes. In order not to affect the tail of the gene, symbols are removed from the end of
the head to make room for the inserted string.
Recombination operator (1) one-point recombination, i.e., split the two randomly
chosen parent chromosomes into halves and swap the corresponding sections; (2)
two-point recombination, i.e., split the chromosomes into three portions and swap the
middle one.
In the experiments, MOGEP parameter settings are shown below. Population size
is 200. The length of head is 10 and thereby the total length of a chromosome is 21.
The mutation probability is 0.3. IS and RIS probability are 0.3 and 0.1, respectively.
One-point and two-point probability are 0.2 and 0.5, respectively. GEP stops run if it
finishes 100 iterations.
The SRs created by MOGEP on the training sets are listed below:
GEP MOGEP
Ins. Obj.
R-F1 R-F2 R-F3 R-F4 R1 R2 R3 R4
F1 5167 5168 5167 5215 5167 5167 5167 5167
F2 2447 1569 2434 2270 1557 2397 2428 2292
1-1
F3 3422 3219 1330 2649 3218 1338 1330 2601
F4 675 380 385 235 385 368 385 234
F1 4691 4696 4691 4691 4691 4691 4697 4697
F2 2106 1385 2091 1982 1386 2055 2083 2032
1-2
F3 2381 2691 1083 1620 2686 1083 1089 1549
F4 597 339 225 146 338 215 218 149
F1 4896 4907 4896 4966 4896 4896 4905 4905
F2 2269 1503 2289 2139 1515 2257 2264 2149
1-3
F3 3162 2768 1217 2290 2757 1217 1226 2202
F4 665 398 343 209 407 331 330 200
F1 5230 5230 5230 5274 5230 5230 5230 5230
F2 2337 1544 2313 2205 1544 2267 2303 2218
1-4
F3 2947 3313 1280 2183 3313 1280 1280 1815
F4 639 340 285 185 340 266 285 178
F1 4604 4606 4604 4604 4604 4604 4606 4606
F2 2017 1294 2081 1912 1292 2043 2075 1967
1-5
F3 2578 2928 1141 1901 2926 1141 1143 1903
F4 568 325 305 168 325 290 303 178
6 Conclusions
Considering the fact that jobs usually arrive over time and several optimization objec-
tives must be considered simultaneously in many real scheduling situations, we pro-
posed MOGEP and applied it on the construction of SRs for DSMSP. MOGEP was
equipped with a fitness assignment scheme, diversity maintain strategy and elitist
strategy on the basis of original GEP. Simulation experiment results demonstrate that
MOGEP creates effective SRs which can generate good Pareto optimal solutions for
DSMSP. These findings encourage the further improvement of MOGEP and applica-
tion it on more complex scheduling problems.
References
1. Balas, E.: Machine scheduling via disjunctive graphs: an implicit enumeration algorithm.
Oper. Res. 17, 941–957 (1969)
2. Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addi-
son-Wesley, Reading (1989)
Multi-Objective Optimization for Dynamic Single-Machine Scheduling 9
3. Laguna, M., Barnes, J., Glover, F.: Tabu search methods for a single machine scheduling
problem. J. Intell. Mauf. 2, 63–74 (1991)
4. Jakobović, D., Budin, L.: Dynamic Scheduling with Genetic Programming. In: Collet, P.,
Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905,
pp. 73–84. Springer, Heidelberg (2006)
5. Atlan, L., Bonnet, J., Naillon, M.: Learning Distributed Reactive Strategies by Genetic
Programming for the General Job Shop Problem. In: 7th Annual Florida Artificial Intelli-
gence Research Symposium. IEEE Press, Florida (1994)
6. Miyashita, K.: Job-shop Scheduling with Genetic Programming. In: Genetic and Evolu-
tionary Computation Conference, pp. 505–512. Morgan Kaufmann, San Fransisco (2000)
7. Nie, L., Shao, X.Y., Gao, L., Li, W.D.: Evolving Scheduling Rules with Gene Expression
Programming for Dynamic Single-machine Scheduling Problems. Int. J. Adv. Manuf.
Tech. 50, 729–747 (2010)
8. Zitzler, E., Thiele, L.: Multiobjective Evolutionary Algorithms: A Comparative Case
Study and the Strength Pareto Approach. IEEE T. Evolut. Comput. 3(4), 257–271 (1999)
9. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A Fast Elitist Nondominated Sorting Ge-
netic Algorithm for Mmulti-objective Optimization: NSGA-II. In: Schoenauer, M., Deb,
K., Rudolph, G., Yao, X., Lutton, E., Merelo, J.J., Schwefel, H.-P. (eds.) Parallel Problem
Solving from Nature – PPSN VI, pp. 849–858. Springer, Berlin (2000)
10. Fonseca, C.M., Fleming, P.J.: Genetic Algorithms for Multiobjective Optimization: For-
mulation, Discussion and Generalization. In: 5th International Conference on Genetic Al-
gorithms, pp. 416–423. Morgan Kaufmann, California (1993)
11. Horn, J., Nafpliotis, N., Goldberg, D.E.: A Niched Pareto Genetic Algorithm for Multiob-
jective Optimization. In: 1st IEEE Conference on Evolutionary Computation, IEEE World
Congress on Computational Computation, pp. 82–87. IEEE Press, New Jersey (1994)
12. Srinivas, N., Deb, K.: Multiobjective Optimization Using Nondominated Sorting in Ge-
netic Algorithms. Evol. Comput. 2(3), 221–248 (1994)
13. Zitzler, E., Deb, K., Thiele, L.: Comparison of Multiobjective Evolutionary Algorithms:
Empirical Results. Evol. Comput. 8(2), 173–195 (2000)
14. Kacem, I., Hammadi, S., Borne, P.: Pareto-optimality Approach for Flexible Job-shop
Scheduling Problems: Hybridization of Evolutionary Algorithms and Fuzzy Logic. Math.
Comput. Simulat. 60, 245–276 (2002)
15. Ferreira, C.: Gene Expression Programming: A New Adaptive Algorithm for Solving
Problems. Complex System 13(2), 87–129 (2001)
16. Ferreira, C.: Discovery of the Boolean Functions to the Best Density-Classification Rules
Using Gene Expression Programming. In: Foster, J.A., Lutton, E., Miller, J., Ryan, C.,
Tettamanzi, A.G.B. (eds.) EuroGP 2002. LNCS, vol. 2278, pp. 50–60. Springer, Heidel-
berg (2002)
17. Zou, C., Nelson, P.C., Xiao, W., Tirpak, T.M.: Discovery of Classification Rules by Using
Gene Expression Programming. In: International Conference on Artificial Intelligence, Las
Vegas, pp. 1355–1361 (2002)
18. Zuo, J., Tang, C., Li, C., Yuan, C., Chen, A.: Time Series Prediction Based on Gene Ex-
pression Programming. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS,
vol. 3129, pp. 55–64. Springer, Heidelberg (2004)
19. Chen, Y., Tang, C., Zhu, J.: Clustering without Prior Knowledge Based on Gene Expres-
sion Programming. In: 3rd International Conference on Natural Computation, pp. 451–455
(2007)
Research of Pareto-Based Multi-Objective Optimization
for Multi-Vehicle Assignment Problem Based on MOPSO
1 Introduction
Nowadays, the importance and complexity of vehicle assignment system for an on-
demand transportation system are growing up and research approaches are evolving.
The vehicle assignment and job allocation problems focus on how to allocate and
schedule vehicles to perform missions at each destination, and to maximize effective-
ness of the overall mission, involved in goal assignment, trajectory optimization, and
time or job requirements, etc [1].
Some models of typical assignment and scheduling problems have been referred
and adapted, including Mixed-Integer Linear Programming (MILP) [2], Binary Linear
Programming (BLP) [3], Linear Ordering Problem (LOP) [4], Traveling Salesman
Problem (TSP) [5], computational intelligence algorithms based model [6,7]and
so on.
Generally, only the time consumption is considered in the abovementioned models.
And yet it is a multi-objective optimization problem with complex constrains, includ-
ing cost, time, distance and so on. The solution of multi-objective optimization
problem is a set of optimal solutions. In the paper, a Pareto-based multi-objective
optimization strategy is utilized, moreover, all constrains are treated as an additional
objective in the following section. And in the 3rd section, a multi-objective particle
swarm optimizer (MOPSO) is combined with the proposed model to handle the three
objectives optimization problem.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 10–16, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Research of Pareto-Based Multi-Objective Optimization 11
Table 1. Nomenclature
Items Explanation
Tij = {tij}(N+M)×N, time cost matrix. tij is the flight time from node i to node j
C NM = {cn,m}N×M, cumulative time matrix. cn,m stands for the cumulative time of both the n-th
target and the m-th vehicle.
O NK = {oi-k}1×N·K, target sequence array. oik means the execution of the k-th task at the i-th
target
Π KN = {лk,n}K×N UAV assignment matrix, лk,m is the vehicle number to perform k-th task at
the n-th target
Since the importance differences of the tasks at various destinations and implementa-
tion capacities of the vehicles, the total profit (see Equation(4)) reflects the effective-
ness of each task completion, in terms of the assignment scheme. V1,V2,…Vm are
defined as the preferred value for each target. Pij represents that the abilities of the
vehicles to perform the task. i is the vehicle sequence array and j is task sequence
array. So the total profit for all tasks can be calculated as followed:
max J 2 = ∑
i∈M , j∈N
Vi ⋅ Pij (4)
3 MOPSO Application
Based on the classic Particle Swarm Optimizer (PSO), whose parameters are set in
Table 2, with the consideration of the multi-objective optimization problem, some
modifications are introduced and the improved PSO is called MOPSO.
The purpose of the MOP is to find the Pareto optimal sets and Pareto front, so as to
provide a series of potential solution to decision makers who can reach the final plan
according to synthesize other information and requirements. The MOPSO includes
external file maintenance, pbest and gbest information update, and so on.
3.2 Encoding
The position vector X of a particle in MOPSO is a K· (N+M) vector with real val-
ues(see Equation(7)), which consists of two parts that the first K·N elements corre-
spond to the target sequence array and the last K·M variables stand for the assignment
O
matrix. X KN is sorted first and then the modules of these sorted serial numbers are
calculated to acquire the target sequence array, explained as Table 3. The operation of
Π O
XKM is similar to X KN .
Π
X =[x1,..., xn,..., xK⋅N , xK⋅N+1,..., xv ,..., xK⋅N+K⋅M ] =[XKN
O
, XKM ] (7)
O O O
O
X KN X KN (1) X KN (2) X KN (3) X KNO (4)
Real value 12.315 10.343 55.819 19.556
ONK 3 4 1 2
The maintenance strategy of Pareto pool for both pBest and gbest is as follows:
a. If the inferior solution particles dominate some of the solutions in the Pareto
pool, delete the dominated particles, and join the current solution to the Pareto pool.
b. The particle which be dominated by the Pareto pool particles is directly ignored.
c. If the current particle and the Pareto pool particles have no relation of dominate,
and the population of the Pareto pool did not reach the scale of the pool, then the
particle will join the Pareto pool directly. Otherwise, calculate the particle’s distance
among other particles and remove the particle with the minimum density distance.
The pBest for next iteration will be selected by the roulette-wheel selection from the
Pareto pool. And all existing gBest in the set will be ranked by the density. The gbest
in the set with less density will have a high probability to be selected as the gbest for
next generation.
14 A. Di-Ming et al.
Two scenarios with different destinations, vehicles and two tasks requirements are
studied. The format of the ‘Scenarios’ is [Destination, Vehicle], which stands for the
number of destination and vehicle separately.
The minimum of the total mission time cost matrix includes flight time between
nodes and the task execution time. Tij, a N×(N+M) matrix, represents the time cost of
the flight time between nodes, while Tsk, a K×N×M matrix, represents the time cost of
the task execution.
For the Scenario [2, 3] with two destinations, three vehicles and two tasks, the pre-
ferred value of targets is shown in Table 4. The profits that vehicles perform different
tasks are listed in the Table 5, A set of Pareto solutions can be obtained by MOPSO,
and the Pareto front fitness value is shown in Table 6.
Objectives Fitness
J1 11.5131 16.88552 126.6441 213.18341
J2 185 195 209 210
J3 0 0 0 0
For the Scenario [3, 4] with three destinations, four vehicles and two tasks, the pre-
ferred value and profits are shown in the Table 7 and Table 8. The Pareto front fitness
value is shown in Table 9.
Objectives Fitness
J1 10 11.51671 12.98741 25.84051 219.42261
J2 475 525 645 670 710
J3 0 0 0 0 0
16 A. Di-Ming et al.
5 Conclusion
Many of the experiments with different scenarios have been tested, each scenario, a
set of Pareto optimal can be achieved. On the other hand, the Pareto front are not
continue, since the existence of constrain requirements which is the third fitness J3
which determines the feasible solutions are limited.
In summary, the proposed multi-objective vehicle assignment model can reduce
the dimension of the solution space and be easily adapted by MOPSO algorithms.
Furthermore, the constrain treatment strategy, which considers the violations as an
objective, is an effective method. The future work is going to refine the model for
more complicate scenarios and improve algorithm’s flexibility, stability and distribu-
tion uniformity for more tasks.
Acknowledgments
References
1. Chandler, P., Pachter, M., Swaroop, D., Fowler, J.: Complexity in UAV cooperative con-
trol. In: American Control Conference, ACC, pp. 1831–1836 (2002)
2. Schumacher, C., Chandler, P.R., Pachter, M., Pachter, L.S.: Optimization of Air Vehicle
Operations Using Mixed-Integer Linear Programming. Air Force Research Lab
(AFRL/VACA) Wright-Patterson AFB, OH Control Theory Optimization Branch (2006)
3. Guo, W., Nygard, K.E., Kamel, A.: Combinatorial Trading Mechanism for Task Alloca-
tion. In: Proceedings of the 14th International Conference on Computer Applications in In-
dustry and Engineering, Las Vegas, Nevada, USA (2001)
4. Arulselvan, A., Commander, C.W., Pardalos, P.M.: A hybrid genetic algorithm for the tar-
get visitation problem. Naval Research Logistics (2007)
5. Vijay, K.S., Moises, S., Rakesh, N.: Priority-based assignment and routing of a fleet of
unmanned combat aerial vehicles. Elsevier Science Ltd. 35, 1813–1828 (2008)
6. Pan, F., Hu, X., Eberhart, R., Chen, Y.: A New UAV Assignment Model Based on PSO.
In: IEEE Swarm Intelligence Symposium (SIS 2008), St. Louis, USA (2008)
7. Pan, F., Chen, J., Tu, X.-Y., Cai, T.: A multiobjective-based vehicle assignment model for
constraints handling in computational intelligence algorithms. In: International Conference
Humanized Systems 2008, Beijing, P.R.China (2008)
Correlative Particle Swarm Optimization for
Multi-objective Problems
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 17–25, 2011.
© Springer-Verlag Berlin Heidelberg 2011
18 Y. Shen, G. Wang, and Q. Liu
Min F ( y ) y ∈ RD (1)
y∈Y
where y = [y1, y2, . . . , yD] is a vector with D decision variables and F = [f1, f2, . . . , fM]
are M objectives to be minimized.
In the absence of any preference information, a set of solutions is obtained, where
each solution is equally significant. Pareto dominance and Pareto optimality are
defined as follows:
Definition 3 (Pareto front). The front obtained by mapping the Pareto optimal set (OS)
into the objective space is called POF.
JK
POF = { f = ( f1 ( x)," , f M ( x)) | x ∈ OS } (2)
X i , j ( t + 1) = X i, j ( t ) + V i , j ( t + 1). (4)
where, w is the inertia weight; c1 and c2 are positive constants known as acceleration
coefficients; Random factors r1 and r2 are independent uniform random numbers in
the range [0,1]. The value of velocity vector vi can be restricted to the range [-vmax,
vmax] to prevent particles from moving out of the search range. vmax represents the
maximal magnitude of the element of velocity vector vi,j.
In SPSO model, the strategy with independent random coefficients is used to process
gbest and pbest. This strategy makes no difference to exploit gbest and pbest, and lets
the cognitive and the social components of the whole swarm contribute randomly to
the position of each particle in the next iteration. In CPSO, correlative factors are used
to process gbest and pbest, and create the relationship between gbest and pbest, where
correlative factors are correlated random factors.
20 Y. Shen, G. Wang, and Q. Liu
In [8], Shen and Wang pointed out the positive correlation between random
factors can maintain population diversity. In order to improve diversity of
solutions, the random factors are positive correlation in this paper. The Gaussian
Copula is used to describe correlated random factors. The Gaussian copulas is a
member of the Elliptical copulas family, which is by far the most popular copula used
in the framework of intensity or structural models because that it is easy to
simulate. The updating velocity of the particle is calculated using the following
equation:
where, H is the joint distribution function of correlative factors. Φρ denotes the joint
distribution function of a standard 2-dimensional normal random vector with
correlation matrix, and Φ is the univariate standard normal distribution function. Φ-1
is the inverses function of Φ. ρ denotes the correlation coefficient between correlated
random factors r1 and r2, where 0<ρ<1.
4 MO-CPSO
In single-objective problems, the term gbest represents the best solution obtained by
the whole swarm. In MO problems, more than one conflicting objectives must all
need be optimized. The number of non-dominated solutions which are located on/near
the Pareto front will be more than one. To resolve this problem, the concept of non-
dominance is used and an archive of non-dominance solutions is maintained, from
which a solution is picked up as the gbest in MO-CPSO. The historical archive stores
non-dominance solutions to prevent the loss of good particles. The archive is updated
at each cycle, e.g., if the candidate solution is not dominated by any members in the
archive, it will be added to the archive. Likewise, any archive members dominated by
this solution will be removed form the archive.
In MO problems, there are many non-dominated solutions which are located on
the Pareto front. This paper introduces a disturbance operation to non-dominated
solutions in the archive for trying to find better solutions or other non-dominated
solutions. The disturbance operation will randomly select m non-dominated solutions
from the archive to put noise into their positions, and is shown as (6)
X i , j (t ) = X i , j (t ) + b *η * X i , j (t ) (6)
5 Experimental Results
(FON)
Minimize F=(f1(x),f2(x)), where
f1 ( x) = [1 + ( A1 − B1 )2 + ( A2 − B2 )2 ], f 2 ( x) = [( x1 + 3)2 + ( x2 + 1)2 ]
Poloni’s study A1 = 0.5sin1 − 2cos1 + sin 2 − 1.5cos 2, A2 = 1.5sin1 − cos1 + 2sin 2 − 0.5cos 2
(POL)
B1 = 0.5sin x1 − 2cos x1 + sin x2 − 1.5cos x2 , B2 = 1.5sin x1 − cos x1 + 2sin x2 − 0.5cos x2 ,
xi ∈[−π , π ], i = 1, 2
Minimize F=(f1(x),f2(x)), where
Kursawe’s
f1 ( x ) = ∑ i =1[−10 exp( −0.2 xi2 + xi2+1 )], f 2 ( x ) = ∑ i =1[| xi |0.8 +5sin( xi3 )],
2 3
study
(KUR) xi ∈ [−5,5], i = 1, 2,3
The knowledge of Pareto front of a problem provides an alternative for selection from
a list of efficient solutions. It thus helps in taking decisions, and also, the knowledge
gained can be used in situations where the requirements are continually changing. In
order to provide a quantitative assessment for the performance of MO optimizer, two
issues are taken into consideration, i.e. the convergence to the Pareto-optimal set and
the maintenance of diversity in solutions of the Pareto-optimal set. In this paper,
convergence metric γ [7] and diversity metric δ [7] have as qualitative measures.
Convergence metric is used to measure the extent of convergence of the obtained set
of solutions. The smaller is the value of γ, the better is the convergence toward the
POF. Diversity metric is used to measure the spread of solutions lying on the POF.
For the most widely and uniformly spread out set of non-dominated solutions,
diversity metric δ would be very small.
Correlative Particle Swarm Optimization for Multi-objective Problems 23
Results for the convergence metric and the diversity metric obtained using
MO-CPSO, are given in Table 2 and 3, where results of NSGA- , MOPSO and Ⅱ
MOIPSO come form Ref. [7]. From the results, they are evident that MO-CPSO
converges better than the other three algorithms. In order to clearly visualize the
quality of solutions obtained, figures have been plotted for the obtained Pareto fronts
with POF. As can been seen form Fig. 2, the front obtained from MO-CPSO has the
high extent of coverage and uniform diversity for all test problems. In a word, the
performance of MO-CPSO is better in converges metric and diversity metric. It must
be noted that MOPSO adopts an adaptive mutation operator and an adaptive-grid
division strategy to improve its search potential, while MOIPSO adopts search
methods including an adaptive-grid mechanism, a self-adaptive mutation operator,
and a novel decision-making strategy to enhance balance between the exploration and
exploitation capabilities. MO-CPSO only adopts disturbance operation to solve MOO
problems, and no other parameters are introduced.
(a)
(b)
(c)
(d)
Fig. 2. Pareto solutions of MOPSO and MO-CPSO. (a) SCH, (b) FON, (c) POL, (d)KUR.
Correlative Particle Swarm Optimization for Multi-objective Problems 25
6 Conclusion
References
1. Schaffer, J.D.: Multiple objective optimization with vector evaluated genetic algorithms.
PhD thesis, Vanderbilt University (1984)
2. Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: A comparative case study
and the strength Pareto approach. Transactions on Evolutionary Computation 3(4), 257–
271 (2000)
3. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic
algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197
(2002)
4. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceeding of International
Conference on Neural Networks, pp. 1942–1948. IEEE Press, Perth (1995)
5. Coello, C.A.C., Pulido, G.T., Lechuga, M.S.: Handling multiple objectives with particle
swarm optimization. IEEE Transactions on Evolutionary Computation 3(3), 256–280
(2004)
6. Liu, D.S., Tan, K.C., Goh, C.K., Ho, W.K.: A multi-objective memetic algorithm based on
particle swarm optimization. IEEE Transaction on Systems, Man and Cybernetics, Part b:
Cybernetics 37(1), 42–61 (2007)
7. Agrawal, S., Dashora, Y., Tiwari, M.K., Son, Y.J.: Interactive particle swarm: a pareto-
adaptive metaheuristic to multiobjective optimization. IEEE Transaction on Systems, Man
and Cybernetics, Part a: Systems and Humans 38(2), 258–278 (2008)
8. Shen, Y.X., Wang, G.Y., Tao, C.M.: Particle swarm optimization with novel processing
strategy and its application. International Journal of Computational Intelligence
Systems 4(1), 100–111 (2011)
A PSO-Based Hybrid Multi-Objective Algorithm for
Multi-Objective Optimization Problems
Liaoning Key Laboratory of Manufacturing System and Logistics, The Logistics Institute,
Northeastern University, Shenyang, 110004, China
[email protected], [email protected]
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 26–33, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A PSO-Based Hybrid Multi-Objective Algorithm 27
Begin:
Initialization:
1. Set the termination criterion, and initialize the values of parameters such as the size of
the population, the size of EXA, the size of PBA[i] (note that all particles have the same
size of PBA[i]), the mutation probability.
2. Set EXA and PBA[i] to be empty.
3. Randomly initialize all the particles in the swarm.
4. Evaluate each particle in the swarm, and store each particle i in PBA[i].
5. Store the non-dominated particles in the swarm in EXA.
while (the termination criterion is not reached) do
1. EXA-propagating-mechanism () % extend EXA when necessary %
2. Particle-flight-mechanism () % particle flight using crossover %
3. Particle-mutation ()
4. Evaluate each particle in the swarm.
5. for each particle i in the swarm
PBA[i]-update-strategy ()
End for
6. for each non-dominated particle i in the swarm
EXA-update-strategy () using the non-dominated particle i
End for
7. EXA-improvement () % local search on EXA %
End while
Report the obtained non-dominated solutions in EXA.
End
The canonical MOPSO algorithm updates particles using the flight equations, which
consists of three parameters, i.e., w, c1, and c2. When combined with other algorithms,
the hybrid MOPSO will have more parameters, which will cause great difficulty in
the parameter tuning. Therefore, in this paper we do not follow the canonical flight
equations, but prefer to adopt a new strategy based on the SBX operator of GA. Since
there have been many research report with respect to the parameters of the SBX
operator, it is reasonable to follow the suggested setting, and thus there is no
parameters to be tuned in the adopted update mechanism. For each particle i, this
strategy has two simple steps: (1) randomly select a pbest from PBA[i], use the
crossover operator to generate two offspring solutions from the particle i and its
A PSO-Based Hybrid Multi-Objective Algorithm 29
selected pbest, and then randomly select a gbest from the EXA; and (2) use the SBX
crossover operator to generate two offspring solutions from the selected pbest and gbest,
and then select the best one based on Pareto dominance as the new particle.
To improve the search diversity, our HMOPSO algorithm also use the mutation
operation. For each dimension of each particle in the swarm, we first generate a
random number rnd in [0, 1]. If rnd<pm (the mutation probability), then the
polynomial mutation operator in [2] is used to mutate this dimension.
The HMOPSO algorithm adopts a personal best archive PBA[i] for each particle i so as
to keep a good memory of the search history of particle i and at the same time
improve the search diversity. For simplicity, the maximum size of each PBA[i] is set
to be the same one, i.e., nPBA. For particle i with new positions, it is directly stored in
the PBA[i] if the current size of PBA[i] is smaller than nPBA; otherwise, we will first
store it in the PBA[i] and then randomly remove one from the PBA[i].
One major task of the MO algorithm requires that the solutions in the obtained EXA
should be uniformly distributed along the Pareto front in the objective space.
Therefore, the mechanism of crowding distance of NSGA-II in [2] is adopted to
maintain the diversity of the EXA.
Since the EXA has been initialized by the non-dominated particles in the swarm at
the first iteration, so for a given non-dominated particle (e.g., particle i) in the current
swarm at iteration t, the update procedure of the EXA can be described as follows.
Step 1. If particle i is dominated by one solution in the EXA, then discard particle i.
Step 2. If particle i is not dominated by any solution in the EXA, store it in the EXA
and then remove all solutions that are dominated by it from the EXA.
Step 3. If |EXA| > nEXA (the maximum size of the EXA), calculate the crowding
distance of all solutions in the EXA, and then remove the most crowded
solution (i.e., the solution with the least crowding distance) from the EXA.
Repeat this step until |EXA| = nEXA.
Since the selection of gbest from the EXA has significant influence on the performance
of MOPSO, it is then clear that the improvement on the EXA can improve the
MOPSO because this will help to provide better candidate solutions to be selected as
gbest. Motivated by this idea, we develop a local search heuristic named the EXA-
improvement to further improve the quality of the EXA, i.e., the distance of the EXA to
the true Pareto front and the diversity of solutions in the EXA. This local heuristic can
be viewed as a simplified version of scatter search (SS) because it adopts the concept
of reference set (denoted as REF) of SS in [14].
30 X. Wang and L. Tang
and f i min are the maximum and minimum values of the ith objective function in the
EXA, respectively. Let nREF denote the maximum size of the REF, then the EXA-
improvement method can be described as follows.
Step 1. Construct REF
Step 1.1 If |EXA| = nREF, store all solutions in EXA in the REF.
Step 1.2 If |EXA| < nREF, store all solutions in EXA to REF, and then use the
non-dominated sorting method to classify the particles in the swarm
into different levels. Starting from front 1, randomly select a
particle and store it in REF, until |REF| = nREF.
Step 1.3 If |EXA| > nREF, then perform the following procedures.
Step 1.3.1 Calculate the crowding distance of each solution in the EXA and
then store them in the non-ascending order of their crowding
distances in a list L.
Step 1.3.2 Select the first nREF / 2 solutions in L and add them to REF, and
then delete them from L.
Step 1.3.3 Select the solution p with the maximum value of the minimum
distance to REF from L, add it to REF and then delete it from L.
Repeat this step until another nREF/2 solutions are added to REF.
Step 2. Generate new solutions from REF. Select two solutions from REF, use
the SBX operator to generate two offspring solutions, and then select the
best one as the new solution. Note that there are a total of nREF(nREF-1)/2
new solutions generated.
Step 3. Update the EXA. Use the obtained nREF(nREF-1)/2 new solutions to update
the EXA based on the EXA-update-strategy.
3 Computational Experiments
We adopt the General Distance (GD), the Spacing (SP), and the Maximum Spread
(MS) to evaluate the algorithm’s performance. Based on the experimental results, the
following parameter setting is adopted: npop = 100, nEXA =100, nREF =10, nPBA =5, and
nprop =0.3.
The HMOPSO is compared with other powerful or state-of-the-art algorithms such
as the NSGA-II [2], the MOPSO [7] (denoted as cMOPSO), and the MOPSO with
crowding distance [15] (denoted as MOPSO-CD). These three algorithms are selected
because they are proven to be very effective and often used by many researchers. In
this experiment, the maximum runtime is used as the stopping criterion because all
the algorithms are written in C++ and run on the same computer. In addition, for each
problem 30 independent duplications were carried out and we select the best one.
The computational results of each test problem are given in Figures 2-5. Based on
these results, it is clear that the proposed HMOPSO outperforms the other algorithms.
In addition, the proposed HMOPSO can reach the true Pareto fronts for all test MOPs,
and it shows a very robust performance. Among the rival algorithms, NSGA-II can
also reach the true Pareto fronts of all test MOPs, but its performance on the
A PSO-Based Hybrid Multi-Objective Algorithm 31
1.4 Pareto front 1.4 Pareto front 1.6 Pareto front 1.4 Pareto front
1.2 cMOPSO 1.2 MOPSO-CD 1.4 cMOPSO 1.2 MOPSO-CD
1.2
1.0 1.0 1.0
1.0
0.8 0.8 0.8
0.8
0.6 0.6 0.6
0.6
0.4 0.4 0.4 0.4
0.2 0.2 0.2 0.2
0.0 0.0 0.0 0.0
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00
Fig. 2. Pareto fronts produced by different algorithms for ZDT1 and ZDT2
1.5 Pareto front 1.5 70.0 Pareto front 90.0 Pareto front
Pareto front
cMOPSO MOPSO-CD 60.0 cMOPSO 80.0 MOPSO-CD
1.0 1.0 70.0
50.0
60.0
0.5 0.5 40.0 50.0
30.0 40.0
0.0 0.0
30.0
20.0
-0.5 -0.5 20.0
10.0 10.0
-1.0 -1.0 0.0 0.0
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00
Fig. 3. Pareto fronts produced by different algorithms for ZDT3 and ZDT4
1.0 Pareto front 1.0 Pareto front 2 Pareto front 2 Pareto front
0.9 HMOPSO 0.9 NSGA-II
NSGA-II 0 HMOPSO 0
0.8 0.8
0.7 0.7 -2 -2
0.6 0.6 -4 -4
0.5 0.5
0.4 0.4 -6 -6
0.3 0.3 -8 -8
0.2 0.2
-10 -10
0.1 0.1
0.0 0.0 -12 -12
0.20 0.40 0.60 0.80 1.00 0.20 0.40 0.60 0.80 1.00 -20 -19 -18 -17 -16 -15 -14 -20 -19 -18 -17 -16 -15 -14
7.0 Pareto front 10.0 Pareto front 2 Pareto front 2 Pareto front
cMOPSO 9.0 MOPSO-CD cMOPSO MOPSO-CD
6.0 0 0
8.0
5.0 7.0 -2 -2
4.0 6.0 -4 -4
5.0
3.0 4.0 -6 -6
2.0 3.0 -8 -8
2.0
1.0 -10 -10
1.0
0.0 0.0 -12 -12
0.20 0.40 0.60 0.80 1.00 0.20 0.40 0.60 0.80 1.00 -20 -19 -18 -17 -16 -15 -14 -20 -19 -18 -17 -16 -15 -14
Fig. 4. Pareto fronts produced by different algorithms for ZDT6 and KUR
32 X. Wang and L. Tang
8.0 Pareto front 8.0 Pareto front 8.6 Pareto front 8.6 Pareto front
7.0 HMOPSO 7.0 NSGA-II 8.4 HMOPSO 8.4 NSGA-II
6.0 6.0 8.2 8.2
5.0 5.0
8.0 8.0
4.0 4.0
7.8 7.8
3.0 3.0
7.6 7.6
2.0 2.0
1.0 1.0 7.4 7.4
0.0 0.0 7.2 7.2
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 -4 -2 0 2 4 6 -4 -2 0 2 4 6
9.0 Pareto front 14.0 Pareto front 8.6 Pareto front 8.6 Pareto front
8.0 cMOPSO 12.0 MOPSO-CD 8.4 cMOPSO 8.4 MOPSO-CD
7.0
10.0 8.2 8.2
6.0
5.0 8.0 8.0 8.0
4.0 6.0 7.8 7.8
3.0
4.0 7.6 7.6
2.0
1.0 2.0 7.4 7.4
0.0 0.0 7.2 7.2
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00 -4 -2 0 2 4 6 -4 -2 0 2 4 6
Fig. 5. Pareto fronts produced by different algorithms for Deb2 and KITA
4 Conclusion
In this paper, we investigated the improvement to the canonical MOPSO algorithm
and proposed three main strategies. First, the traditional update equations for
particles’ positions are replaced by a new particle flight mechanism that is based on
the crossover operator in GA. Second, motivated by observations that there are few
non-dominated solutions for some problems in the starting process of MOPSO, we
proposed a propagating mechanism to improve the quality and diversity of the
external archive. Third, a modified version of scatter search was adopted as the local
search to improve the external archive. In addition, we adopted the DOEs method to
analyze the influences of each parameter and their interactions on the performance of
our HMOPSO algorithm. In the comparative study, HMOPSO is compared against
existing state-of-the-art multi-objective algorithms through the use of benchmark test
problems. The results indicate that our HMOPSO algorithm is competitive or superior
to the NSGA-II, and much better than two MOPSO algorithms in the literature for all
benchmark problems.
Acknowledgements
This research is supported by Key Program of National Natural Science Foundation
of China (71032004), National Natural Science Foundation of China (70902065),
National Science Foundation for Post-doctoral Scientists of China (20100481197),
and the Fundamental Research Funds for the Central Universities (N090404018).
References
1. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multi-objective genetic
algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197
(2002)
2. Knowles, J.D., Corne, D.W.: Approximating the nondominated front using the Pareto
archived evolution strategy. Evolutionary Computation 8, 149–172 (2000)
A PSO-Based Hybrid Multi-Objective Algorithm 33
3. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength Pareto evolutionary
algorithm. Computer Engineering Networks Lab (TIK), Swiss Federal Institute of
Technology (ETH), Zurich, Switzerland, Technical Report, 103 (2001)
4. Nebro, A.J., Luna, F., Alba, E., Dorronsoro, B., Durillo, J.J., Beham, A.: AbYSS -
Adapting scatter search to multiobjective optimization. IEEE Transactions on Evolutionary
Computation 12(4), 439–457 (2008)
5. Hu, X., Eberhart, R.C.: Multiobjective optimization dynamic neighborhood particle swarm
optimization. In: Proceedings of Congress on Evolutionary Computation, pp. 1677–1681
(2002)
6. Mostaghim, S., Teich, J.: Strategies for finding local guides in multi-objective particle
swarm optimization (MOPSO). In: Proceedings of IEEE Swarm Intelligence Symposium,
pp. 26–33 (2003)
7. Coello, C.A.C., Pulido, G.T., Lechuga, M.S.: Handling multiple objectives with particle
swarm optimization. IEEE Transactions on Evolutionary Computation 8(3), 256–279
(2004)
8. Chow, C.K., Tsui, H.T.: Autonomous agent response learning by a multi-species particle
swarm optimization. In: Proceedings of Congress on Evolutionary Computation, pp. 778–
785 (2004)
9. Yen, G.G., Leong, W.F.: Dynamic multiple swarms in multiobjective particle swarm
optimization. IEEE Transactions on Systems, Man, and Cybernetics – Part A 39(4), 890–
911 (2009)
10. Goh, C.K., Tan, K.C., Liu, D.S., Chiam, S.C.: A competitive and cooperative co-
evolutionary approach to multi-objective particle swarm optimization algorithm design.
European Journal of Operational Research 202(1), 42–54 (2010)
11. Li, X.D.: A non-dominated sorting particle swarm optimizer for multiobjective
optimization. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M.,
Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter,
M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO
2003. LNCS, vol. 2723, pp. 37–48. Springer, Heidelberg (2003)
12. Srinivasan, D., Seow, T.H.: Particle swarm inspired evolutionary algorithm (PS-EA) for
multiobjective optimization problem. In: Proceedings of Congress on Evolutionary
Computation, pp. 2292–2297 (2003)
13. Tripathi, P.K., Bandyopadhyay, S., Pal, S.K.: Multi-Objective Particle Swarm
Optimization with time variant inertia and acceleration coefficients. Information
Science 177(22), 5033–5049 (2007)
14. Martí, R., Laguna, M., Glover, F.: Principles of scatter search. European Journal of
Operational Research 169(2), 359–372 (2006)
15. Raquel, C.R., Naval Jr., P.C.: An effective use of crowding distance in multiobjective
particle swarm optimization. In: Proceedings of Conference on Genetic Evolutionary
Computation, pp. 257–264 (2005)
The Properties of Birandom Multiobjective
Programming Problems
1 Introduction
The multiobjective programming problems are studied by many researchers such
as [2], [7], [8]. For given multiobjective problem, its absolute optimal solutions
which optimize each objective functions simultaneously usually dons’t exist, so
we consider their non-inferior solutions in a sense, which are Pareto optimal
solutions in common use.
There are various types of uncertainties in the real-world problem. As we
known, random phenomena is one class of uncertain phenomena which has been
well studied. Based on the probability, stochastic multiobjective programming
problems have been presented such as [1], [10].
In a practical decision-making process, we often face a hybrid uncertain envi-
ronment where linguistic and frequent nature coexist. For the examples of two
fold uncertainty, we may refer to [9], Liu [3], [4], Liu[5], Liu and Liu [6], Yazenin
[11]. To deal with this two fold uncertainty, it is required to employ biran-
dom theory [9]. The multiobjective programming in birandom environment have
not been developed well, therefore, following the idea of stochastic multiobjec-
tive programming, this paper devotes the birandom multiobjective programming
(BRMOP) problems based on the birandom theory. For the birandom param-
eters, we consider their expectation which convert the BRMOP problem into
the expected-value model of random fuzzy multiobjective (EVBRMOP) which
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 34–40, 2011.
c Springer-Verlag Berlin Heidelberg 2011
The Properties of Birandom Multiobjective Programming Problems 35
2 Preliminaries
Let ξ be a random variable defined on the probability space(Ω, Σ, Pr), where
Ω is a universe and Σ is the σ-algebra of subsets of Ω and Pr is a probability
measure defined on (Ω, Σ, Pr).
where D={x∈Rn|E[g(x, ξ)] = (E[g1 (x, ξ)], E[g2 (x, ξ)], · · · , E[gm (x, ξ)])T<= 0,
E[h(x, ξ)] = (E[h1 (x, ξ)], E[h2 (x, ξ)], · · · , E[hl (x, ξ)])T = 0}.
Theorem 3.1.1. Let ξ be a birandom variable, f(x, t) and g(x, t) be convex vec-
tor function on x for any given t, then the EVBRMOP problem is a convex
programming.
Proof. To prove the theorem, it is sufficient to illuminate that E[f (x, ξ)] is a
convex vector function and feasible region D is a convex set. By the assumed
conditions, for any given t, we can obtain:
Eω [f (λx1 + (1 − λ)x2 , ξω (ω))] ≤ λEω [f (x1 , ξω (ω))] + (1 − λ)Eω [f (x2 , ξω (ω))].
Using the similar method, we know that the random variable Eω [f (x, ξω (ω))]
have linear properties, so we have
Eω [Eω [f (λx1+(1−λ)x2 , ξω(ω))]]≤λEω [Eω [f (x1 , ξω(ω))]]+(1−λ)Eω [Eω [f (x2 , ξω(ω))]],
namely,
E[f (λx1 + (1 − λ)x2 , ξ)] ≤ λE[f (x1 , ξ)] + (1 − λ)E[f (x2 , ξ)],
The Properties of Birandom Multiobjective Programming Problems 37
Eω [g(λx1 +(1−λ)x2 , ξω (ω))] ≤ λEω [g(x1 , ξω (ω))]+(1−λ)Eω [g(x2 , ξω (ω))] <= 0.
(6)
It follows from the linear properties that
Eω [Eω [g(λx1+(1−λ)x2 , ξω(ω))]]≤λEω[Eω[g(x1 , ξω(ω))]]+(1−λ)Eω[Eω [g(x2 , ξω(ω))]]<=0.
namely,
On the other hand, because h(ξ, t) is linear vector function, we can obtain
namely,
E[fj (x∗ , ξ)] <= E[fj (x, ξ)], for all j = 1, 2, · · · , p.
Definitions 3.2.2. For the EVBRMOP problem, if x∗ ∈ D, we say that x∗ is the
expected-value efficient solutions to the BRMOP problem whose solution set is
denoted Dpa if it satisfies the following conditions: there does’t exist x ∈ D such
that
E[f (x, ξ)] ≤ E[f (x∗ , ξ)].
namely,
namely,
E[fj (x, ξ)] ≤ E[fj (x∗ , ξ)],
for all j = 1, 2, . . . , p, and their exists j0 at least such that
Dpa = Dwpa .
Proof. It follows from Theorem 3.2.1 that we need only to prove Dab ⊃ Dpa . If
x∗ ∈ Dpa , and x∗ ∈ / Dab , since Dab
= φ, their must exist x ∈ Dab , by the defi-
nition of expected-value absolute optimal solution, we can obtain E[f (x, ξ)] <=
E[f (x∗ , ξ)]. Since x∗ ∈ = E[f (x∗ , ξ)]. It follows from
/ Dab , we have E[f (x, ξ)]
The Properties of Birandom Multiobjective Programming Problems 39
the inequality above that E[f (x, ξ)] ≤ E[f (x∗ , ξ)], which is a contradiction with
x∗ ∈ Rpa . Hence, Dab ⊃ Dpa , which implies the required conclusion.
(2) It follows from Theorem 3.2.1 that we need only to prove Dwpa ⊂ Dpa . If
x∗ ∈ Dwpa , and x∗ ∈ / Dpa , we know that their must exist x ∈ D, and x = x∗ ,
∗
such that E[f (x, ξ)] ≤ E[f (x , ξ)]. By the assumed conditions and Theorem 3.1,
we can obtain that D is a convex set, hence, αx + (1 − α)x∗ ∈ D for any given
α ∈ (0, 1). Since f(x, ξ) is strict convex vector function on D, and f(x, ξ) is also
conmonotonic, by noting the inequality just given, it easy to know that
E[f (αx + (1 − α)x∗ , ξ)] < αE[f (x, ξ)] + (1 − α)E[f (x∗ , ξ)] < E[f (x∗ , ξ)],
which is a contradiction with x∗ ∈ Dwpa . Thus, Dwpa ⊂ Dpa , which proves the
required theorem.
4 Conclusions
Based on birandom theory, the BRMOP problem and its expected value model
has been introduced in this paper. Since the non-inferior solutions play impor-
tant role to multiobjective problem, the expected-value efficient solutions and
expected-value wake efficient solutions of the BRMOP problem are presented
and their relations are also studied. The results in this paper which can be as
theoretical tool to design algorithm for solving BRMOP problem.
Acknowledgments
The authors Mingfa Zheng and Yayi Xu were supported by National Natural
Science Foundation of China under Grant 70571021, and the Shanxi Province
Science Foundation under Grant SJ08A02.
References
1. Benabdelaziz, F., Lang, P., Nadeau, R.: Pointwise efficiency in multiobjective
stochastic linear Prograaming. Jourmal of Operational Research Sociaty 45, 11–18
(2000)
2. Hu, Y.D.: The efficient theory of multiobjective programming. Shanghai Since and
Technology Press, China (1994)
3. Liu, B.: Fuzzy random dependent-chance programming. IEEE Trans. Fuzzy Syst. 9,
721–726 (2001)
4. Liu, B.: Uncertain programming. Wiley, New York (1999)
5. Liu, B.: Random fuzzy dependent-chance programming and its hybrid intelligent
algorithm. Information Sciences 141, 259–271 (2002)
6. Liu, Y.K., Liu, B.: Expected value operator of random fuzzy variable operator.
International Journal of Uncertainty, Fuzziness, Knowlledge-Based Systems 11,
195–215 (2003)
7. Lin, C.Y., Dong, J.L.: The efficient theory and method of multiobjective program-
ming. Jilin Educational Press, China (2006)
40 Y. Zhang et al.
8. Ma, B.J.: The efficient rate of efficient solution to linear multiobjective program-
ming. Jounal of Systems Engineering and Electronic Techology 2, 98–106 (2000)
9. Peng, J., Liu, B.: Birandom variables and birandom programming. Technical (2003)
10. Stancu-Minasian, I.M.: Stochastic programming with multiple objective functions.
Buckarest (1984)
11. Yager, R.R.: A foundation for a theory of possibility. Journal of Cybernetics 10,
177–204 (1980)
A Modified Multi-objective Binary Particle Swarm
Optimization Algorithm
Ling Wang, Wei Ye, Xiping Fu, and Muhammad Ilyas Menhas
Abstract. In recent years a number of works have been done to extend Particle
Swarm Optimization (PSO) to solve multi-objective optimization problems, but
a few of them can be used to tackle binary-coded problems. In this paper, a
novel modified multi-objective binary PSO (MMBPSO) algorithm is proposed
for the better multi-objective optimization performance. A modified updating
strategy is developed which is simpler and easier to implement compared with
standard discrete binary PSO. The mutation operator and dissipation operator
are introduced to improve the search ability and keep the diversity of algorithm.
The experimental results on a set of multi-objective benchmark functions dem-
onstrate that the proposed MBBPSO is a competitive multi-objective optimizer
and outperforms the standard binary PSO algorithm in terms of convergence
and diversity.
1 Introduction
Multi-objective optimization problems (MOPs), which have more than one objective
function, are ubiquitous in science and engineering fields such as astronomy science,
electronic engineering, automation, artificial intelligence. In MOPs, a unique optimal
solution is hard to find due to the contradictory objectives. On the contrary, the ‘trade-
off’ solutions, in other words, the non-dominated solutions are preferred. Several
approaches have been proposed to deal with multi-objective optimization problems
like reducing the problem dimension by combining all objectives into a single objec-
tive [1] or optimizing one while the rests being constrained [2]. However, these men-
tioned methods rely on a priori knowledge of the appropriate weights or constraint
values. Furthermore, they are only capable of finding the individual point on the
tradeoff curve for each problem solution. As a result, Pareto-based multi-objective
methods, which optimize all objectives simultaneously and eliminate the need for
determining appropriate weights or formulating constraints, have been current re-
search hotspot. Pareto-based multi-objective methods operate on the concept of ‘Pare-
to domination’ [3] and the solutions on the curve of Pareto front represent the best
possible compromises among the objectives [4]. So, one of the crucial goals in multi-
objective optimization is to find a set of optimal solutions that distribute well along
the Pareto front.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 41–48, 2011.
© Springer-Verlag Berlin Heidelberg 2011
42 L. Wang et al.
Particle Swarm Optimization (PSO) was firstly developed by Kennedy and Eberhart
in 1995[6]. It is originated by imitating the behavior of a swarm of birds trying to search
for food in an unknown area [5]. Owing to its simple arithmetic structure, high conver-
gence speed and excellent global optimization ability, PSOs have been researched and
improved to solve various multi-objective optimization problems. However, standard
PSO and most of its improved versions work in continuous space, which mean they
cannot tackle the binary-coded problems directly. To make up for it, Kennedy extended
the PSO and proposed a novel discrete binary PSO (DBPSO) [6]. Based on DBPSO,
researchers have introduced binary PSO to solve multi-objective problems. Abdul [8]
proposed a multi-objective DBPSO, called BMPSO, to select the cluster head for leng-
thening the network lifetime and preventing network connectivity degradation. Peng
and Xu [7] proposed a modified multi-objective binary PSO combining DBPSO with
immune system to optimize the placement of the phasor measurement unit. These works
prove that DBPSO-based multi-objective optimizers are efficient in solving MOPs.
Nevertheless, the previous works on single objective optimization problems show that
the optimization ability of DBPSO is not ideal [9], [10]. So we propose a novel mod-
ified multi-objective binary PSO (MMBPSO) in this paper to achieve the better multi-
objective search ability and simplified the implementation of algorithm.
The rest of the paper is organized as follows. In Section 2, the brief introduction on
DBPSO and a modified binary PSO algorithm is given first, and then the proposed
MMBPSO algorithm are described in detail. Section 3 validates the MMBPSO with
several benchmark problems, and the optimization performance and comparison are
also illustrated. Finally, some concluding remarks are given in Section 4.
X X 0 V α (1)
X P α V 1 α ⁄2 (2)
X P 1 α ⁄2 V 1 (3)
The parameter α, called static probability, should be set properly. A small value of α
can improve the convergent speed of the algorithm but makes MBPSO be trapped in
the local optimum easily; while MBPSO with a big α may be ineffective as it cannot
utilize the knowledge gained before well [9].
Although the update formulas of MBPSO and DBPSO are different, the updating
strategy is still the same. In MBPSO, each particle still flies through the search space
according to its past optimal experience and the global optimal information of the
group. The Eq. (1) is an exhibition of inertia which represents the information that a
particle inherited from its previous generation. The Eq. (2) represents particle’s cogni-
tive capability which draws the particle to its own best position. The Eq. (3) is the
particle’s social capacity which leads the particle to move to the best position found
by the swarm [12].
A Modified Multi-objective Binary Particle Swarm Optimization Algorithm 43
Although MBPSO has been successfully adopted to solve various problems such as
numerical optimization problem, feature selection and multidimensional knapsack
problem, it is obvious that standard MBPSO cannot tackle Pareto-based multi-
objective optimization problems. So we extend MBPSO and propose a novel mod-
ified multi-objective binary PSO.
X X 0 V α (4)
X P α V β (5)
X P β V 1 (6)
Here the parameter β can adjust the probability of tracking the two different best
solutions.
According to the Eq. (4-6), MMBPSO is easy to stick in the local optimal. For in-
stance, if X , P and P are all equal to “1”, X will be “1” forever and vice versa. So
the dissipation operator and the mutation operator are introduced to keep the diversity
and enhance the local search ability.
A B
During each iteration process, the non-dominated solution set is sorted according to
the niche count. P for each generation is randomly chosen among top 10%“less
crowded” non-dominated particles in the set. To encourage MMBPSO to search the
whole space and find more non-dominated solutions, P is replaced by the non-
dominated current particle.
3 Experiments
3.1 Benchmark Functions and Performance Metrics
To test the performance of the proposed MMBPSO, five well-known benchmark
functions, i.e., ZDT1, ZDT2, ZDT3, ZDT4 and ZDT6 [15] are adopted in the paper.
All problems have two objective functions and without any constraint. Multi-
objective optimizer is designed to achieve two goals: 1) convergence to the Pareto-
optimal set and 2) maintenance of diversity in solutions of the Pareto-optimal set.
These two tasks cannot be measured adequately with one performance metric. So the
A Modified Multi-objective Binary Particle Swarm Optimization Algorithm 45
convergence metric Υ proposed in [15] and the diversity metric S proposed in [14] are
adopted to evaluate the performance of MMBPSO.
Algorithm Parameters
MMBPSO α=0.55, β=0.775, pd=0.1, pm=0.001;
MDBPSO c 2.0, c 2.0 , ω 0.8, v 5,5 ;
BMPSO [10] c 2.0, c 2.0 , ω 0.8, v 5,5 .
-1
10
-1
-1 10
10
-2
-2 10
10
-2
10 MMBPSO MDBPSO BMPSO
MMBPSO MDBPSO BMPSO
MMBPSO MDBPSO BMPSO ZDT6
ZDT4
2
10
0
10
Convegence
Convegence
-1
10
1
10
-2
10
Fig. 2. Box plots of the convergence metric obtained by MMBPSO, MDBPSO and BMPSO
46 L. Wang et al.
-1
10
Spacing
Spacing
-1
Spacing
10 -1
10
-2
10 -2
10
0
Spacing
-1
Spacing
10 10
-2
10
-2
10
MMBPSO MDBPSO BMPSO MMBPSO MDBPSO BMPSO
Fig. 3. Box plots of the distance metric obtained by MMBPSO, MDBPSO and BMPSO
0 0 -1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
ZDT4 ZDT6
140 8
MMBPSO MMBPSO
120 MDBPSO MDBPSO
BMPSO 6 BMPSO
100
80
4
60
40
2
20
0 0
0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
Fig. 4. The founded Pareto front of MMBPSO, MDBPSO and BMPSO on ZDT series functions
4 Conclusion
In this paper, a novel multi-objective modified binary particle swarm optimization is
proposed. Compared with DBPSO, the proposed MMBPSO developed an improving
updating strategy which is simpler and more easily to implement. The mutation opera-
tor and dissipation operator are introduced to improve its search ability and keep the
diversity of algorithm. The modified global best and local best solutions updating
strategy help MMBPSO converge to the Pareto front better. Five well-known bench-
mark functions were adopted for testing the proposed algorithm. The experimental
results proved that the proposed MMBPSO can find better solutions than MDBPSO
and BMPSO. Especially, the superior of MMBPSO to MDBPSO demonstrated the
advantages of the developed updating strategy in terms of convergence and diversity.
Acknowledgments. This work is supported by Research Fund for the Doctoral Pro-
gram of Higher Education of China (20103108120008), the Projects of Shanghai
Science and Technology Community (10ZR1411800 & 08160512100), Mechatronics
Engineering Innovation Group project from Shanghai Education Commission, Shang-
hai University “11th Five-Year Plan” 211 Construction Project and the Graduate
Innovation Fund of Shanghai University (SHUCX102218).
References
1. Xiang, Y., Sykes, J.F., Thomson, N.R.: Alternative formulations for ptimal groundwater
remediation design. J. Water Resource Plan Manage 121(2), 171–181 (1995)
2. Das, D., Datta, B.: Development of multi-objective management models for coastal aqui-
fers. J. Water Resource Plan Manage 125(2), 76–87 (1999)
3. Erickson, M., Mayer, A., Horn, J.: Multi-objective optimal design of groundwater remed-
iation systems: application of the niched Pareto genetic algorithm (NPGA). Advances in
Water Resources 25(1), 51–65 (2002)
4. Sharaf, A.M., El-Gammal, A.: A novel discrete multi-objective Particle Swarm Optimiza-
tion (MOPSO) of optimal shunt power filter. In: Power Systems Conference and Exposi-
tion, pp. 1–7 (2009)
48 L. Wang et al.
5. Clerc, M., Kennedy, J.: The particle swarm—explosion, stabilityand convergence in a mul-
tidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002)
6. Kennedy, J., Eberhart, R.C.: A discrete binary version of the particle swarm algorithm,
Systems, Man, and Cybernetics. In: IEEE International Conference on Computational Cy-
bernetics and Simulation, vol. 5, pp. 4104–4108 (1997)
7. Peng, C., Xu, X.: A hybrid algorithm based on immune BPSO and N-1 principle for PMU
multi-objective optimization placement. In: Third International Conference on Electric
Utility Deregulation and Restructuring and Power Technologies, pp. 610–614 (2008)
8. Abdul Latiff, N.M., Tsimenidis, C.C., Sharif, B.S., Ladha, C.: Dynamic clustering using
binary multi-objective Particle Swarm Optimization for wireless sensor networks. In: IEEE
19th International Symposium on Personal, Indoor and Mobile Radio Communications,
pp. 1–5 (2008)
9. Wang, L., Wang, X.T., Fei, M.R.: An adaptive mutation-dissipation binary particle swarm
optimisation for multidimensional knapsack problem. International Journal of Modelling,
Identification and Control 8(4), 259–269 (2009)
10. Wang, L., Wang, X.T., Fu, J.Q., Zhen, L.L.: A Novel Probability Binary Particle Swarm
Optimization Algorithm and Its Application. Journal of Software 9(3), 28–35 (2008)
11. Qi, S., Jian, H.J., Chen, X.J., Guo, L.S., Ru, Q.Y.: Modified particle swarm optimization
algorithm for variable selection in MLR and PLS modeling: QSAR studies of antagonism
of angiotensin II antagonists. European Journal of Pharmaceutical Sciences 22(2-3), 145–
152 (2004)
12. Jahanbani Ardakani, A., Fattahi Ardakani, F., Hosseinian, S.H.: A novel approach for op-
timal chiller loading using particle swarm optimization. Energy and Buildings 40, 2177–
2187 (2008)
13. Li, X.: A non-dominated sorting particle swarm optimizer for multiobjective optimization.
In: The Genetic and Evolutionary Computation Conference, pp. 37–48 (2003)
14. Gong, M., Liu, C., Cheng, G.: Hybrid immune algorithm with Lamarckian local search for
multi-objective optimization. Memetic Computing 2(1), 47–67 (2010)
15. Deb, K., Jain, S.: Running performance metrics for evolutionary multi-objective optimiza-
tion. Technical Report, no. 2002004 (2002)
Improved Multiobjective Particle Swarm Optimization
for Environmental/Economic Dispatch Problem in
Power System*
1 Introduction
With the increasing concern of environmental pollution, operating at absolute mini-
mum cost can no longer be the only criterion for economic dispatch of electric power
generation. Environmental/economic dispatch (EED) is becoming more and more
desirable for not only resulting in great economical benefit, but also reducing the
pollutants emission [1]. However, minimizing the total fuel cost and total emission
are conflicting in nature and they cannot be minimized simultaneously. Hence, the
EED problem is a large-scale highly constrained nonlinear multi-objective optimiza-
tion problem.
Over the past decade, the meta-heuristic optimization methods have been signifi-
cantly used in EED primarily due to their nice feature of population-based search [2].
Many multi-objective evolutionary algorithms such as niched Pareto genetic algo-
rithm (NPGA) [3], non-dominated sorting genetic algorithm (NSGA) [4], strength
Pareto evolutionary algorithm (SPEA) [5] and NSGA-II [6, 7] have been introduced
to solve the EED problem with impressive success.
*
Manuscript received January 2, 2011. This work was supported by Natural Science Founda-
tion of Shaanxi Province (Grant No.2010JQ8006) and Science Research Programs of Educa-
tion Department of Shaanxi Province (Grant No.2010JK711).
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 49–56, 2011.
© Springer-Verlag Berlin Heidelberg 2011
50 Y. Wu, L. Xu, and J. Xue
2 Problem Statement
The typical EED problem can be formulated as a bi-criteria optimization model. The
two conflicting objectives, i.e., fuel cost and pollutants emission, should be mini-
mized simultaneously while fulfilling certain system constraints. This problem is
formulated as follows.
Objective 1: Minimization of fuel cost. The total fuel cost F ( PG ) can be represented
as follows:
generator.
where PGimin and PGimax are the minimum and maximum power generated by generator i ,
respectively.
Improved Multiobjective Particle Swarm Optimization 51
Constraint 2: Power balance constraint. The total power generation must cover the
total demand PD and the real power loss in transmission lines PLoss.
NG
PD + PLOSS − ∑ PGi = 0 (4)
i =1
NG NG NG
PLOSS = ∑∑ PGi Bij PGi + ∑ PGi B0i + B00 (5)
i =1 j =1 i =1
where Bij is the transmission loss coefficient, B0i is the i -th element of the loss coeffi-
cient vector and B00 is the loss coefficient constant.
Adjust
Belief
space
Acceptance Influence
Population Performance
Selection
space Function
Variation
A particle status on the population space is characterized by two factors: its position
and velocity, which are updated by following equations [14]:
vid (t + 1) = wvid (t ) + c1r1d (φid (t ) − xid (t )) + c2 r2 d (φgd (t ) − xid (t )) (6)
where xi is the i -th individual in the situational knowledge, and pi , r1 , pi , r 2 are differ-
ent particles in the nondominated set. F ∈ [ 0,1] is the ratio factor of differential
evolution.
If xi' dominates xi , then xi' replaces xi . If neither of them dominates each other,
select the new individual at random.
History Knowledge. History knowledge keeps track of the history of the search
process and records key events in the search. It might be either a considerable move
in the search space or a discovery of a landscape change. Individuals use the history
knowledge for guidance in selecting a moving direction [11].
The history knowledge will be used later to adapt the distribution of the individuals
after finding the Pareto-front.
Acceptance Function. The global worst one of the belief space is replaced by the
global best of the population space every Acc generation.
Acc = Bnum + t / Tmax × Dnum (9)
Where Bnum and Dnum are two constants. The global best of population space is the
least number of the individual. And the global worst of the belief space is the individ-
ual with the shortest crowding distance in Pareto-front.
Improved Multiobjective Particle Swarm Optimization 53
Influence Function. After each Inf generation, the global worst one of the
population space is replaced by the global best of the belief achieve.
Inf = Bnum + (Tmax − t ) / Tmax × Dnum (10)
The global best individual of the belief space is the one with the longest crowding
distance in Pareto-front. And the global worst individual of the population space is the
one with the largest number.
The non-dominated solutions of the archive are composed of two parts. Some of them
are new non-dominated solutions in population space; the others are new non-
dominated solutions in belief space. A circular crowding sorting algorithm is adopted
in this paper to improve the uniformity of Pareto optimal front.
The first step is the encoding of the decision variables. The power output of each
generator is selected as the gene, and many genes comprise a particle which repre-
sents a candidate solution for the EED problems. That is, every particle j consists of N
real coded string such as x j = {PG1, j , PG 2, j ,......PGM , j } , where PGi , j , i = 1, 2,", M means the
power output of the i -th generator with respect to the j -th particle.
In next page, the Pareto-optimal sets are shown in Figs.2(a) for case 1 and in Figs.2(b)
for case 2. It can be seen that the CA-IMOPSO technique preserves the diversity and
uniformity of the Pareto-optimal front and solve effectively the problem in both cases
considered.
The non-dominated solutions with CA-IMOPSO for case 1 and case 2 are
compared to those reported in the literature [10], [4], [3], [5]. And the best two non-
dominated solutions with the proposed approach and those reported for case 1 and
case 2 are given in Table 1 and 2 respectively.
0.225 0.225
0.22 0.22
0.215 0.215
Emission (ton/hr)
Emission (ton/hr)
0.21 0.21
0.205 0.205
0.2 0.2
0.195 0.195
0.19 0.19
600 610 620 630 640 600 610 620 630 640 650
Fuel Cost ($/hr) Fuel Cost ($/hr)
(a) Pareto-optimal front for case 1 (b) Pareto-optimal front for case 2
FCPSO [10] NSGA [4] NPGA [3] SPEA [5] MOPSO CA-IMOPSO
FCPSO [10] NSGA [4] NPGA [3] SPEA [5] MOPSO CA-IMOPSO
From the table we can conclude that the proposed CA-IMOPSO technique is supe-
rior to all reported techniques. And it demonstrates the potential and effectiveness of
the proposed technique to solve EED problem.
6 Conclusion
In this paper, a novel multiobjective particle swarm optimization technique based on
cultural algorithm has been proposed and applied to environmental economic dispatch
optimization problem. The results of the EED problem show the potential and effi-
ciency of the proposed algorithm. In addition, the simulation results also reveal the
superiority of the proposed algorithm in terms of the diversity and quality of the ob-
tained Pareto-optimal solutions.
References
1. Talaq, J.H., EI-Hawary, F., EI-Hawary, M.E.: A summary of environmental/economic dis-
patch algorithms. J. IEEE Trans. Power Syst. 9(3), 1508–1516 (1994)
2. Lingfeng, W., Chanan, S.: Environmental/economic power dispatch using a fuzzified
multi-objective particle swarm optimization algorithm. J. Electr. Power Syst. Research 77,
1654–1664 (2007)
3. Abido, M.A.: A niched Pareto genetic algorithm for multiobjective environ-
mental/economic dispatch. J. Electr. Power Energy Syst. 25(2), 97–105 (2003)
4. Abido, M.A.: A novel multiobjective evolutionary algorithm for environmental/ economic
power dispatch. J. Electr. Power Syst. Research 65, 71–91 (2003)
5. Abido, M.A.: Multiobjective evolutionary algorithms for electric power dispatch problem.
J. IEEE Trans. Evolut. Comput. 10(3), 315–329 (2006)
6. King, R.T.F., Rughooputh, H.C.S., Deb, K.: Evolutionary multi-objective environ-
mental/Economic dispatch: Stochastic versus deterministic approaches. In: Coello Coello,
C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 677–691.
Springer, Heidelberg (2005)
56 Y. Wu, L. Xu, and J. Xue
7. Basu, M.: Dynamic economic emission dispatch using nondominated sorting genetic algo-
rithm-II. J. Electr. Power. Energy Syst. 30(2), 140–210 (2008)
8. Wang, L.F., Singh, C.: Environmental/economic power dispatch using a fuzzified multi-
objective particle swarm optimization algorithm. J. Electr. Power Syst. Res. 77(12), 1654–
1664 (2007)
9. Cai, J.J., Ma, X.Q., Li, Q., Li, L.X., Peng, H.P.: A multi-objective chaotic particle swarm
optimization for environmental/economic dispatch. J. Energy Convers Manage. 50(5),
1318–1325 (2009)
10. Agrawal, S., Panigrahi, B.K., Tiwari, M.K.: Multiobjective particle swarm algorithm with
fuzzy clustering for electrical power dispatch. J. IEEE Trans. Evolut. Comput. 12(5),
529–541 (2008)
11. Daneshyari, W., Yen, G.G.: Cultural MOPSO: A cultural framework to adapt parameters
of multiobjective particle swarm optimization. In: C. IEEE Congress. on Evolut. Comput.,
pp. 1325–1332 (2009)
12. Farag, A., Al-Baiyat, S., Cheng, T.C.: Economic load dispatch multiobjective optimization
procedures using linear programming techniques. J. IEEE Trans. Power Syst. 10(2),
731–738 (1995)
13. Landa, B., Carlos, A., Coello, C.: Cultured differential evolution for constrained optimiza-
tion. J. Comput Methods in Applied Mechanics and Engine 195, 4303–4322 (2006)
14. Yunhe, H., Lijuan, L., Yaowu, W.: Enhanced particle swarm optimization algorithm and
its application on economic dispatch of power systems. J. Proc. of CSEE 24(7), 95–100
(2004)
15. Hemamalini, S., Simon, S.P.: Emission Constrained Economic Dispatch with Valve-Point
Effect using Particle Swarm Optimization. In: C. IEEE Region. 10 Confer., pp. 1–6 (2008)
A New Multi-Objective Particle Swarm
Optimization Algorithm for Strategic Planning
of Equipment Maintenance
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 57–65, 2011.
c Springer-Verlag Berlin Heidelberg 2011
58 H. Ling et al.
in the domain of equipment maintenance and have shown their advantages both
in problem-solving effectiveness and solution quality. For example, Kleeman and
Lemont [9] designed a multi-objective genetic algorithm to solve the aircraft en-
gine maintenance scheduling problem, which is a combination of a modified job
shop problem and a flow shop problem. Verma and Ramesh [14] viewed the initial
scheduling of preventive maintenance as a constrained non linear multi-objective
decision making problem, and proposed a genetic algorithm that simultaneously
optimizes the objectives of reliability, cost and newly introduced criteria, non-
concurrence of maintenance periods and maintenance start time factor. Yang
and Huang [16] also proposed a genetic algorithm for multi-objective equipment
maintenance planning, but the model used a simplified function that evaluates
equipment capability based on equipment cost and thus limited its practicality.
Ai and Wu [1] used a hybird approach based on simulated annealing and ge-
netic algorithm for communication equipment maintenance planning, the they
did not consider multiple objectives. Recently the authors presented in [19] an ef-
ficient multi-objective tabu search algorithm, which was capable of solving large
problems with more than 45000 equipments of 500 kinds.
Particle swarm optimization (PSO) [8] is a population-based global optimiza-
tion technique that enables a number of individual solutions, called particles, to
move through a hyper dimensional search space in a methodical way to search for
optimal solution(s). Each particle represents a feasible solution which has a posi-
tion vector x and a velocity vector v, which are adjusted at iteration by learning
from a local best pbest found by the particle itself and a current global best g best
found by the whole swarm. PSO is conceptually simple and easy to implement,
and has demonstrated its efficiency in a wide range of continuous and combina-
torial optimization problems [2]. Since 2002, multi-objective PSO (MOPSO) has
attracted much attention among researchers and has shown promising results
for solving multi-objective optimization problems (e.g., [3,13,11,6]).
In this paper we define a multi-objective integer programming model for
MESP which considers objectives include minimizing maintenance costs (in-
cluding costs of maintenance materiel and workers) and maximizing expected
mission capability of equipment systems (via layered quadratic functions). We
then propose a MOPSO algorithm for the problem model, which uses an ob-
jective leverage function for global best selection and preserves the diversity of
non-dominated solutions based on the measurement of minimum pairwise dis-
tance. Experimental results show that our approach can achieve good solution
quality with low computational costs to support effective decision- making.
2 Problem Model
2.1 Problem Description and Hypothesis
SEMP needs to determine the numbers of different kinds of equipment to be
maintained at different levels according to the overall mission requirements and
the current conditions of all equipment [18]. There are two key aspects to assess
A New Multi-Objective Particle Swarm Optimization Algorithm 59
an SEMP solution: the overall maintenance cost and the overall mission capa-
bility after maintenance. Thus SEMP is typically a multi-objective optimization
problem, for which the improvement of one objective may cause the degradation
of another.
On of the basic principles of equipment maintenance is to assign each
equipment to an appropriate maintenance level according to the quality of the
equipment. In this paper, we roughly suppose there are three quality levels of
equipment, namely A, B, and C, and two maintenance levels, namely I and II;
Typically, equipment of quality level A does not need to be maintained, and
equipment of quality level C and B should be maintained at the level I and II
respectively1 .
m
CT = (si ti xi + si ti xi ) (3)
i=1
And thus the overall maintenance cost C = CE + CT .
given period the expected numbers of equipment i of the quality level A, B, and
C are respectively calculated as follows:
A
x A
i = (1 − αi − γi )(xi + xi + xi ) (4)
B
x A
i = αi (xi + xi + xi ) + (1 − βi )xi
B
(5)
C
x C B A
i = xi + βi xi + γi (xi + xi + xi ) (6)
Now for equipment i, its mission capability can be evaluated based on the
i of the numbers of equipment at different quality level as follows
weighted sum x
(For most equipment the weight wiA can be set to 1 and the weight wiC is very
small):
i = wiA x
x A B B
i + wiC x
i + wi x C
i (7)
And the mission capability I of the whole equipment system can be evaluated
using the quadratic function as follows:
m
m m
I= i x
aij x j + i + c
bi x (8)
i=1 j=1 i=1
where I is the lower limit of the overall mission capability, C is the upper limit
of the overall cost, M T is the upper limit of the total working hour, and X and
X are the upper limits of the numbers of equipment can be maintained at level
I and II respectively.
A New Multi-Objective Particle Swarm Optimization Algorithm 61
3 MOPSO Algorithm
3.1 The Algorithm Framework
The SEMP model described above is a multi-objective integer programming
problem. Although the standard PSO algorithm works on continuous variables,
the truncation of real values to integers will not affect significantly the perfor-
mance of the method when the range of decision variables is large [10]. The
following presents our MOPSO algorithm for the SEMP that searches for the
Pareto-optimal front rather than a single optimal solution:
1. Set the basic algorithm parameters, and randomly generate a swarm P of p
feasible solutions.
2. For each particle η in the swarm, initialize its velocity v η = 0, and set pηbest
be its initial position xη .
3. Select all non-dominated solutions from P and save them in the archive N P .
4. Choose a solution g best from N P such that:
g best = max{θ ∈ N P |w1 I(θ) − w2 C(θ)} (16)
where and w1 and w2 are two preference weights satisfying w1 , w2 ∈ (0, 1)
and that w1 + w2 = 1.
5. Update the velocity and position of each η in P according to the following
movement equations:
v η = wv η + c1 r1 (pηbest − xη ) + c2 r2 (g best − xη ) (17)
η η η
x =x +v (18)
where w is the inertia weight, c1 and c2 are learning factors, and r1 and r2
are random values between (0, 1).
6. If the position xη violates the problem constraints (11)∼(15), reset xη =
pηbest and reset v η = 0.
η
7. Update each local best solution pbest .
8. Compute SI = θ∈N P I(θ) and SC = θ∈N P C(θ).
9. Update the non-dominated solution set N P based on the new swarm, and
then compute ΔSI = θ∈N P I(θ) − SI and ΔSC = SC − θ∈N P C(θ).
10. If the termination condition is satisfied, then the algorithm stops; else update
the inertia weight according to the following equations and then goto step 4:
k
w = wmax − max (wmax − wmin ) (19)
k
min(w1 + 0.1, w1max ) if ΔSI < ΔSC
w1 = (20)
max(w1 − 0.1, w1min ) else
w2 = 1 − w1 (21)
where k is the current iteration number, k max is the maximum iteration
number of the algorithm, wmax and wmin are the maximum and minimum
inertia weights respectively, and w1max and w1min are the maximum and min-
imum first-preference weights respectively.
62 H. Ling et al.
When the size of N P reaches the size limit |N P |max , the following procedure
is applied for possible inclusion of a new solution η:
1. If η is dominated by any θ ∈ N P , then η is discarded.
2. Else if η dominates some θ ∈ N P , then remove those θ and insert η.
3. Else if minθ∈N P ∧θ=θa dis(η, θ) > dis(θa , θb ), then remove θa and insert η.
4. Else if minθ∈N P ∧θ=θb dis(η, θ) > dis(θa , θb ), then remove θb and insert η.
5. Else choose a closet z ∈ N P to η; if minθ∈N P ∧θ=z dis(η, θ) > dis(η, z), then
remove z and insert η.
6. Else discard η.
A New Multi-Objective Particle Swarm Optimization Algorithm 63
m
Table 1. Parameter setting in the algorithms, where M = i=1 xA B C
i + xi + xi is the
total number of equipment
4 Computational Experiments
The presented MOPSO algorithm (denoted by MOPSO-A) has been tested on
a set of SEMP problem instances and compared with two other algorithms:
5 Conclusion
References
1. Ai, B., Wu, C.: Genetic and simulated annealing algorithm and its application
toequipment maintenace resource optimization. Fire Control & Command Con-
trol 35(1), 144–145 (2010)
2. Clerc, M.: Particle Swarm Optimization. ISTE, London (2006)
3. Coello, C.A.C., Lechuga, M.S.: MOPSO: A proposal for multiple objective particle
swarm optimization. In: Proceedings of Congress on Evolutionary Computation,
vol. 2, pp. 1051–1056. IEEE Press, Los Alamitos (2002)
A New Multi-Objective Particle Swarm Optimization Algorithm 65
1 Introduction
Nurse scheduling, which is among many other types of staff scheduling, intends to
automatically allot working shifts to available nurses in order to maximize hospital
value/benefit subject to relevant constraints including governmental regulations, nurse
skill requirement, minimal on-duty hours, etc. There are several solution methods
proposed in the last decade for dealing with the nurse scheduling problem. These
methods can be divided into three categories: mathematical programming, heuristics,
and metaheuristics. Most of the methods aimed to solve a single-objective formula-
tion, only few of them [1-4] addressed a more complete description of real-world
hospital administration and attempted multiobjective formulation of nurse scheduling.
Nevertheless, due to the high complexity of multiobjective context, the authors of
[1-3] converted the multiobjective formulation into a single-objective program by the
weighting-sum technique. The weighting-sum technique fails to identify optimal
solutions if the Pareto front is non-convex and the value of the weights used to com-
bine multiple objectives is hard to determine.
This paper proposes a cyber swarm algorithm (CSA) for the Multi-Objective Nurse
Scheduling Problem (MONSP). The CSA is a new metaheuristic approach which
marries the major features of particle swarm optimization (PSO) and scatter search.
*
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 66–73, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Multiobjective Optimization for Nurse Scheduling 67
The CSA has been shown to be more effective than several state-of-the-art methods
for single-objective optimization [5]. The contribution of this paper includes the
following. (1) We devise a multiobjective version for the CSA. The proposed method,
named MOCSA, is general and can be employed to solve many classes of problems
with multiobjective context; (2) we show the effectiveness of MOCSA in tackling the
generic multiobjective nurse scheduling problem. The non-dominated solutions ob-
tained by MOCSA are superior to those produced by other competing methods in
terms of the dominance strength and the diversity measure on the solution front; and
(3) the multi-dimensional asymptotic Pareto front is shown in the objective space to
illustrate the comparative performances of competing methods.
The remainder of this paper is organized as follows. Section 2 presents a literature
review of existing methods for the nurse scheduling problem and introduces the cen-
tral concepts of multiobjective optimization. Section 3 describes the problem formally
and articulates the proposed method. Section 4 presents experimental results together
with an analysis of their implications. Finally, concluding remarks and discussions are
given in Section 5.
2 Related Works
To assist various operations performed in a hospital, a work day is normally divided
into two to four shifts (for example, a three-shift day may include day, night, and late
shift). Each nurse is allocated to a number of shifts during the scheduling period with
a set of constraints. A shift is fulfilled by a specified number of nurses with different
medical skills depending on the operations to be performed in the shift. The adherent
constraints with nurse scheduling are necessary hospital regulations when taking into
account the wage cost, execution of operations, nurses’ requests, etc. The constraints
can be classified as hard constraints and soft constraints. Hard constraints should be
strictly satisfied and a schedule violating any hard constraints will not be acceptable.
Soft constraints are desired to be satisfied as much as possible and a schedule violat-
ing soft constraints is still considered feasible.
The objective could involve the reduction of the human resource cost, satisfaction
of nurses’ request, or minimization of violations to any soft constraints. Most existing
works seek to optimize one objective, only few consider multiple objectives when
search for solutions. Berrada et al. [1] proposed the first attempt to find a nurse
schedule optimizing several soft constraints simultaneously. The lexico-dominance
technique is applied where the priority order of the soft constraints is pre-specified
and is used to determine the quality of solutions. Burke et al. [3] applied the weight-
ing sum technique but the weight values are determined by the priority order of objec-
tives obtained after close consultation with hospitals. Burke et al. [4] proposed a
simulated annealing multiobjective method which generates the non-dominated solu-
tions to obtain an approximate Pareto front.
A widely accepted notion in decision science field for multiobjective optimization
is to search the Pareto-optimal solutions which are not dominated by any other solu-
tions. A solution x dominates another solution y, denoted x ; y , if x is strictly better
than y in at least one objective and x is no worse than y in the others. The plots of
objective values for all Pareto-optimal solutions form a Pareto front in the objective
space. It is usually hard to find the true Pareto front due to the high complexity of the
problem nature. Alternatively, an approximate Pareto front is searched for. The
68 P.-Y. Yin, C.-C. Chao, and Y.-T. Chiang
quality of this front is evaluated by two measures: (1) The convergence measure indi-
cates how close the approximate front is converging to the true front, and (2) the di-
versity measure favors the approximate front whose plots are evenly spread on the
front. Classical multiobjective optimization methods include lexico-dominance,
weighting sum, and goal programming. However, multiple runs of the applied method
are needed to obtain a set of non-dominated solutions. Recently, metaheuristic algo-
rithms have been introduced as a viable technique for multiobjective optimization.
Notable applications have been proposed using Strength Pareto Evolutionary Algo-
rithm (SPEA II) [6], Non-dominated Sorting Genetic Algorithm (NSGA II) [7], and
Multi-Objective Particle Swarm Optimization (MOPSO) [8].
3 Proposed Method
This paper deals with the MONSP on a shift-by-shift basis. Each working day is di-
vided to three shifts (day, night, and late shift), and the total shifts in a scheduling
period are numbered from 1 to S (1 indicates the day-shift of the first day, 2 indicates
the night-shift of the first day, etc). Assume that there are M types of nurse skills, and
skill type m is owned by Tm nurses. The aim of the MONSP is to optimize multiple
objectives simultaneously by allotting appropriate nurses to the shifts subject to a set
of hard constraints. By using the notations introduced in Table 1, we present the
mathematical formulation of the addressed MONSP as follows.
xmij
-
Pmij = 1 if nurse i having skill m is satisfied with shift j assignment;
Pmij = 1 if unsatisfied; and Pmij = 0 if no special preference
xmij = 1 if nurse i having skill m is allotted to shift j; otherwise, xmij = 0
M Tm S
Minimize f1 = ∑∑∑ x
m =1 i =1 j =1
mij C mj (1)
M S
⎛ Tm
⎞
Minimize f2 =
∑ ∑ ⎜⎜ ∑ x mij − L mj ⎟⎟ (2)
m =1 j =1 ⎝ i =1 ⎠
∑ ∑ ∑ x (1 − P )
M Tm S
Minimize f3 = (3)
mij mij
m =1 i =1 j =1
Subject to
S
∑x
j =1
mij ≥ W m ∀m, i (4)
Tm
∑x
i =1
mij ≥ L mj ∀m, j (5)
Multiobjective Optimization for Nurse Scheduling 69
Tm
∑x
j =r
mij ≤1 r = 1, 4, 7, …, S-2 ∀m, i
(7)
r + 3 ( Rm +1 )−1
∑x
j=r
mij ≤ Rm r = 1, 4, 7, …, S-2 ∀m, i (8)
-
nurses’ preference Pmij about the schedule, it is converted to a minimization objective
by using 1 Pmij (Eq. (3)). The first constraint (Eq. (4)) stipulates that the number of
shifts fulfilled by a nurse having skill m should be greater than or equal to a minimum
threshold Wm. Eq. (5) and Eq. (6) describe that the number of nurses having skill m
which are allotted to shift j should be a value between Lmj and Umj. The fourth con-
straint (Eq. (7)) indicates any nurse can only work for at most one shift during any
working day. Finally, the fifth constraint (Eq. (8)) requests that the nurse having skill
m can serve for at most Rm consecutive working days.
One of the notable PSO variants is the Cyber Swarm Algorithm (CSA) [5] which
facilitates the reference set, a notion from scatter search [9], keeping the most influen-
tial solutions. To seek the approximate Pareto optimal solutions for the MONSP prob-
lem, we propose the multiobjective version of the CSA, named MOCSA. Fig. 1 shows
the conception diagram of the MOCSA which consists of four memory components.
The swarm memory component is the working memory where a population of swarm
particles evolve to improve their solution quality. The individual memory reserves a
separate space for each particle and stores the pseudo non-dominated solutions by
reference to all the solutions found by this designated particle only. Note that the
70 P.-Y. Yin, C.-C. Chao, and Y.-T. Chiang
-
constraints) for assigning a nurse to available shifts. Hence, a nurse schedule can be
encoded as a value between [0, 2 S 1]. Assume a population of U particles is used,
where a particle Pi = {pij}, indicating the schedule for all the nurses. The fitness of the
ith particle is a four-value vector (f1, f2, f3, f4). The objective values evaluated using
Eqs. (1)-(3) are referred to as the first three fitness values (f1, f2, f3). The fourth fitness
value f4 serves as a penalty which computes the amount of total violations incurred by
any constraint (Eqs. (4)-(8)). We assume that a feasible solution always dominates
any infeasible solution.
Exploiting guiding information. The CSA extends the learning form using pbest
and gbest by additionally including another solution guide which is systematically
selected from the reference set, storing a small number of reference solutions, denoted
RefSol[m], m = 1, 2, …, RS, observed by all particles by reference to fitness values
and solution diversity. For implementing the MOCSA, the selecting of solution guides
is more complex because multiple non-dominated solutions can play the role of pbest,
gbest and RefSol[m]. Once the three solution guides were selected, particle Pi updates
its positional vector in the swarm memory by the guided moving using Eqs. (10) and
(11) as follows.
⎛ ⎛ ω1ϕ1 pbestij + ω 2ϕ 2 gbest j + ω3ϕ 3 RefSol[m] j ⎞ ⎞ ,1≤m≤RS (10)
vijm ← K ⎜ vij + (ϕ1 + ϕ 2 + ϕ 3 )⎜⎜ − pij ⎟⎟ ⎟
⎜ ω ϕ + ω ϕ + ω ϕ ⎟
⎝ ⎝ 1 1 2 2 3 3 ⎠⎠
Pi←non-dominated { ( f (P + v ) 1 ≤ k ≤ 4)
k i
m
i m ∈ [1, RS ] } (11)
where K is the constriction factor, ω and ϕ are the weighting value and the cognition
coefficient for the three solution guides pbest, gbest and RefSol[m]. As RefSol[m],
1≤m≤RS is selected in turn from the reference set, the process will generate RS can-
didate particles for replacing Pi. We choose the non-dominated solution from the RS
candidate particles. If there exist more than one non-dominated solutions, the tie is
broken at random. Nevertheless, all the non-dominated solutions found in the guided
moving are used for experience memory update as noted in the following.
Experience memory update. As shown in Fig. 1, experience memory consists of
individual memory, global memory and reference memory, where the rewarded
Multiobjective Optimization for Nurse Scheduling 71
experience pbest, gbest and RefSol[m] are stored and updated. The individual memory
tallies the personal rewarded experience pbest for each individual particle. Because
there may exist more than one non-dominated solution in the search course of a parti-
cle (here, the non-dominance only refers to all the solutions found by this particle),
we save all these solutions in the individual memory. Any solutions in the individual
memory can serve as pbest in the guided moving, and we’ll present the Diversity
strategy [10] for selecting pbest from the individual memory. By contrast to individ-
ual memory, the global memory stores all the non-dominated solutions found by the
entire swarm. Hence, the content of the global memory is used for the final output of
the approximate Pareto-optimal solutions. During the evolution, the solutions in the
global memory are also helpful in assisting the guided moving of particles by serving
as gbest. The Sigma strategy [11] is employed in our method for selecting gbest from
the global memory. The reference memory stores a small number of reference solu-
tions selected from individual and global memory. According to the original scatter
search template [9], we facilitate the 2-tier reference memory update by reference to
the fitness values and diversity of the solutions.
Selecting solution guides. First, the Diversity strategy for selecting pbest is em-
ployed where each particle selects from its individual memory a non-dominated solu-
tion as pbest that is the farthest away from the other particles in the objective space.
Thus, the particle is likely to produce a plot of objective values equally-distanced to
those of other particles, improving the diversity property of the solution front. Second,
we apply the Sigma strategy for selecting gbest from the global memory. For a given
particle, the Sigma strategy selects from the global memory a non-dominated solution
as gbest which is the closest to the line connecting the plot of the particle’s objective
values to the origin in the objective space, improving the convergence property of the
solution front. Finally, the third solution guide, RefSol[m], m = 1, 2, …, RS, is sys-
tematically selected from the reference memory. These reference solutions have good
properties of convergence and diversity, so their features should be fully explored in
the guided moving for a particle.
To prevent the bias preferred to a less number of efficient points, the Hypervolume is
normalized by the final number of solutions produced. The solutions with a smaller
Hypervolume value is more desired because they are closer to the true Pareto front.
The diversity measure named Spacing which estimates the variance of the distance
between adjacent fitness plots. The solutions with a smaller Spacing value are more
desired because these solutions exhibit a better representation of a front.
As all the competing algorithms are stochastic, we report the average performance
index values over 10 independent runs. Each run of a given algorithm is allowed with
a period of duration of 80,000 fitness evaluations. Table 2 lists the values of the per-
formance indexes for the solution fronts produced by the competing algorithms. For
Problem I, the MOCSA gives the smallest Hypervolume value indicating the
produced solution front converges closer to the true Pareto front than the other two
algorithms. The Spacing value for the MOCSA is also the smallest among all which
discloses that the non-dominated solutions produced by MOCSA spread more evenly
on the front. On the other hand, the NSGA II produces the greatest values (worst
performance) for both Hypervolume and Spacing, while the MOPSO generates the
intermediate values. The experimental outcome for Problem II is slightly different
with the previous case. The NSGA II gives the smallest Hypervolume value (best
performance) although its spacing value indicates that the produced solutions are not
well distributed on the front. The MOCSA produces the second smallest Hyper-
volume value and the smallest Spacing value among all competitors, supporting the
claim that the MOCSA is superior to the other two algorithms. The MOPSO generates
the worst Hypervolume value and a median Spacing value.
Fig. 2(a) shows the plots of the multiobjective values of all the solutions for Prob-
lem I obtained by different algorithms. It is seen that the front produced by MOCSA
is closer to the origin. We can also observe that the spread of the solutions are better
Problem I Problem II
Hypervolume Spacing Hypervolume Spacing
MOCSA 2.42E+07 1.41 9.86E+07 3.40
NSGA II 9.45E+07 2.37 8.37E+07 7.82
MOPSO 6.17E+07 2.01 1.35E+08 4.24
(a) (b)
distributed on the front than those produced by the other two methods. The front gen-
erated by the MOPSO is next to that produced by the MOCSA by reference to the
visual distance to the origin. The front generated by the NSGA II is the farthest to the
origin and the obtained solutions are not evenly distributed on the front. For Problem
II as shown in Fig. 2(b), we can see the front produced by the NSGA II is the closest
to the origin although the obtained solutions are still not evenly distributed on the
front. The MOCSA produces the front next to that of NSGA II, but better spacing is
observed. Finally, MOPSO front is the furthest to the origin where the distribution of
the obtained solutions on the front is also better than that produced by the NSGA II.
5 Conclusions
In this paper, we have presented a multiobjective cyber swarm algorithm (MOCSA)
for solving the nurse scheduling problem. Based on a literature survey, we propose a
mathematical formulation containing three objectives and five hard constraints. In
contrast to most existing methods which transform multiple objectives into an inte-
grated one, the proposed MOCSA method tackles the generic multiobjective setting
and is able to produce approximate Pareto front. The experimental results on two
simulation problems manifest that the MOCSA outperforms NSGA II and MOPSO in
terms of convergence and diversity measures of the produced fronts.
References
1. Berrada, I., Ferland, J., Michelon, P.: A multi-objective approach to nurse scheduling with
both hard and soft constraints. Socio-Economic Planning Sciences 30, 183–193 (1996)
2. Azaiez, M.N., Al Sharif, S.S.: A 0-1 goal programming model for nurse scheduling. Com-
puters & Operations Research 32, 491–507 (2005)
3. Burke, E.K., Li, J., Qu, R.: A Hybrid Model of Integer Programming and Variable
Neighbourhood Search for Highy-Constrained Nurse Rostering Problems. European Jour-
nal of Operational Research 203, 484–493 (2010)
4. Burke, E.K., Li, J., Qu, R.: A Pareto-based search methodology for multi-objective nurse
scheduling. Annals of Operations Research (2010)
5. Yin, P.Y., Glover, F., Laguna, M., Zhu, J.X.: Cyber swarm algorithms – improving parti-
cle swarm optimization using adaptive memory strategies. European Journal of Opera-
tional Research 201, 377–389 (2010)
6. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength pareto evolutionary
algorithm. Technical Report 103, ETH, Switzerland (2001)
7. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic
algorithm: NSGA-II. IEEE Transaction on Evolutionary Computation 6, 42–50 (2002)
8. Coello Coello, A.C., Pulido, G.T., Lechuga, M.S.: Handling multiple objectives with parti-
cle swarm optimization. IEEE Trans. on Evolutionary Computation 8, 256–279 (2004)
9. Laguna, M., Marti, R.: Scatter Search: Methodology and Implementation in C. Kluwer
Academic Publishers, London (2003)
10. Branke, J., Mostaghim, S.: About selecting the personal best in multi-objective particle
swarm optimization. In: Runarsson, T.P., Beyer, H.-G., Burke, E.K., Merelo-Guervós, J.J.,
Whitley, L.D., Yao, X. (eds.) PPSN 2006. LNCS, vol. 4193, pp. 523–532. Springer, Heidel-
berg (2006)
11. Mostaghim, S., Teich, J.: Strategies for finding local guides in multi-objective particle
swarm optimization (MOPSO). In: Proceedings of the IEEE Swarm Intelligence Sympo-
sium 2003 (SIS 2003), Indianapolis, Indiana, USA, pp. 26–33 (2003)
A Multi-Objective Binary Harmony Search Algorithm
1 Introduction
Harmony Search (HS) is an emerging global optimization algorithm developed by
Geem in 2001 [1]. Owing to its excellent characteristics, HS has drawn more and
more attention and dozens of variants have been proposed to improve the
optimization ability. On the one hand, the control parameters of HS were investigated
and several adaptive strategies were proposed to achieve better performance. Pan et al
[2] proposed a self-adaptive global best harmony search algorithm in which the
harmony memory consideration rate and pitch adjustment rate were dynamically
adapted by the learning mechanisms. Wang and Huang [3]presented a self-adaptive
harmony search algorithm which used the consciousness to automatically adjust
parameter values. On the other hand, various hybrid harmony search algorithms were
proposed where additional information extracted by other algorithms was combined
with HS to improve the optimization performance. For instances, Li and Li
[4]combined HS with the real valued Genetic Algorithm to enhance the exploitation
capability. Several hybrid HS algorithms combined with Particle Swarm Optimization
(PSO) were developed to optimize the numerical problem [5], pin connected
structures [6] and water network design [7]. Other related works include the fusion of
HS with Simplex Algorithm [8] or Clonal Selection Algorithm [9].
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 74–81, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A Multi-Objective Binary Harmony Search Algorithm 75
Now HSs have been successfully applied in a wide range of optimization problems
in the scientific and engineering fields. However, most of these works focused on the
single-objective optimization problems in the continuous or discrete space; and so far
just several researches are concerned with the binary-coded problems or multi-
objective optimization problems. On binary-coded optimization problems, Geem [10]
firstly used HS to solve the water pump switching problem where the candidate value
for each decision variable is “0” or “1”. Then Greblicki and Kotowski [11] analyzed
the properties of HS on the one dimensional binary knapsack problem and the
optimization performance of HS is unsatisfactory. Afterwards, Wang et al[12] pointed
out that the pitch adjustment rule of HS cannot performs its function for binary-coded
problems which is the root of the poor performance. To make up for it, Wang
proposed a binary HS algorithm in which a new pitch adjustment operator was
developed to ameliorate the optimization ability. On the multi-objective optimization
problems, Geem and Hwangbo [13] studied the satellite heat pipe design problem
which need simultaneously consider two objectives, i.e., the thermal conductance and
the heat pipe mass. However, the authors transformed this multi-objective problem
into a single objective function by minimizing the sum the individual error between
current function value and optimal value. And Geem [14] later used HS to tackle the
multi-objective time-cost optimization problem for scheduling a project. In this work,
the dominance-based comparison for selection was adopted to achieve the trade-off of
the time and cost. As far as we know, there is no work reported on the multi-objective
binary HS (MBHS). To extend HS to tackle the multi-objective binary-coded
problems, a new Pareto-based multi-objective binary HS is proposed in the work.
This paper is organized as follow. Section 2 briefly introduces the standard HS
algorithm. Then the proposed MBHS is described in Section 3 in details. Section 4
presents the experimental results of MBHS on the benchmark functions and the
comparisons with NSGA-II are also given. Finally, some conclusions are drawn in
Section 5.
Here xi new is the i-th element of the new harmony solution vector; rand() is the
random number; BW is an arbitrary distance band width; and xn new is a neighboring
value of xi new .
If the new harmony vector is better than the worst solution vector in HM in terms
of the fitness value, it will be included in the HM and the existing worst harmony
solution vector is excluded from HM. This process runs iteratively till the terminated
rules are satisfied.
where xi , j ∈ {0,1} is the j-th element of i-th harmony memory vector. Like the
standard HS, MBHS also uses three updating operators, that is, harmony memory
consideration operator, pitch adjustment operator and random selection to generate
the new solutions.
⎧1 r2 ≤ 0.5
RSO = ⎨ (4)
⎩0 otherwise
where x j is the j-th bit of the new harmony solution vector; r1 and r2 are two
independent random number between 0 and 1.
If the element of the new harmony comes from the HM, it need to be adjusted by
pitch adjustment operator (PAO) with the probability PAR. However, in binary space,
the value of the each element in HM is bound to be “0” or “1”, so the standard
definition of PAO in HS will be degraded to mutation operation [12]. If we simply
abandon the PAO, the algorithm will lack the operator to perform local search. To
remedy it, the pitch adjustment operator as Eq. (5) is used in MBHS.
⎧⎪ B j r ≤ PAR
xj = ⎨ (5)
⎪⎩ x j otherwise
where r is a random number; B j is the j-th element of the best harmony solution
vector in HM. The PAO executes a local search based on the current solution and the
optimal solution which will help MBHS find the global optima effectively and
efficiently.
3.3 Updating of HM
The new generated harmony vector is added into the HM. Then all the solutions in
HM are sorted according to the fitness values and the solution with the worst fitness is
removed from HM. In the multi-objective optimization problems, the two major goals
of Pareto-based optimizer are to pursue the convergence to the Pareto-optimal set as
well as maintain the diversity. To achieve it, the non-domination sort strategy based
on crowding distance is adopted to sort the HM vectors.
In this work, the convergence metric γ and the diversity metric Δ proposed in [18]
are adopted to evaluate the performance.
78 L. Wang et al.
|H |
∑d i
(7)
γ = i =1
|H |
d f + dl + ( HMS − 1)
where di is the distance between two successive solutions in the obtained Pareto-
optimal set; d is the mean value of all the di ; d f and d l are the two Euclidean
distances between the extreme solutions and the boundary solutions of the obtained
non-dominated set.
For MBHS, a reasonable set of parameter values are adopted, i.e., HMCR=0.9, and
PAR=0.03; each decision variable are coded with 30 bits. For a comparison, NSGA-II
[18] with the default parameters is used to solve these problems as well. MBHS and
NSGA-II both ran with 50000 function evaluations. Table 1-2 list the optimization
results of MBHS and NSGA-II and box plots of γ and Δ are given in Fig.1
and Fig.2.
According to the results in Table 1-2, it is reasonable to claim that the proposed
MBHS is superior to the NSGA-II. Fig.1 indicated that MBHS generally achieved
solutions with higher quality in comparison with NSGA-II in terms of convergence
metric. And in Fig.2, the comparison of the diversity metric indicated that MBHS is
able to find a better spread of solutions and obviously outperforms NSGA-II in all
problems.
A Multi-Objective Binary Harmony Search Algorithm 79
-3 -3
x 10 FON x 10 DEB1 DEB2
1.25
0.2
8 1.2
Convergence
Convergence
Convergence
0.15
1.15
6
1.1 0.1
4
1.05
0.05
2 1
0.95 0
MBHS NSGA-II MBHS NSGA-II MBHS NSGA-II
-4 -4
x 10 SCH1 x 10 SCH2
10.5 8
Convergence
Convergence
10 7.5
9.5
7
9
6.5
MBHS NSGA-II MBHS NSGA-II
Fig. 1. Box plot of the convergence metrics γ obtained by MBHS and NSGA-II
0.6
0.6 0.8
Diversity
Diversity
Diversity
0.4 0.7
0.55
0.2 0.6
0.5
0.5
MBHS NSGA-II MBHS NSGA-II MBHS NSGA-II
SCH1 SCH2
0.5 1.1
0.4
Diversity
1.05
Diversity
0.3
1
0.2
0.95
0.1
MBHS NSGA-II MBHS NSGA-II
Fig. 2. Box plot of the diversity metrics △ obtained by MBHS and NSGA-II
80 L. Wang et al.
MBHS NSGA-II
Mean Variance Mean Variance
FON 1.9534481E-003 2.5725898E-003 1.9009196E-003 1.9787263E-004
SCH1 9.7508949E-004 5.9029912E-005 9.7769396E-004 6.9622480E-005
SCH2 7.3687049E-004 5.6615711E-005 7.4402367E-004 5.2879053E-005
DEB1 1.0286786E-003 5.8010990E-005 1.0697121E-003 6.6791139E-005
DEB2 8.3743810E-003 1.5211841E-002 9.8603419E-002 1.0217030E-001
MBHS NSGA-II
Mean Variance Mean Variance
FON 9.6845154E-002 6.2345711E-002 7.8416829E-001 2.9294262E-002
5 Conclusion
This paper presented a new multi-objective binary harmony search algorithm for
tackling the multi-objective optimization problems in binary space. A modified pitch
adjustment operator is used to perform a local search and improve the search ability
of algorithm. In addition, the non-dominated sorting based on crowding distance is
adopted to evaluate the solution and update the HM which insures a better diversity
performance as well as convergence of MBHS. Finally the performance of the
proposed MBHS was compared with NSGA-II on five well-known multi-objective
benchmark functions. The experimental results show that MBHS outperforms NSGA-
II in terms of the convergence metric and the diversity metric.
Acknowledge
This work is supported by Research Fund for the Doctoral Program of Higher Education
of China (20103108120008), the Projects of Shanghai Science and Technology
Community (10ZR1411800 & 08160512100), ChenGuang Plan (2008CG48),
Mechatronics Engineering Innovation Group project from Shanghai Education
Commission, Shanghai University “11th Five-Year Plan” 211 Construction Project and
the Graduate Innovation Fund of Shanghai University.
A Multi-Objective Binary Harmony Search Algorithm 81
References
1. Geem, Z., Kim, J., Loganathan, J.: A new heuristic optimization algorithm: harmony
search. J. Simulations 76, 60–68 (2001)
2. Pan, Q., Suganthan, P., Tasgetiren, M., Liang, J.: A self-adaptive global best harmony
search algorithm for continuous optimization problems. Applied Mathematics and
Computation 216, 830–848 (2010)
3. Wang, C., Huang, Y.: Self-adaptive harmony search algorithm for optimization. Expert
Systems with Applications 37, 2826–2837 (2010)
4. Li, H., Li, L.: A novel hybrid real-valued genetic algorithm for optimization problems. In:
International Conference on Computational Intelligence and Security, pp. 91–95 (2008)
5. Omran, M., Mahdavi, M.: Global-best harmony search. Applied Mathematics and
Computation 198, 643–656 (2008)
6. Li, L., Huang, Z., Liu, F., Wu, Q.: A heuristic particle swarm optimizer for optimization of
pin connected structures. Computers & Structures 85, 340–349 (2007)
7. Geem, Z.: Particle-swarm harmony search for water network design. Engineering
Optimization 41, 297–311 (2009)
8. Jang, W., Kang, H., Lee, B.: Hybrid simplex-harmony search method for optimization
problems. In: IEEE Congress on Evolutionary Computation, pp. 4157–4164 (2008)
9. Wang, X., Gao, X.Z., Ovaska, S.J.: A hybrid optimization method for fuzzy classification
systems. In: 8th International Conference on Hybrid Intelligent Systems, pp. 264–271
(2008)
10. Geem, Z.: Harmony search in water pump switching problem. In: Wang, L., Chen, K., S.
Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3612, pp. 751–760. Springer, Heidelberg (2005)
11. Greblicki, J., Kotowski, J.: Analysis of the Properties of the Harmony Search Algorithm
Carried Out on the One Dimensional Binary Knapsack Problem. In: Moreno-Díaz, R.,
Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2009. LNCS, vol. 5717, pp. 697–
704. Springer, Heidelberg (2009)
12. Wang, L., Xu, Y., Mao, Y., Fei, M.: A Discrete Harmony Search Algorithm.
Communications in Computer and Information Science 98, 37–43 (2010)
13. Geem, Z., Hwangbo, H.: Application of harmony search to multi-objective optimization
for satellite heat pipe design. Citeseer, pp. 1–3 (2006)
14. Geem, Z.: Multiobjective Optimization of Time Cost Trade off Using Harmony Search.
Journal of Construction Engineering and Management 136, 711–716 (2010)
15. Schaffer, J.: Multiple objective optimization with vector evaluated genetic algorithms. In:
Proceedings of the 1st International Conference on Genetic Algorithms, pp. 93–100 (1985)
16. Fonseca, C., Fleming, P.: Multiobjective optimization and multiple constraint handling
with evolutionary algorithms. II. Application example. IEEE Transactions on Systems,
Man and Cybernetics, Part A: Systems and Humans 28, 38–47 (2002)
17. Deb, K.: Multi-objective genetic algorithms: Problem difficulties and construction of test
problems. Evolutionary Computation 7, 205–230 (1999)
18. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multi-objective genetic
algorithm: NSGA-II. IEEE Trans. on Evolutionary Computation 6, 182–197 (2002)
A Self-organized Approach to Collaborative Handling of
Multi-robot Systems
1 Introduction
Collaborative handling, as one of tasks of multi-robot systems, plays an important
role in the research on collaborative control of complex system. It begins with the
research on ‘two industrial robots handling a single object’ of Zheng, Y.F. and J.Y.S.
Luh [1], continues in the work of Y Kume [2] as ‘multiple robots’, and receives ma-
turity in recent times in the work of a motion-planning method of multiple mobile
robots in a three-dimensional environment. (see, for example, [3] ). In the Early Stage
of research most of the classic approaches to collaborative handling are centralized
control which may be effective only when the number of controllers is usually limited
within a certain range [4][5][6]. Decentralized control is an effective method by
which each robot is controlled by its own controller without explicit communication
among robots, and the method usually employ the leader-following relational mode
by assigning to a leader who obtain the motion information of the object [2][7].
However, it may not be the best choices because of explicit relational mode and the
bottleneck in communication and computing of leader robot. Self-organized approach is
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 82–90, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A Self-organized Approach to Collaborative Handling of Multi-robot Systems 83
a good train of thought for collaborative handling of multi-robot systems, even swarm
systems [8][9][10]. The main objective of this paper is to initiate a study on self-
organized approach to multi-robot’s collaborative handling problem. For individual
movement representations, an autonomous motion planning graph (AMP-graph) is
described. An individual autonomous motion rule (IAM-rule) including two kind of
“free-loose” and “well-distributed load-bearing” preferences is presented. By establish-
ing the simple and effective individual rule model, an ideal handling formation can be
formed by each robot moving autonomously under their respective preferences. The
simulations show that both the AMP-graph and the IAM-rule are valid and feasible.
Considering many uncertain factors in the handling process, before continuing any
further we will make three necessary assumptions: First, the handling process happen
in the ideal plane. Second, the rim of object exist any solid handling points which
hardly produce deformation. Lastly, handling robots with strong bearing capacity
don’t sideslip and deflect in the handling process. Based on these assumptions, a self-
organized approach will be design.
T0
clti vei
vtdi θ ei
Rpi θ ti Rqi
θ pi vtpi clqi
θ tpi θ qi
clpi
Ri
Fig. 1. The AMP-graph
Given in Definition 2, the desired linear velocity vei is a vector sum of vtdi in the direc-
tion of clti and vtpi in the direction perpendicular to clti. For the sake of simplicity, only
the target T0, two neighbors Rpi, Rqi of the ith robot Ri are taken into account in the
IAM-rule based on the “free-loose” preference. Ensure that vtpi points to the “free-
loose” space while vtdi always point in the direction of cl0i and two constraint condi-
tions vtdi= fd(|clti|,|clpi|,|clqi|) and vtpi=fp(|clpi|,|clqi|) are satisfied, where fd, fp are
the vertical and horizontal potential functions. In the process of moving to the target,
the rule will make all the robots coordinate into the ideal formation and all the robots
tend towards scatter each other and towards gather relative to the target, therefore we
call it the IAM-rule based on the “free-loose” preference.
The “free-loose” space modeling. Let us consider the robot Ri in the relative co-
ordinates in which Y-axis always point to the target T0, the first and forth quadrants
are defined as the positive quadrants since θ pti , θ qti are positive within them and the
second and forth quadrants are defined as the negative quadrants since θ tpi , θ qti are
l
⎧θ tpi
l
= θ ti Cli ≤ ε
⎪
⎨ sgn(θ pti )+ sgn(θ qti )
π
(3.2)
⎪θ tpi
l
= θ ti + (−1) 2
⋅ sgn(Cli ) ⋅ Cli > ε
⎩ 2
where ε is a permissible error. Because Cli covers all of the information to determine
autonomous motion of Ri, we call it interconnected characteristics parameter with
“free-loose” feature. Cli denotes the vector sum of the X-axis components of two
interconnected direction lines clpi and clqi if the robot Ri has not reach the edge of the
target, or the vector sum of clpi and clqi if the robot Ri has reach. Similarly, because
θ tpil covers all the possible direction of “free-loose” space of Ri, therefore we call it
autonomous motion direction angle with “free-loose” feature. Specially, the desired
linear velocity vei point in the direction of θ ti if “free-loose” space do not exist, that
is, ∃ε , θ tpi = θ ti , if |Cli| ≤ ε .
l
We know that the arrow of the Y-axis represents the direction that all the robots
tend towards gather relative to the target and the arrow of the X-axis represents the
direction that all the robots tend towards scatter each other on the edge of the target.
Therefore, the desired angle θ ei at every moment of autonomous motion with the
IAM-rule based on the “free-loose” preference can be obtained as follow:
⎧θ ei = θ ti Cli ≤ ε
⎪
⎪θ ei = θ ti + arctan(vtpi / vtdi ) Cli > ε and clti ≠ 0
⎨ (3.3)
⎪θ ei = θ tpi Cli > ε and clti = 0
l
⎪ *
⎩θ ei = θ ti Cli ≤ ε and clti = 0
Eq. (3.3) describes every process of multi-robot self-organized handling. Accord-
ing to Definition 2, the desired angle θ ei is the deflection angle of vei and xa if two
interconnected constraint lines exist and the robot does not reach the edge of the
86 T.-y. Huang et al.
target, that is, |clti| ≠ 0 and |Cli| > ε . Specially, when two interconnected constraint
lines do not exist, that is |Cli|=0, the desired angle θ ei coincides with the target con-
straint angle θ ti . When the robot reaches the edge of the target and the interconnected
constraint lines exist, that is, |clti|=0 and |Cli| > ε , the desired angle θ ei coincides
with θ . When the robot reaches the edge of the target and the interconnection can be
l
ti
negligible, that is, |clti|=0 and |Cli| ≤ ε , the robot obtain a stable desired
angel θ coinciding with θ ti .
*
ei
Now, we turn to the second motion process of multi-robot self-organized handling.
After a uniform dispersed formation is formed by autonomous motion with the IAM-
rule based on the “free-loose” preference, that is, |clti|=0 and |Cli| ≤ ε , all the han-
dling robots smoothly lift up the object together to measure the load bearing data
which are used as the parameter of the IAM-rule based on the “well-distributed load-
bearing” preference. Similar to the “free-loose” preference, only the load-bearings of
the two nearest neighbors at both left and right sides of Ri are taken into account.
Ensure that Ri always move along the edge of the object and in the direction of
neighbor with larger load-bearing, the IAM-rule will make the load-bearing of all the
robots tending towards average, therefore we call it the IAM-rule based on the “well-
distributed load-bearing” preference.
The “well-distributed load-bearing” space modeling. Similar to the “free-loose”
preference, the robot Ri in the relative coordinates in which Y-axis always point to the
target T0, the first and forth quadrants are defined as the positive quadrants
since θ pti , θ qti are positive in them and the second and forth quadrants are defined as
the negative quadrants since θ pti , θ qti are negative in them, then the “well-distributed
load-bearing” space can be described:
The direction of the “well-distributed load-bearing” space points to the direction of
the space the neighbor with larger load-bearing belong to.
Corresponds to the "free-loose" preference model, the description can be expressed
mathematically as follows:
Cbi = G pi sgn(θ pdi ) + Gqi sgn(θ qdi ) (3.4)
⎧θ ei = θ tpi
b
= θ ti Cbi ≤ ε
⎪
⎪ π
⎨θ ei = θ tpi = θ ti + sgn(Cb ) ⋅ Cbi > ε
b
(3.5)
⎪ 2
⎪Gei* = G0 / n all Cbi ≤ ε
⎩
where Gpi and Gqi are the load-bearing of the two nearest neighbors at both left and
right sides of Ri. Because Cbi covers all of the information to determine the direction
A Self-organized Approach to Collaborative Handling of Multi-robot Systems 87
Remark 1. Effective sensing range is the maximum range within which the omni-
direction sensor of each handling robot can detect a target, denoted by Rs. If the
minimum distance between the robot and the object beyond effective sensing range
Rs, the robot follows any given point T0=(xt0,yt0) within the object, or the robot follows
the point T0i=(xt0i,yt0i) located nearest from the edge of the object.
Remark 2. By setting the parameters of the potential field function, the IAM-rule can
maintain collision avoidance between any two robots. When the distance between the
robots is smaller, the potential field function make the interconnected deflection angle
increasing rapidly to produce greater repulsive interaction. Specially, when the “free-
loose” spaces in all directions do not exist, the robot is forced to remain stationary and
wait for a chance to autonomous motion.
Remark 3. Effective interconnected radius δ is the maximum value within which the
interaction between any two robots Rpi, Rqi exists, that is, ∃ δ i , |clpq|=|clpq| if
|clpq| ≤ δ i , or |clpq|=0 if |clpq| > δ i ,p ≠ q ∈ {1,2,…,n}.
Parameter r ¤ Rs G O H
Value 0.2 0.1 8 4 0.3 0.5
88 T.-y. Huang et al.
R1 R 2 R3 R4 R 5 R6 R7 R8 T0
X 0 -2.2 1.1 -0.2 3.9 2.6 -0.3 -4.8 -4.0
Y -0.8 -4.6 -2.8 -1.3 0.2 -1.6 -7.2 -4.9 1.0
Fig. 2. The moving process of 8 handling robots with IAM-rule (36 steps)
From Fig. 2, we observe that after 36 steps all the handling robots distribute uni-
formly around the edge of the target, so the IAM-rule based on the “free-loose” pref-
erence can effectively make multi-robot systems form the ideal handling formation
corresponding to formation control [12][13][14]. In the initial period R7 follows the
known point T0 within the object, since the object from which the initial position of
R7 farther away can not be perceived, coincides with Remark 1. Due to the smaller
distance between R1 and R4 in the initial period, R1 and R4 obtain two larger desired
deflection angles θ et1 and θ et 4 , coincide with Remark 2. In addition, although the
robot R2, R7 and R8 are neighbors each other, the interaction between them are negli-
gible in the initial period because of the greater distances each other, and then the
re-establishment of the interaction makes their trajectories deflected during the
autonomous motion, coincides with Remark 3. It is to be noted that because each
robot is always fond of moving in the direction of “free-loose” space, the robots in the
periphery of the group possess more dispersity and make the ones within the group
pulled by the “free-loose” space to spread to the periphery gradually, thus the rela-
tively dispersed characteristics for the group are formed finally. If each robot satisfies
local collision avoidance conditions under the special circumstances of Remark 2, we
might as well call it “strict collision avoidance”.
A Self-organized Approach to Collaborative Handling of Multi-robot Systems 89
References
1. Kim, K.I., Zheng, Y.F.: Two Strategies of Position and Force Control for Two Industrial
Robots Handling a Single Object. Robotics and Autonomous Systems 5, 395–403 (1989)
2. Kosuge, K., Oosumi, T.: Decentralized Control of Multiple Robots Handling an Object. In:
IEEE/ RJS Int.Conf. on Intelligent Robots and Systems, vol. 1, pp. 318–323 (1996)
3. Yamashita, A., Arai, T., et al.: Motion Planning of Multiple Mobile Robots for Coopera-
tive Manipulation and Transportation. IEEE Transactions on Robotics and Automa-
tion 19(2) (2003)
4. Koga, M., Kosuge, K., Furuta, K., Nosaki, K.: Coordinated Motion Control of Robot Arms
Based on the Virtual International Model. IEEE Transactions on Robotics and Autono-
mous Systems 8 (1992)
5. Wang, Z., Nakano, E., Matsukawa, T.: Cooperating Multiple Behavior-Based Robots for
Object Manipulation. In: IEEE /RSJ/GI International Conference on Intelligent Robots and
Systems IROS 1994, vol. 3, pp. 1524–1531 (1994)
90 T.-y. Huang et al.
6. Huang, T.-y., Wang, X.-n., Chen, X.-b.: Multirobot Time-optimal Handling Method Based
on Formation Control. Journal of System Simulation 22, 1442–1465 (2010)
7. Kosuge, K., Taguchi, D., Fukuda, T., Sakai, M., Kanitani, K.: Decentralized Coordinated
Motion Control of Manipulators with Vision and Force Sensors. In: Proc. of 1995 IEEE
Int. Conf. on Robotics and Automation, vol. 3, pp. 2456–24162 (1995)
8. Jadbabaie, A., Lin, J., Morse, A.S.: Coordination of Groups of Mobile Autonomous
Agents Using Nearest Neighbor Rules. IEEE Transactions on Automatic Control 48,
988–1001 (2003)
9. Turgut, A.E., Çelikkanat, H., Gökçe, F., Şahin, E.: Self-organized Flocking in Mobile
Robot Swarms. Swarm Intelligence 2, 97–120 (2008)
10. Gregoire, G., Tu, H.C.Y.: Moving and Staying Together Without a Leader. Physica D 181,
157–170 (2003)
11. Xu, W.B., Chen, X.B.: Artificial Moment Method for Swarm Robot Formation Control.
Science in China Series F: Information Sciences 51(10), 1521–1531 (2008)
12. Balcht, T., Arkin, R.C.: Behavior-based Formation Control for Multi-robot Teams. IEEE
Transactions on Robotics and Automation 14, 926–939 (1998)
13. Lawton, J.R., Beard, R.W., Young, B.J.: A Decentralized Approach to Formation Maneu-
vers. IEEE Transactions on Robotics and Automation 19, 933–941 (2003)
14. Das, A.K., Fierro, R., et al.: A vision-based formation control framework. IEEE Transac-
tions on Robotics and Automation 18, 813–825 (2002)
An Enhanced Formation of Multi-robot Based on A*
Algorithm for Data Relay Transmission
Zhiguang Xu1, Kyung-Sik Choi1, Yoon-Gu Kim2, Jinung An2, and Suk-Gyu Lee1
1
Department of Electrical Eng. Yeugnam Univ., Gyongsan, Gyongbuk, Korea
2
Daegu Gyeongbuk Institute of Science & Technology, Daegu, Korea
[email protected], {robotics,sglee}@ynu.ac.kr,
{ryankim9,robot}@dgist.ac.kr
1 Introduction
In mobile robotics, robots execute their own tasks in unknown environment by
navigation, path planning, communication and etc. Recently, researchers focus on the
navigation in multi-robot system to deal with cooperation [1], efficient path planning
[2] [3], keeps the stability of navigation [4], and collision avoidance [5]. They have
got respectable results through simulations and some experiments.
For path planning algorithms, such as Genetic Algorithm, Ant Colony System, A*
Algorithm, Neural Network [6]-[9] are very favored by the researchers. Neural
network algorithm [9] implements path planning for multiple mobile robots to
coordinate with other avoiding moving obstacles. A* algorithm as one of graph search
algorithm provides the fastest search of shortest path under the same heuristic. In [3],
A* algorithm utilizes function to accelerate searching and reduce computational time.
In multi-robot system, the robots are not only required to avoid obstacles, but also
need to avoid collision among each other. To solve this problem, [11] adopted a
reactive multi-agent solution with decision agent and obstacle agent on a linear
configuration. The avoidance decision strategy will be acquired from the timely
observations of the decision agents’ organization and calculating the trajectories
interacted with other decision agents and obstacle agents. [5] developed a step
forward approach for collision avoidance in multiple robot system. They built
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 91–98, 2011.
© Springer-Verlag Berlin Heidelberg 2011
92 Z. Xu et al.
2 Related Works
In multiple robots system, there are three main control approaches, leader-follower
based approach [12], behavior-based approach [13] and virtual structure approach
[14]. Several control algorithms, such as EKF [15], I/O linearization [16], sliding
mode control method [17] are common used to control each robot. In practical,
computer process should be loaded if we use the control approaches and control
methods above to do multi-robot formation. However, the MCU of our system is
AVR, so it is difficult to make our system commercialize and industrialize. In our
robots’ system, each robot just utilizes on-board sensors to localization, but redundant
sensor data will make a great burden on the aim controller. Consequently, we adopt a
more realistic practical application to achieve our control goal which implant path
planning algorithm to each robot so as to reduce the computational burden and satisfy
our control system requirements.
The system structure for the homogeneous robot is shown in Fig.1. There are two
robots for real experiment: one leader robot (LR) and one follow robot (FR). There
are two missions for FR. One is to maintain the given distance with LR, the other is to
determine the ideal temporary target based on A* algorithm when LR change its
moving direction. Generally, mobile robot measures the distance by motor encoders
or some kinds of distance measurement sensors while robots explore freely in an
experimental environment.
We consider a new approach to for team robots navigation based a wireless RF
module. The used wireless RF module is a Nanotron sensor node, Ubi-nanoLOC, which
is developed by HANBACK Electronics©[18]. The WPAN module is based on IEEE
802.15.4a protocol for high aggregate throughput communications with a precision
ranging capability in this system. Since the measured distance from the wireless module
may also include considerable error according to ambient environments, the system
adopted Kalman filter to reduce the localization error. LR knows the whole information
of the experiment environment, and realizes communication between two robots byad-
hoc routing application among multiple wireless communication modules.
An Enhanced Formation of Multi-robot Based on A* Algorithm 93
3 Algorithm Description
3.1 State Function and System Flow Chart
The motion of each robot is described in terms of P = (x, y, θ)T, where x, y and θ are the
x coordinate, the y coordinate and the bearing respectively. The trajectory of each robot
has the form of (x, y) with the velocity v and the angular velocity ω. The model of
robot Ri as the form of:
Fig. 2 shows flow chart of LR process which knows the whole information of
environment. If LR does not reach its destination, LR will send MI to rear robots at
each time step, such as moving distance, heading angle and node information. When
LR arrives at a corner, it will turn 90 degree and regard the next step position of LR as
a node.
Fig. 3 describes flow chart of FR maintaining a given distance range. To reduce the
steps from start point to goal point and maintain the communication with LR, the FRs
use A* algorithm to plan the path, where the FRs make use of the information nodes
received from LR.
For the FRs, the node is target which received from the LR. And when LR moves in
the environment, there is not only just one node. So the FRs must reach every node as
target. However, if two nodes are very close, to increase the efficiency of navigation,
the FRs will use A* algorithm to obtain a shortest path and eliminate useless nodes.
4.1 Simulations
Fig. 5. Trajectories of one LR and three FRs, (a) without A* algorithm, (b) using A* algorithm
Fig. 6. Step comparison histogram of one leader robot with different number of follower robots
using A* algorithm or not
4.2 Experiments
In the experiment, we embed whole map information to LR, such as the distance
between target and corner information. FR navigates autonomously and follows the LR
within a given distance to keep required form. When LR arrive at the corner, it will
send the corner information to FR for executing A* algorithm.
Each robot in the team executes localization using motor encoder and Nanotron
sensor data based on Kalman Filter. From some paper, the researchers obtain θ
calculated from the relationship between motor encoder and the length of robot’s two
wheels. However, heading angle data from compass sensor is more accurate than the
data calculated from encoder, we get θ value from XG1010 and let robot go straight
using XG1010. Robots are in indoor environment and there is just one corner. The
initial distance between LR and FR is 3 meter. When LR robot move 3 meter, it will
An Enhanced Formation of Multi-robot Based on A* Algorithm 97
500
400
y-axis(cm)
300
200
100
-100
-100 0 100 200 300 400 500 600 700
x-axis(cm)
turn left and send the node information to rear FR by Nanotron sensor. At this time,
FR plans optimal path to the temporary target based on A* algorithm to keep the
required form with LR. We get the robot’s position values and orientation value at
each step. And when robots go straight, from its real trajectory, we measure the error
in x axis and y axis is less than 1 centimeter. Then we use Matlab draw the
trajectories of each robot as Fig. 7.
5 Conclusion
In multiple mobile robots system, it is so important to share the moving information of
every robot to increase the efficiency of cooperation. The FRs move to its node (as
target) with A* path planning algorithm using the information node which is achieved
from LR. Basically, the proposed method obtains a respectable result as we want. The
steps of FRs using A* path planning algorithm are much less than the steps of FRs
without A* algorithm. The simulation and experiment results show that the robots
embedded A* algorithm could obtain better performance on efficiency.
For future research, we are going to realize this multi-robot formation control
among more number of robots. And we will consider more complex environment, such
as exist some obstacles.
Acknowledgment
This research was carried out under the General R/D Program of the Daegu
Gyeongbuk Institute of Science and Technology (DGIST), funded by the Ministry of
Education, Science and Technology (MEST) of the Republic of Korea.
98 Z. Xu et al.
References
1. Farinelli, A., Iocchi, L., Nardi, D.: Multi-robot Systems: A Classification Focused on
Coordination. IEEE Transactions on Systems, Man, and Cybernetics, Part-B:
Cybernetics 34(5), 2015–2028 (2004)
2. Wang, K.H.C., Botea, A.: Tractable Multi-Agent Path Planning on Grid Maps. In: Int.
Joint Conf. on Artificial Intelligence, pp. 1870–1875 (2009)
3. Seo, W.J., Ok, W.J., Ahn, J.H., Kang, S., Moom, B.: An Efficient Hardware Architeture of
the A-star Algorithm for the Shortest Path Search Engine. In: Fifth Int. Joint Conf. INC,
IMS and IDC, pp. 1499–1502 (2009)
4. Scrapper, C., Madhavan, R., Balakirsky, S.: Stable Navigation Solutions for Robots in
Complex Environments. In: Proc. IEEE Int. Workshop on Safety, Security and Rescue
Robotics (2007)
5. Cai, C., Yang, C., Zhu, Q., Liang, Y.: Collision Avoidance in Multi-Robot Systems. In:
Proc. IEEE Int. Conf. on Mechatronics and Automation, pp. 2795–2800 (2007)
6. Castillo, O., Trujillo, L., Melin, P.: Multiple objective optimization genetic algorithms for
path planning in autonomous mobile robots. Int. Journal of Computers, Systems and
Signals 6(1), 48–63 (2005)
7. Li, W., Zhang, W.: Path Planning of UAVs Swarm using Ant Colony System. In: Fifth Int.
Conf. on Natural Computation, vol. 5, pp. 288–292 (2009)
8. Yao, J., Lin, C., Xie, X., Wang, A.J., Hung, C.C.: Path planning for virtual human motion
using improved a star algorithm. In: Seventh Int. Conf. on Information Technology, pp.
1154–1158 (2010)
9. Li, H., Yang, S.X., Biletskiy, Y.: Neural Network Based Path Planning for A Multi-Robot
System with Moving Obstacles. In: Fourth IEEE Conf. on Automation Science and
Engineering (2008)
10. Otte, M.W., Richardson, S.G., Mulligan, J., Grudic, G.: Local Path Planning in Image
Space for Autonomous Robot Navigation in Unstructured Environments. Technical Report
CU-CS-1030-07, University of Colorado at Boulder (2007)
11. Sibo, Y., Gechter, F., Koukam, A.: Application of Reactive Multi-agent System to Vehicle
Collision Avoidance. In: Twentieth IEEE Int. Conf. on Tools with Artificial Intelligence,
pp. 197–204 (2008)
12. Consolini, L., Morbidi, F., Prattichizzo, D., Tosques, D.: A Geometric Characterization of
Leader-Follower Formation Control. In: IEEE International Conf. on Robotics and
Automation, pp. 2397–2402 (2007)
13. Balch, T., Arkin, R.C.: Behavior-based Formation Control for Multi-robot Teams. IEEE
Trans. on Robotics and Automation 14, 926–939 (1998)
14. Lalish, E., Morgansen, K.A., Tsukamaki, T.: Formation Tracking Control using Virtual
Structures and Deconfliction. In: Proc. of the 2006 IEEE Conf. on Decision and Control
(2006)
15. Schneider, F.E., Wildermuth, D.: Using an Extended Kalman Filter for Relative
Localisation in a Moving Robot Formation. In: Fourth Int. Workshop on Robot Motion
and Control, pp. 85–90 (2004)
16. Desai, J.P., Ostrowski, J., Kumar, R.V.: Modeling formation of multiple mobile robots. In:
Proc. of the 1998 IEEE Int. Conf. on Robotics and Automation, Leuven, Belgium (1998)
17. Sánchez, J., Fierro, R.: Sliding Mode Control for Robot Formations. In: Proc. of the 2003
IEEE Int. Symposium on Intelligent Control, Houston, Texas (2003)
18. Hanback Electronics, http://www.hanback.co.kr/
19. Atmel Corporation, http://www.atmel.com/
20. MicroInfinity, http://www.minfinity.com/
WPAN Communication Distance Expansion Method
Based on Multi-robot Cooperation Navigation
Abstract. Over the past decade, an increasing number of researches and devel-
opments for personal or professional service robots are attracting considerable
attention and interest in industry and academia. Furthermore, the development
of intelligent robots is strongly promoted as a strategic industry. To date, most
of the practical and commercial service robots are controlled remotely. The
most important technical issue of remote control is wireless communication, es-
pecially in indoor and unstructured environments where communication infra-
structure may be hampered. Therefore, we propose a multi-robot cooperation
navigation method for securing the communication distance extension of the
remote control based on wireless personal area networks (WPANs). The
concept and implementation of following navigation are introduced, and per-
formance verification is carried out through navigation experiments in real or
test-bed environments.
1 Introduction
In fire-fighting and disaster rescue situations, fire fighters always face unpredictable
situations. The probability of unexpected accidents is increased as they cannot effec-
tively cope with such events, owing to which they experience mental and physical
strain. In contrast, a robot can be put in dangerous environments because it can be
controlled or navigated autonomously in a global environment. The use of robots to
accomplish fire-fighting missions can reduce much of the strain experienced by the
fire fighters. This is the reason for the development and employment of professional
robots for fire fighting and disaster prevention. Incidentally, fire sites are considered
in either the local or the global environment. If robots are placed in a global setting,
they have to secure reliable communication among themselves and the central control
system. Therefore, we approached the robot application from the point of view of fire
fighting and disaster prevention, which require reliable communication and highly
accurate distance measurements information.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 99–107, 2011.
© Springer-Verlag Berlin Heidelberg 2011
100 Y.-G. Kim et al.
The Kalman filter, a well-known algorithm widely applied in the robotics field, is
based on the linear mean square error filtering for state estimation. A set of mathe-
matical equations in the Kalman filter is implemented adequately as a compensator
and an optimal estimator for some types of noises. Therefore, it has been used for
stochastic estimation of measurements with noisy sensors. This filter can minimize
the estimated error covariance when the robot is placed under presumed conditions.
For the given spectral characteristics of an additive combination of signal and noise,
the linear operation based on these input yields the best results with minimum square
error of the signal from the noise. The distinctive feature of the Kalman filter, de-
scribed in its mathematical formulation in terms of a state space analysis, is that its
solution is computed recursively.
Park [1] approached the recognition of position and orientation of a mobile robot
using encoders and ubiquitous sensor networks (USNs). For this, the USNs are con-
sisted of four fixed nodes and a mobile node. The robot is based on the fuzzy algo-
rithm using information from the encoder and the USNs. Incidentally, this proposal
has errors in the recognition of a USN when considering the exploration of each robot
without fixed nodes. In addition, the noises caused by the friction between the road
surface and the wheels and the control error of the motor affect the localization esti-
mation acquired from the encoders. In addition, the measured errors accumulate while
a robot navigates. In order to solve these problems, we proposed a localization and
navigation system, which is based on the IEEE 802.15.4a protocol, to measure the
distance between the robots and a compass sensor to obtain the heading angle of each
robot. The IEEE 802.15.4a protocol allows for high aggregate throughput
communication with a precision ranging capability. Nanotron techniques developed
their first Chirp spread spectrum (CSS) smart RF module—smart nanoLOC RF with
ranging capabilities. The proposed method is based on a modified Kalman filter,
which is adapted in our system to improve the measurement quality of the wireless
communication module, and the compass sensor for reducing the error in the localiza-
tion and navigation process.
This paper is organized as follows. Section 2 introduces related works and dis-
cusses localization approaches and the application of the IEEE 802.15.4a protocol to
our system. Section 3 presents the proposed multi-robot-based localization and navi-
gation. Section 4 explains and analyzes the experimental results. Finally, Section 5
presents the conclusion of this research and discusses future research directions.
2 Related Works
Absolute localization is based on telemetric or distance sensors and may avoid the
error accumulation of relative localization. Absolute localization is a global localiza-
tion using which it is possible to estimate the current pose of the mobile robot even if
the conditions of the initial pose are unknown and the robot is kidnapped and tele-
ported to a different location [2]. The basic principle of absolute localization is based
on probabilistic methods and the robot’s belief or Bayes’ rule. The former is a prob-
ability density function of the possible poses. The latter updates the belief according
to the information. Taking into account the problem of approximating the belief, we
can classify localization into Gaussian filter-based localization and non-parametric
filter-based localization. The extended Kalman filter (EKF) [4] and the unscented
Kalman filter (UKF) [3] are included in the former. Markov localization [5] and
Monte Carlo localization [2] are included in the latter.
EKF localization represents the state or pose of the robot as Gaussian density to es-
timate the pose using EKF. UKF localization addresses the approximation issues of
the EKF. The basic difference between the EKF and the UKF stems from the manner
in which Gaussian random variables (GRV) are represented for propagating through
system dynamics [3]. In the EKF, state distribution is approximated by GRV, which is
then propagated analytically through the first-order linearization of a nonlinear sys-
tem. This can introduce large errors in the true posterior mean and the covariance of
the transformed GRV, which may lead to sub-optimal performance and sometimes
divergence of the filter. The UKF addresses this problem by using a deterministic
sampling approach. The state distribution is also approximated by GRV. In contrast, it
is now represented using a minimal set of carefully chosen sample points. These sam-
ple points completely capture the true mean and covariance of the GRV, which are
propagated through the true nonlinear system. The EKF achieves only first-order
accuracy. Neither the explicit Jacobian nor the Hessian calculation is necessary for the
UKF. Remarkably, the computational complexity of the UKF is of the same order as
that of the EKF [3].
Markov localization approximates the posterior pose of a robot using a histogram
filter over a grid decomposition of the pose space. Hence, it is called grid localization.
Monte Carlo localization approximates the posterior pose of a robot using a particle
filter that represents the pose of the robot by a set of particles with important weight.
This non-parametric filter-based localization can resolve the global localization and
kidnap problem through multi-modal distribution.
IEEE 802.15 related to the wireless personal area network (WPAN) is the standard
protocol developed by many task groups (TG) in IEEE. In particular, IEEE 802.15.4
is the standard of the low power for driving devices, the low cost for establishment,
and the available industrial, scientific, and medical (ISM) band. In addition, IEEE
802.15.4a provides enhanced ranging information among nodes through its adaptation
of wireless communication. As a result, we decided to use this protocol for sensor
networking. IEEE 802.15.4a was standardized in August 2007 based on the low com-
plexity, low cost, and low energy in a WPAN environment and its capability to simul-
taneously allow for correspondence and distance measurement. IEEE 802.15.4a
chooses two PHY techniques, namely, the ultra-wide band (UWB) method and the
102 Y.-G. Kim et al.
chirp spread spectrum (CSS) method with the centre of Samsung and Nanotron[6, 7].
UWB is a technique used for local distance communication and it is used for commu-
nicating signals with shorter pulse width in the baseband without a carrier. Owing to
the extremely short pulse width, the applied frequency bandwidth is long. Therefore,
it appears as though normal noise exists in channels of low output power. It does not
affect the wireless device. However, it is difficult in long distance communication
because it is a baseband communication and the output has low voltage. Its frequency
range is 3.4 GHz~10 GHz.
CSS was developed in the 1940s, and it is referred to as the dolphin and bat com-
munication. It has been typically used in radars because it has some advantages such
as strong interference and availability to long distance communication. After 1960, it
was expanding into industrialization, and grafted linear sweep into chirp signal to get
the significant information. CSS uses its entire allocated bandwidth to broadcast a
signal, making it robust to channel noise. Moreover, even in the low voltage case,
multi-path fading will not be much affected. The frequency of the CSS method is the
2.4 GHz ISM band.
3 System Architecture
4 Experimental Results
For this experiment, we placed two robots at certain distances in the linear corridor.
Figure 4 shows the measured distance errors with keeping each distance interval
between the FR and the LR. This experiment shows that the error for maintaining a
specific interval is decreased when the Kalman filter is applied to the distance meas-
urement. The Kalman filter estimates a more correct distance between the predicted
encoder distance information and the measured WPAN distance information, as
summarized in equations (1)–(7). We tried to simulate how well the FR follows LR
by the leader following operation, which is based on the WPAN distance measure-
ment. Figure 5 shows the simulation results of the leader following navigation of a
follower robot. The simulation is the navigation results of the FR following the LR
and navigating in a 10 m × 10 m area. The RF sensor data and compass sensor data
have uncertainty error factors. Therefore, the objective of the proposed system is to
achieve accuracy in the WPAN sensor network system by using the Kalman filter.
However, the Kalman filter requires a considerable amount of data for the estimation.
Even in this case, the system cannot move perfectly when the measurement data are
dispersed. To solve this problem, we have to ignore the dispersed data. Therefore, the
system cannot avoid resulting in errors. Figure 6 shows an experiment of the multi-
robot cooperation navigation for valid wireless communication distance expansion.
xˆ k−+1 = xˆ k + u k + wk , (1)
Δt vk tanφk
θ k +1 = θ k + (3)
L ,
Pk−+1 = Pk + σ w2 k (4)
,
Pk−+1
K = (5)
Pk−+1 + σ RF
2
k +1
,
Pk +1 = Pk−+1 (1 − K ) (7)
.
Fig. 4. Measured distance errors with keeping each interval b/t FR and LR
5 Conclusion
We proposed a multi-robot cooperation navigation method for securing a valid
communication distance extension of the remote control based on the WPAN. The
concept and implementation of the LR following navigation were introduced, and
performance verification was carried out through navigation experiments in real or
test-bed environments. The proposed multi-robot cooperation navigation method
verified the effect and reliability of securing valid wireless communication and ex-
panding the valid communication distance in an indoor and special-purpose service
robot.
Acknowledgments. This research was carried out under the General R/D Program
sponsored by the Ministry of Education, Science and Technology(MEST) of the
Republic of Korea and the partial financial support by the Ministry of Knowledge
Economy(MKE), Korea Institute for Advancement of Technology(KIAT) and Daegu-
Gyeongbuk Leading Industry Office through the Leading Industry Development for
Economic Region.
References
1. Jong-Jin, P.: Position Estimation of a Mobile Robot Based on USN and Encoder and De-
velopment of Tele-operation System using the Internet. The Institute of Webcasting Inter-
net and Telecommunication (2009)
2. Sebastian, T., Dieter, F., Wolfram, B., Frank, D.: Robust Monte Carlo Localization for
Mobile Robots. Artificial Intelligence 128, 99–141 (2001)
3. Wan, E.A., van der Merwe, R.: Kalman Filtering and Neural Networks. In: The Unscented
Kalman Filter, ch. 7. Wiley, Chichester (2001)
4. Greg, W., Gary, B.: An Introduction to the Kalman Filter. Technical Report: TR 95-041,
University of North Carolina at Chapel Hill (July 2006)
5. Dieter, F., Wolfram, B., Sebastian, T.: Active Markov Localization for Mobile Robots in
Dynamic Environments. Journal of Artificial Intelligence Research 11(128), 391–427
(1999)
6. Jeon, H.S., Woo, S.H.: Adaptive Indoor Location Tracking System based on IEEE
802.15.4a. Korea Information and Communications Society 31, 526–536 (2006)
WPAN Communication Distance Expansion Method 107
7. Lee, J.Y., Scholtz, R.A.: Ranging in a Dense Multipath Environment using an UWB Radio
Link. IEEE Journal on Selected Areas in Comm. 20(9) (2002)
8. http://www.hanback.co.kr/
9. http://www.aichi-steel.co.jp/
10. http://www.minfinity.com/
Relative State Modeling Based Distributed Receding
Horizon Formation Control of Multiple Robot Systems*
Abstract. Receding horizon control has been shown as a good method in multiple
robot formation control problem. However, there are still two disadvantages in
almost all receding horizon formation control (RHFC) algorithms. One of them is
the huge computational burden due to the complicated nonlinear dynamical opti-
mization, and the other is that most RHFC algorithms use the absolute states di-
rectly while relative states between two robots are more accurate and easier to be
measured in many applications. Thus, in this paper, a new relative state modeling
based distributed RHFC algorithm is designed to solve the two problems referred
to above. Firstly, a simple strategy to modeling the dynamical process of the rela-
tive states is given; Subsequently, the distributed RHFC algorithm is introduced
and the convergence is ensured by some extra constraints; Finally, formation con-
trol simulation with respect to three ground robots is conducted and the results
show the improvement of the new given algorithm in the real time capability and
the insensitiveness to the measurement noise.
1 Introduction
Formation control, multiple robot systems working together as a fixed geometry con-
figuration, has been widely researched in the past decades. And a great deal of strate-
gies have been introduced and presents their great validity in both theory and reality,
such as leader-following[1], behavior based[2], and virtual structure [3], etc.
Receding horizon control (RHC), also called model predictive control (MPC), with
the abilities of handling constraints and optimization, has been paid more and more
attentions in the field of formation control in most recent. However, there are. One of
the huge disadvantages in almost all existing receding horizon formation control
(RHFC) is the huge computational burden due to the required online optimization
algorithm. In order to solve this problem, distributed RHFC (DRHFC) seems a good
method and some researching works have been published [4-9].
*
This work is supported by the Chinese National Natural Science Foundation: 61005078 and
61035005.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 108–117, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Relative State Modeling Based Distributed Receding Horizon Formation Control 109
However, there are some problems in DRHFC algorithms in most practical appli-
cations: 1) Absolute states of each individual robot are difficult to be obtained by
other robots since intercommunication is lack of reliability in poor environment. 2)
Most DRHFC algorithms use the absolute states directly while relative states between
two robots are more accurate and easier to be measured in many applications[16].
Relative state model, i.e., which determines the relative motion law between two
robot systems considering each individual model simultaneously in detail, is a new
concept originated from the multiple satellite formation control[10]. Both relative
kinematics model[11] and relative dynamics model[12] described this kind of relative
motion. And, these relative state models have been applied to many distributed forma-
tion problems recently.
In this paper, a new DRHFC strategy is proposed by introducing relative state
model to deal with the above disadvantages. And the remainder of this paper is organ-
ized as follows: First, in section 2, the relative state model between two robot systems
and a whole formation model are derived. Second, the formation strategy and distrib-
uted control law is realized in section 3. Subsequently, in section 4, a simulation re-
sults are presented to verify the validity of the proposed algorithm. Finally, the
conclusions are given in section 5.
2 System Modeling
We consider the formation control problem of N (N≥2) robot systems, and each indi-
vidual robot’s dynamical model can be denoted as follows,
xi0 = fi 0 ( xi0 , ui ) (1)
where xi0 ∈ \ n (i=1,2,…,N) and ui ∈ \ m are the state vector and control input vector
of the ith robot, respectively; f i 0 (⋅) are some nonlinear smooth functions with pre-
defined structure.
Generally, Eq.(1) describes the motion of the robot system in the global coordinate
fixed with the earth [14-15]. Thus, xi0 is often called absolute states.
Actually, for most member robot system in a formation, only relative states infor-
mation are necessary to keep a high precise formation, so it is necessary to obtain the
dynamical equation considering relative states between two interested robot systems.
In this paper, we denote the relative model of robot i and robot j as follows,
x ij = f ji ( xij , ui , u j ) (2)
where xij ∈ \n is the relative state vector with the same dimensions of individual
state xi and x j . ui , u j ∈ \ m are the control input of robot i and j, respectively. The
methods for modeling relative state equations can be founded in [11] and [12].
110 W. Zheng, H. Yuqing, and H. Jianda
In a formation control problem, suppose that every robot i has ni neighbor robots
(neighbors of the ith robot mean the robots which can exchange information with
robot i), and all the neighbors of robot i consist of a set Ni.
There are two roles in our formation architecture, Na (Na≤N) leaders and N-Na fol-
lowers. Leaders mean these robots which know their own desired states profile. While
Followers denote these robots have no a priori-knowledge about their own desired
states profile, and they can only follow their neighbor robots to keep the formation.
Thus, the leader robot can be modeled using absolutely state equation and the fol-
lower robot can be modeled as several relative state equations with his neighbor ro-
bots. Thus, each robot’s state equation combined to its neighbors can be denoted as
follows,
⎡ xi0 ⎤ ⎡ fi 0 ( xi0 , ui ) ⎤
⎢ ⎥ ⎢ ⎥
⎢# ⎥=⎢ # ⎥ (3.a)
⎢ x ij ⎥ ⎢ f ji ( xij , ui , u j ) ⎥
⎢ ⎥ ⎢ ⎥
⎣⎢ # ⎦⎥ ⎣⎢ # ⎦⎥
⎡#⎤ ⎡ # ⎤
⎢ x i ⎥ = ⎢ f i ( x i , u , u ) ⎥ (3.b)
⎢ j⎥ ⎢ j j i j ⎥
⎢⎣ # ⎥⎦ ⎢⎣ # ⎥⎦
where vector xi = [ xi " xij "]T and xi = [" xij "]T denote the leader and follower’s
states, respectively. For the purpose of simplification, Eq.(3.a) and Eq.(3.b) can be
transformed uniformly as
xi = fi ( xi , ui , u− i ) (4)
where u−i = ["u j "]T is all the neighbors’ control inputs. Combining all the system
states and models, the whole formation system’s formation model should be ex-
pressed as follow,
x = f ( x, u ) (5)
where x = [ x1 ," xN ]T is the total states of all robots, and u = [u1 ,", uN ]T the total
control input. f ( x, u ) = [" fi ( xi , ui , u− i )"]T is the summation of all the individual
robots’ model (4).
arbitrary positive-definite real symmetric matrix. Also, λmax ( P ) and λmin ( P) denote
the largest and smallest eigenvalue of P, respectively. xijc , xi0c , xic and
x c = [ x1c ," xNc ]T are the desired states.
In general, the following cost function is used in RHFC algorithm,
N ⎧ ⎫⎪
N
⎪ 1
L( x, u ) = ∑ L ( xi , ui ) = ∑ ⎨γ xi0 − xi0 c + (1 − γ ) ∑ x ij − x ij c
2 2
+ ui
2
⎬ (6)
i =1 ⎪ ⎪⎭
Qi0 2 Q ij Ri
i =1 ⎩ j∈ N i
where
⎧1 for i ∈ {1,… , Na} (robot i is a leader)
γ =⎨
⎩0 for i ∈ {N − Na,… , N } (robot i is a follower)
is a positive constant for distinguishing leader and follower. Weighted matrix Qi0 , Qij
and Ri are all positive definite matrixes, and Qij = Qi j .
Let Q = diag ("Qi0 "Qij ") and R = diag (" Ri ") , the integrated cost function
can be equivalently rewritten as
2 2
L ( x, u ) = x − x c + u R
(7)
Q
Splitting the cost function (7) as following distributed cost function for each individ-
ual robot,
1
+ (1 − γ ) ∑ xij − x ij c
2 2 2
Li ( xi , ui ) = xi − xic + ui = γ xi0 − xi0 c + ui
2 2
(8)
Qi Ri Qi0 2 Qij Ri
j ∈N i
Then, the distributed formation control problem can be described as: Design some
distributed controllers ui = ki ( xi ) by solving a optimal control problem with respect
to the distributed const function (8) for each individual robot i to make the formation
system (5) converge to the desired formation state x c .
3.2 Algorithm
Since some cost Li(xi,ui) depends upon the relative states xij , which is subject to dy-
namics model (2), robot i must predict some relative trajectories according to ui and u-
i over each prediction horizon. That means, during each update, robot i will receive an
assumed control trajectories uˆi (⋅; tk ) from its neighbors[9]. Then, by solving the opti-
mal control problem using model (2), the assumed relative state trajectories can be
computed. Likewise, robot i should transmit an assumed control to all neighbors for
their own behavior optimization. Thus, the optimal control problem for each individ-
ual robot system can be denoted as
112 W. Zheng, H. Yuqing, and H. Jianda
Problem 1. For every robot i ∈ {1,… , N } and at any update time tk, give initial condi-
tions xi(tk), and assumed controls uˆ−i (⋅; tk ) , for all s ∈ [tk , tk + T ] find
where
tk + T
J i ( xi (tk ), ui (⋅; tk )) = ∫ Li ( xi ( s; tk ), ui ( s; tk ))ds + M i ( xi (tk + T ; tk ))
tk
for τ ∈ [tk , tk + 1) , and the receding horizon control law is updated when each new
initial state update x(tk ) ← x(tk +1 ) is available. Following the succinct presentation in
[9], we state the control algorithm.
Algorithm 1. At time t0 with initial state xi(t0), the distributed receding horizon con-
troller for any robot i ∈ {1,… , N } is as follows,
Relative State Modeling Based Distributed Receding Horizon Formation Control 113
Data: xi(t0), T ∈ (0, ∞) , δ ∈ (0, T ] . Initialization: At time t0, solve Problem 1 for
robot i, setting uˆi (τ ; t0 ) = 0 and uˆ−i (τ ; t0 ) = 0 for all τ ∈ [t0 , t0 + T ] and removing
constraint (11). At every update interval,
(1) Over any interval [tk, tk+1):
a) Apply ui∗ (τ ; tk ) , τ ∈ [tk , tk +1 ) ,
b) Compute uˆi (τ ; tk +1 ) = uˆi (τ ) as
⎧⎪u ∗ (τ ; tk ) τ ∈ [tk +1 , tk + T )
uˆi (τ ; tk +1 ) = ⎨ i
⎪⎩0 τ ∈ [tk + T , tk +1 + T ]
In this section, the stability analysis of algorithm 1 is given and the main result is
somewhat similar to the work in reference [9]. So, the primary lemmas and theorems
will be given with a simple explanation.
Lemma 1. For a given fixed horizon time T>0, and for the positive constant ξ de-
fined by
ξ = 2ρmax λmax (Q)ANT δ 2κ
The function J*(.) satisfies
N
J ∗ ( x(tk +1 )) − J ∗ ( x(tk )) ≤ − ∑ ∫
tk +1
Li ( xi∗ ( s; tk ), ui∗ ( s; tk ))ds + ξδ 2 (13)
tk
i =1
Theorem 1. For a given fixed horizon time T>0 and for any state x (t0 ) ⊂ X at ini-
tialization, if there exist an proper update time δ satisfies (14), then the formation
can converge to x c asymptotically.
114 W. Zheng, H. Yuqing, and H. Jianda
A small fixed upper bound on δ is provided that guarantees all robots have
reached their terminal constraint sets via the distributed receding horizon control.
After applying the previous lemmas, J*(.) is shown to be a Lyapunov function for the
closed-loop system and the remainder of the proof follows closely along the lines of
the proof of Theorem 1 in [13].
4 Simulation
In this section, we will conduct some simulations to verify the supposed algorithm.
Considering two dimensional bicycle-style robot system, shown an Fig.1, and its
absolute and relative state model are stated as,
⎡ xi ⎤ ⎡υi cos θi ⎤
⎢ y ⎥ ⎢υ sin θ ⎥
⎢ i⎥ =⎢ i i ⎥
(15.a)
⎢θi ⎥ ⎢ ui1 ⎥
⎢ ⎥ ⎢ ⎥
⎣υi ⎦ ⎣ ui 2 ⎦
⎡ x ij ⎤ ⎡υ j cos θ ij − υi + y ij ui1 ⎤
⎢ i⎥ ⎢ ⎥
⎢ y j ⎥ = ⎢ υ j sin θ j − x j ui1 ⎥
i i
(15.b)
⎢θij ⎥ ⎢ −ui1 + u j1 ⎥
⎢ ⎥ ⎢ ⎥
⎢⎣ υi ⎥⎦ ⎢⎣ ui 2 ⎥⎦
υj
υj
θj θ ij
(x j , y j ) ( xij , y ij )
υi υi
θi θi
( xi , yi ) ( xi , yi )
8 Leader 8 Leader
Follower1 Follower1
Follower2 Follower2
6 6
4 4
Y(m)
Y(m)
2 2
0 0
-2 -2
0 5 10 15 0 5 10 15
X(m) X(m)
Since DRHFC-B takes one relative model instead of two absolute models while
solving the optimal problem at every interval. The computing time will be naturally
reduced. Computing time of the two algorithms is shown in Fig.4, with average cost
time Time(DRHFC-A)=3.18 and Time(DRHFC-B)=1.81. That means DRHFC-B is
more effective than DRHFC-A. Also, comparisons are conducted in different simula-
tion environment as shown in Table 1, and the similar results can be concluded.
Relative Positions
1.5 14
Time(A)=3.1848
Time(B)=1.8053
12
x12(m)
1
10
CPU Time(s)
0.5 8
0 1 2 3 4 5 6 7 8 9 10
-0.5 6
4
y12(m)
-1
2
-1.5 0
0 1 2 3 4 5 6 7 8 9 10 -2 0 2 4 6 8 10 12
Time(s) Time(s)
Fig. 3. Relative positions of robot 1 and 2 Fig. 4. Computing time at every update interval
116 W. Zheng, H. Yuqing, and H. Jianda
0.9 0.025
0.8
0.02
Cost(m2)
0 1 2 3 4 5 6 7 8 9 10
0.015
-0.8
-0.9 0.01
y12(m)
-1
-1.1 0.005
-1.2
0
0 1 2 3 4 5 6 7 8 9 10 -2 0 2 4 6 8 10 12
Time(s) Time(s)
5 Conclusion
In this paper, a new decentralized receding horizon formation control based on rela-
tive state model was proposed. The new designed algorithm has the following advan-
tages: 1) The relative states, instead of the absolute states are used, since the latter is
the only requirement for most member robots in a formation and easier to be meas-
ured; 2) Computation burden and influence from measurement noise is reduced.
However, as a classical leader-follower scheme, some disadvantages will still exist in
the proposed algorithm, which is common in most DRHFC algorithm. Such as, how
to select proper parameters as the receding horizon time T and update period δ .
Relative State Modeling Based Distributed Receding Horizon Formation Control 117
References
1. Das, A.K., Fierro, R., Kumar, V.: A vision-based formation control framework. J. IEEE
Transactions on Robotics and Automation 18(5), 813–825 (2002)
2. Balch, T., Arkin, R.C.: Behavior-based formation control for multi-robot teams. J. IEEE
Transactions on Robotics and Automation 14(6), 926–939 (1998)
3. Lewis, M.A., Tan, K.H.: High precision formation control of mobile robots using virtual
structures. J. Autonomous Robots 4(4), 387–403 (1997)
4. Camponogara, E., Jia, D., Krogh, B.H., Talukdar, S.: Distributed model predictive control.
J. IEEE Control Systems Magazine 22(1), 44–52 (2002)
5. Motee, N., Sayyar-Rodsari, B.: Optimal partitioning in distributed model predictive con-
trol. In: Proceedings of the American Control Conference, pp. 5300–5305 (2003)
6. Jia, D., Krogh, B.H.: Min-max feedback model predictive control for distributed control
with communication. In: Proceedings of the American Control Conference, pp. 4507–4512
(2002)
7. Richards, A., How, J.: A decentralized algorithm for robust constrained model predictive
control. In: Proceedings of the American Control Conference, pp. 4261–4266 (2004)
8. Keviczy, T., Borrelli, F., Balas, G.J.: Decentralized receding horizon control for large scale
dynamically decoupled systems. J. Automatica 42(12), 2105–2115 (2006)
9. Dunbar, W.B., Murray, R.M.: Distributed receding horizon control for multi-vehicle for-
mation stabilization. J. Automatica 42(4), 549–558 (2006)
10. Inalhan, G., Tillerson, M., How, J.P.: Relative dynamics and control of spacecraft forma-
tions in eccentric orbits. J. Guidance, Control, and Dynamics 25(1), 48–59 (2002)
11. Chen, X.P., Serrani, A., Ozbay, H.: Control of leader-follower formations of terrestrial
UAVs. In: Proceedings of Decision and Control, pp. 498–503 (2003)
12. Wang, Z., He, Y.Q., Han, J.D.: Multi-unmanned helicopter formation control on relative
dynamics. In: IEEE International Conference on Mechatronics and Automation, pp. 4381–
4386 (2009)
13. Chen, H., Allgower, F.: Quasi-infinite horizon nonlinear model predictive control scheme
with guaranteed stability. J. Automatica 34(10), 1205–1217 (1998)
14. Fukao, T., Nakagawa, H., Adachi, N.: Adaptive tracking control of a nonholonomic mo-
bile robot. J. IEEE Transactions on Robotics and Automation 16(5), 609–615 (2002)
15. Béjar, M., Ollero, A., Cuesta, F.: Modeling and control of autonomous helicopters. J. Ad-
vances in Control Theory and Applications 353, 1–29 (2007)
16. Leitner, J.: Formation flying system design for a planer-finding telescope-occulter system.
In: Proceedings of SPIE the International Society for Optical Engineering, pp. 66871D-10
(2007)
Simulation and Experiments of the Simultaneous
Self-assembly for Modular Swarm Robots
1 Introduction
The self-assembly has been paid special attention to in modular robot field, and has
made a remarkable progress. Self-assembly can realize autonomous construction of
configurations which refers to organizing a group of robot modules into a target ro-
botic configuration through self-assembly without human interventions [1]. Because
in modular swarm robotic field, the basic modules in most cases usually can not move
on their own or only have very limited ability of autonomous locomotion, their initial
configuration is generally manually assembled. However, once the robotic configura-
tion is established, the number of modules is fixed, leading to difficulties to add new
modules without external direction [2].
The self-assembly provides an efficient way for autonomous construction for modu-
lar swarm robots [3]. A group of modules or individual robots with the same function
through self-assembly are connected into robotic structures, which have higher capabili-
ties of locomotion, perception and operation. Bojinov [4], Klavins [5], Grady [6] et al
respectively proposed some self-assembly control methods in different ways.
We have designed a newly developed robotic module named as Sambot, which is
an autonomous mobile robot having the characteristics of chain-type and mobile self-
reconfigurable robots. Each Sambot has one active docking interface and four passive
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 118–127, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Simulation and Experiments of the Simultaneous Self-assembly 119
docking interfaces. It can move fully autonomously and dock with another Sambot
from four directions. Through docking with each other, multiple Sambots can organ-
ize into a collective robot [7].
The algorithm for self-assembly is complex and because of high cost of hardware
experiment, a simulation platform for Sambot robot is required. Using Microsoft
Robotics Studio (MSRS), we design a simulation platform according to physical
Sabmot system and some simulation experiments of autonomous construction for
various configurations are conducted.
In our previous work [7], [8], we have proposed a distributed self-assembly method
based on Sambot platform. There have three types of Sambots, including Docking
Sambots (DSA), SEED and Connected Sambots (CSA). Single DSA experiments for
some configurations have been conducted. But because there have interference of the
infrared sensor between multiple Sambots, the simultaneous self-assembly have not
been realized. In this paper, two interference problems in Wandering and Locking
phase are found out and solved. A simultaneous self-assembly method is designed to
enhance the efficiency of the self-assembly of modular swarm robots. Meanwhile, the
simultaneous docking of multiple Sambots in Locking phase has been realized. The
simulation and physical experiment results show that the simultaneous self-assembly
control method is more effective for the autonomous construction of swarm robots.
The paper is organized as follows. In section 2, the overall structure of the Sambot
robot is described and simulation platform of Sambot is introduced. In section 3, two
interference problems in Wandering and Locking phase are analyzed and a simultane-
ous self-assembly control method is proposed. In section 4, based on the Sambot
simulation platform, some simulation experiments are demonstrated to verify the self-
assembly algorithm suitable for autonomous construction of various configurations.
The simulation results are provided and analyzed. In section 5, physical experiments
are implemented and the results are discussed. Finally, conclusions are given and the
ongoing work is pointed out.
Fig. 1. The structure of Sambot. (a) a Sambot robot ; (b) simulated Sambot module; (c) simu-
lated cross quadruped configuration; (d) simulated parallel quadruped configuration.
While some researches are being performed, we use Microsoft Robotics Studio
(MSRS) to build our simulation platform for more complex strcuture and large quan-
tity of swarms. The simulation model is shown in Fig. 1(b). To realize physics-based
simulation, we should design a class which contains inspection module, control
module and execution module (as shown in Fig. 2). The inspection module contains
gyroscope, infrared sensor and bumper sensor. Control module works as ports in
simulation environment which receives message from inspection module and makes
decision according to the information. Then robot carries out performance according
to these decisions. In Fig. 1 (c) and (d), the simulated cross quadruped configuration
and simulated parallel quadruped configuration have been demonstrated.
Analysis of
Gyroscope Configuration
Simulation engine
Infrared Inspection Control AGEIA XNA
sensor module module PhysX render
Bumper Execution
sensor module
Navigati
Wander Dock
on
SEED DSA
Sambo DSA (1)
t Left
(1)
DSA SEED
Sambo Front
(2)
t
(2 A
DS
Back Right
)
(a) (b)
Fig. 3. Two interference situations. (a) DSA’s detecting infrared sensors are interfered by an-
other DSA. (2) Information conflict of simultaneous docking for multiple DSAs.
1. In the phase of Wandering, when there is only one DSA to dock with the current
configuration, the DSA searches for the Docking_Direction (infrared emitters) without
another DSA’s interference. However, if there are multiple DSAs wandering simultane-
ously, interference would occur from other Sambots’ infrared emitters. In such cases,
the DSAs might mistake anther DSA as the current configuration and then miss the
target. As shown in Fig. 3 (a), in the process of searching SEED or CSA, detecting
sensors of DSA (2) detect DSA (1) before find SEED and DSA (1) is mistaken as cur-
rent configuration. Then it will navigate around DSA (1). Although DSA (2) still can
get away from DSA (1) after DSA (1) is not within the perception of DSA (2), this
process is unprofitable. So it is necessary to distinguish current configuration and DSA.
2. In the Locking phase, for simultaneous docking of multiple Sambots, informa-
tion transmitting conflict can cause the deadlock. Because of CAN bus characteristics
and sensors’ limitation, the bus is shared simultaneously by two or more Docking
Sambots. When two docking interfaces of current configuration are docked with
Sambot A and B meanwhile, Sambot A waits for the record end of Sambot B while
Sambot B waits for record end of Sambot A. For example, in Fig. 3 (b) DSA (1) and
DSA (2) are docking simultaneously with the SEED, the SEED needs to communicate
with them. However, in previous self-assembly algorithm, docking time difference is
used to recognize which interface is docked with and further define DSA’ node num-
ber in connection state table, here which is unavailable and need to be improved.
A fixed
angle
SEED SEED SEED SEED
Sambo Sambo Sambo Sambo
t t t t
DSA DSA
DSA
DSA
Fig. 4. Operation scenario of DSA detecting current configuration (here only SEED)
A fixed
angle
Anothe
r Anothe
r Anothe
DSA r Anothe
DSA DSA
DSA r DSA
DSA
DSA
DSA
interface of a lower number (here back) is delayed, until the information of high num-
ber has been transmitted and deadlock is removed, that is, comunication is running as
an ordered allocation.
Two improved algorithms to solve the corresponding interference problems are
added to self-assembly control method. Multiple DSAs are able to simultaneously
self-assembly into the target configuration according to design requirement. Obvi-
ously, it will shorten the assembly time, which would be analyzed in next sections
through simulation and physical experiments.
Communication bus
Group 2
Left front right back
Fig. 6. Solution to avoid the information conflict using ordered resource allocation policy
Fig. 7. The self-assembly experiments of the snake-like and cross quadruped configuration on
simulation platform
Fig. 10 shows the process of the self-assembly experiments of H-form and parallel
quadruped configuration on simulation platform. The Fig.11 shows the distribution of
completion time.
Simulation and Experiments of the Simultaneous Self-assembly 125
Fig. 10. The self-assembly experiments on the H-form and quadruped configuration on simula-
tion platform
Fig. 11. Distribution of completion time of the H-form and quadruped configurations on simu-
lation platform
5 Physical Experiments
ⅹ
Based on Sambot modules, on a platform of 1000 mm 1000 mm, we conduct the
simultaneous self-assembly experiments with multiple DSAs for both the snake-like
and the quadruped configurations. The SEED is also located at the platform center,
but the DSAs are put randomly at the four corners.
1. The simultaneous self-assembly of the snake-like configuration with multiple
DSAs is shown in Fig. 12. As for linear configuration, in simultaneous self-assembly
process, simultaneous docking conflict doesn’t exit but DSA’s sensors are possible to
be interfered by another DSA.
2. The simultaneous self-assembly of the quadruped configuration with multiple
DSAs. As indicated by the red arrows in Fig. 13, all the four lateral interfaces of the
SEED are Docking-Directions which remarkably enhance the experimental effi-
ciency. Transmitting information conflict to the deadlock and sensor interference are
possible to happen. However, the simultaneous self-assembly algorithm can deal with
the problems. The experimental results verify the effectiveness of the algorithm.
126 H. Wei et al.
Fig. 12. The self-assembly experiment of the snake-like configuration with multiple DSAs
Fig. 13. The self-assembly experiment of the quadruped configuration with multiple DSAs
Some ongoing researches still deserve studying. It is significant that wandering and
navigating algorithm still needs further improvement using evolutionary algorithm.
Moreover, it is necessary to establish an autonomous control system for the self-
assembly of some given configurations, the movement of the whole configuration, the
evolutionary reconfiguration to another arbitrary robotic structure and so on.
Acknowledgments
This work was supported by the 863 Program of China (Grant No. 2009AA043901
and 2009AA043903), National Natural Science Foundation of China (Grant No.
60525314), Beijing technological new star project (Grant No. 2008A018).
References
1. Whitesides, G.M., Grzybowski, B.: Self-Assembly at All Scales. J. Science 295, 2418–
2421 (2002)
2. Christensen, A.L., Grady, R.O., Dorigo, M.: Morphology Control in a Multirobot System.
J. IEEE Robotics & Automation Magzine 14, 18–25 (2007)
3. Anderson, C., Theraulaz, G., Deneubourg, J.L.: Self-assemblages in Insect Societies. J.
Insectes Sociaux 49, 99–110 (2002)
4. Bojinov, H., Casal, A., Hogg, T.: Multiagent Control of Self-reconfigurable Robots. J.
Artificial Intelligence 142, 99–120 (2002)
5. Klavins, E.: Programmable Self-assembly. J. IEEE Control Systems Magazine 27, 43–56
(2007)
6. Christensen, A.L., O’Grady, R., Dorigo, M.: Morphology Control in a Multirobot System.
J. IEEE Robotics & Automation Magazine 14(4), 18–25 (2007)
7. Hongxing, W., Yingpeng, C., Haiyuan, L., Tianmiao, W.: Sambot: a Self-assembly Modu-
lar Robot for Swarm Robot. In: The 2010 IEEE Conference on Robotics and Automation,
pp. 66–71. IEEE Press, Anchorage (2010)
8. Hongxing, W., Dezhong, L., Jiandong, T., Tianmiao, W.: The Distributed Control and Ex-
periments of Directional Self-assembly for Modular Swarm Robot. In: The 2010 IEEE/RSJ
International Conference on Intelligent Robots and Systems, pp. 4169–4174. IEEE Press,
Taipei (2010)
Impulsive Consensus in Networks of Multi-agent
Systems with Any Communication Delays
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 128–135, 2011.
c Springer-Verlag Berlin Heidelberg 2011
Impulsive Consensus in Delayed Networks of Multi-agent Systems 129
2 Consensus Algorithms
Let R = (−∞, +∞) be the set of real numbers, R+ = [0, +∞) be the set
of nonnegative real numbers, and Z + = {1, 2, · · · } be the set of positive integer
130 Q. Wu et al.
+∞
+ bij (xj (t) − xi (t))δ(t − tm ), (4)
m=1 vj ∈Ni
Impulsive Consensus in Delayed Networks of Multi-agent Systems 131
where bij ≥ 0 are constants called as the control gain, δ(t) is the Dirac function
[9,10].
Remark 1. If bij = 0 for all i, j ∈ n, then the protocol (4) becomes a linear con-
sensus protocol (3) corresponding to the neighbors of node vi . Clearly, consensus
protocol (4) is the generalization of corresponding results existing in the litera-
ture [3,7,8,9]. It should be noted that the latter part of the impulsive consensus
protocol (4) has two aims. On one hand, if τ (t) < τ ∗ , we can utilize it to accel-
erate the average consensus of such systems. On the other hand, if τ (t) ≥ τ ∗ , it
can solve average consensus for any communication time-delays. This point will
be further illustrated through the numerical simulations.
Under the consensus protocol (4), the system (2) has the following form
ẋ(t) = −Lx(t − τ (t)), t = tm , t ≥ t0 ,
(5)
Δx(t) = x(t) − x(t− ) = −M x(t), t = tm , m ∈ Z + ,
In what follows, we will consider the average consensus problem of (5) with
fixed topology. We will prove that under appropriate conditions the system
achieves average consensus uniformly asymptotically.
3 Main Results
Based on stability theory on impulsive delayed differential equations, the follow-
ing sufficient condition for average consensus of the system (5) is established.
Theorem 1. Consider the delayed dynamical network (5). Assume there exist
positive constants α, β > 0, such that for all m ∈ N , the following conditions
are satisfied:
s
2λ2 (M ) + λ2 (M M ) · L
(A1 ) 2 + ≤ α;
(A2 ) ln 1 + 2λ2 (M s ) + λ2 (M M ) − α(tm − tm−1 ) ≥ β > 0.
Then the delayed dynamical network (5) achieve average consensus uniformly
asymptotically.
Proof. Since the graph G has a spanning tree, by using Lemma 3.3 in [5], then
its Laplacian M has exactly one zero eigenvalue and the rest n− 1 eigenvalues all
have positive real-parts. Furthermore, M s is a symmetric matrix and has zero
row sums. Thus, the eigenvalues of matrices M s and M M can be ordered as
0 = λ1 (M s ) < λ2 (M s ) ≤ · · · ≤ λn (M s ),
132 Q. Wu et al.
and 0 = λ1 (M M ) < λ2 (M M ) ≤ · · · ≤ λn (M M ).
On the other hand, since M s and M M are symmetric, by the basic theory of
Linear Algebra, we know
η (t)M s η(t) ≥ λ2 (M s )η (t)η(t), 1 η = 0. (7)
that is
1
V (tm , η(tm )) ≤ V (t− −
m , η(tm )) (10)
1 + 2λ2 (M s ) + λ2 (M M )
1
Let ψ(t) = t, then ψ(t) is strictly increasing and
1 + 2λ2 (M s ) + λ2 (M M )
ψ(0) = 0 with ψ(t) < t for all t > 0. Hence, the condition (ii) of Theorem
1 in [10] is satisfied.
For any solutions of Eqs. (6), if
V (t − τ (t), η(t − τ (t))) ≤ ψ −1 (V (t, η(t)). (11)
Calculating the upper Dini derivative of V (t) along the solutions of Eqs. (6),
and by using the inequality x y + y x ≤ εx x + ε−1 y y, we can get that
D+ V (t) = −η Lη(t − τ (t)) ≤ L · V (t, η(t)) + sup V (s, η(s))
t−τ ≤s≤t
≤ 2 + 2λ2 (M s ) + λ2 (M M ) · LV (t, δ(t)) ≤ αV (t, η(t)).
Letting g(t) ≡ 1 and H(t) = αt. Thus, the condition (iii) of Theorem 1 in [10] is
satisfied. The condition (A2 ) of Theorem 1 implies that
μ tm
ds
− g(s) ds
ψ(μ) H(s) tm−1
1
1
= ln μ − ln[ s
μ] − (tm − tm−1 )
α 1 + 2λ2 (M ) + λ2 (M M )
ln[1 + 2λ2 (M s ) + λ2 (M M )] β
= − (tm − tm−1 ) ≥ > 0.
α α
Impulsive Consensus in Delayed Networks of Multi-agent Systems 133
The condition (iv) of Theorem 1 in [10] is satisfied. Let w1 (|x|) = w2 (|x|) = |x2 |/2,
so the condition (i) of Theorem 1 in [10] is satisfied. Therefore, all the conditions
of Theorem 1 in [10] are satisfied. This completes the proof of Theorem 1.
Remark 2. Theorem 1 shows that, average consensus of the delayed dynamical
network (5) not only depends on the topology structures of the entire network,
but also is heavily determined by the impulsive gain matrix M and the impulsive
interval tm − tm−1 . In addition, the conditions of Theorem 1 are all sufficient
conditions but not necessary, i.e., the dynamical networks can achieve average
consensus uniformly asymptotically, although one of the conditions of Theorem
1 may fail.
4 Simulations
As an application of the above theoretical results, average consensus problem for
delayed dynamical networks is worked out in this section. Meanwhile, simulations
with various impulsive gains matrices are given to verify the effectiveness of the
proposed impulsive consensus protocol, and also visualize the impulsive gain
effects on average consensus problem of the delayed dynamical networks.
Here we consider a directed network with fixed topology G having 100 agents
as in Fig. 1. It is easy to see that G has a spanning tree. Matrix L is given by
⎛ ⎞
2 −1 0 · · · −1
⎜ −1 2 −1 · · · 0 ⎟
⎜ ⎟
⎜ ⎟
L = ⎜ 0 −1 2 · · · 0 ⎟ .
⎜ .. .. .. . . .. ⎟
⎝ . . . . . ⎠
−1 0 0 ··· 2 100×100
12
10
x (t),(i=1,2,...,100)
6
i
4
0
0 5 10 15 20
t
Fig. 2. The change process of the state variables of the delayed dynamical network (5)
without impulsive gain in case τ (t) = τ ∗ = π/8
10 10
9 9
8 8
7 7
x (t),(i=1,2,...,100)
x (t),(i=1,2,...,100)
6 6
5 5
4 4
i
3 3
2 2
1 1
0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
(a) t (b) t
Fig. 3. Average consensus process of the agents state of the delayed dynamical network
(5) with different impulsive gains matrices in case τ (t) = 1.0
then all the conditions of Theorem 1 are satisfied, which means the delayed
dynamical network (5) achieve average consensus uniformly asymptotically.
Let the equidistant impulsive interval be taken as Δt = 0.02. Fig. 2 is the
simulation result corresponding to change process of the state variables of the
delayed dynamical network (5) having the communication delay τ (t) = τ ∗ =
π/2λn = π/8 with the impulsive gain matrix M = 0 in time interval [0, 20].
This clearly shows that average consensus is not asymptotically reached, which
is consistent with the result of Proposition 1. Fig. 3 demonstrates the change
process of the state variables of the delayed dynamical network (5) having the
communication delay τ (t) = 1 with different impulsive gain mij = −0.015, i = j,
mij = 1.485, i = j, α = 30, β = 2.7322 and mij = −0.018, i = j, mij = 1.782,
i = j, α = 36, β = 2.9169 in time interval [0, 2], respectively, which satisfy the
conditions of Theorem 1. It can be shown that impulsive average consensus is
finally achieved, and the impulsive gain matrix heavily affect consensus of the
delayed dynamical network.
5 Conclusions
This paper has developed a distributed algorithm for average consensus in di-
rected delayed networks of dynamic agents. We have proposed a simple impulsive
consensus protocol for such networks for any communication delays, and some
generic sufficient conditions under which all the nodes in the network achieve
Impulsive Consensus in Delayed Networks of Multi-agent Systems 135
Acknowledgment
This work was supported by the National Science Foundation of China (Grant
Nos. 10972129 and 10832006), the Specialized Research Foundation for the Doc-
toral Program of Higher Education (Grant No. 200802800015), the Innovation
Program of Shanghai Municipal Education Commission (Grant No. 10ZZ61), the
Shanghai Leading Academic Discipline Project (Project No. S30106), and the
Scientific Research Foundation of Tongren College (Nos. TS10016 and TR051).
References
1. Jadbabaie, A., Lin, J., Morse, A.S.: Coordination of Groups of Mobile Autonomous
Agents Using Nearest Neighbor Rules. IEEE Trans. Autom. Contr. 48, 988–1001
(2003)
2. Fax, J.A., Murray, R.M.: Information Flow and Cooperative Control of Vehicle
Formations. IEEE Trans. Autom. Contr. 49, 1465–1476 (2004)
3. Olfati-Saber, R., Murray, R.M.: Consensus Problems in Networks of Agents with
Switching Topology and Time-Delays. IEEE Trans. Autom. Contr. 49, 1520–1533
(2004)
4. Moreau, L.: Stability of Multiagent Systems with Time-Dependent Communication
Links. IEEE Trans. Autom. Contr. 50, 169–182 (2005)
5. Ren, W., Beard, R.W.: Consensus Seeking in Multiagent Systems Under Dynam-
ically Changing Interaction Topologies. IEEE Trans. Autom. Contr. 50, 655–661
(2005)
6. Hong, Y.G., Hu, J.P., Gao, L.X.: Tracking Control for Multi-Agent Consensus with
an Active Leader and Variable Topology. Automatica 42, 1177–1182 (2006)
7. Sun, Y.G., Wang, L., Xie, G.M.: Average Consensus in Networks of Dynamic
Agents with Switching Topologies and Multiple Time-Varying Delays. Syst. Contr.
Lett. 57, 175–183 (2008)
8. Lin, P., Jia, Y.M.: Average Consensus in Networks of Multi-Agents with both
Switching Topology and Coupling Time-Delay. Physica A 387, 303–313 (2008)
9. Wu, Q.J., Zhou, J., Xiang, L.: Impulsive Consensus Seeking in Directed Networks
of Multi-Agent Systems with Communication Time-Delays. International Journal
of Systems Science (2011) (in press), doi:10.1080/00207721.2010.547630
10. Yan, J., Shen, J.H.: Impulsive Stabilization of Functional Differential Equations by
Lyapunov-Razumikhin Functions. Nonlinear Anal. 37, 245–255 (1999)
FDClust: A New Bio-inspired Divisive Clustering
Algorithm
1 Introduction
Clustering is an important data mining technique that has a wide range of applications
in many areas like biology, medicine, market research and image analysis among
others. It is the process of partitioning a set of objects into different subsets. The goal
is that the object within a group be similar (or related) to one another and different
from (or unrelated to) the objects in other groups.
Many clustering algorithms exist in the literature. At a high level, we can divide
these algorithms into two classes: partitioning algorithms and hierarchical algorithms.
Given a database of n objects or data tuples, a partitioning method constructs k parti-
tions of the data, where each partition represents a cluster. Whereas, hierarchical
clustering presents data in the form of a hierarchy over the entity set. In hierarchical
clustering methods, the number of clusters has not to be specified a priori, and there
are no initializations to be done. Hierarchical clustering is static, and data affected to a
given cluster in the early stages cannot be moved between clusters. There are two
approaches to build a cluster hierarchy: (i) agglomerative clustering that builds a
hierarchy in the bottom up fashion by starting from smaller clusters and sequentially
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 136–145, 2011.
© Springer-Verlag Berlin Heidelberg 2011
FDClust: A New Bio-inspired Divisive Clustering Algorithm 137
merging them into parental nodes (ii) divisive clustering that builds a top-down hie-
rarchy by splitting greater clusters into smaller ones starting from the entire data set.
Researchers seek to invent new approaches to enhance the resolution of the cluster-
ing problem and to achieve better results. Recently, research on and with the
bio-inspired clustering algorithms has reached a very promising state. The basic moti-
vation of these approaches stems from the incredible ability of social animals and
other organisms (ants, bees, termites, birds, fish, etc) to solve complex problems col-
lectively.
These algorithms use a set of similar and rather simple artificial agents (ant, bee,
individual, etc) to solve the clustering problem. These algorithms can be divided into
three main categories according to data representation [1]: (i) an agent represents a
potential solution to the clustering problem to be optimized such as genetic [2,3] and
particle swarm optimization clustering algorithms [4,5], (ii) data points which are
objects in the universe, are moved by agents in order to form clusters. Examples of
such approaches are ant-based clustering algorithms [6] [7], (iii) each artificial agent
represents one data set. These agents move on the universe to form groups of similar
entities, for example Antree [8] and AntClust [9].
In this work, we propose a new bio-inspired divisive clustering algorithm: artificial
Fish based Divisive Clustering algorithm (FDClust). This algorithm takes inspiration
from the social organization and the encounters of fish shoals phenomena. Several
studies have shown that fish shoals are assorted according to several characteristics
[10][11]. During fish shoals encounters, an individual fish decides to join or to leave a
group according to its common characteristics with the already existing group mem-
bers [12][13]. Shoals encounters may result in the fission of the group into two
homogenous shoals. Thus real fish are able to solve the sorting problem. These phe-
nomena can be easily adapted to solve the clustering problem. In FDClust, an artifi-
cial fish represents an object to be clustered. The encounters of two artificial shoals
results in the fission of the group into two clusters of similar objects. FDClust builds a
binary tree of clusters. It applies recursively this process to split each node into two
homogenous clusters.
The reminder of the paper is organized as follows. Section 2 first describes the so-
cial organization of fish species and then the encounter phenomenon of fish shoals. In
section 3 we present the FDClust algorithm in details. Experimental results are pre-
sented and discussed in section 4. Section 5 concludes the paper and gives sugges-
tions for future work.
Fig. 1. Diagram showing the two forms of fission events that were recorded a) a rear fission
event, b) a lateral fission event [14]
Shoals membership is not necessarily stable over time. Individuals are exchanged
between groups [14]. Fish shoals are thus open groups (groups where individuals are
free to leave and join). Theoretical models of open groups assert that socials animals
make adaptive decisions about joining groups on the basis of a number of different
phenotypic traits of existing group members. Hence, individuals prefer to associate
with similar conspecifics, those of similar body length and those free of parasite [13].
Active choice of shoal mates has been documented for many fish species. During
shoals encouters individuals may actively choose neighboring fish that are of a similar
phenotype. Fish have limited vision and then cannot interact with the total group
members but only with perceived ones. Thus, shoals encounters provide an individual
based mechanism for shoal assortment. Since individuals can make decisions based
on the composition of available shoals, other group members are a source of informa-
tion about the most adaptive decisions [15]. Group living is likely to be based on a
continuous decision-making process, with individuals constantly evaluating the prof-
itability of joining, leaving or staying with others, in each encounter with other
groups. The encounters of fish shoals result in shoal fission or fusion.
Fission (but not fusion) events are shown to be an important mechanism in generat-
ing phenotypic assortment [14]. Shoal fission events are divided into two categories
(figure 1): (i) rear fission events where the two resulting shoals maintained the same
direction of travel and fission occur due to differential swimming speeds, (ii) lateral
fission events where the two resulting shoals are separated due to different directions
of travel [14].
The social organization of fish shoals is based on the phenotypic similarity. The
continuous decision-making process is based on the maintenance of social organiza-
tion with neighboring group members. The behavior of real fish during shoals en-
counters makes them able to solve collectively the sorting problem. Our study of
these phenomena (particularly the fission events) from a clustering perspective results
in the development of a clustering model for solving the divisive clustering problem.
The core task in such a problem is to split a candidate cluster into two distant parts. In
our model, this task is achieved by the simulation of the encounters of two groups of
artificial fish. The model is described in the next section.
sub-clusters until each object form one cluster. At each step the cluster with the highest
diameter among those not yet splitted is partitioned into two sub-clusters. To achieve the
partitioning of a group of objects into two homogenous groups, FDClust applies a bi-
partitioning procedure that takes inspiration from the shoals encounters phenomenon.
During shoals encounters, real fish are able to evaluate dynamically the profitability of
joining, leaving or staying with neighboring agents. This decision making process is
based on the maintenance of social organization of the entire group. Fish shoals are
phenotypically assorted by color, size and species. Shoals encounters may result in the
fission of the group into two well-organized groups (assorted groups). In lateral fission,
groups are separated due to two different directions of swimming. To achieve the divi-
sion of the candidate cluster into two sub-clusters, we use two artificial fish shoals. The
encounter of these two groups of artificial fish results in a lateral fission of the group
into two homogenous groups. Artificial fish (agents) are initially randomly scattered on
the clustering environment. Each agent is an object to be clustered. Each agent is ran-
domly associated a direction left or right. Since real fish have only local vision, artificial
agents interact only with neighboring agents to make adaptive decisions about joining or
leaving a group. Each agent has to make a binary decision whether to move to the left or
to the right. Agents take the same direction as most similar agents in their neighborhood.
Artificial fish join finally their appropriate group composed with similar agents. The
initial group is then separated into two sub-groups of similar objects due to the two
directions of travel left and right. Two groups of agents are formed those having the left
direction and those having the right direction.
(a) (b)
then the agent p interacts with all his neighbors, else it interacts only with n v = s × s
i
Each agent has an initial preferable direction left (←) or right (→). This direction is
initially fixed randomly. Agents move with identical speed. In one step, an agent can
move to one of its neighboring cells whatever the left one or the right one. It chooses
actively its travel direction through the interactions with its neighboring agents. An
agent interacts with at most n v nearest neighbors among those situated in his local
neighborhood. In fact, agents can occupy the same cell as other agents. To take the
decision about the next travel direction, the agent p i evaluates its similarity with
agents from pv ( p i ) that have the direction left (→) (respectively right (←)). These
two similarities are calculated as follow:
p j ∈ pv ( p i ) / dir ( p j ) =→
∈ pv ( p
∑ d 2
( pi, p
) =→
j )
(1)
( pi,→ ) = 1 −
p j i ) / dir ( p j
sim
m * pv j ∈ v ( p i ) / dir ( p j ) =→
p j ∈ pv ( p i ) / dir ( p j ) =←
∑
p j ∈ pv ( p i ) / dir ( p
2
d ( pi, p
) =←
j )
(2)
sim ( p i , ← ) = 1 − j
m * p j ∈ pv ( p i ) / dir ( p j ) =←
ing the right direction, than p will move to the cell at its left and vice versa. An
i
FDClust starts by all objects gathered in the same cluster. At each step it applies the
bi-partitioning algorithm to the cluster to be splitted until each object constitutes one
cluster. It is a hierarchical divisive clustering algorithm (figure 3).
FDClust: A New Bio-inspired Divisive Clustering Algorithm 141
Input: number of objects N, the size of the perception zone s*s, the movement
step p and the number of iterations T.
Output: Cl and Cr
1. Scatter objects of cluster C in the central square of the grid
2. Associate random direction (ĺ or ĸ) to each object
3. For t=1 to T do
4. For i=1 to N do
5. If(| pv ( p ) |=0) then stand by else
i
p
direction ( i )=ĸ and move to the left.
9. else stand by
12. Else p i ∈ Cl
13. end
14. end
15. Return Cl et Cr
Data base N M K
Iris 150 4 3
Glass 214 9 6
Thyroid 215 5 3
Soybean 47 35 4
Wine 178 13 3
Yeast 1484 8 10
The main features of the databases are summarized in Table 1. In each case the num-
ber of attributes (M), the number of classes (K) and the total number of objects (N)
are specified.
To evaluate our algorithm we have used the following measures:
The intra clusters inertia: used to determine how homogonous the objects in
clusters are with each others (where, Gi is the center of the cluster I, d is the
Euclidean distance):
K
1
I =
K
¦ i =1
¦
xi ∈ C
d ( x i , Gi ) 2 (3)
The recall, the precision and the F-measure: are based on the idea of compar-
ing a resulting partition with a real or a reference partition. The relative recall
(respectively precision and F-measure) of the reference class Ci to the resulting
class Cj are defined as follows:
n ij n ij precision ( i , j ) * recall (i, j )
recall ( i , j ) = precision (i , j ) = F (i, j ) = 2
N i Nj precision ( i , j ) + recall (i, j )
The global value of the recall (r) , the precision (p) and F-measure (F) for all
classes will be respectively ( p i is the weight of the class Ci):
In table 2, we present the obtained results for FDClust, kmeans, Alink, Clink, Slink
and Diana algorithms. Since FDClust and kmeans are stochastic, we give the min, the
max the mean and the standard deviation of 100 runs.
FDClust: A New Bio-inspired Divisive Clustering Algorithm 143
For the database Iris, our algorithm generates the best results according to all consi-
dering measures in comparison with other algorithms. For the data bases Glass and
thyroid, FDClust encounters some difficulty in the determination of real cluster struc-
ture, but the obtained clusters are homogenous. For the data base soybean, all algo-
rithms generate good partitions and results are nearby. For the data base Wine
FDClust generates a partition of a good quality in term of inertia, recall, precision and
F_measure in comparison with those obtained by the other algorithms. For the data
base yeast, FDClust generates the best partition in term of intraclusters inertia but like
Kmeans it has a difficulty in detecting real clusters structures.
Comparing with other algorithms, we note that for all data bases FDClust has rec-
orded good performances. Moreover FDClust has the advantage of having lower
complexity than the other hierarchical algorithms.
5 Conclusion
Bio-inspired clustering algorithms are an appropriate alternative to traditional cluster-
ing algorithms. Research on bio-inspired clustering algorithms is still an on-going field
of research. In this paper we have presented a new approach for divisive clustering
with artificial fish. It is based on the shoal encounters and social organization of fish
shoals phenomena. The obtained results are encouraging.
As prospects, we attempt to extend our algorithm by considering more than two di-
rections of travels. A candidate cluster may be divided into more than two sub-clusters.
References
1. Bock., H., Gaul, W., Vichi, M.: Studies in classification, data analuysis, and knowldge or-
ganization (2005)
2. Falkenauer, E.: A new representation and operators for genetic algorithms applied to
grouping problems. Evolutionary Computation 2(2), 123–144 (1994)
3. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern
Recognition 33, 1455–1465 (2000)
4. Sandra Cohen, C.M., Leandro de Castro, N.: Data Clustering with Particle Swarms. In:
IEEE Congress on Evolutionary Computations 2006 (2006)
5. Chen, C.-Y., Ye, F.: Particle swarm optimization algorithm and its application to clustering
analysis. In: Proceedings of IEEE International Conference on Networking, Sensing and
Control, pp. 789–794 (2004)
6. Lumer, E., Faieta, B.: Diversity and adaptation in populations of clustering ants. In: Cliff,
D., Husbands, P., Meyer, J., W., S. (eds.) Proceedings of the Third International Confe-
rence on Simulation of Adaptive Behavior, pp. 501–508. MIT Press, Cambridge (1994)
7. Gzara., M., Jamoussi., S., et Elkamel, A., Ben Abdallah, H.: L’algorithme CAC: des four-
mis artificielles pour la classification automatique. Accepté à paraitre dans la revue
d’intelligence artificielle (2011)
8. Azzag, H., Guinot, C., Oliver, A., Venturini, G.: A hierarchical ant based clustering algo-
rithm and its use in three real-world applications. In: Dullaert, W., Marc Sevaux, K.S.,
Springael, J. (eds.) European Journal of Operational Research (EJOR). Special Issue on
Applications of Metaheuristics (2005)
FDClust: A New Bio-inspired Divisive Clustering Algorithm 145
9. Labroche, N., Monmarché, N., Venturini, G.: A new clustering algorithm based on the
chemical recognition system of ants. In: van Harmelen, F. (ed.) Proceedings of the 15th
European Conference on Artificial Intelligence, pp. 345–349 (2002)
10. Krause, J., Butlin, R.K., Peuhkuru, N., Prichard, V.: The social organization of fish shoals:
a test of the predictive power of laboratory experiments for the field. Biol. Rev. 75, 477–
501 (2000a)
11. McCann, L.I., Koehn, D.J., Kline, N.J.: The effects of body size and body markings on
nonpolarized schooling behaviour of zebra fish (Brachydanio rerio). J. Psychol. 79, 71–75
(1971)
12. Krause, J., Godin, J.G.: Shoal choice in the banded killifish (Fundulus diapha-nus, Teleos-
tei, Cyprinodontidae) – Effects of predation risk, fish size, species compo-sition and size of
shoals. Ethology 98, 128–136 (1994)
13. Crook, A.C.: Quantitative evidence for assortative schooling in a coral reef. Mar. Ecol.
Prog. Ser. 179, 17–23 (1999)
14. Theodorakis, C.W.: Size segragation and effects of oddity on predation risk in minnow
schools. Anim. Behav. 38, 496–502 (1989)
15. Croft, D.P., Arrowsmith, B.J., Bielby, J., Skinner, K., White, E., Couzin, I.D., Margurran,
I., Ramnarine, I., Krausse, J.: Mechanisms underlying shoal composition in Trinidadian
guppy (Poecilia). Oikos 100, 429–438 (2003)
16. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Mining Class Association Rules from Dynamic Class
Coupling Data to Measure Class Reusability Pattern
Abstract. The increasing use of reusable components during the process of soft-
ware development in the recent times has motivated the researchers to pay more
attention to the measurement of reusability. There is a tremendous scope of using
various data mining techniques in identifying set of software components having
more dependency amongst each other, making each of them less reusable in isola-
tion. For object-oriented development paradigm, class coupling has been already
identified as the most important parameter affecting reusability. In this paper an
attempt has been made to identify the group of classes having dependency
amongst each other and also being independent from rest of the classes existing in
the same repository. The concepts of data mining have been used to discover pat-
terns of reusable classes in a particular application. The paper proposes a three
step approach to discover class associations rules for Java applications to identify
set of classes that should be reused in combination. Firstly dynamic analysis of the
Java application under consideration is performed using UML diagrams to com-
pute class import coupling measure. Then in the second step, for each class these
collected measures are represented as Class_Set & binary Class_Vector. Finally
the third step uses apriori (association rule mining) algorithm to generate Class
Associations Rules (CAR’s) between classes. The proposed approach has been
applied on sample Java programs and our study indicates that these CAR’s can as-
sist the developers in the proper identification of reusable classes by discovering
frequent class association patterns.
1 Introduction
Object oriented development has become widely acceptable in the software industry.
It provides many advantages over the traditional development approaches [17] and is
intended to enhance software reusability through encapsulation and inheritance [28].
In object-oriented concept, classes are basic building blocks and coupling between
classes is well-recognized structural attribute in OO software engineering. Software
Reuse is defined as the process of building or assembling software applications from
previously developed software [20]. Concept of reuse has been widely used by the
software industry in recent times. The present scenario of development is to reuse
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 146–156, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Mining Class Association Rules from Dynamic Class Coupling Data 147
some of the already existing quality components and development of new highly re-
usable component. The reuse of software components in software development leads
to increased productivity, quality, maintainability etc [3,23]. The success of reusabil-
ity is highly dependent on proper identification of whether a particular component is
really reusable or not. These measures help to develop, store and identify reusable
components [21]. Reuse of Class code has been frequent in practice. It is essential &
tricky to identify a set of needed classes to reuse together or alone. Hence it is always
desirable to find out the classes along with their associated classes [17]. Class cou-
pling plays a vital role to measure reusability and selecting classes for reuse in combi-
nation because the highly coupled classes are required to be reused as a group [7].
One can define a class Ca related to class Cb if Ca must use Cb in all future reuse. So
group of dependent classes should be reused together for ensuring the proper
functioning of the application [22]. There is a thrust of software metrics especially
reusability metric as an active research area in the field of software measurement.
Software metric is a quantitative indicator of an attribute of a software product or
process. There are some reuse related metric models like cost productivity, return on
investment, maturity assessment, failure modes and reusability assessment etc [20].
For developer who wants to reuse components, reusability is one of the important
characteristic. It is necessary to measure the reusability of components in order to
recognize the reuse of components effectively. So classes must be developed as Reus-
able to effectively reuse them later. Developers should be trained or facilitated to use
reusable components e.g. classes because it is hard to understand the structure of
classes developed by others [24]. If developers do not have any prior knowledge
about the coupling of classes they want to reuse, then they need to spend few time to
understand the association pattern of classes. So there is a need to develop some
mechanism that helps to know what combination of classes to reuse. By viewing class
association rules and patterns, developer can predict required set of classes and can
avoid unnecessary, partial class reuse. So for reuse, issues like maintaining class code
repository, deciding what group of classes should be incorporated into repository &
their association patterns and identifying exact set of classes to reuse, need to be ad-
dressed. It will reduce some reuse efforts. To discover the class association rules data
mining can be used. By using data mining technology, one can find frequently used
classes and their coupling pattern in a particular java application.
Data mining is the process of extracting new and useful knowledge from large amount
of data. Mining is widely used to solve many business problems such as customer
profiling, customer behavior modeling, product recommendation, fraud detection etc
[25]. Data mining techniques can be used to analyze software engineering data to bet-
ter understand the software and assist software engineering tasks. It also helps in pro-
gramming, defect detection, testing, debugging, maintenance etc. In component reuse,
mining helps in numerous ways such as to decide which components we should reuse,
what is the right way to reuse, which components may often be reused in combina-
tions etc [25]. The general approach of mining software engineering data consists of
following steps:
148 A. Parashar and J.K. Chhabra
2 Related Works
For object-oriented development paradigm, class coupling has been used as an impor-
tant parameter effecting reusability. Li et.al. [19], Yacoup et. al [18] , Arisholm et.
al.[2] proposed some measures for coupling. Efforts have been made by the re-
searchers to measure reusability through coupling and cohesion of components [5].
Gui et al [6,7] and Choi et al [4] provided some reusability measures based on cou-
pling and cohesion. ISA [8] methodology has been proposed to identify data cohesive
subsystems. Gui et al [10] proposed a new static measure of coupling to assess and
rank the reusability of java components. Arisholm et. al.[2] have provided a method
for identifying import coupled classes with each class at design time using UML
Mining Class Association Rules from Dynamic Class Coupling Data 149
3 Proposed Methodology
The concepts of data mining have been used to discover patterns of reusable classes in
a particular application. These patterns are further helpful in reusing the classes. As-
sociation rules between classes and class coupling behaviour are used to identify the
class reusability patterns. For this purpose, association mining algorithm [1, 11] is
used to mine class association rules (CAR) from class import coupling data. To know
the class coupling behaviour the cosine similarity measure can be applied on class
import coupling data. Our approach to mine class association rules and class coupling
behavior consists of three steps:
1. Collection of Class import coupling data through UML.
2. Representation of Collected Data.
3. Mining of Class Association Rules (CAR) & Prediction of class import cou-
pling behavior.
The steps are described in section 3.1 to 3.3.
Dynamic analysis of a program is a precondition for finding the association rules be-
tween classes. Dynamic analysis of programs can be done through UML diagrams
[27]. Significant advantages of using UML are its language independence and compu-
tation of dynamic metrics based on early design artifacts. Erik Arisholm[2] referred
UML models to describe dynamic coupling measures as a way to collect for each
class its import coupled classes. They used following formula for calculating class
import coupling IC_OC (Ci).
I C _ O C ( c 1) { ( m 1 , c 1 , c 2 ) | ( ( o 1 , c 1 ) R o c ) ( ( o 1 , c 2 ) R o c | N ) c 1 z c 2 ( o 1 , m 1 | o 2 , m 2 ) M E }
150 A. Parashar and J.K. Chhabra
IC_OC (Ci) counts the number of distinct classes that a method in a given object
uses. This formula can be used to measure dependency of one class to other classes in
terms of its import coupling.
3.3 Mining of Class Association Rules and Prediction of Class Import Coupling
Behavior
1: i=1
2: Create candidate class set CSi having all classes and their Support. (The Support
for a class Ci is the frequency of occurrence of that class in IC_SET (application).
3: Create large class set Li by eliminating Class Set from CSi having Support
sup<min_sup
4: Repeat
5: i=i+1
6: Create candidate class set CSi having Cartesian product of sets in Li-1 and calcu-
late their support from IC_SET (application).
7: Create Large set Li by eliminating Class Set from CSi having sup <min_sup
8: Until (no scope to built large class set)
So Class Set in CSi give frequent Class Combination Set (FCCS).After this in Second
phase, FCCS and minimum confidence min_conf constraint are used to form CAR.
The support and confidence values for each pair of classes in FCCS are calculated
using below mentioned formulas 1 &2 [1,11,3]:
no of tuples containing both Ci & C j
support(Ci → Cj) = (1)
total no of tuples
The value 1 means that the coupling pattern of classes Ci & Cj is identical, 0 means
completely different[16,26].So using Cosine similarity one can analyze which classes
have similar, nearly similar or completely different coupling pattern. In next section,
we demonstrate our approach of Mining Class Association Rules and Measuring
Class Coupling Behavior of a sample application.
152 A. Parashar and J.K. Chhabra
As a first part of third step of methodology, the method given in section 3.3.1 is ap-
plied on IC_SET (MYSHAPES) to find Frequent Class Combination Set (FCCS) and
is shown in figure 1.Then the output FCCS is used to form Class Association
Rules(CAR) having min_conf ≥90%. Table 1 lists the CAR with confidence more
than 90% for the application MYSHAPES.
To find out the behavior of each class in terms of class coupling pattern we use class
vector representation (C_V) of MYSHAPES (table 2) and compute Cosine similarity
measure between classes as mentioned in section 3.3.2. Following table 3 shows the
computed Cos_Sim(Class1,Class2).
We can measure the reusability pattern of classes by analyzing their association rules
and import coupling patterns. CAR’s of application MYSHAPES (figure 2) suggest
whenever a class on the left hand side of rule is to be reused, there is strong probabil-
ity with 100% confidence that classes on right side of the rule will also be reused.
From figure 2 it is observed that whenever class square is to be reused class shape
will also be reused. From figure 2 it is observed that the cosine similarity between
classes circle and shape is 1 and myshape and square is.71. It suggests that import
coupling behavior of classes circle & shape are exactly similar i.e. they are always
used together while classes myshape and square are sometimes import coupled to
some common classes.
Our study shows that FCCS, CAR’s and Cos_sim between classes can be helpful
for a repository designer/user to predict which classes are required to be reused in
combination and what is the coupling pattern of classes. The effectiveness of Class
association rule is dependent on type of coupling attributes used to know import cou-
pling between classes, ways to represent coupling data and accuracy of association
mining algorithm applied to it.
Mining Class Association Rules from Dynamic Class Coupling Data 153
L1
CS2
Class_Set sup
Class_Set sup
circle 03
{ circle ,square } 03
square 04
{ circle ,shape } 03
shape 04
{ square shape } 03
CS3
L2
Class_Set sup
Class_Set sup
{ circle ,square, 03
{ circle ,square } 03 shape }
{ circle ,shape } 03
{ square shape } 03
Frequent Class
Combination Set(FCCS)
0.6
squareĺshape 0.4
0.2
CAR’s
shapeĺsquare
0
circleĺshape
circleĺsquare
Fig. 2. CAR and their Support Fig. 3. Cosine Similarities between Classes
Mining Class Association Rules from Dynamic Class Coupling Data 155
6 Conclusions
In this paper, an attempt has been made to determine class reusability pattern from
dynamically collected class import coupling data of java application. Our initial study
indicates that basic technique of market basket analysis (apriori) and cosine similarity
measure can be constructive to find out class association rules (CAR’s) and class
import coupling behaviour. Currently, we have deduced CAR’s for a sample java ap-
plication. However, the approach can also be applied on larger java applications.
Moreover, other association mining and clustering algorithms can be explored to ap-
ply on class coupling data for finding class reusability patterns.
References
1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items
in Large Databases. In: ACM, SIGMOD, pp. 207–216 (1993)
2. Arisholm, E.: Dynamic Coupling Measurement for Object-Oriented Software. IEEE
Transactions on Software Engineering 30(8), 491–506 (2004)
3. Negandhi, G.: Apriori Algorithm Review for Finals, http://www.cs.sjsu.edu
4. Choi, M., Lee, J.: A Dynamic Coupling for Reusable and Efficient Software System. In:
5th IEEE International Conference on Software Engineering Research, Management and
Applications, pp. 720–726 (2007)
5. Mitchell, A., Power, F.: Using Object Level Run Time Metrics to Study Coupling Between
Objects. In: ACM Symposium on Applied Computing, pp. 1456–1462 (2005)
6. Gui, G., Scott, P.D.: Coupling and Cohesion Measures for Evaluation of Component Reus-
ability. In: ACM International Workshop on Mining Software Repository, pp. 18–21
(2006)
7. Taha, W., Crosby, S., Swadi, K.: A New Approach to Data Mining for Software Design.
In: 3rd International Conference on Computer Science, Software Engineering, Information
Technology, e-Business, and Applications (2004)
8. Montes, C., Carver, D.L.: Identification of Data Cohesive Subsystems Using Data Mining
Techniques. In: IEEE International Conference on Software Maintenance, pp. 16–23
(1998)
9. Xie, T., Acharya, M., Thummalapenta, S., Taneja, K.: Improving Software Reliability and
Productivity via Mining Program Source Code. In: IEEE International Symposium on Par-
allel and Distributed Processing, pp. 1–5 (2008)
10. Gui, G., Scott, P.D.: Ranking reusability of software components using coupling metrics.
Elsevier Journal of Systems and Software 80, 1450–1459 (2007)
11. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: 20th Interna-
tional Conference on Very Large Data Bases, pp. 487–499 (1994)
12. Thabtah, F.A., Cowling, P.I.: A greedy classification algorithm based on association rule.
Elsevier journal of Applied Soft Computing 07, 1102–1111 (2007)
13. Zemirline, A., Lecornu, L., Solaiman, B., Ech-Cherif, A.: An Efficient Association Rule
Mining Algorithm for Classification. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A.,
Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 717–728. Springer, Hei-
delberg (2008)
14. Li, W., Han, J., Pei, J.: CMAR: Accurate and Efficient Classification Based on Multiple
Class-Association Rules. In: International Conference on Data Mining, pp. 369–376
(2001)
156 A. Parashar and J.K. Chhabra
15. Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules
16. Cosine Similarity Measure,
http://www.appliedsoftwaredesign.com/cosineSimilarityCalculat
or.php
17. Lee, Y., Chang, K.H.: Reusability and. Maintainability Metrics for Object-Oriented Soft-
ware. In: ACM 38th Annual Southeast Regional Conference, pp. 88–94 (2000)
18. Yacoub, S., Ammar, H., Robinson, T.: Dynamic Metrics for Object-Oriented Designs. In:
IEEE 6th International Symposium Software Metrics, pp. 50–61 (1999)
19. Li, W., Henry, S.: Object Oriented Metrics that predict Maintainability. In: Technical Re-
pot, Virginia Polytechnic Institute and State University (1993)
20. Shiva, S.J., Shala, L.A.: Software Reuse: Research and Practice. In: Proceedings of the
IEEE International Conference on Information Technology, pp. 603–609 (2007)
21. Bhatia, P.K., Mann, R.: An Approach to Measure Software Reusability of OO Design. In:
Proceedings of the 2nd National Conference on Challenges & Opportunities in Information
Technology, pp. 26–30 (2008)
22. Eickhoff, F., Ellis, J., Demurjian, S., Needham, D.: A Reuse Definition, Assessment, and
Analysis Framework for UML. In: International Conference on Software Engineering
(2003),
http://www.engr.uconn.edu/~steve/Cse298300/eickhofficse2003s
ubmit.pdf
23. Caldiera, G., Basili, V.R.: Identifying and Qualifying Reusable Software Components.
IEEE Journal of Computer 24(2), 61–70 (1991)
24. Henry, S., Lattanzi, M.: Measurement of Software Maintainability and Reusability in the
Object Oriented Paradigm. In: ACM Technical Report (1994)
25. Xie, T., Pei, J.: Data mining for Software Engineering,
http://ase.csc.ncsu.edu/dmse/dmse.pdf
26. Cosine Similarity, http://en.wikipedia.org/wiki/Cosine_similarity
27. Gupta, V., Chhabra, J.K.: Measurement of Dynamic Metrics Using Dynamic Analysis of
Programs. In: Proceedings of the Applied Computing Conference, pp. 81–86 (2008)
28. Michail, A.: Data Mining Library Reuse Patterns in User-Selected Applications. In: 14th
IEEE International Conference on Automated Software Engineering, pp. 24–33 (1999)
29. Associations Rule,
http://en.wikipedia.org/wiki/Association_rule_learning
30. Jaccard Index, http://en.wikipedia.org/wiki/Jaccard_index
An Algorithm of Constraint Frequent Neighboring
Class Sets Mining Based on Separating Support Items
Abstract. For the reasons that present constraint frequent neighboring class sets
mining algorithms need generate candidate frequent neighboring class sets and
have a lot of repeated computing, and so this paper proposes an algorithm of
constraint frequent neighboring class sets mining based on separating support
items, which is suitable for mining frequent neighboring class sets with
constraint class set in large spatial database. The algorithm uses the method of
separating support items to gain support of neighboring class sets, and uses up
search to extract frequent neighboring class sets with constraint class set. In the
course of mining frequent neighboring class sets, the algorithm only need scan
once database, and it need not generate candidate frequent neighboring class
sets with constraint class set. By these methods the algorithm reduces more
repeated computing to improve mining efficiency. The result of experiment
indicates that the algorithm is faster and more efficient than present mining
algorithms when extracting frequent neighboring class sets with constraint class
set in large spatial database.
Keywords: neighboring class set; constraint class set; separating support items;
up search; spatial data mining.
1 Introduction
Geographic information databases is an important and typical spatial database, mining
spatial association rules from geographic information databases is one important part
of spatial data mining and knowledge discovery, which is all known as spatial
co-location pattern written in [1]. Spatial co-location pattern are some implicit rules
expressing construct and association of spatial objects in geographic information
databases, and also expressing hierarchy and correlation of different subsets of spatial
association or spatial data in geographic information databases written in [2]. At pre-
sent, in spatial data mining, there are mainly three kinds methods of mining spatial
association rules written in [3], such as, layer covered based on clustering written in
[3], mining method based on spatial transaction written in [2, 4, 5 and 6] and mining
method based on non-spatial transaction written in [3]. We use the first two methods
to extract frequent neighboring class set written in [4, 5 and 6], but AMFNCS written
in [4] and TDA written in [5] are not able to efficient extract frequent neighboring
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 157–163, 2011.
© Springer-Verlag Berlin Heidelberg 2011
158 G. Fang et al.
class set with constraint class set, and MFNCSWCC written in [6] need generate
many candidates and have a lot of repeated computing when it uses iterative search to
generate frequent neighboring class set with constraint class set. Hence, this paper
proposes an algorithm of constraint frequent neighboring class sets mining based on
separating support items, denoted by CMBSSI, which need not generate candidate
when mining frequent neighboring class sets with constraint class set.
In order to not generate candidate when mining frequent neighboring class sets, the
algorithm introduces the method of separating support items. The method is used to
separate all items or itemsets from spatial transaction to compute support, namely,
extracting all itemsets supported by a spatial transaction to compute support. Let C =
{C1, C2…Cm} be a spatial class set, we regard NCS = {Ct1, Ct2…Ctk} as a transaction,
and the method is expressed as follows:
Step1, according to definition 5 and 6, computing Neighboring Class Set Vector as
NCSV = ( 2 t1−1 , 2 t2 −1 ... 2 tk −1 ).
Step2, let every itemsets supported by the NCS be new Neighboring Class Set, and
computing index interval as [1,2 k − 1] , we use this interval to generate Neighboring
Class Set Identification as NCSIx of these new Neighboring Class Sets.
Step3, computing NCSIx = B x ⋅ NCSV T , x ∈ [1,2 k − 1] , component of vector B x is k bit
of integer as x.
Example C = {U, V, W, X, Y, Z} is a spatial class set, NCS = {V, X, Y}. We use
the method of separating support items to extract all itemsets supported by the NCS.
Step1, we compute Neighboring Class Set Vector as NCSV = (22-1, 24-1, 25-1) = (2,
8, 16).
Step2, we compute index interval as [1,23 − 1] , namely, [1, 7].
Step3, we extract all itemsets supported by the NCS as follows:
NCSI1 = B1 ⋅ NCSV T = (0, 0, 1) · (2, 8, 16) T =16, corresponding NCS1= {Y}.
NCSI2 = B 2 ⋅ NCSV T = (0, 1, 0) · (2, 8, 16) T =8, corresponding NCS2 = {X}.
NCSI3 = B3 ⋅ NCSV T = (0, 1, 1) · (2, 8, 16) T =24, corresponding NCS3 = {X, Y}.
NCSI4 = B4 ⋅ NCSV T = (1, 0, 0) · (2, 8, 16) T =2, corresponding NCS4 = {V}.
NCSI5 = B5 ⋅ NCSV T = (1, 0, 1) · (2, 8, 16) T =18, corresponding NCS5 = {V, Y}.
160 G. Fang et al.
NCSI6 = B6 ⋅ NCSV T = (1, 1, 0) · (2, 8, 16) T =10, corresponding NCS6 = {V, X}.
NCSI7 = B7 ⋅ NCSV T = (1, 1, 1) · (2, 8, 16) T =26, corresponding NCS7 = {V, X, Y}.
Obviously, we use the method to gain all NCSk supported by NCS = {V, X, Y}, and
so compute once support of all NCSk supported by NCS = {V, X, Y}.
Let C = {C1, C2…Cm} be a spatial class set, and let I = {i1, i2…in} be an instance set,
let nk (n=∑nk) be the number of instance of Ck, The length of constraint class set is l.
Time complexity. Computing of CMBSSI mainly includes three parts which are
expressed as computing right instance, separating support items of NCS and search
frequent NCS. Time complexity is expressed as follows:
(2 m−l − 1)[n 2 C m2 / m 2 + 2 m−l −1 − 1] .
50000
MFNCSWCC
Runtime(ms) )
40000
30000 CMBSSI
20000
10000
0
8.15 4.08 1.63 0.82 0.41 0.25 0.16 0.08
Support(%)
50000
Runtime(ms) )
40000 MFNCSWCC
30000 CMBSSI
20000
10000
0
3 4 5 6 7 8 9 10
Length
5 Conclusion
This paper proposes an algorithm of constraint frequent neighboring class sets mining
based on separating support items, which is suitable for mining frequent neighboring
class sets with constraint class set in large spatial database. In the future, we need
further discuss how to improve space utilization ratio.
Acknowledgments. This work was fully supported by science and technology re-
search projects of Chongqing Education Commission (Project No. KJ091108), and it
was also supported by science and technology research projects of Wanzhou District
Science and Technology Committee (Project No. 2010-23-01) and Chongqing Three
Gorges University (Project No. 10QN-22, 24 and 30).
An Algorithm of Constraint Frequent Neighboring Class Sets Mining 163
References
1. Ma, R.H., Pu, Y.X., Ma, X.D.: GIS Spatial Association Pattern Ming. Science Press, Beijing
(2007)
2. Ma, R.H., He, Z.Y.: Mining Complete and Correct Frequent Neighboring Class Sets from
Spatial Databases. Journal of Geomatics and Information Science of Wuhan Univer-
sity 32(2), 112–114 (2007)
3. Zhang, X.W., Su, F.Z., Shi, Y.S., Zhang, D.D.: Research on Progress of Spatial Association
Rule Mining. Journal of Progress in Geography 26(6), 119–128 (2007)
4. Fang, G.: An algorithm of alternately mining frequent neighboring class set. In: Tan, Y.,
Shi, Y., Tan, K.C. (eds.) ICSI 2010. LNCS, vol. 6146, pp. 588–593. Springer, Heidelberg
(2010)
5. Fang, G., Tu, C.S., Xiong, J., et al.: The Application of a Top-Down Algorithm in Neighbor-
ing Class Set Mining. In: International Conference on Intelligent Systems and Knowledge
Engineering, pp. 234–237. IEEE press, Los Alamitos (2010)
6. Fang, G., Xiong, J., Chen, X.F.: Frequent Neighboring Class Set Mining with Constraint
Condition. In: International Conference on Progress in Informatics and Computing, pp. 242–
245. IEEE press, Los Alamitos (2010)
A Multi-period Stochastic Production Planning
and Sourcing Problem with Discrete Demand
Distribution
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 164–172, 2011.
c Springer-Verlag Berlin Heidelberg 2011
A Multi-period Stochastic Production Planning and Sourcing Problem 165
dealt with a stochastic production planning problem with a service level require-
ment, and provided non-sequential and deterministic equivalent formulations of
the model; Zäpfel [7] claimed that MRP II systems can be inadequate for the
solution of production planning problems with uncertain demand because of
the insufficiently supported aggregation process, and proposed a procedure to
generate an aggregate plan and a consistent disaggregate plan for the master
production schedule, and Kelly et al. [8] considered randomness in demand for a
single-product, single-machine line with setups in the process industry, and pro-
posed a model that incorporates mean and standard deviation of demand in the
planning horizon time periods to set production runs. Though only one product
was being made, start-ups after periods of idleness required significant setups.
On the basis of fuzzy theory, the production planning problems have also been
studied in fuzzy community. In this respect, the interested reader may refer to
Lan et al. [9,10], and Sun et al. [11,12].
The purpose of this paper is to study a realistic production planning model.
We consider production, setup, and inventory carrying costs and minimum ser-
vice level constraints at each time period, in which the demands are stochastic
with known probability distributions. Most of the stochastic production plan-
ning models in the literature may formulate the model to minimize the expected
sum of all costs [6,8,13]. In the current development, we minimize the prob-
ability that the total cost exceeds a predetermined maximum allowable cost,
where the total cost includes the sum of the inventory holding, setup and pro-
duction costs in the planning horizon. For general demand distributions, the
proposed problem is very complex, so we cannot solve it by conventional opti-
mization methods. To avoid this difficulty, we assume the demands have finite
discrete distributions, and derive the crisp equivalent forms of both probabilistic
objective function and the probability level constraints. As a consequence, the
proposed production planning problem is turned into its equivalent integer pro-
gramming problem. Since there is no “one-size-fits-all” solution that is effective
for all integer programming problems, we adopt the branch and bound method
to solve our equivalent integer production planning problem.
The rest of this paper is organized as follows. In Section 2, we formulate a
new class of stochastic production planning models with probability objective
subject to service level constraints. In Section 3, we assume the demands have
finite discrete probability distributions, and deal with the equivalent formulation
of original stochastic production planning problem. Section 4 is devoted to the
discussion of the branch and bound solution method for the equivalent integer
production planning problem. Section 5 performs some numerical experiments
via one 3-product source, 8-period production planning problem to demonstrate
the developed modeling idea. Finally, we draw our conclusions in Section 6.
2 Formulation of Problem
In this section, we will develop a new class of stochastic minimum risk pro-
gramming models for a multi-period production planning and sourcing problem.
Assume that there is a single product and N types of production sources (plants
166 W. Chen, Y. Liu, and X. Wu
and subcontractors). The demand for this specific product in each period is
characterized by a random variable with known probability distribution.
The costs in the objective function consist of production cost, inventory hold-
ing cost and setup cost. The objective of the problem is to minimize the proba-
bility that the total cost exceed a predetermined maximum allowable cost.
Constraints on the performance (related to backorders) of the system are
imposed by requiring service levels which forces the probability of having no
stock out to be greater than or equal to a service level requirement in each
period.
In addition, we adopt the following notation to model our production planning
problem: i is the index of sources, i = 1, 2, . . . , N ; t is the index of periods,
t = 1, 2, . . . , T ; cit is the unit cost of production at source i in period t; ht is the
unit cost of inventories in period t; I0 is the initial inventory; It is the inventory
level at the end of period t; sit is the fixed cost of setup at source i in period t;
yit is 1 if a setup is performed at source i in period t, and 0 otherwise; Mit is
the capacity limitation of source i at time period t; dt is the stochastic demand
in period t; αt is the service level requirement in period t; ϕ is the maximum
allowable cost, and xit is the production quantities at source i in period t.
Using the notation above, a minimum-risk stochastic production planning
model with probability service levels is formally built as
⎧ T N
⎪
⎪ min Pr{ t=1 (ht (It )+ + i=1 (sit yit + cit xit )) > ϕ}
⎨
s.t.: Pr{It ≥ 0} ≥ αt , t = 1, 2, . . . , T
(1)
⎪
⎪ xit ≤ Mit yit , i = 1, 2, . . . , N, t = 1, 2, . . . , T
⎩
xit ∈ Z+ n
, yit ∈ {0, 1}, i = 1, 2, . . . , N, t = 1, 2, . . . , T,
where (It )+ = max{0, It }, t = 1, 2, . . . , T , are the real inventory levels. For each
period t the inventory balance is
N
t
N
t
It = It−1 + xit − dt = I0 + xiτ − dτ , (2)
i=1 τ =1 i=1 τ =1
where dˆk = (dˆk1 , dˆk2 , · · · , dˆkT ) is the kth realization of demand d during T periods,
pk > 0 for all k such that K k=1 pk = 1.
In this case, the tth probability level constraint
t N t
Pr I0 + xiτ − dτ ≥ 0 ≥ αt (4)
τ =1 i=1 τ =1
where Q−
t (αt ) is the left end-point of the closed interval of αt -quantiles of
dτ
τ =1
probability distribution F tτ =1 dτ (t) of random demand tτ =1 dτ .
Furthermore, we define a binary vector z whose components zk , k ∈ K, take
1 if the corresponding set of constraints has to be satisfied and 0 otherwise. In
particular, for each scenario k, we may introduce a number M large enough so
that the following inequality holds
T
t
N
t
N
T
(ht (I0 + xiτ − dˆkτ )+ ) − M zk ≤ ϕ − (sit yit + cit xit ). (6)
t=1 τ =1 i=1 τ =1 i=1 t=1
t
N
t
ltk = (I0 + xiτ − dˆkτ )+ . (9)
τ =1 i=1 τ =1
T
t
N
t
T
ht (I0 + x̂iτ − dˆkτ )+ − M ẑk ≤ ht l̂tk − M ẑk . (10)
t=1 τ =1 i=1 τ =1 t=1
From the reformulation of production planning problem (7), we can see that
even for a small size of the random vector, the number of K can be very large.
In addition, problem (8) consists of integer and binary decision variables. Thus,
problem (8) belongs to the class of NP-hard problems. In the next section, we
discuss the solution method of (8) by a general purpose optimization software.
4 Solution Method
5 Numerical Experiments
In this section, we perform some numerical experiments via the following ex-
ample. A manufacturer supplies his products to a retailer, suppose that the
manufacturer has three product sources, N = 3, and eight production periods,
T = 8. Each plant and subcontractor has different setup cost, product capacity
and unit production cost. Suppose sit , Mit , cit , ht , αt , ϕ are all predetermined by
the actual situation. The manufacturer has to meet the demands for different
products according to the service level requirements set by its customers. Let
sit
Periods 1 2 3 4 5 6 7 8
source
1 1500 1450 2000 1600 1200 1250 2200 1800
2 1200 1280 1300 1850 1600 1650 1480 2000
3 2500 2000 1880 1600 1980 1500 1660 1750
Mit
Periods 1 2 3 4 5 6 7 8
source
1 5000 4000 4500 4500 4500 4800 5000 5000
2 6000 5500 5500 4500 4800 3800 4000 4000
3 6500 6500 5500 4000 4000 3800 3800 3500
cit
Periods 1 2 3 4 5 6 7 8
source
1 2 3 2.5 2.5 3.5 2.5 2.5 2.5
2 2.5 3 3 4 4.5 1.6 3 1.8
3 3 3.5 2 2.5 2.2 2.8 5 3.5
dt
Periods 1 2 3 4 5 6 7 8
3800 3760 4800 4500 4890 3200 3450 3990
p 0.4 0.3 0.5 0.45 0.35 0.6 0.55 0.2
3290 4300 5200 5000 6100 5740 4880 4100
p 0.6 0.7 0.5 0.55 0.65 0.4 0.45 0.8
Periods 1 2 3 4 5 6 7 8
ht 4 5 5.5 4 4.5 3 3.5 6
αt 0.95 0.8 0.9 0.92 0.88 0.9 0.92 0.95
170 W. Chen, Y. Liu, and X. Wu
xit 1 2 3 4 5 6 7 8
1 3800 0 0 4500 2100 1830 4590 0
2 0 4100 0 0 0 3800 0 4000
3 0 0 5500 0 4000 0 0 0
us assume that the demand dt has a finite integer discrete distribution and it
is meaningful when the products are not indivisibly. We assume, for the sake of
simplicity, that the initial inventory level is 0, I0 =0, and the data used for this
test are collected in Table 1.
Due to T = 8 and each period demand has two realizations, one has K =
256. Let ϕ = 1.8 × 105 , and M = 106 . We employ the Lingo 8.0 to solve the
equivalent production planning problem (8). The obtained optimal suction of
the production planning problem is reported in Table 2, and the corresponding
optimal value is 0.1412070.
From Table 2 we get the production quantities at each source in each period.
The production quantity is nonzero as the binary variables yit = 1. From the
numerical experiment, we can see that even for a small size of the random vec-
tor, the number of K can be very large, and because of introducing auxiliary
variables, the scale of this numerical example is also rather large. Furthermore,
more numerical experiments for this example have been performed with differ-
ent values of parameter ϕ. Figure 1 shows that how the optimal objective value
varies with the predetermined maximum allowable cost ϕ. Lower values of ϕ al-
low bigger probability that the total cost exceeds the maximum allowable cost.
Nevertheless, the choice of ϕ is up to the capability of a decision maker. In reality
life, a manufacturer who has lower acceptable costs may suffer higher risk than
the ones who have higher acceptable costs. So the manufacturer should make a
decision according to the relationship between an acceptable maximum cost and
the suffered risk.
6 Conclusions
When optimal production decisions must be reached in a stochastic environ-
ment, in order to give to the optimization problem its appropriate form, the
formulation of the decision model requires a deeper probing of the aspirations
criteria. In addition, the computational obstacles should be overcome to find
optimal production decisions. In these two aspects, the major new contributions
of the current development are as follows.
(i) On the basis of minimum risk criteria, we have presented a new class of
stochastic production planning problem with probability objective subject
to service level constraints, in which product demands are characterized by
random variables. In addition, a manufacturer has a number of plants and
subcontractors and has to meet the product demands according to various
service levels prescribed by its customers.
A Multi-period Stochastic Production Planning and Sourcing Problem 171
0.9
0.8
0.7
0.6
risk 0.5
0.4
0.3
0.2
0.1
0
0.8 1 1.2 1.4 1.6 1.8 2
maximum allowable cost 5
x 10
(ii) For general demand distributions, the developed stochastic production plan-
ning problem (1) is very complex, so we cannot solve it by conventional
optimization methods. So, we assumed the demands have finite discrete dis-
tributions, and derived the crisp equivalent forms of both probability ob-
jective function and the probabilistic level constraints. As a consequence,
we turn the original production planning problem (1) into its equivalent
integer programming model (7) so that the branch-and-bound method can
be used to solve it. The equivalent alternative formulation about
integer production planning problem (7) has also been discussed (see
Proposition 1).
(iii) To demonstrate the developed modeling idea, a number of numerical exper-
iments has been performed via one numerical example with three product
sources and eight production periods. By changing the value of parameter
ϕ, we get the trade-off between an acceptable maximum cost and the suf-
fered risk (see Figure 1). This relationship is considered as a guidance for
investment that is meaningful in reality production processing.
Acknowledgments
This work was supported by the National Natural Science Foundation of China
under Grant No.60974134, the Natural Science Foundation of Hebei Province
under Grant No.A2011201007, and the Education Department of Hebei Province
under Grant No.2010109.
172 W. Chen, Y. Liu, and X. Wu
References
1. Candea, D., Hax, A.C.: Production and Inventory Management. Prentice-Hall,
New Jersey (1984)
2. Das, S.K., Subhash, C.S.: Integrated Approach to Solving the Master Aggregate
Scheduling Problem. Int. J. Prod. Econ. 32(2), 167–178 (1994)
3. Dzielinski, B.P., Gomory, R.E.: Optimal Programming of Lot Sizes, Inventory and
Labor Allocations. Manag. Sci. 11, 874–890 (1965)
4. Florian, M., Klein, M.: Deterministic Production Planning with Concave Costs and
Capacity Constraints. Manag. Sci. 18, 12–20 (1971)
5. Lasdon, L.S., Terjung, R.C.: An Efficient Algirithm for Multi-Echelon Scheduling.
Oper. Res. 19, 946–969 (1971)
6. Bitran, G.R., Yanasee, H.H.: Deterministic Approximations to Stochastic Produc-
tion Problems. Oper. Res. 32(5), 999–1018 (1984)
7. Zäfel, G.: Production Planning in the Case of Uncertain Individual Demand Ex-
tension for an MRP II Concept. Int. J. Prod. Econ. 119, 153–164 (1996)
8. Kelly, P., Clendenen, G., Dardeau, P.: Economic Lot Scheduling Heuristic for Ran-
dom Demand. Int. J. Prod. Econ. 35(1-3), 337–342 (1994)
9. Lan, Y., Liu, Y., Sun, G.: Modeling Fuzzy Multi-Period Production Planning and
Sourcing Problem with Credibility Service Levels. J. Comput. Appl. Math. 231(1),
208–221 (2009)
10. Lan, Y., Liu, Y., Sun, G.: An Approximation-Based Approach for Fuzzy Multi-
Period Production Planning Problem with Credibility Objective. Appl. Math.
Model. 34(11), 3202–3215 (2010)
11. Sun, G., Liu, Y., Lan, Y.: Optimizing Material Procurement Planning Problem by
Two-Stage Fuzzy Programming. Comput. Ind. Eng. 58(1), 97–107 (2010)
12. Sun, G., Liu, Y., Lan, Y.: Fuzzy Two-Stage Material Procurement Planning Prob-
lem. J. Intell. Manuf. 22(2), 319–331 (2011)
13. Yıldırım, I., Tan, B., Karaesmen, F.: A Multiperiod Stochastic Production Plan-
ning and Sourcing Problem with Service Level Constraints. OR Spectrum 27(2-3),
471–489 (2005)
14. Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. John
Wiley & Sons, New York (1988)
15. Wolsey, L.A.: Integer Programming. John Wiley & Sons, New York (1998)
Exploration of Rough Sets Analysis in Real-World
Examination Timetabling Problem Instances
J. Joshua Thomas, Ahamad Tajudin Khader, Bahari Belaton, and Amy Leow
School of Computer Sciences, University Sains Malaysia & KDU College Penang
[email protected], {tajudin,bahari}@cs.usm.my
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 173–182, 2011.
© Springer-Verlag Berlin Heidelberg 2011
174 J. Joshua Thomas et al.
3 Dataset
Many researchers are using the benchmarking Carter dataset [16] to applying the
methods and test the results with quality feasible solution. There are two standard
dataset used by the examination timetabling community Carter dataset and the
ITC(International Timetabling Competition) dataset[ 14]. Everybody in the scientific
community uses in order to test any proposed algorithms. The Carter dataset were
introduced in 1996 by Carter, Laporte and Lee in a paper published in the Journal of
the Operational Research Society. One of the major drawbacks of most articles in the
timetabling literature is that testing is limited to randomly generated problems and
perhaps to one practical example.
The formulation for the Carter dataset as follows:
─ The room capacities for the examination rooms (which is why is it considered
an uncapacitated problem).
─ The fact that two consecutive examinations, which are on different days, is bet-
ter than two consecutive examinations on the same day.
Both of these scenarios would give the same penalty cost using the usual objective
function used in the Carter dataset, even though the student would have the evening
(indeed, all night) to revise, as opposed to no time at all if the examinations were truly
consecutive. Indeed, each instance in the dataset just has a number of timeslots. There
is concept of different days.
176 J. Joshua Thomas et al.
The recent examination timetabling review paper [13] has explained the two ver-
sions of the datasets and the modifications. However, the contributions of the works
are not with the data values, but on the problem instances. Few works were done on
modifying with respective of real-world scenario provided by the institutions. Table 1
shows Carter dataset with the problem instances.
4 Pre-processing
In real world examination timetabling, many decisions are required to take into ac-
count several factors simultaneously under various sets of constraints (soft con-
straints). Usually it is not known which parameter(s) need to be emphasized more in
order to generate a better solution or decision. In many cases, a tradeoff between the
various potentially conflicts on assignment of exams into timeslots. Rough sets usual-
ly employ a dataset is represented as a table, where each row represents an object.
Every column represents a variable, an observation that can be evaluated for each
object. This table is called an information system. The following ordering criteria
were considered when selecting which exam should be scheduled first:
─ Number of conflict exams, largest degree (LD)
─ Number of student enrolled, largest enrollment (LE)
─ Number of available slot, saturation degree (SD)
In each case, two out of the three criteria above were selected as input variables.
More formally, a pair U, A ,where U is a non-empty finite set of objects
called the universe and A is a non-empty finite set of attributes such that :
for every . The set is called the value set of a.
Exploration of Rough Sets Analysis in Real-World Examination 177
= , | (1)
is called the indiscernibility relation.
For instance Table. 2 define an indiscernibility relation. The subsets of the
conditional attributes are [Course], [Enrollment]. If for instance, [No of students
Enrollment (LE)] only, objects x4 and x5 belongs to the same equivalence class and
indiscernible. We look at the relation defines three equivalence class
identified below.
= {{ x1,}, { x2,}, { x3,}, { x4,}, { x5,}, { x6,}, { x7,}, { x8,},
{ x9,}, { x10,}, { x11,}, { x12,}}
= {{ x4 , x5 }, { x10, x12}}
= {{ x1, x2}, { x3, x6, x7, x10, x11, x12}, { x4, x5}, { x8, x9}}
The rough set approach requires only indiscernibility it is not necessary to define an
order or distance when the values of different kinds are combined (e.g. courses,
enrollment). The discretization step determines how roughly the data to be processed.
We called this as “pre-processing”. For instance, course or enrollment values have to
establish cut-off points. The intervals might be refined based on good domain know-
ledge. Setting the cut-off points are computationally expensive for the large datasets
and that domain expert to prepare discretization manually.
Let be an information system with n objects. The discernibility matrix of is
symmetric n matrix with entries as given below. Each entry thus consists of
the set of values upon which objects and differ.
| } for i, j = 1,….,n (2)
The discrenibility function for an information system is a Boolean function
m Boolean variables (a1………..am).
……, |1 , 0
| for 1 } (3)
Input: Information table (T) created from the dataset value column and n is the no. of interval
for each column value.
Output: Information table (DT) with discretized real value column.
1. For do
2. Define Boolean variables B = ∑ , ……
3. End For where ∑ correspond to a set of partition defined on the va-
riables of column v.
4. Create a new Information table (DT) by using the set of partition.
5. Find the objects that discerns in the decision class.
For instance, the Table. 4 shows the interval and cut-off points used in the Carter
dataset problem instances. The column count explains the large, average, medium and
low intervals a set on the standard dataset where the number of student enrolled larg-
est enrollment (LE). Searching the reducts form a decision table is NP-complete.
Fortunately, Carter dataset has no reducts, and the work directions with setting inter-
vals, ranking with classification on the dataset.
180 J. Joshua Thomas et al.
Table 5. Experimental results for the rough sets discretization approach that were implemented
6 Conclusion
In this paper, we have presented an intelligent data analysis approach based on rough
sets theory for generating classification rules from a set of observed 12- real world
problem instances as a benchmarking dataset for the examination timetabling com-
munity. The main objective is to investigate the problem instances/dataset and with
minor modification to obtained better timetables. To increase the classification
process rough sets with Boolean reasoning (RSBR) discretization algorithm is used to
discretize the data. Further work will be done to minimize the experiment duration in
order to get better results with the rough set data analysis.
Exploration of Rough Sets Analysis in Real-World Examination 181
References
[1] Burke, E.K., Elliman, D.G., Ford, P.H., Weare, R.F.: Examination timetabling in British
Universities – a survey. In: Burke, E., Ross, P. (eds.) PATAT 1995. LNCS, vol. 1153, pp.
76–90. Springer, Heidelberg (1996)
[2] Burke, E.K., Elliman, D.G., Weare, R.F.: A hybrid genetic algorithm for highly con-
strained timetabling problems. In: Proceedings of the 6th International Conference on Ge-
netic Algorithms (ICGA 1995), Pittsburgh, USA, July 15-19, pp. 605–610. Morgan
Kaufmann, San Francisco (1995)
[3] Burke, E.K., Bykov, Y., Newall, J., Petrovic, S.: A time-predefined local search approach
to exam timetabling problems. IIE Transactions on Operations Engineering, 509–528
(2004)
[4] Caramia, M., Dell’Olmo, P., Italiano, G.F.: New algorithms for examination timetabling.
In: Näher, S., Wagner, D. (eds.) WAE 2000. LNCS, vol. 1982, pp. 230–241. Springer,
Heidelberg (2001)
[5] Carter, M.W., Laporte, G., Lee, S.Y.: Examination timetabling: Algorithmic strategies
and applications. Journal of the Operational Research Society, 373–383 (1996)
[6] Joshua, J., et al.: The Perception of Interaction on the University Examination Timetabl-
ing Problem. In: McCollum, B., Burke, E., George, W. (ed.) Practice and Theory of Au-
tomated Timetabling, ISBN 08-538-9973-3
[7] Al-Betar, M., et al.: A Combination of Metaheuristic Components based on Harmony
Search for The Uncapacitated Examination Timetabling. In: McCollum, B., Burke, E.,
George, W. (eds.): Practice and Theory of Automated Timetabling, ISBN 08-538-9973-3
(PATAT 2010, Ireland, Aug, selected papers) for Annals of operational research
[8] Boizumault, P., Delon, Y., Peridy, L.: Constraint logic programming for examination
timetabling. The Journal of Logic Programming 26(2), 217–233 (1996)
[9] Brailsford, S.C., Potts, C.N., Smith, B.M.: Constraint satisfaction problems: Algorithms
and applications. European Journal of Operational Research 119, 557–581 (1999)
[10] Burke, E.K., de Werra, D., Kingston, J.: Applications in timetabling. In: Yellen, J., Gross,
J.L. (eds.) Handbook of Graph Theory, pp. 445–474. Chapman Hall, CRC Press (2003)
[11] Burke, E.K., Newall, J.P.: Solving examination timetabling problems through adaption of
heuristic orderings. Annals of Operations Research 129, 107–134 (2004)
[12] Burke, E.K., Petrovic, S.: Recent research directions in automated timetabling. European
Journal of Operational Research 140, 266–280 (2002)
[13] Qu, R., Burke, E.K., McCollum, B., Merlot, L.T.G., Lee, S.Y.: A Survey of Search Me-
thodologies and Automated System Development for Examination Timetabling. Journal
of Scheduling 12(1), 55–89 (2009), online publication (October 2008), doi:
10.1007/s10951-008-0077-5.pdf
[14] McCollum, B., Schaerf, A., Paechter, B., McMullan, P., Lewis, R., Parkes, A., Di Gaspe-
ro, L., Qu, R., Burke, E.: Setting The Research Agenda in Automated Timetabling: The
Second International Timetabling Competition. INFORMS Journal on Computing 22(1),
120–130 (2010)
[15] Carter, M.W.: A survey of practical applications of examination timetabling algorithms.
Operation Research 34(2), 193–202 (1986)
[16] Carter, M.W., Laporte, G., Lee, S.Y.: Examination timetabling: Algorithmic strategies
and applications. Journal of the Operational Research Society 47, 373–383 (1996)
[17] Pawlak, Z.: Rough sets. International Journal of Computer and Information Science 11,
341–356 (1982)
182 J. Joshua Thomas et al.
[18] Pawlak, Z.: Rough Sets Theoretical Aspect of Reasoning about Data. Kluwer Academic,
Boston (1991)
[19] Pawlak, Z., Grzymala-Busse, J., Slowinski, R., Ziarko, W.: Rough sets. Communications
of the ACM 38(11), 89–95 (1995)
[20] Ślęzak, D.: Various approaches to reasoning with frequency-based decision reducts: a
survey. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Sets in Soft Computing
and Knowledge Discovery: New Developments. Physica Verlag, Heidelberg (2000)
[21] Pal, S.K., Polkowski, L., Skowron, A.: Rough-Neuro Computing: Techniques for Compu-
ting with Words. Springer, Berline (2004)
Community Detection in Sample Networks
Generated from Gaussian Mixture Model
1 Introduction
The modern science of networks has brought significant advances to our under-
standing of complex systems [1,2,3]. One of the most relevant features of graphs
representing real systems is community structure, i.e. the organization of vertices
in clusters, with many edges joining vertices of the same cluster and compara-
tively few edges joining vertices of different clusters. Such communities can be
considered as fairly independent compartments of a network, playing a similar
role like the tissues or the organs in the human body [4,5]. Detecting communi-
ties is of great importance, which is very hard and not yet satisfactorily solved,
despite the huge effort of a large interdisciplinary community of scientists work-
ing on it over the past few years [6,7,8,9,10,11,12,13]. On a related but different
front, recent advances in computer vision and data mining have also relied heav-
ily on the idea of viewing a data set or an image as a graph or a network, in
order to extract information about the important features of the images [14].
In our previous work [12], we extend the measure of diffusion distance between
nodes in a network to a generalized form on the coarse-grained network with data
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 183–190, 2011.
c Springer-Verlag Berlin Heidelberg 2011
184 L. Zhao, T. Liu, and J. Liu
2 Framework of Coarse-Grained-Diffusion-Distance
Based Agglomerative Algorithm
2.1 Construction of Coarse-Grained Diffusion Distance
We will start with a brief review of the basic idea in [12]. Let G(S, E) be a
network with n nodes and m edges, where S is the nodes set, E = {e(x, y)}x,y∈S
is the weight matrix and e(x, y) is the weight for the edge connecting the nodes x
and y. We can relate this network to a discrete-time Markov chain with stochastic
matrix P with entries p1 (x, y) given by p1 (x, y) = e(x,y)
d(x) , d(x) = z∈S e(x, z),
where d(x) is the degree of the node x [3]. The process is driven by the P t =
{pt (x, y)}x,y∈S , where pt (x, y) represents the probability of going from node
x to node y through a random walk in t time steps. This Markov chain has
stationary distribution μ(x) = d(x)d(z) and it satisfies the detailed balance
z∈S
Community Detection in Sample Networks 185
condition μ(x)p1 (x, y) = μ(y)p1 (y, x). The diffusion distance Dt (x, y) between x
and y is defined as the weighted L2 distance
pt (x, z) − pt (y, z) 2
Dt2 (x, y) = , (1)
μ(z)
z∈S
where the weight μ(z)−1 penalize discrepancies on domains of low density more
than those of high density. As is well known, the transition matrix P has a set
of left and right eigenvectors and a set of eigenvalues 1 = λ0 ≥ |λ1 | ≥ · · · ≥
|λn−1 | ≥ 0, and naturally P ϕi = λi ϕi , ψiT P = λi ψiT , i = 0, 1, · · · , n − 1. Note
that ψ0 = μ, ϕ0 ≡ 1 and ψiT ϕj = δij . The left and right eigenvectors are related
according to ψi (x) = ϕi (x)μ(x). The spectral decomposition of P t is given by
n−1
n−1
pt (x, y) = λti ϕi (x)ψ(y) = λti ϕi (x)ϕ(y)μ(y), (2)
i=0 i=0
n−1
Dt2 (x, y) = i (ϕi (x) − ϕi (y)) .
λ2t 2
(3)
i=0
N
We take a partition of S as S = k=1 Sk with Sk Sl = Ø if k = l, and
regard each set Sk in the state space S = {S1 , · · · , SN } as corresponding to the
nodes of a N -nodes network Ĝ(S, Et ), where Et = {êt (Sk , Sl )}Sk ,Sl ∈S , and the
weight
êt (Sk , Sl ) on the edge that connects Sk and Sl is defined as êt (Sk , Sl ) =
x∈Sk ,y∈Sl μ(x)pt (x, y), where the sum involves all the transition probabilities
between x ∈ Sk and y ∈ Sl . From the detailed balance condition, it can be
verified that êt (Sk , Sl ) = êt (Sl , Sk ). By setting μ̂(Sk ) = z∈Sk μ(z), one can
define a coarse-gained Markov chain on Ĝ(S, Et ) with stationary distribution μ̂
and transition probabilities
êt (Sk , Sl ) 1
p̂t (Sk , Sl ) = N = μ(x)pt (x, y). (4)
m=1 êt (Sk , Sm )
μ̂(Sk )
x∈Sk ,y∈Sl
It can be easily shown that p̂t is a stochastic matrix on the state space S and
satisfies a detailed balance condition with respect to μ̂. More generally, we define
coarse-grained versions of ψi in a similar way by summing over the nodes in a
partition ψ̂i (Sk ) = z∈Sk ψi (z) and as above, coarse-grained versions of ϕi
according to the duality condition ψ̂i (Sk ) = ϕ̂i (Sk )μ̂(Sk ) is defined as ϕ̂i (Sk ) =
ψ̂i (Sk ) 1
μ̂(Sk ) = μ̂(Sk ) z∈Sk ϕi (z)μ(z). Then the coarse-grained probability p̂t can be
written in a similar way as (2) in the spectral decomposition form as follows
n−1
p̂t (Sk , Sl ) = λti ϕ̂i (Sk )ϕ̂i (Sl )μ̂(Sl ). (5)
i=0
186 L. Zhao, T. Liu, and J. Liu
This can be considered as an extension version of (2). This leads to the diffusion
distance between community Sk and Sl given by
n−1
N
D̂t2 (Sk , Sl ) = λti λtj (ϕ̂i (Sk ) − ϕ̂i (Sl ))(ϕ̂j (Sk ) − ϕ̂j (Sl )) ψ̂i (Sm )ϕ̂j (Sm ).
i,j=0 m=1
(6)
This notion of proximity of communities in the coarse-grained networks reflects
the intrinsic geometry of the set S in terms of connectivity of the meta-nodes
in a diffusion process. This metric is thus a key quantity in the design of the
following algorithm that is based on the preponderance of evidences for a given
hypothesis.
1
N
d(x)d(y)
Q= e(x, y) − pE (x, y) , pE (x, y) = (7)
2me 2me
k=1 x,y∈Sk
and me is the total weight of edges given by x,y∈S e(x, y)/2. Some existing
methods are presented to find good partitions of a network into communities
by optimizing the modularity over possible divisions, which has proven highly
effective in practice [7,8,11,12].
Table 1. The parameters for construction of the three sample networks generated from
the Gaussian mixture model
Table 2. The computational results obtained by our method. Here CR1 and CR2 are
the correct rates compared with the original partition {Tk } and those obtained from
k-means algorithm, respectively.
dissimilarity mentioned above, since it takes into account all the information re-
lating the two clusters. The only parameter our computation is the time step
t and increasing t corresponds to propagating the local influence of each node
with its neighbors.
123
6 142107
149 108 121
126 112110
127
115 147
143
114
5.5 124 102106 66
139 120
128 140
133 130 76
144 129 118 132 85 57
70 89 60 78
5 104116 138
122109 83
131 113136111
146 101
137 99 5955 96 67
119103 9380 71 7756
5182 72
145 150
125 81 86 52
84
75100 91
4.5 141 88
87
105 62 68 73
135
148 64 53
7494
117 92
134 9754 98 65
4 69 95 63 61 58 79
y
3 90
3.5 14 42
24
6 9
13 1234
15 41 33
3 2644
18 43
29
40
8 4930162 31 4
11 25 20
46 736
2.5 4538 21
Samples in group 1 28
23 1 37 27
Samples in group 2 19 5 22
32 47
2 Samples in group 3 39 50
17 10
−1 0 1 2
48
x 35
(a) (b)
Fig. 1. (a)150 sample points generated from the given 3-Gaussian mixture distribu-
tion. The star symbols represent the centers of each Gaussian component. The circle,
square and diamond shaped symbols represent the position of sample points in each
component, respectively. (b)The network generated from the sample points in Figure
1(a) with the parameter dist = 0.9.
188 L. Zhao, T. Liu, and J. Liu
3 Experimental Results
As mentioned in Section 1, we generate n sample points {xi } in two dimensional
Euclidean space subject to a K-Gaussian mixture distribution K k=1 qk G (µk ,Σk ).
Here we pick nodes n(k − 1)/K + 1 : nk/K in group Tk for simplicity, and with
this choice, approximately qk = n/K, k = 1, · · · , K. The covariance matrices
are set in diagonal form Σk = σI. The other parameters for construction of the
three sample networks generated from the Gaussian mixture model are list in Ta-
ble 1. The computational results obtained by our method are shown in Table 2.
Here CR1 and CR2 are the correct rates compared with the original partition
{Tk } and those obtained from k-means algorithm, respectively. We can see that
the number of communities are in accordance with the number of components in
the corresponding Gaussian mixture models and the two kinds of correct rates
0.6
Modularity Q
0.6 t=1
t = 1000 0.4 Q =0.6344
max
0.5
t = 1200
0.2
t = 1500
Modularity function
t = 2000 0
0.4
20 40 60 80 100 120 140
0.3 27
35
4
33
31
43
29
134
48
23
0.2 47
39
17
50
32
22
10
5
3
11
0.1 40
8
14
6
19
41
18
44
30
0 16
45
25
28
21
38
0 50 100 150 49
46
20
26
Number of communities 15
42
24
34
12
13
9
36
(a) 7
2
37
1
140
132
141
117
148
135
105
123
123 142
126
142 150
125
107 124
139
149 108 121 128
127
112
126 110
127 112
149
115 147
143
114 115
110
124 102106 66 108
107
139 120
128 140 146
131
133 130 76 145
144 129 118 132 85 57
70 89 60 78
144
113
104
104116 138
122109 83 130
118
131 113136111
146 101
137 99 5955 96 67 106
120
114
119103 9380 71 7756
5182 72 147
143
145 150 52
84 102
133
141 125 81 86 8875100 91 129
116
105 87 136
138
135
148 62 68 73 64 53 122
119
117 7494 92 103
109
134 9754 98 65 111
69 95 63 61 58
137
101
79 121
93
80
97
90 69
90
3 76
99
14 42
24 83
6 9 85
70
13 12 34 66
62
33 65
67
15 41 63
79
264418 43
29 60
78
40 92
8 49 162 31 4 91
72
11 3846 30
25 20 736 61
58
68
45 21 89
81
28 71
23 1 37 27 98
94
73
19 5 22 64
74
32 47 95
54
39 50 53
59
57
17 10 96
100
82
55
77
56
51
87
84
48 35
75
52
88
86
(b) (c)
Fig. 2. The computational results for the sample network with 150 nodes detected by
our method. (a)The modularity changing with number of communities in each iteration
for different time parameter t. (b)The community structure identified by setting t = 1,
corresponds to 3 communities represented by the colors. (c)The dendrogram of hierar-
chical structures and the optimal partition with a maximal modularity Q = 0.6344 is
denoted by a vertical dash line.
Community Detection in Sample Networks 189
7.5
t=1
0.6
7 t=2
t=3
6.5 0.5 t = 10
Modularity function
6 t = 20
0.4
5.5
y
5 0.3 190
271
268
4.5
0.2
4 149
0.1
3.5 Samples in group 1
Samples in group 2
3 Samples in group 3 0
x Number of communities
Fig. 3. (a)300 sample points generated from the given 3-Gaussian mixture distribution.
(b)The modularity changing with number of communities in each iteration for different
time parameter t. (c)The community structure identified by setting t = 3, corresponds
to 3 communities represented by the colors.
8.5
8
0.7 t=1
7.5
t=2
0.6 t=3
t = 10
Modularity function
7
0.5 t = 100
6.5
0.4
y
0.3
5.5
5 0.2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 50 100 150 200 250 300
x
Number of communities
Fig. 4. (a)320 sample points generated from the given 4-Gaussian mixture distribution.
(b)The modularity changing with number of communities in each iteration for different
time parameter t. (c)The community structure identified by setting t = 3, corresponds
to 4 communities represented by the colors.
indicate that our method can produce accurate results when the time parameter
t is properly chosen. The visualization of the partitioning result and the dendro-
gram of hierarchical structures are shown in Figure 2. The same goes for the other
two sample networks and the clustering results can be seen in Figure 3 and Figure
4, respectively.
4 Conclusions
In this paper, we use the coarse-grained-diffusion-distance based agglomerative
algorithm to uncover the community structure exhibited by sample networks
generated from Gaussian mixture model. The present algorithm can identify the
community structure in a high degree of efficiency and accuracy. An appropri-
ate number of communities can be automatically determined without any prior
190 L. Zhao, T. Liu, and J. Liu
Acknowledgements
This work is supported by the Project of the Social Science Foundation of Beijing
University of Posts and Telecommunications under Grant 2010BS06.
References
1. Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. Rev. Mod.
Phys. 74(1), 47–97 (2002)
2. Newman, M.: The structure and function of complex networks. SIAM Review 45(2),
167–256 (2003)
3. Newman, M., Barabási, A.L., Watts, D.J.: The structure and dynamics of networks.
Princeton University Press, Princeton (2005)
4. Barabási, A., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., Vicsek, T.: Evolution
of the social network of scientific collaborations. Physica A 311, 590–614 (2002)
5. Ravasz, E., Somera, A., Mongru, D., Oltvai, Z., Barabási, A.: Hierarchical organi-
zation of modularity in metabolic networks. Science 297(5586), 1551–1555 (2002)
6. Girvan, M., Newman, M.: Community structure in social and biological networks.
Proc. Natl. Acad. Sci. USA 99(12), 7821–7826 (2002)
7. Newman, M., Girvan, M.: Finding and evaluating community structure in net-
works. Phys. Rev. E 69(2), 026113 (2004)
8. Newman, M.: Modularity and community structure in networks. Proc. Natl. Acad.
Sci. USA 103(23), 8577–8582 (2006)
9. E, W., Li, T., Vanden-Eijnden, E.: Optimal partition and effective dynamics of
complex networks. Proc. Natl. Acad. Sci. USA 105(23), 7907–7912 (2008)
10. Li, T., Liu, J., E, W.: Probabilistic Framework for Network Partition. Phys. Rev.
E 80, 26106 (2009)
11. Liu, J., Liu, T.: Detecting community structure in complex networks using simu-
lated annealing with k-means algorithms. Physica A 389(11), 2300–2309 (2010)
12. Liu, J., Liu, T.: Coarse-grained diffusion distance for community structure detec-
tion in complex networks. J. Stat. Mech. 12, P12030 (2010)
13. Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)
14. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern
Anal. Mach. Intel. 22(8), 888–905 (2000)
15. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Springer, New York (2001)
16. Penrose, M.: Random Geometric Graphs. Oxford University Press, Oxford (2003)
Efficient Reduction of the Number of Associations Rules
Using Fuzzy Clustering on the Data
1 Introduction
Nowadays, we notice a growing interest for the Knowledge Discovery in Databases
(KDD) methods. One of the important reasons to that fact is the increasing volume of
the accumulated data by organizations that are under-exploited extensively. Several
solutions were proposed, they are based on neural networks, trees, concept lattices,
association rules, etc. [1].
Several algorithms for mining association rules were proposed in the literature. The
existing generation methods are combinative and generate a big number of rules (even
when departing from sets of reasonable size) that are not easily exploitable [2], [3].
Several approaches of reduction of this big number of rules have been proposed like
the use of quality measurements, syntactic filtering by constraints, and compression
by the representative or Generic Bases [4]. These bases constitute the reduced sets of
rules permitting to preserve the most relevant rules, without any loss of information.
In our opinion, the big number of the generated rules is due to the fact that these
approaches try to determine rules departing from the enormous data set.
In this paper, we propose to extract knowledge taking in consideration another de-
gree of granularity into the process of knowledge extraction. We propose to define
rules (meta-rules) between classes resulting from a preliminary fuzzy clustering on
the data. We call the knowledge yet extracted "Meta-Knowledge". Indeed, while clas-
sifying data, we construct homogeneous groups of data having the same properties, so
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 191–199, 2011.
© Springer-Verlag Berlin Heidelberg 2011
192 A. Grissa Touzi, A. Thabet, and M. Sassi
defining rules between clusters implies that all the data elements belonging to those
clusters will be necessarily dependent on these same rules. Thus, the number of gen-
erated rules is smaller since one processes the extraction of the knowledge on the
clusters which number is relatively lower compared to the initial data elements. We
prove that we can easily deduce knowledge about the initial data set if we want more
details.
The rest of the paper is organized as follows: section 2 presents the basic concepts
of discovering association rules. Section 3 presents problems and limits of the existing
knowledge discovery approaches. Section 4 defines the theoretical foundation of this
approach. Section 5 contains the principles of the new approach that we proposed.
Section 6 enumerates the advantages of the proposed approach. Section 7 validates
the proposed approach and gives an experimentation example. We finish this paper
with a conclusion and a presentation of some future works.
2 Basic Concepts
In this section, we present the basic concepts of discovering association rules.
Association rules mining have been developed in order to analyze basket data in a
marketing environment. Input data are composed of transactions: each transaction
consists of items purchased by a consumer during a single visit. Output data is com-
posed of rules. An example of an association rule is “90% of transactions that involve
the purchase of bread and butter also include milk” [5]. Even if this method was in-
troduced in the context of Market Business Analysis, it has many applications in other
fields, like webmining or textmining. It can also be used to search for frequent co-
occurrences in every large data set.
The first efficient algorithm to mine association rules is APriori [6]. Other algo-
rithms were proposed to decrease the count of reads of the database and to improve
computational efficiency. Among them, we mention CLOSED [7], CHARM [8],
TITANIC [9],[10], GENALL [11], PRINCE [12].
Several varieties of lattice have been introduced with these algorithms, like Ice-
berg Concept lattices [10] where the nodes are frequent closed itemsets ordered by
the inclusion relation, Minimal Generators Lattice [12], where the nodes are the
minimal Generators (called key itemsets) are ordered by the inclusion relation. In
these cases we don't construct the FCA on the data but on the found itemsets. For
more detail the reader can see [12].
The only interesting work, that used the data classification as a prior step of the gen-
eration of association rules, applied in the industry, is the one of Plasse et al. [13]. The
technique proposed was to carry out a preliminary classification of the variables in
order to obtain homogeneous groups of attributes then to seek the association rules
inside each one of these groups. They obtained groups of variables, more restricted
and homogeneous. Besides the rules obtained are fewer and simpler.
Efficient Reduction of the Number of Associations Rules 193
We define the cut, noted α-Coupe, on the fuzzy context as being the reverse of the
number of clusters obtained.
We can consider two possible strategies for the application of the α-Coupe: Bi-
nary strategy (resp. Fuzzy strategy): This strategy defines a binary membership
(resp. fuzzy membership) of the objects to the different clusters. We propose to leave
the fuzzy formal context, to apply an α-Coupe to the set of the degrees of member-
ship, to replace these last by values 1 and 0 and to deduce the binary reduced formal
context.
DB PL NT LI AT
C1 C2 C3
15 14 12 14 10
S1 S1 0.092 0.804 0.104
14 15 9 8 10
S2 S2 0.091 0.708 0.201
16 13 12 12 7
S3 S3 0.041 0.899 0.060
7 10 14 12 8
S4 S4 0.071 0.100 0.829
11 5 18 15 14 S5 0.823 0.070 0.107
S5
12 11 10 10 10 S6 0.090 0.548 0.362
S6
17 6 14 15 14 S7 0.810 0.108 0.082
S7
9 10 12 11 10 S8 0.036 0.066 0.898
S8
5 6 10 6 10 S9 0.157 0.179 0.664
S9
13 7 12 14 13 S10 0.231 0.388 0.381
S10
In our example α-Coupe = 1/3. Table 3 presents the binary reduced formal context
after application of the α-Coupe to the fuzzy formal context presented in table 2.
Table 4 represents the fuzzy reduced formal context after application of the α-Coupe
to the fuzzy formal context presented on Table 2.
Generally, we can consider that the attributes of a formal concept, best-known as
the concept intention, are the description of the concept. Thus, the relationships be-
tween the object and the concept should be the intersection of the relationships be-
tween the objects and the attributes of the concept. Since each relationship between
the object and an attribute is represented as a set of membership values in the fuzzy
formal context, the intersection of these member-ship values should be the minimum
of these membership values, according to fuzzy theory. Thus, we defined the fuzzy
formal concept from the fuzzy formal context.
Properties
− The number of clusters generated by a clustering algorithm is always lower than
the number of starting objects to which one applies the clustering algorithm
− All objects belonging to one same cluster have the same characteristics. These
characteristics can be deduced easily knowing the center and the distance from the
cluster.
Efficient Reduction of the Number of Associations Rules 195
Table 3. Reduced binary formal context Table 4. Reduced fuzzy formal context
C1 C2 C3 C1 C2 C3
S1 0 1 0 S1 - 0.804 -
S2 0 1 0 S2 - 0.708 -
S3 0 1 0 S3 - 0.899 -
S4 0 0 1 S4 - - 0.829
S5 1 0 0 S5 0.823 - -
S6 0 1 1 S6 - 0.548 0.362
S7 1 0 0 S7 0.810 - -
S8 0 0 1 S8 - - 0.898
S9 0 0 1 S9 - - 0.664
Notation
Let C1 and C2 be two clusters generated by a fuzzy clustering algorithm.
The rule C1 = > C2 with a coefficient (CR) will be noted C1 = > C2 (CR)
If the coefficient CR is equal to 1 then the rule is called an exact rule.
Theorem 1
Let C1, C2 be two clusters, generated by a fuzzy clustering algorithm and verifying
the properties p1 and p2 respectively. Then the following properties are equivalent:
C1 ⇒ C2 (CR) ⇔
- ∀ object O1 ∈ C1 => O1 ∈C2 (CR)
- ∀ object O1 ∈ C1, O1 checks the property p1 of C1 and the property p2 of C2. (CR)
Theorem 2
Let C1, C2 and C3 be three clusters generated by a fuzzy clustering algorithm and
verifying the properties p1, p2 and p3 respectively. Then the following properties are
equivalent:
C1 and C2 = > C3 (CR)
⇔
∀ object O1 ∈ C1 ∩ C2 = > O1 object ∈C3 (CR)
∀ O1 object ∈ C1 ∩ C2 then O1 checks the properties p1, p2 and p3 with (CR)
The proof of the two theorems rises owing to the fact that all objects which belong to
a same cluster check necessarily the same property as their cluster.
Classification of data
We regroup customers that check the same property in a class (only one property).
Using this type of fuzzy clustering, we have the following properties:
- The number of clusters in this case will be equal to the number of attributes.
- The Class i will contain all the objects which check only one same property.
For example in basket data in a marketing environment we regroups all the customers
who bought the same product x.
196 A. Grissa Touzi, A. Thabet, and M. Sassi
C1 C2
Customers who Customers who
bought Bread bought
From this matrix, we can generate rules giving associations between the different
clusters. The Figure1 models an example of classification result modeling an over-
lapping between the two clusters C1 and C2. We notice that the intersection of the
two clusters gives the customers who bought bread and Chocolate.
Begin
Step 1: Introduce a data set (any type of data)
Step 2: Apply a fuzzy clustering algorithm to organize the data into the different
groups (or clusters).
Step 3: Determine the fuzzy formal context (Object/Cluster) of the matrix obtained
in the step 2
Step 4: Deduce the reduced binary formal context of the matrix obtained in the
step 3
Step 5: Apply an algorithm of generation of association rules on clusters to generate
the set of Meta knowledge in the form of association rules between clusters
Step 6: Generate knowledge of the data set in the form of association rules
End
Generation of
Analysis Extraction of Knowledge
Meta Knowledge’s
8 Conclusion
In this paper, we presented a new approach that permits to extract knowledge from a
preliminary fuzzy clustering of the data. Generally, all the methods in this field applied
to the data (or data variety) which is huge. It generates consequently a big number of
association rules that are not easily assimilated by the human brain. The space memory
and the time execution necessary for the management of these lattices are important.
To resolve this problem, we propose to build rules (meta-rules) between groups (or
clusters) resulting from a preliminary fuzzy clustering of the data. This approach is
based on the following main ideas: while classifying data, we construct homogeneous
groups or clusters of data having each one the same properties. Consequently, defin-
ing rules between clusters implies that all data belong to those will be necessarily
dependent on these same generated rules.
To validate this approach, we have chosen the FCM (Fuzzy C-Means) algorithm
which allows a fuzzy clustering to generate clusters. We have chosen the Prince algo-
rithm that permits to extract Generic Bases modeling the Meta Knowledge from initial
data and we deduce the data set’s knowledge. We have implemented a fuzzy cluster-
ing and a data knowledge extraction platform. It is extensible, it offers different fuzzy
clustering algorithms and different generation association rules algorithms.
1
www.cck.rnu.tn/sbenyahia/software_release.htm
Efficient Reduction of the Number of Associations Rules 199
In the future, we propose to define obtained rules in an Expert System and offer to
the user the possibility to dialogue with this system to satisfy his needs.
References
1. Goebel, M., Gruenwald, L.: A Survey of Data Mining and Knowledge Discovery Software
Tools. SIGKDD, ACM SIGKDD 1(1), 20–33 (1999)
2. Zaki, M.: Mining Non-Redundant Association Rules. In: Data Mining and Knowledge Dis-
covery, vol. (9), pp. 223–248 (2004)
3. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Intelligent structuring and re-
ducing of association rules with formal concept analysis. In: Baader, F., Brewka, G., Eiter,
T. (eds.) KI 2001. LNCS (LNAI), vol. 2174, pp. 335–350. Springer, Heidelberg (2001)
4. Pasquier, N.: Data Mining: Algorithmes d’Extraction et de Réduction des Règles
d’Association dans les Bases de Données. Thèse, Département d’Informatique et Statis-
tique, Faculté des Sciences Economiques et de Gestion, Lyon (2000)
5. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between sets of items in
large Databases. In: Proceedings of the ACM SIGMOD Intl. Conference on Management
of Data, Washington, USA, pp. 207–216 (June 1993)
6. Agrawal, R., Skirant, R.: Fast algoritms for mining association rules. In: Proceedings of the
20th Int’l Conference on Very Large Databases, pp. 478–499 (June 1994)
7. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient Mining of Association Rules Us-
ing Closed Itemset Lattices. Information Systems Journal 24(1), 25–46 (1999)
8. Zaki, M.J., Hsiao, C.J.: CHARM: An Efficient Algorithm for Closed Itemset Mining. In:
Proceedings of the 2nd SIAM International Conference on Data Mining, Arlington, pp. 34–
43 (April 2002)
9. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Fast Computation of Concept
Lattices Using Data Mining Techniques. In: Bouzeghoub, M., Klusch, M., Nutt, W., Sattler,
U. (eds.) Proceedings of 7th Intl. Workshop on Knowledge Representation Meets Data-
bases (KRDB 2000), Berlin, Germany, pp. 129–139 (2000)
10. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing Iceberg Concept
Lattices with TITANIC. J. on Knowledge and Data Engineering (KDE) 2(42), 189–222
(2002)
11. Ben Tekaya, S., Ben Yahia, S., Slimani, Y.: Algorithme de construction d‘un treillis des
concepts formels et de déte rmination des générateurs minimaux. ARIMA Journal, 171–193
(Novembre 2005); Numéro spécial CARI 2004
12. Hamrouni, T., Ben Yahia, S., Slimani, Y.: Prince: Extraction optimisée des bases généri-
ques de règles sans calcul de fermetures. In: Proceedings of the Intl. INFORSID Confer-
ence, Editions Inforsid, Grenoble, France, May 24-27, pp. 353–368 (2005)
13. Plasse, M., Niang, N., Saporta, G., Villeminot, A., Leblond, L.: Combined use of associa-
tion rules mining and clustering methods to find relevant links between binary rare attrib-
utes in a large data set. Computational Statistics & Data Analysis 52(1), 596–613 (2007)
14. Pasquier, N., Bastide, Y., Touil, R., Lakhal, L.: Pruning closed itemset lattices for associa-
tion rules. In: Proceedings of 14th International Conference Bases de Données Avancées,
Hammamet, Tunisia, October 26-30, pp. 177–196 (1998)
15. Zaki, M.J.: Generating Non-Redundant Association Rules. In: Proceedings of the 6th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston,
MA, pp. 34–43 (August 2000)
16. Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with
counting inference. SIGKDD Explorations 2(2), 66–75 (2000)
A Localization Algorithm in Wireless Sensor Networks
Based on PSO
Hui Li, Shengwu Xiong, Yi Liu, Jialiang Kou, and Pengfei Duan
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 200–206, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A Localization Algorithm in Wireless Sensor Networks Based on PSO 201
In this paper, RSS (Received Signal Strength) [7] is used to measure distance between
two nodes. The radio range is a circle in this model. The distance between the nodes is
measured according to the attenuation of signal broadcasted in medium. Mathematical
model of wireless communication channel is showed as follows:
⎛d ⎞
PL(d ) = PL(d 0 ) − 10n lg ⎜ ⎟ − xσ (1)
⎝ d0 ⎠
Where d denotes the distance between transmitter and receiver; d0 denotes the reference
node; n denotes the channel attenuation index and can be read out value from 2 to 4. Xσ
denotes a variable of Gauss random noise; PL(d0) denotes the signal strength which is
d0 away from the transmitter; PL(d) denotes signal strength which is d away from
transmitter; PL(d0) can be got from experience or definition of hardware criterion. In
this formula, the distance d is calculated by signal strength PL(d).
2.2 Assumptions
In general, most localization algorithms adopt the communication model which deems
that the communication region of a node in a two dimension space is a circle. However,
the bounding box algorithm uses squares instead of circles to bound the possible posi-
tions of a node. An example of this algorithm is depicted in Fig.1.
For each anchor node i, a bounding box is defined as a square with its center at the
position of this node (xi,yi), with sides of size 2h (the side length of the internal con-
necting square of the circle mentioned above) and with four corners’ coordinates of
( xi − h, yi − h), ( xi − h, yi + h),( xi + h, yi − h), ( xi + h, yi + h) respectively. The intersection
of all bounding boxes can be easily computed without any need for floating point op-
erations by taking the maximum of the low coordinates and the minimum of the high
coordinates of all bounding boxes, expressed by formula (5) and denoted by the shaded
rectangle in Fig.1.
So we can figure out the rectangular estimation range of unknown node from the
formula (6) as follows:
With the scope of the EstimateScope, get the xrandom between the xleft and the
xright,get the yrandom between the yfloor and the yceiling,so we can figure out:
In this set of experiments, we deployed 300 sensor nodes including anchor nodes and
unknown nodes randomly distributed in a two-dimension rectangular area of
1000*1000m2(shows in figure 2). We assume the transmission range is fixed by 200m
for both unknown nodes and anchor nodes. In the simulation of the position algorithm,
we assume the channel attenuation index n of 4. Figure 3 shows the connectivity rela-
tion of sensor nodes, which leads to an average connectivity of 31.8933 and adjacent
anchor node numbers of 6.5533. In the graph, points represent nodes and edges rep-
resent the connections between neighbors who can hear each other.
A Localization Algorithm in Wireless Sensor Networks Based on PSO 203
In this section, traditional bounding box and optimized bounding box localization
algorithm will be simulated in same parameters environment (original node coordi-
nates, ratio of anchor nodes, node density and communication radius) and the different
performance of positioning accuracy and localization error will be analyzed.
Figure 4 shows the localization error of different nodes before optimization and after
optimization. All of the unknown nodes have been estimated by bounding box algorithm.
204 H. Li et al.
The green represent the localization error of different nodes before optimization and then
the blue represent the localization error of different nodes after optimization. We can see
clearly from Figure 4 that localization error of almost all nodes decrease correspondingly
by optimization. This is due to the fact that in the bounding box algorithm, the final po-
sition of the unknown node is then computed as the center of the intersection of all rec-
tangular estimation range, however, after estimated coordinates are optimized by PSO,
we can get the more precise coordinate value of the unknown nodes.
A Localization Algorithm in Wireless Sensor Networks Based on PSO 205
Figure 5 shows the average location accuracy of the traditional bounding box algo-
rithm compared with optimized algorithm, which shows us that the latter is more
preferable to the former.
Figure 6 shows original position and optimized position of unknown nodes. The
circles represent the true location of the nodes, and the squares represent the estimated
location of the nodes which have been optimized by PSO. The longer the line, the larger
the error is. In this graph, we can see that the estimated location of the nodes is more
near to the true location after optimization.
4 Conclusion
Localization is an important issue for WSNs. To reduce the localization error and
improve the accuracy of the estimated location, a localization algorithm in WSNs based
on PSO is proposed.
In this paper, after the rectangular estimation range of unknown node is calculated
by bounding box, the final position of the unknown node is not computed by the center
of the intersection of all rectangular estimation range, but got by the position of the
unknown node randomly within rectangular estimation range and then optimized by
PSO. Analysis denotes that this scheme requires small amount of calculation and
simulation results shows that optimized algorithm is superior to the traditional
bounding box on the positioning accuracy and localization error.
206 H. Li et al.
References
1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: A Survey on Sensor Networks.
IEEE Commun. Magn. 40, 102–114 (2002)
2. Tubaishat, M., Madria, S.: Sensor Networks: An Overview. IEEE Potentials 22(2), 20–23
(2003)
3. Basagni, S., Carosi, A., Melachrinoudis, E., Petrioli, C., Wang, Z.M.: Protocols and model
for sink mobility in wireless sensor networks. ACM SIGMOBILE Mobile Computing and
Communications Review 10, 28–30 (2006)
4. He, T., Huang, C., Blum, B.M., Stankovic, J.A., Abdelzher, T.: Range-Free Localization
Schemes for Large Scale Sensor Networks. In: 9th Annual International Conference on
Mobile Computing and Networking, pp. 81–95. IEEE Press, San Diego (2003)
5. You, Z., Meng, M.Q.-H., Liang, H., et al.: A Localization Algorithm in Wireless Sensor
Networks Using a Mobile Beacon Node. In: International Conference on Information
Acpuisition, pp. 420–426. IEEE Press, Jeju City (2007)
6. Eberhart, R.C., Kennedy, J.: A New Optimized using Particle Swarm Theory. In: 6th Inter-
national Symposium on Micromachine and Human Science, pp. 39–43. IEEE Press, Pis-
cataway (1995)
7. Chen, H., Ping, D., Xu, Y., Li, X.: A Novel Localization Scheme Based on RSS Data for
Wireless Sensor Networks. In: Advanced Web and Network Technologies, and Applications,
pp. 315–320. IEEE Press, Harbin (2008)
8. Bulusu, N., Heidemann, J., Estrin, D.: GPS-less low-cost outdoor localization for very small
devices. IEEE Personal Communications 7(5), 28–34 (2000)
9. Capkun, S., Hamdi, M., Hubaux, J.P.: GPS-Free Positioning in Mobile Ad-Hoc Networks.
In: 34th Annual Hawaii International Conference on System Sciences, pp. 255–258. IEEE
Press, Maui (2001)
Game Theoretic Approach in Routing Protocol for
Cooperative Wireless Sensor Networks
Abstract. A game theoretic method, called the first price sealed auction game,
was introduced to control routing overhead in wireless sensor networks in this
paper. The players of the game are the wireless nodes with set of strategies
(forward or not). The game is played whenever an arbitrary node in the network
forwards packets. In order for the game to function, a multi-stage pricing game
model is established, this provides the probability that the wireless nodes for-
ward the receiving packets, and the payoff of all nodes can be optimize through
choosing the best neighbour node. The simulations in NS2 showed that the pric-
ing routing game model improves performance, not only decreasing the energy
consumption, but also prolonging network life time. Finally the numerical
analysis about nodes’ payoff is given through Matlab.
1 Introduction
Wireless sensor networks(WSNS) have received significant attention in recent years.
The main features of WSNS are that of low-cost nodes with limited resources both in
terms of computational power and battery whose purpose is sensing the environment.
In order to decrease the energy consumption, Numerous routing protocols have been
introduced for wireless sensor networks. Our approach in this paper falls into selec-
tively balancing the forwarding overhead on nodes by applying Game Theory. Game
theory is a mathematical method that attempts to mathematically capture and analyze
the behavior in strategic situations, in which an individual's success in making choices
depends on the choices of others. It ensures that the desired global objective is
achieved in the presence of selfish agents. Game Theory is not new to the area of tele-
communications and wireless networks. It has been used to model the interaction
among users, to solve routing and resource allocation problems in a competitive envi-
ronment, it provides incentive mechanism for high energy nodes to cooperate with
other nodes in transferring information in networks. In [1-4], authors gave the relative
researches on these problems.
In this paper, a routing model based pricing game is presented. In order to use first
price sealed auction game model to select relay node, we first organize the node and
its neighbour nodes into an incomplete information auction game, the node is buyer
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 207–217, 2011.
© Springer-Verlag Berlin Heidelberg 2011
208 Q. Liu, X. Xian, and T. Wu
and its neighbour nodes are sellers. Then, we evolve the auction game as a multi-stage
game model. Finally, we get a reliable routing path through relay node selection in
each stage game. In each stage game, the Sink offers to the buyer a payment which
compensates for its energy consumption and service reward (Sink-pay). The buyer
purchases service from sellers and selects the relay node by evaluating each seller, the
selected seller will get a payment for the forwarding service (relay-reward). The sta-
bility of the routing path is very important. So the choice of the relay node is a key
problem. From a game theory perspective, such a stable configuration corresponds to
Nash Equilibrium [5]. In our algorithm, we first choose the best node to forward
packets in each stage, and then all the selected nodes form a stable path through
multi-stage game. So our algorithm decreases the energy consumption and prolongs
network life time.
The rest of this paper is organized as follows. Section 2 surveys the related net-
work. Section 3 introduces the network model and game model. Section 4 provides
the design of pricing routing game model. Section 5 elaborates the proposed algo-
rithm. Section 6 evaluates the algorithm with numerical analysis and simulation. At
last, we conclude the paper in the section 7.
2 System Model
Consider a multi-hop networks consisting of one buyer node and several seller nodes,
we model the WSNs as an undirected graph G =< V , E > as illustrated in Fig. 1,
where V denotes the node set and E represents the link set. Each link between nodes
indicates whether each buyer-seller pair can communicate with each other. If they
could, there is a link between them. Otherwise, there is no link between the two
nodes. ei denotes the residual energy of node vi, hvi denotes the hop count from the
node vi to the Sink node, and h(vs, v3) denotes the minimum hop count from vs to the
Sink node through node v3. Each node saves local network information, including its
minimum hop count to the Sink node, its residual energy, its neighbour nodes’ mini-
mum hop count to the Sink node and its neighbour nodes’ residual energy. We present
the node of vs information matrix in Table 1.
In most cases, the key problem that gets energy efficiency and reliability is to design
an efficient incentive for accelerating cooperation. Most of nodes would like to
choose cooperation strategy in order to improve their payoff. Some of the nodes are
likely to break their agreement and drop the packets after they get the payoff. Some
existing mechanisms can't solve this problem well, and these mechanisms can expend
many network resources due to the overhead packets’ transmitting frequently [5]. In
WSNs, most of the nodes will serve as routers, the whole forwarding process is de-
picted as multi-stage game, each stage is composed of one buyer and several sellers,
the buyer is the sending node and the sellers are receiving nodes. We model this net-
work based on sealed auction game model, the whole auction game process in net-
work is shown in Fig. 2. At first stage, the Sink needs the packets of source node, the
source node acts as the buyer, and he will pay a certain price to buy the forwarding
service from his neighbour nodes. After the selected neighbour node receives the
packets and gets the profit, the auction game enters the second stage with new state
and the sellers in stage one become buyers. The auction game is a typical strategic
game with incomplete information, because the buyer knows the valuation of all the
sellers, but any seller does not know valuation about others except his own.
β j ( r (e j (t ), h j ,k )) = b ⋅ r (e j (t ), h j ,k ) (1)
where h j , k is the hop count of node j to the sink node through neighbor node k , b
is the payoff for forwarding packets, r (e j (t ), h j , k ) is the link quality of node j .
Because the buyer knows the residual energy and the hop count of his neighbor
nodes, he will give an average price according to the valuation information of all the
neighbor nodes. The bidding function of buyer node i can be expressed as
where ∑ (r (e (t ), h
i i, j
)) is the total evaluation of all the neighbor nodes’ link quality,
j∈ N i
link quality of each neighbor node. After the buyer and sellers quote price respec-
tively according to their link quality, we can get the deal price ( ϕ (i , j ) ) of buyer i
and seller j , given by
where ϕ (i , j ) is the deal price of buyer i and seller j . According to (3), when the
quoted price of buyer i is larger than seller j , the buyer i buy forwarding service at
j ’s price, or else, the deal price is 0, which means i doesn’t choose j as relay node.
When there are multiple sellers, the seller with the lowest price is chosen by i , which
is depicted as
ϕ ( i , j ) = M IN β j ( h j , k , r ( e j ( t ), h j , k )) (4)
j∈ N i
3.1 The Payoff Function of Source Node as Buyer at First Stage Auction
Sink will give a price to the source nodes which send the packets to it, and it is the
payment to compensate for energy consumption and service reward of source node.
We define the payoff function of source node s at time t is
The aim of source node is to maximize its own utility, which can be expressed as
m a x ( u s ( t ) ) = m a x {α s ( t ) ⋅ [ ( h s , j ( t ) ) 2 ⋅ b − ϕ ( s , j ) − e s s ( t ) / e s ( t ) ]} (7)
According to (5), in order to maximize his utility, the source node should increase
the forwarding success rate and choose the lowest deal price ϕ ( i , j ) .This implies
m a x ( u s ( t )) = ( m a x (α s ( t )) ) ⋅ [( h s , j ) 2 ⋅ b − m in ( ϕ ( s , j ) + e s s ( t ) / e s ( t ))] (8)
212 Q. Liu, X. Xian, and T. Wu
Assume that the energy consumed in receiving packets is ignored, the buyer (source
node) and the seller (relay node) agree to transact at the price ϕ ( s, j ) . The payoff
function of the relay node j is depicted as
u j (t ) = ϕ ( s , j ) (9)
In this stage, source node withdraw the game, relay node (seller) j in first stage acts
as the buyer, his neighbor nodes set N j act as sellers, and node j needs buy the for-
warding service from his neighbor nodes N j . Node j and each neighbor node in
N j bid at the same time, the process is same with the first stage game. The final deal
price is the minimum price of the neighbor nodes’. Assume k is the selected neighbor
node, the payoff function of node j at time t is u j (t ) , given by
u j (t ) = α j (t )[( h j ,k )2 ⋅ b + ϕ (s, j) − ϕ ( j, k ) − e js (t ) / e j (t )] (10)
From the above equations, each node would like to choose the minimum pricing node
as his relay node to maximize his payoff. Node’s forwarding success rate is affected
by his neighbour node, and node’s payoff is constrained by forwarding success rate.
In order to maximize his payoff, node will pay price to his minimum pricing neighbor
node to ensure his forwarding success rate.
4 Algorithmic Flow
Our goal is to find out the reliable and stable path from the source to the destination,
if such path exists, according to our model, each node on this path will get non-
negative payoff. If a node will give price to his neighbours, he must know both his
own evaluation information and his neighbour nodes’ evaluation information. There-
fore, we propose algorithm 1 and algorithm 2 to build the neighbour node table for
each node.
Game Theoretic Approach in Routing Protocol for Cooperative WSNs 213
Before each cluster head selection, for each node Vi ∈ N, sending a Hello information with
∈
node id, residual energy (Hello, Vi ,energy) to his neighbours, Send(Hello, Vi ,energy(ei)).
2. For each node Vi ∈ N receives a Hello information, recv(Hello, Vj Ni , ej ).
3. Find(Vj );
Update(Vj , ej);/*Update the energy of Vj*/
When the current cluster selection time is over, jump to step 1.
The neighbour node table is built up at the cluster head selection stage. In the proc-
ess of sending packets, each node will choose the proper relay node to forward pack-
ets according the neighbour node table. The buyer node and seller nodes will launch
an auction for the forwarding service, and the seller nodes compete with each other to
get profit by sending buyer node’s packets. The pricing routing game algorithm is
given in algorithm 3.
We use Matlab to analyse the node’s payoff which is affected by the forwarding suc-
cess rate and link quality, and we use source node and his neighbour nodes game
model to calculate source node’s payoff.
We set the number of nodes is 50, the number of BS nodes is 1 and the mobility of
nodes is static, the value of reward b is 1, the hop count of source node to Sink is 10,
and the energy consumed of sending packets is 0.1. In Fig.3, it shows the payoff of
source node at different forwarding success rate α . We can know that the payoff at
α = 0.8 is higher than the payoff at α = 0.4 and α = 0.1 . The experiment results
214 Q. Liu, X. Xian, and T. Wu
show that larger the forwarding success rate, higher the payoff, and also show the
source node’s payoff increases with his neighbour node’s residual energy increasing.
Therefore, each node would like to choose the maximum energy neighbour node as
his relay node to send packets to destination.
For each source node Vk select the relay node Vj ∈ Nk, and sends a information to his
neighbour nodes.
The source node estimates price ρ ( r ( e (t ), h
k k k , j )) , and his neighbours Nk estimate their
price, for node Vj ∈ Nk, the price is β j ( r ( e j (t ), h j , l ))
Neighbour nodes send the price information to the source node, and source node judge the
deal price by
ρ k (r (ek (t ), hk , j )) ≥ β j ( r (e j (t ), h j ,l )) , finding the proper node as relay node, and give
the deal price ϕ ( k , j ) to the selected node.
Sends data to Vj ,calculate the payoff u k (t ) of Vk.
For each node Vj , if (Vj is the selected node) then
Buy the link quality from neighbour nodes, give the deal price to the selected node;
Send data to Vl
Calculate the payoff of Vj, u j (t ) = α j (t )[ h j , k ) ⋅ b + ϕ ( k , j ) − ϕ ( j , l ) − e js (t ) / e j (t )]
2
Else
The payoff u j (t ) =0
End if
If the node is not the one which is described above, go to sleep.
80
α =0.8
70
60
50
α =0.4
40
payoff
30
20
α =0.1
10
0
0 1 2 3 4 5 6 7 8 9 10
energy
Fig. 3. The payoff of source node at different Fig. 4. The payoff of different hops of
forwarding success rate neighbour nodes and node’s energy
The source node’s payoff with different hop count and different residual energy of
neighbour nodes is given in Fig.4. We assume the value of forwarding success rate is
1 ( α =1), the hop count of source node is 10, the energy consumed of sending packets
is 0.1, and the residual energy of source node is 8. Fig.4 indicates that the smaller the
hop count of neighbour nodes and the larger the residual energy of neighbour nodes
Game Theoretic Approach in Routing Protocol for Cooperative WSNs 215
are, the higher the payoff of source node is. We also conclude that the larger the hop
count and the smaller the residual energy of neighbour nodes are, and the payoff of
source node will decrease rapidly and nearly close to 0. In order to increase its payoff,
the source node will choose the minimum hop count and the largest residual energy
neighbour node as his relay node.
10 1000
LEACH
princing routing game
9 pricing routing game 900 LEACH
800
8
700
7
residual energy
600
energy
6
500
5
400
4 300
3 200
100
2
0 5 10 15 20 25 30 35 40 45 50
simulation time 0
0 10 20 30 40 50 60 70
simulation time
Fig. 5. The residual energy profit of a randomly Fig. 6. The profit of the sum of all active
selected node nodes’ residual energy
10
7
residual energy
2
LEACH
1 pricing routing game
0
15 16 19 24 35 37 40 41 49
node id
6 Conclusion
In this paper, we considered an energy constrained cooperative wireless sensor net-
works, and proposed a pricing routing game model based on the first price sealed auc-
tion game model. Through the pricing routing game model, we encourage the relay
node to forward packets, each node aims at maximizing his payoff by choosing the op-
timal relay node. Compared to the LEACH protocol, our algorithm can enhance net-
work’s life time effectively. In the next paper, we will discuss the network performance
under the influence of the dishonest nodes and cooperative nodes respectively.
Game Theoretic Approach in Routing Protocol for Cooperative WSNs 217
References
1. Machado, R., Tekinay, S.: A survey of game-theoretic approaches in wireless sensor net-
works. Computer Networks 52, 3047–3061 (2008)
2. Liu, Q., Liao, X.F., et al.: Dynamics of an inertial two-neuron system with time delay.
Nonlinear Dynamics 58(3), 573–609 (2009)
3. Komathy, K., Narayanasamy, P.: Best neighbor strategy to enforce cooperation among selfish
nodes in wireless ad hoc network. Computer Communications 30(18), 3721–3735 (2007)
4. Jun, C., Xiong, N.X., Yang, L.T., He, Y.: A joint selfish routing and channel assignment
game in wireless mesh networks. Computer Communications 31, 1447–1459 (2008)
5. Liu, H., Krishnamachari, B.: A price-based reliable routing game in wireless networks. In:
Proceedings of the First Workshop on Game Theory for Networks, GAMENETS 2006
(2006)
6. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application-specific proto-
col architecture for wireless microsensor networks. IEEE Transactions on Wireless Com-
munications 1, 660–670 (2002)
7. Zhong, S., Chen, J., Yang, Y.R.: A simple, cheat-proof, Credit-based System for Mobile
Ad hoc Networks. In: Proceeding of IEEE INFOCOM, pp. 1987-1997 (2003)
8. Marti, S., Giuli, T.J., Lai, K., Baker, M.: Mitigating Routing Misbehaviour in Mobile Ad
Hoc Networks. In: Proceedings of the Sixth Annual International Conference on Mobile
Computing and Networking, MobiCom 2000 (2000)
9. Lu, Y., Shi, J., Xie, L.: Repeated-Game Modeling of Cooperation Enforcement in Wireless
Ad Hoc Network. Journal of Software 19, 755–776 (2008)
10. Altman, E., Kherani, A.A., Michiardi, P., Molva, R.: Non-cooperative Forwarding in Ad-
hoc Networks, Technical Report INRIA Report No.RR-5116 (2004)
11. Wang, B., Han, Z., Liu, R.: Stackelberg game for distributed resource allocation over mul-
tiuser cooperative communication networks. IEEE Trans. Mobile Computing 8(7), 975–
990 (2009)
12. Shastry, N., Adve, R.S.: Stimulating cooperative diversity in wireless ad hoc networks
through pricing. In: Proc. IEEE Intl. Conf. Commun. (June 2006)
13. Zhong, S., Li, L., Liu, Y., Yang, Y.R.: On designing incentive-compatible routing and for-
warding protocols in wireless ad-hoc networks an integrated approach using game theoretical
and cryptographic techniques, Tech. Rep. YALEU/DCS/TR-1286, Yale University (2004)
14. Huang, J., Berry, R., Honig, M.: Auction-based spectrum sharing. ACM/Springer J. Mo-
bile Networks and Applications 11(3), 405–418 (2006)
15. Chen, J., Lian, S.G., Fu, C., Du, R.Y.: A hybrid game model based on reputation for spec-
trum allocation in wireless networks. Computer Communications 33, 1623–1631 (2010)
16. Huang, J., Han, Z., Chiang, M., Poor, H.V.: Distributed power control and relay selection
for cooperative transmission using auction theory. IEEE J. Sel. Areas Commun. 26(7),
1226–1237 (2008)
17. Chen, L., Szymanski, B., Branch, W.: Auction-Based Congestion Management for Target
Tracking in Wireless Sensor Networks. In: Proceedings of the 2009 IEEE International
Conference on Pervasive Computing and Communications (PERCOM 2009), Galveston,
TX, USA, 9-13, pp. 1–10 (2009)
A New Collaborative Filtering Recommendation
Approach Based on Naive Bayesian Method
1 Introduction
Recommendation systems are widely used by e-commerce web sites. They are a
kind of information retrieval. But unlike search engines or databases they pro-
vide users with things they have never heard of before. That is, recommendation
systems are able to predict users’ unknown interests according to their known
interests[8],[10]. There are thousands of movies that are liked by millions of peo-
ple. Recommendation systems are ready to tell you which movie is of your type
out of all these good movies. Though recommendation systems are very useful,
the current systems still require further improvement. They always provide ei-
ther only most popular items or strange items which are not to users’ taste at
all. Good recommendation systems have a more accurate prediction and lower
computation complexity. Our work is mainly on the improvement of accuracy.
Naive Bayesian method is a famous classification algorithm[6] and it could also
be used in the recommendation field. When factors affecting the classification
results are conditional independent, naive Bayesian method is proved to be the
solution with the best performance. When it comes to the recommendation field,
naive Bayesian method is able to directly calculate the probability of user’s
possible interests and no definition of similarity or distance is required, while in
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 218–227, 2011.
c Springer-Verlag Berlin Heidelberg 2011
A New Collaborative Filtering Recommendation Approach 219
other algorithms such as k-NN there are usually many parameters and definitions
to be determined manually. It is always fairly difficult to measure whether the
definition is suitable or whether the parameter is optimal. Vapnik’s principle said
that when trying to solve some problem, one should not solve a more difficult
problem as an intermediate step. On the other side, although Bayesian network[7]
have good performance on this problem, it has a great computational complexity.
In this article, we designed a new collaborative filtering algorithm based on
naive Bayesian method. The new algorithm has a similar complexity to naive
Bayesian method. However, it has an adjustment of the independence which makes
it possible to be applied to the instance where conditional independence assump-
tion is not obeyed strictly. The new algorithm provides us with a new simple
solution to the lack of independence other than Bayesian networks. The good per-
formance of the algorithm will provide users with more accurate recommendation.
2 Related Work
2.1 Recommendation Systems
As shown in Table 1, recommendation systems are implemented in many ways.
They attempt to provide items which are likely of interest to the user accord-
ing to characteristics extracted from the user’s profile. Some characteristics are
from content of the items, and the corresponding method is called content-based
approach. In the same way, some are from the user’s social environment which
is called collaborative filtering approach[12].
Content-based approach reads the content of each item and the similarity be-
tween items is calculated according to characteristics extracted from the content.
The advantages of this approach are that the algorithm is able to handle brand
new items, and the reason for each recommendation is easy to explain. However,
not all kinds of items are able to read. Content-based systems mainly focus on
items containing textual information[13], [14], [15]. When it comes to movies,
the content-based approach does not work. Therefore in this problem, we chose
collaborative filtering approach.
Compared to content-based approach, collaborative filtering approach does
not care what the items are. It focuses on the relationship between users and
items. That is, in this method, items in which similar users are interested are
considered similar[1],[2].
Here we mainly talk about collaborative filtering approach.
recommendation systems
model-based memory-based
220 K. Wang and Y. Tan
Collaborative filtering systems try to predict the interest of items for a partic-
ular user based on the items of other users’ interest. There have been many
collaborative systems developed in both academia and industry[1]. Algorithms
for collaborative filtering can be grouped into two-general classes, memory-based
and model-based[4], [11].
Memory-based algorithms essentially are heuristics that make predictions
based on the entire database. Values deciding whether to recommend the item
is calculated as an aggregate of the other users’ records for the same item.[1]
In contrast to memory-based methods, model-based algorithms first built
a model according to the database and then made predictions based on the
model[5]. The main difference between model-based algorithms and memory-
based methods is that model-based algorithms do not use heuristic rules. Instead,
models learned from the database provide the recommendations.
The improved naive Bayesian method belongs to the model-based algorithms
while the k-NN algorithm which appears as a comparison later belongs to the
memory-based algorithms.
where
p(mu1 , mu2 , · · · |mx ) p(mu1 |mx ) p(mu2 |mx )
q= = · · ··· (6)
p(mu1 , mu2 , · · ·) p(mu1 ) p(mu2 )
Making recommendation. Now we have the prior probability for each item
and the conditional probability for each pair of items. The algorithm 3 will show
how we make the recommendations.
in which each line represent an interest record of a user and M is the number
of items. The online computation which gives the recommendation of all users,
also has a complexity of O(LM). Therefore the total complexity is O(LM) only.
4 Experiment
Many recommendation algorithms are in use nowadays. We have non-
personalized recommendation and k-NN recommendation mentioned before to
be compared with our improved naive Bayesian.
4.3 Evaluation
We have F-measure as our evaluation methodology. F-measure is the harmonic
mean of precision and recall[3]. Precision is the number of correct recommen-
dations divided by the number of all returned recommendations and recall is
the number of correct recommendations divided by the number of all the known
interests supposed to be discovered. A recommendation is considered correct if
it is included in the group of interests which is set unknown. It is to be noted
that the value of our experiment result shown later is doubled F-measure.
224 K. Wang and Y. Tan
5 Conclusion
References
1. Adomavicius, G., Tuzhilin, A.: The next generation of recommender systems: A sur-
vey of the state-of-the-art and possible extensions. IEEE Transactions on Knowl-
edge and Data Engineering (2005)
2. Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item col-
laborative filtering. IEEE Internet Computing (2003)
3. Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for
information extraction. In: Proceedings of Broadcast News Workshop 1999 (1999)
4. Breese, J.S., Heckerman, D., Kadie, C.: Empirical Analysis of Predictive Algo-
rithms for Collaborative Filtering. In: Proc. 14th Conf. Uncertainty in Artificial
Intelligence (July 1998)
5. Hofmann, T.: Collaborative Filtering via Gaussian Probabilistic Latent Semantic
Analysis. In: Proc. 26th Ann. Int’l ACM SIGIR Conf. (2003)
6. Kotsiantis, S.B., Zaharakis, I.D., Pintelas, P.E.: Machine learning: a review of clas-
sification and combining techniques. Artificial Intelligence Review (2006)
7. Yuxia, H., Ling, B.: A Bayesian network and analytic hierarchy process based
personalized recommendations for tourist attractions over the Internet. Expert
System With Applications (2009)
8. Resnick, P., Varian, H.R.: Recommender systems. Communications of the ACM
(March 1997)
A New Collaborative Filtering Recommendation Approach 227
Abstract. Energy consumption by cell phones has great effect on energy crisis.
Calculating and optimizing the method of service that provided by the cell phone
is essential. In our solution, we build up three main models. Transition model
reflects the relationship between the change of energy and time; next we give the
function of energy consumption during the steady state. Optimization approach
structures the function of energy consumption and constructs the function with
convenience degree to emphasize the convenience of cell phones. Using waste
model we obtain the waste functions under different situations and get the total
waste energy.
1 Introduction
Recently, the use of mobile computing devices has increased in computation and
communication. With the development of cell phones, landline telephones are even-
tually given up. We have noticed that people’s charger stays warm even it is not
charging the phone. All of these drain electricity. It’s not just wasting your money, but
also adding to the pollution created by burning fossil fuels [1]. According to investi-
gation, only 5% of the power drawn by cell phone chargers is actually used to charge
phones. The other 95% is wasted when you leave it plugged into the wall, but not into
your phone [2]. It is no doubts that calculate the energy consumption of cell phone and
optimize the method of service that provided by the landline and the cell phone is
significant for coping with the energy crisis. Although increases in the perceived like-
lihood of an energy shortage had no effect, increments in the perceived noxiousness or
severity of energy crisis strengthened intentions to reduce energy consumption [3].
Over the last decades, in order to reduce the energy waste of communication
Equipments, many academics and politicians had put forward some algorithms,
methods, models and arguments for energy consumption. In literature [4], several
authors compare the power consumption of an SMT (DSP) with a CMP (DSP) under
different architectural assumptions; they find that the SMT (DSP) uses up to 40% less
power than the CMP (DSP) in our target environment. To reduce the idle power,
Eugene Shih, Palaver Buhl introduces a technique to increase the battery lifetime of a
PDA-based phone by reducing its idle power, the power a device consumes in a
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 228–235, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Statistical Approach for Calculating the Energy Consumption by Cell Phones 229
"standby" state. Using this technique, we can increase the battery lifetime by up to
115%. In the paper, they describe the design of "wake-on-wireless" energy-saving
strategy and the prototype device they implemented [5].
In this paper, we use available data to build up a transition model and interpret the
steady state to study consequences of the change of electricity utilization after landlines
are replaced by cell phones. Then we consider an optimal way of phone service through
discussing three different cases and discover the convenience using cell phones instead
of landlines. Besides, we use a population growth model and an economic growth
model to predict the energy consumption by cell phones combining with the population
and economy in the future.
This paper is organized as follows. In section two, we design models of energy
consumption by cell phones. Besides, we analyze and discuss the relationship between
the energy consumption and the habit of people using cell phones. Section three is an
application of the models for “Pseudo US”. We find that Americans waste plenty of oil
because of their bad habits. In section four, we make a conclusion.
2 Design of Model
With the development of technology, cell phone usage is mushrooming, and many
people are using cell phones and giving up their landline telephones [6], [12]. Our
model just involves the “energy” consequences of the cell phone revolution. Here, we
make an assumption that every cell phone comes with a battery and a charger. We
design models in consider with the current US, a country of about 300 million people.
In this paper, we develop a model to analyze how the cell phone revolution impacts
electricity consumption at the national level. The basic component of our model is the
household. A household can exist in one of three disjoint states at a time. The three
states are as follows: (1) Initial State: a household only uses landline telephones. (2)
Acquisition State: when a household acquires its first cell phone. (3) Transition State:
all household members have their own cell phones but the landline is retained.
Definition 2. W e define “waste” in a simple way, it’s just the misuse of energy with
nothing utilization. Take the “waste” of the electricity into three different cases:
charging the cell phone when it turns on, continuing to charge cell phone after it has
fully charged and leaving the charger plugged in but not charging the device.
Definition 3. Compare with “waste”, we define “consumption” is that the common use
of the energy, no matter with high or low utilization, there is no waste of energy.
230 S. Pang and Z. Yu
If all the landlines are replaced by cell phones, there exists a change of electricity
utilization. Here, we assume that each family only has one landline phone and each
member just has one cell phone [7], and the energy consumption of the average cell
phone remains constant. We except for those who don’t have cell phones that they
belong to the family that owns a landline phone. If one loses his cell phone, he buys
new cell phone immediately. The energy consumed by cell phones is calculated
through the following formula:
W (t ) = ( H (t )mP1 − H (t ) P2 ) × t . (1)
Where H (t ) is the number of landline users at time t, it also symbolize the amount of
total family at time t, m is the average number of family members in the United States.
P1 is the average power of cell phones in the U.S. market. P2 is the average power of a
single landline.
As the change of W (t ) follows time t, it is possible that the energy consumption
can reach a steady state. Here, the “steady state” means that the growth of energy
consumption remains unchanged all the time. In mathematics, we calculate the derivate
of function (1), expressed as W ′(t ) .Just because
H (t ) = H (t0 )e ρ ×t . (2)
H (t0 ) Symbolizes the amount of total family at time t, we can consider it as a constant
at time t0 . ρ is the population growth rate of mobile phone users. Based on both
function (1) and (2), we can draw a conclusion that W ′(t ) . Generally, when W ′(t )
equals zero, the system reaches the steady state. Here, it is sure that
H (t0 ) ≠ 0 and m ≥ 1 . For that, we conclude that W ′(t ) cannot equal to zero. Con-
sequently, only when all landline users transform into mobile phone users, it can reach
steady state.
Consider a second “Pseudo US” [8], a country with about the same economic status as
the current US. However, this emerging country has neither landlines nor cell phones.
We need to find an optimal way of providing them with phone service.
ω1
W = ω0 × ( P3 β 0 + P4 β1 ) × T + × P5 × T . (3)
m
There ω0 is the population of America who own cell phones. ω1 is the population who
don’t have cell phones, so the sum of ω0 and ω1 is the total population. T is the time
after charging from the beginning through to the next complete depletion when the
power in the mobile phone completely exhausted. P3 is the power of a cell phone when
maintain a cell phone call. P4 is the power of a cell phone when it is not used. P5 is the
average power of a single landline. β0 is the percentage of T when maintain a cell
phone call. β1 is the percentage of T when the cell phone is not used. There are three
different conditions when ω0 , ω1 takes different values:
(1) When ω0 = 0 , ω1 ≠ 0 . All of the people use landlines. At this time,
ω1
W= ×P 5 ×T .
m
(2) When ω0 ≠ 0 , ω1 ≠ 0 . That shows that some people use landlines while others use
cell phones. We can express the whole enery consumption as function (3).
(3) When ω0 ≠ 0 , ω1 = 0 . All of the people use cell phones.
W = ω0 × ( P3 β 0 + P4 β1 ) .
Considering people who waste electricity in many ways, we divide them into three
basic situations. In every situation, we can conclude a waste function, so we can cal-
culate the accurate energy consumption. Here are the details:
w = p 3 × t1 × N ( t ) × γ 1 (5)
.
Where p3 the rated power of cell phones, t1 is the phone standby time. N (t ) is the
total population of the United States at time t . γ 1 is the proportion of Americans who
charge the cell phone with it turns on. In order to find the functional relationship be-
tween w and t , the relationship between the populations N (t ) and t have to be
confirmed firstly. Here, we can estimate the United States population N (t ) using a
Logistic model.
w = p 5 × t3 × N ( t ) × γ 3 (7)
.
Where p5 is the power of cell phone while leaving the charger plugged in but not
charging the device. t3 is the time of leaving the charger plugged in but not charging
the device per day. γ3 is the proportion of people who left the charger plugged in but
not charging the devices. We can simply take the power of charger as the main power
while continuing to charge after the cell phone is fully charged.
3 Application of Model
According to energy assumption that, the growth rate of mobile phone users is as same
as the economic growth rate. The above discussion is the current situation. Now con-
sider population and economic growth over the next 50 years.
Statistical Approach for Calculating the Energy Consumption by Cell Phones 233
For each 10 years for the next 50 years, predict the energy needs for providing phone
service based upon your analysis in the first three requirements. Again, assume elec-
tricity is provided from oil. Interpret your predictions in term of barrels of oil. We use
the population and economic growth model to explain the energy needs for providing
phone service based upon the analysis in the first three parts.
Solow neoclassical model of economic growth adds to the quality of labor and
capital elements of the quality of the elements use the construction of Cobb-Douglas
production function model [10], we can get the model:
α ( λ ) = α (1 + λ ) . (8)
Fig.1 shows the energy needs over the next 50 years, and signifies the energy con-
sumption in terms of oil.
We consider a second “Pseudo US”-a country of about 300 million peoples with about
the same economic status as the current US. Cell phones periodically need to be re-
charged. However, many people always keep their charger plugged in. Additionally,
many people charge their phones every night, whether they need to be recharged or not.
This causes a large quantity of energy consumption. Assume that the Pseudo US sup-
plies electricity from oil. Take the situation that people continue to charge their cell
phones after they are fully charged as an example, we can calculate the wasted energy
234 S. Pang and Z. Yu
according to formula (6). For a particular mobile phone, the battery capacity of cell
phone C=850(mA) [11], [12], and the voltage of cell phone battery V=3.7 (v). Thus,
p 4 = ( C × V ) / 1000 = 3.145 w . Take t = 2009 , t2 =5 , γ 2 =5% [13]. Then we
get the result w = 1.2721×10
8
J . According to oil calculation, American people
waste B = w / w4 = 12.085 barrels per day through this way. Similarly, for the
other two situations, they are 7.411 and 20.804 barrels respectively. Thus, American
waste 40.3 barrels oil per day.
4 Conclusions
From the models we build, with the landlines are replaced by cell phones, there exists a
change of electricity utilization. We make a transition model to estimate the con-
sumption of enery, and get the steady model that the growth of energy consumption
remains unchanged. We realize that the amount of energy consumption of phones is
very large. At present, the energy crisis becomes more and more serious. So, we have to
make the most use of the energy and save the energy.
However, our models still exist weaknesses. Our model doesn’t examine all
household member dynamics, i.e., members getting born, growing old enough to need
cell phones, moving out, starting households of their own, etc. Another is ignores In-
frastructure. We do not examine the energy cost of cellular infrastructure (towers, base
stations, servers, etc.) as compared to the energy cost of landline infrastructure (i.e.
telephone lines and switchboards).
References
1. Robert, L.H.: Mitigation of Maximum World Oil Production: Shortage scenarios. Energy
Policy 36, 881–889 (2008)
2. Mayo, R.N., Ranganathan, P.: Energy Consumption in Mobile Devices: Why Future Sys-
tems Need Requirements–Aware Energy Scale-Down. In: Falsafi, B., VijayKumar, T.N.
(eds.) PACS 2003. LNCS, vol. 3164, pp. 26–40. Springer, Heidelberg (2005)
3. Hass, J.W., Bagley, G.S., Rogers, R.W.: Coping with the Energy Crisis: Effects of Fear
Appeals upon Attitudes toward Energy Consumption. Journal of Applied Psychology 60,
754–756 (1975)
4. Stefanos, K., Girija, N., Alan, D.B., Zhigang, H.: Comparing Power Consumption of an
SMT and a CMP DSP for Mobile Phone Workloads. In: The 2001 International Conference
on Compilers, Architecture, and Synthesis for Embedded Systems (2001)
5. Eugene, S., Paramvir, B., Michael, J.S.: Wake on Wireless: An Event Driven Energy Saving
Strategy for Battery Operated Devices. In: 8th Annual International Conference on Mobile
Computing and Networking, pp. 160–171 (2002)
6. Singhal, P.: Integrated Product Policy Pilot Project. Nokia Corporation (2005)
7. Paolo, B., Andrea, R., Anibal, A.: Energy Efficiency in Household Appliances and Lighting.
Springer, New York (2001)
8. Tobler, W.R.: Pseudo-Cartograms. The Am. Cartographer 13, 40–43 (1986)
Statistical Approach for Calculating the Energy Consumption by Cell Phones 235
9. Sabate, J.A., Kustera, D., Sridhar, S.: Cell-phone Battery Charger Miniaturization. In: In-
dustry Applications Conference, pp. 3036–3043 (2000)
10. Meeusen, W., Broeck, J.: Efficiency Estimation from Cobb-Douglas Production Functions
with Composed Error. International Economic Review 9, 435–444 (1977)
11. Toh, C.: Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in
Wireless ad hoc Networks. IEEE Communications, 138–147 (2001)
Comparison of Ensemble Classifiers in
Extracting Synonymous Chinese Transliteration
Pairs from Web
1 Introduction
There is no transliteration standard across all Chinese language regions; thus, many
悉悉 雪雪 雪雪
different Chinese transliterations can arise. As an example, the Australian city “Syd-
ney” also produces different transliterations of (xi ni), (xue li) and (xue
li). Someone who uses the Chinese language may never know all these different Chi-
nese synonymous transliterations; hence, this level of Chinese transliteration variation
leads readers to mistake transliterated results or to retrieve incomplete results when
searching the Web for documents or pages if a trivial transliteration is submitted as the
search keyword in a search engine such as Google or Yahoo. Moreover, while varia-
tions in Chinese transliteration have already emerged in all Chinese language regions,
including China, Hong Kong and Taiwan, we still lack effective methods to address
this variation. Most research focuses on machine transliteration across two different
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 236–243, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Comparison of Ensemble Classifiers in Extracting Synonymous Chinese Transliteration 237
languages; in contrast, fewer efforts in the literature have focused on confirming a pair
comprised of a Chinese transliteration term and a Chinese term (or another Chinese
transliteration) as to whether it is synonymous.
In this paper, we compare several ensemble classifiers in confirming a pair whether
this pair is “synonymous” or “not synonymous”. We first construct an integrated con-
firmation framework (ICF) with considering the use of a majority voting scheme and a
boosting scheme [1] together to robustly confirm pairs, since majority voting and
boosting have been used to reduce noise and over fitting in terms of training classifiers.
Then, the well-known ensemble classifiers, boosting [1] and bagging [2] are applied on
this classification problem. The contribution of this research lies in that the results of
the confirmation framework can be applied to construct a new database of synonymous
transliterations, which can then be used to increase the size of the transliterated voca-
bulary, making it useful to expand an input query in search engines such as Google and
Yahoo. This could alleviate the problem of incomplete search results stemming from
the existence of different transliterations of a single foreign word.
2 Decision-Making
Two major steps are included in the framework for the sake of confirming whether a
pair is synonymous. First, we study two Romanization transcription systems, including
the National Phonetic System of Taiwan (BPMF system) and the Pinyin system, to
transcribe Chinese characters into sound alphabets. The BPMF system is used to
transcribe a Chinese character into a phonetic sequence for the use of CSC [3] and LC
[4]; the Pinyin system is used for ALINE [5], FSP [6]and PLCS [7].
Measuring similarity for two sets of sound alphabet sequences produces a similarity
score between two transliterations. Assume that we have two Chinese transliterations A
={a1, …, an, …, aN} and B ={b1, …, bm, …, bM}, where an is the nth character of A and
bm is the mth character of B. N may not be equal to M. The characters an and bm are
formed into sound alphabet sequences an ={ an,1, …, a n,i, …, a n,I} and bm ={ bm,1, …, b
m,j, …, b n,J}, respectively. The alphabets an,i and bm,j are generated by either the BPMF
system or the Pinyin system.
Second, we use a dynamic programming-based approach to obtain the similarity
score for a given Chinese pair; that is, a Chinese transliteration versus another Chinese
term. To acquire the maximum similarity score between two sets of sound alphabet
sequence (formed from A and B, respectively), which is represented as score(A,B), a
dynamic programming-based approach can be used to acquire the largest distortion
between A and B by adjusting the warp on the axis of T(n,m) of sim(an,bm), which
represents the similarity between an and bm. The recursive formula (1) is defined as
follows.
ܶሺ݊ െ ͳǡ ݉ െ ͳሻ ݉݅ݏሺܽ ǡ ܾ ሻ
ܶሺ݊ǡ ݉ሻ ൌ ቐܶሺ݊ െ ͳǡ ݉ሻ (1)
ܶሺ݊ǡ ݉ െ ͳሻ
238 C.-H. Chen and C.-C. Hsu
where the formula respects the similarity range [0,1]; accordingly, the two normalized
scores in the above examples are 0.87 and 0.71, respectively.
Let X be a dataset containing a set of n data pairs, and let be a pair consisting of
a transliteration and another Chinese term, which corresponds to class label yj Y,
representing a synonymous pair or not a synonymous pair. Let M = {m1, …, mI} be a set
of pronunciation-based approaches, where mi is the ith approach in M. For a pair
, let scorej = {scorej,1, …, score j,I} be a set of similarity scores, where scorej,i is
measured by mi (to use formula (2)) for xj, and then let vj = {vj,1, …, v j,I} be a set of
decisions, where vj,i is a decision (i.e., a vote) taken from scorej,i. In particular, a pair xj
has three entities, namely, yj, vj and scorej.
The similarity entity scorej drives the decision entity vj. Most studies in the literature
often take a vote, represented as vj,i that is accepted when scorej,i ≧ θi, whereas vote vj,i
is rejected when scorej,I < θi. The parameter θi is a threshold. Determining a higher
value for θi often brings higher precision but lower recall, whereas determining a lower
value θi often brings lower precision but higher recall. Nevertheless, the determination
of the appropriate parameters is usually empirical in many applications of
information retrieval.
Instead of requiring the parameters , we use the K-nearest neighbor algorithm
to obtain vi with the help of scorej, because it provides a rule that xj can be classified
according to its K nearest neighbor pairs; by the same token, the vote vj,i is assigned by
a majority vote on , with respect to , , where “j → k”
represents the kth nearest neighbor training pair of xj. Initially, we denote , = yr in
advance if xr is a training pair.
Since a majority-voting scheme is a well-known integrated voting approach to
generate a final decision, it is applied to obtain a class label. The class label yj is de-
termined using a majority-voting scheme on vj. In particular, the voting function h(xj)
determines a predicted class label via a majority vote of , and is
written as
argmax ∑ δ , (3)
where the function δ returns a Boolean value.
The ensemble framework proposed in this paper considers the use of multiple learning
approaches M = {m1, …, mI} and multiple data fractions X1, X2, …, XT. Let be
Comparison of Ensemble Classifiers in Extracting Synonymous Chinese Transliteration 239
(6)
where is the probability of training error at the tth round. In addition, we also write
(7)
where is the probability of training error in the comparison approach mi at the tth
round.
The entities are good candidates for driving the data distribution for Xt.
The xj obtaining the correct vote at round t will receive a lower probability value
Dj(t+1) and will be less likely to be drawn at round t+1. Dj(t+1) is expressed as
3 Experiments
The data source is selected from the study in [3] in which the dataset contains a total of
188 transliterations collected from Web news sources. These transliterations are proper
names, including geographic, entertainment, sport, political and some personal names.
They are built as a set of pairs, some of which are synonymous and others of which are
not synonymous pairs. In other words, the class label of each pair is known in advance.
The pairs are constructed as a training dataset and are used for decision-making.
In particular, a total number of 17,578 unique pairs (C ) is obtained. However, we
only allow the length diversity of the pair to be one because the length of differentiation
between a Chinese transliteration and its actual synonym is often at most one [3]. From
this point of view, many pairs can be ignored without allowing the length diversity to
exceed one; thus, we retain a total of 12,006 pairs, which include 436 ac-
tual-synonymous pairs and 11,570 pseudo-synonymous pairs (i.e., pairs that are not
synonymous).
In order to reduce the likelihood of participative training data driving confirmation
performance as well as to ignore the influences of an imbalanced training dataset, we
perform a validation task involving ten different datasets selected from the training data
by sampling without replacement and thus ensure the number of positive pairs is the
same as the number of negative ones. Therefore, ten training datasets, each of which
includes 436 positive pairs and 436 negative ones, are used for the experiments.
Two datasets, D50 and D97, are used for the experiments in [9] and contain translite-
rations. The second dataset, referred to as D97, is from the 2008 TIME 100 list of the
world's most influential people. There are a total of 104 names in the list, since four
entries include two names. Ninety-seven names are retained for the experiment. Seven
names are ignored, namely, Ying-Jeou Ma, Jintao Hu, Jeff Han, Jiwei Lou, Dalai Lama,
Takahashi Murakami, and Radiohead. The first five have Chinese last names that have
standard Chinese translations. The sixth term is a Japanese name for which translation
is usually not done using transliteration. The last name is that of a music band; its
translation to Chinese is not according to its pronunciation, but its meaning.
In this experiment, we input the transliterations in D50 and D97 to collect their syn-
onyms from a real-world Web corpus using the integrated confirmation framework
proposed in this paper. For each transliteration, we collected Web snippets by submit-
ting a search keyword to the Google search engine. The search keyword is used to
retrieve Web snippets; however, it does not contribute information to the confirmation
framework, which determines whether a pair is synonymous.
To construct a pair, we use the original term of the given transliteration as a search
keyword, because the original term is able to retrieve appropriate Web documents in
which the transliteration’s synonyms appear. Let a transliteration (abbreviated as TL)
Comparison of Ensemble Classifiers in Extracting Synonymous Chinese Transliteration 241
be an entry. The TL’s original term (abbreviated as ORI), which is treated as the search
keyword for the search engine, is represented as QOri and is submitted to retrieve search
result Web snippets, represented as DOri. The set DOri is limited to Chinese-dominant
Web snippets. The procedure of returning a pair by collecting Web snippets from the
Google search engine is as follows.
A. For each TL in D50 and D97, we use QORI to download Web snippets DORI. In
particular, we set |DORI | to 20 for each TL because the snippets appearing at the
head of the returned snippets are often more relevant to the research keyword. The
size of the downloaded DORI for D50 is 1,000, whereas the size of the downloaded
DORI for D97 is 1,940.
B. We delete known vocabulary terms with the help of a Chinese dictionary for DORI
and apply an N-gram algorithm to segment Chinese n-gram terms for the remain-
ing fractional sentences in DORI. Furthermore, most synonymous transliterations
(TLs with their STs) have the same length, but some of them have different lengths
of at most one [3]. Therefore, we retain the Chinese terms from DORI while con-
trolling for length. Each Chinese term of length N is retained, with N=|TL|-1 to
N=|TL|+1 and N ≥ 2.The number of remaining pairs for D50 is 9,439, whereas that
for D97 is 19,263, where the pair consists of a given TL and a remaining Chinese
n-gram term.
C. However, some pairs have similarities that are not high enough and thus are never
considered synonymous pairs. We set a similarity threshold to ignore those pairs.
According to the findings in [3], a lower similarity threshold can be set to 0.5 by
using the CSC approach to cover effectively all examples of synonymous transli-
terations. After discarding the pairs with similarities lower than 0.5, a total of 2,132
and 5,324 pairs are retained for D50 and D97, respectively. These pairs are con-
firmed by the use of the framework proposed in this paper and will be discussed in
next section.
z Bagging [2]: This combines multiple classifiers to predict the class label for a pair by
integrating their corresponding votes. The base algorithm for the classification we
used is KNN with k=5 due to its simplicity.
z Boosting [1, 8]: This requires a weak learning algorithm. We use KNN with k=5 in
this study.
ICF, bagging and boosting are the same in that they determine a parameter T, the
number of iterations. One study [8] set the parameter T to 10 to use the boosting
scheme. We follow the same setting for our experiments.
A total of ten results are obtained for the testing data in the experiment, since we
have ten training datasets involved in the validation process. The evaluator used for the
experiment is the accuracy measure, which is common in a classification task. More-
over, we use a box-plot analysis to graphically employ a total of nine approaches,
including ICF, boosting, bagging, MV, and five individual approaches (CSC, LC,
ALINE, FSP and PLCS). The results are shown in Figure 1.
Fig. 1. Box-plot analysis for nine approaches in the testing datasets (a) D50 and (b) D97
In Figure 2, the experimental results show that the average accuracy in the confir-
mation of Chinese transliteration pairs for three ensemble approaches (namely, ICF,
boosting, and bagging) is higher than that of the other individual approaches. This is
because the three ensemble approaches allow repeated learning of the variant data
distributions, whereas the individual approaches only perform the experiments once,
driven by the participative training datasets. In addition, ICF achieves an average ac-
curacy of 0.93 in D50 and 0.89 in D97 and is the best among nine approaches, because
it considers several individual approaches together in evaluating variant data distribu-
tions. Otherwise, CSC achieves an average accuracy of 0.88 in D50 and 0.85 in D97
and is the best of the five individual approaches. Moreover, a shorter distance between
the top and the bottom in a box-plot analysis demonstrates that ICF produces a much
more stable performance than the others do; in contrast, bagging produces the most
unstable performance among all ensemble approaches. This is because ICF best
achieves learning objectives with variant data distributions. Otherwise, all five indi-
vidual approaches produce a less stable performance than do the ensemble approaches,
because they are seriously affected by the training datasets.
Comparison of Ensemble Classifiers in Extracting Synonymous Chinese Transliteration 243
4 Conclusions
In this paper, we propose a new ensemble framework for confirming Chinese transli-
teration pairs. Our framework confirms and extracts pairs of synonymous translitera-
tion from a real-world Web corpus, which is helpful to support search engines such as
Google and Yahoo for retrieving much more complete search results. Our framework
considers the use of the majority-voting scheme and the boosting scheme together at
the same time. The experimental results were evaluated according to the proposed
framework in this paper, comparing boosting, bagging, general majority voting, and
five individual approaches. The experimental results demonstrate that the proposed
framework is robust for improving performance in terms of classification accuracy and
stability.
References
1. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of
the 13th International Conference on Machine Learning, pp. 148–156 (1996)
2. Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)
3. Hsu, C.C., Chen, C.H., Shih, T.T., Chen, C.K.: Measuring similarity between transliterations
against noise data. ACM Transactions on Asian Language Information Processing 6, 1–20
(2007)
4. Lin, W.H., Chen, H.H.: Similarity measure in backward transliteration between different
character sets and its applications to CLIR. In: Proceedings of Research on Computational
Linguistics Conference XIII, Taipei, Taiwan, pp. 97–113 (2000)
5. Kondrak, G.: Phonetic alignment and similarity. Computers and the Humanities 37, 273–291
(2003)
6. Connolly, J.H.: Quantifying target-realization differences. Clinical Linguistics & Phonetics,
267–298 (1997)
7. Gao, W., Wong, K.-F., Lam, W.: Phoneme-based transliteration of foreign names for OOV
problem. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS
(LNAI), vol. 3248, pp. 110–119. Springer, Heidelberg (2005)
8. Sun, Y., Wang, Y., Wong, A.K.C.: Boosting an associative classifier. IEEE Transactions on
Knowledge and Data Engineering 18, 988–992 (2006)
9. Hsu, C.C., Chen, C.H.: Mining Synonymous Transliterations from the World Wide Web.
ACM Transactions on Asian Language Information Processing 9(1), 1–28 (2010)
Combining Classifiers by Particle Swarms with Local
Search
Liying Yang
School of Computer Science and Technology, Xidian University, Xi’an, 710071, China
[email protected]
1 Introduction
Combining classifiers is one of the most prominent techniques currently used to
augment the accuracy of learning algorithms. Instead of evaluating a set of different
algorithms against a representative validation set and selects the best one, Multiple
classifier systems (MCS) are to integrate several models for the same problem. MCS
came alive in the 90’s of last century, and almost immediately produced promising
results [1][2]. From this beginning, research in this domain has increased and grown
tremendously, partly as a result of the coincident advances in the technology itself.
These technological developments include the production of very fast and low cost
computers that have made many complex pattern recognition algorithms practicable
[3]. A large number of combination schemes have been proposed in the literature [4].
Majority vote is the simplest combination method and has been a much-studied sub-
ject among mathematicians and social scientists. In majority vote, each individual has
the same importance. A natural extension to majority vote is to assign weight to dif-
ferent individual. Thus weighted combination algorithm was obtained. Since under
most circumstances, there is difference between individuals, weighted combination
algorithm provides a more appropriate solution. The key to weighted combination
algorithm is the weights. Two weighted combination models based on particle swarm
optimization were proposed in our previous work [5][6]. In order to avoid the local
optimum in PSO-WCM, a new weighted combination model is proposed in this pa-
per, which cooperate PSO with local search to combine multiple classifiers.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 244–251, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Combining Classifiers by Particle Swarms with Local Search 245
where t is the loop counter; i = 1,..., m ; d = 1,..., D ; c1 and c2 are two positive
constants called cognitive learning rate and social learning rate respectively; r1 and
r2 are random numbers in the range [0,1]. The velocity vid is limited in
[−vmax , vmax ] with vmax a constant determined by specific problem. The original
version of PSO lacks velocity control mechanism, so it has a poor ability to search at
a fine grain [9]. Many researchers devoted to overcoming this disadvantage. Shi and
Eberhart introduced a time decreasing inertia factor to equation (1) [10]:
Where μ is inertia factor which balances the global wide-range exploitation and the
local nearby exploration abilities of the swarm. Clerc introduced a constriction factor
a into equation (2) to constrain and control velocities magnitude [11]:
The above equations (3) and (4) are called classical PSO, which is much efficient and
precise than the original one by adaptively adjusting global variables.
246 L. Yang
Hill-climbing is a typical local search algorithm used in many fields, partially due to
its easier implement and flexible transform in the particles. Aimed to avoid some
demerits in classical PSO, such as relapsing into local extremum and low convergence
precision in the late evolutionary, we adopted a hybrid algorithm of particle swarm
optimization and hill-climbing algorithm, PSO-LS called in [12]. In PSO-LS, each
particle has a chance of self-improvement by applying hill-climbing algorithm before
it exchanges information with other particles in the swarm. Hill-climbing used as
local search algorithm in our work is executed as follows.
∑ w = W ,K is
K
there are two types of constraint [13]. One is Sum-W constraint: i
i =1
There are two methods for acquiring the weights in WCM. One set fixed weights
to each classifier according to experience or something else. The other obtains
weights by training. In the previous work, we proposed a combination algorithm
which determined the weights based on PSO (PSO-WCM) [5].
Begin PSO-LS-WCM
Step 1.Initialize the parameters: swarm size N, max
loop times in local search T1, max iteration of
PSO T2;
Step 2. Randomly generating N particles;
Step 3. Calculate the fitness for each of the N
particles;
Step 4. Calculate Pi(i=1…N) and Pg, set t=1;
Step 5. While t<=T2
5.1 For every pi(i=1…N), do local Search as shown in
Section 2.2;
5.2 Update Pi and Pg;
5.3 Update the velocity according to formula (3);
5.4 Update the position according to formula (4);
5.5 Evaluate the fitness for each particle in current
iteration;
5.6 update Pi and Pg;
5.7 t=t+1;
Step 6. End while
End PSO-LS-WCM
248 L. Yang
Five classifiers used in this work are: (1) LDC, Linear Discriminant Classifier; (2)
QDC, Quadratic Discriminant Classifier; (3) KNNC, K-Nearest Neighbor Classifier
with K=3; (4) TREEC, a decision tree classifier; (5) BPXNC, a neural network classi-
fier based on MATHWORK's trainbpx with 1 hidden layer and 5 neurons in hidden
layer.
PSO-LS-WCM was applied to seven real world problems from the UCI repository:
Pima, Vehicle, Glass, Waveform, Satimage, Iris and Wine [14]. For each dataset, 2/3
examples were used as training data, 1/6 validation data and 1/6 test data. In other com-
bination rules or individual classifiers, 2/3 examples were used as training data and 1/3
test data. All experiments were repeated for 10 runs and averages were computed as the
final results. Note that all subsets were kept the same class probabilities distribution as
original data sets. The characteristics of these data sets are shown in Table 1.
4.3 Configurations
Hill-climbing. Loop times in local search T1=5, r was initially set 0.1 times the search
space and linearly decreased to 0.005 times as iteration increased.
PSO. Since there are 5 classifiers, the number of weights is 5. A particle in PSO was
coded into one 4-dimension vector w = ( w1 , w2 , w3 , w4 ) . The fifth weight w5 was
5
computed according to ∑w
k =1
k = 1 . Classical PSO was adopted. Parameters were set
as following: size of the swarm N=10; inertia factor μ linearly decreases from 0.9 to
0.4; c1 = c2 = 2 ; constriction factor a =1; for i-th particle, each dimension in
position vector xi and velocity vector vi were initialized as random number in the
range [0,1] and [-1,1]; max iteration T2= 500.
Combining Classifiers by Particle Swarms with Local Search 249
The performance of individual classifiers was list in Table 2. It is shown that different
classifier achieved different performance for the same task, and no classifier is supe-
rior for all problems. For purpose of comparison, individual classifiers were combined
by majority vote rule, max rule, min rule, mean rule, median rule, product rule, PSO-
WCM and PSO-LS-WCM [15]. Ensemble learning performance was given in
Table 3.
It is shown that PSO-LS-WCM outperforms all comparison combination rules and
the best individual classifier on data sets Pima, Satimage, Vehicle, Waveform. These
data sets have a common characteristic, that is, the sample size is large. Therefore, the
optimal weights obtained on validation set are also representative on test set. The
same thing is not true on smaller data sets (such as Glass, Iris, and Wine) for the ob-
vious reason that over fitting tends to occur. Optimal weights might appear in initial
process, so the succeeding optimization makes no sense. But on these small datasets,
PSO-LS-WCM exhibits as good as the other methods or just obtains a median result,
which avoid selecting the worst classifier called Worst Case Motivation for Multiple
Classifier System.
From Table 3, we can also see that PSO-LS-WCM is better than PSO- WCM. The
error rates of the two combination methods were plotted in Fig. 1 in order to give an
intuitionistic comparison.
0.35 PSO-WCM
PSO-LS-WCM
0.30
0.25
Error Rate
0.20
0.15
0.10
0.05
0.00
Pima Glass Iris Satimage Vehicle Waveform Wine
Data sets
5 Conclusion
Evolutionary computing based weighted combination model is a very natural ap-
proach to linear combiners in ensemble learning. It trains base learners and combines
them with specific weights rather than identical ones to solve the problem. We present
a weighted combination method based on particle swarm optimization with local
search in this paper, namely PSO-LS-WCM, which can avoid the local optimum in
PSO-WCM that proposed in our previous work. Experiments were carried out on
seven data sets from the UCI repository. It is shown that PSO-LS-WCM performs
better than individual classifiers, majority voting rule, max rule, min rule, mean rule,
median rule, product rule, and PSO-WCM.
References
1. Hansen, L., Salamon, P.: Neural Network Ensembles. IEEE Transactions on Pattern
Analysis and Machine Intelligence 12, 993–1001 (1990)
2. Brown, G.: Ensemble Learning. In: Encyclopedia of Machine Learning. Springer Press,
Heidelberg (2010)
3. Suen, C.Y., Lam, L.: Multiple classifier combination methodologies for different output
levels. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 52–66. Springer,
Heidelberg (2000)
4. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley and
Sons, Inc., Chichester (2004)
Combining Classifiers by Particle Swarms with Local Search 251
5. Yang, L.-y., Qin, Z.: Combining Classifiers with Particle Swarms. In: Wang, L., Chen, K.,
S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3611, pp. 756–763. Springer, Heidelberg (2005)
6. Yang, L., Zhang, J., Wang, W.: Selecting and Combining Classifiers Simultaneously with
Particle Swarm Optimization. Information Technology Journal 8(2), 241–245 (2009)
7. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE International Conference
on Neural Networks, Perth, Australia, vol. 4, pp. 1942–1948 (1995)
8. Eberhart, R., Kennedy, J.: A New Optimizer Using Particle Swarm Theory. In: Proceeding
of the Sixth International Symposium on Micro Machine and Human Science, Nagoya,
Japan, pp. 39–43 (1995)
9. Angeline, P.J.: Ebulutionary optimization versus particle swarm optimization: philosophy
and performance differences. Evolutionary programming VII. In: Proceedings of the Sev-
enth Annual Conference on Evolutionary Programming (1998)
10. Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: IEEE World Congress on
Computational Intelligence, pp. 69–73 (1998)
11. Clerc, M.: The Swarm and the Queen: Towards a Deterministic and Adaptive Particle
Swarm Optimization. In: Proceeding of the Congress of Evolutionary Computation, vol. 3,
pp. 1951–1957 (1999)
12. Chen, J., Qin, Z., Liu, Y., Lu, J.: Particle Swarm Optimization with Local Search. In:
Proceedings of 2005 International Conference on Neural Networks and Brain Proceedings,
ICNNB 2005, pp. 481–484 (2005)
13. Tomas, A.: Constraints in Weighted Averaging. In: Benediktsson, J.A., Kittler, J., Roli, F.
(eds.) MCS 2009. LNCS, vol. 5519, pp. 354–363. Springer, Heidelberg (2009)
14. Blake, C., Keogh, E., Merz, C.J.: UCI Repository of Machine Learning Databases (1998),
http://www.ics.uci.edu/~mlearn/MLRepository.html
15. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transac-
tions On Pattern Analysis and Machine Intelligence 3, 226–239 (1998)
An Expert System Based on Analytical Hierarchy
Process for Diabetes Risk Assessment (DIABRA)
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 252–259, 2011.
© Springer-Verlag Berlin Heidelberg 2011
An Expert System Based on Analytical Hierarchy Process for DIABRA 253
A major difficulty for reaching a correct assessment is the complexity of the risk
factors, as well as the vast amount of information (including psychosocial issues, age,
gender, information quality, and so on) that the expert must take into account, hence,
the process gets complicated and it is transformed into a difficult modeling.
The author’s objective is to develop an expert system for risk assessment in diabe-
tes which personalizes its decision based on incoming case. The proposed expert sys-
tem utilizes the results of Analytical Hierarchy Process (AHP) approach to improve
the risk assessment for Type 2 diabetes.
The rest of this paper is organized as follow: Section 2 reviews the literature. In sec-
tion 3 the research methodology is explained and AHP results are devoted to section 4.
Expert system development and validation are provided in section 5 and 6, respectively.
In the last section (section 7) we summarize our findings and present final remarks.
2 Review of Literature
3 Research Methodology
Regarding to the aim of the study, identification of risk factors for Type 2 diabetes
has a key role in developing the intended expert system. In order to provide a com-
prehensive set of risk factors, the related secondary sources were investigated and
selected [17-22]. Some of identified risk factors which are grouped as Physical Risk
Factors (PRF) were gathered from previous studies addressing the correlation coeffi-
cient of these factors to the risk value for diabetes [18-22]. Hence, PRF score was
calculated using AHP approach (explained in section 4) with regards to the findings
of these studies. Next, several scenarios were developed in terms of different levels of
all identified risk factors (Table 1) and finally, scenarios were evaluated by human
experts (Knowledge acquisition).
Assigned
Risk Factor Type Value Comment
level
0 L- Number of times that
Fasting Blood Sugar 1 L FBS is more than 110
Numerical
(FBS) Index mg/dl in 5 sequential
>=2 L+
assessments.
Obesity Body Mass Index
(BMI) PRF < 2 L- (BMI) is a measure of
body weight relative
Diet to height (Kgr/m2).
Physical 2 <= PRF < P P
Female L-
Gender Categorical
Male L+
No L- Having a parent,
Family History Categorical brother, or sister with
Yes L+
diabetes.
No L-
Smoking status Categorical
Yes L+
Yes L-
Having Breakfast Categorical
No L+
Alcohol Drinking Categorical No L-
Yes L+
No L-
Feeling of stress Categorical
Yes L+
An Expert System Based on Analytical Hierarchy Process for DIABRA 255
Furthermore, the methodology employed to drive the expert system can be summa-
rized as four main phases:
Phase 1. Identifying the risk factors for Type 2 diabetes including the Physical
Risk Factors (PRF) and risk factors related to the lifestyle, medical,
family history and so on which are all presented in Table 1.
Phase 2. Knowledge acquisition with the help of several human experts and in-
formation acquired in the literature, several scenarios in an extended
range are developed and the chance of getting diabetes in scenarios is
evaluated by conducting AHP.
Phase 3. Knowledge representation utilizing FOOPES shell [23] so as to compile
the scenarios that capture the human expert’s knowledge in a rule based
knowledge base.
Phase 4. Validation of system based on comparative performance of human ex-
pert and expert system through some test samples.
In this work, the risk for Type 2 Diabetes is categorized into “low”, “medium”, and
“high” and assessed for each scenario designed by the human experts.
4 AHP Results
In phase 1 of implementation (explained in section 3), the identified risk factors were
categorized into two groups: physical risk factors (PRF) and risk factors related to
medical and family history and the like, due to differences in their knowledge acquisi-
tion process (Table 1).
Knowledge acquisition of PRF and their importance were carried out via conduct-
ing AHP framework through some scenarios developed by human experts of “Yazd
Research Center for Diabetes” and the rest of information was acquired from the lit-
erature [17-22]. The steps of conducting AHP framework through PRF are explained
as follows:
Step1: List the PRF.
Step2: Elicit pair wise comparison between the risk factors as prefer-
ence/importance matrix and normalize it (Table 2).
Step3: Develop the decision matrix.
Step4: Develop multiple preference/importance matrix to normalized deci-
sion matrix in order to obtain weighted score for each scenario
(Table 3).
Step 5: In this step, the level of each scenario for PRF can be determined
using Table 1.
Step 6: Some scenarios are developed in various level for all determined risk
factors and are assessed the risk factor for diabetes of each scenarios
by experts (Knowledge acquisition).
256 M.R. Amin-Naseri and N. Neshat
Table 2. Priorities among the factors by making a series of judgments based on pair wise com-
parisons of the factors
Overall
Factor Age Diet BMI BP
priority
Age 0.12 0.25 0.128205 0.090909 0.147279
Diet 0.04 0.083333 0.102564 0.090909 0.079202
BMI 0.48 0.416667 0.512821 0.545455 0.488735
BP 0.36 0.25 0.25641 0.272727 0.284784
Sum 1 1 1 1 1
Table 3. Weighted score of PRF for scenarios provided by synthesizing the priorities and nor-
malized score of risk factors
Factor Weighted
Age Diet BMI BP
Scenarios score (PRF)
S1 0.87 0.854167 0.893162 0.901515 3.5188442
S2 0.545 0.53125 0.561966 0.564394 2.2026098
S3 0.42 0.40625 0.384615 0.386364 1.597229
S4 0.94 0.947917 0.935897 0.931818 3.7556323
S5 0.67 0.65625 0.619658 0.621212 2.5671202
S6 0.44 0.447917 0.465812 0.462121 1.8158498
S7 0.525 0.489583 0.480769 0.488636 1.9839889
S8 0.73 0.708333 0.74359 0.75 2.9319231
S9 0.9 0.864583 0.833333 0.840909 3.4388258
S10 0.855 0.833333 0.861111 0.867424 3.4168687
S11 0.545 0.53125 0.561966 0.564394 2.2026098
S12 0.65 0.614583 0.598291 0.606061 2.4689345
S13 0.56 0.625 0.594017 0.575758 2.3547747
S14 0.69 0.697917 0.641026 0.636364 2.6653059
S15 0.565 0.5 0.523504 0.541667 2.1301709
S16 0.46 0.416667 0.42735 0.439394 1.743411
5 DIABRA Development
In order to represent the acquired knowledge, an expert system shell namely,
FOOPES (Fuzzy Object Oriented Programming Expert System) was used. Input data
was defined in “Scenarios and input variables data window” and scenarios were rep-
resented using “Ruled by Table Window” as shown in Figure 1.
A client requesting for the consultation is needed to run the DIABRA (DIABETES
RISK ASSESSMENT) file (by pressing F5). Input data by selection the nearest in-
tended item within limited range is entered to the DIABRA. The results of risk as-
sessment for Type 2 diabetes by DIABRA is reported using “Report window” as
shown in Figure 2.
An Expert System Based on Analytical Hierarchy Process for DIABRA 257
6 DIABRA Validation
The developed expert system (DIABRA) should be tested to ensure a satisfactory
performance is achieved. Hence, five unseen test samples were presented to the
258 M.R. Amin-Naseri and N. Neshat
DIABRA and its results were compared to the human expert. The results of testing
and evaluation demonstrated good performance of the system when compared to hu-
man experts. According to the results of Table 4 all results are similar.
Human
Family Having Stress DIABRA
# FBS PRF Gender Smoking Drinking expert
History Break. Score result
result
S1 3 3 Male NO Yes NO NO - High High
S2 2 2 Female NO NO Yes NO Yes Medium Medium
S3 1 2 Female Yes NO Yes NO - Low Low
S4 1 3 Male NO Yes NO NO NO Low Low
S5 3 3 Male Yes Yes Yes NO Yes High High
References
1. Wild, S., Roglic, G., Green, A., Sicree, R., King, H.: Prevalence of Diabetes: Estimates for
2000 and Projections for 2030. Diabetes Care 27(5), 1047–1053 (2004)
2. Rother, K.I.: Diabetes Treatment-Bridging the Divide. The New England Journal of Medi-
cine 356(15), 1499–1501 (2007)
3. American Diabetes Association, Total prevalence of diabetes & pre-diabetes. Archived
from the original on February 08 (2006),
http://web.archive.org/web/20060208032127,
http://www.diabetes.org/diabetesstatistics/prevalence.jsp
(retrieved March 17, 2006)
4. Çinar, M., Engin, M.E., Engin, Z., Atesçi, Y.Z.: Early Prostate Cancer Diagnosis by Using
Artificial Neural Networks and Support Vector Machines. Expert Systems with Applica-
tions 36, 6357–6361 (2009)
5. Ezziane, Z.: Applications of Artificial Intelligence in Bioinformatics: A Review. Expert
System with Applications 30(1), 2–10 (2006)
6. Gaspari, M., Roveda, G., Scandellari, C., Stecchi, S.: An expert system for the evaluation
of EDS Sin multiple sclerosis. Artificial Intelligence in Medicine 25, 187–210 (2002)
An Expert System Based on Analytical Hierarchy Process for DIABRA 259
7. Pérez-Carretero, C., Laita, L.M., Roanes-Lozano, E., Lázaro, L., González-Cajal, J., Laita,
L.: A Logic and Computer Algebra-Based Expert System for Diagnosis of Anorexia.
Mathematics and Computer Sin Simulation 58, 183–202 (2002)
8. Lejbkowicz, I., Wiener, F., Nachtigal, A., Militiannu, D., Kleinhaus, U., Applbaum, Y.H.:
Bone Browser a Decision-Aid for the Radiological Diagnosis of Bone Tumors. Computer
Methods and Programs in Biomedicine 67, 137–154 (2002)
9. Lamma, E., Mello, P., Nanetti, A., Riguzzi, F., Storari, S., Valastro, G.: Artificial Intelli-
gence Techniques for Monitoring Dangerous Infections. IEEE Transactions on Information
Technology in Biomedicine 10(1), 143–155 (2006)
10. HyukIm, K., Sang Park, C.: Case-based Reasoning and Neural Network Based Expert Sys-
tem for Personalization. Expert Systems with Applications 32, 77–85 (2007)
11. Pandey, B., Mishra, R.B.: Knowledge and Intelligent Computing System in Medicine.
Computers in Biology and Medicine 39, 215–230 (2009)
12. Hernando, M.E., Gomez, E.J., Corcoy, R., del Pozo, F.: Evaluation of DIABNET, A Deci-
sion Support System for Therapy Planning in Gestational Diabetes. Computer Methods
and Programs in Biomedicine 62, 235–248 (2000)
13. Mark, A., Mateo, R., Gerardo, B.D., Lee, J.: Health Care Expert System Based on the
Group Cooperation Model. In: International Conference on Intelligent Pervasive Comput-
ing, Jeju Island, Korea, pp. 285–288 (October 2007)
14. Šušteršič, O., Rajkovič, U., Dinevski, D., Jereb, E., Rajkovič, V.: Evaluating Patients’
Health Using a Hierarchical Multi-Attribute Decision Model. Journal of International
Medical Research 37(5), 1646–1654 (2009)
15. Bohanec, M., Zupan, B., Rajkovic, V.: Applications of Qualitative Multi-Attribute Deci-
sion Models in Healthcare. International Journal of Medical Informatics 58-59, 191–205
(2000)
16. Luciano, C.N., Plácido, R.P., Tarcísio, C.P.: An Expert System Applied to the Diagnosis
of Psychological Disorders. In: International Conference on Intelligent Computing and In-
telligent Systems, ICIS, pp. 363–367. IEEE, Los Alamitos (2009)
17. Rimmi, E.B., Manson, J.E., Stampfer, M.J., Colditz, G.A., Willett, W.C., Rosner, B.: Oral
Contraceptive Use and the Risk of Type 2 diabetes Mellitus in a Large Prospective Study
of Women. Diabetologia 35, 967–972 (1992)
18. Sugimori, H., Miyakawa, M., Yoshida, K., Izuno, T., Takahashi, E., Tanaka, C.,
Nakamura, K., Hinohara, S.: Health Risk Assessment for Diabetes Mellitus Based on Lon-
gitudinal Analysis of MHTS Database. Journal of Medical Systems 22(1), 121–138 (1998)
19. Griffin, M.E., Coffey, M., Johnson, H., Scanlon, P., Foleyt, M., Stronget, N.M.: Universal
v.s. Risk Factor-Based Screening for Gestational Diabetes Mellitus: Detection Rates, Ges-
tation at Diagnosis and Outcome. British Diabetic Association Medicine 17, 26–32 (2000)
20. Park, J., Edington, D.W.: A Sequential Neural Network Model for Diabetes Prediction. Ar-
tificial Intelligence in Medicine 23, 277–293 (2001)
21. Anonymas: Am I at Risk for Diabetes?. National Institutes of Health National Institute of
Diabetes and Digestive and Kidney Diseases NIH Publication No. 04–4805 (2003)
22. Leontos, C., Gallivan, J.: Small Dteps, Big Rewards: Your Game Plan for Preventing Type
2 Diabetes. Journal of the American Dietetic Association 12(1), 143–156 (2008)
23. FOOPES: A Fuzzy Objective Oriented Program Expert System, Developed by Roozbehani
and Amin Naseri, Tarbiat Modares University (2004)
Practice of Crowd Evacuating Process Model with
Cellular Automata Based on Safety Training
Abstract. To solve the problem that the crowd evacuating process model with
cellular automata is quite different from the reality crowd evacuating process,
the crowd evacuating process model with cellular automata based on safety
training is addressed. The crowd evacuating process based on safety training is
simulated and predicted, and the result is very close to the reality. Using the
vertical way to place the shelves gets both a higher escaping rate and a larger
shelf area that the total area is up to 216m2, and the average death number is 4.2
by safety training when the fire level being 2.
1 Introduction
In recent years, Manny types of natural disasters and man-made events from the United
States 911 events to China WenChuan earthquake and Japan Miyagi earthquake [1]
occurred; it strengthened the study of crowd emergency evacuating process. Japanese
researchers Kikuji Togawa proposed the evacuation time formula in 1955. J. Fruin
raised the relation curves between the population average speed and population density
by using statistics method. Henderson gave the probability distribution formula of
crowd forward velocity by using the Maxwell-Boltzmann thermodynamics. Researchers
of Wuhan University and Hong Kong City University have established a network
evacuation model, which divided the building into the network reflecting the person
specific location in geometric space, and which analyzed the person moving speed
within the building by using Lagrangian method.
The key of emergency evacuation modeling is crowd evacuation modeling; its
essence is the pedestrian flow model implementation in a specific environment. A
model that can describe the evacuation accurately is needed to simulate hundreds of
thousands of human activities on the computer. Cellular automata as a mathematical
model framework which time, space, states are discrete has strong ability to simulate
various physical systems and natural phenomena by constructing the dynamic
evolution system through the interaction between elements. A series results about
pedestrian evacuation simulation based on cellular automata have been gained, which
are two floors field model proposed by C. Bur stedde and Kirchner [2-3], discrete
social force model proposed by L.Z. Yang [4], and dynamic parameters model
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 260–268, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Practice of Crowd Evacuating Process Model 261
proposed by Hao Yue[5]. These cellular automata models can simulate the macro
evacuation characteristics, which are jamming and clogging, faster-is-slower [1-7].
All evacuation models have a common problem that we try to build complex real-
life evacuation model by using cellular automata assumed to be random, without any
training, and having the crowd with a variety of complex psychological. All these are
the most complex phenomenon of life which is possible to achieve by using the fourth
cellular automata model at the edge of chaos. But the various results are studied based
on elementary cellular automaton mode, which are the theoretical reasons that can not
completely describe the objects' portraits. The crowd evacuating process model with
cellular automata based on safety training (EPMCAST) is raised in this paper to
study the evacuating process by using elementary cellular automata, and the
simulating and predicting results are more realistic than other models.
2 EPMCAST Model
Cellular automata is a time and space discrete power system, each cell distributed in
the regular grid takes in the finite discrete state following the same effect, updating
synchronously according to the local rules. A large number of cellular constitute the
evolution of dynamic system by simple interacting. Cellular automaton consists of the
basic cell, cellular space, neighborhood and rules. Cellular automata can be
considered as a cellular space and a transformation function in the space, which can
be expressed by a four-tuple [8]:
A= (d, S, N, f)
Dimension of cellular automaton is d, S is the finite and discrete status set of
cellular, N is a combination of cellular in neighborhood space, f is the rule of change,
is a conversion function. Unlike the general dynamic model, cellular automata are not
strictly defined by the physical equation or function, but by a series of rules
constructed with model. Models to meet all of these rules can be counted as a cellular
automaton model. Therefore, the cellular automata model is a general term for the
kind of models, or a methodological framework characterized by discrete time, space
and states. Each variable takes only a finite number of states, and the rules changing
the state are local in time and space. We need to construct cellular automaton
according to the actual research questions for there is no fixed mathematical formula.
The grid of cellular automata includes the grid number, grid size and its boundary
conditions. The number and size of the grid based on simulation of the needs. The
two-dimensional grid structure is usually triangular, square and hexagonal [9].
Cellular automata adjacent space is named neighborhood, two-dimensional cellular
neighborhood types are Moore-type, extended type and Margolus Moore type.
EPMCAST considers not only the spread of fire, but also the evacuation. The impact
of safety training including the command role of training staff to assure a reasonable
direction, to ensure an orderly evacuation, and to ensure the protecting of fire and
smoke in the evacuating process, so that minimize the victims. We use the
262 S. Xi Tang and K. Ming Tang
The fire spreading state of cellular Aij is among 0 (not burning), 1 (not burning), 2
(just burn), 3 (burning) and 4 (off). The main factors affecting the fire spread include
the role of thermal radiation, the characteristics of buildings, the role of large fire
retardant elements, and the environmental impact. The probability of Cellular (i, j)
fire occurrence is
Qij = Wij ·Aij · Lij· H ij
Wij is the wind load effects, Aij is the impact indicators of building structure, Lij is
the impact indicators for the fire load, Hij is the collapsed factor coefficients by
considering the fire performance requirements and the risk factor coefficients of
building, and their values see [10,11]. Replacement of fire spreading is complied by
taking larger adjacent cellular Qij.
The probability of movement of crowd in different directions in the cellular space (i.e.
building) is signed with preference matrix M as shown in Figure 2. Preference matrix
element values are determined with the velocity v and the direction standard deviation
When more than one cellular compete in the same grid at the same time, only one
cellular left, the others continue to find the best location in accordance with the
current obtained moving direction, the probability coming in position (0, 0) by the
surrounding cellular is
ki , j • M i , j
p (i, j ) =
M 1, −1 + M 1,0 + M 0,−1 + M 0,1 + M −1,1 + M −1,1 + M −1,0 + M −1,1
kij is training factor of moving cellular (i, j), the evacuation crowd do not panic and
do not crowded, knowing the fire exits location and the expanding trend of fire and
smoking due to staff training, and its function is
(ip, jp) is the current cellular location, (il, jl) is the possible export position, (im, jm) is
the last person position of current from the potential export positions, dsmoke is the
264 S. Xi Tang and K. Ming Tang
of degree of fire and smoking in position (im, jm). The person position in the grid is
re-located after each crowd's time step is updated. This procedure is continued until
all crowd are evacuated out of the building or burned up.
3 Implementation of EPMCAST
We take fire evacuation of Times Mall of Yancheng City, Jiangsu Province as the
actual background, and resulting in the best way to place shelves with maximum shelf
space and minimum fire casualties. The intermediate fire is often took place in
Yancheng. Power supply room is the most vulnerable to fire in Times Mall. The
safety export numbers must be greater than two according to state regulations. There
are two security exports in Times Mall, one is the elevator, and the other is the stairs,
but the former can not be used when fire broke, so the latter is considered. We
suppose shelves and cashier are making in metal, and non-combustible. The fire
equipments are failure in use for power failure when the fire occurs in power supply
room. The fire is put out by fire brigade after all crowd escaped. We design six
different shelve placing style by considering the various situations that may occur
during fire escaping, considering the purpose for profit of shopping mall, according to
architectural principles and fire escape means in public places as shown in table 1.
the shelve No. shelve placing style the total area (m2)
1 horizontal 14 * 16 = 224
2 horizontal 14 * 14 = 196
3 vertical 24 * 8 = 192
4 vertical 24 * 9 = 216
5 vertical 26 * 9 = 234
6 dispersed 18 * 7 = 126
We divide evacuating plane into uniform grid, each grid is occupied with obstacles,
crowd, or is empty in model space. Each person takes one cellular with space 0.5m ×
0.5m, and can move in any direction of up, down, left, right, upper left, lower left,
upper right, lower right within the unit time, corresponding to k = 1,2, ..., 8. k = 0
indicates that the cellular stagnant. Crowd's walking speed is 1m/s under normal
circumstances, so each time step is 0.5m / (1m / s) = 0.5s.
crowd, status 2 means fire, status 3 means barrier. The bigger the number of fire level,
the lighter the fire, and a serious fire is happened when we select 1. The bigger the
number, the litter the number of crowd, a maximum crowd number is reached when
we select 1. The s placing style of shelve is inputted too. *
in red signs the fire
source of power room in supermarket, ⊙in blue signs the escaping crowd, ∩
emergency exit, signs the cashier, signs the fence in the recounting place,
and others in white are shelves. The crowd evacuating process with cellular automata
based on safety training is shown in Figure 3.
Fig. 3. The crowd evacuating process with cellular automata based on safety training
The escaping person chooses the best path to escape based on knowledge of escape
training according to the actual situation when the fire continued to spread. The
crowding phenomenon is appeared when there is too much escaping crowd in one
region, and the whole escape velocity is slow down, so the escaping person selects the
path by judging the situation of all exports. The evacuation is completed in 86
seconds, the total number of crowd is 422 before the fire, the number of escaped
crowd is 421, and the number of crowd killed in the fire is 1.
first quarter of 2010[14], so the average death number is 4, which is very close to the
average death number 4.2 of simulation result with fire A. We recommend the fourth
shelf placing style for Yancheng Times Mall which is not easy to fire to get higher
escape rate and a larger shelf area when the number of escape crowd reaches the
upper. For the dry climate areas, we recommend the sixth shelf placing style to get
highest escape rate and to achieve 0 casualties for fires less than C.
Table 2. The evacuation simulation result of Yancheng Times Mall with cellular automata
based on safety training for fire A
Table 3. The evacuation simulation result of Yancheng Times Mall with cellular automata
based on safety training for fire B
Table 4. The evacuation simulation result of Yancheng Times Mall with cellular automata
based on safety training for fire C
5 Conclusion
We design the crowd evacuating process model with cellular automata based on
safety training by using the evacuation strategies based on safety training. The
simulation results show that the crowd evacuating process model with cellular
automata based on safety training can simulate the emergency evacuation behavior of
supermarket shopping center, and the simulation results are very close to reality. This
simulation method is intuitive, flexibility and scalability, and provides good ideas for
the emergency management research. We will expand this method to study the more
complex evacuation situation in future.
References
1. Real-time information earthquake situation in Japan (March 27, 2011),
http://news.sina.com.cn/z/japanearthquake0311/index.shtml
(March 11, 2011)
2. Burst Edde, C., Klauck, K., Schadschneider, A., Zittartz, J.: Simulation of pedestrian
dynamics using a two dimensional cellular automaton. Physical A (S0378-4371) 295(3),
507–525 (2001)
3. Kirchner, A., Schadschneider, A.: Simulation of evacuating processes using a bionics-
inspired cellular automaton model for pedestrian dynamics. Physical A (S0378-
4371) 312(1), 260–276 (2002)
4. Zhao, D.L., Yang, L.Z., Li, J.: Occupants’ behavior of going with the crowd based on
cellular automata occupant evacuation model. Physica A 387, 3708–3718 (2008)
5. Hao, Y., Fu, S., Zhisheng, Y.: Based on cellular automata simulation of pedestrian
evacuation flow. Physics 7, 4523–4530 (2009)
6. Wang, D., Kwok, N.M., Jia, X., Li, F.: A cellular automata based crowd behavior model.
In: Proceedings of the 2010 International Conference on Artificial Intelligence and
Computational Intelligence: Part II, Sanya, China, October 23-24 (2010)
268 S. Xi Tang and K. Ming Tang
7. Peng, Y.-C., Chou, C.: Simulation of pedestrian flow through a ”t” intersection: A multi-
floor field cellular automata approach. Computer Physics Communications, 205–208
(January 2011)
8. Wolfman, D.: Cellular automata for traffic simulations. Physical A 263(1-4), 438–445
(1999)
9. Talia, D.: Parallel Cellular Environments to Enable Scientists to Solve Complex Problems
(1999), http://www.cscfac.uk/euresco99/presentations/Talia.ppt
10. Ohgai, A., Gohna, Y.i.i., Watanabe, K.: Cellular automata modeling of fire spread in built -
up areas - A tool to aid community -based planning for disaster mitigation. Computers,
Environment and Urban Systems 31(4), 441–460 (2007)
11. Xiaojing, M., Yang, L.-z., Jian, L.: Based on cellular automata model for urban areas the
probability of fire spread. China Safety Science Journal 18(2), 28–33 (2008)
12. Zhao, S.-Y., Su, G.-J., He, Y., Xu, X.-H.: Research of Emergency Evacuation System
Simulation Based on Cellular Automata. Journal of Chinese Computer Systems 28(12),
2220–2224 (2007)
13. The new standards of fire level (March 27, 2011),
http://www.dys.gov.cn/Public/2007/0707/10019152.html (July 07,
2007)
14. Fire statistics in (March 2010) (March 27, 2011),
http://www.js119.com/news/folder15/2010/0415/2010-04-
1575374.html (April 15, 2010)
Feature Selection for Unlabeled Data
Chien-Hsing Chen
Abstract. Feature selection has been explored extensively for several real-world
applications. In this paper, we address a new solution of selecting a subset of
original features for unlabeled data. The concept of our feature selection method
is referred to a basic characteristic of clustering in that a data instance usually
belongs in the same cluster with its geometrically nearest neighbors and belongs
to different clusters with its geometrically farthest neighbors. In particular, our
method uses instance-based learning for quantifying features in the context of the
nearest and the farthest neighbors of every instance, such that using salient fea-
tures can raise this characteristic. Experiments on several datasets demonstrated
the effectiveness of our presented feature selection method.
1 Introduction
Feature selection has been explored extensively for several real-world applications
such as text processing [1], image representation [2] and time-series prediction [3].
Recently, many extensive studies have been proposed for feature selection in unsu-
pervised learning, and the selected salient feature subset was found to be helpful for
cluster analysis.
In this paper, we present a new feature selection method, feature selection from the
nearest and the farthest neighbors (NF), which uses instance-based learning for quan-
tifying features in the context of the nearest and the farthest neighbors of every in-
stance. Unlike the previous studies in selecting salient features (e.g., for clustering), our
method does not need a prespecified clustering algorithm for training features and
potentially ignores the noisy features, providing strength in the process of delivering
salient features. This quantification is motivated in that one of the most well-known
characteristics in clustering is that an instance usually belongs in the same cluster with
its geometrically nearest neighbors and belongs to different clusters with its geometr-
ically farthest ones. The purpose of our feature selection method is to quantify features,
such that using salient features (i.e., with higher quantity) can raise this well-known
characteristic.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 269–274, 2011.
© Springer-Verlag Berlin Heidelberg 2011
270 C.-H. Chen
, | , , (3)
We then follow the idea of approximate clustering [4, 5] and approximate the frac-
tional instances, which include xi and the instances in , and , , as at least two
approximate clusters: the first cluster contains xi and the instances in , , and
another cluster does not simultaneously have xi and any instance in , . The reason
of obtaining these approximate clusters is because an instance usually belongs in the
same cluster with its geometrically nearest neighbors and belongs to different clusters
with its geometrically farthest ones.
We focus on those approximate clusters and further use an evaluation function to
evaluate the goodness of a feature, whether this feature is informative to those
approximate clusters or not. As such, the evaluation function we used in this paper was
Silhouette Width Criterion (SWC) [6], which is a well-known relative clustering va-
lidation and has been widely used to evaluate how a dataset can be partitioned into
Feature Selection for Unlabeled Data 271
subsets (clusters) in harvesting the cluster compactness for the instances within a
cluster and the cluster separability for the instances between clusters. The functions to
measure feature compactness (feature width from xi to its nearest neighbors) and fea-
ture separability (feature width from xi to its farthest neighbors) are respectively de-
fined as follows (4 and 5).
, , (4)
, , (5)
We evaluate intrinsic characteristics for every individual feature. The function f(a,
b) represents the average distance of b to all instances in a for every feature and is
applied to return , ,…, , ,…, , and
, ,…, , ,…, , . The aggregate function for and is shown as
follows (6).
(6)
1 α (7)
272 C.-H. Chen
3 Experiment
We denote the basic method as NF. Assume that we have a dataset including a total of
N instances. We first set an initial learning rate a(1) to 0.8 for our algorithm and de-
crease the learning rate a(t)= a(1)¯[(T-t)/T] at the tth iteration. The weight of every
element in the initial feature salience vector w(1) is set to be the same. The number of
iterations T should be large, such that most instances can be randomly selected for
training on the algorithm. We thus set T to 10N, where N is the size of the training
dataset. Empirically, the adaptive parameters and are set to 0.01 and 1, respec-
tively. The parameters K and L depend on the training instances and are empirically set
to√ 3.
and the pixels of the salient gray-level were retained as the original gray-level pixels.
The processed gray-scale images using the selected features by NF method are shown
in Figure 2. Moreover, we also conducted a variance metric to select features for dis-
playing the gray-scale images. With a variance metric, a feature (or random variable)
with a higher variance is treated as more salient. The resulting gray-scale images using
the selected features by a variance metric are shown in Figure 3.
Comparing the results in Figures 2 and 3, we see the NF method better achieved
feature selection for discriminating these images. Using the NF method can depict
summary of the original images. This implies that the extracted salient features can
discriminate the difference among images. However, the method using a variance
metric often failed in achieving this goal. For example, the third image, “opencountry”,
we could still see that the image was clear (see third image in Figure 2). However, the
variance metric showed a very unclear image (see third image in Figure 3). Taking
another image into consideration affords the same conclusions (compare the last image
in Figures 2 and 3).
Fig. 2. The gray-scale images with 128 dimensions using the NF method
Fig. 3. The gray-scale images with 128 dimensions using a variance metric
274 C.-H. Chen
In addition, when the number of instances in a dataset is small, it will have less
shared context for the distance metric. From this example, it was interesting to see that
our NF method can still handle the task of feature selection for distinguishing images.
4 Conclusions
Our paper provides an interesting approach of using only a subset of features for un-
labeled data. Our presented method differs from filter-based methods, which usually
fail in selecting feature subsets for clustering because the number of clusters or the
clustered structure cannot be effectively predicted in advance. As such, our method
uses an instance-based learning for quantifying features in the context of the nearest
and the farthest neighbors of every instance such that all instances using the salient
features for clustering can increase cluster compactness for the instances within a
cluster and cluster separability for the instances between clusters. The experiments on
several datasets demonstrated that our presented method could effectively select a
feature subset that was adaptable in cluster analysis and achieved better performance
than variance metric in feature selection.
References
1. Lee, C., Lee, G.G.: Information gain and divergence-based feature selection for machine
learning-based text categorization. Information and Process Management 42, 155–165
(2006)
2. Wang, H., Li, P., Zhang, T.: Histogram features-based fisher lineardiscriminant for face
detection. In: Asian Conference on Computer Vision, pp. 521–530 (2006)
3. Crone, S.F., Kourentzes, N.: Feature selection for time series prediction - A combined filter
and wrapper approach for neural networks. Neurocomputing 73, 1923–1936 (2010)
4. Hathaway, R.J., Bezdek, J.C., Huband, J.M., Leckie, C., Kotagiri, R.: Approximate clus-
tering in very large relational data, in review. Journal of Intelligent Systems (2005)
5. Feder, T., Greene, D.: Optimal algorithms for approximate clustering. In: Proceedings of the
20th Annual ACM Symposium on the Theory of Computing, pp. 434–444 (1988)
6. Kaufman, L., Rousseeuw, P.: Finding groups in data. Wiley, Chichester (1990)
7. Haykin, S.S., Widrow, B.: Least-mean-square adaptive filters. Wiley, Chichester (2003)
8. Boutemedjet, S., Bouguila, N., Ziou, D.: A hybrid feature extraction selection approach for
high-dimensional non-Gaussian data clustering. IEEE Transactions on Pattern Analysis and
Machine Intelligence 3(8), 1429–1443 (2009)
9. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the
spatial envelope. International Journal of Computer Vision 42, 145–175 (2001)
10. Otsu, N.: A threshold selection method from gray-level histogram. IEEE Transactions on
Systems, Man, and Cybernetics 9(1), 62–66 (1979)
Feature Selection Algorithm Based on Least Squares
Support Vector Machine and Particle Swarm
Optimization
1 Introduction
Classification is one of the major problems in the field of pattern recognition. The
accuracy of the classifier is related to the selection of classifier, the number of the
samples and the dimension of the sample. The feature selection is the key problem for
classification. Selecting a handful of most informative genes is a necessary step for
classification. With the development of science and technology, the dimension of
sample that obtains in some field is becoming larger and larger. So more and more
researchers pay attention to the feature selection and focus on the study of feature
selection.
Recently, the evolution algorithm based on biology intelligence is developed
rapidly. Meanwhile some feature selection algorithms based on intelligent algorithm
and their hybrid algorithm appeared. Shoorehdeli [1] proposed a feature subset
selection algorithm based on genetic algorithm and particle swarm optimization for
face detection. Huang [2] gave a feature selection algorithm based on double parallel
feed-forward neural networks and particle swarm optimization. Yu [3] gave a feature
gene selection algorithm based on discrete particle swarm optimization and support
vector machines. Qiao [4] presented a feature subset selection algorithm based on
particle swarm optimization and support vector machines. Dai [5] gave a fast feature
selection algorithm based on support vector machines.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 275–282, 2011.
© Springer-Verlag Berlin Heidelberg 2011
276 S. Chuyi et al.
For the better classification and generalization ability of the support vector
machines, and the least squares support vector machines converting the inequality
constrains into equality constrains to decrease the difficulty of training support vector
machines, a hybrid feature selection algorithm based on particle swarm optimization
and least squares support vector machines is proposed in this paper.
2 Basic Algorithm
Particle Swarm Optimization was introduced by Kennedy and Eberhart [6] who had
been inspired by the research of the artificial livings. It is an evolutionary
computational model based on swarm intelligence.
In PSO, the possible solution for the optimization problem is imaged as a point in
the D dimension search space, which is called a particle. The particles fly in the
search space on some certain speeds. The speed adjusts dynamically according to the
particle itself flying experience and its partners’ flying experience. Every particle is
valued by a fitness which is calculated from the objective function. Every particle
records the best position that it experienced, which is denoted by pbest. The best
position that the colony experienced is called global best and denoted by gbest.
Suppose that the search space is D-dimensional and m particles form the colony[7].
The ith particle represents a D-dimensional vector Xi (i=1, 2, …, m). It means that the
ith particle locates at Xi = (xi1, xi2, …, xiD) (i=1, 2, …, m) in the search space. The
position of each particle is a potential solution. We calculate the particle’s fitness by
putting its position into a designated objective function. When the fitness is higher,
the corresponding Xi is “better”. The ith particle’s “flying” velocity is also a D-
dimensional vector, denoted as Vi = (vi1, vi2, …, viD) (i=1, 2, …, m). Denote the best
position of the ith particle as Pi = (pi1, pi2, …, piD), and the best position of the colony
as Pg (pg1, pg2, …, pgD), respectively. The PSO algorithm could be performed using the
following equations
X i ( k + 1) = X i ( k ) + Vi ( k + 1) Δ t (2)
where i=1, 2, …, m, k represents the iterative number, w is the inertia weight, c1 and
c2 are learning rates, r1 and r2 are random numbers between 0 and 1, ∆t is the time step
∈
value, Vi [Vmin, Vmax] where Vmin and Vmax are the designated vectors. The
termination criterion for the iterations is determined according to whether the
maximum generation or a designated value of the fitness of Pg is reached.
PSO is used to deal with the optimization of continuous function initially. In 1997,
Kennedy and Eberhart [8] proposed binary particle swarm optimization (BPSO).
Feature Selection Algorithm Based on Least Squares Support Vector Machine 277
f ( x , w ) = sign [ w T ϕ ( x ) + b ] (3)
where the nonlinear mapping ϕ (⋅) maps the input data into a higher dimensional
feature space. In least squares support machines the following optimization problem
is formulated
N
1 T
min J ( w , e ) = w w + γ ∑ ei2 (4)
w,e 2 i =1
y i [ w T ϕ ( xi ) + b ] = 1 − ei , i = 1,..., N (5)
⎧ ∂L N
⎪ ∂W = 0 → w = ∑ α i y iϕ ( x i )
⎪ i =1
⎪ ∂L = 0 →
N
⎪ ∂b ∑
i =1
α i yi = 0
⎨ ∂L (7)
⎪ = 0 → α i = γei
⎪ ∂ ei
⎪ ∂L
⎪ = 0 → y k [ w T ϕ ( xi ) + b ] − 1 + ei = 0
⎩ ∂ α i
for i=1,…,N. After elimination of ei and w , the solution is given by the following set
of linear equations
⎡0 − y T ⎤ ⎡b ⎤ ⎡0 ⎤
⎢ ⎥⎢ ⎥ = ⎢G ⎥ (8)
⎣y Ω + γ −1 I ⎦ ⎣ a ⎦ ⎣ 1 ⎦
278 S. Chuyi et al.
G
where y = [ y1 ,..., y N ]T ,1 = [1,...,1]T , α = [α1 ,..., α N ]T and the Mercer condition
Substituting w in Eq. (1) with the first equation of Eqs. (7) and using Eq. (9) we have
N
f ( x ) = sign [ ∑ α i y iψ ( x , x i ) + b ] (11)
i =1
where α i , b are the solution to Eqs. (8). The kernel function ψ ( ⋅ ) can be chosen as
linear function ψ ( x , x i ) = x iT x , polynomial function ψ ( x , x i ) = ( x iT x + 1 ) d or
radial basis function ψ ( x , x i ) = exp{ − x − x i 2 / σ 2 } . In the proposed algorithm the
2
The initialization of colony is to generate a set of particle randomly. But the number
of 1 and 0 in each particle will almost be the same. This means that the number of
features in each particle is almost the same. To obtain the different numbers of
feature, the algorithm [4] used in this paper is generating the number of 1 in a particle
randomly and then distributing 1 in a particle randomly. This algorithm could reflect
the variety of feature.
The initial velocity is generated by this formula [11]:
where rand () is a random between 0 and 1. Vmax and Vmin are the maximum velocity
and minimum velocity respectively.
Updating position follows the Eq.(2) which adds the position vector and the velocity
vector. And then the result is rounded. Finally, the result modulo 2 maps 0 and 1.
The purpose of feature selection is to find the feature subset which has stronger
classification ability. Fitness is the scale for evaluating the feature subset which is
denoted by a particle. Fitness is composed by two parts: (a) testing accuracy. (b)
numbers of selected features. For each particle h, the fitness is as follows:
where the acc(h) is the classification accuracy. The classifier is constructed by the
features selecting according to h. Ones(h ) is the number of 1 in h. The PSO is to find
the global minimum. The higher accuracy means lower fitness. k is the parameter
which balances the accuracy and feature number. The larger k means that the feature
number is important.
For the better classification and generality ability, we consider selecting the support
vector machines as classifier. The least square support vector machines converse the
inequality constrain to equality constrain. So it is easy to solve. LS-SVM is selected
as classifier in this paper.
280 S. Chuyi et al.
4 Numerical Experience
We use two datasets to demonstrate the performance of the proposed algorithm. The
datasets are obtained from http://sdmc.lit.org.sg/GEDatasets/Datasets.html. Table 1
shows the details of the two datasets. The experiments are run on a Lenovo personal
computer, which utilizes a 3.0GHz Pentium IV processor with 1GB memory. This
computer runs Microsoft Windows XP operating system. All the programs are written
in C++, using Microsoft's Visual C++ 6.0 compiler. We use the original datasets
without normalization.
The parameters that should be predetermined are as follows: The kernel ψ (⋅) for
LS-SVM is chosen as a linear function ψ ( x , x i ) = ϕ ( x ) T ϕ ( x ) = x iT x .k=0.45 in
Eq(13). The size m of colony is 100. The value of Vmax in each dimension is 10 and
the value of Vmin in each dimension is 0.1. The performance of the proposed
algorithm is summarized in Table 2.
It can be seen from Table 2 that 2.20% genes are selected from ALLAML
Leukemia dataset (157 genes are selected from 7129 genes). 1.07% genes are selected
from Lung Cancer dataset (135 genes are selected from 12533 genes).
5 Conclusion
A feature selection algorithm based on PSO and LS-SVM is proposed in this paper.
LS-SVM performs well for classification problem. PSO is easy to solve and robust.
The proposed algorithm combines the advantages of LS-SVM and PSO. PSO is used
to select feature and LS-SVM is used to construct classifier. The accuracy for
classification is the main part in fitness function. Numerical experiences show that
this algorithm decreases the dimension of samples and improve the efficiency for
classification.
There are some further improvements for the proposed algorithm. For example, the
distance between two positions, the added method for position vector and velocity
vector, the initial colony method and so on. The further improvement algorithm will
achieve better performance.
Acknowledgment
This work was supported by the funds from National Natural Science Foundation of
China (NSFC) (61073075, 60803052 and 10872077), the National High-Tech R&D
Program of China (2009AA02Z307), Jilin University ("985" and "211" project,
Scientific Frontier and interdisciplinary subject project (200903173)), Inner Mongolia
Autonomous Region Research Project of Higher Education (NJ10118 and NJ10112).
References
1. Shoorehdeli, M.A., Teshnehlab, M., Moghaddam, H.A.: Feature Subset Selection for Face
Detection Using Genetic Algorithms and Particle Swarm Optimization. In: Proceedings of
the 2006 IEEE International Conference on Networking, Sensing and Control, pp. 686–690
(2006)
2. Huang, R., He, M.Y.: Feature Selection Using Double Parallel Feedforward Neural
Networks and Particle Swarm Optimization. In: IEEE Congress on Evolutionary
Computation, pp. 692–696 (2007)
3. Yu, H.L., Gu, G.C., Zhu, C.M.: Feature Gene Selection by Combining an Improved Discrete
PSO and SVM. Journal of Harbin Engineering University 30(13), 1399–1403 (2009)
282 S. Chuyi et al.
4. Qiang, L.Y., Peng, X.Y., Peng, Y.: BPSO-SVM Wrapper for Feature Subset Selection.
Acta Electronica Sinica 34(3), 496–498 (2006)
5. Dai, P., Li, N.: A Fast SVM-based Feature Selection Method. Journal of Shandong
University (Engineering Science) 40(5), 60–65 (2010)
6. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: IEEE International
Conference on Neural Networks, pp. 1942–1948. IEEE Service Center, Piscataway (1995)
7. Shi, X.H., Wan, L.M., Lee, H.P., et al.: An Improved Genetic Algorithm with Variable
Population-size and a PSO-GA Based Hybrid Evolutionary Algorithm. In: Second
International Conference on Machine Learning and Cybernetics, pp. 1735–1740 (2003)
8. Eberhart, R.C., Kennedy, J.: A Discrete Binary Version of the Particle Swarm Algorithm.
In: IEEE Conference on Systems, Man, and Cybernetics, vol. 5, pp. 4104–4109. IEEE
Press, Orlando (1997)
9. Suykens, J.A.K., Vandewalle, J.: Least Squares Support Vector Machine Classifiers.
Neural Processing Letter 9, 293–300 (1999)
10. Chua, K.S.: Efficient Computations for Large Least Square Support Vector Machine
Classifiers. Pattern Recognition Letters 24, 75–80 (2003)
11. Ma, H.M., Ye, C.M., Zhang, S.: Binary Improved Particle Swarm Optimization Algorithm
for Knapsack Problem. Journal of University of Shanghai for Science and Technology 28(1),
31–34 (2006)
Unsupervised Local and Global Weighting for Feature
Selection
Abstract. In this paper we will describe a process for selecting relevant features
in unsupervised learning paradigms using a new weighted approachs: local
weight observation “OBS-SOM”, and global weight observation “GObs-SOM”
This new methods are based on the self organizing map (SOM) model and
feature weighting. These learning algorithms provide cluster characterization by
determining the feature weights within each cluster. We will describe extensive
testing using a novel statistical method for unsupervised feature selection. Our
approach demonstrates the efficiency and effectiveness of this method in
dealing with high dimensional data for simultaneous clustering and weighting.
These models are tested on a wide variety of datasets, showing a better
performance for new algorithms or classical SOM algorithm. We can also show
that through deferent means of visualization, OBS-SOM, and GObs-SOM
algorithms provide various pieces of information that could be used in practical
applications.
1 Introduction
Feature selection for clustering or unsupervised feature selection aims at identifying
the feature subsets so that a model describing accurately the clusters can be obtained
from unsupervised learning. This improves the interpretability of the induced model,
as only relevant features are involved in it, without degrading its descriptive accuracy.
Additionally, the identification of relevant and irrelevant variables with SOM [1, 2]
learning provides valuable insight into the nature of the group-structure. Features
selection or variables selection for clustering is difficult because, unlike supervised
learning [3], there are no class labels for the dataset and no obvious criteria to guide
the search. The important issue of feature selection in clustering is to provide
variables which give the Memory based Weighted Topological Clustering "best"
homogeneous clustering [4]. Therefore, we use the weight and prototype vector π[j]
and w[j] provided by our proposed weighting approaches to cluster the map and to
characterize each cluster with relevance variable. For map clustering we use
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 283–290, 2011.
© Springer-Verlag Berlin Heidelberg 2011
284 N. Mesghouni, K. Ghedira, and M. Temani
3 Weighting Observations
Weighting the observations during the learning process is a technique which allows
giving more importance to the relevant features of the weighted observation. Consider
the dataset X={x1, x2, x3, x4} and suppose that the observation x2 has a bigger
relevance in the X. In this case the weighting approach must be able to assign a highest
weight value to this one comparing to others three observations. For this type of
approach we propose the both local and global weighting described in the next sections.
We based our method on initial work describing the supervised modelw-LVQ2 [7].
This approach adapts weights to filter the observation during the learning process.
Using this model, we weighted observations x using weight vectors π before
computing the distance. So, the weight matrix will be considered like a filtering
process for the observations. The objective function was rewritten as follows:
Unsupervised Local and Global Weighting for Feature Selection 285
| |
R Obs-SOM ( ,W,π) = ∑ ∑| | || ||² (1)
Minimize R OBS-SOM (χ, , ) with respect to Π by fixing χ and W. The update rule
for the feature weight vector πj (t+1) is:
We used this dataset to show a good level of performance for both algorithms
(Dis-SOM and Obs-SOM) for simultaneous clustering and feature weighting. All
Unsupervised Local and Global Weighting for Feature Selection 287
observations were used to generate a map with 26×14 cells dimension. Both learning
algorithms provided two vectors for each cell: the referent vector wj= (w1 j,w2j,...,wdj)
and weight vector πj= (π1j,π2j,...,πdj), where d= 40. Preparing data for clustering
requires some preprocessing, such as normalization or standardization. In the first
experimentation step, we normalized the initial dataset (Figure 1(a)) to obtain more
homogenous data (Figure 1 (b)). We used variance normalization, representing a
linear transformation that scales the values such that their variance is equal to 1.
We created 3D representations of the referent vector and weight vector provided by
classical SOM and by our methods (G/Obs-SOM). The axes X and Y indicate the
features and the referent indexes, respectively. The amplitude indicates the mean
value of each component. Examination of the two graphs (3(c), 4(b)) shows that the
noise represented by features 19 to 40 may be clearly detected with low amplitudes.
This visual analysis of the results clearly shows that Obs-SOM algorithm provides the
best results. Both graphs of weights Π and prototypes W show that features associated
to noise is irrelevant with low amplitude. Visual analysis of the weight vectors
(Figure 2(d)) showed the weight vectors obtained with Obs-SOM to give a more
accurate. The Obs-SOM algorithm provides good results because the weight vectors
work as a filter for observations and estimates the referents that result from this
filtering. We applied the selection task to all parameters of the map before and after
map clustering to check that it was possible to automatically select the features using
our algorithms. This task involves detecting major changes for each input vector
represented as a signal graph. We used hierarchical classification [10] for clustering
the map. After Obs-SOM map clustering, we obtained three clusters with a purity
index equal to 0.7076. This demonstrates that when there is no cluster (labels)
information, feature weighting can be used to find and characterize homogeneous
clusters. The importance of this index is that it can give us information about each
clusters in a visually mode. The plot founded on the left part of the figure shows the
wrong labeled observations. In the case of both global weighting algorithms, we can
see that some noise features has a high value, and even for the Obs-SOM the first
features (1-20) do not describe well the waves. This disadvantage compared to local
weighted approaches is because the global weighting technique uses only a vector of
weights for all the data, and respectively each sample vector will be weighted with the
same vector of weights. After Obs-SOM map clustering with the referents W, which
are already weighted, we obtain 3 clusters. The characterization of clusters with the
"Scree Test" algorithm is provided in Table 1for each algorithm, we present the
features selected for each cluster. Both techniques (Obs-SOM, GObs-SOM) provided
three clusters characterized by different features. By contrast, segmentation of the
map using classical SOM provided six clusters with a purity index value of 0.662.
Map segmentation was performed using hierarchical clustering with all the features.
For clusters cl1, cl2 and cl3, the features selected using Obs-SOM. We found that the
algorithm Obs-SOM identified relevant and informative features, giving more
accurate results than classical SOM. The new and classical methods were compared
after segmentation of the map. We investigated the effect of selected features before
and after, or without segmentation by testing this selection process in the supervised
paradigm and computing the accuracy index for each method. In the case of global
weighting approaches (G/Obs-SOM) we are not able to characterize each cluster
because the weight vector are the same for all the prototypes, but we can detect the
288 N. Mesghouni, K. Ghedira, and M. Temani
relevant features for the whole map (dataset). We can see that the set of selected
features using these global weighting algorithms (Table 1) represent the union of
relevant features obtained with the local weighting approach for all the clusters.
Table 1. Comparison of the selected variables using traditional and our approaches (G/Obs-
SOM). [i− j] indicates the set of selected variables.
Fig. 2. 3D visualization of the referent vector and weight vector. The axes X and Y indicate
features and the referent index values, respectively. The amplitude indicates the mean value of
each component of map 26×14 (364 cells).
Fig. 3. Comparison of purity score (classification accuracy with learning dataset) using SOM,
GObs-SOM and Obs-SOM before and after clustering map
290 N. Mesghouni, K. Ghedira, and M. Temani
5 Conclusion
In this paper, we have described a process for selecting relevant features in unsupervised
learning paradigms using these new weighted approaches. These new methods are
based on the SOM model and feature weighting. Both learning algorithms Obs-SOM,
and Gobs-SOM provide cluster characterization by determining the feature weights
within each cluster. We described extensive testing using a novel statistical method for
unsupervised feature selection. Our approaches demonstrated he efficiency and
effectiveness of this method in dealing with high dimensional data for simultaneous
clustering and weighting. The models proposed in this paper were tested on a wide
variety of datasets, showing a better performance for the Obs-SOM, and Gobs-SOM
algorithms or classical SOM algorithm. We also showed that through different means of
visualization, Obs-SOM, and Gobs-SOM, algorithms provide various pieces of
information that could be used in practical applications. The global weighted
approaches are used in the case of analysis of the entire clustering result and not each
cluster separately.
References
[1] Kohonen, T.: Self-organizing Maps. Springer, Berlin (1995)
[2] Vesanto, J., Alhoniemi, E.: Clustering of the selforganizing map. IEEE Neural Networks
[3] Kohonen, T.: Self-organizing Maps. Springer, Berlin (2001)
[4] Frigui, H., Nasraoui, O.: Unsupervised learning of prototypes and attribute weights.
Pattern Recognition 37(3), 567–581 (2004)
[5] Yacoub, M., Bennani, Y.: Features selection and architecture optimization in
connectionist systems. IJNS 10(5) (2000)
[6] Cattell, R.: The scree test for the number offactors. Multivariate Behavioral Research 1,
245–276 (1966)
[7] Yacoub, M., Bennani, Y.: Features selection and architecture optimization in
connectionist systems. IJNS 10(5) (2000)
[8] Asuncion, A., Newman, D.J.: Uci machine learning repository (2007)
[9] Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing
Surveys 31(3), 264–323 (1999)
[10] Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Transactions on
Neural Networks 11(3), 586–600 (2000)
[11] Yacoub, M., Bennani, Y.: Une mesure de pertinence pour la sélection de variables dans
les perceptrons multicouches. RIA, Apprentissage Connexionniste, pp. 393–410 (2001)
Graph-Based Feature Recognition of Line-Like
Topographic Map Symbols
1 Introduction
Paper-based raster maps are primarily appropriate for human usage. They al-
ways require a certain level of intelligent interpretation. In GIS applications
vectorized maps are preferred. Especially, government, local authorities and ser-
vice providers tend to use topographic maps in vectorized form. It is a serious
challenge in every country to vectorize maps that are available in raster format.
This task has been accomplished in most countries — often with the use of un-
comfortable, “manual” tools, taking several years. However, it is worth dealing
with the topic of raster-vector conversion. On one hand, some results of vector-
ization need improvement or modification. On the other hand, new maps are
created that need vectorization.
The theoretical background of an intelligent raster-vector conversion system
has been studied in the IRIS project [2]. Several components of a prototype sys-
tem has been elaborated. It became clear very early that the computer support
of conversion steps can be achieved at quite different levels. For example, a map
symbol can be identified by a human interpreter, but the recognition can be
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 291–298, 2011.
c Springer-Verlag Berlin Heidelberg 2011
292 R. Szendrei, I. Elek, and M. Márton
attempted with a software, using the tools of image processing [3]. A computer
system can be fairly valuable and usable even if every important decision of
interpretation is made by the expert user. However, the system designed and
developed by the authors is aimed at to automatize the raster-vector conver-
sion of line-like symbols as much as possible. This aim gives an emphasis to a
knowledge-based approach.
This paper deals with a part of raster-vector conversion applied in cartog-
raphy, with knowledge-based approach [1]. The line-like map symbols used in
topographical maps will be introduced, together with the algorithms used to
recognize them. The organization of expertise into knowledge base will also be
presented.
The following must be considered in connection with good quality and auto-
mated vectorization. Raster maps can be adequately understood only by human
expert. After the vectorization, the relationships used for interpretation are no
more contained in the vectorized map — it consists only of numerical and de-
scriptive data. Automatic interpretation of image contents requires sophisticated
image processing tools, whiches are not comparable to human perception in the
majority of cases. Therefore, the level of automatic recognition must also be
appropriately determined.
The topic of this article is how to interpret printed variant of line-like map sym-
bols and how to represent them in computer systems. This process is considered
basically as the result of interpretation and processing of map symbols. To ac-
complish this task it is very important to understand maps, and specifically,
map symbols. To gain a comprehensive survey, refer to [5]. Although human
cognition can not be completely understood, it is necessary to know to a cer-
tain extent how the human expert interprets graphical information. Regarding
human perception, primarily points, lines (see Fig. 1) and textured surfaces are
sought and distinguished. It must be realized that human perception may reveal
finer or hidden information, for example how roads crossing at different levels
hide each other. Human mind is also capable of abstraction, for example when it
disregards the actual texture of surface, and investigates only its shape. Human
eye can make some corrections, for example in the determination of shades of
color layers printed over each other.
Map interpretation process and the complexity of knowledge based object
recognition can be visualized via any example of the four different object types
— that is, point, line, surface and inscription.
Line-like elements can cover larger area on map than their real size. For in-
stance in the case of a highway, a zero-width center line can represent the the-
oretical position of the road in the database. Beyond the graphical properties
of lines the database may contain real physical parameters, such as road width,
carrying capacity, coating (concrete, asphalt) etc. Hiding is a very inherent phe-
nomenon in maps when line-like objects, landmarks (typically roads, railways
Graph-Based Feature Recognition of Line-Like Topographic Map Symbols 293
a) c) f)
d)
b)
12 e)
0 0 0 0 0 1 1
a) 1 b) 1 1 0 c) 1 1 1 d) 1 1
1 1 1 1
5-7 in the case of topographic maps. We assume that on a printed map each
pixel color can be defined as a linear combination of a surface and an object
color. In optimal case, this can be written as a c = α ∗ co + (1 − α) ∗ cs equation,
where c is the value of the current pixel, and co , cs are the respective object and
surface colors, so the segmentation can be done by solving the minimalization
task min |c − α ∗ co + (1 − α) ∗ cs | for each pixel.
o∈O,s∈S
As the second step is a simple selection on the segmented pixels, it can be
done easily. The third step consists of two different thinning methods. A general
thinning method is used first to avoid creating unneeded short lines by morpho-
logical thinning. The general thinning can be described as it iteratively deletes
pixels inside the shape to shrink it without shortening it or breaking it apart.
Because the result of the general thinning algorithm may contain small pixel
groups, a morphological thinning should be performed. This morphological thin-
ning can be done by using the structuring elements shown in Fig. 2. At each
iteration, the image is first thinned by the left hand structuring element (see
Fig. 2 a) and b) ), and then by the right hand one, and then with the remaining
six 90◦ rotations of the two elements. The process is repeated in cyclic fashion
until none of the thinnings produces any further change. As usual, the origin of
the structuring element is at the center.
The skeletonized binary image can be vectorized in the following way. Mark
all object pixels black and surface pixels white. Mark those black pixels red,
where N(P 1)> 2, and then mark the remaining black fork points blue by using
structuring elements c) and d) of figure 2 in the same way as structuring elements
are used in morphological thinning. The red fork points are connecting lines,
while blue fork points are connecting other fork points. Mark green each black
pixel, if at most one neighbour of it is black (end point of line segment). It can
be seen that a priority is defined over the colors as white < black < green <
red < blue. The following steps vectorize the object pixels
1. Select a green point, mark white and create a new line segment list, which
contains that point.
2. Select a black neighbour if it exists and if the current point is also black.
Otherwise select a higher priority point. Mark white the point and add to
the end of the list.
3. Go to Step 2, while a corresponding neighbour exists.
4. Go back to the place of the first element of the list and go to Step 2. Be
careful that new points should be added now to the front of the list. (This
step processes points in the opposite direction.)
Graph-Based Feature Recognition of Line-Like Topographic Map Symbols 295
Although, the algorithm above vectorizes all the objects, it merges the several
object types and colors. Hence, pixels of a given object color are copied onto a
separate binary image before they are vectorized.
We introduce an approach, which is able to recognize the features of line-like
objects, so the corresponding attributes can be assigned to them. This assumes
that the path of each object exists in the corresponding vector layer. In order to
recognize a specific feature, its properties should be defined for identification.
Two properties of vectorised symbols are recognized: forks (F ∼ F ork), and
end-points (E ∼ End). Both are well known in fingerprint recognition where
they are called minutiae. In the case of fingerprints, a fork means an end-point
in the complement-pattern, so only one of them is used for identification. In our
case, we can not define a complement-pattern, so both forks and end-points are
used.
Representation of line-like symbols is based on weighted, undirected graphs.
An EF -graph is an undirected graph with the following properties:
– Nodes are either of type E or F . The color of a node is determined by the
corresponding vector layer.
– Two nodes are connected if the line-segment sequence connecting the nodes
in the corresponding vector layer does not contain additional nodes. Edges
running between nodes of different colors can be defined by the user (in case
of multicolor objects). The weight of the edge is equal to the length of the
road connecting the two nodes, and it has the color of the corresponding
symbol part.
– There are special nodes, denoted by an index P , which occur on the trace
of a line object. These will be used to produce the final vector model.
296 R. Szendrei, I. Elek, and M. Márton
E E E
E E E E E
4, Black 4, Black
4, Black 4, Black
E E
Fig. 3. The EF graph and the elementary EF graph of a double railway line. Distances
are relative values and refer to the scale of the map. The dashed line in the elementary
EF graph represents its cyclical property.
An EF -graph can also be assigned to the vectorised map, not only to the vec-
torised symbols, where line-like symbols are not separated to their kernels. For
recognition we use the smallest units of the symbol, called the kernel. The small-
est unit is defined as the one which can be used to produce the entire symbol by
iteration. In the EF -graph there is usually only two nodes participating in the
iteration; these are type F with only a single edge, so become the entry and exit
points to the graph. In the very few cases, where the entry and exit points of
the smallest unit can not be identified, the kernel of the line-like object is itself.
Smallest unit can not be defined for the whole vectorised map.
Figure 3 shows how a symbol is built up from its smallest units by itera-
tion. Weights represent proportions and depend on the scale of the map. Beside
weights, we can assign another attribute to edges, their color. In the figure almost
all edges are coloured black.
The recognition of line-like objects is reduced to an extended subgraph iso-
morphism problem: we try to identify all the occurrences of the EF graph of the
symbol (subgraph) in the EF graph of the entire map. The weights of the EF
graphs are normalized with respect to the scale of the map, and the collection
is sorted in decreasing order of node degrees. Call this collection of sorted EF
graphs S. Since the EF graphs created to maps do not contain the edges those
connecting nodes with different colors, this case should be handled. In this arti-
cle, the potential edges are identified by searching the corresponding neighbour
on its own layer in the given distance of the node. The validity of a found po-
tential edge is verified by comparing the color of the edge and the color of the
segmented image pixels lying under the edge.
Subject to the conditions above it is possible to design an algorithm for the
recognition of subgraphs. While processing the map, recognized objects are re-
moved, by recoloring the corresponding subgraph. Two colors, say blue and
red, can be used to keep track of the progress of the algorithm and to ensure
termination.
Graph-Based Feature Recognition of Line-Like Topographic Map Symbols 297
The following algorithm stops when there are no more red nodes left in the
graph.
1. Choose an arbitrary red node U with the highest degree from the EF graph
of the map.
2. Attempt to match the subgraph at node U against S, that is the sorted
collection of EF graphs, until the first successful match in the following
way:
(a) Perform a “parallel” breadth-first search on the EF graph of the map
and the EF graph of kernel of the current symbol with origin U . This is
called successful if both the degree of all nodes match, and weights are
the same approximtely.
(b) In the case of success, all matching nodes become blue, otherwise they
remain red.
Upon successful matching the EF -graph of the symbol is deleted from the EF -
graph of the map. Entry and exit points must not be deleted unless they are
marked as E, and the degree of remaining F nodes must be decreased accord-
ingly. The equality of edge weights can only be approximate, due to the curvature
of symbols. The algorithm above can be implemented, as it keeps track the given
object by using the line segments as paths in vector data. The other difficulty
is that edges with differently colored nodes are not directly defined by the vec-
tor layers. In practice, we have created a spatial database, which has contained
the vectorized line segments and their color attribute. The potential edges was
determined by a query, looking for existence of a neighbour with the right color
in a given distance from the corresponding node.
4 Results
In this article a feature extraction method is introduced for line-like symbol
vectorization in the IRIS projects. The project aims to automate and support
the recognition of raster images of topographic maps, with the combination of
digital image processing and a knowledge-based approach.
The interpretation of line-like symbols is the most difficult issue in topographic
map vectorization. An EF graph representation is developed, which is used for
the recognition of curved, line-like objects with regular patterns.
Fork-free line,
Recognized pattern,
Recognized circle,
Unrecognized pattern
Fig. 4. Recognition results of a Hungarian topographic map line networks at map scale
1:10000 and scanning resolution of 300dpi
298 R. Szendrei, I. Elek, and M. Márton
The method was tested on a 6km × 4km section (6990 × 4680 pixels) of
a large scale 1:10 000 topographic map (see Fig. 4). In our experience some
spatial filters, like Kuwahara and conservative smoothing improved the quality
of segmentation. During a large number of tests, symbols completely appearing
on maps were identified at a high rate (> 90%), while symbols disappearing
partially, (like junctions and discontinuities) remained mostly unidentified. In
the latter case, the level of identification can be enhanced with some heuristics
using the neighbour segments.
Acknowledgement
The work was supported by the European Union and co-financed by the Eu-
ropean Social Fund (grant agreement no. TAMOP 4.2.1./B-09/1/KMR-2010-
0003).
References
[1] Corner, R.J.: Knowledge Representation in Geographic Information Systems. Ph.D.
thesis, Curtin University of Technology (December 1999)
[2] Dezső, B., Elek, I., Máriás, Z.: IRIS, Development of Automatized Raster-Vector
Conversion System. Tech. rep., Eötvös Loránd University and IKKK (November
2007) (in Hungarian)
[3] Dezső, B., Elek, I., Máriás, Z.: Image processing methods in raster-vector conver-
sion of topographic maps. In: Karras, A.D., et al. (eds.) Proceedings of the 2009
International Conference on Artificial Intelligence and Pattern Recognition, pp.
83–86 (July 2009)
[4] Janssen, R.D.T., Vossepoel, A.M.: Adaptive vectorization of line drawing images.
Computer Vision and Image Understanding 65(1), 38–56 (1997)
[5] Klinghammer, I., Papp-Váry, Á.: Földünk tükre a térkép (Map, mirror of the
Earth). Gondolat (1983)
[6] Liang, S., Chen, W.: Extraction of line feature in binary images. IEICE Transactions
on Fundamentals of Electronics Communications and Computer Sciences E91A(8),
1890–1897 (2008)
Automatic Recognition of Topographic Map
Symbols Based on Their Textures
1 Introduction
This paper1 describes a method that recognizes symbols within the raster-vector
conversion of maps [4]. Maps that contain topographic symbols are made from
vector data models, because photos and remote sensing images contain map
symbols only in raster form.
If a map symbol is identified, then two transformation steps can be made
automatically [1, 3]. First, the vectorized polygon of the map symbol will be
removed from the vectorized map if it was mistakenly recognized as a line or
polygon object. Next, the meaning of the removed symbol will be assigned as an
attribute to the polygon of the corresponding real object i.e. surface in the vector
data model. For instance, after removing the symbol “vineyard”, this attribute
will be added to the boundary polygon of the “real” vineyard (see Fig. 1a ). In
practice, the attributes of the polygons are stored in a GIS database.
1
This research was supported by the project TÁMOP-4.2.1/B-09/1/KMR-2010-003
of Eötvös Loránd University.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 299–306, 2011.
c Springer-Verlag Berlin Heidelberg 2011
300 R. Szendrei, I. Elek, and I. Fekete
3 Symbol Recognition
It is important to recognize those objects of the map that represent a symbol
even if they look like lines or polygons. The texture based pattern matching
algorithm developed by the authors will directly recognize these symbols. This
algorithm also determines the positions of symbols on the map. The position
is needed in order to query its corresponding polygon from the vector model.
This polygon will be removed from the vector model and its attribute property
(e.g. “vineyard”) will be assigned to the underlying polygon. A second query is
required to determine the line or polygon that belongs to the symbol [2].
Automatic Recognition of Topographic Map Symbols 301
Using vector data, the kernel of the sample can be determined by a motion
vector searching algorithm. The details are not discussed here, because this al-
gorithm is known in the image sequence processing to increase the compression
ratio. (For example, the standard of MPEG and its variants use motion vec-
tor compensation and estimation to remove the redundant image information
between image frames.)
Raster Vector
image model Removal Filtered
of symbol vector
polygons model
Rectangle of Pattern
the sample matching on
texture raster image
Kernel
search
Symbol
Correct 5 535 82 6 11
False Pos. 0 5 4 0 0
False Neg. 0 13 4 0 2
In practice, k is a constant value (e.g. k = 3 for RGB images) and the value ls has
an upper boundary, which is not influenced by the size of the map. Therefore,
the pattern matching algorithm works in linear runtime.
8 Conclusion
References
[1] Ablameyko, S., et al.: Automatic/interactive interpretation of color map images.
Pattern Recognition 3, 69–72 (2002)
[2] Bhattacharjee, S., Monagan, G.: Recognition of cartographic symbols. In: MVA
1994 IAPR Workshop on Machine Vision Applications, Kawasaki (1994)
[3] Levachkine, S., Polchkov, E.: Integrated technique for automated digitization of
raster maps. Revista Digital Universitaria 1(1) (2000),
http://www.revista.unam.mx/vol.1/art4/
[4] Szendrei, R., Elek, I., Márton, M.: A knowledge-based approach to raster-vector
conversion of large scale topographic maps (abstract). CSCS, Szeged, Hungary
(June 2010), full paper accepted by Acta Cybernetica (in Press)
[5] Trier, O.D., et al.: Feature extraction methods for character recognition - a survey.
Pattern Recognition 29(4), 641–662 (1996)
[6] Tsai, D., Tsai, Y.: Rotation-invariant pattern matching with color ring-projection.
Pattern Recognition 35(1), 131–141 (2002)
Using Population Based Algorithms for
Initializing Nonnegative Matrix Factorization
1 Introduction
The nonnegative matrix factorization (NMF, [1]) leads to a low-rank approxi-
mation which satisfies nonnegativity constraints. Contrary to other low-rank ap-
proximations such as SVD, these constraints may improve the sparseness of the
factors and due to the “additive parts-based” representation also improve inter-
pretability [1, 2]. NMF consists of reduced rank nonnegative factors W ∈ Rm×k
and H ∈ Rk×n with k min{m, n} that approximate matrix A ∈ Rm×n . NMF
requires that all entries in A, W and H are zero or positive. The nonlinear
optimization problem underlying NMF can generally be stated as
1
min f (W, H) = min ||A − W H||2F . (1)
W,H W,H 2
Initialization. Algorithms for computing NMF are iterative and require initial-
ization of the factors W and H. NMF unavoidably converges to local minima,
probably different ones for different initialization (cf. [3]). Hence, random initial-
ization makes the experiments unrepeatable since the solution to Equ. (1) is not
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 307–316, 2011.
c Springer-Verlag Berlin Heidelberg 2011
308 A. Janecek and Y. Tan
unique in this case. A proper non random initialization can lead to faster error
reduction and better overall error at convergence. Moreover, it makes the exper-
iments repeatable. Although the benefits of good NMF initialization techniques
are well known in the literature, most studies use random initialization (cf. [3]).
The goal of this paper is to utilize population based algorithms (abbreviated
as “PBAs”) as initialization booster for NMF. The PBAs are used to initialize
the factors W and H in order to minimize the NMF objective function prior to
the factorization. The goal is to find a solution with smaller overall error at con-
vergence, and/or to speed up convergence of NMF (i.e., smaller approximation
error for a given number of NMF iterations). Instead of initializing the com-
plete factors W and H at once, we sequentially optimize single rows of W and
single columns of H, respectively. This allows for parallel/distributed computa-
tion by splitting up the initialization into several partly independent sub-tasks.
Mathematically, we consider the problem of finding a “good” (ideally the global)
solution of an optimization problem with bound constraints in the form:
min f (x), (2)
x∈Ω
2 Methodology
2.1 The NMF Algorithm
The general structure of NMF algorithms is given in Alg. 1. Usually, W and H are
initialized randomly and the whole algorithm is repeated several times (maxrep-
etition). In each repetition, NMF update steps are processed until a maximum
number of iterations is reached (maxiter ). These update steps are algorithm spe-
cific and differ from one NMF variant to the other. If the approximation error
drops below a pre-defined threshold, or if the shift between two iterations is very
small, the algorithm might stop before all iterations are processed.
Given matrix A ∈ Rm×n and k min{m, n};
for rep = 1 to maxrepetition do
W = rand(m, k);
(H = rand(k, n));
for i = 1 to maxiter do
perform algorithm specific NMF update steps;
check termination criterion;
end
end
Algorithm 1. General Structure of NMF Algorithms
where σi are the singular values of D, and dij is the element in the ith row and
j th column of D. The Frobenius norm can also be computed row wise or column
wise. The row wise calculation is
m
1/2
||D||RW
F = |dri |2 , (4)
i=1
n
where |dri | is the norm1 of the ith row vector of D, i.e., |dri | = ( j=1 |rji |2 )1/2 ,
and rji is the j th element in row i. The column wise calculation is
⎛ ⎞1/2
n
||D||CW
F =⎝ |dcj |2 ⎠ , (5)
j=1
m
with |dcj | being the norm of the j th column vector of D, i.e., |dcj | = ( i=1 |cji |2 )1/2 ,
and cji being the ith element in column j. Obviously, a reduction of the Frobenius
norm of any row or any column of D leads to a reduction of the total Frobenius
norm ||D||F . In the following, D refers to the distance matrix of the original
data and the approximation, D = A − W H.
In line 4, input parameters for the PBAs are ari (the ith row of A) and H0, the
output is the initialized row vector wri , the ith row of W . In line 8, input parame-
ters are acj (the j th column of A) and the already optimized factor W , the output
is the initialized column vector hcj , the j th column of H. Global parameters used
for all PBAs are upper/lower bound of the search space and the initialization
(the starting values of the PBAs), number of particles (chromosomes, fish, ...),
and maximum number of fitness evaluations. The dimension of the optimization
problem is identical to the rank k of the NMF.
Parallelism. All iterations within the first for -loop and within the second for -
loop in Algorithm 2 are independent from each other, i.e., the initialization
of any row of W does not influence the initialization of any other row of W
(identical for columns of H). This allows for a parallel implementation of the
proposed initialization method. In the first step, all rows of W can be initialized
concurrently. In the second step, the columns of H can be computed in parallel.
4 Experimental Evaluation
For PSO and DE we used the Matlab implementations from [19] and adapted
them for our needs For PSO we used the constricted Gbest topology using the
parameters suggested in [20], for DE the crossover probability parameter was set
to 0.5. For GA we adapted the Matlab implementation of the continuous genetic
algorithm available in the appendix of [21] using a mutation rate of 0.2 and a
selection rate of 0.5. For FWA we used the same implementation and parameter
settings as in the introductory paper[17], and FSS was self-implemented following
the pseudo algorithm and the parameter settings provided in [15]. All results are
based on a randomly created, dense 100×100 matrix.
Fig. 1. Left side: average appr. error per row (after initializing rows of W ). Right side:
average appr. error per column (after initializing columns of H) – rank k =5.
Fig. 2. Left side: average appr. error per row (after initializing rows of W ). Right side:
average appr. error per column (after initializing columns of H) – rank k =30.
Population Based Algorithms for Initializing NMF 313
The figures on the left side show the average (mean) approximation error per
row after initializing the rows of W (first loop in Alg. 2). The figures on the right
side show the average (mean) approximation error per column after initializing
the columns of H (second loop in Alg. 2). The legends are ordered according to
the average approximation error achieved after the maximum number of function
evaluations for each figure (top = worst, bottom = best).
Results for k=5. Fig. 1 shows the results achieved for a small NMF rank k
set to 5 (k is identical to the problem dimension of the PBAs). In Fig. 1 (A),
only 500 evaluations are used to initialize the rows of W based on the randomly
initialized matrix H0 (see Alg. 2). In Fig. 1 (B) the previously initialized rows
of W are used to initialize the columns of H – again using only 500 function
evaluations. As can be seen, (to a small amount) GA, DE and especially FWA
are sensitive to the small rank k and the small number of function evaluations.
PSO and FSS achieve the best approximation results, FSS is the fastest in terms
of accuracy per function evaluations. The lower part (C, D) of Fig. 1 shows the
results when increasing the number of function evaluations for all PBAs from
500 to 2 500. The first 500 evaluations in (C) are identical to (A), but the results
in (D) are different from (B) since they rely on the initialization of the rows of W
(the initialization results after the maximum number of function evaluations in
Fig. 1 (A) and (C) are different). With more function evaluations, all algorithms
except FWA achieve almost identical results.
Results for k=30. With increasing complexity (i. e., increasing rank k) FWA
clearly improves its results, as shown in Fig. 2. Together with PSO, FWA clearly
outperforms the other algorithms when using only 500 function evaluations, see
Fig. 2 (A, B). With increasing number of function evaluations, all PBAs achieve
identical results when initializing the rows of W (see Fig. 2 (C)). Note that
GA needs more than 2 000 evaluations to achieve a low approximation error.
When initializing the columns of H (see Fig. 2 (D)), PSO suffers from its high
approximation error during the first iterations. The reason for this phenomenon
is the relatively sparse factor matrix W computed by PSO. Although PSO is able
to reduce the approximation error significantly during the first 500 iterations, the
other algorithms achieve slightly better results after 2 500 function evaluations.
FSS and GA achieve the best approximation accuracy. The NMF approximation
results in Section 4.2 are based on factor matrices W and H initialized with the
same parameters as Fig. 2 (C, D): k=30, 2 500 function evaluations.
than random initialization. NNDSVD shows slightly better results than PSO
and FWA, but GA, DE and especially FSS are able to achieve a smaller error
per iteration than NNDSVD. Since algorithms (B) - (D) in Fig. 3 have faster con-
vergence per iteration than MU but also also have higher cost per iteration, only
the first 25 iterations are shown. For ALSPG (B), all new initialization variants
based on PBAs are clearly better than random initialization and also achieve a
better approximation error than NNDSVD. The performance of the five PBAs
is very similar for this algorithm. FastNMF (C) and BayesNMF (D) are two
recently developed NMF algorithms which were developed after the NNDSVD
initialization. Surprisingly, when using FastNMF, NNDSVD achieves a lower
approximation than random initialization, but all initializations based on PBAs
are slightly better than random initialization. The approximation error achieved
with BayesNMF strongly depends on the initialization of W and H (similar to
ALSPG). The PSO initialization shows a slightly higher approximation error
that NNDSVD, but all other PBAs are able to achieve a smaller approximation
error than the state-of-the-art initialization, NNDSVD.
5 Conclusion
In this paper we introduced new initialization variants for nonnegative matrix
factorization (NMF) using five different population based algorithms (PBAs),
particle swarm optimization (PSO), genetic algorithms (GA), fish school search
(FSS), differential evolution (DE), and fireworks algorithm (FWA). These algo-
rithms were used to initialize the rows of the NMF factor W , and the columns
of the other factor H, in order to achieve a smaller approximation error for a
given number of iterations. The proposed method allows for parallel implemen-
tation in order to reduce the computational cost for the initialization. Overall,
the new initialization variants achieve better approximation results than random
initialization and state-of-the-art methods. Especially FSS is able to significantly
reduce the approximation error of NMF (for all NMF algorithms used), but other
heuristics such as DE and GA also achieve very competitive results.
Another contribution of this paper is the comparison of the general applicabil-
ity of population based algorithms for continuous optimization problems, such as
the NMF objective function. Experiments show that all algorithms except PSO
are sensitive to the number of fitness evaluations and/or to the complexity of
the problem (the problem dimension is defined by the rank of NMF). Moreover,
the material provided in Section 4 is the first study that compares the recently
developed PBAs, fireworks algorithm and fish school search. Current work in-
cludes high performance/distributed initialization, and a detailed comparative
study of the proposed methods. A future goal is to improve NMF algorithms by
utilizing heuristic search methods to avoid NMF getting stuck in local minima.
References
[1] Lee, D.D., Seung, H.S.: Learning parts of objects by non-negative matrix factor-
ization. Nature 401(6755), 788–791 (1999)
[2] Berry, M.W., Browne, M., Langville, A.N., Pauca, P.V., Plemmons, R.J.: Algo-
rithms and applications for approximate nonnegative matrix factorization. Com-
putational Statistics & Data Analysis 52(1), 155–173 (2007)
[3] Boutsidis, C., Gallopoulos, E.: SVD based initialization: A head start for nonneg-
ative matrix factorization. Pattern Recogn. 41(4), 1350–1362 (2008)
[4] Wild, S.M., Curry, J.H., Dougherty, A.: Improving non-negative matrix factoriza-
tions through structured initialization. Patt. Recog. 37(11), 2217–2232 (2004)
[5] Xue, Y., Tong, C.S., Chen, Y., Chen, W.: Clustering-based initialization for non-
negative matrix factorization. Appl. Math. & Comput. 205(2), 525–536 (2008)
[6] Kim, H., Park, H.: Nonnegative matrix factorization based on alternating non-
negativity constrained least squares and active set method. SIAM J. Matrix Anal.
Appl. 30, 713–730 (2008)
[7] Janecek, A.G., Gansterer, W.N.: Utilizing nonnegative matrix factorization for
e-mail classification problems. In: Berry, M.W., Kogan, J. (eds.) Survey of Text
Mining III: Application and Theory. John Wiley & Sons, Inc., Chichester (2010)
[8] Stadlthanner, K., Lutter, D., Theis, F., et al.: Sparse nonnegative matrix factoriza-
tion with genetic algorithms for microarray analysis. In: IJCNN 2007: Proceedings
of the International Joint Conference on Neural Networks, pp. 294–299 (2007)
[9] Snásel, V., Platos, J., Krömer, P.: Developing genetic algorithms for boolean ma-
trix factorization. In: DATESO 2008 (2008)
[10] Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural
Comput. 19(10), 2756–2779 (2007)
[11] Schmidt, M.N., Laurberg, H.: Non-negative matrix factorization with Gaussian
process priors. Comp. Intelligence and Neuroscience (1), 1–10 (2008)
[12] Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learn-
ing, 1st edn. Addison-Wesley Longman, Amsterdam (1989)
[13] Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE
International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
[14] Price, K.V., Storn, R.M., Lampinen, J.A.: Differential Evolution A Practical Ap-
proach to Global Optimization. Springer, Heidelberg (2005)
[15] Filho, C.J.A.B., de Lima Neto, F.B., Lins, A.J.C.C., Nascimento, A.I.S., Lima,
M.P.: Fish school search. In: Chiong, R. (ed.) Nature-Inspired Algorithms for
Optimisation. SCI, vol. 193, pp. 261–277. Springer, Heidelberg (2009)
[16] Janecek, A.G., Tan, Y.: Feeding the fish – weight update strategies for the fish
school search algorithm. To appear in Proceedings of ICSI 2011: 2nd International
Conference on Swarm Intelligence (2011)
[17] Tan, Y., Zhu, Y.: Fireworks algorithm for optimization. In: Tan, Y., Shi, Y., Tan,
K.C. (eds.) ICSI 2010. LNCS, vol. 6145, pp. 355–364. Springer, Heidelberg (2010)
[18] Berry, M.W., Drmac, Z., Jessup, E.R.: Matrices, vector spaces, and information
retrieval. SIAM Review 41(2), 335–362 (1999)
[19] Pedersen, M.E.H.: SwarmOps - numerical & heuristic optimization for matlab
(2010), http://www.hvass-labs.org/projects/swarmops/matlab
[20] Bratton, D., Kennedy, J.: Defining a standard for particle swarm optimization.
In: Swarm Intelligence Symposium, SIS 2007, pp. 120–127. IEEE, Los Alamitos
(2007)
[21] Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms, 2nd edn. John Wiley &
Sons, Inc., Chichester (2005)
A Kind of Object Level Measuring Method Based on
Image Processing*
1 Introduction
Traditional level measurements such as molten steel are mainly involved eddy current
probe, float measurement, isotope measurement and radar measurement. Those instru-
ments are easy to damage under harsh environment such as high temperature, dust and
liquid environment. What’s more, they are very expensive. This paper designed a kind
of non-contact level measuring method based on image process, using only one digital
camera or other visual sensors to capture a single image for measurement which has
better maintenance performance and low cost. This method can avoid small view and
spatial matching problem in three-dimensional field by its simple structure and
operation.
Current available measuring techniques based on image process are measurement
based on blur extent [1], the virtual stereo vision measurement [2], the measurement
based on image magnification times [3], and etc. Measurement based on blur extent
applied only to the situation that the lens is closer to the target and is unsuitable for
long distance. The principle of virtual stereo vision measurement is similar to binocu-
lar measurement that they have two sets of mirror to form a single virtual camera,
which requires that the two sets of mirror tilt angle is symmetrical. In addition, it has
*
This work is supported by National Natural Science Foundation of China (No. 60804068),
Natural Science Foundation of Jiangsu Province (No.BK2010261), and Cooperation Innova-
tion of Industry, Education and Academy of Jiangsu Province (No. BY2010126).
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 317–326, 2011.
© Springer-Verlag Berlin Heidelberg 2011
318 X. Wang and Y. Chen
the assumption that the axis and the object is vertical, which is same as the method
based on image magnification times.
The fuzzy method in reference [1] uses the Gaussian formula in optics knowledge,
that is, the image will blur when the distance from the lens to object changes and the
object distance is calculated by its blurring extent. The wavelet algorithm is used to
detect image edge. Although the author mentioned that selecting an appropriate
threshold can determine the blurring band width, but he didn’t give a detailed
theoretical analysis and there exists large errors.
Reference [4] also uses a single light source to project a concentric circle on screen
and calculates the object distance through picture. But they didn’t do meticulous work
for image processing. According to given image after preprocessing in literature, it is
hard to measure. Reference [4] also use some lens before light source for color filter
which will not only increase the complexity of the equipment, but also bring noise or
distortion to the collected images.
For above reasons, this paper proposed an object level measuring system based on
image processing, only one image acquisition device for image data acquisition and a
standard camera projection equipment as an auxiliary light source plus necessary
image pretreatment program. Except the calculation of object level, the system can
firstly judge whether the object's axis is perpendicular to the image acquisition
equipment and do fine adjustment automatically or alarm to adjust by hand, so it can
prevent most errors caused by tilt.
2 Implementation
2.1 Equipment
The image collecting device used in this article is shown in Figure 1. Where 1 is the
support installed on the top of object, 2 is image acquisition equipment such as
camera, 3 is standard video projection equipment and 4 is test object. As standard
video projection equipment, laser transmitter is used to project parallel light and its
projection graphics can be concentric circle, equidistant parallel line or some
equidistant point on a straight line. This paper will project concentric circles as
example.
Preprocess the original image and get the appropriate picture is very important which
relates to whether we can measure and calculate smoothly later. After obtaining the
original image in this system, the major processing steps are completed as follow.
1) Image subtraction processing [5], which can effectively remove the background
image and leave useful information.
2) Image grayscale processing, which convert the color image to grayscale.
3) Image binarization processing, which calculate the histogram point in grayscale
picture and got the bottom point as threshold.
4) Refinement of the binary image and the final image of the skeleton can be
obtained. Fig.2 and Fig.3 are images before and after processing separately.
From Fig.4, it can be seen that the deformation extent of projected image will become
large, which depend on the angle between the test object and camera surface. We need
to ensure that the test object is vertical to the camera surface as much as possible, so
we can get accurate measurement.
(a) vertical (b) anticlockwise tilt 20° (c) anticlockwise tilt 35°
θ
α α
r1 r2
During angle detection process, we fix lens a certain distance m meter to test object
so H and angle α are known and AB, BC can also be calculated through them. So we
assume the length as L for convenience.
As shown in Fig.5, we establish coordinate system which A is origin and AB
direction is x axis and we may deduce:
The linear equation of AC' is:
y = tan(α + θ ) ⋅ x (2)
The linear equation of BC' is:
y = tan 2α ⋅ ( x − L ) (3)
The linear equation of AC is:
y = tan α ⋅ x (4)
A Kind of Object Level Measuring Method Based on Image Processing 321
The line of BD’ goes through the point of D’ and B, so its equation can be
obtained:
tan 2α ⋅ tan(α + θ )
y= ( x − L) (7)
2 tan(α + θ ) − tan 2α
Where r1 and r2 can be calculated through acquired image and tan α is also be
identified when equipment is fixed.
322 X. Wang and Y. Chen
Figure 6 (a), (b), (c) show some different images acquired by different test distance
and we well aware that the image is larger when it is close to image acquisition
equipment.
As shown in Fig.7, point O is the focus of image acquisition devices, the image of
object AB is CD and the image of object A’B’ which moved some distance back is
C’D’. f is the focal length of image acquisition devices, H and H1 is the distance from
test object to image acquisition device which need to be evaluated. CD and C’D’ is
the diameter of projected image which can be calculated by program. AB and A'B 'are
only the different position of the same object. According to Fig. 7 and the similarity
properties of triangles, we can obtain:
AB
H= f× (13)
CD
3 Accuracy Analysis
In this experiment, the value of tan α
is 5.7671. The tilt angle of the object θ can
be calculated according to formula (12). The value range of θis confined to 0-40
degrees in the following experiment due to the limitation of actual equipment.
The experiment and the computational process are as follows:
1) As shown in Fig.1, projecting the concentric circles to the test object using
standard video projection equipment and got object image with the image acquisition
device which is placed m meter away from the object. We set m as 1 meter in the
experiment for convenience.
2) Rotating the object a certain angle, 2 degrees per rotation in this experiment.
3) Computing the size of r1 and r2 in acquired image according to our algorithm
and its unit is pixel.
Repeat steps 2) and step 3) and got various data sets. To eliminate the influence of
environment and other factors, we collect several group data in the same angle and
got its arithmetic mean, which are shown partly on Table 1. And then we got
computational angle and errors through formula (12). Table 1 and Fig.8 indicate that
those data errors are mostly fluctuated between ±1degree.
Actual Computational
r1 r2 Errors
Angle Angle
0 180.45 180.75 0.27 0.27
2 181.85 184.70 2.56 0.56
4 183.95 189.62 5.00 1.00
6 183.69 191.21 6.60 0.6
8 185.48 195.48 8.61 0.61
10 186.95 199.67 10.74 0.74
12 190.19 205.24 12.38 0.38
14 192.76 210.48 14.22 0.22
16 195.71 216.86 16.47 0.47
18 198.40 222.20 18.08 0.08
20 201.48 228.38 19.83 -0.17
22 204.90 235.86 22.06 0.06
24 208.86 245.52 24.97 0.97
26 212.62 251.76 25.94 -0.06
28 218.43 263.05 28.14 0.14
30 223.33 272.90 29.96 -0.04
32 228.00 283.48 32.04 0.04
34 234.76 296.48 33.84 -0.16
36 243.24 311.71 35.45 -0.55
38 254.22 329.89 36.76 -1.24
40 263.67 354.67 40.34 0.34
324 X. Wang and Y. Chen
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Largest
Computational Actual
Diameter Errors/cm
Distance/cm Distance/cm
D(pixel)
580 49.94 50.1 -0.16
569 50.85 50.9 -0.05
547 52.90 53.1 -0.20
538 53.81 53.7 0.11
519 55.77 55.7 0.07
482 60.11 60.0 0.11
454 63.85 63.9 -0.05
439 65.95 65.9 0.05
402 72.10 72.1 0.00
373 77.64 77.8 -0.16
338 85.79 85.8 -0.01
313 92.72 92.7 0.02
288 100.86 100.9 -0.04
268 108.25 108.1 0.15
257 112.89 112.9 -0.01
244 119.11 119.1 0.01
A Kind of Object Level Measuring Method Based on Image Processing 325
There are a lot of reasons to generate errors such as ordinary low resolution
webcam in this experiment. Manual focusing also generates some visual deviation.
Moving object to experimental angle generates some errors too while the
measurement tools is not absolutely precise.
4 Conclusions
Discussed the shortcomings of exist research, this paper put forward a kind of new non-
contact level measuring method based on image process and its prototype equipment.
Using only one image acquisition device and one video projection equipment as
auxiliary light, the system can detect tilt angle and object distance automatically. The
paper also did some deep comparison of actual distance and computational distance.
Those data errors are acceptable which may obtain expected measuring effect.
326 X. Wang and Y. Chen
References
1. Faquan, Z., Liping, L., Mande, S., et al.: Measurement Method to Object Distances by Mo-
nocular Vision. Acta Photonica Sinica 38(2), 453–456 (2009)
2. Jigui, Z., Yanjun, L., Shenghua, Y., et al.: Study on Single Camera Simulating Stereo Vi-
sion Measurement Technology. Acta Photonica Sinica 25(7), 943–948 (2005)
3. Chunjin, Z., Shujua, J., Xiaoning, F.: Study on Distance Measurement Based on Monocular
Vision Technique. Journal of Shandong University of Science and Technology 26(4),
65–68 (2007)
4. Hsu, K.-S., Chen, K.-C., Li, T.-H., et al.: Development and Application of the Single-
Camera Vision Measuring System. Journal of Applied Sciences 8(13), 2357–2368 (2008)
5. Shuying, Y.: VC++ Image Processing Program Design (The Second Version). Northern
Jiaotong University Press (2005)
Fast Human Detection Using a Cascade of United Hogs
College of Computer Science and Technology, Jilin University, 130012 Changchun, China
[email protected], [email protected], [email protected]
Abstract. Accurate and efficient human detection has become an important area
for research in computer vision. In order to solve problems in the past human
detection algorithms such as features with fixed sizes, fixed positions and fixed
number, we propose the human detection based on united Hogs algorithm.
Through intersection tests and feature integration, the algorithm can dynamically
generate the features closer to human body contours. Basically maintaining the
detection speed, the detection accuracy is improved by our algorithm.
1 Introduction
In recent years, with the development of image recognition, object detection in video
sequences and 2D images has made a series of success. Such as, in the study of human
face detection, Viola and Jones[1] proposed the algorithm of rectangular features with
cascade boosting, which made face detection faster and more accurate. After the great
success in face detection technology, human detection has become a hot issue on
computer vision[2]. Useful information on human detection is got mainly from body
shapes and body parts. Relevant algorithms in human detection have been proposed,
which are mainly divided into two categories: methods based on various parts of the
body and methods based on single detection window. Literature [3] gave a description
of them in detail.
For single detection window methods, the paper from Gavrila and Philomin[4]
compared the edge image of the target image with the one of images in sample database
with respect to bevel in 1999. After this, Gavrila[5] constructed edge images on pe-
destrians in database to a layered structure by similar types, which thereby speeded up
the detection when compared in database. This method was successfully applied in a
real-time human detection system[6]. In 2000, Papgeorgiou and Poggio[7] proposed a
human detection algorithm based on Haar wavelet features and SVM training. Inspired
by rectangle feature filter as a good one on human face detection[1], Viola and Jones[8]
combined the Haar wavelet with spatial and temporal characteristics of the human body
in its movement in 2003. Dalal and Triggs[9] studied the species of features on object
recognition in depth, finding that the local appearances of objects are often shown by
the distribution of local gradients intensity and local edges. Inspired, in 2005 they
proposed a human detection algorithm using Histograms of Oriented Gradients(Hog),
and demonstrated by experiments that local normalized Hog was much better than
human detection features existed before. Soon, Dalal and Triggs[10] improved the
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 327–332, 2011.
© Springer-Verlag Berlin Heidelberg 2011
328 W. Li, Y. Lin, and B. Fu
method above with the pixel information on optical flow in 2006, which made human
detection more accurate. In the same year, Zhu et al[2] made a fast and accurate human
detection algorithm using a cascade of Hogs.
However, human detection algorithms using Hog, whichever Dalal and Triggs or
Zhu et al proposed, are static in the stage of feature extraction, and the number of
features obtained is limited greatly by the size of training samples. This paper improves
the Hog feature extraction and reduces the number of useless features by dynamically
generating new features from useful ones, which therefore reduces subsequent over-
heads in the training of weak classifiers, strong classifiers and cascade classifiers, and
also increases the accuracy of human detection. Experiments at the second half of this
paper show these improvements.
weak classifiers are done further intersection test and trained to be a new weak classi-
fier in the third layer, and so on. This process is shown in Figure 1. In order to balance
the training speed and the detection accuracy, we only combine features in the same
layers. That is, the only new weak classifiers which were combined to in the exactly
previous step can be done intersection test and training. Note that the new weak clas-
sifiers may have the same irregular rectangles as ones generated before, so judgments
on the same irregular rectangle are needed. The more layers are, the more useful fea-
tures and useful weak classifiers are, and after adding some or all of them to training,
the higher detection accuracy will be. The number of layers generated by combination
of weak classifiers depends on the size of initial blocks, the number of weak classifiers
combined in each layer and their performance. In order to make features more obvious
and make calculations simpler, there is not need to combine them all to the end. This
algorithm can stop after a few layers (usually from 5 to 10 layers).
The number of weak classifiers will increase with more layers. Taking the number of
weak classifiers, detection speed and detection precision into account, weak classifiers
which are ready to be combined in each layer are defined as useful weak classifiers,
which are those with higher detection rate and lower false rate selected from the current
layer. As good weak classifiers are retained in each layer, which ensures that the new
weak classifiers generated from them are highly useful, so the number of blocks in the
first layer can be less, and scale and step can change less complicatedly. For
high-resolution images (such as 64*128 pixels and above), there is no need of hundreds
of thousands initial blocks and weak classifiers, which greatly reduces the useless
features. For low-resolution images (such as 16*32 pixels and below), this algorithm
gets a few features with fixed-size blocks in the first layer, and a large number of useful
features can be generated by combinations, which solves the problem that traditional
330 W. Li, Y. Lin, and B. Fu
feature extraction relies on image scale heavily. Therefore, 1/10 pixels of the smaller
one in length and width of the training images are the initial side size of blocks (aspect
ratio is 1:1). That is, for 64*128 pixels images and 16*32 pixels images, initial sizes of
blocks are 6*6 pixels and 2*2 pixels, and the step is half of the side size of blocks.
Dynamic selection in the useful weak classifiers significantly improves the detection
rate and speed.
After combination, the detection rate and false rate of weak classifiers have im-
proved. The shape of Human body should give priority to meet these advanced weak
classifiers, which therefore will be selected in the early stages of cascade classifier. As
long as they satisfy the certain requirements in the certain stage of cascade classifier,
each of them can be a strong classifier. It shows that a weak classifier after combination
is actually a strong classifier in a sense. The more effective features are obtained by
combining weak classifiers constantly. The method of feature integration in our paper
is totally different from Adaboost algorithm and is a more thorough application of the
machine learning in features extraction.
(a) (b)
Fig. 2. Classification Accuracy of (a)The Fast Human Detection Algorithm by Zhu et al and
(b)The Human Detection with United Hogs Algorithm in our paper
Fast Human Detection Using a Cascade of United Hogs 331
Fig. 3. The Best of Rectangular Filter, Hog with Variable-size Blocks Filter and United Hogs Filter
(a) (b)
(c)
Fig. 4. Stability of (a)The Best Haar, (b)The Best Hog based on Variable-size Blocks and (c)The
Best United Hogs
Figure 3 clearly shows the best Haar, the best Hog based on variable-size blocks and
the best united Hogs in our paper. The best means the best weak classifier which has the
highest detection rate / false positive rate on average. There is a weak classifier in the
third image at Figure 3, which was generated in the 10th layer in our algorithm. It con-
sisted of 38 initial blocks which often appeared at the edges of the important human
parts such as arms, legs and head. The differences between the traditional rectangular
332 W. Li, Y. Lin, and B. Fu
Hogs and the United Hogs can be seen from the figure: the former selects the entire
edge of human body, which includes some non-human edges; the latter accurately
selects the edge of body without any non-human ones which improves the detection
accuracy. Haar does not include the whole human body, and only selects a few repre-
sentative regions, whose accuracy is the worst.
The differences in these three algorithms can be shown from the stability comparison
of them. First, Eigen values on average of these three best features are calculated in the
test set. And then correlation values between Eigen values and their respective mean
values are calculated. The results are shown in Figure 4, where correlation values are
related to the peaks of images: the peak and variance of the best Haar is 0.5 and 0.3; the
peak and variance of the best Hog based on variable-size blocks is 0.85 and 0.1; the peak
and variance of the best united Hogs is 0.88 and 0.08, showing that united Hogs feature in
our paper is more stable and more suitable for human detection.
4 Summary
We propose the human detection based on united Hogs algorithm. Through intersection
test and feature integration, the algorithm can dynamically generate the features closer to
the human body contours. Basically maintaining the detection speed, the detection rate of
our algorithm is increased by 2.75% and 4.03% than the fast human detection based on
Hog with variable-size blocks algorithm. Since rectangles generated in the combination
are irregular, it is difficult to use integration maps to speed up calculations and therefore
the detection speed is severely affected, which is the further research in the next paper.
References
1. Viola, P., Jones, M.J.: Robust Real-Time Face Detection. J. International Journal of Com-
puter Vision 52(2), 137–154 (2004)
2. Zhu, Q., Avidan, S., Yeh, M.C., Cheng, K.T.: Fast Human Detection Using a Cascade of
Histograms of Oriented Gradients. In: Proc. IEEE International Conference on Computer
Vision and Pattern Recognition (2006)
3. Gavrila, D.M.: The Visual Analysis of Human Movement: A survey. J. Journal of Computer
Vision and Image Understanding 73(1), 82–98 (1999)
4. Gavrila, D.M., Philomin, V.: Real-time Object Detection for Smart Vehicles. In: Proc. IEEE
International Conference on Computer Vision and Pattern Recognition (1999)
5. Gavrila, D.M.: Pedestrian detection from a moving vehicle. In: Vernon, D. (ed.) ECCV
2000. LNCS, vol. 1843, pp. 37–49. Springer, Heidelberg (2002)
6. Gavrila, D.M., Giebel, J., Munder, S.: Vision-Based Pedestrian Detection: The Projector
System. In: Proc. IEEE Intelligent Vehicles Symposium (2004)
7. Papageorgiou, C., Poggio, T.: A Trainable System for Object Detection. J. International
Journal of Computer Vision 38(1), 15–33 (2000)
8. Viola, P., Jones, M., Snow, D.: Detecting Pedestrians using Patterns of Motion and Ap-
pearance. In: International Conference on Computer Vision (2003)
9. Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Confer-
ence on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
10. Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and
appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952,
pp. 428–441. Springer, Heidelberg (2006)
The Analysis of Parameters t and k of LPP on
Several Famous Face Databases
1 Introduction
As one of the most important biometric techniques, face recognition has gained
lots of attentions in pattern recognition and machine learning areas. The sub-
space transformation plays an important role in the face recognition. Feature
extraction is one of the central issues for face recognition. Subspace transfor-
mation (ST) is often used as a feature extraction method. The idea of ST is
to project the feature from the original high dimensional space to a low dimen-
sional subspace, which is called projective subspace. In the projective subspace,
the transformed feature is easier to be distinguished than the original one.
Principal Component Analysis (PCA)[12] is a widely used subspace transfor-
mation. It attempts to find the projective directions to maximize variance of the
samples. To improve classification performance, LDA[1] encodes discriminant in-
formation by maximizing the ratio between the between-class and within-class
scatters. LDA can be thought of as an extension with discriminant information
of PCA. Both PCA and LDA focus on preserving the global structure of the sam-
ples. However, Seung[10] assumed that the high dimensional visual image infor-
mation in the real world lies on or is close to a smooth low dimensional manifold.
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 333–339, 2011.
c Springer-Verlag Berlin Heidelberg 2011
334 S. Wang et al.
1. 0-1 ways
1 nodes i and j are connected in G
Sij = (1)
0 otherwise.
2. Heat kernel
exp(−xi − xj 2 /2t2 ) nodes i and j are connected in G
Sij = (2)
0 otherwise.
we add the 2rd, 3nd,... Nth rows into the 1st row, and obtain |L| = 0. So, the
rake of L is at most N − 1. It is known that the maximum possible rank of the
product of two matrices is smaller than or equal to the smaller of the ranks of
the two matrices. Hence, rank(SL ) = rank(XLXT ) ≤ N − 1. Similarly, we have
rank(SL ) ≤ N .
From Theorem 1, LPP like LDA also suffers from the SSS problem. Another
problem is how to measure he local structure of the samples. LPP uses the
similarity matrix S. If every entries are the same, the local structure of the
samples is not preserved. Without loss of generality, each entry in S is set as
1/N 2 , i.e., L = N1 I − N12 eeT , where e ia a vector, whose entries are 1. The
matrix SL is equivalent to the covariance matrix in PCA[6]. In this case, LPP
degenerates into PCA. Obviously, the performance of LPP dependents on how
construct the similarity matrix S. In next section, the performance of LPP with
respect to the neighborhood size k and heat kernel parameter t on several famous
face databases will be reported.
3 Experiment
3.1 Database and Experimental Set
Three well-known face database ORL1 , Yale2 and the Extended Yale
Face Database B[4] (denoted by YaleB hereafter) were used in our experiments.
1
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
2
http://cvc.yale.edu/projects/yalefaces/yalefaces.html
336 S. Wang et al.
The ORL database collects images from 40 individuals, and 10 different images
are captured for each individual. For each individual, the images with different
facial expressions and details are obtained at different times. The face in the
images may be rotated, scaled and be tilting in some degree. The sample images
of one individual from the ORL database are shown in Figure 1.
There are total of 165 gray scale images for 15 individuals where each individ-
ual has 11 images in Yale face database. The images demonstrate variations in
lighting condition, facial expression (normal, happy, sad, sleepy, surprised, and
wink). The sample images of one individual from the Yale database are showed
in Figure 2.
The YaleB contains 21888 images of 38 individuals under 9 poses and 64
illumination conditions. A subset containing 2414 frontal pose images of 38 in-
dividuals under different illuminations per individual is extracted. The sample
images of one individual from the YaleB database are showed in Figure 3.
29 29 29 0.48
0.44
0.52
0.46
0.42
0.5 0.44
22 22 22
The neighborhood size k
0.4 0.42
0.48
0.4
0.38
0.46
15 15 15 0.38
0.36
0.44 0.36
0.34 0.34
8 0.42 8 8
0.32
0.4 0.32
0.3
2 2 2
1 1.5 2 5 10 50 100 1 1.5 2 5 10 50 100 1 1.5 2 5 10 50 100
The heat kernel parameter t The heat kernel parameter t The heat kernel parameter t
Fig. 4. The performance of LPP vs. the two parameters k and t on Yale face database
from the fact that the essential manifold structure of samples. An alternative
interpretation is that facial images lie on multi-manifolds instead of a single
manifold. Recently, the efforts of research on multi-manifolds for face recognition
are proposed[8]. In order to verify the validation of the assumption that the
performance is insensitive to the heat kernel parameter t and the top performance
incurs in the case that the neighbors size k is greater than the half of the number
of the samples, 50-time cross-validations are performed on Yale database. The
results are illustrated in Fig. 6.
338 S. Wang et al.
319 189
0.54
0.94
0.92
0.53
0.9
240 143
0.88 0.52
0.86
160 95 0.51
0.84
0.82
0.5
0.8
80 48
0.78
0.49
0.76
0.48
0.74
2 2
1 1.5 2 5 1 1.5 2 5
The heat kernel parameter t The heat kernel parameter t
0.75
0.7
Recognition accuracy (%)
0.65
0.6
0.55
0.5
BASE
PCA
LPP
0.45
2 3 4 5 6 7 8
The number of training samples
4 Conclusion
References
1. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: recog-
nition using class specific linear projection. IEEE Transactions on Pattern Analysis
and Machine Intelligence 19(7), 711–720 (1997)
2. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding
and clustering. Advances in Neural Information Processing Systems 1, 585–592
(2002)
3. Chen, S.B., Zhao, H.F., Kong, M., Luo, B.: 2D-LPP: a two-dimensional extension
of locality preserving projections. Neurocomputing 70(4-6), 912–921 (2007)
4. Georghiades, A., Belhumeur, P., Kriegman, D.: From few to many: illumination
cone models for face recognition under variable lighting and pose. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence 23(6), 643–660 (2001)
5. He, X.F., Niyogi, P.: Locality preserving projections. In: Advances in Neural In-
formation Processing Systems, vol. 16, pp. 153–160. The MIT Press, Cambridge
(2004)
6. He, X.F., Yan, S.C., Hu, Y.X., Niyogi, P., Zhang, H.J.: Face recognition using lapla-
cianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3),
328–340 (2005)
7. Liu, Y., Liu, Y., Chan, K.C.C.: Tensor distance based multilinear locality-preserved
maximum information embedding. IEEE Transactions on Neural Networks 21(11),
1848–1854 (2010)
8. Park, S., Savvides, M.: An extension of multifactor analysis for face recognition
based on submanifold learning. In: 2010 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), pp. 2645–2652. IEEE, Los Alamitos (2010)
9. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear em-
bedding. Science 290(5500), 2323 (2000)
10. Seung, H.S., Lee, D.D.: The manifold ways of perception. Science 290(5500), 2268–
2269 (2000)
11. Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for non-
linear dimensionality reduction. Science 290(5500), 2319 (2000)
12. Turk, M., Pentland, A.: Eigenfaces for recognition. Journal of Cognitive Neuro-
science 3(1), 71–86 (1991)
13. Wan, M.H., Lai, Z.H., Shao, J., Jin, Z.: Two-dimensional local graph embedding
discriminant analysis (2DLGEDA) with its application to face and palm biometrics.
Neurocomputing 73(1-2), 197–203 (2009)
14. Xu, Y., Zhong, A., Yang, J., Zhang, D.: LPP solution schemes for use with face
recognition. Pattern Recognition (2010)
15. Yu, W.W., Teng, X.L., Liu, C.Q.: Face recognition using discriminant locality
preserving projections. Image and Vision Computing 24(3), 239–248 (2006)
16. Yu, W.: Two-dimensional discriminant locality preserving projections for face
recognition. Pattern Recognition Letters 30(15), 1378–1383 (2009)
Local Block Representation for Face Recognition
1 Introduction
Recently, learning from high-dimensional data sets is a contemporary challenging
problem in machine learning and pattern recognition fields, which becomes increas-
ingly important as large and high-dimensional data collections need to be analysed in
different application domain. Suppose that a source dataset R produces high-
dimensional data that we wish to analyze. For instance, each data point could be the
frames of a movie produced by a digital camera, or the pixels of a high-resolution
pixel image or large vector-space representation of text documents abound in multi-
modal data sets. When dealing with this type of high dimensional data, the high-
dimensionality is an obstacle for any efficient processing of the data[1]. Indeed, many
classical data processing algorithms have a computational complexity that grows
exponentially with the dimension. On the other hand, the source R may only enjoy a
limited number of degrees of freedom. This means that most of the variables that
describe each data points are highly correlated, at least locally, or equivalently, that
the data set has a low intrinsic dimensionality. In this case, the high-dimensional rep-
resentation of the data is an unfortunate (but often unavoidable) artifact of the choice
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 340–347, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Local Block Representation for Face Recognition 341
= YSi J (I k − Wi ) F
2 2
errbi = YΓ − YΓ Wi (1)
i i F
1’s. By the SVD of X − x e , we can obtain the orthonormal basis Q for the
T
E (T ) = ∑ Ei ≡ ∑ min Ti − ci eT − Li Θ i
2 2 2
= TSW (3)
ci , Li
i i
1 T
Wi = ( I − ee )( I − Θ i+ Θ + ) (4)
k
Manifold alignment [15,16] maps several datasets into a global space by the matching
points in each dataset, which is essential for data fusion and multicue data matching
[13]. Suppose, there are m datasets X = {X 1 ,", X m } , and they need to be transformed
~
~
into a uniform dataset Y with dimensionality d . The corresponding matching points
~ ~ ~
in X should be mapped to the same point in Y . Let Y j be X j 's counterpart in Y ,
~
and let S j be a 0-1 selection matrix satisfying
344 L. Jia, L. Huang, and L. Li
~~
Y j = YS j (6)
according to (2) and (6), we can give the summarization of the block approxima-
tion error of all datasets
m m
~~ ~ 2
ERRb = ∑ errbj = ∑ Y j S bj Bbj
2
= Y S b Bb (7)
F F
j =1 j =1
~
(~ ~ )(~ ~ ) , S B is
here M b = Sb Bb Sb Bb
T
bj bj the block approximation matrix for Y j ,
~
[~
,", S S ] , B = diag {B ," B
S b = S1S b1
~
m bm
~
b1 bm } . By imposing the additional constraint
~ ~T ~* ~
Y Y = I d , the optimal Y is given by the d eigenvectors of the matrix M b , corre-
~
sponding to the 2nd to d + 1 st smallest eigenvalues of M b .
Learning with the label info can be regarded as the problem of approximating a
multivariate function from labeled data points. The function can be real valued as in
regression or binary valued as in classification. Learning with the label info can also
be regarded as a special case of dimension reduction that maps all the data points in
2
the label space. The label error of yi is defined as errli = si yi − f i , where
F = [ f1 ,", f n ] is the label value, si is the flag to identify the labeled points satisfying
⎧1 " i ∈ L
si = ⎨ , and L is the collection of index of labeled points. By weighted
⎩0 " i ∉ L
combining the point approximation error and label error, we can get
( )
n
Errp = ∑ (1 − ai ) errpi + ai errli = YB p (I n − A) + (Y − F )A
2 2 2 2
F
(8)
F
i =1
optimal
(
Y * = FAAT M p + AAT )
−1
(9)
l
M p = B p ( I n − A)( I n − A)T B Tp , ai = ( (1 − a 0 ) + a 0 ) si is the weight coefficient at yi , l
n
is the number of labeled points, a 0 is the minimal weight coefficient set by user, and
A = diag (ai ) . We set the weight coefficient on the following intuition that: if the pro-
portion of the labeled points is very small, we have to reduce our dependence on the
knowledge only retained from the labeled points; if all the points are labeled, we must
totally discard the geometry info of the point clouds, for at that moment the label info
is more reliable. As a result, the coefficient ai has to be adjusted with the proportion
of labeled points.
Similarly with the point approximation, the total error defined for block approxi-
mation is
( )
ξ = ∑ (1 − ai )2 erri + ai2errj = YSb Bb ( I k − Ab )
2
F
+ (Y − F )A
2
F
(10)
Local Block Representation for Face Recognition 345
optimal
(
Y * = FAAT M b + AAT )−1
(11)
(a)obj1 (b)obj2
Fig. 1. Images of two objects in FacePix
(a)obj1 (b)obj2
(a)obj1 (b)obj2
We select two image sequences for experiments from the FacePix database[13,14].
The image sequences corresponding to the profile view of the object taken with the
camera placed at 90 degrees from the frontal are shown in Figure 1. The embedded
manifolds before aligned are shown in Figure 2, and the embedded manifolds after
aligned are shown in Figure 3.
Acknowledgment
This work was supported by the Research Foundation of Science & Technology
office of Hunan Province under Grant (No. 2010FJ4213); Project supported by Scien-
tific Research Fund of Hunan Provincial Education Department under Grant (No.
10C0498).
References
1. Zhou, D., Huang, J., Schoelkopf, B.: Learning with hypergraphs: Clustering, classification,
and embedding. In: Advances in Neural Information Processing Systems (NIPS), vol. 19
(2006)
2. Yan, S.C., Xu, D., Zhang, B., Zhang, H.J.: Graph embedding and extensions: A general
framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence 29(1), 40–51 (2007)
Local Block Representation for Face Recognition 347
3. He, X., Cai, D., Yan, S., Zhang, H.J.: Neighborhood preserving embedding. In: Tenth
IEEE International Conference on Computer Vision, Beijing, vol. 2, pp. 1208–1213 (2005)
4. Belkin, M., Niyogi, P.: Semi-Supervised Learning on Riemannian Manifolds. Machine
Learning 56, 209–239 (2004)
5. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework
for learning from examples (Technical Report TR-2004-06). University of Chicago (2004)
6. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding.
Science 290, 2323–2326 (2000)
7. Zhang, Z.Y., Zha, H.Y.: Principal Manifolds and Nonlinear Dimension Reduction via Lo-
cal Tangent Space Alignment. CSE-02-019, Technical Report, CSE, Penn State Univ.
(2002)
8. Poggio, T., Girosi, F.A.: Theory of networks for approximation and learning. Technical
Report A. I. Memo 1140. MIT, Massachusetts (1989)
9. Ham, J., Lee, D., Saul, L.: Semisupervised Alignment of Manifolds. In: Proc. 10th Int’l
Workshop Artificial Intelligence and Statistics, pp. 120–127 (2005)
10. Lafon, S., Keller, Y., Coifman, R.R.: Data Fusion and Multicue Data Matching by Diffu-
sion Maps. IEEE Trans. on PAMI 28(11), 1784–1797 (2006)
11. Yang, G., Xu, X., Yang, G., Zhang, J.: Semi-supervised Classification by Local Coordina-
tion. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS,
vol. 6444, pp. 517–524. Springer, Heidelberg (2010)
12. Yang, G., Xu, X., Yang, G., Zhang, J.: Research of Local Approximation in Semi-
Supervised Manifold Learning. Journal of Information & Computational Science 7(13),
2681–2688 (2010)
13. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information
Processing Systems, vol. 16, p. 37. The MIT Press, Cambridge (2006)
14. Zhang, T., Yang, J., Zhao, D., Ge, X.: Linear local tangent space alignment and application
to face recognition. Neurocomputing 70, 1533–1547 (2007)
15. He, X.: Incremental semi-supervised subspace learning for image retrieval. In: Proceedings
of the ACM Conference on Multimedia, New York, pp. 10–16 (October 2004)
16. Yang, X., Fu, H., Zha, H., Barlow, J.L.: Semisupervised nonlinear dimensionality reduc-
tion. In: ICML 2006, Pittsburgh, PA, pp. 1065-1072 (2006)
Feature Level Fusion of Fingerprint and
Finger Vein Biometrics
Abstract. The aim is to study the fusion at feature extraction level for finger-
print and finger vein biometrics. A novel dynamic weighting matching
algorithm based on quality evaluation of interest features is proposed. First, fin-
gerprint and finger vein images are preprocessed by filtering, enhancement,
gray-scale normalization and etc. The effective feature point-sets are extracted
from two model sources. To handle the problem of curse of dimension,
neighborhood elimination and reservation of points belonging to specific
regions are implemented, prior and after the feature point-sets fusion. Then, ac-
cording to the results of features evaluation, dynamic weighting strategy is in-
troduction for the fusion biometrics. Finally, the fused feature point-sets for the
database and the query images are matched using point pattern matching and
the proposed weight matching algorithm. Experimental results based on
FVC2000 and self-constructed finger vein databases show that our scheme can
improve verification performance and security significantly.
1 Introduction
Whether in passports, credit cards, laptops or mobile phones, automated methods of
identifying people through their anatomical features or behavioral traits are increasing
feature of modern life [1-3]. Uni-biometric systems have to contend with a variety of
problems such as noisy data, intra-class variations, restricted degrees of freedom, non-
universality, spoof attacks, and unacceptable error rates [2-3]. Some of these limita-
tions can be handled by deploying multi-biometric systems that integrate the evidence
presented by multiple sources of information. Ross and Jain [4] have presented an
overview of multimodal biometrics with various levels of fusion, namely, sensor
level, feature level, matching score level and decision level.
Despite the abundance of research papers related to multimodal biometrics [4-9],
fusion at feature level is a relatively understudied problem. Since the feature set con-
tains much richer information on the source data than the matching score or the output
decision of a matcher [4], fusion at the feature level is expected to provide better rec-
ognition performances. The existing literatures are mostly based on fingerprint, face,
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 348–355, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Feature Level Fusion of Fingerprint and Finger Vein Biometrics 349
speech and palmprint. Even though fingerprint is the most widely used biometric trait
[2, 9-11] and vein has high security [12], no methods for feature level fusion of the
two modalities have been proposed in the reported literature.
Thus, a novel approach to fuse fingerprint and finger vein biometrics at feature ex-
traction level is proposed in this paper. Despite the abundance of research papers re-
lated to images quality evaluation [13-15], it is difficult to give an effective criterion
for quality evaluation of the specific features. Aiming at the low quality features, from
the viewpoint of feature point-sets, an effective quality evaluation is given for finger-
print and finger vein features. Five evaluation factors are given to classify the features
as excellent and poor quality so as to set the respective fusion weight. On this basis, a
weight matching algorithm based on quality evaluation of interest features is pro-
posed. Experimental results based on fingerprint and finger vein databases show that
the fusion scheme can greatly reduce the uncertainty and improve verification per-
formance and security.
m m m
shown in Fig. 1. The input to this system is a finger vein image and the output is the
extracted minutiae set {V = (x , y ,θ ) | 1 ≤ k ≤ K} (Set_vein), where K is the number
k k k k
v v v
of minutiae, (x , y ) and θ represent the spatial location and the local orientation.
k k k
v v v
( xi − x j ) + ( y i − y j ) ≤ r
2 2
Sd = (1)
The proposed interest features are some specific features in matching, such as
minutiae, texture structure and filter features [11] and etc. Minutiae feature is used in
this paper. Fingerpint and finger vein can be divided into two categories of excellent
and poor respectively by the proposed quality evaluation of minutiae sets. The method
of quality evaluation can be briefly described as follows.
The five Evaluation Factors. (a) The proportion of the effective area (the
corresponding evaluation factor is λ1); (b) Whether the number of minutiae extracted
from an input sample is within a certain range (λ2); (c) The change of minutiae
extracted from an input sample before and after reduction of the features (λ3); (d) The
ratio of the number of the registered template and input sample minutiae (λ4); (e) The
degree of the center deviation (λ5). Where λi(i=1,2,…,5) values are 1 or 0, the con-
straints are (2)-(6). If λi meet the corresponding constraint, the value is 1. Otherwise,
the value is 0. Five factors λi (i=1,2,…,5) are used for fingerprint and three factors λi
(i=2, 3, 4) are used for finger vein.
Pmin ≤ Parea = S D / S ≤ 1 (2)
Δ1min ≤ Δ1 = K M / V ′ / K M / V ≤ 1 (4)
Where Parea is the proportion of the effective area, K is the number of minutiae, Δ1
is the ratio of the number of minutiae (before and after reduction of the features), Δ2
is the ratio of the number of minutiae (the registered template and input sample),
(xcore , ycore) and (xz , yz) are the center coordinate and barycentric coordinate respec-
tively. Parameters of five constraint conditions are shown in Table 1.
Quality Evaluation. If the evaluation factors of fingerprint and finger vein features
satisfy with (7) and (8) respectively, quality of the features is excellent. Otherwise,
the quality is poor.
352 K. Lin et al.
∑λ i
≥3 (i = 1, 2, ..., 5) (7)
∑λ i
≥2 (i = 2, 3, 4) (8)
Several typical low quality images are shown in Fig. 4. (a) We can not extract
effective minutiae successfully: λ1+λ2+λ4+λ5=0. (b) There are scars and many false
minutiae: λ2+λ3+λ4=0. (c) The degree of deviation of the center is large and effective
area is small: λ1+λ2+λ4+λ5=0. (d) The degree of the center deviation is large:
λ3+λ4+λ5=0.
The proposed approach is based on the fusion of the two traits by extracting inde-
pendent feature point-sets from the two modalities, and making the two point-sets
compatible for concatenation. Overall thought of the algorithm is weight matching
based on predictive quality evaluation of interest features.
Step1: Feature point-sets of fingerprint and finger vein are extracted and expressed
as Set_fingerprint and Set_vein, where each feature point consist of the spatial loca-
tion (x, y) and the local orientation θ.
Step2: Apply the feature reduction techniques for Set_fingerprint and Set_vein. We
can get the feature point-sets Set_fingerprint′ and Set_vein′, recording the amount of
minutiae K, K′ and the center coordinate (xcore, ycore).
Feature Level Fusion of Fingerprint and Finger Vein Biometrics 353
Step3: To evaluate the feature point-sets and get the evaluation factors. According
to the quality evaluation factors of Set_fingerprint′ and Set_vein′, the features are
classified into two groups of excellent and poor quality.
Step4: Set fusion weight (λM and λV) of fingerprint and finger vein features as
follows:
M
,
⎧λ = 1 λ = 2 V
Initial values;
⎪λ = 2,λ = 2
⎪ M V
If M is excellent,V is poor;
⎨λ = 1,λ = 4
,
If M is poor,V is excellent;
⎪ M V
( x' j − xi ) + ( y' j − yi ) ≤ r0
2 2
sd = (9)
Where the points i and j are represented by (xi, yi, θi)of the concatenated database and
query point-sets Set_fusion and Set_fusion′, sd is the spatial distance and dd is the
direction distance.
b) Compute the matching score. The final matching score is calculated using (11) and
(12) based on the number of matched pairs found between the database and query sets.
λM ⋅ N M + λV ⋅ NV
score = c × (11)
λM ⋅ N max_ finger + λV ⋅ N max_ vein
4 Experimental Results
The proposed approach has been tested on public-domain databases FVC
2000_DB1_B-DB4_B and self-constructed finger vein databases. Fingerprint data-
bases consist of 40 individuals composed of 8 images for each individual. The finger
vein images were acquired using an infrared sensor. Finger vein databases
consist of 40 individuals composed of 8 images under different illumination or time
354 K. Lin et al.
for each individual. All fingers are vertical upward and the grayscale and size of ac-
quired vein images are 256 and 135×235 pixels.
The experiments were conducted in several sessions below recording False Accep-
tance Rate (FAR), False Rejection Rate (FRR) and Accuracy (which is computed at
the certain threshold, where the performance is maximum.).
The fingerprint and finger vein recognition systems were tested before and after
reduction of the dimension (RD) respectively. The matching score is computed using
point pattern matching independently for fingerprint and finger vein. The individual
system performance was recorded and the results were computed for each modality as
shown in Table 2. From the experimental results, the reduction of the dimension in-
creased the recognition accuracy by 1.48% for fingerprint and 0.36% for finger vein.
In the second session, multimodal fusion was tested on the multimodal databases
acquired by the authors with fingerprint and finger vein. The complex databases con-
sist of 40 pairs of images composed of a fingerprint sample and a finger vein sample
for each pair. For comparison, the results are computed for multimodal fusion at
matching score and feature extraction level as shown in Table 3. The fusion of
weighted average was used at matching score level. The results show that our scheme
of Dynamic weighting fusion achieved 98.9% recognition accuracy, compared with
fingerprint and finger vein modalities increased by 6.6% and 9.6% respectively,
compared with fusion recognition at matching level increased by 5.4%. Moreover,
compared with the Unweighted feature level fusion, the Dynamic weighting fusion
enhanced the recognition accuracy by 3.1%.
5 Summary
A multimodal biometric system based on the integration of fingerprint and finger vein
traits at feature extraction level was proposed. There are some advantages in multi-
modal biometric systems, including the easy of use, robustness to noise, low cost and
high security. From the viewpoint of feature point-sets, an effective quality evaluation
Feature Level Fusion of Fingerprint and Finger Vein Biometrics 355
is given for fingerprint and finger vein features. On this basis, a dynamic weighting
matching algorithm based on predictive quality evaluation of interest features was
proposed. Experimental results show that our scheme achieved 98.9% recognition
rate, compared with fingerprint and finger vein modalities alone increased by 6.6%
and 9.6% respectively. The scheme can improve verification performance and secu-
rity significantly. The next job is to derive the mathematical model of the best weights
for fusion at feature extraction level.
Acknowledgments. Thanks for the editor and reviewers’ helpful comments. The
work is financially supported by (a) the Funds for Visiting Scholars of State Key
Laboratory (Project No. 2007DA10512709403) and (b) the Fundamental Research
Funds for the Central Universities (Project No. CDJXS11150014).
References
1. Jain, A.K.: Biometric recognition: Q&A. Nature 449(6), 38–40 (2007)
2. Davide, M., Dario, M., Jain, A.K., et al.: Handbook of Fingerprint Recognition, 2nd edn.
Springer, London (2009)
3. Jain, A.K., Ross, A.: Multibiometric systems. Communications of the ACM 47(1), 34–40
(2004)
4. Ross, A., Jain, A.K.: Information fusion in biometrics. Pattern Recognition Letters 24,
2115–2125 (2003)
5. Darwish, A.A., Zaki, W.M., Saad, O.M., et al.: Human authentication using face and fin-
gerprint biometrics. In: 2010 Second International Conference on Computational Intelli-
gence, Communication Systems and Networks, Liverpool, United Kingdom, pp. 274–278
(2010)
6. Tong, Y., Wheeler, F.W., Liu, X.M.: Improving biometric identification through quality-
based face and fingerprint biometric fusion. In: 2010 IEEE Computer Society Conference
on Computer Vision and Pattern Recognition, pp. 53–60. CVPRW, San Francisco (2010)
7. Hong, L., Jain, A.K.: Integrating faces and fingerprints for personal identification. IEEE
Transactions on Pattern Analysis and Machine Intelligence 20(12), 1295–1307 (1998)
8. Sun, A.B., Zhang, D.X., Zhang, M.: Multiple features based intelligent biometrics verifica-
tion model. Computer Science 37(2), 221–224 (2010)
9. Rattani, A., Kisku, D.R., Bicego, M., et al.: Feature Level Fusion of Face and Fingerprint
Biometrics. In: First IEEE International Conference on Biometrics: Theory, Applications,
and Systems, pp. 1–6. BTAS, Crystal City (2007)
10. Luo, X.P., Tian, J.: Image Enhancement and Minutia Matching Algorithms in Automated
Fingerprint Identification System. Journal of Software 13(5), 946–956 (2002)
11. Jain, A.K., Prabhakar, S., Hong, L., et al.: Filterbank-Based Fingerprint Matching. IEEE
Transactions On Image Processing 9(5), 846–859 (2000)
12. Yu, C.B., Qin, H.F.: Research on extracting human finger vein pattern characteristics.
Computer Engineering and Applications 44(24), 175–177 (2008)
13. Alonso, F.F., Fierrez, J., Qrteqa, G.J., et al.: A comparative study of fingerprint image-
quality estimation methods. IEEE Trans. on Information Forensics and Security 2(4),
734–743 (2007)
14. Liu, L.H., Tan, T.Z.: Research on fingerprint image quality automatic measures. Computer
Engineering and Applications 45(9), 164–167 (2009)
15. Hu, M., Li, D.C.: A Method for Fingerprint Image Quality Estimation. Computer Technol-
ogy And Development 20(2), 125–128 (2010)
A Research of Reduction Algorithm for Support Vector
Machine
1 Introduction
Support vector machine has been widely used in pattern recognition[1], bioinformatics,
text classification. It has its own advantages compared with other statistics. However,
the support vector machine still has many problems as an emerging technology. The
current research on support vector machine includes two aspects: First, the property of
support vector machine to study. Secondly, the new fields of application to explore.
Generalization and response time are two important criterions of support vector
machine. The factors influence the response time of support vector machine are: the
number of the training dataset and support vectors. We want to reduce the number of
training dataset and support vectors on the condition of keeping the classification ac-
curacy. Liu Xiangdong and Chen Zhaoqian[2] presented a fast support vector machine
classification algorithm: using a small amount of support vectors instead of all the
support vectors. But this method requires solving complex optimization problems. Li
Honglian[3] put forward a SVM learning strategy for large-scale training dataset. The
classification accuracy and the number of support vectors using this method are better,
but the method needs to select a threshold. This paper presents a reduction algorithm for
support vector machine. First, the KNN algorithm is used to remove the noise data and
the borderliner data. Then the KNN algorithm is used to extract support vectors from the
*
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 356–362, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A Research of Reduction Algorithm for Support Vector Machine 357
training dataset to form a new dataset. Finally, the classifier is obtained from the new
training dataset. The experiment studies show that this algorithm the paper presented can
reduce the number of training dataset and support vectors on the condition of keeping
the classification accuracy of the original training dataset.
y i (w ⋅ x i + b) ≥ 1 − ξ i , ξ i ≥ 0, i = 1,2," , l
c > 0 is referred as the penalized cost. It is a trade-off between model complexity and
the accuracy on the training dataset. By introducing Lagrange multipliers to solve the
above problem, the optimal hyperplane is:
l (5)
f (x) = sign (∑ α i y i x i ⋅ x + b)
i =1
Ф
original input space for some dataset. To solve this problem, the support vector machine
maps the original data (xi,yi) to the high-dimensional Hibert space ( (xi),yi) through a
nonlinear function Φ [4]. And the training dataset can be linear separable in this
high-dimensional space. xi ,xj x,∈ x∈R n
. If there exits a mapping Ф
from Rn toRm,
n<<m
k ( xi , x j ) =< Φ ( xi ), Φ ( x j ) > (6)
358 S. Liu and L. Sun
The <,> is the inner product of the space Rm, k ( xi , x j ) is a kernel function. The opti-
mization problem is transformed into:
1 2
min w
w,b 2 (7)
yi ( w ⋅ Φ ( xi ) + b) ≥ 1, i = 1,...l
By introducing the Lagrange multiplier ai , the optimal hyperplane is:
l
f ( x) = sgn(∑ ai∗ yi k ( xi , x) + b) (8)
i =1
Because the point of ai=0 has no effect on the hyperplane, we can remove the point of
ai=0 to reduce the training dataset firstly. The literature[5] proposed Boundary Nearest
Support Vector Machine. The idea is to find the support vectors to compress the training
dataset. The training dataset is divided into two datasets according to the class label. For
each data, we try to find the k-nearest points from the heterogeneous. Then the hetero-
geneous points are formed a new training dataset. Figure 1 shows that in the case of
no misclassified data, the method can extract all the support vectors, and keep the
A Research of Reduction Algorithm for Support Vector Machine 359
classification accuracy. Using the method of the Boundary Nearest Support Vector
Machine for the dataset shown in the figure 1, the classification hyperplane and support
vectors are shown in the figure 2. However, there are many misclassified points in the
dataset, and these misclassified points are support vectors. This is also the factor of too
much support vectors. For the dataset shown in Figure 3, the classification hyperlane and
the support vectors using the Boundary Nearest Support Vector Machines are shown in
the figure 4. In this case, the classification result of the test dataset is very poor.
As shown in Figure 5, the positive data are mainly distributed in the following forms.
Fig. 2. The hyperplane and support vectors of the ideal training dataset
Fig. 4. The hyperplane and support vectors of the training dataset with noisy examples
360 S. Liu and L. Sun
(1) Noise points. For instance the positive points are in the bottom left corner.
(2) Borderliner points are close to the boundary between the positive and negative
regions.
(3) The data have no effect to the hyperplane. For example, the positive data are in
the upper right corner.
(4) Safe points are important to the hyperplane.
If we delete the borderliner points when the data of the two classes mixed severely, it
can avoid too complex of the hyperplane and poor generalization ability[6]. By ana-
lyzing the distribution of the dataset, we hope to take the following steps to compress
the sample dataset: remove the noise data, part of the borderliner points, and the data
have no effect on the hyperplane, retain the safe points. Based on the above analysis, a
reduction algorithm for support vector machine is proposed. First, using KNN algo-
rithm we try to find the noise points from the training dataset, the part of the Borderliner
points and remove them. Then look for support vectors to form a new training dataset.
For convenience, in the following sections the above algorithm is called KNSVM
algorithm.
The calculation of distance between two points is given in the following form:
d ( xi , x j ) = Φ ( xu ) − Φ ( xh ) = k ( xi , xi ) − 2k ( xi , x j ) + k ( x j , x j ) (9)
∈ ∈
Given the training set T={(x1, y1),…, (xl, yl)} (X×Y)l xi x=Rn,yi y=R, i=1,2…l. ∈
Here are the specific steps of the algorithm KNSVM.
(1) Calculate the distance of each point with the other points, when record with its
own distanc e∞ .
(2) For each point xi of the dataset find the k-nearest points, if all class label of
k-nearest points are identical, and the class label of these points are different with the
point xi, then delete the point xi.
(3) Update the distance matrix. Calculate the k-nearest heterogeneous points for each
point, these heterogeneous points are formed a new training dataset.
(4) Train classifier for the new training dataset.
A Research of Reduction Algorithm for Support Vector Machine 361
The algorithm first remove the noise points and a part of the Borderliner points, and look
for support vectors to form a new dataset, then train a classifier for the new dataset. The
reduction algorithm calculate distance matrix only once, and the implementation process
is simple.
4 Simulation Research
4.1 Experiment Environment
In order to verify the reduction extent of support vectors and dataset, using KNSVM
algorithm, we select a different number of training set for study[7]. The experimental
datasets used are shown in the following table.
We use RBF kernel function for support vector machine in the experiment. The KNN
algorithm is used twice in the KNSVM algorithm. We use the standard support vector
machine and KNSVM in the experiment. The result of the standard SVM is shown in
table 2 while the result of the KNSVM is shown in table 3.
By the comparison data shown in Tables 2 and Table3, we can see that the number of
training dataset and support vectors are reduction significantly using the KNSVM al-
gorithm compared with the standard SVM. But when the training dataset is unbalanced
severely, the algorithm has its drawback. The algorithm may delete the minority class.
The classifier result for the test dataset may not better than the standard SVM. It is better
to consider the characteristics of the dataset to choose the appropriate model.
362 S. Liu and L. Sun
5 Conclusion
This paper presents a reduction algorithm for support vector machine. The algorithm
combines support vector machine with KNN algorithm. First, we use KNN algorithm to
prune the training dataset. Then the KNN algorithm is used to select for k-nearest
heterogeneous points for each point. At last, train the classifier for the new dataset.
Experiment results show that using KNSVM algorithm, the number of training dataset
and support vectors are greatly reduced on the condition of keeping the accuracy to the
test dataset.
References
1. Lin, Y.Y., Liu, T.L., Fuh, C.S.: Local ensemble kernel learning for object category recog-
nition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,
pp. 1–8. IEEE Press, Washington D.C (2007)
2. Liu, X.D., Chen, Z.Q.: A Fast Classification Alorithm of Support Vector Machines. Journal
of Computer Research and Development 41(8), 1327–1332 (2004)
3. Li, H.L., Wang, C.H., Yuan, B.Z., et al.: A Learning Strategy of SVM Used to Large
Trarning Set. Chinese Journal of Computers 27(5), 716–719 (2004)
4. Nguyen, C.H., Ho, T.B.: An Efficient Kernel Matrix Evaluation Measure. Pattern Recogni-
tion 41, 3366–3372 (2008)
5. Feng, G.H., Li, Y.J., Zhu, S.M.: Boundary Nearest Support Vector Machines. Application
Research of Computers 23(4), 11–12 (2006)
6. Ke, H.X., Zhang, X.G.: Edit support vector machines. In: Proceedings of International Joint
Conference on Neural Networks, Washington, USA, pp. 1464–1467 (2001)
7. A library for Support Vector Machines,
http://www.csie.ntu.edu.tw/cjlin/libsvm
Fast Support Vector Regression Based on Cut
Abstract. In general, the similar input data have the similar output target val-
ues. A novel Fast Support Vector Regression (FSVR) is proposed on the re-
duced training set. Firstly, the improved learning machine divides the training
data into blocks by using the traditional clustering methods, such as K-mean
and FCM clustering techniques. Secondly, the membership function on each
block is defined by the corresponding target values of the training data, all the
training data have the membership degree falling into the interval [0, 1], which
can vary the penalty coefficient by multiplying C. Thirdly, the reduced training
set is used to training FSVR, which consists of the data with the membership
degrees, which are greater than or equal to the selected suitable parameter λ .
The experimental results on the traditional machine learning data sets show that
the FSVR can not only achieve the better or acceptable performance but also
downsize the number of training data and speed up training.
1 Introduction
Support Vector Machines (SVMs) based on statistical learning theory been an elegant
and powerful tool for classification and regression over the past decade as a modern
machine learning approach [1]. SVMs have generated a great interest in the commu-
nity of machine learning due to their excellent generalization performance in a wide
field of learning problems, such as handwritten digit recognition[1], disease diagnosis
[2] and face detection[3]. On the other hand, applications of SVR, such as forecasting
of financial market [4], prediction of highway traffic flow[5], are developed. How-
ever, training SVMs is still expensive, especially for a large-scale learning problem,
i.e., O(N3), where N is the total size of training data. The size of the training set is
related to the computational burden of SVMs, how to solve the training process effec-
tively is the key focus for SVMs. The first strategy is to improve the velocity of the
quadratic programming. So many fast algorithms such as Chunking [6], SMO[7],
SVMTorch[8], LIBSVM[9], and the modified finite Newton method [10] have been
presented to reduce the training set and speed up training. For Chunking method,
Training a SVM is equivalent to solving a linearly constrained quadratic program-
ming (QP) problem in a number of variables equal to the number of data points. SMO
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 363–370, 2011.
© Springer-Verlag Berlin Heidelberg 2011
364 W. Zhou et al.
is an extreme version of this approach which updates only two variables at each itera-
tion. The minimal number of variables can be optimized in order to fulfill the linear
equality constrain. For working set including two variables, the key point is that the
optimization sub-problem can be solved analytically without explicitly invoking a
quadratic optimizer. SVMTorch is a new decomposition algorithm intended to effi-
ciently solve large-scale regression problems using SVMs. LIBSVM is related to a
Nystrom sampling[11] with active selection of support vectors and estimation in the
primal space.
The second strategy is to reduce the scale of the training set to alleviate the compu-
tational burden for SVM training algorithms. The reduced set strategy has been
successfully applied to solve SVM pattern recognition problems[12] and regression
problems[13]. By using a heuristic method for accelerating SVM regression training,
all the training data are first divided into several groups using some clustering meth-
ods, and then for each group, some training data are discarded based on the measure-
ment of similarity among examples. The two processes including the clustering
method and the similarity computation are related to the dimension of the input data.
In an interesting estimated ε -tube based pattern selection method, the probability of a
training point falling in the ε -tube is computed according to the difference between
true target value and the outputs of some regressions, each of which is trained on a
smaller bootstrap sample of the original training set.
In this paper, the reduced training set is used to the regression problems. There are
two processes to deal with the training data before training SVR. The first step is the
partition of the original training set, the clustering methods on the inputs of the train-
ing data, such as K-mean and FCM clustering, are used to divide the training set into
different clusters (blocks). The second step is to discard the redundant data. In the
process, the membership function or importance of data to SVR are defined by the
corresponding target values on each cluster, the data is selected to train SVR when
their membership degrees are greater than or equal to the selected parameter λ .
The paper is organized as follows. Section 2 gives a brief of SVR. In section 3, we
explain the idea of the paper in detail. In section 4, we compare the improved fast
training method with the traditional algorithm on the benchmark data sets. Some con-
clusions are given finally.
In space R d , the traditional clustering methods, such as the K-means and FCM clus-
tering, are used to partition the training set into blocks.
K-means clustering uses a two-phase iterative algorithm to minimize the sum of
point-to-centroid distances, summed over all k clusters: The first phase uses what the
literature often describes as "batch" updates, where each iteration consists of reassign-
ing points to their nearest cluster centroid, all at once, followed by recalculation of
cluster centroids. You can think of this phase as providing a fast but potentially only
approximate solution as a starting point for the second phase. The second phase uses
what the literature often describes as "on-line" updates, where points are individually
reassigned if doing so will reduce the sum of distances, and cluster centroids are re-
computed after each reassignment. Each iteration during this second phase consists of
one pass though all the points.
FCM clustering was originally introduced by Jim Bezdek in 1981 as an improvement
on earlier clustering methods. It is a data clustering technique wherein each data point
belongs to a cluster to some degree that is specified by a membership grade. It provides
a method that shows how to group data points that populate some multidimensional
space into a specific number of different clusters. FCM clustering starts with an initial
guess for the cluster centers, which intends to mark the mean location of each cluster.
The initial guess for these cluster centers is most likely incorrect. Additionally, FCM
assigns every data point a membership grade for each cluster. By iteratively updating
the cluster centers and the membership grades for each data point, FCM iteratively
moves the cluster centers to the "right" location within a data set. This iteration is based
on minimizing an objective function that represents the distance from any given data
point to a cluster center weighted by that data point's membership grade.
For each block, the distances between the target of the data in the block and the
their mean values are calculated by the following formula (1).
where nl denotes the number of the block. The membership function of the block is
The membership functions make the training set to the fuzzy training set by adding
the membership values of every training data. In original training data, the samples
are the pair ( xi , yi ) , in the fuzzy training set, the samples are triplets
( xi , yi , membership (i )) . The data, whose target values are near to the mean of all the
target values, have the large membership values. Conversely, the data have the small
membership values. The entire membership values (degrees) focusing on the interval
[0,1] are convenient to select the suitable parameter λ of cut. The suitable parameter
λ is used to cut off the training data which can not be misclassified and support vec-
tors. As we known, the generalization ability is related to the support vectors focusing
into the ε -tube which is the subset of the whole training set, namely the reduced
training set and the suitable parameter λ is existential for SVR and FSVR.
366 W. Zhou et al.
The reduced training set is achieved on the basis of the original training set. Compared
with the traditional SVR, the FSVR obtain the ε -tube and the estimating function on
the reduced training set, namely the mentioned irregular tube above, while the tradi-
tional SVR obtain them on the entire training set. The total size of training data will be
reduced compared with the original training set, and consume the smaller time in term
of the complexity of SVR. The proposed algorithm can be described as follows.
Algorithm. FSVR based the training set with n elements.
Step 1. Dividing the inputs of training data into m blocks by using the clustering
methods.
Step 2. Defining the membership function on each block by using formula (5).
Fast Support Vector Regression Based on Cut 367
0.3 0.3
0.2 0.2
0.1 0.1
0 0
-0.1 -0.1
-0.2 -0.2
0.3 0.3
0.2 0.2
0.1 0.1
0 0
-0.1 -0.1
-0.2 -0.2
Fig. 2. The selections of the reduced training set with different parameters
Step 3. Selecting parameter λ of cut and forming the reduced training set.
Step 4. Training FSVR.
Step 5. Verifying the performance of FSVR, if it is acceptable, outputting support
vectors and the values of the corresponding Lagrange multipliers for testing, other-
wise, go to step3.
In step 3, the parameter λ belongs to interval [0, 1] and is selected by rule of the
descent order, such as λ = 1.0, 0.9, 0.8, 0.7,... . The membership degrees of the nearest
data points to the estimate function are maximal on the basis of the definition (5). The
data points with the lager membership degree lie in the neighborhood of the estimate
function. Whether the samples are the elements of the reduced training set is
determined by the membership degree, so the larger parameter λ means the reduced
training set with the smaller amount. For examples, the data points fall into the inter-
val (-10,-8) in Figure 1 are extracted to interpret the cut process, there are 503 data
points for the example, the inputs of these data lie in the interval (-10,-8), and their
target values belong to (-0.26,0.32), the membership degrees defined by the outputs
fall into the interval [0,1]. 94, 149, 209, 260 elements of 503 training data are used to
consist of the corresponding reduced training set by the parameter
λ = 0.8, 0.7, 0.6, 0.5 respectively. Figure 2 shows the selected process, both the red
points and the blue ones are the training points, the red points are the selected ones,
and the curve represents the estimate function of the interval (-10, -8).
368 W. Zhou et al.
A training set ( xi , yi ) and testing set ( xi , yi ) 4 with 5000 training data respectively
are created where xi s are uniformly randomly distributed on the interval (-10,10). In
order to make the regression problem ‘real’, large uniform noise distributed in
[-0.2,0.2] has been added to all the training data while testing data remain noise-free.
The results with 50 clusters, C=100, RBF kernel with width 0.3, and different parame-
ter λ of cut, the number of training data (Ntr), time of training and testing, accuracy
of training and testing. It can be seen from Table 1 that the FSVR spend 1.141 sec-
onds CPU time obtaining the acceptable testing root mean square error (RMSE)
0.0052671, however, it takes 156.22 seconds CPU time for the traditional SVR to
reach the smaller RMSE 0.00632. The FSVR runs 137 times faster than the traditional
SVR. Fig.3 shows the true and the approximated function of the FSVR when
λ = 0.9 . The right figure of Fig.3 shows the true and the approximated function of
traditional SVR. In the figure, the dashed represents the real cure of sinc, the real line
represents the curve of SVR or FSVR.
The performance of the FSVR and SVR are compared on real world benchmark
datasets with large size. For dataset abalone with 4177 data, each datum consists of 7
attributes and 1 target value. Before training, 3000 training data and 1177 testing data
are randomly generated from the whole dataset. The simulation results by using the
linear kernel are showed in Table 1. From the table, we can see the FSVR reduce the
testing error and training time. KIN40K including 40000 samples was generated with
maximum nonlinearity and little noise, giving a very difficult regression task. We
select 30000 samples as the training data, and the rest 10000 samples are regarded as
the testing data. It takes a long time for SVR to perform the training process on the
data set KIN40K, we only select the FSVR whose performance approaches that of the
traditional one. From the Table 1, we can see the running time of FSVR is reduced
obviously, and the performance is acceptable when the parameter λ is set 0.9.
1
ftp://ftp.ics.uci.edu/pub/machine-learning-databases
2
http://ida.first.fraunhofer.de/~anton/data.html
3
http://www.support-vector.net/
4
http://www.ntu.edu.sg/home/egbhuang/
Fast Support Vector Regression Based on Cut 369
Table 1. Performance comparison for learning by using FSVR and SVR for sinc function
Fig. 3. Outputs of the FSVR on the reduced training set when λ = 0.9 and SVR on testing
data
4 Conclusions
In this paper, we propose the methods to reduce training data for support vector re-
gression estimation based on the integration of the inputs and the outputs of training
data, the data near to the estimate function are extracted to construct the reduced train-
ing set and form the FSVR. In order to improve the performance of FSVR, the prod-
uct of the membership degree and the traditional penalty coefficient C is adopted to
increase the penalty to the training data with large errors. The experiment results on
the benchmark data sets of machine learning demonstrate the superiority of FSVR,
which can not only reduce the training complexity but also achieve the better or ac-
ceptable generalization abilities. Unfortunately, the preprocessing before training and
the tuning of the suitable parameters during training process increase the time con-
sumption. Whether the proposed methods are proper is determined by the users, who
need the learning machines with the speediness velocity and the acceptable perform-
ance or the maximal performance.
370 W. Zhou et al.
References
[1] Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
[2] Wee, J.W., Lee, C.H.: Concurrent Support Vector Machine Processor for Disease Diag-
nosis. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds.) ICONIP 2004.
LNCS, vol. 3316, pp. 1129–1134. Springer, Heidelberg (2004)
[3] Buciu, L., Kotropoulos, C., Pitas, I.: Combining Support Vector Machines for Accurate
Face Detection. In: Proceeding of International Conference on Image Processing, Thessa-
loniki, Greece, pp. 1054–1057 (2001)
[4] Yang, H., Chan, L., King, I.: Support Vector Machine Regression for Volatile Stock
Market Prediction. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Opti-
mization. LNCS, vol. 2241, pp. 391–396. Springer, Heidelberg (2001)
[5] Ding, A., Zhao, X., Jiao, L.: Traffic Flow Time Series Prediction Based on Statistics
Learning Theory. In: Proceedings of International Conference on Intelligent Transporta-
tion System, Singapore, pp. 727–730 (2002)
[6] Osuna, E., Freund, R., Girosi, F.: An Improved Training Algorithm for Support Vector
Machines. In: Proceedings of Workshop on Neural Networks for Signal Processing, Ame-
lea Island, pp. 276–285 (1997)
[7] Platt, J.C.: Fast Training of Support Vector Machines Using Sequential Minimal Optimi-
zation. In: Advances in Kernel Methods: Support Vector Learning, pp. 185–208 (1999)
[8] Collobert, R., Bengio, S.: SVMTorch: Support vector machines for large-scale regression
problems. Journal of Machine Learning 1(2), 143–160 (2001)
[9] Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines (2001),
http://www.csie.ntu.edu.tw/~cjlin
[10] Keerthi, S.S., DeCoste, D.M.: A Modified Finite Newton Method for Fast Solution of
Large Scale Linear SVMs. Journal of Machine Learning Research 6, 341–361 (2005)
[11] Girolami, M.: Orthogonal Series Density Estimation and the Kernel Eigenvalue Problem.
Neural Computation 14(3), 669–688 (2002)
[12] Shin, H.J., Cho, S.: Invariance of Neighborhood Relation Under Input Space to Feature
Space Mapping. Pattern Recognition Letter 26(6), 701–718 (2004)
[13] Wang, W.J., Xu, Z.B.: A Heuristic Training for Support Vector Regression. Neurocom-
puting 61, 259–275 (2005)
Using Genetic Algorithm for Parameter Tuning on ILC
Controller Design
Abstract. In this project we use the ILC control method to manipulate the ro-
botic arms of a robot with two degrees of freedom. First we implement the dy-
namic equations of robot according to the Schillings book of robotic. The
aforementioned implementation was done in MATLAB SIMULINK environ-
ment. The Genetic Algorithm was used for tuning the coefficients of PD Con-
trollers (proportional and derivative gains). Also we use Multi objective genetic
Algorithm to attain the coefficients of ILC PD Controllers.
1 Introduction
Robotics and precision machining, high positioning accuracy and low tracking errors
(position and velocity 'errors) arc very important performance indices. On the other
hand, for the sake of practical feasibility of implementation, the use of classical linear
controllers such as PD in robotic systems has attracted great attention from industries
for many decades. As a matter of fact, the majority of the industrial robots are con-
trolled with the popular PD laws/algorithms.
We begin with a brief introduction about ILC control method in section 2. After
that we will discuss about our simulated [2] model in MATLAB, and then the using
of MATLAB Genetic toolbox is described. Also we have attached some pictures of
our MATLAB model diagram and results that have been achieved from running the
model. The results are as some graphs, achieving from simulated model.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 371–378, 2011.
© Springer-Verlag Berlin Heidelberg 2011
372 A. Rezaee and M. jalali
strings undergo the evolutionary process where the traits of the selected strings
(which may or may not be good) are selected and combined to form new strings for
the next generation. In theory, with each generation, the strings in the population
should return better and better cost functions. In practice, there is a limit to the cost
functions that strings can achieve, depending on the objective functions and the limits
imposed on the model parameters. Further, there is the possibility that the best traits
are not found [5]. We use a toolbox with MOEA name that is written in National Uni-
versity of Singapore.
Fig. 2 shows the chart our Model in Simulink Environment of Matlab
(UURULW ,/&
FRQWUROOHU D u W L W
3' T D
T TD FRQWUROOH 5RERW
G '\QDPLF
0RGHO
3' T D
FRQWUROOH
T T D
G
(UURULW
,/& D u W L W
FRQWUROOHU
(1)
Where and are diagonal proportional and derivative gain matrices, respectively,
and and denote the position and velocity
errors. Also, and are positive definite matrices.
Fig. 3 shows the PD-type learning control scheme. The performance of the PD-
type learning control depends upon the proportional gain ф and derivative gain .
Stability, settling time, maximum overshoot and many other system performance in
dicators depend upon the values of ф and . The proposed strategy utilizes GA as an
optimization and search tool to determine optimum values for the gains. The
374 A. Rezaee and M. jalali
performance index or the cost function chosen is the error taken by the system to
reach and stay within a range specified by absolute percentage of the final value.
Hence, the role of GA is to find optimum values of the gains ф and . In this case,
integral of absolute error (IAE) is used for minimizing the error and generating the
controller parameters:
∑
(2)
where Error = r(t) - y(t), N = size of sample, r(t) = reference input and y(t) = meas-
ured variable. Thus, the function in Eq. (2) can be minimized by applying a suitable
tuning algorithm.
Genetic algorithm is used here as a tuning algorithm. Genetic algorithms constitute
stochastic search methods that have been used as optimization tools in a wide spec-
trum of applications such as parameter identification and control structure design.
GAs have also found widespread use in controller optimization particularly in the
fields of fuzzy logic and neural networks. The GA used here initializes a random set
of population of the two variables ф and . The algorithm evaluates all members of
the population based on the specified performance index. The algorithm then applies
the GA operators such as reproduction, crossover and mutation to generate a new set
of population based on the performance of the members of the population. The best
member or gene of the population is chosen and saved for next generation. It again
applies all operations and selects the best gene among the new population. The best
gene of the new population is compared to best gene of previous population and the
best among all will be selected to represent ф and .
We choose the proportional and derivative coefficients (P, D) of ILC controllers
and α1 , α 2 such as genes in one chromosome. These aforementioned genes form the
Genetic parameters that were adjusted according to the some objectives. Thus we
have six gene or parameters which participate at forming a chromosome and two ob-
jectives for one of such chromosome. The structure of one chromosome is as follows:
Using Genetic Algorithm for Parameter Tuning on ILC Controller Design 375
P1 D1 P2 D2 α1 α
2
In this implementation seven (7) bits was assigned for each gene. Two objectives
were defined F1 and F2, that we discuss them bellow, before the definition of F1 and
F2 are described.
F1=Error1+Error2;
F2= (Error1>Error p1) + (Error2>Error p2);
Error1 means the total error in this iteration for the first arm joint. Error p1 means the
total error of the previous iteration for the first arm. And Error2 , Error p2 are such as
predefined parameters for the second degree of freedom.
F1 means that the total error of the first and second joints must be minimized.
F2 consists of two inequalities:
The first inequality means if the total error of this iteration for arm #1 is greater than
the total error of the previous iteration then the result of it becomes zero (0). If (Er-
ror1<=Errorp1) then the outcome of inequality is one (1). The second inequality has
the same description.
Therefore the minimum of F2 occurs when that (Error1<=Errorp1) and (Er-
ror2<=Errorp2) that the amount of F2 is zero.
Maximum value of F2 occurs when, the both of above inequality occurs simulta-
neously. For each chromosome the model was ran 10 times thus F2_total equals the
sum of above F2's and F1_total equals the sum of F1's.
Because they give us averaage of error less than those of MOEA in iteration numbbers
further than 10.
we compare the result off MOEA and those of GAOT, both of them be well forr 10
iteration, but GAOT outcom mes, give better average of error than MOEA. Becausee of
applying a constraint such h as Error(k+1)<=Error(k) to objectives is not possiblee in
MOEA, it is possible to imp pose it to goals in GAOT.
The fitness that, we use in
i this toolbox is:
Val= (1/(Error1+Error2))*(1/count)+(1/Error)
5 The Results
Figs. 4 and 5 show the totall error of the 1st and 2nd joints versus number of iteratioons:
7
1.5
6
Second Joint Total Error
First Joint Total Error
0.5
2
0
1 2 3 4 5 6 7 8 9 10 0
1 2 3 4 5 6 7 8 9 10
Fig. 4. Fig. 5.
Figs. 6 and 7 indicate desired value of 1st and 2nd joint angle and the acttual
value of it:
Fig. 6. Fig. 7.
Using Genetic Algorithm for Parameter Tuning on ILC Controller Design 377
These results illustrate increasing in total error during 10 iteration after the tuning
phase.
Iteration: 1 Total Error 7.6864
Iteration: 2 Total Error 1.0291
Iteration: 3 Total Error 0.46992
Iteration: 4 Total Error 0.19265
Iteration: 5 Total Error 0.09609
Iteration: 6 Total Error 0.07436
Iteration: 7 Total Error 0.064329
Iteration: 8 Total Error 0.066136
Iteration: 9 Total Error 0.061094
Iteration: 10 Total Error 0.055988
The pictures of trajectory in Tool space is as bellow. These pictures are plotted with
XY graph in Matlab Simulink Environment.
Fig. 8. Tool space trajectory in iteration NO.1 Fig. 9. Tool space trajectory in iteration NO.3
Fig. 10. Tool space trajectory in iteration NO.5 Fig. 11. Tool space trajectory in iteration NO.7
378 A. Rezaee and M. jalali
Fig. 12. Tool space trajectory in iteration NO.9 Fig. 13. Tool space trajectory in iteration
NO.10
References
1. Siciavicco, L., Siciliano, B.: Modeling and control of robot Manipulators. McGraw-Hill,
New York (1996)
2. Schilling, R.J.: Fundamental of Robotics. Prentice Hall, Englewood Cliffs (1990)
3. Bien, Z., Xu, J.-X.: Iterative learning control. In: Analysis, Design, Integration and Applica-
tion. Kluwer Academic Publisher, Dordrecht (1998)
4. Boming, S.: On Iterative Learning, PhD thesis, National University of Singapore (1996)
5. Fonseca, C.M., Fleming, P.J.: An overview of evolutionary algorithms in multiobjective
optimization. Evolutionary Computation 3(1), 1–16 (1998)
6. Goldberg, G.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison
Wesley, Reading (1989)
Controller Design for a Heat Exchanger in Waste Heat
Utilizing Systems
1 Introduction
Among different ways of recycling waste heat, Organic Rankine Cycle (ORC) system
is preferred because of high reliability, flexibility and low requirement for mainte-
nance. The key components of ORC system are evaporator and condenser. Usually,
moving boundary method and discrete method are used to build evaporator model.
Compared with discrete model, moving boundary model for two phase flow in
evaporator is less complex, as it is characterized by smaller order and higher compu-
tational speed [1]. The moving boundary model has been verified to be effective for
describing the dynamic characteristics of evaporator [2], [3].
A properly controlled evaporator plays a key role in achieving high performance in
ORC system. The most popular controller used in industrial processes is PID controller.
Cruhle and Isermann designed a PI controller to keep evaporator superheat at a fixed
point [4], but the PID controller behaves optimally only for the operating point which is
designed. The practical ORC system always operates in a wide range of operating
conditions, so it is necessary to design self-tuning PID controller. Genetic algorithm
has been recognized as an effective and efficient technique to solve optimization
problem of PID controllers [5], [6]. The self-tuning PID controller based on genetic
algorithm is employed to control the superheated temperature in waste heat
utilizing systems.
This paper is organized as follows: Section 2 describes the dynamic characteristics
of the evaporator. Section 3 solves the self-tuning PID parameters based on genetic
algorithm. Section 4 presents the simulation studies on the evaporator and compares the
self-tuning controller with the Ziegler-Nichols PID controller. Section 5 concludes
the paper.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 379–386, 2011.
© Springer-Verlag Berlin Heidelberg 2011
380 J. Zhang et al.
m i hi
Tr1 Sub-Cooled
L1 Zone Ta1
z ρ1 Tw1
hl
Tr 2 Two-Phase
ρ Zone Tw 2 Ta 2
L2 2
hg
Tr 3 Superheated
L3 Tw 3 Ta 3
ρ3 zone
m o ho
Nomenclature
A —area ( m )
2 α —heat transfer coefficient( w/ m2 °C )
P —pressure ( Pa ) Subscripts
L —length ( m ) r —working fluid
w —wall
h —specific enthalpy ( J / kg )
a —gas
D —diameter ( m ) i —inlet or inner
ρ —density ( kg / m 3 ) o —outlet or outer
T — temperature ( °C ) 1—sub-cooled
Z — length coordinate ( m ) 2—two-phase
v — velocity ( m / s ) 3—superheated
m — mass flow rate ( kg / s ) s —steady state
c p — heat capacity ( J / m 2 °C )
The moving boundary method is one way to investigate the dynamic characteristics
of the evaporator. The evaporator is divided into zones and separated by boundaries.
Several assumptions must be made in order to simplify the model.
(1) The evaporator is a long, thin, horizontal tube;
(2) The working fluid is mixed adequately and the working fluid flowing through the
evaporator tube can be modeled as a one-dimensional fluid flow;
Controller Design for a Heat Exchanger in Waste Heat Utilizing Systems 381
(3) The pressure drop along the evaporator tube, caused by momentum change in
working fluid and viscous friction, is negligible;
(4) Axial heat conduction in the working fluid as well as in the pipe wall is
negligible.
The pressure loss is assumed negligible, so the momentum balance is superfluous.
The governing equations, derived from the mass and energy conservation principles,
are represented by [3]:
∂Aρ ∂m (1)
+ =0
∂t ∂z
1 2 o w1 w2 w3
and the vector of
T
control input as u = ⎡⎣ m
i , hi , m o , va ⎤⎦ .The compact state space form [3] is as follows:
x = D −1 f ( x, u ) (4)
This model has been reduced to a compact, lumped-parameter form, even the linear
model around an operating point can also be obtained as follows:
δx = Aδx + Bδu (5)
∂f ( x, u ) ∂f ( x, u )
where A = D −1 , B = D −1 .
∂x x , u ∂u x , u
s s s s
It is significant to maintain an appropriate superheated temperature at the outlet of
the evaporator. The efficiency of the evaporator becomes lower with higher super-
heated temperature, since shorter section is used for evaporation and the energy is
wasted. However, if the superheated temperature is too small, there may be some liquid
entering the turbine.
k
u ( k ) = K p e(k ) + K i ∑ e( j ) + K d [e( k ) − e( k − 1)] (6)
j =0
Genetic
algorithm
K p Ki Kd ξ2
ξ1
r e(t ) y
+
PID Evaporator
−
J = ∫ (w e(t ) + w2 u ( t ) )dt + w t
2
1 3 u
(7)
0
(8)
∞
4 An Illustrative Example
In the simulation, the size of population is 30, the number of generation is 100,
crossover probability Pc = 0.9 , mutation probability Pm < 0.01 . The weights in equa-
tion (8) are w1 = 1 , w = 1 , w = 1 and w
2 3 4 = 100 . The searching ranges of PID
parameters are set to K p ∈ [0,1] , K i ∈ [0,1] and K d ∈ [0,1] .
R245fa is adopted as organic working fluid. The initial steady condition is consid-
ered: evaporation pressure P = 2 Mpa , the working fluid mass flow m = 3.72kg / s ,
the evaporator outlet temperature Tout = 137.6°C ,the velocity of low-quality exhausts
va = 4.03m / s . The set point of the outlet temperature of evaporator is Tset = 140°C .
Figure 3 shows the objective function J in equation (8) is convergent when PID pa-
rameters are optimized using genetic algorithm. It can be shown from figure 4 that the
self-tuning PID controller is better than Ziegler-Nichols PID controller ( K p = 0.24 ,
K i = 0.015 ,K d
= 0.06 ), the step response with self-tuning PID controller has smaller
overshoot and settling time. The self-tuning PID controller is more complex compared
with the Ziegler-Nichols PID controller, but the proposed algorithm can regulate pa-
rameters online. Figure 5 demonstrates the profile of the manipulated variable corre-
sponding to self-tuning PID controller, the velocity of the exhausts changes with small
chatter due to the disturbances existed in the closed loop control system.
384 J. Zhang et al.
9000
8000
7000
6000
Best J
5000
4000
3000
2000
1000
0 10 20 30 40 50 60 70 80 90 100
Times(s)
140.5
ZN PID
140
GA PID
139.5
Tout(• •
139
138.5
138
137.5
0 10 20 30 40 50 60 70 80 90 100
Time(s)
5.5
5
u(m/s)
4.5
3.5
3
0 10 20 30 40 50 60 70 80 90 100
Time(s)
142
141
140
rin,yout(• )
139
138
137
136
0 5 10 15 20 25 30 35 40 45 50
Time(s)
5 Conclusions
In this paper, the self-tuning PID controller based on genetic algorithm has been em-
ployed to control the outlet temperature of the evaporator in a waste heat utilizing
system. Simulation results demonstrate that the self-tuning controller outperforms
Ziegler-Nichols PID controller. Therefore genetic algorithm is a reasonable and effec-
tive method to optimize the parameters of PID controllers.
Acknowledgement
This work was supported by the China National Science Foundation under Grant
(60974029) and National Basic Research Program of China under Grant (973 Program,
2011CB710706). These are gratefully acknowledged.
References
1. Wei, D.H., Lu, X.S., Lu, Z., Gu, J.M.: Dynamic Modeling and Simulation of an Organic
Rankine Cycle (ORC) System for Waste Heat Recovery. J. Applied Thermal Engineering 8,
1216–1224 (2008)
2. Jensen, J.M., Tummescheit, H.: Moving Boundary Models for Dynamic Simulation of
Two-Phase Flows. In: 2nd International Modelica Conference, Oberpfaffenhofen, pp.
235–344 (2002)
3. He, X.D.: Dynamic Modeling and Multivariable Control of Vapor Compression Cycles in
Air Conditioning System. Ph.D Thesis, Massachsetts Institute of Technology, Department of
Mechanical Engineering (1996)
386 J. Zhang et al.
4. Gruhle, W.D., Isermann, R.: Modeling and Control of a Refrigerant Evaporator. In:
American Control Conference, Darmstadt, pp. 234–240 (1985)
5. Lin, G.H., Liu, G.F.: Tuning PID Controller Using Adaptive Genetic Algorithms. In: 5th
International Conference on Computer Science & Education, pp. 519–523. IEEE Press,
Hefei (2010)
6. Singh, R., Sen, I.: Tuning of PID Controller Based AGC System Using Genetic Algorithms.
In: 2004 IEEE Region 10 Conference, TENCON 2004, pp. 531–534. IEEE Press, Bangalore
(2004)
7. Liu, J.K.: Advanced PID Control MATLAB Simulation, 2nd edn. Electronic Industry Press,
Beijing (2006)
Test Research on Radiated Susceptibility of Automobile
Electronic Control System
1 Introduction
With the development of the electronic technology, the frequency of many circuits
and electronic equipment keeps increasing while the operation voltage keeps decreas-
ing. Consequently, the susceptibility and vulnerability to EMP (electromagnetic
pulse) tend to be increasing as well [1]. What’s more, the electronic system has a wide
application of integrated circuits which is relatively sensitive to EMP. Therefore, a
high strength EMP will lead to codes error and memory loss of LSI (Large Scale
Integrated) circuit, and will even result in invalidation or burnout of electronic de-
vices. The facts mentioned above constitute a great threat to the normal operation of
modern automobile which is characterized by the mass application of ECS. Hence it
has become extremely necessary and exigent to conduct the research into the influ-
ence of electromagnetic radiation on automobile ECS [2].
In order to assure the electromagnetic compatibility and susceptibility of military
materiel, a series of military standards come in to being, such as GJB151, GJB152,
MIL-STD-461 and MIL-STD-464. Take GJB151A [3] for example, it provides nine-
teen electromagnetic emission and susceptibility requirement items, including three
RSs (RS101, RS103 and RS105). And MIL-STD-461F [4] provides eighteen emis-
sion and susceptibility requirement items, also including three RSs (RS101, RS103
and RS105). According to these standards, the electromagnetic compatibility, inter-
ference and susceptibility designing and testing had been carried out before the batch
production of a specific automobile. In fact, in such a test, there are some limits
whether in strength or in frequency range. There are many evidences indicating that,
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 387–394, 2011.
© Springer-Verlag Berlin Heidelberg 2011
388 S. Yang et al.
the main threat of the automobile may be the high strength EMP in a modern battle-
field. Because of the lack of tests in the influence of an automobile under high
strength EMP environment, we don’t know whether the ECS of an engine can over-
come the high strength EMP or not; and if the influence exists, how and to what ex-
tent will it be affected.
In this paper, a special RS test has been designed and carried out, and the test re-
sults have also been analyzed. Through the test, the influences of high strength EMP
on the automobile ECS have been identified, the susceptibility and vulnerability to the
EMP has been researched, the damage characteristics and law under the EMP envi-
ronment have been probed. It can provide not only technical support for the decreas-
ing of the vulnerability of the automobile ECS, but also design rules for battlefield
damage assessment and repair.
A subsystem of a typical automobile engine ECS has been selected to carry out the
RS test. Because of the limits of the test devices and space, an analog device has been
designed. It can simulate the actual operation of the subsystem of the ECS. The de-
vice is consisted of an ECU (electronic control unit), a mass air flow sensor, a tem-
perature sensor, two electromagnetic injector, an ignition controller, a storage battery,
a signal generator and a blower (Fig.1).
Each elements of the device is from the original automobile except the signal gen-
erator and the blower, and they are connected by the control bundle of wires from the
original automobile. The blower is adopted to simulate the inlet condition of the en-
gine, and the signal generator is applied to simulate the speed signal of the engine,
thus the electronic control system can operate independently.
power meter
GTEM cell
field intensity
power supply(1) EUT
meter
optical fiber
portable monitoring
computer device
shielded enclosure
field intensity
power supply EUT
meter
optical fiber
monitoring portable
device computer
Injection Signal
Rotary Signal
Ignition Signal
d) When testing frequency is 120 MHz and field strength of is 50V/m, there is de-
rangement of fuel injection signal, decrease in the amplitude of the ignition signal and
the phenomenon of misfire (see Fig. 6. a.); when field strength is increased to 67 V/m,
there is disappearance of ignition and fuel injection signals (see Fig. 6. b.); and then
when field strength is gradually reduced to 55 V/m, ignition signal and fuel injection
signal are restored to the state of under interference.
e) When testing frequency is 32 MHz and field strength reaches 53V/m, ignition
signal and fuel injection signal disappear, and when field strength exceeds 200V/m,
damage to ECU devices occurs.
392 S. Yang et al.
When penetrating through the electrical devices, energy of EMP will impose on the
equipment adverse effects which are usually shown in two aspects: one is functional
damage to the electronic installations, and the other is malfunction of the ECS [7].
System malfunction is referred to the state of temporary operating interruption of
the devices with the impact of EMP. Generally, system malfunction is divided into
two cases, one of which is that transient interference generated by EMP impact ap-
pears on one input point of the circuit with the other input points being still on the
original level and the output being temporarily changed. Under this circumstance, the
interfering signal produced in the process of electromagnetic transient passes through
amplifying circuit and is then mistakenly regarded as control signal, which in succes-
sion will lead to the derangement of ignition and injection signal. Since the input
wire, DC power cord and grounding line are usually made of common conductor
without shielding, the system is extremely sensitive to that interference and is vulner-
able to operational malfunction [6].
The main reason for signal disappearance is the recurrent relocation or crash of mi-
crocontroller.
Microcontroller is the core component of the electronic control system. Therefore,
watchdog circuit is usually set up to avoid the abnormality of microcontroller pro-
gram under the electromagnetic interference and restore the program to its normal
condition as soon as possible. But when the amplitude and frequency reaches to a
certain degree, recurrent relocation or crash of the microcontroller is likely to occur
and its external features are the disappearance of signals. But when the amplitude of
interference signal decreases, the system again returns to the state of under interfer-
ence before the state of signal disappearance.
After a great number of repeated experiments and intensive analysis of operating
principle of microcontroller, we have come to the conclusion as follows [7]: under the
environment of EMP irradiation, there is potential resetting of microcontroller, for
which one reason is that interference signal on RST pin is mistaken for relocation
signal, and the other reason is that interference signal on the relocation signal line of
CPU directly relocates microcontroller. So if a secure relocation of microcontroller is
to be achieved, there must be a high level with its machine cycles no less than two on
the RST pin. And that high level should at least keep at 2μs when the crystal fre-
quency reaches 12MHz [8]. Although the width of plus-minus pulse of the interfer-
ence signal is usually less than 2μs which seems to be unqualified for relocation, that
condition is for sure the requirement for a secure relocation. In other words, it is
known that relocation circuit inside CPU will take a sample of the state of RST in the
S5P2 of each machine cycle. Therefore, if the RST data in two successive samplings
are all in high level, CPU can be relocated as well. Furthermore, since the duration of
interference signal on RST pin is close to 2μs, the possibility of acquiring high level
on RST pin in two successive samplings is still in existence.
Test Research on Radiated Susceptibility of Automobile Electronic Control System 393
Crash problem is referred to a state, in which the “endless loop” program is oper-
ated in computer system and only by means of pressing the key of reset can endless
loop be skipped out.
To be specific, the crash is caused by abnormal jump of computer program, of
which the jumping point is random. When the program jumps to the noninitial byte of
an instruction, the latter part of that instruction is possible to be combined with the
former part of the following instruction as an integrated instruction to be executed.
Consequently, such series of instructions as a whole has already changed into a new
program beyond recognition which probably evolves into an endless loop [9]. Alter-
natively, the abnormal jump of computer program will not necessarily give rise to a
crash, because there are slim chances of crash when computer program jumps to the
initial byte of an instruction. It is the alternation of PC (program counter) content
inside CPU that leads to the abnormal jump of computer program. In this way, it is
very easy for EMP to be imported into CPU with data bus by means of front door or
back door coupling. Thus the PC value is changed and then the program crashes.
The test research indicates that various hazardous electromagnetic resources exert
their influences mainly by means of conductive coupling or radiant coupling of the
energy. EMP damage mechanism of electric system and electronic control system can
be summarized into the following 3 aspects:
a) Thermal Effect
EMP thermal effect is an adiabatic process that is generally achieved in nanosec-
ond or microsecond. This effect will give rise to overheating of microelectronic de-
vices and electromagnetic sensitive circuits, lead to burnout of the mental bar in input
protection resistance and CMOS (Metal Oxide Semiconductor), and ultimately result
in the functional degradation or invalidation of the circuit [10].
b) Interference and Surge Effect
RFI generated by EMP creates electrical noise, EMI (electromagnetic interference),
malfunction or functional invalidation to the electronic circuit. Besides, transient
overvoltage or surge effect produced by EMP will also bring forth hard damage that is
mainly reflected in the phenomena such as short circuit, open circuit, breakdown of
PN junction, Oxide Breakdown of semiconductor devices.
c) High-Electric Field Effect
The High-Electric Field can not only bring about dielectric breakdown between
MOS gate oxides or wires and circuit invalidation, but it will also impose adverse
effects on the operating reliability of the sensitive devices.
4 Conclusions
It has been found through the RS test within band of 10 kHz-18GHz that the elec-
tronic control system is sensitive to the band of 4MHz-260MHz, the reason for which
is the fact that the CPU crystal frequency in ECU of electronic control system is
16MHz. And when the field strength of electromagnetic radiation within sensitive
bandwidth rises to a certain degree, there is signal derangement, signal loss and even
damage to electronic components of the ECS.
394 S. Yang et al.
Finally, because the test is not so complete and thorough, and the measurement of
threshold values and analysis of the phenomena mentioned above are not so exact and
perfect, deeper and further research and study still remain to be continued.
References
1. Shuzhong, W., Zhenxin, Z.: The Effects of EMP on Electronic Circuits and its Protection.
Electronics Quality (8), 75–77 (2008)
2. Shenghui, Y.: The Measurements of Automobile Equipment Support in the Complex Elec-
tromagnetic Environment. Journal of Academy of Military Transportation 11(4), 32–35
(2009)
3. GJB151A. Electromagnetic Emission and Susceptibility Requirements for Military
Equipment and Subsystems, pp. 4–5. Publishing House of Defense Industry, Beijing
(1997)
4. MIL-STD-461F. Requirements for the Control of Electromagnetic Interference Character-
istics of Subsystems and Equipment, pp. 25–26. U.S. Government Printing, Washington
(2007)
5. GJB152A. Measurement of Electromagnetic Emission and Susceptibility for Military
Equipment and Subsystems, pp. 1–87. Publishing House of Defense Industry, Beijing
(1997)
6. MIL-STD-464C. Electromagnetic Environmental Effects Requirements for Systems, pp.
1–156. U.S. Government Printing, Washington (2010)
7. Shenghui, Y.: Study on Irradiation Effects of Nuclear Electromagnetic Pulse to Vehicle
Equipment Electronic Control System. Journal of Academy of Military Transporta-
tion 12(4), 46–51 (2010)
8. Guangdi, L.: Fundamentals of Single-chip Microcomputer, pp. 20–24. Press of Beijing
University of Aeronautics and Astronautics, Beijing (2006)
9. Huanxiang, L.C.: Software Anti-interference Design of Monolithic Application System.
Wuhan Uni. Of Sci. & Tech. 23(2), 193–195 (2000)
10. Shanghe, L.: Electromagnetic Environment Effect and its Development Trends. National
Defense Science and Technology 29(1), 4–6 (2008)
Forgeability Attack of Two DLP-Base Proxy
Blind Signature Schemes
1 Introduction
In traditional digital signature schemes, the binding between a user and his
public key needs to be ensured. The usual way to provide this assurance is
by providing certificates that are signed by a trusted third party, Namely the
public certificate. As a consequence, the system requires a large storage and
computing time to store and verify each user’s public key and the corresponding
certificate. In 1984, Shamir [2]introduced the conception of identity-based public
key cryptosystem to simplify key management procedures in certificate-based
public key setting. In ID-based mechanism, the user’s public key is indeed his
identity (such as email, IP address, etc.). Since then, various ID-based encryption
schemes and signature schemes have been proposed. At present, many ID-based
encryption and signature schemes have been proposed based on the bilinear
pairings in elliptic curves or hyper-elliptic curves. The size of signature is in
general short in these schemes.
The notion of blind signature was introduced by D.Chaum [4], which can pro-
vide an anonymity of signed message. Since it was introduced, blind signature
schemes [4,5,6,7,8,9,10] have been used in numerous application, most promi-
nently in anonymous voting and anonymous e-cash. At the same time, to adapt
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 395–402, 2011.
c Springer-Verlag Berlin Heidelberg 2011
396 J. Zhang et al.
practical demands, many variant of the blind signature schemes appeared, such
as partial blind signature, group blind signature etc.
Informally, blind signature allows a user to obtain signatures from an authority
on any document, in such a way that the authority learns nothing about the
message that is being signed. The most important property of blind signature
is unforgeability, which requires that it is impossible for any malicious user that
engages in k runs of the protocol with the signer, to obtain strictly more than k
valid message-signature pairs. The basic idea of most existing blind signatures
is that the requester randomly chooses some random factors and embeds them
to the message to be signed. The random factors are kept in secret so the signer
cannot recover the message. Upon the blinded signature returned by the signer,
the requester can remove the random factor to obtain a valid signature. Up to
now, two ID-based blind signature schemes based on bilinear pairings have been
proposed. The first scheme was proposed by Zhang and Kim[16] in Asiacrypt
2002, the other scheme was proposed in ACISP2003.
The notion of proxy signature scheme introduced by Mambo et. al in 1996
[15]. A proxy signature scheme allows a entity, called original signer, to delegate
his signing capability to one or more entities, called proxy signer. Since it is
proposed, the proxy signature schemes have been suggested for use in many
applications [16,17,18], particularly in distributed computing where delegation of
rights is quite common. Examples discussed in the literature include distributed
systems, Grid computing, mobile agent applications, distributed shared object
systems, global distribution networks, and mobile communications. And to adapt
different situations, many proxy signature variants are produced, such as one-
time proxy signature, proxy blind signature, multi-proxy signature, and so on.
Since the proxy signature appears, it attracts many researchers’ great attention.
The proxy signature and blind signature have respective advantages. In some
real situations, we must apply them both concurrently, for example, in an anony-
mous proxy electronic voting. The first proxy blind signature was proposed by
Lin and Jan [3] in 2000. Later, Tan et al.[5] proposed a proxy blind signature
scheme. However, in 2003, Lal et al.[8] pointed out that Tan et al.s scheme was
insecure and proposed a new proxy blind signature scheme based on Mambo et
al.s scheme [6]. In 2004, Wang et al.[10] demonstrated that Tans scheme was
insecure and proposed two effective attacks. In 2005, Sun et al.[11] showed that
Tan et al.’s schemes didnt satisfy the unforgeability and unlinkability properties
and they also pointed out that Lal’s scheme [8] didnt possess the unlinkability
property either. In 2004, Xue and Cao showed there exists one weakness in Tan
et al.s scheme [4] and Lal et al.s scheme [8] since the proxy signer can get the link
between the blind message and the signature or plaintext with great probability.
In 2007, Li et al.[11] proposed a proxy blind signature scheme using verifiable
self-certified public key, and compared the efficiency with Tan et al.[5]. Recently,
Yang et.al proposed new scheme[12] and showed their scheme is more efficient
than Li et al.[11]. Recently, based on Yang et.al ’s proxy blind signature, Kar
et al and Nway Oo et al. proposed a novel proxy blind signature in [1] and [2],
respectively. In this paper, by analyzing we present Kar et al’s scheme and Nway
Forgeability Attack of Two DLP-Base Proxy Blind Signature Schemes 397
Oo et al.’s scheme, we show that the two scheme is insecure against unforgeabil-
ity attack, They are universally forgeable, in other words, anyone is able to forge
a proxy blind signature on arbitrary a message. And we also analyze the reason
to produce such attack. Finally, the corresponding attack is given.
The rest of the paper is organized as follows: Section 2 give some preliminary
knowledge related to the paper; in section 3, we show the flaw of Huang et.al
blind signature scheme, then recall Zhang et.al blind signature scheme in section
4; in section 5, we analyze the security of Zhang et.al blind signature scheme
; in section 6, we analyze security of Wu et.al scheme and give the attack on
blindness. Finally, we draw this paper.
2 Preliminaries
In this section, we will review security requirements of proxy blind signature.
– Distinguishability: The proxy blind signature must be distinguishable from
a normal signature.
– Non-repudiation: Neither the original signer nor the proxy signer can sign on
behalf of the other party. This means that they cannot deny their signatures
against anyone.
– Verifiability: The verifier should be able to verify the proxy signature in a
similar way to the verification of the original signature.
– Unforgeability: Only the designated proxy signer can create a valid proxy
signature for the original signer (even the original signer cannot do it).
– Identifiability: Anyone can determine the identity of the corresponding proxy
signer from a proxy signature.
– Prevention of misuse: It should be confident that proxy key pair should
be used only for creating proxy signature, which conforms to delegation
information. In case of any misuse of proxy key pair, the responsibility of
proxy signer should be determined explicitly.
– Unlinkability: After proxy blind signature is created, the proxy signer knows
neither the message nor the signature associated with the signature scheme.
Definition 1. (Blindness:) Let S be a probabilistic polynomial time algorithm,
U1 and U0 be two honest users. U1 and U0 engage in the signature issuing
protocol with S on messages mb and m1−b , and output signatures δb and δ1−b ,
respectively, where b is randomly chosen from {0, 1}. Send (m0 , m1 , δb , δ1−b ) to
S and then S outputs b ∈ {0, 1}. For all such S , U0 and U1 , for any constant
c, and for sufficiently large n,
4.1 Forgeability
Here, we will show the scheme is insecure against unforgeability attack. Namely,
anyone can produce a forged proxy blind signature on behalf of original signer.
In the following ,we give the corresponding attack.
– Let m be a forged message.
– To produce a forgery, an adversary randomly chooses k ∈ Zq to compute
e = H(g k mod p).
– Then the adversary sets s = g k (yA yB K H(mw ||K) )e mod p
– Finally, the resultant proxy blind signature on message m is δ =
(m , mw , s , e , K)
In the following, we show that the generated forged proxy blind signature is valid
and it can be verified by the verifier . This is true, because
e = h(m||g l )
h(m||s (yA yB K H(mw ||K) )−e ) = h(m||(g l (yA yB K H(mw ||K) )e )(yA yB K H(mw ||K) )−e )
= h(m||g l )
= e
Obviously, we can know that the forged proxy blind signature δ = (m , mw , s ,
e , K) can pass verification equation. Thus, our attack is valid.
The reason to produce such attack is the formation of s in the proxy blind
signature is not fixed. It makes that an adversary can randomly chooses a suitable
formation to construct a forgery.
400 J. Zhang et al.
r = gk mod p
– If r∗ = 0, then the user C has to select a new tuple (u, v). Otherwise, the
user sends e to the proxy signer.
– After receiving e, proxy signer B computes
s∗ = k + es mod q (4)
6 Conclusion
As an important cryptgraphical technique, proxy blind signature plays an impor-
tant role in secure e-commerce, such as e-cash, e-vote. Where the Unforgeability
is an important property of proxy blind signature scheme. In this paper, we give
the security analysis on two DLP-based proxy blind signature schemes[1,2], and
show that the two schemes are insecure. They are universally forgeable, in other
words, anyone is able to forge a proxy blind signature on arbitrary a message. It
is a open problem to how to design a secure proxy blind signature scheme.
References
1. Kar, B., Sahoo, P.P., Das, A.K.: A Secure Proxy Blind Signature Scheme Based
on DLP. In: MINES 2010, pp. 477–480 (2010)
2. Oo, A.N., Thein, N.: DLP based Proxy Blind Signature Scheme with Low-
Computation. In: 2009 The Fifth International Joint Conference on INC, IMS,
and IDC, pp. 285–288 (2009)
3. Lin, W.D., Jan, J.K.: A security personal learning tools using a proxy blind sig-
nature scheme. In: Proc. of Int. Conference on Chinese Language Computing, pp.
273–277 (2000)
4. Chaum, D.: Blind signature for untraceable payment. In: Advances in Cryptology-
Crypto 1982, pp. 199–203. Springer, Heidelberg (1983)
5. Tan, Z.W., Liu, Z.J., Tang, C.M.: A proxy blind signature scheme based on DLP.
Journal of Software 14(11), 1931–(1935)
6. Kim, J.-H., Kim, K., Lee, C.S.: An Efficient and Provably Secure Threshold Blind
Signature. In: Kim, K.-c. (ed.) ICISC 2001. LNCS, vol. 2288, pp. 318–327. Springer,
Heidelberg (2002)
7. Wang, S., Bao, F., Deng, R.H.: Cryptanalysis of a Forward Secure Blind Signature
Scheme with Provable Security. In: Qing, S., Mao, W., López, J., Wang, G. (eds.)
ICICS 2005. LNCS, vol. 3783, pp. 53–60. Springer, Heidelberg (2005)
8. Wang, S.H., Wang, G.L., Bao, F., Wang, J.: Cryptanalysis of a proxy blind signa-
ture scheme based on DLP. Journal of Software 16(5), 911–915 (2005)
9. Li, J.G., Wang, S.H.: New Efficient Proxy Blind Signature Scheme Using Verifiable
Self-certified Public Key. International Journal of Network Security 4(2), 193–200
10. Okamoto, T., Inomata, A., Okamoto, E.: A proposal of short proxy signature us-
ing pairing. In: The Proceedings of the International Conference on Information
Technology: Coding and Computing, pp. 631–635 (2005)
11. Pointcheval, D.: Security Arguments for Digital Signatures and Blind Signatures.
Journal of Cryptology 13(3), 361–396
12. Zhang, F., Kim, K.: ID-Based Blind Signature and Ring Signature from Pairings.
In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 533–547. Springer,
Heidelberg (2002)
13. Zhang, F., Kim, K.: Efficient ID-based Blind Signature and Proxy signature from
Bilinear Pairings. In: Safavi-Naini, R., Seberry, J. (eds.) ACISP 2003. LNCS,
vol. 2727, pp. 312–323. Springer, Heidelberg (2003)
14. Wu, Q., Susilo, W., Mu, Y., Zhang, F.: Efficient Partially Blind Signatures with
Provable Security. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K.,
Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3982,
pp. 345–354. Springer, Heidelberg (2006)
15. Mambo, M., Usuda, K., Okamot, E.: Proxy signature: delegation of the power to
sign messages. IEICE Trans. Fundamentals E79-A(9), 1338–1353 (1996)
16. Xu, J., Zhang, Z., Feng, D.: ID-Based Proxy Signature Using Bilinear Pairings.
In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 359–367.
Springer, Heidelberg (2005)
17. Zhang, F., Kim, K.: Efficient ID-based blind signature and proxy signature from
pairings. In: Safavi-Naini, R., Seberry, J. (eds.) ACISP 2003. LNCS, vol. 2727, pp.
312–323. Springer, Heidelberg (2003)
18. Shim, K.-A.: An Identity-Based Proxy Signature Scheme from Pairings. In: Ning,
P., Qing, S., Li, N. (eds.) ICICS 2006. LNCS, vol. 4307, pp. 60–71. Springer,
Heidelberg (2006)
Key Cutting Algorithm and Its Variants for
Unconstrained Optimization Problems
Abstract. This paper presents the key cutting algorithm and its variants. This
algorithm emulates the work of locksmiths to defeat the lock. The best key that
matches a given lock is pretended to be an optimal solution of a relevant opti-
mization problem. The basic structure of the key cutting algorithm is as simple
as that of genetic algorithms in which a string of binary numbers is employed as
a key to open the lock. In this paper, four variants of the predecessor are pro-
posed. The modification is mainly in the key cutting selection. Various criteria
of the key cutting probability are added in order to improve the searching speed
and the solution convergence. To evaluate their use, four standard test functions
are challenged and therefore which satisfactory best solutions obtained from the
key cutting variants are compared with those obtained from genetic algorithms.
The results confirm the effectiveness of the key cutting and its variants to solve
the unconstrained optimization problems.
1 Introduction
Locksmithing is the science and art of making and defeating a lock. The lock is a clas-
sical mechanism to secure building, rooms, cabinets, storage facilities, etc. A key is a
tool used to open the lock. A “smith” of any kind is one who shapes metal pieces. The
locksmithing as its name implies is the assembly and designing of locks and their re-
spective keys [1]. Although the locksmiths actually made the entire lock to maintain the
security of homes, business, automobiles or etc, sometimes people often encounter a
daily situation of losing their keys. The locksmiths can help them to open their key.
Lock picking [2] is an essential skill of lock smiths to open the lock without having the
correct key while not damaging the lock. There are various techniques to pick different
types of locks. The simplest way starts with a blank key and uses the following method
for a functioning key to open the lock. A hook pick is inserted into the lock and there-
fore the number and exact location of key teeth are known. The key tooth adjustment of
the initial key bank is applied until a key that can open the lock is found.
In November 2009, Jing Qin introduced his/her key cutting algorithm that emulates
the lock picking work of locksmiths to open a lock [3]. His/her algorithm is simple to
understand and be implemented. In his/her paper, a 9-number puzzle and a quadratic
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 403–410, 2011.
© Springer-Verlag Berlin Heidelberg 2011
404 U. Leeton and T. Kulworawanichpong
function of a single variable were used for test. The results were satisfactory but li-
mited. In our further work, this key cutting algorithm always fails when the number of
control variables is equal or greater than two. With respect to the original key cutting
algorithm, some modifications are made in order to improve the performance of the
algorithm suitable for unconstrained optimization problems. In this paper, Section 2
will give a useful description of the original key cutting algorithm while its variants
are illustrated in Section 3. Section 4 shows test results and discussion. The last sec-
tion, Section 5 is the conclusion.
2.1 Definitions
Definition 1: Lock
The lock is defined as an objective function of unconstrained optimization problem. It
requires a solution that is called a “key” to open the lock.
Definition 2: Key
A key is a one possible solution to a given objective function.
Definition 3: Key Tooth
A set of key teeth is a binary string representing an encoded key as in Fig. 1.
Definition 4: Key Set
A set of possible solution is a collection of possible keys to open the lock as a collec-
tion of keys in a key ring.
Definition 5: Key Fitness
The key fitness represents a degree of the key and lock matching. The key with a
higher fitness is more suitable to fit the lock.
Definition 6: Similarity
The degree of similarities among all keys in a key set can reveal a correct tooth loca-
tion of the respective key to the lock.
Definition 7: Key Cutting
Key cutting is a step to adjust one tooth on a key or to change one bit of a string.
Definition 8: Key Cutting Probability
Key cutting probability is the probability to control the variation of one tooth of a bit
string. It can be calculated based on the similarity of the key set.
Definition 9: Key Selection (Key Picking)
Choose a subset from a key set to create a new key set in the next iteration.
f ( x1 , x2 , x3 ) = 30 + ∑( 1 ( xi − 20)2 − 9cos( 2π xi ))
3
(3)
i =1 10 5
The test of this function is carried out by applying the same parameter setting to all
the key cutting algorithms and genetic algorithms as follows.
• Population size is 80
• Maximum iteration is 50
• No stalled generation is applied
• 16-bit resolution is used for each variable
After 30 trials of solutions, the selected convergence from each method is shown in
Fig. 3. KCA2 and KCA4 are the two best methods for finding the best objective func-
tion, while GA is the fastest.
The test of this function is carried out by applying the same parameter setting to all
the key cutting algorithms and genetic algorithms as follows.
• Population size is 80
• Maximum iteration is 50
• No stalled generation is applied
• 16-bit resolution is used for each variable
After 30 trials of solutions, the selected convergence from each method is shown in
Fig. 4. KCA4 is the best method for finding the best objective function, while GA is
the fastest.
The test of this function is carried out by applying the same parameter setting to all
the key cutting algorithms and genetic algorithms as follows.
• Population size is 80
• Maximum iteration is 100
• No stalled generation is applied
• 20-bit resolution is used for each variable
After 30 trials of solutions, the selected convergence from each method is shown in
Fig. 5. KCA4 and GA are the two best methods for finding the best objective func-
tion, while only GA is the fastest.
Key Cutting Algorithm and Its Variants for Unconstrained Optimization Problems 409
The test of this function is carried out by applying the same parameter setting to all
the key cutting algorithms and genetic algorithms as follows.
• Population size is 30
• Maximum iteration is 50
• No stalled generation is applied
• 20-bit resolution is used for each variable
410 U. Leeton and T. Kulworawanichpong
After 30 trials of solutions, the selected convergence from each method is shown in
Fig. 6. KCA4 is the best method for finding the best objective function, while GA is
the fastest.
5 Conclusion
This paper presents the key cutting algorithm and its variants to solve multivariate
optimization problems. The proposed algorithms are challenged with four standard
test functions and their results are also compared with those obtained by genetic algo-
rithms. As a result, the key cutting algorithm with modification 4 (KCA4) shows the
best performance of finding the best solution among them. However, from the tests
the key cutting algorithm is slower than genetic algorithms. This is because there is no
stalled iteration applied to all the methods. As all the key cutting algorithms having
fast convergent characteristic, if an appropriately stalled iteration is used, the key
cutting algorithms would be expected to perform faster than genetic algorithms.
References
1. Phillips, B.: The Complete Book of Locks and Locksmithing. McGraw-Hill, Chicago
(2005)
2. McCloud, M.: Lock Picking Basics. Standard Publication Inc. (2004)
3. Qin, J.: A New Optimization Algorithm and Its Application – Key Cutting Algorithm. In:
2009 IEEE International Conference on Grey Systems and Intelligent Services, pp. 1537–
1541. IEEE Press, New York (2009)
4. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addi-
son-Wesley, Reading (1989)
5. Charuwat, T., Kulworawanichpong, T.: Genetic Based Distribution Service Restoration
with Minimum Average Energy Not Supplied. In: Beliczynski, B., Dzielinski, A., Iwa-
nowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS (LNAI), vol. 4431, pp. 230–239.
Springer, Heidelberg (2007)
6. He, S., Wu, Q.H., Saunders, J.R.: Group Search Optimizer: An Optimization Algorithm In-
spired by Animal Searching Behavior. IEEE Transactions Evolutionary Computation 13,
973–990 (2009)
Transmitter-Receiver Collaborative-Relay
Beamforming by Simulated Annealing
Dong Zheng1 , Ju Liu1,2, , Lei Chen1 , Yuxi Liu1 , and Weidong Guo1
1
Shandong University, Jinan, 250100, China
2
Southeast University, Nanjing, 210096, China
1 Introduction
The insistent demand for developing more spectral efficient technologies makes
the multiple-input multiple-output (MIMO) system attract much attention re-
cently. Space diversity can be fully exploited by using multiple antennas equipped
both at the transmitter and the receiver. However, the limited space, complexity
and non-regenerative power of the mobile terminals challenge the implementa-
tion of multiple antennas and make the potential benefits be difficult to utilize.
Nowadays, another type of diversity (named cooperative diversity), by which
users can relay each other’s information and form a virtual multi-antenna system,
has opened a new research avenue [1]- [4]. Amplify-and-forward, decode-and-
forward, and compress-and-forward are three common fixed relaying schemes.
Among them, amplify-and-forward (AF), which just amplifies the received noisy
signal and forwards it to other relay nodes or destination, is arguably the most
attractive strategy due to its simplicity. These schemes have been well studied
under different assumption of CSI [4]- [6].
This work was supported by National Natural Science Foundation of China
(60872024), the Cultivation Fund of the Key Scientific and Technical Innovation
Project (708059), Open Research Fund of National Mobile Communications Re-
search Laboratory (2010D10), and Independent Innovation Foundation of Shandong
University (2010JC007). Corresponding author: Ju Liu ([email protected]).
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 411–418, 2011.
c Springer-Verlag Berlin Heidelberg 2011
412 D. Zheng et al.
2 Model Description
We consider a three hop relay network with a source S, a destination D and two
clusters of relay nodes, namely cluster one {Tm }M
m=1 with M relay node and
K
cluster two {Rk }k=1 with K relay node, as shown in Fig. 1. We assume that
nT nR
H
T1 w1 R1 v1
g1
f1 nD
f2 g2
S T2 w2 R2 v2 D
• •
fM • •
• • gK
TM wM RK vK
ym = PS fm s + nT,m (1)
2
where s is the information symbol, with E |s| = 1, P S is the transmit power
of source S, and nT,m is a complex Gaussian noise with zero mean and variance
σT2 . During the second stage, the received signal ym is weighted by a power
2
normalization factor lm = 1 |fm | PS + σT2 and a beamforming weight wm .
Therefore the received signal at Rk is expressed as
uk =
h M
PS fm s + nT,m ) + nR,k (2)
k,m wm lm (
i=1
2
in which nR,k is a complex Gaussian noise with zero mean and variance σR .
Then Rk retransmits the received
noisy signal multiplied by a
similar power
normalization factor dk = 1
M
i=1 hk,m wm lm
√
PS fm s + nT,m
2
+ σR2
and the beamforming weight vk , hence the received signal at D is
r=
(g v d u ) + n
K
k k k k D
g v d
k=1
(3)
K
M
= k k k [hk,m wm lm (fm Ps S + nT,m )] + nR,k + nD
i=1
k=1
2
where nD is a complex Gaussian noise with zero mean and variance σD . The
Eq. (3) can be represented using matrix form as
+ v
P v DGHLFWs
H H
r= S DGHLWnR +vH DGnT +nD
(4)
desired signal noise
result, the instantaneous SNR at D is given by
2
PS vH DGHLFw
Γ= 2 2
(5)
σT2 vH DGHLw + σR
2
vH Dg + σD
2
where PT and PR are the total power constraints of the first relay cluster and
the second relay cluster, respectively.
414 D. Zheng et al.
vH Rv
f (w) = − (8)
vH Qv
Æ
−1
where R = JJH , Q√= σT2 KKH + σR
2
DggH DH√+ σD
2
PR I, J = DGHLFw,
K = DGHLw, v = PR P Q R . Note that PR is used to meet the power
Transmitter-Receiver Collaborative-Relay Beamforming 415
constraint of the second cluster and P (·) represents the principal eigenvector of
a matrix. Neighborhood searching function is used for generating new states in
the neighbor of the former states, and this can be done by adding a perturbation
to the former states. The pseudo-code for neighborhood searching is given as:
Algorithm 1. CRBF by SA
Input:w0 , t0
Initialization: wp = w0 , ti = t0
1: while i < Cout do
2: while j < Cin do
3:
4:
wc = Generate(wp )
if min 1, exp −
f (wc )−f (wp )
≥ rand[0, 1] then
kti
5: wp := wc
6: end if
7: j := j + 1
8: end while
9: ti+1 = αti {α is set to 0.95 here}
10: i := i + 1
11: end while
√
Output w = wP , v = PR P(Q−1 R), f (w)
√ √
where w = PT ŵ, v = PR v̂, and ŵ, v̂ are the normalized unit vectors of
w and v, respectively. Let the equivalent channel matrix A = GHLF have a
singular value decomposition (SVD) A = UΛZ, where U and Z are unitary
matrices and Λ is the diagonal matrix of singular values of A. Then we can
choose ŵ as the column vector of Z corresponding to the largest singular value
of A. When ŵ has been determined, the objective function in (10) have the
form:
PS PT PR v̂H FFH v̂
max H 2 P v̂H DggH DH v̂
(11)
v̂ σT PT PR v̂H GG v̂ + σR
2
R
2 v̂H v̂
+ σD
v̂H Rv̂
SNR(v̂) = (12)
v̂H Qv̂
5 Simulation Results
In this section, the performance of the proposed distributed beamforming solu-
tion in the CRBF systems will be presented. Consider a network with M relay
nodes in first relay cluster and K relay nodes in the second relay cluster. The co-
operative network experiences the independent Rayleigh flat fading. Assume that
2 2
E{|fm | } = 1 for i = 1, 2, · · · , M /2 and E{|fm | } = 2 for i = M /2 + 1, · · · , M ;
2 2
E{|gk | } = 1 for i = 1, 2, · · · , K/2 and E{|gk | } = 2 for i = K/2 + 1, · · · , K.
2
E{|hj,i | } = 1, ∀i, j. The source SNR is defined as PS /N , and all the nodes in
the proposed network have the same power level, without generality, set to 1.
Throughout our simulation, the source SNR (transmit power) is assumed to be
10dB, the total transmit power increases from -5dB to 20dB.
Fig. 2 shows the average received SNR versus the maximum allowable power
of relay cluster one with M = 6,10, K = 10 and P2 = 10dB, while Fig. 3 plots the
solution against the maximum allowable power of the second relay cluster with
M = 10, K = 6,10 and PT = 10dB. Fig. 2 illustrates as the power of the cluster
one increases, the output SNR is accordingly improved. The received SNR is high
even if P1 is low, because noises introduced in the first two hops can be suppressed
by effectively exploiting the channel (especially the channel between two clusters)
spatial diversity through the proper adjustment of w and v. Subsequently, the
output SNR saturates as it is constrained by noises introduced in the last hop
2
σD . In contrast, in Fig. 3, we see that the received SNR rises almost linearly
Transmitter-Receiver Collaborative-Relay Beamforming 417
35
SA based solution [M,K]= [10,10]
SA based solution [M,K]= [6,10]
30
Suboptimal solution [M,K]= [10,10]
Suboptimal solution [M,K]= [6,10]
20
15
10
0
−5 0 5 10 15 20
Total relay SNR in dB
35
SA based solution [M,K]= [10,10]
SA based solution [M,K]= [10,6]
30
Suboptimal solution [M,K]= [10,10]
Suboptimal solution [M,K] = [10,6]
Received SNR at Destination
20
15
10
0
−5 0 5 10 15 20
Total relay SNR in dB
with the increasing of the total relay power PR , since the increased power of the
2
second cluster PR helps to resist the constraint of noise σD .
Both figures indicate that the SNR at the receiver improves as the relay
number of either cluster increases, because there are better chances to select
more suitable relays to forward the signal to the destination. Furthermore, we can
observe that the SA approach is about 3dB better than the suboptimal method.
On the other hand, simulations results show that both approaches outperform
the fixed power allocation strategy.
6 Conclusion
In this paper, we proposed a SA based approach to improve the SNR at the re-
ceiver for the three-hop multiple-relay network and a stochastic global
418 D. Zheng et al.
References
1. Laneman, J.N., Wornell, G.W.: Cooperative diversity in wireless networks: Efficient
protocols and outage behavior. IEEE Trans. Info. Theory 50, 3062–3080 (2004)
2. Sendonaris, A., Erkip, E., Aazhang, B.: User cooperation diversity - Part I. System
description. IEEE Trans. Commun. 51, 1927–1938 (2003)
3. Sendonaris, A., Erkip, E., Aazhang, B.: User cooperation diversity - Part II. Imple-
mentation aspects and perfromance analysis. IEEE Trans. Commun. 51, 1939–1948
(2003)
4. Havary-Nassab, V., Shahbazpanahi, S., Grami, A., Luo, Z.-Q.: Distributed beam-
forming for relay networks based on second-order statistics of the channel state
information. IEEE Trans. Signal Process 56(9), 4306–4316 (2008)
5. Zheng, G., Wong, K.-K., Paulraj, A., Ottersten, B.: Collaborative-relay beam-
forming with perfect CSI: optimum and distributed implementation. IEEE Signal
Processing Letters 16(4) (April 2009)
6. Jing, Y., Jafarkhani, H.: Network beamforming using relays with perfect channel
information. IEEE Trans. Info. Theory 55, 2499–2517 (2009)
7. Manfred, G., Peter, W.: A review of heuristic optimization methods in economet-
rics. In: Working papers, Swiss Finance Institute Research Paper Series, vol. (8-12),
pp. 8–12 (2008)
8. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing.
Science New Series 220(4598), 671–680 (1983)
9. Eglese, R.W.: Simulated annealing: a tool for operational research. European Jour-
nal of Operational Research 46(3), 271–281 (1990)
10. Trucco, A., Murino, V.: Stochastic optimization of linear sparse arrays. IEEE Jour-
nal of Oceanic Engineering 24(3), 291–299 (1999)
11. Cardone, G., Cincotti, G., Pappalardo, M.: Design of wide-band arrays for low
side-lobe level beam patterns by simulated annealing. IEEE Trans. Ultransonics,
Ferroelectrics, and Frequency Control 49(8), 1050–1059 (2002)
Calculation of Quantities of Spare Parts and the
Estimation of Availability in the Repaired as Old Models
1
Key Laboratory of Natural Resources of Changbai Mountain & Functional Molecules
(Yanbian University), Ministry of Education, 133002 Yanji, China
2
Department of Information Management, Peking University, 100871 Beijing, China
[email protected]
Abstract. In this paper, based on the repair as old model under the same storage
condition, the quantities of spare parts M for N identical systems is derived on the
P
condition that failures are repaired with probability 0 , a special example is
given to prove the feasibility. Besides, availability function expression is pro-
vided at the same time, and taking system which follows Weibull distribution as
the case, the validity of system is shown through calculation.
1 Introduction
Many equipments and systems is still repairable, this will include two cases: One case
is the need to maintain in the process of working, and the other one is in the process of
storage. This article gives statistical analyses for the second kind of the time required
for the maintenance. However, in actual fact, the most concern of people is the storage
quantity of spare parts and how to improve the availability of equipment. Allen and
D'esopo [1] proposed the idea that the spare parts should be classified before the 1960s.
Cohen [2] divided needs into urgent needs and ordinary ones. Moore [3] did the clas-
sification according the functions of spare parts. Because of the influence of spare parts
to manufacture and economy, many scholars have studied the amounts of the spare
parts needed. P.Flint [4] provided the advice that we should develop the fellowship and
the resource sharing to reduce the cycle time. Besides, Foote [5] studied stocks pre-
diction, and Luxhoj and Rizzo [6] obtained the method of amounts of spare parts
needed of the same set based on the set model. Kamath [7] used the Bayesian method to
predict the amounts of spare parts needed. Yu [8] put forward and discuss the repaired
as ole models, and offered calculation formula of the quantities of spare parts M
which could meet the needs of the equipment, and Yan [9] studies the calculation of
availability of the equipments that composed by only one part. Based on this, this paper
considers repaired as old model which has reduced the quantities of spare parts and the
validity of system is shown through calculation.
*
Corresponding author. Head of department of mathematics.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 419–426, 2011.
© Springer-Verlag Berlin Heidelberg 2011
420 Z. Yin et al.
⎧1 , normal at timet
Xt = ⎨
⎩0 , abnormal at timet
Z by F ( t ) . On the
Note the distribution function of the first failure time of systems
assumption of model, when b = 0 , the availability at time t ,
A ( t ) = P ( X t = 1) = 1 − F ( t ) ; when b > 0 , note ak = k ⋅ a , ( k = 0,1,...) ,
bk = k ⋅ a + b , ( k = 1, 2,...) , we can know from the model assumption:
1 − F (t )
A ( t ) = P ( X t = 1| X ak = 1) P ( X ak = 1) = A ( ak ) (2.2)
1 − F ( ak )
When bk < t ≤ ak +1 ( k ≥ 1) ,
A ( t ) = P ( X t = 1| X ak = 0 ) + P ( X t = 1, X ak = 1)
= P ( X t = 1| X ak = 0 ) P ( X ak = 0 ) + P ( X t = 1| X ak = 1) P ( X ak = 1)
1 − F (t ) 1 − F (t )
=
1 − F ( bk )
(1 − A ( a ) ) + 1 − F a A ( a ) (2.3)
k
( k) ak
We can see from 2.1 and 2.2 that we can the available function of system A ( t ) at any
⎛ ⎡ t ⎤⎞
time t only if we calculate all A ( ak ) , ⎜ k = 1, 2,..., ⎢ ⎥ ⎟ . In formula 2.3, let
⎝ ⎣a⎦⎠
t = ak +1 , when k ≥ 1 , we have
Calculation of Quantities of Spare Parts and the Estimation 421
⎛ ⎡ t ⎤⎞
We can calculate A ( ak ) , ⎜ k = 1, 2,..., ⎢ ⎥ ⎟ , making use of
⎝ ⎣a⎦⎠
A ( ak ) = 1 − F ( ak ) , further more, the expression of availability function A ( t ) at
any time t can be obtained.
j=N ⎝ j ⎠
(2.5)
N +m
⎛ N + m⎞ m
⎛ N + m⎞
= ∑⎜ ( ) ∑ ⎟ A ( t ) (1 − A ( t ) )
N + m− j N + m− j
⎟ A ( t ) 1 − A ( t ) − ⎜
j =0 ⎝ j ⎠ j =0 ⎝ j ⎠
When P0 = 0.6 , we can obtain M = 3 though formula 2.5 and looking up bino-
mial distribution table. In like manner, we can get M is 4, 5 and 9 when P0 is 0.7, 0.8
and 0.9, respectively.
If one rubber bearing costs 3000 yuan, the corresponding use of funds are 9000
yuan, 12000 yuan, 15000 yuan and 27000 yuan, such is Table 1.
422 Z. Yin et al.
From the analysis of results, we can see that more quantities of spare parts, higher
reliability, and more use of funds. In the case of the lack of maintenance funds, how to
balance reliability and maintenance is the key consideration.
i = 1, 2; j = 2,3,...
z1( ) : The storage life span of the part i from t = 0 , z1( i ) ~ Fi ( t ) , i = 1, 2;
i
T1( ) : The time interval of part i from the beginning storage to the first updating,
i
i = 1, 2;
Tj(i ) : The time interval of part i from the j − 1 updating to j updating,
i = 1, 2; j = 2,3,...
( ) (( j − 1) a < Z ( ) ≤ ja )
g (j1) = P T1(1) = b j = P 1
j
= F1 ( ja ) − F1 ( ( j − 1) a )
= R1 ( ( j − 1) a ) = R1 ( ja ) , j = 1, 2, ⋅⋅⋅,
( ) (( j − 1) a − b < Z ( ) ≤ ja − b )
h(j1) = P T2( `) = ja = P 2
1
= F1 ( ja − b ) − F1 ( ( j − 1) a − b )
= R1 ( ( j − 1) a − b ) − R1 ( ja − b ) , j = 1, 2, ⋅⋅⋅,
( ) ( ( j − 1) a < Z ( ) ≤ ja )
g (j2) = P T1( 2 ) = c j = P 1
2
= F2 ( ja ) − F2 ( ( j − 1) a )
= R2 ( ( j − a ) a ) = R2 ( ja ) , j = 1, 2, ⋅⋅⋅,
( ) (( j − 1) a − b < Z ( ) ≤ ja − c )
h(j 2) = P T2( 2 ) = ja = P 2
1
= F2 ( ja − b ) − F2 ( ( j − 1) a − b )
= R2 ( ( j − 1) a − c ) − R2 ( ja − c ) , j = 1, 2 ⋅⋅⋅,
( ) ( ( j −1) a < Z ( ) ≤ a )
g (jn ) = P T1(1) = n j = P 1
2
= Fn ( ja ) − Fn ( ( j − a ) a )
= Rn ( ( j − a ) a ) = Rn ( ja ) , j = 1, 2, ⋅⋅⋅
( ) (( j − 1) a − n < Z ( ) ≤ ja − n )
h(j n ) = P T2( n ) = ja = P 2
n
= Fn ( ja − n ) − F2 ( ( j − 1) a − n )
= Rn ( ( j − 1) a − n ) − Rn ( ja − n ) , j = 1, 2, ⋅⋅⋅,
4 Availability Function
Because the availability of time t during the updating process depends on the last
updating process before time t , we need to study distribution of S N ( t ) ( i = 1, 2, ⋅⋅⋅, n )
(i )
( )
k +1
sk( i+)1 = ∑ T j( i ) ( i = 1, 2, ⋅⋅⋅, n ) is g ( i ) ( s ) h(i ) ( s )
k
function of The distribution of
j =1
convolution.
So the relation between distribution and generating function is as follows:
⎪⋅
⎨
⎪⋅
⎪⋅
⎪
⎪ ( i )( n ) j −1 ( i )( m −1) ( i )
⎪v j = ∑ vk h j − k , i = 1, 2, ⋅⋅⋅, n. j = n + 1, n + 2, ⋅⋅⋅
⎩ k =n
So far we can obtain the analytical availability function of the equipments, specific
calculation is as follows:
( i ) when 0 ≤ t < a1 ,
A ( t ) = P ( X t = 1) = P ( Z1 > t ) = R ( t )
( ii ) when a1 ≤ t < a2 ,
( )
A ( t ) = P ( X t = 1) = P X 1 = 1, X a1 = 1 + P X 1 = 1, X a2 = 0 ( )
(
= R ( t ) + P X t = 1, X a1 = 0 )
( iii ) when am ≤ t < am+1 ( m ≥ 2 ) ,there will be n kind of case:
Calculation of Quantities of Spare Parts and the Estimation 425
(
P X 1 = 1, S N(1) ( t ) = bm , S N( 2) ( t ) ≠ cm )
( ) (
= P X 1 = 1, S N(1) ( t ) = bm P X 1 = 1, S N( 2 ) ( t ) ≠ cm )
( ) ( ⎛
) ( ) ( ⎞
)
m−1
= P X 1 = 1,| S N( ) ( t ) = bm P S N( ) ( t ) = bm ⎜ P X 1 = 1, S N( ) ( t ) = 0 + ∑ P X 1 = 1, S N( ) ( t ) = c1 ⎟
1 1 2 2
⎝ i =1 ⎠
So we have
(
P X t = 1, S N(1) ( t ) = bm , S N( 2 ) ( t ) ≠ c, d , ⋅⋅⋅, n )
m
⎛ m −1 l ⎞
= R 2 ( t − bm ) ∑ vm(1)( j −1) ⎜1 + ∑∑ vl( 2 )( j −1) ⎟
j =1 ⎝ l =1 j =1 ⎠
Case 2 only change part 1
(
P X i = 1, S N( 2 ) ( t ) = cm , S N(1) ≠ bm , ⋅⋅⋅, n )
m
⎛ m −1 l ⎞
= R 2 ( t − cm ) ∑ vm( 2 )( j −1) ⎜1 + ∑∑ vl(1)( j −1) ⎟
j =1 ⎝ l =1 j =1 ⎠
#
Case k only change part k
(
P X i = 1, S N( k ) ( t ) = nm , S Ni( i =1,2,⋅⋅⋅,n ) ≠ bm , cm , ⋅⋅⋅, mm )
m
⎛ m −1 i i ( i =1,2,⋅⋅⋅,n −1) )( j −1) ⎞
= R 2 ( t − nm ) ∑ vm( n )( j −1) ⎜1 + ∑∑ vi( ⎟
j =1 ⎝ i =1 j =1 ⎠
Besides, we can obtain that
( =1,2,⋅⋅⋅,n) ) ≤ t ∪ Z ( j( j ≠i(i =1,2,⋅⋅⋅,n ))) ≤ t ⎞
( ⎝
)
F ( t ) = P Z i( i =1,2,⋅⋅⋅,n ) ≤ t = P ⎛⎜ Z i(ii=(i1,2,⋅⋅⋅ ,n ) i( i =1,2,⋅⋅⋅ ,n )
⎟
⎠
( ) (
= P Zi((i =1,2,⋅⋅⋅,n ) ) ≤ t + P Z i((i=1,2,⋅⋅⋅,n)
i ( i =1,2,⋅⋅⋅, n ) j ≠ i ( i =1,2,⋅⋅⋅,n ) )
) ( ) (
≤ t − P Zi((i =1,2,⋅⋅⋅,n ) ) ≤ t P Zi((2i =) 1,2,⋅⋅⋅,n ) ≤ t
i ( i =1,2,⋅⋅⋅, n )
)
= Fi(i =1,2,⋅⋅⋅,n ) ( t ) + Fj ≠i( i =1,2,⋅⋅⋅,n ) − Fi( i =1,2,⋅⋅⋅,n ) ( t ) Fj ≠i( i =1,2,⋅⋅⋅,n )
426 Z. Yin et al.
5 Conclusion
This paper derives the quantities of spare parts M for N identical systems based on the
repair as old model n on the condition that failures are repaired with probability P 0 ,
and a special example is given to prove the feasibility. Besides, availability function
expression is provided at the same time, and taking system which follows Weibull
distribution as the case, the validity of system is shown through calculation.
References
[1] Allen, S.G., D’esopo, D.A.: An ordering policy for repairable stock items. Operations
Research 16(3), 82–489 (1968)
[2] Cohen, M.A., Kleindorfer, P.R., Lee, H., et al.: Multi-item service constrained(s, S) policy
for spare parts logistics system. Naval Research Logistics (39), 561–577 (1992)
[3] Moore, R.: Establishing an inventory management program. Plant Engineering 50(3),
113–116 (1996)
[4] Flint, P.: Too much of a good thing: Better inventory management could save the industry
millions whileimproving reliability. Air Transport World (32), 103–106 (1995)
[5] Foote, B.: On the implementation of a control-based forecasting system for air-craft spare
parts procurement. IIE Transactions 27(2), 210–216 (1995)
[6] Luxhoj, J.T., Rizzo, T.P.: Probabilistic spaces provisioning for repairable population
models. Journal of Business Logistics 9(1), 95–117 (1988)
[7] Rajashree, K.K., Pakkala, T.P.M.: A Bayesian approach to a dynamic inventory model
under an unknown demand distribution. Computers & Operations Research 29, 403–422
(2002)
[8] Dan, Y., Xia, Y., Guoying, L.: Fiducial inference for repaired as old Weibull distributed
systems. Chinese Journal of Applied Probability and Statistics 20(2), 197–204 (2004)
[9] Xia, Y., Dan, Y., Guoying, L.: Fiducial inference for a kind of repairable equipment.
Journal of Systems Science and Mathematics Sciences 24(1), 17–27 (2004)
[10] Aronis, K.P., Magou, I., Dekker, R., et al.: Inventory con-trol of spare parts using a
Bayesian approach: a case study. European Journal of Operational Research 154, 730–739
(2004)
The Design of the Algorithm of Creating Sudoku Puzzle
1 Introduction
Sudoku is a well-known and time-honored game. Original Sudoku puzzle enjoys a tight
relationship with Latin Square. It firstly appeared as a logic-based placement puzzle in
“Dell Pencil Puzzles and Word Games” in 1979. In 1984 Nobuhiko Kanamoto
introduced it to Japan. The modern Sudoku was invented in Indianapolis in 1979 by
Howard Garns. He picked up a Japanese Sudoku magazine and became so enamored of
the puzzle that he spent six years writing a program named “Pappocom” which could
automatically generate the puzzles of varying number of difficulty levels.
The aim of the Sudoku puzzle is to put in a numerical digit from 1 through 9 in each
cell of a 9×9 grid made up of 3×3 sub-grids (called "block”), starting with various digits
given in some cells (the "givens") with the others empty; each row, column, and block
must contain only one instance of each numeral. Now a large number of mathematicians
and computer engineers are researching the Sudoku puzzles problem [1-5]. In this paper,
we consider the Sudoku puzzles as the classical Sudoku with 9×9 cells.
As we know, developing an algorithm to generate Sudoku puzzle is harder than to
solve Sudoku. The difficult and key aspects are how to make the standard of difficulty
level and how to guarantee a unique solution of the Sudoku puzzle generated by our
algorithm.
*
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 427–433, 2011.
© Springer-Verlag Berlin Heidelberg 2011
428 J. Meng and X. Lu
Fig. 1.
3 Our Algorithm
Level 2. If there is only one candidate for a given row, column or box but it is hidden
among other candidates.
The Design of the Algorithm of Creating Sudoku Puzzle 429
Level 3. If two cells in a group (row, column or block) contain an identical pair of
candidates and only those two candidates, then no other cells in that group could be
those values.
Level 4. If three cells in a group contain no candidates other that the same three
candidates. The cells don't have to contain every candidate of the triple. If these
candidates are found in other cells in the group they can be excluded.
Our main idea is that: first generate a complete Sudoku grid without any blank cell
randomly, and then we empty the cells step by step according to various difficulty
levels of the final puzzle we need. During the process we called “dig holes”; we
guarantee the puzzles have unique solution respectively in each step.
If the rest of filled cells can not be emptied, end digging. The derived grid is a
Sudoku puzzle we define as Level 1. For example, in Figure 4, cell B8 can’t be emptied,
because its candidates are 6 and 7 after emptying the cell. If we empty the cell G9, it has
only candidate 7, so G9 can be emptied.
The algorithm flow chart of Level 1 is as follows.
430 J. Meng and X. Lu
start
Randomly generated
0≤x,y<9 Yes
num=num+1 j<3?
No
map[x][y]=0?
i=i+1;j=0
No Yes
i<3?
i=1
No
mark[i]=0;i=i+1 mark[map[x][y]]==3?
Yes No
Yes
i<10?
map[x][y]=0
No
i=0
num<t?
mark[map[x][i]]++
Yes No
mark[map[i][y]]++
i=i+1
Yes END
i<9?
No
nx=x/3*3,ny=y/3*3;
i=0,j=0
Fig. 2.
any other candidates don’t satisfy this condition. If not, then we restore the cell and go
to Step 1, else go on Step1 directly.
If the rest of filled cells can not be emptied, end digging. The derived grid is a
Sudoku puzzle we define as Level 2. The algorithm flow chart of Level 1 is as follows.
start
≤≤
Generate a random Sudoku
≤≤
mapp[9][9];num=0,t=1000 K=map[x][i][o];0 i 8;
mark[map[x][i][j]]++;1 j k
≤
Randomly generated
0 x,y<9
mark[mapp[x][y]]==0?
num=num+1
No
≤≤
mapp[x][y]==0?
Init:mark[i]=0;1 i 9
No
Init:mark[i]=0;1 ≤≤ i 9
nx=x/3*3,ny=y/3*3
K=map[x][i][o];0 ≤≤
i 8; k=map[nx+i][ny+j][o]; 0 ≤≤ i,j 2;
mark[map[x][i][j]]++;1 ≤≤ j k mark[map[nx+i][ny+j][L]]++;
Yes
1≤≤ L k
Yes
mapp[x][y]==0?
mark[mapp[x][y]]==0 ?
No
Init:mark[i]=0;1 ≤≤ i 9
mapp[x][y]=0
Num<t
No
END
Fig. 3.
For example, in Figure 4, there is no 7 in candidates and givens of the cells in row C
except cell C5, so it only can be filled with 7.
432 J. Meng and X. Lu
Fig. 4.
Fig. 5.
The Design of the Algorithm of Creating Sudoku Puzzle 433
Acknowledgments
References
State Grid Information & Telecommunication Co., LTD, 28th Floor, Times Fortune
Building, No. 1 Hang Feng Road, 100070, Fengtai, District, Beijing, China
[email protected]
1 Introduction
With the economic development, in 21st century people will face energy shortages.
The sustainable development of energy has become the focus of the world. Energy
saving, exhaust reduction and low carbon economy has become an international trend.
There is a lack of interaction between the traditional grid and users, so the user can’t
know the real-time information of power consumption, etc, even less accomplish the
remote control of home appliances, etc. As a result, energy waste appears everywhere.
Therefore, the State Grid Corporation positively change the development mode of
power and the smart grid which makes up the deficiencies of traditional grid comes
into being. It can realize real-time interaction response between user and gird, en-
hance the comprehensive service capacity of network, and meet demand for interac-
tive marketing. The construction of the user side of the communication network is the
*
Jianming Liu (1955 -), male, comes from Rongcheng Shandong Province, senior engineer
(professor), doctoral tutor, researching in the power system information and communication
technologies, smart grid research, application and promotion.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 434–440, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Research and Validation of the Smart Power Two-Way Interactive System 435
foundation of intelligent electricity, and is the premise and guarantee to meet the
power of information, automation, interactive.
At this stage, in the construction of smart grid field developed countries mainly
concentrate on household energy management and renewable energy access, etc; in
our country, intelligence information gathering, intelligent home, electricity
Value-added services and so on has been carried out during a period of early explora-
tion and pilot construction. With the intelligent electricity business development, the
needs of business information integration and interaction will increase. A safe, reli-
able, high bandwidth, practical intelligent power communication network system is
urgently needed to develop and construct to achieve well interaction between user
and gird.
With the development of “Tripe Play”, more and more referred value-added services
can be used in the smart consumption interaction system, which will be well devel-
oped in a few years, while the power information collection services and smart home
appliance have already been used in it. According to the differences between them,
the needs to the network in the aspects of security, stability and real time are also
differ from each other.
Table 1 shows the needs of intelligence use of electric power to the communication
network. As we know, the node is range between the characteristics of two-way inter-
active communication system in intelligent power using, type of business diversity,
the trend of data flow is concentrated, data bandwidth requirements are different etc.
In order to meet the demand of the business, the system construction should be com-
bined with business characteristics, to adopt appropriate means of communication,
complete the communication network system accompany with the characteristics of
power system.
Real
Needs
Bandwidth Security Stability time
Service
Power
One-way meter :573 bps,
information high middle low
three-way meter: 684 bps
collection
Smart home
About 10kbps high middle low
appliance
Phone:100bps According
“Tripe Play” and
IPTV (HD):8Mbps to the
referred value-add High high
Internet: according to the content
services
customer’s needs supplier
436 J. Liu et al.
system. The scope of its application is greatly limited because of its low transmission
rate, short transmission distance and stability apparently affected by power network.
However, for the communication networks which require low reliability and date
bandwidth, narrowband technology can be applied because it is simple to construct
and can avoid wiring. It is mainly applied in the functional modules of electrical in-
formation collection and smart home system.
Fig. 2. Overall Network Architecture based on Optical Fiber Composite Low-Voltage Cable
From the power company to the users’ side, the demands of the channel band-
width, real-time performance, safety and reliability are high. Using the fiber commu-
nication technology, can meet intelligent power two-way interactive function, and
support the construction of smart power grid. Besides, it can meet the demand of
intelligence power and carry triple play service.
The OPLC(Optical Fiber Composite Low-voltage Cable) is laid in these communi-
cation lines of 10kv and the inferior, that is, from the 10kV substation to the power
distribution boxes in buildings, then to the meter box in the floor, and finally to the
power distribution boxes in user’ house. With access of power lines, optical fiber
access can be followed, which reduces wiring construction and material costs.
OLT deployed in the 10kV substation or 110KV substation, provides network (Inter-
net, radio and television networks, telecommunication network) centralization and ac-
cess, while can complete the optical / electrical conversion, the bandwidth allocation,
the control of the connection of these channels, and the functions of real-time monitor-
ing, management and maintenance. ONU deployed in the users’ houses and the location
of the meter box in the floor, realizes the transparent transfer of user data, the service of
voice and video and the upload of meter data. The optical interface in fiber power Meter
which integrates optical network unit (ONU) function, can communicates directly with
the OLT to realize the acquisition and control of the meter data.
In the user’s house, the network diagram of these network devices, such as the control
center, intelligent interactive terminals, intelligent interactive set-top boxes, smart
sockets, handheld terminals diagram is shown in Figure 3. The control center is con-
nected to the interactive terminal equipment to realize these terminal operations via
Ethernet or WIFI. Due to lower demand of bandwidth, power information gathering
systems business adopts power line communication technology to prevent re-wiring
construction. Within the small range indoor, intelligent appliance control, home
security, and water, gas collection should be realized. Because of the complex of
Research and Validation of the Smart Power Two-Way Interactive System 439
5 Conclusions
This article has researched the characteristic of the electric power communication
network and the business requirements analysis of the intelligent power in the user
side. Then communication network structure of the smart two-way interactive service
system has been proposed in the paper. According to the scope of communication
technologies to select the best networking solutions, the flexible and reliable commu-
nication network system, based on the integrate of OPLC, EPON and micro power
440 J. Liu et al.
wireless communication network, has been carried out, which has been verified in the
application of the pilot projects and obtained perfect results. This smart two-way
interactive service system based on many communication technologies is significant
to the development of the smart grid, which implants some systems, realizes intelli-
gent power and supports construction of the smart grid.
References
1. Chen, L.: The Design of Intelligent Village and Intelligent Building System. China archi-
tecture & building press, Beijing (2000)
2. Zhang, F., Zhang, C.: Study on the short range wireless communication technique and its
merging developing trends. Electrical Measurement & Instrumentation 10, 48–52 (2007)
3. Qi, M., Qi, C., Huang, T.: Home automation system based on power line carrier communi-
cation technology. Electric Power Automation Equipment 25(3), 72–75 (2005)
4. Pu, L.: Design of Wireless Communication Protocol for Home LAN. Intelligent Ubiquitous
Computing and Education, 374–377 (2009)
5. Liu, J., Zhao, B., Li, X.: The report of transmission power line carrier-current communica-
tion test in State Grid Information & Telecommunication Co., LTD, 5 (2009)
6. Akyildiz, I., Su, W., Sanakarasubramanian, Y., et al.: Wireless Sensor Networks: A Sur-
vey. Computer Networks 38(4), 393–422 (2002)
A Micro Wireless Video Transmission System
1 Introduction
With development of multimedia technology, the video surveillance has been widely
applied in factory workshop, road traffic, bank, mine factory, malls, airport safety, hos-
pitals and so on [1-3]. However, there are some important locations, such as unmanned
substation, mobile cars or ships, in the places where a lot of wired surveillance haven’t
or can’t reach. And for the unattended patient, old people, robot and so on, they do not
need the wired lines. The wireless video transformation would be the best way.
As the development of wireless communication and networking systems and video
coding technology, the wireless video surveillance has been likely. In short-range wire-
less communication, Bluetooth was used [4], but it is not practical for the high band-
width needed for video surveillance. At end of last century, the second generation (2G)
brought us digital mobile communication. It includes GSM network, GPRS network and
CDMA network. Reference [5] respectively introduced the three networks used in wire-
less video. The maximum data transmission rate was no more 200Kbit/s [6-8].
At the beginning of this century, the third generation (3G) is characterized by its abil-
ity to carry data at much higher rates. It would enable multimedia communication with
bit rates ranging from 144 to 384 kbps outdoor and bit rates up to 2 Mbps indoor [4].
Although 2G and 3G radio access networks are now becoming common practices for
users to access data, it is usual for users to access more data with more money [5]. And
now, the pricing trend is progressing toward the unlimited internet model. WLANs are
of different types such as 802.11, 802.11b, 802.11g, 802.11a, and 802.11n networks,
*
Corresponding author.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 441–448, 2011.
© Springer-Verlag Berlin Heidelberg 2011
442 Y.-m. Yang et al.
where the maximum data transmission rates are 1, 2, 11, 54, and 600Mbps, respectively
[9]. IEEE802.11n is a new high-speed transmission scheme used in WLANs. It focuses
on general data transmission and wireless Internet services.
In order to enable the transmission of more video channels or higher quality video
representations within existing digital transmission capacities, in the last decade,
video compression technologies have evolved in the series of MPEG-1,MPEG-2,
MPEG-4, and H.264[10]. H.264/AVC is the newest international video coding stan-
dard [11]. References [10, 12-14] show that H.264 has achieved substantial superior-
ity of video quality over that of H.263, MPEG-2, and MPEG-4. It has achieved up to
50% in bit rate saving compared with H.263 or MPEG-4 coding schemes. This means
that H.264 offers significantly higher coding quality with the same bit rates [13].
Therefore, the paper takes H.264 to compress the video before wireless transmission.
On the other hand, the bulks for the most of the wired video surveillance devices
are big, so they are not used in small locations or portable surveillance. The paper
takes the TI DaVinci technology to process video, and applies H.264 in video com-
pression. Then the video compression pockets would be transformed to real time
transport protocol (RTP) format [15-16], they will be transmitted to PC client through
the WIFI communication module. All the hardware were designed and made into a
micro wireless video transmission server. VC++ was taken to design the video sur-
veillance interface at the PC client.
WIFI Network
Camera2
PC Client
Receive/Send Packet
AP
Decode/Encode
and 5V in order to save power. The powers of cameras and LCD would be turned off
when they do not need. A battery monitor AD was used to monitor the battery voltage
in order to prevent the system turned off by a sudden power-down. At the same time,
to ensure the lithium battery life, when battery voltage is down to 3.4V, it would
prompt user to replace battery. If battery voltage is down to 3.3V, the system auto-
matically shut down. The power module diagram is shown in Fig.3.
Fig. 3. System hardware configuration Fig. 4. The circuit board of wireless video
transmission server
3 System Software
Client/Server(C/S) structure was used to design the system software. Software design
is divided into server-side software and client software, that is, the wireless video
transmission server and PC for the client. Server-side software is consisted of four
modules by video capture, video compression, video transmission and control proc-
essing. And client-side software was designed by three functional modules. They are
video receiving, video decoding and video displaying.
There are several available operating systems for the current embedded environment.
Embedded Linux is one of them. It is promoted as inexpensive, high quality, reliable,
widely available, and well supported. So the server uses embedded Linux operating
system to realize multi-threaded operation.
A Micro Wireless Video Transmission System 445
The server-side software flow chart is shown in Fig.5. After Linux system boots
and the hardware of processor module ware initialized, system will use the “Serv-
er_initWifi” to initialize hardware and configure network parameters of WIFI module.
Then “Server_initVideoEncode” is used to initialize video encoding. At the same time
it create a video coding thread to capture video and initialize video encoding queue in
order to store encoded video. Thirdly, ”Server_initUserinfo”and “Serv-
er_initCheckClientHeart” were used to initialize user information and create the
checking client heart thread respectively. Finally,”Server_initNet” is taken to initial-
ize the network, create listen thread and wait for PC client to visit.
Start
Server_initWifi
Server_initVideoEncode
Server_initUserinfo
Server_initCheckClientHeart
Server_initNet
End
Fig. 5. Server-side software flow chart Fig. 6. Wireless video transmission process
When the user wants to close the client, interrupt of the link between client and
server is an interactive process. First, the client sends exit command through the con-
trol socket thread. When server receives the command, it would return the appropriate
information that accepts the exit command. After that, it would release the video
thread, and then exit control thread.
446 Y.-m. Yang et al.
Client software is designed in three layers, such as the hardware layer, soft control
and decoding layer and soft interface layer. They are shown in Fig.7. Hardware layer
would be designed to receive the video packets and control packets through calling
windows API program. Soft control and decoding layer is responsible for packing and
sending the control commands from soft interface layer, and receives video data
packets and decodes them to display in soft interface layer in time.
Soft control and decoding layer consists of video decoding module and control
module. They are different threads. Control threads will get the control command
from “Control Button” to change the video bit rate, frame rate, encoding mode and so
on. Each control thread has a specific socket, Client and server is to communicate
through the socket. In order to make the client achieve simultaneously multiple moni-
toring, and to exclude the interference of different wireless networks, the client login
and connecting with server should be certified before they can implement link estab-
lished. Listening port of authentication is the port of socket bound. For each passing
authenticated user, there is a control thread will be created. When users log in and are
certified, they can accept video data pockets sent from the server.
In the experiment, the client and server are made up of a wireless LAN. Client and
server are set on the same IP network segment. Service set identifier (SSID) of server
is CQUvideo, so the SSID of wireless router is also set to CQUvideo. Set up the wire-
less transmission rate is 25 frame video captures per second. The server starts can be
monitored by the client through hyper terminal.
When the client software is running, it will be connected to the server to login and
be authenticated. Total sampling frames are set to 250, and each frame size of video
capture is 352x288. It spends about 4s to capture, transmit and display all the frames.
Client software interface is shown in Fig.8. From the effects of surveillance video, the
wireless transmission is normal and reliable. For RTP is used to transmit video, the
video is almost synchronous with the scene. There is only 120ms delay, almost negli-
gible. The system can monitor real-time wireless video transmission.
A Micro Wireless Video Transmission System 447
5 Conclusions
The paper realized a micro wireless video transmission system. Compared with con-
ventional wired surveillance system, the system uses WIFI to transmit video, there are
four significant advantages: Firstly, the size of server is small and miniaturization,
that could allow the server to be used for smaller locations or robot eyes. Secondly, it
is convenience for users to build network with the internet LAN based on WIFI wire-
less communications. Thirdly, the realization of battery-powered could be applicable
for no power occasion. Fourthly, the server is realized completely wireless, and it do
not need wiring and save a lot of work.
The future work, the system will be applied to industrial production control, robot
vision, medical video transmission and other fields. The system will be improved and
upgraded in the specific application.
References
1. Foresti, G.L.: Object Recognition and Tracking for Remote Video Surveillance. IEEE
Transactions on Circuits and Systems for Video Technology 9, 1045–1062 (1999)
2. Haering, N., Péter, L.: Venetianer and Alan Lipton. The evolution of video surveillance: an
overview. Machine Vision and Applications 19, 279–290 (2008)
3. Remagnino, P., Velastin, S.A., Velastin, S.A., Trivedi, M.: Novel concepts and challenges
for the next generation of video surveillance systems. Machine Vision and Applica-
tions 18, 135–137 (2007)
4. Rudagavi, M., Heiiizelman, W.R., Webb, J., Talluri, R.: Wireless MPEG-4 video commu-
nication on DSP chips. IEEE Signal Processing Magazine 17, 36–53 (2000)
5. Etoh, M., Yoshimura, T.: Wireless video applications in 3G and beyond. IEEE Wireless
Communications 12, 66–72 (2005)
6. Lehtoranta, O., Suhonen, J., Hännikäinen, M., Lappalainen, V., Hännikäinen, T.D.: Com-
parison of video protection methods for wireless networks. Signal Processing: Image
Communication 18, 861–877 (2003)
7. Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl,
M., Quinquis, C., Kövesi, B., Massaloux, D.: A candidate proposal for a 3GPP adaptive
multi-rate wideband speech codec. IEEE International Conference on Acoustics 2, 757–
760 (2001)
8. Rahnema, M.: Overview of the GSM System and Protocol Architecture. IEEE Communi-
cations Magazine 31, 92–100 (1993)
9. Lin, C.-F., Hung, S.-I., Chiang, I.-H.: An 802.11n wireless local area network transmission
scheme for wireless telemedicine applications. Proceedings of the Institution of Mechani-
cal Engineers, Part H: Journal of Engineering in Medicine 224, 1201–1208 (2010)
10. Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264 /AVC
Video Coding Standard. IEEE Transactions On Circuits and Systems for Video Technol-
ogy 13, 560–576 (2003)
448 Y.-m. Yang et al.
11. Joint Video Team of ITU-T and ISO/IEC JTC 1: Draft ITU-T Recommendation and Final
Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC
14496-10 AVC). Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-
G050 (2003)
12. Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable video coding extension of
the H.264-AVC standard. IEEE Transactions on Circuits and Systems for Video Technol-
ogy 17, 1103–1120 (2007)
13. Yu, H., Lin, Z., Pan, F.: Applications and Improvement of H.264 in Medical Video Com-
pression. IEEE Transactions on Circuits and Systems-I:Regular Papers 52, 2707–2716
(2005)
14. Saponara, S., Blanch, C., Denolf, K., Bormans, J.: The JVT advanced video coding stan-
dard: complexity and performance analysis on a tool-by-tool basis. In: Proc. 13th Int.
Packetvideo Workshop, Nantes, France (2003)
15. Basso, A., Cash, G.L., Civanlar, M.R.: Real-time MPEG-2 delivery based on RTP: Im-
plementation issues. Signal Processing: Image Communication 15, 165–178 (1999)
16. Busse, I., Deffner, B., Schulzrinne, H.: Dynamic QoS control of multimedia applications
based on RTP. Computer Communications 19, 49–58 (1996)
Inclusion Principle for Dynamic Graphs
1
School of Control Science and Engineering, Dalian University of Technology,
Dalian Liaoning 116024, China
[email protected]
2
School of Electronics and Information Engineering, Liaoning University of Science and
Technology, Anshan Liaoning 114051, China
[email protected]
Abstract. From the point of view of graph, complex systems can be described by
using dynamic graphs. Thus, the correlative theory of dynamic graphs is
introduced, and inclusion principle for dynamic graphs is provided. Based on the
inclusion principle and permuted transformation, a decomposition method for
dynamic graphs is proposed. By using the approach, the graph can be decomposed
as a series of pair-wise subgraphs with desired recurrent reverse order in the
expanded space of graph. These decoupling pair-wise subgraphs may be designed
to have respective controllers or coordinators. It provides us a theoretic framework
for decomposition of complex system, and is also convenient for the decentralized
control or coordination of complex systems.
1 Introduction
Research on complex systems has become a focus in the world currently. A complex
system may be interpreted as composing of multiple small systems and these subsystems
are generally easier to study, whether it is carried out analytically, experimentally, or
computationally. However, hitherto there hasn’t uniform definition for complex systems
in all literatures. For the motive of this paper, we call the system as complex system,
which consists of multiple homogeneous subsystems and has many dynamic
interconnections between subsystems, such as multi-agent systems, electric power
systems, multi-vehicle system, etc. [1-2] In general, the dynamic interconnections are
time-variant. Since complex systems have the characteristics of high dimensions and
variable topology structure constraints, traditional centralized control methods aren’t
adapted to them obviously, decentralized control has been a method in common use.
Thus, how to decompose the system is worth researching. In additional, decomposition is
also the premise of the control or coordination.
From the point of view of graph, complex systems may be described by dynamic
graphs, where the vertices and edges denote the subsystems and dynamic
interconnections respectively. Thus, when complex systems are discussed, they may be
abstracted as dynamic graphs. Based on the above stated, we will provide a
decomposition method for dynamic graphs, and use the method we can decompose
complex systems and complex networks.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 449–456, 2011.
© Springer-Verlag Berlin Heidelberg 2011
450 X.-y. Ouyang and X.-b. Chen
2 Dynamic Graphs
Consider a weighted directed graph D = (V , E ) which is an ordered pair with N
vertices in the set V and edges in the set E . The vertices {v1 , v2 , , vN } are connected
by edges ( vi , v j ), each edge being oriented from v j to vi , here i, j ∈ {1, 2, , N } . We
assign a weight eij for each edge if edge ( v j , vi ) ∈ D , while eij = 0 if ( v j , vi ) ∉ D .
First let’s define a space Ω of graphs with a fixed number N of vertices, as a linear
space over the field R of real numbers. For any D1 , D2 ∈ Ω , there is a unique graph
( D1 + D2 ) ∈ Ω called the sum of D1 and D2 , and for any D ∈ Ω and any α ∈ R , there
is a unique graph α D ∈ Ω called the multiplication of the graph D by a scalar α .
Consider the space Ω of graphs and a family of mapping Φ (t , D ) , which to any
graph D ∈ Ω and any time t ∈ R assigns a graph Φ ∈ Ω .According to the literature
[3], the definition of dynamic graphs will be given in the following.
Definition 1. A dynamic graph D is a one-parameter mapping Φ : R × Ω → Ω of the
space Ω into itself satisfying: (i) Φ (t0 , D0 ) = D0 , ∀t0 ∈ R, ∀D0 ∈ Ω ; (ii) Φ (t , D ) is
continuous, ∀t ∈ R, D ∈ Ω ; (iii) Φ (t2 , Φ (t1 , D)) = Φ (t1 + t2 , D ), ∀t1 , t2 ∈ R, ∀D ∈ Ω .
Definition 2. The number of other vertices connected to the vertex vi is called its
degree and will be denoted by d vi ; The D = (V , E ) for which V ⊆ V and E ⊆ E is
called a subgraph of D .
With dynamic graph D we associate the isomorphic concept of N × N adjacency
matrix E = (eij ) , it can be stated [3]:
⎡ 0 v1 v2 vN −1 vN ⎤
⎢ v 0 e12 e1, N −1 e1N ⎥
⎢ 1 ⎥
⎡ 0 V( N ) ⎤ ⎢ v2 e21 0 e2, N −1 e2 N ⎥
D ( N +1) = ⎢ T =⎢ ⎥
EN × N ⎥⎦ ⎢
(1)
⎣V( N ) ⎥
⎢ vN −1 eN −1,1 eN −1,2 0 eN −1, N ⎥
⎢ ⎥
⎣⎢ vN eN 1 eN 2 eN , N −1 0 ⎦⎥
Inclusion Principle for Dynamic Graphs 451
⎡ 0 V( N ) ⎤
D ( N +1) = ⎢ T ⎥ (2)
⎣⎢V( N ) EN × N ⎦⎥
V − V = {vN +1 , vN + 2 , , vN },
(3)
E − E = {eij , e ji | i = 1, , N ; j = N + 1, , N } ∪ {eij , e ji | i, j = N + 1, , N}
Si : xi = f i ( xi , ui , t ) + ∑ j =1 eij h j ( x j , u j , t ) ; yi = gi ( xi , t ), i = 1, 2,
N
,N (4)
j ≠i
where xi ∈ R ni , ui ∈ R mi and yi ∈ R li are the state, input and output vectors of the
i th subsystem respectively; f i ( xi , ui , t ) , h j ( x j , u j , t ) and gi ( xi , t ) are proper
functions, they may be linear or nonlinear; eij is dynamic interconnected coefficient
between subsystem i and j , which is a function with respect to time t and/or state
x ; eij = 0 denotes it hasn’t self-connection in the i th subsystem at i = j . The
variables satisfying
N N N
n = ∑ ni , m = ∑ mi , l = ∑ li ,
i =1 i =1 i =1 (5)
x = [ x1T , x2T , , xTN ]T , u = [u1T , u2T , , uTN ]T , y = [ y1T , y2T , , yTN ]T
here x ∈ R n , u ∈ R m and y ∈ R l the state, input and output vectors of the system
respectively. The matrix form of the system (4) can be described by
S : x = f ( x, u, t ) + E ⋅ h( x, u , t ) ; y = g ( x, t ), (N ≥3) (6)
f ( x, u , t ) = [ f1 , , f N ]T , h( x, u, t ) = [h1 , , hN ]T , g ( x, u, t ) = [ g1 , , g N ]T . (7)
According to concept of dynamic graph, if we use vertices to denote subsystems and
eij to denote edges, the complex dynamic system can be represented by using a
dynamic graph. In order to decompose the system, we will propose inclusion principle
of dynamic graphs in the next, by which the complex systems can be composed as a
series of pair-wise subsystems.
Definition 5. In its expanded space of graphs, the dynamic graph D ( N +1) is said to be of
N ( N − 1) / 2 pair-wise subgraphs Dij if only the degree d vi of each vertex vi is N ,
that is, at least eij ≠ 0 and/or e ji ≠ 0 . The subscript of Dij is called as a pair sequence.
Definition 5 implies that D ( N +1) can be taken as a multi-overlapping graph of Dij .
This idea of multi-overlapping provides a decomposition method for dynamic graphs.
According to the pair sequence of Dij in Definition 5, we give a special sequence of
the pair-wise subgraphs with recurrent reverse order subscripts.
Definition 6. In its expanded space of graphs, the dynamic graph D ( N +1) is said to be of
N ( N − 1) / 2 pair-wise subgraphs with recurrent reverse order subscripts, if its
pair-wise subgraphs Dij are arranged as
The special sequence of Dij allows a rearrangement for pair-wise subgraphs when
the last some vertices are disconnected from the dynamic graph D ( N +1) . Furthermore, it
also benefits a reconstruction when some new vertices are added into the dynamic
graph D ( N +1) , the dynamic graph can be expanded by adding vertices.
In order to obtain pair-wise subgraphs from dynamic graphs, let’s introduce
inclusion principle of dynamic graph and recurrent reverse order transform.
Based on the correlative theory in the literature [4-10] and system (4), inclusion
principle of dynamic graph is given in the following:
Theorem 1. The expanded dynamic graph D ( N +1) = (V , E ) includes the dynamic graph
D ( N +1) = (V , E ) , or D ( N +1) ⊃ D ( N +1) , if there exists a pair full rank matrices
Inclusion Principle for Dynamic Graphs 453
{RN × N , S N × N } satisfying SR = I N , such that for any D0( N +1) = (V0 , E0 ) , the conditions
E0 = RE0 imply E (t , E0 ) = S × E (t , E0 ) , V(TN ) = S *V(TN ) for all t ≥ t0 , here degree d vi
of vertex vi is N − 1 .
The relations between D ( N +1) and D ( N +1) can be represented by
where
⎛ N -1 N -1 N -1
⎞
V( N ) = ⎜ v1 v1 v1 , v2 v2 v2 , , vN v N vN ⎟
⎜ ⎟
⎝ N ⎠ (12)
( )
E = Eij
N×N
, i, j = 1, 2, …, N .
where, M ijE are also block matrices with a dimension of (N–1) I ni × (N–1) I n j .
permutation. First of all, let us introduce permuted transform before giving the
permuted inclusion principle [4,7,10].
Definition 7. By partitioning an identity matrix I n× n into M sub-identity matrices,
I1 , , Ik , , I M , with proper dimensions, we call
⎡ 0 Ik ⎤
pk ( k +1) = blockdiag ( I1 , , I k −1 , ⎢
0 ⎥⎦ k + 2
,I , , I M ),
⎣ I k +1
(14)
⎡0 I k +1 ⎤
p −1k ( k +1) = blockdiag ( I1 , , I k −1 , ⎢
0 ⎥⎦ k + 2
,I , , IM )
⎣Ik
as basic column exchange matrices and basic row exchange matrices respectively,
are column group permutation matrices and row group permutation matrices
respectively.
If we want to obtain the special sequence of Dij in Definition 5, the following
transforms can be used:
P = Π iN=1− 2 Π Nj =−1i −1Π kN=(1N+−i (ij )−−1)i ( j +1) pk ( k +1) ,
(16)
P −1 = Π iN=1− 2 Π Nj =−1i −1Π kN=(1N+−i ( ij −) −1)i ( j +1) pkT( k +1)
6 Conclusions
The paper proposed a decomposition method for dynamic graphs based on permuted
inclusion principle, which can be used to decompose complex systems. It provided us a
theoretic framework to research decentralized control or coordination for complex
systems and complex network.
Acknowledgment
This research reported herein was supported by the NSF of China under grant No.
60874017.
References
[1] Zhang, Z.D., Jia, L.M., Chai, Y.Y.: On General Control Methodology for Complex
Systems. In: Proceedings of the 27th Chinese Control Conference, Kunming,Yunnan,
China, pp. 504–508 (2008)
[2] Ouyang, X.Y., Chen, X.B., Wang, W.: Modeling and decomposition of complex dynamic
interconnected systems. In: The 13th IFAC Symposium on Information Control Problems
in Manufacturing, Moscow, Russia, pp. 1006–1011 (2009)
456 X.-y. Ouyang and X.-b. Chen
[3] Šiljak, D.D.: Dynamic graphs. Nonlinear Analysis: Hybrid Systems 2, 544–567 (2008)
[4] Chen, X.B., Stankovic, S.S.: Decomposition and decentralized control of systems with
multi-overlapping structure. Automatica 41, 1765–1772 (2005)
[5] Chen, X.B., Stankovic, S.S.: Stankovic: Dual inclusion principle for overlapping
interconnected systems. I. J. Control 77(13), 1212–1222 (2004)
[6] Ikeda, M., Šiljak, D.D., White, D.E.: Decentralized control with overlapping information
sets. J. Optimization Theory and Applications. 34(2), 279–310 (1981)
[7] Chen, X.-B., Stankovic, S.S.: Overlapping decentralized approach to automation
generation control of multi-area power systems. I.J. Control 80(3), 386–402 (2007)
[8] Chen, X., Stankovic, S.S.: Inclusion principle of stochastic discrete-time systems. Acta
Automatica Sinica 23(1), 94–98 (1997)
[9] Ikeda, M., Šiljak, D.D.: Lotka-Volterra Equations: Decomposition, Stability, and
Structrue. Journal of Mathematical Biology 9(1), 65–83 (1980)
[10] Chen, X.B., Xu, W.B., Huang, T.Y., Ouyang, X.Y., Stankovic, S.S.: Pair-wise
decomposition for coordinated control of complex systems. Submitted to Information
Sciences (2010)
Lie Triple Derivations for the Parabolic
Subalgebras of gl(n, R)
1 Introduction
Let R be a commutative ring with identity, gl(n, R) the general linear Lie algebra
consisting of all n × n matrices over R and with the bracket operation [x, y] =
xy − yx. We denote by E the identity matrix in gl(n, R), Ei,j the matrix in
gl(n, R) whose sole nonzero entry 1 is in the (i, j) position, t the subset of
gl(n, R) consisting of all n × n upper triangular matrices over R, and d the
subset of gl(n, R) consisting of all n × n diagonal matrices over R.
Recently, significant work has been done in studying the derivations of general
linear Lie algebra and its subalgebras, such as the subalgebra consisting of all
n × n upper triangular matrices (see [1,2,3]), the parabolic subalgebra (see [4])
and Lie triple derivations of nest algebras and TUHF algebras (see [5,6,7]). In
[8], Li and Wang described the generalized Lie triple derivations for the maximal
nilpotent subalgebras of classical complex simple Lie algebras.
The purpose of this paper is to describe any Lie triple derivation for the
parabolic subalgebra P of gl(n, R). The main result can be summarized as
follows.
Theorem 1. Let R be a commutative ring with identity, P = t+ Aj,i Ej,i
1≤i<j≤n
a parabolic subalgebra of gl(n, R) with Φ = {Aj,i ∈ I(R)|1 ≤ i < j ≤ n} a flag of
ideals of R. Then every Lie triple derivation φ of P can be uniquely written as
follows.
This work is supported by a grant-in-aid for Innovation Fund from Guangxi Univer-
sity for Nationalities, China.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 457–464, 2011.
c Springer-Verlag Berlin Heidelberg 2011
458 J. Zhao, H. Li, and L. Fang
Lemma 1. Define ηχ : P → P by
ηχ (x) = χ(Dx )E + (j − i + 1(mod2))rai,j Ei,j ,
1≤i,j≤n
n
where Dx denotes the projection of x to d(Dx = ai,i Ei,i , when x =
i=1
ai,j Ei,j ) and 2r = 0, r ∈ R. Then ηχ is a Lie triple derivation,
1≤i,j≤n
provided that χ is suitable for central triple derivations.
Proof. Let
x= ai,j Ei,j , y = bi,j Ei,j , z = ci,j Ei,j , (2)
1≤i,j≤n 1≤i,j≤n 1≤i,j≤n
where all ai,j , bi,j , ci,j lie in R and aq,p , bq,p , cq,p ∈ Aq,p for 1 ≤ p < q ≤ n.
Note that
[x, y], z] = di,j Ei,j , (3)
1≤i,j≤n
where
dij = (ai,r br,k ck,j − bi,r ar,k ck,j − ci,k ak,r br,j + ci,k bk,r ar,j ). (4)
1≤r,k≤n
Lie Triple Derivations for the Parabolic Subalgebras of gl(n, R) 459
Let ⎛ ⎞
(r,1) (r,2) (r,n)
d1,1 d1,1 · · · d1,1
⎜ (r,1) (r,2) (r,n) ⎟
⎜ d2,2 d2,2 · · · d2,2 ⎟
⎜
Bij = (bi,j ) = ⎜ . ⎟
.. .. ⎟ ,
⎝ .. . . ⎠
(r,1) (r,2) (r,n)
dn,n dn,n · · · dn,n
using (5), we have that bi,i = 0 for i = 1, 2, . . . , n and bi,j + bj,i = 0 for
1 ≤ i < j ≤ n. Then
n
di,i χ(Ei,i ) = bi,j χ(Ei,i − Ej,j ) = 0 (7)
i=1 1≤i<j≤n
1≤r≤n
and
ηχ ([[x, y], z]) = (j − i + 1(mod2))rdi,j Ei,j , (8)
1≤i,j≤n
which equals [[ηχ (x), y], z] + [[x, ηχ (y)], z] + [[x, y], ηχ (z)], thus ηχ is a Lie
triple derivation.
n−1
n
(a) π(An,1 ) ⊆ ( Bn,i ) ( Bi,1 ),
i=1 i=2
(b) π(An,j Aj,1 ) = 0 for j = 2, 3, . . . , n − 1.
Lemma 2. Define ρπ : P → P by
ρπ ( ai,j Ei,j ) = π(an,1 )E1,n .
1≤i,j≤n
Proof. Let
x= ai,j Ei,j , y = bi,j Ei,j , z = ci,j Ei,j , (9)
1≤i,j≤n 1≤i,j≤n 1≤i,j≤n
where all ai,j , bi,j , ci,j lie in R and aq,p , bq,p , cq,p ∈ Aq,p for 1 ≤ q < p ≤ n.
Note that
[[x, y], z] = di,j Ei,j , (10)
1≤i,j≤n
where
dij = (ai,r br,k ck,j − bi,r ar,k ck,j − ci,k ak,r br,j + ci,k bk,r ar,j ). (11)
1≤r,k≤n
3 Proof of Theorem 1
In this section, we will give the proof of our main result.
Proof. If n = 1, then it is easy to determine the Lie triple derivations of P . From
now on, we suppose that n > 1. Let φ be any Lie triple derivation of P .
Firstly, we show that there exists some x0 ∈ P such that (φ − adx0 )(d) ⊆ d.
(i) (i) (i)
Suppose that φ(Ei,i ) = αp,q Ep,q with all αp,q ∈ R, and αl,k ∈ Al,k , 1 ≤
1≤p,q≤n
k < l ≤ n. By applying φ on the two sides of
0 = [[E1,1 , Ej,j ], Ek,k ], (14)
(1) (1)
we can obtain that αk,j = αj,k = 0 for k = j, k = 1 and j = 1. Choose
n
(1) (1)
x1 = (αi,1 Ei,1 −α1,i E1,i ), then φ−adx1 sends E1,1 to d. Write (φ−adx1 )(Ei,i )
i=2
(i) (i) (i)
in the form βp,q Ep,q with all βp,q ∈ R, βl,k ∈ Al,k , 1 ≤ k < l ≤ n. By
1≤p,q≤n
applying φ − adx1 on the two sides of
0 = [E1,1 , [E1,1 , E2,2 ]], 0 = [[E2,2 , Ej,j ], Ek,k ]], (15)
we can obtain that
(2) (2) (2) (2)
β1,2 = β2,1 = 0, βk,j = βj,k = 0, for k = j, k = 2 and j = 2. (16)
n
(2) (2)
Choose x2 = (βi,2 Ei,2 − β2,i E2,i ), then φ − adx1 − adx2 sends E2,2 to d
i=3
k−1
(also sends E1,1 to d). Generally, suppose that φ − adxi sends E1,1 , E2,2 , . . . ,
i=1
Ek−1,k−1 to d, respectively. Suppose that
k−1 (i) (i) (i)
(φ − adxi )(Ei,i ) = γp,q Ep,q with all γp,q ∈ R, γl,j ∈ Al,j , 1 ≤ j < l ≤ n.
i=1 1≤p,q≤n
(17)
k−1
By applying φ − adxi on the two sides of
i=1
Secondly, we prove that for any 1 ≤ i < j ≤ n, Aj,i Ej,i + REi,j is stable under
the action of φ1 .
For fixed i < j and aj,i ∈ Aj,i . By applying φ1 on the two sides of
aj,i Ej,i = [Ej,j , [Ej,j , aj,i Ej,i ]], aj,i Ej,i = [Ei,i , [Ei,i , aj,i Ej,i ]], (19)
with ri ∈ R, si ∈ Ai+1,i .
Let
n−1
y0 = diag(0, r1 , r1 + r2 , . . . , ri ) (23)
i=1
2
φ2 (Ei,i ) = (i)
δp,p Ep,p ∈ d, φ2 (E1,2 ) = s1 E2,1 . (35)
p=1
E1,2 = [E1,1 , [E1,1 , E1,2 ]], E1,2 = [E2,2 , [E2,2 , E1,2 ]], E1,2 = [[E1,1 , E1,2 ], E2,2 ],
(36)
464 J. Zhao, H. Li, and L. Fang
we obtain that
(1) (1) (2) (2) (1) (1) (2) (2)
2(δ1,1 − δ2,2 ) = 0, 2(δ1,1 − δ2,2 ) = 0, δ1,1 − δ2,2 = δ1,1 − δ2,2 , (37)
respectively. Put
(1) (1) (2) (2)
δ1,1 − δ2,2 = δ1,1 − δ2,2 = v, (38)
(i)
then φ2 (Ei,i ) = δi,i E + vEi,i . By applying φ2 on the two sides of
a2,1 (E1,1 − E2,2 ) = [[E1,1 , E1,2 ], a2,1 E2,1 ], (39)
we get that
(1) (2)
a2,1 (δ1,1 − δ2,2 ) = 0, φ2 (a2,1 E2,1 ) ⊆ RE1,2 . (40)
Let χ : d → R be a homomorphism of R−modules defined by χ(Ei,i ) =
(i)
δi,i , i = 1, 2. Then χ is suitable for central triple derivations. For
x= ai,j Ei,j , assume that
1≤i,j≤2
ηχ (x) = χ(Dx )E + (j − i + 1(mod2))vai,j Ei,j , (41)
1≤i,j≤2
2
where Dx denotes the projection of x to d(Dx = ai,i Ei,i ). Denote φ2 − ηχ by
i=1
φ3 . Then
φ3 (d) = 0, φ3 (E1,2 ) ⊆ A2,1 E2,1 , φ3 (A2,1 E2,1 ) ⊆ RE1,2 . (42)
Define two homomorphisms of R−modules σ1 : R → A2,1 , σ2 : A2,1 → R such
that
φ3 (a1,2 E1,2 ) = σ1 (a1,2 )E2,1 , φ3 (a2,1 E2,1 ) = σ2 (a2,1 )E1,2 .
Similar to case 1, we can prove that (σ1 , σ2 ) is suitable for permutation triple
derivation and φ = adx0 − ady0 + ηχ + φ(σ1 , σ2 ).
References
1. Cao, Y.: Automorphsims of certain Lie algebras of upper triangular matrices over a
commutative ring. J. Algebra. 189, 506–513 (1997)
2. Jondrup, S.: Automorphsims and derivations of upper triangular matrix rings. Lin-
ear Algebra Appl. 221, 205–218 (1995)
3. Ou, S., Wang, D., Yao, R.: Derivations of the Lie algebra of strictly upper triangular
matrices over a commutative ring. Linear Algebra Appl. 424, 378–383 (2007)
4. Wang, D., Yu, Q.: Derivations of the parabolic subalgebras of the general linear Lie
algebra over a commutative ring. Linear Algebra Appl. 418, 763–774 (2006)
5. Ji, P., Wang, L.: Lie triple derivations of TUHF algebras. Linear Algebra Appl. 403,
399–408 (2005)
6. Zhang, J., Wu, B., Cao, H.: Lie triple derivations of nest algebras. Linear Algebra
Appl. 416, 559–567 (2006)
7. Wang, H., Li, Q.: Lie triple derivations of the Lie algebra of strictly upper triangular
matrices over a commutative ring. Linear Algebra Appl. 430, 66–77 (2009)
8. Li, H., Wang, Y.: Generalized Lie triple derivations. Linear Multi Algebra 59, 237–
247 (2011)
Non-contact Icing Detection on Helicopter and
Experiments Research
Abstract. The paper puts forward a new non-contact icing detection approach. In
this approach, an infra-red laser directly radiates on the frozen surface. Because
there are great differences between the absorption rate on the ice and the detected
surface, the energy reflected using the photoelectric detector was totally
different. The received energy is disposed by the signal process circuit, so that
the icing information along the chord of the rotor can be achieved. The method is
validated by the experiments using the infrared laser with the wavelength of
1450nm, the influence of the flapping and torque movement of the rotor to the
signal amplitude is discussed, and the corresponding measures to reduce
the influence of the flapping and torque movement of the rotor is put forward to
the icing detection system according to the changing rule of the signal amplitude.
1 Introduction
The helicopter icing refers to the phenomenon that the helicopter body surface
accumulate ice when it is in flight in the atmosphere, The helicopter body surface icing
has a serious impact on the flight safety, especially it will decline the helicopter lift
coefficients, increase the resistance coefficient, increase the flying oil consumption,
obstruct the instrumental display of the flight hydrostatic system, and even can cause
helicopter’s vibration, affect the stability and manipulative capability seriously, and
reduce the flight safety, even worse, it can cause seriously destroyed accidents[1]. To
ensure the helicopter’s flight safety in icy conditions, the icing detection to the key
parts of a helicopter (especially the rotor system) is badly needed, cooperating with the
deicing device on the helicopter, it can reduce the possibility of helicopter crash in icy
weather conditions.
The helicopter rotor is a moving part, the traditional contact icing detection methods
will damage its structure, normally, and there is no redundant electric ring contacts
provided for icing detection system, so it’s difficult to transfer the signal. In allusion to
the characteristics of helicopter rotor and actual needs, this paper puts forward a new
non-contact icing detection method, it does not need to change the original structure of
the rotor surface, it’s easy to install and can solve the problem of real-time icing
detection on the helicopter rotor.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 465–473, 2011.
© Springer-Verlag Berlin Heidelberg 2011
466 J. Zhang et al.
The principle of the infrared non-contact icing detection is as follows: The infrared
laser is modulated by a fixed-frequency square wave to emit energy to the under surface
of the rotor, besides the absorbed energy by the rotor, others is reflected, part of the
reflected energy arrives at the infrared detector through the optical fiber system. The
infrared absorption rate of ice is greater than that of the rotor at the special wavelength,
so if the rotor surface is icing, the reflected energy received by the detector which will
obviously reduce, is converted, processed and analyzed by the signal processing circuit
to judge whether the rotor is icing or not. The icing distribution on the rotor surface can
be calculated through this successive detection.
Due to the 16 fibers, whose diameter is only 1mm, we use as the probe light receive
device, the received light intensity is so weak that the electrical signal converted by
photoelectric detector is also weak, the signal processing circuit adopts the
synchronous integrator to improve the signal-to-noise ratio. According to the dynamic
features of the helicopter rotor and propagation characteristics of the infrared radiation,
the system adopts a double-threshold comparator to judge the crossing time of the icing
and the non-icing area of rotor, using the widen circuit, DSP can analyze and calculate
the icing distribution on the rotor surface. The system frame based on infrared laser
non-contact icing detection is shown in Fig.1. The whole detection system consists of
optical system, signal processing circuit and DSP microprocessor.
In Fig.1, square wave generator generates the square wave of 100KHZ, on one hand,
it’s used to drive the infrared lasers, on the other hand, as the reference signal of the
synchronous integrator. The infrared laser emits laser beam to the rotor, the reflected
light is transmitted to photoelectric detector through the optical fibers, the optical signal
is converted to a current signal, the preamplifier converts the current signal to a
corresponding voltage signal, the programmable gain amplifier and the amplifier
Non-contact Icing Detection on Helicopter and Experiments Research 467
further amplifies the signal, the synchronous integrator improves the signal-to-noise
ratio, so that the signal can be strengthen. The output signal of the synchronous
integrator will be processed by double gate voltage comparator to judge whether there
is icing on the rotor surface. Due to the high modulated frequency, it is too difficult to
realize data processing directly by using a pulse signal to trigger DSP interrupt signal,
and there is not enough time to deal with the data within a pulse width, we adopt
dual-channel pulse widen circuit to solve this problem: when crossing icy area of the
rotor above the detector, the output of the high gate voltage comparator widen circuit
keeps low level, the low gate voltage comparator widen circuit maintains high level,
when crossing the icy area above the detector, the output of the dual-channel widen
circuit all maintain high level. The two output signals are used for DSP interrupt clock,
in this way, the system can judge the icing condition, and realize the icing alarm, and
carry on the deicing control system.
There are many key parts in the design of the signal processing circuit, which are
preamplifier, synchronous integrator, threshold comparator, and expanding circuit.
(1) Preamplifier: The selection and design of the preamplifier circuit is significant, its
performance directly affects all the subsequent signal processing circuit. The
requirement of the preamplifier is ultra-low drift, high gain, wide frequency band
and high speed, this system adopts the AD8065 amplifier. In order to improve the
signal amplitude as much as possible, we should design the feedback resistor with
high value, but too large feedback resistance will narrow the frequency band,
resulting in distortion of the output signal of the preamplifier, the value of the
feedback resistor in this system is 15M. We add a 5pF feedback capacitor to
compensate for the phase to eliminate the oscillation caused by the feedback
resistance with high value.
(2) Synchronous integrator: The high-speed analog electronic switch DG419
alternately charge C158, C159 with integration time constant of RW107C158 and
RW107C159 under the control of PP1, which is a synchronous pulse to the
reference signal. After a number of circles of synchronous alternating charge, the
special frequency signals are strengthened from strong noise by accumulating and
averaging for times. The synchronous integrator use cross-correlation detection:
the input signal is supposed to be: f1 (t ) = s1 (t ) + n(t ) , and the reference signal is
supposed to be: f 2 (t ) = s2 (t ) ,
1 T
1 T T
= lim [ ∫ s1 (t ) s2 (t − τ )dt + ∫ n(t ) s2 (t − τ ) dt ]
T →∞ 2T −T −T
The design of the synchronous integrator requires both high SNR and rapidity. It
does a great influence to the whole system to set the integration time constant. If the
time constant is too long, although it can suppress stronger noise very well, the cost of it
is that it will spend more time on measuring. At the same time, too large integration
time constant will smooth the fast signal, and affect the measure accuracy of the
system. Therefore, we get the appropriate integration time constant by the circuit
simulation and experiments, so that it can not only meet the need of improving
signal-to-noise ratio, but also it can meet the dynamic demand of the detected rotor. In
this system, the largest integration time constant of sample is t= 2RW107C158 =0.02s.
3) Dual-gate comparator: The system adopts a dual-channel comparator LM119. It’s
supposed that the output amplitude of the synchronous integrator is V1 when the
infrared laser is not on the rotor or ice; the output amplitude of the synchronous
integrator is V2 when the infrared laser is on the ice area; the output amplitude of the
synchronous integrator is V3 when the infrared laser is on the rotor; the low threshold
voltage VL is set to be V2 > VL > V1 , the high threshold voltage VH is set to be
V3 > VH > V2 . So it can make sure that when the rotor is not icing, both the comparators
output pulse signals; when the rotor is icing, only the low gate voltage comparator
outputs pulse signals, and the high one outputs low level. When there is only one icing
area on the rotor, the output waveform of the double-gate comparator in two states of
icing and no-icing is as shown in Fig.3, t1 represents the time when the infrared laser is
on the rotor, t2 represents the time when the infrared laser is not on the rotor or the ice
Non-contact Icing Detection on Helicopter and Experiments Research 469
area, t3 represents the time when the infrared laser is on the ice area. Reasonably set the
two thresholds, the system will detect the position and the scope of the icing area on any
rotors.
(4) Pulse widen circuit: The system adopts a dual retriggerable-resettable monostable
multivibrator HEF4528BP to realize the function of pulse widen. The circuit outputs a
high level when there are pulse signal, otherwise it outputs a low level.
(5) DSP digital signal processing circuit can determine the icing scope and icing
position by comparing the output signal of the HEF4528BP, whose output is
corresponding to the two threshold comparators.
is 10mm, parallel to the rotor’s span direction. In order to study the influence of
different thickness and types of ice to the signal amplitude, the spray console can set
each spray time to control the thickness of ice and frost; the frozen well can set the
temperature to simulate clear ice and rime ice. The lifting platform and the angle frame
are used to realize the changes of the distance and the angles between the laser source
and the rotor.
3.2 Contrast Test between Two Infrared Lasers with Different Wavelength
The two laser with 940nm and 1450nm are separately tested at the same height and
incident angle, The temperature of the frozen well is set to be -15°C, and control the
spray time, we record the output amplitude of the synchronous integrator, the
amplitudes are normalized and shown in fig 5.
The Fig.5 demonstrates the ability to absorb the infrared energy at the wavelength of
1450nm is far greater than that of 940nm, so the output amplitude of the synchronous
integrator will have significant difference when the rotor freezes or not, this is helpful
for the icing recognition and measurement. In reference [4], the absorption rate of the
single crystalloid ice at 1450nm are much greater than 940nm, but this just consider the
ice crystalloid internal absorption characteristics, in this system, the output amplitude
of the synchronous integrator is not only determined by the absorb rate, but also
influenced by the superficial character of the ice and the rotor, Therefore, the system
will adopt the laser source with the wavelength of 1450nm.
The rotor’s waving will change the height and angle along the span direction, the height
and angle are changed using special equipments to examine the influence of the change
to the signal amplitude.
From Fig.6 we can see, with the increase of the distance between the laser source and
the rotor in the height range from 1650mm to 1800mm, the output amplitude of the
synchronous integrator decreases, when the rotor has no icing, the maximum change of
the output amplitude is 0.4V, when the rotor has icing, the maximum change of the
output amplitude is 0.35V. For this system measurement, the decrease of the amplitude
caused by waving height is not so sufficient to misjudge the icing state, so it’s not
necessary to consider the compensative measures caused by the amplitude decrease.
From Fig.7 we can see: with the increase of the waving angle from 0 degree to 8
degree, the output amplitude of the synchronous integrator decreases, when the rotor
has no icing, the signal amplitude decreases 1.1V, when the rotor has icing, the signal
amplitude decreases 0. 5V, but icing also can make signal amplitude decrease, the
system may not distinguish the real reason of the decrease of the signal amplitude, so
the waving of the rotor is possible to be mistaken for icy information, we should set the
high gate voltage to be between 1.3V and 1.8V to improve the accuracy of the system.
The bending moment of the rotor can cause the angle changes along the chord direction
of the rotor, the output amplitude of the synchronous integrator changes with bending
angle are shown in Fig.8.
By Fig.8 we can see: with the increase of the bending angle from 0 degree to 8
degree, the output amplitude of the synchronous integrator decreases, when the rotor
has no icing, the signal amplitude decreases 0.6V, when the rotor has icing, the signal
amplitude decreases 1V, it is likely to cause icing misjudgment, We should set the high
gate voltage to be between 1.3V and 1.8V to improve the accuracy of the system.
Non-contact Icing Detection on Helicopter and Experiments Research 473
4 Conclusion
The experimental studies have shown that the absorption effect of ice on the 1450nm
laser wavelength is better than 940nm wavelength laser, so the 1450nm laser
wavelength is more suitable for non-contact ice detection. The system uses the
non-contact infrared ice detection is feasible. When helicopter is in flight the height of
flapping have little effect on the system detection, while the angle of flapping and the
angle of torque have great effect on the system, so we need to make compensation
measures for the system.
References
[1] Qida, W., Tongguang, W.: Icy Detection Technology Progress On Helicopter Rotor
Aeronautical Manufacturing Technology 3, 62–64 (2009)
[2] Ligang, W., Tiancheng, J.: Circuit noise analysis and circuit design based on the
photoelectric diode detection. Journal of Daqing Petroleum Institute 2(133), 88–91
[3] Qingyong, Z.: Weak Signal Detection, pp. 66–75. Zhejiang University Press (2003)
[4] Hobbs, P.V.: Ice physics, pp. 355–455 (1998)
Research on Decision-Making Simulation of
"Gambler's Fallacy" and "Hot Hand"
Abstract. The "gambler's fallacy" and the "hot hand" are considered as typical
examples of misunderstanding random chronological events. People have two
different expectations on the same historical information: gambler's fallacy and
hot hand. This paper analyzes the occurring numbers of the four effects which are
"gambler’s fallacy", "hot hand", "hot outcome" and "stock of luck" and their
revenues in a series of random chronological events by describing the
decision-making preferences of heterogeneous individuals with the method of
computer simulation. We can conclude from the simulation process of coin flips
that there are no differences among the occurring numbers and the revenues of
these four effects mentioned above. However, they are different when a single
period is focused on, which conforms to the beliefs and behavior in the real
decision-making situation.
1 Introduction
Almost every decision made in reality involves uncertainty. People intend to maximize
their profits by choosing an optimum option. Researches on decision-making under
uncertainty reveal that our beliefs toward the probabilities of future events usually
deviate from Bayes rule. With response bias, many people are caught in gambler’s
fallacy or others, and hope that there will be systemic reversal in the outcome of the
random events. Extensive studies have been achieved on behavioral economics,
psychology and neurons economics (e.g. Camerer, Loewenstein, & Prelec, 2005;
Gilovich, Vallone, & Tversky, 1985; Kahneman, 2002; Rabin & Vayanos, 2010).
Further researches on searching regulations in these random sequences and comparison
between the revenue of gambler’s fallacy and hot hand deserve to be carried out. Real
experiments are difficult to be realized because of the high cost and complicated
procedures. Experimental economics and computer simulation provide strong support
for testing these effects in large sample experiments. This paper attempts to test the
revenue of "gambler's fallacy", "hot outcome", "hot hand" and "stock of luck" existing
in decision-making by using computer simulation and explore which is the optimal
decision-making mode.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 474–478, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Research on Decision-Making Simulation of "Gambler's Fallacy" and "Hot Hand" 475
2 Literature Review
Researches on gambler's fallacy and hot hand primarily involve the following four
aspects. First is confirming gambler's fallacy and other recency effects existing in the
real decision-making; secondly, some scholars conduct their researches from the
perspective why gambler's fallacy and hot hand exist; the third research area is to prove
the mechanism results in generation of gamblers fallacy and hot hand and interpret the
rationality of the existence of such false beliefs; the fourth type of studies focus on
combining the false beliefs of gambler's fallacy with the behavior of people, such as the
correlation of the false beliefs and the confidence level, abnormal reaction in stock
market, etc.
The gambler's fallacy is one's subjective judgment about the probability of objective
events, which is considered as a belief in negative autocorrelation of a non-autocorrelated
random sequence of outcomes like coin flips (Peter Ayton, Ilan Fischer, 2004). The hot
outcome is the opposite of the gambler's fallacy, which is a belief in positive
autocorrelation of a non-autocorrelated random sequence of outcomes (outcomes of
objective events). In contrast to the gambler's fallacy, the hot hand is one's subjective
judgment about the probability of subjective events, and is a belief in positive
autocorrelation of a non-autocorrelated random sequence of outcomes (outcomes of
subjective events) (Peter Ayton, Ilan Fischer, 2004). The hot hand also has its opposite
bias--" Stock of luck" which suggests that individual's luck is a solid value. When the
luck is used up, the probability of winning will be reduced.
We define random numbers 0 and 1 as the representations of the back and the front
sides of the coin separately, and their occurring probabilities are both 50%. The
outcomes are stored in the array a[i], and the size of is the total count of coin flips. Thus,
the value of a[i] equals the outcome of coin flips at time-i: if a[i]=1, representing the
front side; if a[i]=0, the back side. The outcome of guess is also represented by the
random numbers 0 and 1. The guess probability of each outcome is also 50%, which
equals the expected probability of a rational individual. The figures guessed at time-i
are stored in b[i], and if a[i]=b[i], the outcomes of the guess and the coin flip are
identical.
the next outcome will be the opposite because of the gambler's fallacy, namely, a[i+2]
and a[i+3] are not equal. After the fourth guess, if a[i+2] and a[i+3] are not equal, the
guess outcome under the decision-making mode of the gambler's fallacy is correct, and
then the revenue increases by 2. In contrast, if a[i+2]=a[i+3], the guess outcome under
the gambler's fallacy mode is wrong, thus the revenue is subtracted by 2.
Simulation procedure of hot outcome. On the contrary, the individual expects the
fourth outcome of the coin flips will be the same with the third under the
decision-making mode of hot outcome. If a[i+2]=a[i+3] after the fourth guess, which
means that the decision under the hot outcome mode is right and makes the revenue
increased by 2. Instead, if a[i+2] and a[i+3] are not equal, the revenue decreases by 2.
Simulation procedure of hot hand. Two cases will be explained here. First, when the
outcomes of the guess and the coin flips maintain the same for three times sequentially,
namely, the three conditions are met simultaneously: b[i]=a[i], b[i+1]=a[i+1],
b[i+2]=a[i+2], the individual expects his decision will be correct in the next guess. If
b[i+3]=a[i+3], then add 3 to the revenue, and on the contrary, subtract 3 from the revenue
( the individual increase his bet to 3 because that he has won for three times under hot
hand). Secondly, if the decisions maintain wrong in three sequent guesses under hot hand,
the individual will decrease his bet. In this case, the revenue increases by 1 after the
correct decision in the fourth guess and decreases by 1 after the opposite result.
Simulation procedure of stock of luck. There are also two cases about the stock of
luck here. First, contrast to the hot hand, when the outcomes of the guess and the coin
flips maintain the same for three times sequentially as mentioned above, the individual
expects his decision will not be correct in the next guess under the decision-making
mode of stock of luck. That is to say, the individual expects that b[i+3] and a[i+3] are
not equal under the conditions: b[i]=a[i], b[i+1]=a[i+1], b[i+2]=a[i+2]. In this case, the
individual will reduce his bet to 1. Thus, he will obtain 1 after he wins in the fourth
guess, or lose 1 contrarily. Secondly, if the decisions maintain wrong in three sequent
guesses under stock of luck, the individual will enhance his bet to 3 because of the
expected "reversal of fortune". Then, the revenue increases by 3 after the correct
decision and decreases by 3 in the opposite situation.
300
200
100
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
-100
-200 GF HC HH
200
150
100
50
GF HC
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
4 Result Analysis
The simulation involved 50 periods, and the number of coin flips was 1000 in every
period. At the end of every period, the occurring number and the revenue of every
effect was counted and recorded. From the comparative analysis above we could
conclude that there was no difference among the revenues and the occurring numbers
of the four effects. That is to say, the revenues of the four effects were homogenous.
In a sense, that is one of the reasons which lead to the individual differences in
decision-making. The decision based on the objective knowledge of random events
shows no advantage compared with the false beliefs, and that may be the cause of the
wild existence of the false beliefs (Andreas Wilke & H. Clark Barrett, 2009).
As is shown in the tables above, the occurring number of every effect is 1/8 which
conforms to our rational expectation. The distribution function of the revenues of every
effect is normal distribution with mean 0. Compared to the revenues under the
completely random decision-making mode, there is also indifference. However, they
are different when a single simulation is focused on.
5 Conclusion
This paper reviews the definition of concepts such as the gambler's fallacy, hot hand
and their performance in reality, and then summarizes the generation causes of the four
effects. On this basis, we analyze the occurring numbers and the revenues of the four
effects by describing the decision-making preferences of heterogeneous individuals
with the method of computer simulation. We can conclude that the occurring numbers
478 J. Li et al.
and the revenues of the four effects are indifferent from the analysis of Stata 10.0.
There exist some differences in the beliefs and behavior of the real decision-making.
People's behavior is influenced by their own preferences and beliefs. Preference is
the driver of behavior, and the belief is the understanding of the relation between the
behavior and the result, which also can be considered as an expectation of future events.
People in different countries form their own unique behavior preferences because of the
distinct national culture and institutional environment. Different decision-making
beliefs will be generated in different situations, and the beliefs of these recency effects
such as the gambler's fallacy, hot hand can be interchangeable under certain conditions.
People in different countries, with different preferences, prefer a certain kind of effect,
resulting in "collective effect" of accumulation actions. Thus, decision-making analysis
in cross-cultural perspective is a valid way to study on the collection of the false beliefs.
References
1. Sundali, J., Croson, R.: Biases in casino betting: The hot hand and the gambler’s fallacy.
Judgment and Decision Making 1(1), 1–12 (2006)
2. Sun, Y., Wang, H.: Gambler’s fallacy, hot hand belief, and the time of patterns. Judgment
and Decision Making 5(2), 124–132 (2010)
3. Ayton, P., Fischer, I.: The gambler’s fallacy and the hot-hand fallacy: Two faces of
subjective randomness. Memory and Cognition (32), 1369–1378 (2004)
4. Guryan, J., Kearney, M.S.: Gambling at Lucky Stores: Empirical Evidence from State
Lottery Sales. American Economic Review 98(1), 458–473 (2008)
5. Rabin, M.: The Gambler’s and Hot-Hand Fallacies: Theory and Applications. Review of
Economic Studies 77, 730–778 (2010)
An Integration Process Model of Enterprise Information
System Families Based on System of Systems
1 Introduction
System of systems (SoS) is a set of theories and methods proposed and developed to
solve complex system family issues. Before integration, the constituent system is in
spatial distribution and complementary in terms of capabilities. By effective integrating
the constituent systems, SoS enhances problem-solving and responsiveness to
challenging opportunity without changing the existing system working environment.
SoS is highly adaptable to dynamic and unstable external environment [1]. Andrew P.
Sage et al pointed out many of today’s systems are no longer engineered as stand alone
systems by an individual institution, but as part of an integrated system of systems, or a
federation of systems or systems family [2].
With the increasing demand on integrative supply chain and enterprise collaborative
management, an enterprise information system family needs to be integrated as a SOS
among the enterprises along the industry chain and within an enterprise itself to
facilitate collaboration between enterprises or within an enterprise itself. An integration
process study of an enterprise information family using SoS theories and methods will
cast light on how to integrate an enterprise information SoS, and how to systemize,
regulate and guide integration of an enterprise information system family.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 479–485, 2011.
© Springer-Verlag Berlin Heidelberg 2011
480 Y. Wu, X. Wang, and Y. Lin
Fig. 1. The integration process model of an enterprise information system family based on SoS
The sub-processes which is shown in Fig. 3 also involve four stages. In the first two
stages, component system integration objectives and requirements are planned and
defined based on SoS integration sub-processes. In the next two stages, component
systems combination & connection schemes are developed, and component systems
integration is implemented by devising specific implementation schemes of
engineering organization, management and technologies.
As enterprise information system family changes with its integration, the SoS thus
integrated evolves dynamically. In the process model, each time after the completion
An Integration Process Model of Enterprise Information System Families 483
of integration process, it waits next in line based on the need for a SoS evolution. And
the activities in the process are based on a decomposition principle, for example a SoS
objectives can be decomposed into sub-SOS objectives activities, thus each iterative
integration can be viewed as a construction process of a next SoS. An episode of the
iterative processes is shown in Fig. 4.
Fig. 4. An episode of the iterative processes that favor dynamic evolution of integration
In the process model, the activities in integration sub-processes at the two different-
levels form mapping relations, which can be divided into horizontal and vertical
relations based on mapping directions. Mapping relations of sub-process activities are
shown in Fig. 5.
5 Conclusion
The integration process of an enterprise information system family, which differs
from constructing an individual system, is concerned with a complex comprehensive
SoS. The concept and method of SoS will help us solve complex multi-system
integration issues. Based on SoS, this paper made useful exploration by proposing a
process model that guides and regulates enterprise information system family
integration. There is still much work to be done in integration process of an enterprise
information system family based on SoS like process refinement, assessment, and
optimization.
References
1. Songbao, Z., Weiming, Z., Zhong, L., et al.: Research of Structure Analyzing and
Modeling for Complex System of Systems. Journal of National University of Defense
Technology 28(1), 62–67 (2006)
2. Sage, A.P., Biemer, S.M.: Process for System Family Architecting, Design, and
Integration. IEEE System Journal 1(1), 5–16 (2007)
3. Carlock, P.G., Fenton, R.E.: System of Systems (SoS) Enterprise Systems Engineering for
Information-intensive Organizations. System Engineering 4(4), 242–261 (2001)
4. Morganwalp, J., Sage, A.P.: A System of Systems Focused Enterprise Architecture
Framework. Information, Knowledge System Management 3(2), 87–105 (2003)
An Integration Process Model of Enterprise Information System Families 485
5. Stephenson, S.V., Sage, A.P.: Architecting for Enterprise Resource Planning. Information,
Knowledge, System Management 6(1-2), 81–121 (2007)
6. Maier, M.: Architecting Principles for Systems-of-systems. System Engineering 1(4), 267–
284 (1998)
7. Sage, A.P., Cuppan, C.D.: On the Systems Engineering and Management of Systems of
Systems and federation of systems. Information Knowledge System Management 2(4),
325–345 (2001)
8. Boardman, J., Sauser, B.: System of Systems - the Meaning of. In: Proceedings of the 2006
IEEE/SMC International Conference on System of Systems Engineering, pp. 118–123.
IEEE Press, New York (2006)
A Linear Multisensor PHD Filter Using the
Measurement Dimension Extension Approach
1 Introduction
Since it was proposed by Mahler [1], the probability hypothesis density (PHD)
filter has been widely studied, especially in target tracking. These researches
can be classified two classes according to their development: the PHD based
algorithms and the cardinality PHD (CPHD) based algorithms. In the PHD
based class, for the PHD filter is a nonlinear function, the particle-PHD filter
was first given in references [2,3,4]. The particle-PHD filter can deal with the
nonlinear tracking system, but it needs more computational load. The further
important work is the Gaussian mixture PHD (GM-PHD) filter due to Vo et al
[5]. The GM-PHD filter can estimate the target states without state clustering
algorithms. To improve the state estimation, Nandakumaran et al proposed the
PHD smoother [8]. Combining the interacting multiple model, Kirubarajan et
al proposed the multiple-model PHD filter for tracking maneuvering targets
[6]. In the CPHD based aspect, since the value of expected number of targets
is very unstable in the presence of missed detections and/or significantly large
false alarm densities in the PHD filter propagation process [9,10], Mahler further
proposed the CPHD filter [10]. The CPHD filter propagates not only the PHD
but also the entire probability distribution on number of targets. The analytic
solutions of the CPHD was proposed by Vo et al [11].
Nevertheless, The most PHD and CPHD based algorithms are based on the
single-sensor observation. Mahler investigated the multisensor PHD filter in ref-
erence [1] and pointed out that the resulting PHD formula is impractical due to
its complexity in multisensor case. He proposed an approximation multisensor
PHD algorithm through the product of individual sensor PHDs. In this algo-
rithm, the sensors are independence from respective observation spaces. This
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 486–493, 2011.
c Springer-Verlag Berlin Heidelberg 2011
A LMPHD Filter Using the Measurement Dimension Extension Approach 487
In the single-sensor case, the PHD filter is iteratively proceeded by the following
time-update and measurement-update steps. Predicted step:
Dk+1|k (x) = γk+1|k (x) + [(Ps (u)fk+1|k (x|u) + βk+1|k (x|u))Dk (x|u)]du (1)
Update step:
Dk+1 (x|Zk+1 ) = Fk+1 (Zk+1 |x)Dk+1|k (x) (2)
Pd (x)gk+1 (z|x)
Fk+1 (Zk+1 |x) = 1 − PD (x) + (3)
z∈Z
λck+1 (z) + Pd (x)gk+1 (z|x)Dk+1|k (x)dx
k+1
where γk+1|k (x) is the target birth intensity at time step k + 1, PD (u) is the de-
tection probability, βk+1|k (x|u) is the target spawning intensity, which has state
u at time step k and has state x at time step k + 1, PS (x) denotes target surviv-
ing probability, λ is the average clutter intensity, ck+1 (z) is the clutter density
for each clutter point, and gk+1 (z|x) is the measurement likelihood function.
Mahler proposed an approximation for the multisensor PHD filter as follows.
Dk+1 (x) ∼
[1] [1] [s] [s]
= Fk+1 (Zk+1 |x) · · · Fk+1 (Zk+1 |x)Dk+1|k (x) (4)
where s is the number of sensors. It can be seen from (4) that the PHD Dk+1 (x)
is a s + 1-fold application of the Poisson approximation [1]. This update process
can not be used in the sensor correlation case.
xk+1 = F k xk + Gk w k (5)
[j] [j] [j]
y k+1 = H k+1 xk+1 + v k+1 (6)
[j]
z k+1 = Ak+1 yk+1 + bk+1 (7)
, y k+1 ]T
[1] [s]
y k+1 = [y k+1 , · · ·
[j]
Where zk+1 denotes the measurement of the jth sensor. Obviously, individual
[j]
observations {zk+1 } are dependent. Equation (6) is the measurement model for
each sensor. Accordingly, equation (7) is the proposed linear fusion model of
each sensor. We aim at deriving the PHD intensity conditional on the measure-
ment set Zk+1 , i.e., D(xk+1 |Zk+1 ), where Zk+1 = {Zk+1 , · · · , Zk+1 }T ,Zk+1 =
[1] [s] [j]
[j] [j]
{zk+1,1 , · · · , zk+1,nj } By augmenting measurements, we rewrite the measure-
ment function (7) as follows
Where H A A
k+1 = Ak+1 H k+1 , v k+1 = Ak+1 v k+1 + bk+1 . Our next main task is
to obtain the corresponding LMPHD filter after the MDE.
1 δ ms F · · · δ m1 F δF
Dk+1 (x|Zk+1 ) = [0, · · · , 0, 1] (13)
fk+1 (zk+1 |Zk ) δ ms z [s] · · · δ m1 z [1] δx
s
F [g1 , · · · , gs , h] =
[1] [s] [1] [s]
g Z · · · g Z hX fk+1 (Z [1] |X) · · · fk+1 (Z [s] |X)fk+1|k (X|Zk )δZ [1] · · · δZ [s] δX (14)
δ ms F δ ms z [s]
= (15)
δ z
m s [s] δz ms · · · δz 1
Obviously, it is intractable to obtain the update PHD Dk+1 (x|Zk+1 ). We here
adopt the MDE approach (9) z k+1 . Thus the above multisensor PHD reduce
to the single-sensor PHD [1](pp: 1173, equations (110),(111)), but here the ex-
tension measurements consist of total combination of all sensor measurements.
That is
1 δ Ls +1 F
Dk+1 (x|Zk+1 ) = [0, 1] (16)
fk+1 (zk+1 |Zk ) δz Ls · · · δz 1 δx
s
Where Ls is the total combination number. It equals to Ls = l=1 ml , ml is the
number of measurements of the lth sensor.
Fk+1 (Zk+1 |x) = 1 − PD (x) +
Pd (x)gk+1 ([z [1] , · · · , z [s] ]T |x)
(17)
[1]
λs ck+1 ([z [1] , · · · , z [s] ]T ) + PD (x)gk+1 (z|x)Dk+1|k (x)dx
[z k+1 ,··· ,z [s] ]T ∈Zk+1
Clutter Density and Clutter Intensity. Assume that the lth sensor to be a
space surface Sl . Thus, the multisensor can be described to be a super-cylinder
C consisting of these sensor spaces, i.e., C S1 × · · · × Ss . Therefore, the clutter
intensity is proposed by:
λ = λ1 + · · · + λs (18)
where λ1 , · · · , λs are the clutter intensities of the sensors.
The clutter density of each clutter point is proposed as follows
λ
ck ([z [1] , · · · , z [s] ]T ) = (19)
V (C)
Where V (C) = V (Sl × · · · × Ss ) denotes the volume of the super-cylinder.
490 W. Liu and C. Wen
RA T
[1] [s]
k+1 = Ak+1 diag[Rk+1 , · · · , Rk+1 ]Ak+1 (20)
v̄ k+1 = E(v A
k+1 ) = bk+1 (21)
Given the three parameters, the LMPHD filter can be proceeded like the single-
sensor PHD filter. Similarly, the GM-PHD filter can also be used in LMPHD
filter.
4 Simulation
In this section, three targets with CV movements are proposed in x-y coordina-
tion plane. We suppose two sensors observing the same region [−1000, 1000] ×
[−1000, 1000]m2 . These two sensors are placed in different positions and have
a difference in precision. Some system parameter are proposed as follows: ini-
tial states are [250m, 5m/s, 250m, −12m/s],[−250m, 12m/s, −250m, −5m/s] and
[−250m, 12m/s, −250m, −5m/s] respectively for target 1, 2 and 3. Detection
probability is PD = 0.9. The process covariance is Qk = diag(25, 25)m2 .
The measurement covariances are respectively R1 = diag(25, 25)m2 and R2 =
diag(50, 50)m2 for sensor 1 and 2. That is, the sensor 1 has a better performance
than sensor 2. The measurement functions are position observation. We assume
in the simulation that the two sensors have the same reference coordinations.
Accordingly, we fuse these two sensors as follows:
y k = H k xk + v k
z k = Ak y k + bk
⎡ ⎤
0.8 0 0.2 0
1000 ⎢ 0 0.8 0 0.2 ⎥
Hk = , Ak = ⎢ ⎥
⎣ 0.6 0 0.4 0 ⎦ , bk = 0
0010
0 0.6 0 0.4
The GM-PHD filter is here proposed to track the targets. The Gaussian terms
whose weights are greater than threshold 0.5 are selected as the estimations.
Fig.1(a) and Fig.1(b) suggest that the proposed algorithm is effective in tar-
get tracking. We further compare the proposed algorithm with the PHD filter in
Fig.2 and 3. It can be seen from Fig.2 that in the number of targets the proposed
algorithm is stable than the PHD filter. Fig.3(a) is the Wasserstein distances of
these two algorithms. It shows that the PHD filter has something fluctuation
due to the estimation of target number. This point can be also found in Fig.3(b)
where the OSPA distance is adopted. However, the proposed algorithm needs
some more computing time than the PHD filter. The reason is that the proposed
A LMPHD Filter Using the Measurement Dimension Extension Approach 491
400 1000
True track
Estimation track: the proposed algorithm 500
200
x(m)
0
500
−400
0
y(m)
−600
−500
−800 −1000
−800 −600 −400 −200 0 200 400 600 800 0 10 20 30 40 50 60
x(m) Time step(s)
(a) The proposed linear MPHD fil- (b) Tracks in x,y coordinations
ter: true tracks and estimation tracks
in x,y plane
5
True number of targets
4.5 Estimation number of targets: the proposed algoirthm
Estimation number of targets: the original PHD filter
4
3.5
Number of targets
2.5
1.5
0.5
0
0 10 20 30 40 50 60
Time step(s)
400 100
The proposed algorithm The proposed algorithm
350 The original PHD filter 90 The original PHD filter
80
300
70
Wasserstein distance
250
60
200
OSPA
50
150
40
100
30
50
20
0 10
−50 0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Time step(s) Time step(s)
algorithm requires to deal with more extension measurements and the higher
dimension vectors. Here, the number of the measurements is Mk = m1 × · · ·× ms
and the dimension is d = d1 + · · · + ds , where d1 , · · · , dm are the measurement
dimensions for individual sensors.
5 Conclusion
We proposed a LMPHD filter by using the MDE approach. Though the new fil-
ter has the same form as the original PHD filter, it can deal with the case where
the sensors are linear correlation. We also proposed the derivation of some pa-
rameters for the new extension measurement in the LMPHD filter. Future works
may focus on two aspects: computational complexity and nonlinear correlation.
The proposed filter needs to update the product of all sensor measurements com-
pared with the sum of the measurements in single sensor PHD filter. Secondly,
Radar plays an important role in target tracking, and radar networks is a new
tendency in future application. Nevertheless, radar is a nonlinear sensor system.
How to solve the nonlinear multisensor will be a popular topic.
References
1. Mahler, R.P.S.: Multitarget Bayes Filtering via First-Order Multitarget Mo-
ments. IEEE Transactions on Aerospace and Electronic systems 39(4), 1152–1178
(2003)
2. Vo, B., Singh, S., Doucet, A.: Sequential Monte Carlo implementation of the PHD
filter for multi-target tracking. In: Proceedings of the International Conference on
Information Fusion, Cairns, Australia, pp. 792–799 (2003)
3. Sidenbladh, H.: Multi-target particle filtering for the probability hypothesis density.
In: Proceedings of the International Conference on Information Fusion, Cairns,
Australia, pp. 800–806 (2003)
4. Zajic, T., Mahler, R.: A particle-systems implementation of the PHD multitarget
tracking filter. In: Signal Processing, Sensor Fusion, and Target Recognition XII,
pp. 291–299 (2003)
5. Vo, B.-N., Ma, W.-K.: The Gaussian Mixture Probability Hypothesis Density Fil-
ter. IEEE Transactions on signal processing 54(11), 4091–4104 (2006)
6. Punithakumar, K., Kirubarajan, T., Sinha, A.: Multiple-model probability hy-
pothesis density filter for tracking maneuvering targets. IEEE Transactions on
Aerospace and Electronic Systems 44(1), 87–88 (2008)
7. Vo, B.N., Pasha, A., Tuan, H.D.: A Gaussian mixture PHD filterr for nonlinear
jump Markov models. In: Proceedings of the 45th IEEE Conference on Decision
and Control, pp. 3162–3166. IEEE, San Diego (2006)
8. Nandakumaran, N., Punithakumar, K., Kirubarajan, T.: Improved multi-target
tracking using probability hypothesis density smoothing. In: Drummond, O.E. (ed.)
Proc. Signal and Data Processing of Small Targets, vol. 6699 (August 2007)
A LMPHD Filter Using the Measurement Dimension Extension Approach 493
9. Erdinc, O., Willet, P., Bar-Shalom, Y.: A Physical-Space Approach for the Proba-
bility Hypothesis Density and Cardinalized Probability Density Filters. In: Signal
and Data Processing of Small Targets, Proc. of SPIE, vol. 6236, pp. 1–12 (2006)
10. Mahler, R.: PHD filters of higher order in target number. IEEE Trans. Aerosp.
Electron. Syst. 43(3), 1523–1543 (2007)
11. Vo, B.-T., Vo, B.-N., Cantoni, A.: Analytic Implementations of the Cardinal-
ized Probability Hypothesis Density Filter. IEEE Transactions on Signal Process-
ing 55(7), 3553–3567 (2007)
An Improved Particle Swarm Optimization for Uncertain
Information Fusion
1 Introduction
The integration of uncertain multi-sensor information is commonly deal by information
fusion algorithm [1-2]. Information often contains uncertainties, which are usually
related to physical constrains, detection algorithms, and the transmitting channel of the
sensors. Whilst the intuitive approaches, such as Dempster-Shafer Fusion, Dezert
Samarandche Fusion, and Smets’ Transferable Belief Model [3-5] are to aggregate all
available information, these approaches do not always guarantee optimum results.
Acknowledging these measurement techniques have associated measurement costs, the
essence is to derive a fusion process to minimize global uncertainties.
Nowadays, the system is increasingly relying on information fusion techniques to
automate processes and make decisions. An informed decision maker, meanwhile,
often relies on various forms of data fusion models to assist him/her to assess the
current situations. The D-S evidence theory is an excellent method of information
fusion, its adoption belief degree function but not probability is a generous character,
no request to make binary mutex assumption to the indetermination affairs, having
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 494–501, 2011.
© Springer-Verlag Berlin Heidelberg 2011
An Improved Particle Swarm Optimization for Uncertain Information Fusion 495
According to defines 1 and 2, we can clearly see the particle swarm optimization
process. If adjusts the weight according to the particle velocity evolution degree and
the granule extent of polymerization, we can combine the weight with the particles
optimization process, thus adjust population diversity by adjusting the weight.
496 P. Zhu, B. Xu, and B. Xu
Therefore the weight changes along with changing the particle velocity evolution and
aggregation degree, which situation can solve the particles premature problem from
the mathematical model. When E( x) is big the velocity evolution is quick, the
algorithm may continue in the big space to search, namely the swarm optimizes in
wide range. When E( x) is small, may reduce ω to cause the swarm searched in a
small scope, thus found the optimum value quickly. When A( x) is small the swarm
is quite scattered, not easy to fall into local optimum. The swarm easily to fall into
local optimum with A( x) increasing, at this moment, we will increase ω to increases
the swarm search space, and improve the swarm global search capability. In
conclusion, ω increases with the particle velocity evolution E( x ) reduce or increases
with the particles aggregation degree A( x) increases. Therefore ω is decided by
E( x) and A( x) , so the functional relations may express the equation below:
w = f ( E ( x), A( x)) = w0 − 0.55* E ( x) + 0.15* A( x) (3)
=
Where ω0 is initial value of ω , generally speaking, ω0 0.9 ; by definition we can
see 0 < E (x ) ≤ 1 and 0 < A (x ) ≤ 1 , so ω0 − 0.55 < ω < ω0 + 0.15 , ensure ω < 1 the
convergence requirements.
Since the aim of this paper is to define a meaningful metric distance for BPAs, let
w1 , w2 is the corresponding weight of the evidence m1 , m2 in information fusion,
An Improved Particle Swarm Optimization for Uncertain Information Fusion 497
It can be easily seen that ∑ wi = 1 , thus, the weight shows the relative importance
of the collected evidence. We can use traditional optimization methods to solve
Eq. (8). Considering real-time demand for information fusion, and accuracy and
validity fusion results in conflict evidence, the more optimal solution based on
improved PSO should be given to meet these requirements. For the summary of the
new uncertain information fusion see Table 1.
The weight of evidence’ owns reliability wi can be acquired by the above analysis,
so we can revise evidence theory by wi . Considering the evidence source itself has
different importance, therefore, the revised evidence not be a simple average, the new
evidence source probability assigned is defined as follows:
n
mae(m) = ∑ ( wi mi ) (9)
i =1
498 P. Zhu, B. Xu, and B. Xu
Then, the new probability assignment is acquired, thus, we can combine it using
D-S combination rule.
4 Numerical Experiments
The new method have analyzed above, this section mainly give two numerical
examples to further understanding the method proposed above.
A. Initial experiments
The experimental arrangement, two groups of common data to acquire the basic
probability assignment, this problem is solved by using two different analysis types
namely case 1 using D-S rules and case 2 using Yager’s rule to combine evidence at
sub-system level for each type. We used modified PSO algorithm to solve this
optimization problem along with D-S rule of combined evidence program in the
developed matlab program to solve this problem. For the PSO parameters, namely,
population size=15, tmax = 500 , c1 = c2 = 2.0 , ω0 = 0.9 , stopping convergence
criterion (in terms of change in the objective function value) = 10-8 for over 200
continuous iterations.
Assuming a discernment frame Θ = { A, B, C} , source S1 , S 2 , S3 , S 4 , such that
Data 1:
S1 : m( A) = 0.7 m( B) = 0.1 m(C ) = 0.1 m( A, C ) = 0.1
S2 : m( A) = 0.1 m( B) = 0.8 m(C ) = 0.05 m( A, C ) = 0.05
S3 : m( A) = 0.4 m( B ) = 0.3 m(C ) = 0.2 m( A, C ) = 0.1
S4 : m( A) = 0.4 m( B) = 0.2 m(C ) = 0.1 m( A, C ) = 0.3
Data 2:
S1 : m( A) = 0.001 m( B ) = 0.199 m(C ) = 0.8
S2 : m( A) = 0.9 m( B ) = 0.05 m(C ) = 0.05
S3 : m( A) = 0.3 m( B ) = 0.6 m(C ) = 0.1
S4 : m( A) = 0.4 m( B) = 0.4 m(C ) = 0.2
The results of combination are shown in Table 2 by three different rules, we can
see that when the confliction between evidences are relatively small, this paper’s
method is slightly better than D-S rules and Yager rules, which reflects the basic
status of D-S evidence theory, so there is not much difference among effectiveness of
sensor fusion methods of linear combination, In this paper, but if the identification
object is multi-element set, this paper’s method makes it more reasonable and simple
to compute.
The numbers in the table 3 confirm our conclusion made in the following. We can
see that when the confliction between evidences is relatively large, the experimental
results are much better than Yager rules, and even better than the method of D-S
rules. Because there is any contradiction among the evidences from different sources,
the Yager rule treats that contradiction as coming from ignorance. If there is any
additional knowledge, then this contradiction might be resolved. So the Yager rule is
more conservative than the D-S rule. Therefore, we have any reason to believe that
this paper’s method has the best combination result between three algorithms.
B. Evidence conflict and robustness
In this section, we consider two evidence source combinations by the classic DS
evidence theory and the improved algorithm. Assuming a discernment frame
Θ = { A, B, C} , source S1 and S2, such that
Data 3:
m1 : m1 ( A) = 0.99, m1 ( B ) = 0.01, m1 (C ) = 0
m2 : m2 ( A) = 0, m1 ( B ) = 0.01, m1 (C ) = 0.99
m1' : m1' ( A) = 0.98, m1' ( B) = 0.01, m1' (C ) = 0.01
The results of combination are shown in Table 4 by two different rules.
We can see that the evidence is high conflicting in this test. In these two evidence,
the focal element A and the focal element C intuitively obtain a higher support degree,
should be approximately 50%, which should get more support after combination.
However, combination based on DS combination rule causes that the focal element B
gets almost certainly support, which is clearly unreasonable. We proposed algorithm
by modifying conflict evidence, get more reasonable results, therefore, the improved
algorithm can effectively avoid the defects of the traditional DS combination rules
when there is conflict evidence.
If the evidence m1' is substitute by m1 , the combination results are shown in Table 5.
500 P. Zhu, B. Xu, and B. Xu
5 Conclusion
For traditional D-S evidence theory’s problems of high conflict between evidences,
this paper proposed a new method to solve it. Firstly, we deal with the evidence with
a method of weighted D-S theory. Then an optimize model of obtaining sensor
weights has been set up. At last, we use the improved PSO to acquire the reliability
weight of the relationship between evidences to modify D-S theory. Numerical
experiments show that this new method is more effective.
References
[1] Wan, S.: Fusion Method for Uncertain Multi-sensor Information. In: International
Conference on Intelligent Computation Technology and Automation, vol. 1, pp. 1056–
1060 (2008)
[2] Chen, L., Huang, J.: Research of Uncertainty. Journal of Circuit and System 9(3),
105–111 (2004)
[3] Shafer, G.: A mathematical theory of evidence, pp. 19–63. Princeton University Press,
Princeton (1976)
[4] Dezert, J., Smarandache, F.: DSmT: A New Paradigm Shift for Information Fusion. In:
COGnitive systems with Interactive Sensors International Conference, Paris, March 2006,
pp. 1–11 (2006)
[5] Ristic, B., Smets, P.: Target Classification Approach Based On the Belief Function
Theory. IEEE Transactions on Aerospace and Electronics Systems 41(2), 1097–1103
(2005)
[6] Capelle, A.S., Fernandez-Maloigne, C., Colot, O.: Introduction of Spatial Information
within the Context of Evidence Theory. In: IEEE International Conference On Acoustics,
Speech, and Signal Processing, vol. 2, pp. 785–788 (2003)
An Improved Particle Swarm Optimization for Uncertain Information Fusion 501
[7] Wu, Z., Wu, G.: A new improvement of evidence combination. Computer and
Modern 12, 116–117 (2007)
[8] Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In:
Proceedings of the sixth International Symposium on Micro and Human Science, Nagoya,
Japan, pp. 39–43 (1995)
[9] Sentz, K., Ferson, S.: Combination of evidence in Dempster Shafer theory. TR 0835,
Sandia National Laboratories, Albuquerque, New Mexico (2002)
[10] Yager, R.: On the Dempster-Shafer framework and new combination rules. Information
Sciences 41, 93–137 (1987)
[11] Dubois, D., Prade, H.: Representation and combination of uncertainty with belief
functions and possibility measures. Computational Intelligence 4, 244–264 (1998)
[12] Jousselem, A., Grenier, D., Bosse, E.: A new distance between two bodies of evidence.
Information Fusion 2, 91–101 (2001)
Three-Primary-Color Pheromone for Track Initiation
1 Introduction
Multi-target tracking (MTT) has received considerable interest over last decade years,
and its application focuses on civil and military areas [1,2]. In general, MTT includes
phases of track initiation, data association and state estimate, among which track
initiation can determines the number of targets as well as the initial state estimate for
state estimator, and track initiation under consideration may result in target loss or
increase computation burden. So far, four popular track initiation techniques are
generally used in radar tracking, namely, the rule-based method, the logic-based
method, the Hough transform and the modified Hough transform method [3]. In this
work, however, we focus on the bearings-only track initiation problem in the sonar
based tracking of submarines. Since measurement candidates grows exponentially
with the number of sensors ( m ) or polynomially with the number of targets, many
attempts have been made including various evolutionary algorithms, among which
Ant Colony Optimization (ACO) approach is recognized a competitive one [4,5].
However, these algorithms need to be improved for practical tracking scenario due to
the assumption that the number of tracks is known a prior.
Biologist reports that there are near 20,000 species of ants that vary in size, color,
and ways of life. Most are a dull, drab color such as brown, rust, or black, but some
are yellow, green, blue, or purple. Inspired from these colored ants, we propose an ant
system with three primary colors to jointly identify the number of tracks to be
initiated and their individual tracks.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 502–508, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Three-Primary-Color Pheromone for Track Initiation 503
2 Background
In the generic ACO algorithm, ants communicate with each other indirectly by the
stigmergy, and such behavior can be replaced by a more direct communication means
called color similarity of pheromone. Color similarity comparison can be conducted
through the following two steps in general: color space conversion and color
difference computation. Since there are many techniques for color space conversions,
the adopted conversion strategy is generally conditional on various applications. In
this work, to increase the color discrimination ability of each ant in the “subtractive”
color mixing model, the following conversion steps are employed:
Step 1): From CMY to standard RGB space. Since standard RGB component
values vary between 0 and 1, a cheap and simple transform from CMY space to
standard RGB space is adopted, namely, R = 1 − C , G = 1 − M , and B = 1 − Y ,
respectively, where each component of CMY lies in the range of [0,1] as well.
Step 2): From standard RGB to CIE XYZ. According to human vision tristimulus,
the conversion law from standard RGB to CIE XYZ is introduced as [6]
Step 3): From CIE XYZ to CIE LAB. Because the CIE LAB is more perceptually
uniform than the CIE XYZ, i.e. a change of the same amount in a color value
can produce a change of about the same visual importance, the color
difference comparison is conducted preferably in the CIE LAB space, which can be
computed as
L∗ = 116 f ( Y Yn ) − 16
⎧t 1/3 t > (6 / 29)3
⎪
a = 500 ⎡⎣ f ( X X n ) − f (Y Yn ) ⎤⎦ where f (t ) = ⎨ 1
∗
4 (2)
⎪ ( 29 6 ) t +
2
otherwise
b = 200 ⎡⎣ f ( Y Yn ) − f ( Z Z n ) ⎤⎦
∗
⎩3 29
where typical ranges for above three values are L* ∈ [0,100] , a* ∈ [−100,100] , and
b* ∈ [−100,100] , respectively; ( X n , Yn , Z n ) denotes the tristimulus value with reference
to white point in the CIE XYZ space, and is given by (95.017,100.0,108.813) .
For a multi-sensor multi-target bearings-only tracking system, the sampling data from
the first four scans are generally utilized to initiate tracks, thus we obtain four spaces,
denoted by Ω1 , Ω 2 , Ω3 and Ω 4 , respectively, and each is formed through intersecting
bearing measurements (or Line of sights) of the same scan, as shown in Fig. 1. Since
our algorithm is based on the three primary colors, we consider three groups of equal
number of ants accordingly. Initially, three groups of ants are mixed together and
placed randomly on position candidates in the first search space Ω1 . Afterwards, each
ant will visit the position candidates in the next search space probabilistically.
Suppose that an ant with s 1 pheromone is now located at position i in Ω k
( k = 1, 2,3 ), then the ant will visit position j in the next search space by applying the
following probabilistic formula:
∑eα
−α ⋅ΔE ( ws , w j )
Pi ,( sj ) = e ηi , j ηi , l
− ⋅ΔE ( ws , wl )
(4)
l ∈Ωk +1
−α ⋅ΔE ( w , w )
where e s j
denotes the pheromone color similarity between the current ant and
the path from i to j ; ηi , j denotes the problem dependent heuristic function; while
α is an adjustable positive parameter whose value determines the degree of
pheromone color similarity among candidates. It is noted that the smaller the
−α ⋅ΔE ( ws , w j )
pheromone color similarity e , the bigger the color difference. To eliminate
the effect of outliers and simplify the computation of each ant, ηi , j is defined as
below
⎪⎧1 if r1 ≤ Di , j ≤ r2 , j ∈ γ
ηi , j = ⎨ (5)
⎪⎩κ otherwise
where κ is a constant and takes a value between 0 and 1, Di , j is the distance between
i and j , and γ is denotes an annular gate region whose inner and outer radiuses are
determined respectively by r1 =|| v min ||iT and r2 =|| vmax ||iT with sampling interval T .
While walking from i to j , the ant deposits its corresponding color pheromone with
a given amount τ 0s as
τ i , j ← τ i , j + τ 0s (6)
where τ 0s is the added local pheromone amount with color s , and τ i , j is the resulting
pheromone through mixing three primary colors with their individual amounts.
Once all ants at a given iteration have finished their individual tours, the
pheromone amount on some established tracks will be updated globally, which will
directly result in the color changes on these tracks. We consider respectively three
1
Without loss of generality, s = 1, 2,3 represents cyan, magenta, and yellow, respectively.
Three-Primary-Color Pheromone for Track Initiation 505
Ω2
Ω3
Ω4
Sensor 1 Sensor 2 X
Y θ
a
1
b
1
p
2 V
p 3
c
3 2
O X O
ρ
(a) (b)
Δ E p (1, 2 )
Δ E p ( 2, 3)
Δ E p (1, 3)
θ
1
3
2
O ρ
(c)
where Δτ is,,jp is the amount of pheromone with color s added on the segment ij of
track p . In our ant system, Δτ is,,jp is defined as follow
Δτ is,,jp = Q0s J p (8)
s
where Q0 is a adjustment constant related to the s type of three primary colors, and
J p is the objective function to be discussed below.
Hough transform (H-T) has been recognized as a robust technique for line or curve
detection in the image detection field, and it in is essence a transform from a point
506 B. Xu, Q. Chen, and J. Zhu
As shown in Fig. 2(c), any two segments on track p are selected to compute the
corresponding color difference, so total three terms are required to be calculated.
4 Numerical Simulation
Numerous simulations with different cases are conducted on a DELL 6GHz processor
with 1.99 GB RAM, we only present the case of three track initiation due to the
layout restrictions. Figs.3 (a) and 4 (a) present the obtained colored board in clutter-
free and clutter environment, respectively. Since our goal is to select all potential
tracks of nearly cyan, magenta, or yellow, through using both color and distance
difference thresholds ( ε c = 60, ε d = 400 ) the extracted tracks are extracted and plotted
in Figs.3 (b) and 4 (b), and each is nearly in cyan, magenta, or yellow. Besides,
simulation results indicate that our proposed algorithm enjoys robust track initiation
performance both in clutter-free and clutter environment.
Three-Primary-Color Pheromone for Track Initiation 507
4 4
x 10 x 10
9 9
track 1
track 2
track3
ghosts
8 8
7 7
Y-Coordinate in (m)
Y-Coordinate in (m)
6 6
5 5
4 4
3 3
4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000
X-Coordinate in (m) X-Coordinate in (m)
4 4
x 10 x 10
9 9
8 8
7 7
Y-Coordinate in (m)
Y-Coordinate in (m)
6 6
5 5
4 4
3 3
2 2
4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000
X-Coordinate in (m) X-Coordinate in (m)
References
1. Cheng, H.-Y., Hwang, J.-N.: Adaptive particle sampling and adaptive appearance for
multiple video object tracking. Signal Processing 89(9), 1844–1849 (2009)
2. Vo, B.-N., Singh, S., Doucet, A.: Sequential Monte Carlo methods for multi-target filtering
with random finite sets. IEEE Trans. On Aerospace & Electronic Systems 41(4), 1224–
1245 (2005)
3. Hu, Z., Leung, H., Blanchette, M.: Statistical performance analysis of track initiation
techniques. IEEE Transactions on Signal Processing 45(2), 445–456 (1997)
508 B. Xu, Q. Chen, and J. Zhu
4. Xu, B., Chen, Q., Wang, Z.: Ants for Track Initiation of Bearings-Only Tracking.
Simulation Modelling Practice and Theory 16(6), 626–638 (2008)
5. Xu, B., Wang, Z.: A Multi-objective-ACO-Based Data Association Method for Bearings-
Only Multi-Target Tracking. Communications in Nonlinear Science and Numerical
Simulation 12(8), 1360–1369 (2007)
6. Albers, J.: Interaction of Color. Revised and Expanded edn. Yale University Press, New
Haven (2006)
7. Bhattacharya, P., Rosenfeld, Weiss, A.I.: Point-to-line mappings as Hough transforms.
Pattern Recognition Letters 23, 1705–1710 (2002)
Visual Tracking of Multiple Targets by
Multi-Bernoulli Filtering of Background
Subtracted Image Data
1 Introduction
Single-view visual tracking techniques invariably consist of detection followed by
filtering. A detection module generates point measurements from the images in
the video sequence which are then utilised as inputs by a filtering module, which
estimates the number of targets and their states (properties such as location
and size). Detection is an integral part of single-view visual tracking techniques.
There is a large body of literature on models and techniques for detecting tar-
gets based on various background and foreground models. One of the most pop-
ular approaches is the detection of targets based on matching colour histograms
of rectangular blobs [1,2]. Other recent methods include a game-theoretic ap-
proach [3], using human shape models [4,5], multi-modal representations [6],
sample-based detection [7], range segmentation [8] and a multi-step detection
scheme including median filtering, thresholding, binary morphology and con-
nected components analysis [9].
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 509–518, 2011.
c Springer-Verlag Berlin Heidelberg 2011
510 R. Hoseinnezhad, B.-N. Vo, and T.N. Vu
Detection compresses the information on the image into a finite set of points
measurements, and is efficient in terms of memory as well as computational re-
quirements. However, this approach may not be adequate when the information
loss incurred in the detection process becomes significant. Another problem with
using detection is the selection of a suitable measurement model for the filtering
algorithm. Modelling the detection process in a computationally tractable man-
ner is a difficult problem. In practice, the selection of the measurement model is
done on an ad-hoc basis and requires the manual tuning of model parameters.
Using random finite set (RFS) theory, a tractable framework for tracking mul-
tiple targets from video data without detection was recently introduced in [10].
This work led to a novel method for tracking multiple targets in video and has
been successfully demonstrated on sport players tracking [11]. However, this
method requires prior information about the visual appearance of the targets
to be tracked, and is most useful in cases where visual target model is avail-
able either a priori or from training data. In many applications, such as people
surveillance, there is no prior information about the visual appearance of the
targets and a new algorithm is needed.
This paper presents a novel algorithm that tracks multiple moving targets
directly from the image without any training data. Our proposed algorithm
gradually learns and updates a probabilistic background model based on ker-
nel density estimation. This resulting background model is then substracted to
generate a grey scale foreground image from which the multi-target posterior
distribution can be computed analytically using the multi-Bernoulli update of
[10]. A sequential Monte Carlo implementation of the multi-Bernoulli filter is de-
tailed and demonstrated through case studies involving people tracking in video
sequences.
2 Background
In the context of jointly estimating the number of states and their values, the
collection of states, referred to as the multi-target state, is naturally represented
as a finite set. The rationale behind this representation traces back to a funda-
mental consideration in estimation theory–estimation error, see for example [10].
Since the state and measurement are treated as realisations of random variables
in the Bayesian estimation paradigm, the finite-set-valued (multi-target) state X
is modelled as a random finite set (RFS). Mahler’s Finite Set Statistics (FISST)
provides powerfull yet practical mathematical tools for dealing with RFSs [12],
[13], based on a notion of integration and density that is consistent with the well-
established point process theory [14]. FISST has attracted substantial interest
from academia as well as the commercial sector with the developments of the
Probability Hypothesis Density (PHD) and Cardinalized PHD filters [12],[14],
[15], [16], [17].
Let us denote the frame image observation by y = [y1 . . . ym ]. Then, using the
FISST notion of integration and density, we can compute the posterior probability
density π(·|y) of the multi-target state from the prior density via Bayes rule:
Visual Tracking of Multiple Targets by Multi-Bernoulli Filtering 511
g(y|X)π(X)
π(X|y) = (1)
g(y|X)π(X)δX
where g(y|X) is probability density (likelihood) of observation y given the multi-
target state X and the integral over the space of finite sets is defined as follows:
∞
1
f (X)δX f ({x1 , . . . , xi })dx1 . . . dxi . (2)
i=0
i!
and the multi-target RFS has a multi-Bernoulli prior distribution {(r(i) , p(i) )}M
i=1 ,
then the posterior distribution of X, given by Bayes rule (1), is also multi-
(i) (i)
Bernoulli with the parameters {(rupdated, pupdated)}M
i=1 where:
3 Visual Likelihood
Using background subtraction, each frame image is transformed into a grey scale
image in which each pixel value is the probability density of the pixel belong-
ing to the background. The background subtraction method used in this work
512 R. Hoseinnezhad, B.-N. Vo, and T.N. Vu
is based on kernel density estimation which has been quite popular in visual
tracking [18,19,20]. The resulting grey scale image is then used as input to the
multi-target filter. For simplicity of notation, we will use the y and yi symbols
for the background subtracted grey scale image and its pixel values (which are
indeed the probability density values of the actual pixel in the colour image to
belong to background). We also assume that the yi values are normalised to the
interval [0, 1].
2
where N (x; x0 , σ) √2πσ
1
exp(− (x−x 0)
2σ2 ) and σr , σg and σI are the bandwidth
of Gaussian kernels for the rgI colours and are user-defined parameters chosen
Visual Tracking of Multiple Targets by Multi-Bernoulli Filtering 513
between 0 and 1. To normalise the pi (k) values to vary within [0,1], the density
normalisation factors √ 1
3
are removed which results in the following
(2π) σr σg σI
normalised yi values:
N0 −1
where mj = |T (xj )| is the number of pixels within the region T (xj ) defined by
the state xj . The likelihood of the region T (xj ) to include a target is expressed
as a function of yj denoted by gF (y j ). This function should be a strictly de-
creasing function in [0,1]. An appropriate choice of such function is gF (y j ) =
ζF exp(−y j /δF ) where δF is a control parameter to tune the sensitivity to large
average pixel values, and ζF is a normalising constant – see Fig. 1(a). Based on
independence assumptions, the likelihood of all elements of the stateset X to in-
clude target regions in the background-subtracted image is given by nj=1 gF (y j ).
The rest of the pixels in the image that do not belong to any of the regions
T (xj ) (j = 1, . . . , n), are highly likely to belong to background. This is an
important condition as otherwise, there might be more than n targets in the
scene and this violates the premise that there are n targets. Let us denote the
rest of the image by:
n
y−X (y) − {yi |i ∈ T (xj )} (12)
j=1
where (y) is the result of mapping the matrix y to a set containing all the pixel
values. We also construct a new image by filling up all the target regions with
background pixels (all yi values equal to 1), and denote the set of its pixel values
by (y; X) which can also be expressed as below:
⎡ ⎤
⎢ ⎥
n
(y; X) = ⎣ {1, · · · , 1}⎦ y−X . (13)
j=1
mj times
The average (or weighted average) of pixels belonging to (y; X) is given by:
⎛ ⎞
1 ⎝ m n
yB = yi + (1 − yi )⎠ . (14)
m i=1 j=1 i∈T (xj )
514 R. Hoseinnezhad, B.-N. Vo, and T.N. Vu
§ y · §y ·
g F ( y j ) ] F exp¨¨ j ¸¸ g B ( y B ) ] B exp¨¨ B ¸¸
© GF ¹ © GB ¹
0 1
yj 0 1
yB
(a) (b)
This average is within [0,1] and expected to be very close to 1. Indeed, if there are
any targets existing in the image but not included in the hypothesised state X,
the low values of the pixels belonging to that target region will decrease yB . If the
average target size is small relative to the whole image, this decreasing effect can
be small. Therefore, the likelihood of y B to represent background region should
be large only for y B values that are very close to 1. It is important to note that
scattered noise (e.g. salt and pepper noise) in the background-subtracted image
may cause reduction in yB , similar to the effect of small size targets. To prevent
this, we remove such tiny noise and other areas of image (containing small-values
pixels) by morphologically closing the image (erosion followed by dilation of the
image using a small structural element).
We denote the likelihood of y B to represent an all background region by
gB (y B ) which is expected to be an increasing function of y B in [0,1]. We choose
the exponential function gB (y B ) = ζB exp(y B /δB ) where δB is a control param-
eter to tune the sensitivity to deviations of the average pixel value from 1, and
ζB is a normalising constant – see Fig. 1(b). As we will see later, the exponen-
tial form is a necessity here to provide a separable form for the total likelihood.
Replacing y B from equation (14), we derive:
m n
i=1 yi + j=1 i∈T (xj ) (1 − yi )
gB (y B ) = ζB exp (15)
m δB
m n
i=1 y i j=1 i∈T (x j ) (1 − y i)
= ζB exp exp (16)
m δB m δB
m
i=1 yi
n
mj − i∈T (xj ) yi
= ζB exp exp (17)
m δB j=1
m δB
m n
i=1 yi
mj (1 − y j )
= ζB exp exp . (18)
m δB j=1
m δB
Finally, the total likelihood of the image y for the given set of states X is
given by:
n
g(y|X) = gB (y B ) gF (y j ). (19)
j=1
Visual Tracking of Multiple Targets by Multi-Bernoulli Filtering 515
(i) L(i)
k|k−1 (i,j) (i,j)
where k = j=1 wk|k−1 gyk (xk|k−1 ) [10].
516 R. Hoseinnezhad, B.-N. Vo, and T.N. Vu
Similar to the MeMBer filter [21], the updated particles are resampled with
the number of particles reallocated in proportion to the probability of existence
as well as restricted between a minimum Lmin and maximum Lmax . To reduce
the growing number of multi-Bernoulli parameters, those with probabilities of
existence less then a small threshold (set at 0.01) are removed. In addition,
the targets with substantial overlap are merged. Finally, the number of targets
and their states are estimated via finding the multi-Bernoulli parameters with
existence probabilities larger than a threshold (set at 0.5 in our experiments).
Each target state estimate is then given by the weighted average of the particles
of the corresponding density.
5 Tracking Experiments
We demonstrate our method for tracking moving people in three video sequences
from the CAVIAR dataset1 which is a benchmark for visual tracking experi-
ments. The tracking results are available to download and view from our home
page.2 The first video shows two persons each entering the lobby of a lab in
INRIA and leaving the environment. The second video shows people walking in
a shopping centre and occasionally visiting a shop that is in the front view of
the camera. The third video shows four people entering the same place as in the
first video, walking together and leaving the lobby. Except for a small number
of frames, the four people are relatively accurately detected and tracked at all
times. In this video, we also show the background subtracted (grey scale) images
to give an indication of how our tracking method uses the results of background
subtraction.
Figure 2 shows snapshots of the third video. It demonstrates that in general,
our method can accurately track multiple targets in the video. The tracking
results in the frames shown in Fig. 2 also present the ability of our tracking
technique in detecting the arrival of new targets into the scene and tracking them
while moving and interacting with other targets, and detecting their departure
from the scene.
6 Conclusions
A novel algorithm for tracking multiple targets directly from image observa-
tions has been presented. Using kernel density estimation, the proposed algo-
rithm gradually learns and updates a probabilistic background model which is
then used to generate a grey scale foreground image. A separable likelihood
function has been derived for the grey scale foreground image, which enabled
an efficient multi-target filtering technique called multi-Bernoulli filtering to be
applied. The method has been evaluated in three tracking scenarios from the
CAVIAR datasets, showing that multiple persons can be tracked accurately.
1
http://homepages.inf.ed.ac.uk/rbf/CAVIARDATA1/
2
Video 1: www.dlsweb.rmit.edu.au/eng1/Mechatronics/Case01.mpg
Video 2: www.dlsweb.rmit.edu.au/eng1/Mechatronics/Case02.mpg
Video 3: www.dlsweb.rmit.edu.au/eng1/Mechatronics/Case03.mpg
Visual Tracking of Multiple Targets by Multi-Bernoulli Filtering 517
Frame 83 out of 491 Background Subtraction Frame 152 out of 491 Background Subtraction
50 50 50 50
100 200 300 100 200 300 100 200 300 100 200 300
Frame 241 out of 491 Background Subtraction Frame 327 out of 491 Background Subtraction
50 50 50 50
100 100 100 100
150 150 150 150
200 200 200 200
250 250 250 250
100 200 300 100 200 300 100 200 300 100 200 300
50 50
100 100
150 150
200 200
250 250
Fig. 2. Tracking of up to four people in a video sequence from CAVIAR dataset. The
selected frames show that the method is capable of detecting and tracking multiple
moving objects as they enter the scene, interact and leave the scene.
Acknowledgement
This work was supported by ARC Discovery Project grant DP0880553. Authors
thank Dr Branko Ristic from DSTO, Australia for his contribution.
References
1. Okuma, K., Taleghani, A., De Freitas, N., Little, J., Lowe, D.: A boosted particle
filter: Multitarget detection and tracking. In: Pajdla, T., Matas, J(G.) (eds.) ECCV
2004. LNCS, vol. 3021, pp. 28–39. Springer, Heidelberg (2004)
2. Kristan, M., Per, J., Pere, M., Kovacic, S.: Closed-world tracking of multiple in-
teracting targets for indoor-sports applications. Computer Vision and Image Un-
derstanding 113(5), 598–611 (2009)
3. Yang, M., Yu, T., Wu, Y.: Game-theoretic multiple target tracking. In: ICCV 2007,
Rio de Janeiro, Brazil (2007), http://dx.doi.org/10.1109/ICCV.2007.4408942
4. Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans
by Bayesian combination of edgelet based part detectors. IJCV 75(2), 247–266
(2007)
5. Zhao, T., Nevatia, R., Wu, B.: Segmentation and tracking of multiple humans in
crowded environments. PAMI 30(7), 1198–1211 (2008)
6. Apewokin, S., Valentine, B., Bales, R., Wills, L., Wills, S.: Tracking multiple pedes-
trians in real-time using kinematics. In: CVPR 2008 Workshops, Anchorage, AK,
United states (2008), http://dx.doi.org/10.1109/CVPRW.2008.4563149
518 R. Hoseinnezhad, B.-N. Vo, and T.N. Vu
7. Zhu, L., Zhou, J., Song, J.: Tracking multiple objects through occlusion with online
sampling and position estimation. Pattern Recognition 41(8), 2447–2460 (2008)
8. Parvizi, E., Wu, Q.J.: Multiple object tracking based on adaptive depth segmen-
tation. In: Canadian Conference on Computer and Robot Vision – CRV 2008,
Windsor, ON, Canada, pp. 273–277 (2008)
9. Abbott, R., Williams, L.: Multiple target tracking with lazy background subtrac-
tion and connected components analysis. Machine Vision and Applications 20(2),
93–101 (2009)
10. Vo, B.N., Vo, B.T., Pham, N.T., Suter, D.: Bayesian multi-object estimation from
image observations. In: Fusion 2009, Seatle, Washington, pp. 890–898 (2009)
11. Hoseinnezhad, R., Vo, B.N., Suter, D., Vo, B.T.: Multi-object filtering from image
sequence without detection. In: ICASSP, Dallas, TX, pp. 1154–1157 (2010)
12. Mahler, R.: Multi-target bayes filtering via first-order multi-target moments. IEEE
Trans. Aerospace & Electronic Systems 39(4), 1152–1178 (2003)
13. Mahler, R.: Statistical multisource-multitarget information fusion. Artech House,
Boston (2007)
14. Vo, B.N., Singh, S., Doucet, A.: Sequential Monte Carlo methods for multi-target
filtering with random finite sets. IEEE Tran. AES 41(4), 1224–1245 (2005)
15. Vo, B.N., Ma, W.K.: The Gaussian mixture probability hypothesis density filter.
IEEE Trans. Signal Proc. 54(11), 4091–4104 (2006)
16. Mahler, R.: Phd filters of higher order in target number. IEEE Trans. Aerospace
& Electronic Systems 43(4), 1523–1543 (2007)
17. Vo, B.T., Vo, B.N., Cantoni, A.: Analytic implementations of the Cardinalized
Probability Hypothesis Density filter. IEEE Trans. Signal Processing 55(7), 3553–
3567 (2007)
18. Tyagi, A., Keck, M., Davis, J.W., Potamianos, G.: Kernel-based 3D tracking. In:
CVPR 2007, Minneapolis, Minnesota, USA (2007)
19. Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.S.: Background and fore-
ground modeling using nonparametric kernel density estimation for visual surveil-
lance. Proceedings of the IEEE 90(7), 1151–1162 (2002)
20. Han, B., Comaniciu, D., Zhu, Y., Davis, L.S.: Sequential kernel density approx-
imation and its application to real-time visual tracking. PAMI 30(7), 1186–1197
(2008)
21. Vo, B.T., Vo, B.N., Cantoni, A.: The cardinality balanced multi-target multi-
Bernoulli filter and its implementations. IEEE Transactions on Signal Process-
ing 57(2), 409–423 (2009)
Mobile Robotics in a Random Finite Set
Framework
1 Introduction
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 519–528, 2011.
c Springer-Verlag Berlin Heidelberg 2011
520 J. Mullane et al.
probabilistic robot and sensor models and attempt to extract optimal estimates
of the map and robots.
By far the most common approach to the problem is to use a random vector
framework, in which the map and robot paths are modeled as random vec-
tors containing positional information about the features and robot locations
respectively [1]. While this model is the basis for the majority of existing mobile
robotics algorithms, it requires independent data association and map manage-
ment routines to respectively assign measurements to features and to estimate
the number of features in the map [5], [6]. Recently, a new framework has been
developed using Random Finite Set (RFS) models [7], [8], [9], which alleviates
the need for independent routines and unifies the stochastic mobile robotics
framework into a single Bayesian recursion. This new approach admits numerous
benefits such as removal of data association, increased robustness to measure-
ment error, integration of map management, straightforward fusion of multiple
robot map estimates, expected map estimation and can be readily applied to
single or multiple robot scenarios.
This paper advocates a fully integrated Bayesian framework for mobile robotics
under DA uncertainty and unknown feature number. The key to this formula-
tion is the representation of the map as a finite set of features. Indeed, from
an estimation viewpoint, it is argued in section that the map is indeed a finite
set and not a vector. Using Random Finite Set (RFS) theory, mobile robotics is
then posed as a Bayesian filtering problem in which the posterior distribution of
the set-valued map is propagated forward in time as measurements arrive. In the
case of an unknown robot path, the joint density including the robot trajectory
can be propagated. A tractable solution which propagates the first order mo-
ment of the map, its Probability Hypothesis Density (PHD), is presented. The
PHD construct can also be interpreted in terms of occupancy maps [9], [10]. In
this paper, both the map estimation from a known robot path and joint map/
trajectory estimation from an unknown robot path are examined separately. In
particular, mapping robustness to multiple robots, which may interfere with the
map building process, is demonstrated.
2 Background
Map estimation is closely related to the multi-target filtering problem, where the
aim is to jointly estimate the time-varying number of targets (features) and their
states from sensor measurements in the presence of data association uncertainty,
detection uncertainty, clutter and noise. The first systematic treatment of this
problem using random set theory was conceived by Mahler in 1994 [11], which
later developed into Finite Set Statistics and the Probability Hypothesis Density
(PHD) filter in 2003 [12]. A detailed treatment can be found in [13]. The mobile
robotics problem was first formulated in an RFS framework in [7], with mapping
and localisation algorithms presented in [8]. The approach modeled the joint ve-
hicle trajectory and map as a single RFS, and recursively propagates its first
order moment. Stemming from the popular FastSLAM algorithm [6], a factored
Mobile Robotics in a Random Finite Set Framework 521
where M̄k−1 = M − Mk−1 , i.e the set of features that are not in Mk−1 . If
fk|k−1 (Mk |Mk−1 , Xk−1 ) then represents the RFS feature map state transition
density, the generalised Bayesian RFS robotic mapping recursion can be written
[17],
pk|k−1 (Mk |Z0:k−1 , X0:k ) = fk|k−1 (Mk |Mk−1 , Xk )×
where gk (Zk |·) denotes the likelihood of the RFS measurement and δ denotes
a set integral. Integration over the map, requires integration over all possible
feature maps (all possible locations and numbers of features).
pk (Mk , X1:k |Z0:k , U0:k−1 , X0 ) = pk (X1:k |Z0:k , U0:k−1 , X0 )pk (Mk |Z0:k , X0:k ).
(5)
where U0:k−1 denotes the robot control inputs random vector. Note that the
second term is exactly equivalent to the posterior of (4). The first term can be
calculated via,
pk|k−1 (Mk |Z0:k−1 , X0:k )
pk (X1:k |Z0:k , U0:k−1 , X0 ) = gk (Zk |Mk , Xk ) ×
pk (Mk |Z0:k , X0:k )
pk|k−1 (X1:k |Z0:k−1 , U0:k−1 , X0 )
× (6)
gk (Zk |Z0:k−1 )
Further details can be seen in [9], [10], [15], [16].
where b(m|Xk ) is the PHD of the new feature RFS, B(Xk ). The PHD corrector
equation is then,
vk (m|X0:k ) = vk|k−1 (m|X0:k ) 1 − PD (m|Xk )+
Λ(m|Xk )
(10)
z∈Z
ck (z|Xk ) + Λ(ζ|Xk )vk|k−1 (ζ|X0:k )dζ
k
Fig. 1. An example of a map PHD superimposed on the true map represented by black
dots. The peaks of the PHD represent locations with highest concentration of expected
number of features. The PHD on the left is at time k − 1, and that on the right is at
time k.
4 Filter Implementations
This section outlines a Gaussian Mixture implementation of the proposed filters.
For the Robotic Mapping filter of section 3.1,let the predicted map PHD,
Jk−1
(j) (j) (j)
vk−1 (m|Xk−1 ) = ηk−1 N m; μk−1 , Pk−1 (11)
j=1
where, Jb,k defines the number of Gaussians in the new feature intensity at
(j) (j) (j)
time k and ηb,k , μb,k and Pb,k are the corresponding components. The predicted
intensity is therefore also a Gaussian mixture,
Jk|k−1
(j) (j) (j)
vk|k−1 (m|Xk ) = ηk|k−1 N m; μk|k−1 , Pk|k−1 (13)
j=1
524 J. Mullane et al.
which consists of Jk|k−1 = Jk−1 + Jb,k Gaussians representing the union of the
prior map intensity, vk−1 (m|Xk−1 ), and the proposed new feature intensity, ac-
cording to (9). Since the measurement likelihood is also of Gaussian form, it
follows from (10) that the posterior map PHD, vk (m|Xk ) is then also a Gaus-
sian mixture given by,
Jk|k−1
(j)
vk (m|Xk ) = vk|k−1 (m|Xk ) 1 − PD (m|Xk ) + vG,k (z, m|Xk ) .
z∈Zk j=1
(j)
(j)
PD (m|Xk )ηk|k−1 q (j) (z, Xk )
ηk (z|Xk ) = Jk|k−1
(15)
()
c(z) + PD (m|Xk )ηk|k−1 q () (z, Xk )
=1
(j) (j)
where, q (z, Xk ) = N
(j)
z; Hk μk|k−1 , Sk .
The terms μk|k , Pk|k and Sk can be
obtained using any standard filtering technique such as EKF or UKF. In this
paper, the EKF updates are adopted. The clutter RFS, Ck , is assumed Poisson
distributed [2] in number and uniformly spaced over the mapping region and
Gaussian management methods are carried out as in [19].
For the Robotic Mapping filter of section 3.2, the location density of (6) can
be propagated via Particle Filtering techniques [6], [15]. If the vehicle transition
density is chosen as the proposal distribution, the weighting for the ith particle
becomes,
w
(i) (i) )w(i) .
k = gk (Zk |Z0:k−1 , X (16)
0:k k−1
Fig. 2. Left: The simulated environment showing point features (green circles). A sam-
ple measurement history plotted from the robot trajectory (green line) is shown. Right:
Comparison of mapping error vs. measurement noise for the proposed filters and clas-
sical vector EKF solutions.
Fig. 3. Left: Feature mapping error vs. clutter density for vector based NN-EKF and
JCBB-EKF approaches and the proposed PHD framework, with the PHD approach
seen to perform well in high clutter. Right: Comparison of the map estimation error in
the presence of increasing densities per square meter of mobile robots.
Fig. 4. Left: The average estimated number of features in the map vs. ground truth
for each approach. The feature number estimate from the proposed approach can be
seen to closely track that of the ground truth. Right: A comparative plot of the mean
and standard deviation of the map estimation error vs. time. Note that the ‘ideal’ error
converges to zero, an important property for robotic mapping filters.
7
GPS
6
RB−PHD−SLAM
Average Positional RMSE (std)
NN−FastSLAM-FE
5
NN−FastSLAM-LQ
Wireless Industrial FMCW MMWR
Modem PC
4
LMS 200
LMS 200
3 E-Stop
Steering Motor Speed
Encoder Controllers
2
Wheel Encoders
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Time Index
Fig. 5. Left: The mean and standard deviation of the expected trajectory estimates of
the proposed RFS approach versus that of FastSLAM over 50 MC runs. LQ refers to
an implementation with the ‘landmark quality’ method of [4]. Right: The Autonomous
Robot used in experimental trials.
Fig. 6. A: Raw radar measurements and noisy vehicle path. B: The scan map plotted
from the GPS path. C: Posterior Estimate from FastSLAM, D: Posterior Estimate from
PHD-SLAM.
Mobile Robotics in a Random Finite Set Framework 527
and mapping noise. Experimental results based on a 77GHz millimeter wave radar
mounted on an autonomous robot, as seen in figure 5, are shown in 6.
6 Conclusion
This paper shows that from a fundamental estimation viewpoint that a feature-
based map is a finite set and subsequently presented a Bayesian filtering for-
mulation as well as a tractable solution for the feature-based mobile robotics
problem. The framework outlined here presents a new direction of research for
the multiple mobile robot community, which naturally encapsulates the inherent
system uncertainty. Both a mapping-only and joint robot trajectory / map filter
was introduced and analysed. In contrast to existing frameworks, the RFS ap-
proach to mobile robotics jointly estimates the number of features in the map as
well as their individual locations in the presence of data association uncertainty
and clutter. It was also shown that this Bayesian formulation admits a number
of optimal Bayes estimators for mobile robotics problems. Analysis was carried
out both in a simulated environment through Monte Carlo trials, demonstrating
the robustness of the proposed filter, particulary in the presence of large data
association uncertainty and clutter, illustrating the merits of adopting an RFS
approach a swarm-based robotics applications.
Acknowledgements
References
1. Smith, R., Self, M., Cheeseman, P.: Estimating uncertain spatial relationships in
robotics. In: Autonomous Robot Vehicles, pp. 167–193 (1990)
2. Makarsov, D., Durrant-Whyte, H.: Mobile vehicle navigation in unknown envi-
ronments: a multiple hypothesis approach. In: IEE Proceedings of Contr. Theory
Applict., vol. 142 (July 1995)
3. Thrun, S.: Particle filter in robotics. In: Uncertainty in AI (UAI) (2002)
4. Dissanayake, G., Newman, P., Durrant-Whyte, H., Clark, S., Csorba, M.: A solu-
tion to the simultaneous localization and map building (SLAM) problem. IEEE
Transactions on Robotic and Automation 17(3), 229–241 (2001)
5. Guivant, J., Nebot, E., Baiker, S.: Autonomous navigation and map building using
laser range sensors in outdoor applications. Journal of Robotic Systems 17(10),
565–583 (2000)
6. Montemerlo, M., Thrun, S., Siciliano, B.: FastSLAM: A Scalable Method for the
Simultaneous Localization and Mapping Problem in Robotics. Springer, Heidelberg
(2007)
528 J. Mullane et al.
7. Mullane, J., Vo, B., Adams, M., Wijesoma, W.: A random set formulation for
bayesian SLAM. In: Proceedings of the IEEE/RSJ International Conference on
Intelligent Robots and Systems, France (September 2008)
8. Mullane, J., Vo, B., Adams, M., Wijesoma, W.: A random set approach to SLAM.
In: Proceedings of the IEEE International Conference on Robotics and Automation
(ICRA) workshop on Visual Mapping and Navigation in Outdoor Environments,
Japan (May 2009)
9. Mullane, J., Vo, B., Adams, M., Vo, B.: A random finite set approach to bayesian
SLAM. IEEE Transactions on Robotics 27(2), 268–283 (2011)
10. Mullane, J., Vo, B., Adams, M., Vo, B.: Random Finite Sets for Robot Mapping
& SLAM. Springer Tracts in Advanced Robotics (to appear)
11. Mahler, R.: Global integrated data fusion. In: Proc. 7th Nat. Symp. on Sensor
Fusion, vol. 1, pp. 187–199 (1994)
12. Mahler, R.: Multi-target bayes filtering via first-order multi-target moments. IEEE
Transactions on AES 4(39), 1152–1178 (2003)
13. Mahler, R.: Statistical Multisource Multitarget Information Fusion. Artech House
(2007)
14. Kaylan, B., Lee, K., Wijesoma, W.: FISST-SLAM: Finite set statistical approach
to simultaneous localization and mapping. International Journal of Robotics Re-
search 29(10), 1251–1262 (2010), Published online first in October 2009
15. Mullane, J., Vo, B., Adams, M.: Rao-blackwellised PHD SLAM. In: Proceedings of
the IEEE International Conference on Robotics and Automation (ICRA), Alaska,
USA (May 2010)
16. Mullane, J., Keller, S., Rao, A., Adams, M., Yeo, A., Hover, F., Patrikalakis, N.:
X-band radar based SLAM in singapore’s off-shore environment. In: Proceedings
of the 11th IEEE ICARCV, Singapore (December 2010)
17. Vo, B., Singh, S., Doucet, A.: Sequential monte carlo methods for multi-target
filtering with random finite sets. IEEE Transactions on Aerospace and Electronic
Systems 41(4), 1224–1245 (2005)
18. Shoudong, H., Zhan, W., Dissanayake, G.: Sparse local submap joining filter for
building large-scale maps. IEEE Transactions on Robotics 24(5), 1121–1130 (2008)
19. Vo, B., Ma, W.: The gaussian mixture probability hypothesis density filter. IEEE
Transactions on Signal Processing 54(11), 4091–4104 (2006)
IMM Algorithm for a 3D High Maneuvering Target
Tracking
1 Introduction
The problem of tracking maneuvering targets has been studied extensively since the
mid 1960s. However, it’s still a challenge to track targets that flying at high speeds,
particularly performing “high-g” turns in the 3D space. When targets maneuver in a
horizontal plane with nearly constant speed and turn rate and have little or limited
vertical maneuver(such as civilian aircraft in ATC system), many of 2D horizontal
models and algorithms can lead to an acceptable accuracy [1]. In practice, targets
(such as military aircraft) always perform an arbitrary trajectory in the 3D space with
high speed or coordinated-turn. In this situation, horizontally or decoupled models
may lead to an unacceptable accuracy. So it’s necessary to investigate an algorithm
for a 3D maneuvering target tracking.
The multiple model approach has been observed to be a successful method for
maneuvering target tracking [2]. The results of previous investigations also indicate
that the IMM algorithm is the superior technique for tracking maneuvering targets
when the computational requirements of the technique are considered [1, 2, 3]. The
IMM algorithm uses model (Markov chain state) probabilities to weigh the inputs and
outputs of a bank of parallel Kalman filters (maybe other filter) at each time instant.
The key matter of the IMM algorithm is how to select models to match the real
motion mode. In other words, the combination of models in the IMM algorithm plays
an important role for the final tracking accuracy.
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 529–536, 2011.
© Springer-Verlag Berlin Heidelberg 2011
530 D.-l. Peng and Y. Gu
2 3D Motion Models
Singer model is explored by singer in 1970 [4]. It assumes that the target acceleration
a (t ) is a zeros-mean first-order stationary Markov process with autocorrelation [5]
R a (τ ) = E { a ( t ) a ( t + τ )} = σ a2 e − α τ (a ≥ 0) (2.4)
where a is the reciprocal of the maneuver time constant and depends on how long the
maneuver lasts. σ a2 is the “instantaneous variance” of the acceleration.
Such a process a (t ) is the state process of a linear time-invariant system
IMM Algorithm for a 3D High Maneuvering Target Tracking 531
The constant speed coordinated turn model [7, 8] assumes a circular target moves at
constant turn rate in a plane (for a constant speed motion, the acceleration vector is
orthogonal to the velocity vector). For an arbitrary plane of maneuver, the
acceleration can be described as
a = Ω×v (2.11)
532 D.-l. Peng and Y. Gu
where Ω is the (constant) turn rate vector and Ω = 0 , v is the velocity vector.
Taking the derivative of (2.12) lead to the following equivalent
a = (Ω ⋅ v )Ω − ( Ω ⋅ Ω )v (2.12)
Using the fact that v is orthogonal to Ω , that is, Ω ⊥ v , (2.13) can be reformulated
as
a = −ω 2 v (2.13)
where ω is defined as
a
ω Ω = (2.14)
v
If the acceleration perturbations modeled as white noise w , (2.13) can be
expressed as
a = −ω 2 v + w (2.15)
The corresponding 3D discrete-time model is
x(k) = diag[F(ω), F(ω), F(ω)]x(k −1) + w(k) (2.16)
where
⎡ sin ωT 1 − cos ωT ⎤
⎢1 ω ω2 ⎥
⎢ ⎥
sin ωT ⎥
F (ω ) = ⎢ 0
(2.17)
cos ωT
⎢ ω ⎥
⎢ ⎥
⎢ 0 −ω sin ωT cos ωT ⎥
⎣⎢ ⎥⎦
and T is the sampling period.
3 IMM Estimator
Here we consider a typical linear dynamic system, it can be represented as
X (k ) = F (k ) X (k − 1) + W (k ) (3.1)
Z (k ) = H (k ) X (k ) + V (k ) (3.2)
The main steps of the IMM estimator [3, 9, 10, 11] are as follows:
Step 1- Model Interaction or Mixing
The mode-conditioned state estimate and the associated covariances from the
previous iteration are mixed to obtain the initial condition for the mode-matched
filters. The initial condition in cycle k for the Kalman filter matched to the j-th mode
is computed using
∑ Xˆ i (k −1)μi| j (k −1)
r
Xˆ 0j (k −1) i =1
(3.3)
and
Pj0 (k −1) = ∑i =1 μi| j (k −1){Pi (k −1) +[Xˆ i (k −1) − Xˆ 0j (k −1)][Xˆi (k −1) − Xˆ 0j (k −1)]'}
r
(3.4)
where r is the number of model-matched filters used. The state estimates and their
covariance matrix at time k-1 conditioned on the i-th model are denoted by Xˆi (k−1)
and Pi (k −1) , respectively; μi| j (k − 1) are the mixing probabilities and can be
described as
{
μi| j (k − 1) P m(k − 1) = i m(k ) = j, Z k −1 }
pij μi (k − 1)
= r
(i, j = 1, 2, r) (3.5)
∑ p μ (k − 1)
l =1
lj l
where m(k) is the index of the model in effect in the interval (k-1, k]. μi ( k ) is the
probability that the model i (i=1, 2, 3…r) is in effect in the above interval and can be
expressed as
μi (k ) P{m(k ) = i Z k } (3.6)
The cumulative set of measurements up to and including scan k is denoted by Zk. pij is
the model transition probability and is defined as
pij P{m( k ) = j m(k − 1) = i} (3.7)
The definitions of m(k-1), μi ( k − 1) and Zk-1 are similar to the definitions of m(k),
k
μi ( k ) and Z .
Step 2- Model-conditioned Filtering
According to the outline of the Kalman filtering, the mixed state Xˆ j (k − 1) and the
0
associated covariance matrix Pj0 ( k − 1) are matched to each model to yield the
Xˆ ( k ) = ∑ j =1 Xˆ j ( k ) μ j ( k )
r
(3.11)
⎡0.97 0.03 0 ⎤
⎢
pij = ⎢ 0 0.75 0.25⎥⎥ (4.1)
⎢⎣ 0.05 0 0.95⎥⎦
When the target perform turns B and C, the RMSE of the three algorithms in X, Y, Z
respectively are demonstrated in Fig. 1. Fig.2 is the specific of the RMSE of three
algorithms between 200~250s, i.e., Fig. 2 is the magnified version of Fig.1 between
200~250s.
It’s clearly shown that when the target is non-maneuvering the performance
of three algorithms is almost the same. At the turn B, CV-Singer-3DCSCT and
CV-CSM-3DCSCT have almost the same RMSE and are slightly better than CV-CA-
3DCSCT in tracking accuracy. When the assumption of 3DCSCT model is slightly
violated, such as turn C, CV-Singer-3DCSCT and CV-CSM-3DCSCT also have
almost the same RMSE, however, they are much better than CV-CA-3DCSCT.
Fig. 1. RMSE of three IMM algorithms for Fig. 2. RMSE of three IMM algorithms at
X-Y-Z turn C
5 Conclusions
The benefits of using the CSM and 3DCSCT model in IMM algorithm to track a 3D
high maneuvering target have been clearly demonstrated in this paper. When the
target perform “high-g” turn in 3D space, this IMM algorithm utilizing CSM is better
536 D.-l. Peng and Y. Gu
than other two IMM algorithms, which Singer and CA are included. However, how to
choose the parameter in models and filters is an important issue to be addressed in
future study.
References
1. Li, X.R., Jilkov, V.P.: Survey of maneuvering target tracking. In: Part V: multiple-models.
SPIE, vol. 4048, pp. 212–236 (2000)
2. Blom, H.A., Bar-Shalom, Y.: The interacting multiple model algorithm for systems with
markovian switching coefficient. IEEE Transactions on Automatic Control 33(8), 780–783
(1988)
3. Watson, G.A., Blair, W.D.: IMM algorithm for tracking targets that maneuver through
coordinated turn. In: SPIE, vol. 1698, pp. 236–247 (1992)
4. Nabaa, N., Bishop, R.H.: Validation and comparison of coordinated turn aircraft maneuver
models. IEEE Transactions on Aerospace and Electronic Systems 36(1), 250–259
5. Singer, R.A.: Estimating optimal tracking filter performance for manned maneuvering
targets. IEEE Transactions on Aerospace and Electronic Systems 6(4), 473–483 (1970)
6. Zhou, H.R., Jin, Z.L., Wang, P.D.: Maneuvering target tracking, pp. 135–145. National
Defence Industry Press, Beijing (1991)
7. Tahk, M., Speyer, J.L.: Target tracking problems subject to kinematic constraints. IEEE
Transactions on Automatic Control 35(3), 324–326 (1990)
8. Alouani, A.T., Blair, W.D.: Use of a kinematic constraint in tracking constant speed,
maneuvering targets. IEEE Transactions on Automatic Control 38(7), 1107–1111 (1993)
9. Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with applications to tracking and
navigation: theory, algorithms, and software, pp. 453–457. Wiley, New York (2001)
10. Li, X.R., Jilkov, V.P.: Survey of maneuvering target tracking. Part V: multiple-model
methods. IEEE Transactions on Aerospace and Electronic Systems 41(4), 1255–1321
(2005)
11. Kadirkamanathan, V., Li, P., Kirubarajan, T.: Sequential Monte Carlo filtering vs. the
IMM estimator for fault detection and isolation in nonlinear systems. In: SPIE, vol. 4389,
pp. 263–274 (2001)
A New Method Based on Ant Colony Optimization for
the Probability Hypothesis Density Filter*
1 Introduction
Multi-target tracking (MTT) is regarded as a classic but an intractable problem in a
wide variety of contexts. According to recent literature [1-5], data association (DA)
problems form the main stream in MTT. But due to its combinatorial nature, the DA
problem makes up the bulk of the computational load in MTT filed. The random finite
sets (RFS) which avoids explicit associations between measurements and tracks
becomes an alternative formulation in recent decade. Especially, the probability
hypothesis density (PHD) [6], a novel RFS-based filter, and its implementations have
generated substantial interest.
The PHD filter operates on the single-target state space and avoids the
combinatorial problem that arises from DA problem. This salient feature renders the
PHD filter extremely attractive. However, the PHD recursion involves multiple
integrals that have no closed form solutions in general. Fortunately, two methods have
been successfully developed for approximating the PHD filter so that it can be
implemented [7-8], i.e., the sequence Monte Carlo PHD method (SMCPHD) [7] and
the Gaussian mixture PHD method (GMPHD) [8]. Hundreds of papers based on
methods in [7-8] are proposed in recent decade. But most of them are applied directly
with two methods in different fields or modified with traditional DA algorithm.
*
This work is supported by national natural science foundation of China (No.60804068) and by
national science foundation of Jiangsu province (No.BK2010261) and by cooperation
innovation of industry, education and academy of Jiangsu province (No.BY2010126).
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 537–542, 2011.
© Springer-Verlag Berlin Heidelberg 2011
538 J. Zhu et al.
So far, there are few reports on the ant-based applications to the parameter estimate
field or multi-target tracking except [9-11]. In this work, a novel approximating
method based on ant colony optimization (ACO) for PHD filter is proposed. The
remainder of this paper is organized as follows. Section 2 presents the background on
the PHD filter. Section 3 describes the principle of the proposed method for PHD
filter. Numerical simulations are conducted and corresponding results are analyzed in
Section 4. Finally, conclusions are drawn in Section 5.
∫ S
v( x)dx = ∫ | X ∩ S | P(dX ) (1)
where | X | denotes the cardinality of a set X . In other words, the integral of v over
any region S gives the expected number of elements of X that are in S . This
intensity is commonly known in the tracking literature as PHD.
Let vk and vk |k −1 denote the respective intensities associated with the multi-target
posterior density pk and the multi-target predicted density pk |k −1 . So the PHD filter
can be shown that the posterior intensity can be propagated in time via the PHD
recursion (2) and (3).
vk |k −1 ( x) = ∫ ps , k (ς ) f k |k −1 ( x | ς )vk −1 (ς )d ς
(2)
+ ∫ β k |k −1 ( x | ς )vk −1 (ς )d ς + γ k ( x)
vk ( x) = ⎡⎣1 − pD , k ( x) ⎤⎦ vk |k −1 ( x)
pD , k ( x) g k ( z | x)vk |k −1 ( x) (3)
+ ∑κ
Z ∈Z K ( z ) + ∫ pD , k (ξ ) g k ( z | ξ )vk |k −1 (ξ )d ξ
k
where γ k (i) denotes the intensity of the birth RFS Γ k at time k , β k |k −1 (i| ς ) denotes
the intensity of the RFS Bk |k −1 (ς ) spawned at time k by a target with previous state
ς , ps , k (ς ) denotes the probability that a target still exists at time k given that its
previous state is ς , f k |k −1 (i| ς ) denotes the transition probability density of individual
targets, pD , k ( x) denotes the probability of detection given a state x at time k ,
g k ( z |i) denotes the likelihood of individual targets and γ k (i) denotes the intensity of
the clutter RFS Κ k at time k .
A New Method Based on Ant Colony Optimization for the PHD Filter 539
In the third phase, the extremum search process is executed. Suppose the value of
candidate i is denoted by vk( i|k) −1 , the value of its neighbors is denoted by [vk(i|k−1)−1 , vk(i|k+1)−1 ] .
If ant am locates on the candidate i , it will move to this left neighbor or right
neighbor, four moving behaviors are designed as follows:
z If vk( i|k−1)−1 < vk( i|k) −1 and vk( i|k) −1 < vk( i|k+1)−1 holds, ant am will move to candidate i + 1 .
z If vk( i|k−1)−1 > vk(i|k) −1 and vk( i|k) −1 > vk(i|k+−1)1 holds, ant am will move to candidate i − 1 .
540 J. Zhu et al.
z If vk( i|k−1)−1 ≤ vk( i|k) −1 and vk( i|k) −1 ≥ vk( i|k+1)−1 holds, ant am will select candidate i + 1 or i − 1
with given probability threshold P0 .
z If vk( i|k−1)−1 > vk(i|k) −1 and vk( i|k) −1 < vk( i|k+1)−1 holds, ant am will select candidate i + 1 or i − 1
with probability P , which is given by
−ηij / C1
τ i( m ) e
Pij ( m ) = (6)
∑ τ in( m) e−η
n∈[ i −1, i +1]
in / C1
where C2 is a given positive constant. If the number of ants has moved to the
candidate i is l at iteration t , the pheromone on candidate i is updated as following
formula
l
τ i (t ) = (1 − ρ )τ i (t ) + ∑ Δτ im (t ) (8)
m =1
Meanwhile, all ants will stay on the points with local maximum intensity function
value. But no all of these points are originated from true targets, so in the final phase,
the state extraction of targets is executed depending on the measurement at each time
step. Given the importance density g k′ ( z | x) , the formula (3) can be defined by
vk ( x) ≈ ⎣⎡1 − pD , k ( x) ⎦⎤ vk |k −1 ( x)
pD , k ( x) g k′ ( z | x)vk |k −1 ( x) (9)
+ ∑κ ′
Z ∈Z K k ( z ) + C3 pD , k ( x ) g k ( z | x )vk |k −1 ( x )
where C3 is a given positive parameter. Each candidate where ants stay on will be
computed the value based on formula (9), if vk ( xi ) of candidate i is smaller than a
given parameter ε , all ants staying on candidate i will die and the candidates with
surviving ants will be regarded the state originated from true target. And these
candidates will be utilized as the beginning process at next time step.
4 Numerical Simulations
For illustration purposes, two dimensional scenario with an unknown and time
varying number of targets observed in clutter over the surveillance region
A New Method Based on Ant Colony Optimization for the PHD Filter 541
[1km,3km] × [14km,16km] are considered, the Poisson birth RFS Γ k with intensity
γ k ( x) = 0.1N ( x, mr , Pr ) , mr = [2000,50,14816, −50]T , Pr = diag ([100,10,100,10]T ) and
other parameters are set to be as same as in [10]. The importance density used
are pk = f k |k −1 , qk = N (i, x , Q) and g k′ = g k ( z | x) . Additionally, the ant-based
parameters are set to be as follows: N iteration = 500 , C1 = 0.5 , C2 = 100 , C3 = 1.0 ,
ρ = 0.2 , ε = 1e − 20 .
4
x 10
3000 1.6
True tracks
2500 1.58
Measurements
x (m)
2000 1.56
True tracks
1500 1.54 Measurements
1000 1.52
5 10 15 20 25 30 35 40 45 50
y (m)
time step 1.5
4
x 10
1.6 1.48
1.55 1.46
y (m)
1.5 1.44
1.45 1.42
1.4 1.4
5 10 15 20 25 30 35 40 45 50 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000
time step x (m)
3000 3000
2500 2500
x(m)
x(m)
2000 2000
1.55 1.55
y(m)
y(m)
1.5 1.5
1.45 1.45
1.4 1.4
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
time time
4 3
Target number Target number
No. of targets
No. of targets
0 0
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
time step time step
OSPA distance (in m)
1.69
10 1.6
10
1.1
1.6 10
10
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
time step time step
Fig. 3. Target number estimate and OSPA distance of proposed method and SMCPHD
542 J. Zhu et al.
Figure 1 shows the true target tracks in clutter environment. Figure 2 shows that
the position estimates based on the proposed method and SMCPHD (500 particles).
Figure 3 shows target number estimate and OSPA distance of proposed method and
SMCPHD. From Figure 2 and 3, it can be observed that the performance of proposed
method is closed to the SMCPHD in such kind of scenario. But our method is simpler
than SMCPHD method, due to the approximating represent.
5 Conclusions
A new approximating estimate method based on ACO algorithm for PHD filter is
proposed. The main idea is composed of four process phases, and the key idea of the
proposed method is that the extremum search method based on ACO deals with the
approximating recursive function. Simulations show that the proposed method can be
closed to the SMCPHD according to the OSPA distance metric. And the proposed
method is simpler than SMCPHD. Future work will focus on the estimate accuracy of
the proposed method and extend the method into maneuvering targets tracking cases.
References
1. Lee, M.S., Kim, Y.H.: New Data Association Method for Automotive Radar Tracking. IEE
Proc.-Radar Sonar Navig. 148(5), 297–301 (2001)
2. Li, X.R., Bar-Shalom, Y.: Tracking in Clutter with Nearest Neighbor Filters: Analysis and
Performance. IEEE Trans. On Aerospace and Electronic Systems 32(3), 995–1010 (1996)
3. Li, X.R.: Tracking in Clutter with Strongest Neighbor Measurements –Part I: Theoretical
Analysis. IEEE Trans. On Automatic Control 43(11), 1560–1578 (1998)
4. Fortmann, T., Bar-Shalom, Y., Scheffe, M.: Sonar Tracking of Multiple Targets Using Joint
Probabilistic Data Association. IEEE Journal of Oceanic Engineering, OE 8, 173–183 (1983)
5. Blackman, S.S.: Multiple Hypothesis Tracking for Multiple Target Tracking. IEEE A&E
Systems Magazine 19(1), 5–18 (2004)
6. Mahler, R.: Multi-target Bayes Filtering via First-order Multi-target Moments. IEEE
Trans. AES 39(4), 1152–1178 (2003)
7. Vo, B., Singh, S., Doucet, A.: Sequential Monte Carlo Implementation of the PHD Filter
for Multi-target Tracking. In: Proc. Int’l Conf. on Information Fusion, Cairns, Australia,
pp. 792–799 (2003)
8. Vo, B., Ma, W.K.: The Gaussian Mixture Probability Hypothesis Density Filter. IEEE
Trans. Signal Processing 54(11), 4091–4104 (2006)
9. Nolle, L.: On a Novel ACO-Estimator and its Application to the Target Motion Analysis
problem. Knowledge-Based Systems 21(3), 225–231 (2008)
10. Xu, B.L., Vo, B.: Ant Clustering PHD Filter for Multiple Target Tracking. Applied Soft
Computing 11(1), 1074–1086 (2011)
11. Xu, B.L., Chen, Q.L., Zhu, J.H., Wang, Z.Q.: Ant Estimator with Application to Target
Tracking. Signal Processing 90(5), 1496–1509 (2010)
12. Dorigo, M., Maniezzo, V., Colorni, A.: Positive Feedback as a Search Strategy. Technical
Report 91-016, Dipartimento di Elettronica, Politecnico di MILANO, Milan, Italy (1991)
13. Pang, C.Y., Li, X.: Applying Ant Colony Optimization to Search All Extreme Points of
Function. In: 5th IEEE Conf. on industrial Electronics and Applications, pp. 1517–1521
(2009)
A Hybrid Algorithm Based on Fish School
Search and Particle Swarm Optimization for
Dynamic Problems
1 Introduction
The optima solutions for many real-world problems may vary over the time. For
example, the optimal routes for a computer network can change dynamically due
to nodes failures or due to unavailable links. Therefore, optimization algorithms
to solve real-world problems should present the capability to deal with dynamic
environments, in which the optima solutions can change along the time.
Many bio-inspired optimization algorithms have been proposed in the last
two decades. Among them, there are the swarm intelligence algorithms, which
were conceived based on some collective behaviors. In general, swarm algorithms
are inspired in groups of animals, such as flocks of birds, schools of fish, hives
of bees, colonies of ants, etc. Although a lot of swarm-based algorithms were
already proposed, just some few were designed to tackle dynamic problems.
One of the most used swarm intelligence algorithms is the Particle Swarm
Optimization (PSO). Despite the fast convergence capability, the vanilla version
of the PSO can not tackle dynamic optimization problems. It occurs because the
entire swarm often increases the explotation around a good region of the search
space, reducing the overall diversity of the population. However, some variations
of the PSO have been created in order to increase the capacity to escape from
regions in the search space where the optimum is not located anymore [1,2,3].
On the other hand, another swarm intelligence algorithm proposed in 2008, the
Fish School Search algorithm (FSS) [4,5,6], presents a very interesting feature that
can be very useful for dynamic environments. FSS presents an operator, called
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 543–552, 2011.
c Springer-Verlag Berlin Heidelberg 2011
544 G.M. Cavalcanti-Júnior et al.
2 Background
2.1 PSO (Particle Swarm Optimization)
Particle Swarm Optimization is a population-based optimization algorithm in-
spired by the behavior of flocks of birds. It was firstly introduced by Kennedy
and Eberhart [7] and it has been largely applied to solve optimization problems.
The standard approach is composed by a swarm of particles, where each one
has a position within the search space − →
xi and each position represents a solu-
tion for the problem. The particles fly through the search space of the problem
searching for the best solution, according to the current velocity − →
vi , the best
−−−→
position found by the particle itself (Pbesti ) and the best position found by the
−−−→
entire swarm during the search so far (Gbest ).
According to the approach proposed by Shi and Eberhart [8] (this approach is
also called inertia PSO), the velocity of a particle i is evaluated at each iteration
of the algorithm by using the following equation:
−
→ −−−→ → −−−→ →
vi (t + 1) = w−
→
vi (t) + r1 c1 [Pbesti − −
xi (t)] + r2 c2 [Gbesti − −
xi (t)], (1)
where r1 and r2 are numbers randomly generated in the interval [0, 1]. The
inertia weight (w) controls the influence of the previous velocity and balances
the exploration-exploitation behavior along the process. It generally decreases
from 0.9 to 0.4 during the algorithm execution. c1 and c2 are called cognitive
and social acceleration constants, respectively, and weights the influence of the
memory of the particle and the information acquired from the neighborhood.
The position of each particle is updated based on the velocity of the particle,
according to the following equation:
−
→
xi (t + 1) = −
→
xi (t) + −
→
vi (t + 1). (2)
A Hybrid Algorithm Based on FSS and PSO for Dynamic Problems 545
Since the standard PSO can not tackle dynamic problems due to the the low
capacity to increase the diversity after the entire swarm has converged to a
single region of the search space, many efforts to overcome this weakness have
been made. The simplest idea is to restart the particles every time the search
space changes. However, all the previous information obtained from the problem
during the search process is lost in this case.
An interesting approach introduced by Blackwell and Bentley [1] is the Charged
PSO, which uses the idea of electrostatic charges. Some particles are charged (they
repeal themselves) and some others are neutral. In general, the neutral particles
tend to exploit towards a single sub-region of the search space, whereas the charged
particles never converges to a unique spot. Nevertheless, the charged particles are
constantly exploring in order to maintain diversity.
In order to consider the effect of the charged particles, the velocity equation
receives a fourth term, as shown in the equation (3). This term is defined as the
acceleration of the particle i (−
→a i ) an can be seen in equation (4).
−
→ −−−→ → −−−→ →
vi (t + 1) = w−
→
vi (t) + r1 c1 [Pbesti − −
xi (t)] + r2 c2 [Gbest − −
xi (t)] + −
→
a i (t). (3)
Qi Qj − →
r ij (t), if Rc ≤ −
→
r ij (t) ≤ Rp ,
−
→
a i (t) = =j →
i −
r ij (t)3
(4)
0, otherwise ,
where − →
r ij (t) = −
→
x i (t) − −
→
x j (t), Qi is the charge magnitude of the particle i, Rc
is the core radius and Rp is the perception limit of the particle. Neutral particles
have charge value equal to zero, i.e. Qi = 0.
The Fish School Search (FSS) is an optimization algorithm based on the gre-
garious behavior of oceanic fish. It was firstly proposed by Bastos-Filho et al in
2008 [4]. In the FSS, each fish represents a solution for the problem. The success
of a fish during the search process is indicated by its weight. The FSS has four
operators, which are executed for each fish of the school at each iteration: (i)
individual movement, which is responsible for local search stepind ; (ii) feeding,
which updates the fish weights indicating the degree of success or failure during
546 G.M. Cavalcanti-Júnior et al.
the search process so far; (iii) collective-instinctive movement, which makes all
fish moves toward a resultant direction; and (iv) collective-volitive movement,
which controls the granularity of the search. In this paper, as we are dealing
with dynamic environments, only the feeding and collective-volitive movement
operators are used to build the proposed hybrid algorithm.
Feeding operator
The feeding operator determines the variation of the fish weight at each iteration.
One should notice that a fish can increase or decrease its weight depending,
respectively, on the success or failure during the search process. The weight of
the fish is evaluated according to the following equation:
Δfi
Wi (t + 1) = Wi (t) + , (5)
max(|Δf |)
where Wi (t) is the weight of the fish i, Δfi is the variation of the fitness function
between the new position and the current position of the fish, max(|Δf |) is
the absolute value of the greatest fitness variation among all fish. There is a
parameter wscale that limits the maximum weight of the fish. The weight of
each fish can vary between 1 and wscale and has an initial value equal to wscale 2
.
We use equation (7) to perform the fish school expansion (use sign +) or
contraction (use sign −).
→
− →
−
−
→ x i (t) − B (t)
x i (t + 1) = −
→
x i (t) ± stepvol r1 −
→ , (7)
d(−
→x i (t), B (t))
→
−
where r1 is a number randomly generated in the interval [0, 1]. d(− →
xi , B ) evaluates
the euclidean distance between the particle i and the barycenter. stepvol is called
volitive step and controls the step size of the fish. The stepvol is bounded by two
parameters (stepvol min and stepvol max ) and decreases linearly from stepvol max
to stepvol min along the algorithm iterations. It helps the algorithm to initial-
ize with an exploration behavior and change dynamically to an exploitation
behavior.
A Hybrid Algorithm Based on FSS and PSO for Dynamic Problems 547
3 Volitive PSO
This section presents the proposed algorithm, called Volitive PSO, which is a
hybridization of the FSS and the PSO algorithms. Our proposal is to include two
FSS operators in the Inertia PSO, the feeding and the collective-volitive move-
ment. In the Volitive PSO, each particle becomes a weighted particle, where the
weight is used to indicate the collective-volitive movement, resulting in expan-
sion or contraction of the school. In our proposal, the stepvol does not decrease
linearly, it decreases according to equation (8). The parameter volitive step decay
percentage (decayvol ) must be in the interval [0, 100].
100 − decayvol
stepvol (t + 1) = stepvol (t) . (8)
100
The stepvol is reinitialized to stepvol max when a change in the environment is
detected. We use a sentry particle [9] to detect these changes. The fitness of the
sentry particle is evaluated in the end of each iteration and in the beginning of
the next iteration. The Algorithm 1.1 shows the Volitive PSO pseudocode.
4 Simulation Setup
In this section we present the benchmark function, the metric to measure the qual-
ity of the algorithms and the values for the parameters used in the simulations.
548 G.M. Cavalcanti-Júnior et al.
We used the DF1 benchmark function proposed by Morrison and Jong [10] in our
simulations. DF1 is composed by a set of random peaks with different heights
and slopes. The number of peaks, their heights, slopes, and positions within the
search space are adjustable. The function for a N -dimensional space is defined
according to the equation (9).
f (−
→
x ) = maxi=1,2,...,P [Hi − Si (−
→
x −− →
xi )2 ], (9)
The mean fitness metric was introduced by Morrison [11]. He argued that a
representative performance metric to measure the quality of an algorithm in a
dynamic environment should reflect the performance of the algorithm across the
entire range of environment dynamics. The mean fitness is the average over all
previous fitness values, as defined below:
T
t=1 Fbest (t)
Fmean (T ) = , (11)
T
where T is the total number of iterations and Fbest is the fitness of the best
particle after iteration t. The advantage of the mean fitness is that it represents
the entire algorithm performance history.
We also used the collective mean fitness [11], that is simply the average value
of the mean fitness at the last iteration over a predefined number of trials.
All results presented in this paper are the average values after 30 trials. We
used 10,000 iterations for all algorithms. We performed the experiments in two
situations: (i) 10 dimensions and 10 peaks and (ii) 30 dimensions and 30 peaks.
A Hybrid Algorithm Based on FSS and PSO for Dynamic Problems 549
In this paper, only the peak positions are varied along the iterations. The heights
and slopes of the peaks were initialized randomly within the predefined interval.
The parameters used for the DF1 function are Hbase = 40, Hrange = 20, Hscale =
0.5, rh = 3.2, Sbase = 1, Srange = 7, Sscale = 0.5, rs = 1.2, xbase id = −10,
xrange id = 20, xscale id = 0.7, rxd = 3.2.
For all PSO algorithms, we used 50 particles, local topology, c1 and c2 equal
to 1.494 [12] and w decreasing linearly from 0.9 to 0.4 along 100 iterations. We
set up w = 0.9 every time an environment change is detected. We chose the
local topology since it helps to avoid premature convergence to a local optimum,
which is good for optimization in dynamic environments. The Charged PSO was
tested empirically with 30%, 50% and 70% of charged particles, and for Q = 4,
Q = 8, Q = 12 and Q = 16. In both scenarios, the best results were achieved
for 30% of charged particles and Q = 12. Hence, these values were used. For
the FSS, we used 50 fish, Wscale = 500, initial and final individual step equal to
2% and 0.01%, and initial and final volitive step equal to 40% and 0.1%. stepind
and stepvol decreases linearly along 100 iterations and are reinitialized when a
change in environment occurs. For the Volitive PSO, we used wscale = 500, and
stepvol min = 0.01%.
5 Results
5.1 Analysis of the Parameters
This section presents an analysis of the influence of the parameters decayvol
and stepvol max in the performance of the Volitive PSO. As preliminary re-
sults showed that the algorithm is more sensible to the decayvol parameter and
high values for decayvol do not present good performance, we tested the follow-
ing decayvol values: 0%, 10% and 25%. For each decayvol value, we varied the
stepvol max value and the box plots of the mean fitness at the last iteration are
shown in the Figure 1.
For the case 1 (10 dimensions and 10 peaks), the average mean fitness for dif-
ferent stepvol max are not so different (as shown in Figures 1(a), 1(c) and 1(e)).
However, slightly better results can be observed for decayvol = 10%. Neverthe-
less, for the case 2 (30 dimensions and 30 peaks), the best results were achieved
for decayvol equal to 0%. It indicates that is better to not diminish the stepvol for
spaces with higher dimensionality. The best results for the case 2 were achieved
when stepvol max = 40% and decayvol = 0%. Hence, we used these values for the
comparison presented in the next sub-section.
(a) decayvol = 0%, 10d and 10 peaks. (b) decayvol = 0%, 30d and 30 peaks.
(c) decayvol = 10%, 10d and 10 peaks. (d) decayvol = 10%, 30d and 30 peaks.
(e) decayvol = 25%, 10d and 10 peaks. (f) decayvol = 25%, 30d and 30 peaks.
Fig. 1. Analysis of the parameters decayvol and stepvol max of the Volitive PSO
algorithm
A Hybrid Algorithm Based on FSS and PSO for Dynamic Problems 551
Table 1. Collective Mean Fitness - Average (standard deviation) after 10, 000
iterations
Table 1 shows the collective mean fitness (and standard deviation in parenthe-
sis) after 10, 000 iterations. One can observe that the Volitive PSO also achieved
lower standard deviation in both cases.
6 Conclusion
In this paper we proposed a hybrid FSS-PSO algorithm for dynamic optimiza-
tion. We showed that the collective-volitive movement operator applied to the
PSO can help to maintain diversity when the search space is varying over the
time, without reducing the exploitation capability. Some preliminary results
showed that the volitive step must not decay quickly. It indicates the impor-
tant hole of the FSS-operator to generate diversity after environmental changes.
Further research includes a deeper analysis of the Volitive PSO and more tests
varying the peaks height and slopes. Also, we intend to analyze the dynamics of
the swarm within the search space.
Ackonowledgments
The authors acknowledge the financial support from CAPES, CNPq and Uni-
versity of Pernambuco for scholarships, support and travel grants.
552 G.M. Cavalcanti-Júnior et al.
References
1. Blackwell, T.M., Bentley, P.J.: Dynamic Search with Charged Swarms. In: Proceed-
ings of the Genetic and Evolutionary Computation Conference, pp. 19–26 (2002)
2. Rakitianskaia, A., Engelbrecht, A.P.: Cooperative charged particle swarm opti-
miser. In: Congress on Evolutionary Computation, CEC 2008, pp. 933–939 (June
2008)
3. Nickabadi, A., Ebadzadeh, M.M., Safabakhsh, R.: Evaluating the performance of
DNPSO in dynamic environments. In: IEEE International Conference on Systems,
Man and Cybernetics, pp. 2640–2645 (October 2008)
4. Bastos-Filho, C.J.A., Neto, F.B.L., Lins, A.J.C.C., Nascimento, A.I.S., Lima, M.P.:
A novel search algorithm based on fish school behavior. In: IEEE International
Conference on Systems, Man and Cybernetics, pp. 2646–2651. IEEE, Los Alamitos
(October 2009)
5. Bastos-Filho, C.J.A., Neto, F.B.L., Sousa, M.F.C., Pontes, M.R.: On the Influence
of the Swimming Operators in the Fish School Search Algorithm. In: SMC, pp.
5012–5017 (October 2009)
6. Bastos-Filho, C.J.A., de Lima Neto, F.B., Lins, A.J.C.C., Nascimento, A.I.S., Lima,
M.P.: Fish school search. In: Chiong, R. (ed.) Nature-Inspired Algorithms for Op-
timisation. SCI, vol. 193, pp. 261–277. Springer, Heidelberg (2009)
7. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE
international conference on neural networks, vol. 4, pp. 1942–1948 (1995)
8. Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: The 1998 IEEE
International Conference on Evolutionary Computation Proceedings, IEEE World
Congress on Computational Intelligence, pp. 69–73 (1998)
9. Carlisle, A., Dozier, G.: Applying the particle swarm optimizer to non-stationary
environments. Phd thesis, Auburn University, Auburn, AL (2002)
10. Morrison, R.W., Jong, K.A.D.: A test problem generator for non-stationary envi-
ronments. In: Proc. of the 1999 Congr. on Evol. Comput., pp. 2047–2053 (1999)
11. Morrison, R.W.: Performance Measurement in Dynamic Environments. In:
GECCO Workshop on Evolutionary Algorithms for Dynamic Optimization Prob-
lems, pp. 5–8 (2003)
12. Eberhart, R.C., Shi, Y.: Particle Swarm Optimization: Developments, Applications
and Resources. In: Proceedings of the IEEE Congress on Evolutionary Computa-
tion, CEC 2001 (2001)
Feeding the Fish – Weight Update Strategies
for the Fish School Search Algorithm
1 Introduction
The Fish School Search (FSS) algorithm [1, 2, 3] is a recently developed swarm
intelligence algorithm based on the social behavior of schools of fish. By living
in swarms, the fish improve survivability of the whole group due to mutual pro-
tection against enemies. Moreover, the fish perform collective tasks in order to
achieve synergy (e.g. finding locations with lots of food). Comparable to real
fish that swim in the aquarium in order to find food, the artificial fish search
(swim) the search space (aquarium) for the best candidate solutions (locations
with most food ). The location of each fish represents a possible solution to the
problem – comparable to locations of particles in Particle Swarm Optimization
(PSO, [4]). The individual success of a fish is measured by its weight – conse-
quently, promising areas can be inferred from regions where bigger ensembles of
fish are located. As for other heuristic search algorithms we consider the prob-
lem of finding a “good” (ideally the global) solution of an optimization problem
with bound constraints in the form: minx∈Ω f (x), where f : RN → R is a non-
linear objective function and x is the feasible region. Since we do not assume
that f is convex, f may possess many local minima. Solving such tasks for high
dimensional real world problems may be expensive in terms of runtime if exact
algorithms were used. Various nature inspired algorithms have shown to be able
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 553–562, 2011.
c Springer-Verlag Berlin Heidelberg 2011
554 A. Janecek and Y. Tan
to preform well with these difficulties. Even though if these algorithms are only
meta-heuristics, i.e. there is no proof that they reach the global optimum of the
solution, these techniques often achieve a reasonably good solution for the given
task at hand in a reasonable amount of time.
Related work. The FSS algorithm was introduced to the scientific community
in 2008 [1]. This paper was extended to a book chapter [2] where FSS has been
evaluated and compared to different variants of PSO. Results indicate that FSS is
able to achieve better results as PSO on several benchmark functions, especially
on multimodal functions with several local minima. In another study [3] the same
authors analyzed the importance of the swimming operators of FSS and showed
that all operators have strong influences on the results. Although for some bench-
marks the individual operator alone sometimes produced better results than all
operators together, the results using only the individual operator are highly sen-
sitive to the initial and also final values of stepind and stepvol . Moreover, it was
shown that a rather large initial value for stepind (stepind initial = 10%) generally
achieved the best results. In a very recent study FSS has been used successfully
to initialize the factors of the non-negative matrix factorization (NMF) [5].
In this work we aim at investigating the influence of newly developed weight
update strategies for FSS as well as the influence of a non-linear decrease of
the step-size parameters stepind and stepvol . We introduce and compare weight
update strategies based on a linear decrease of weights, as well as a fitness based
weight decrease strategy. Moreover, we introduce a combination of (i) this fit-
ness based weight decrease strategy, (ii) the non-linear decrease of the step-size
parameters, and (iii) a newly introduced dilation multiplier which breaks the
symmetry between contraction and dilation but can be useful in some situa-
tions to escape from local minima. Experimental evaluation performed on five
benchmark functions shows that especially the non-linear decrease of the step-
size parameters is an effective and efficient way to significantly speed up the
convergence of FSS and also to achieve better fitness per iteration results.
Δx = n − x. (3)
If no individual movement occurs Δf = 0 and Δx = 0. The parameter stepind
decreases linearly during the iterations
stepind initial − stepind f inal
stepind (t + 1) = stepind (t) − . (4)
number of iterations
B. Feeding: Fish can increase their weight depending on the success of the
individual movement according to
where wi (t) is the weight of fish i, Δf (i) is the difference of the fitness at current
and new location, and max(Δf ) is the maximum Δf of all fish. An additional
parameter wscale limits the weight of a fish (1 <= wi <= wscale ).
C. Collective instinctive movement: After all fish have moved individually, a
weighted average of individual movements based on the instantaneous success of
all fish is computed. All fish that successfully performed individual movements
influence the resulting direction of the school movement (i.e. only fish whose
Δx != 0 influence the direction). The resulting direction m(t) is evaluated by
N
i=1 Δxi Δfi
m(t) = N . (6)
i=1 Δfi
Then, all fish of the school update their positions according to m(t)
radius of the fish school is dilated in order to cover a bigger area of the search
space. First, the barycenter b (center of mass/gravity) needs to be calculated
N
xi wi (t)
b(t) = i=1
N
. (8)
i=1 wi (t)
When the total weight of the school increased in the current iteration, all fish
must update their location according to
(x(t) − b(t))
x(t + 1) = x(t) − stepvol randu(0, 1) , (9)
distance(x(t), b(t))
when the total weight decreased in the current iteration the update is
(x(t) − b(t))
x(t + 1) = x(t) + stepvol randu(0, 1) , (10)
distance(x(t), b(t))
Behavior of step_ind
% of search space amplitude
A a a B
10 %
c
b
D
0.001 %
0 2500 5000
Iterations
Fig. 1. Linear and non-linear decrease of step ind and step vol
• S1 (weight update) - linear decrease of weights: Here, the weights of all fish
are decreased linearly in each iteration by a pre-defined factor Δ lin such
that after the weight update in Eqn. (5) the weight of all fish is reduced by
wi = wi − Δ lin , and all weights smaller than 1 are set to 1.
• S2 (weight update) - fitness based decrease of weights: Here, not all fish will
have their weights diminished by the same factor, instead fish in poor regions
will loose weight more quickly. If f (x) is a vector containing the fitness values
of all fish at their current location, the weight of the fish will be decreased by
Feeding the Fish – Weight Update Strategies for the FSS Algorithm 557
4 Experimental Setup
Table 1 shows the benchmark functions used for minimization in this paper, as
well as the search space and the optimum point for each function. The initializa-
tion subspace was chosen to be in the interval [up/2, up], where up is the upper
558 A. Janecek and Y. Tan
D x2
D √
D
FRastrigin (x) = 10D + x2i − 10 cos(2πxi ) −5.12 ≤ xi ≤ 5.12 0.0D
i=1
D−1
FRosenbrock (x) = 100(xi+1 − x2i )2 + (1 − xi )2 −30 ≤ xi ≤ 30 1.0D
i=1
D
FSphere (x) = x2i −100 ≤ xi ≤ 100 0.0D
i=1
limit of the search space for each function (similar to [3]). We used the same
settings as in [3]: 5 000 iterations, 30 dimension, and 30 fishes, leading to 300 000
functions evaluations. We performed 15 trials per function. For all experiments
stepind initial was set to 10% of up, and stepind f inal was set to 0.001% of up.
5 Evaluation
In this section we evaluate the update strategies introduced in Section 3. First,
we discuss each strategy separately and focus especially on fitness per iteration
aspects, i.e. how many iterations are needed in order to achieve a given fitness.
Later we compare the best results achieved by all update strategies to each other
and discuss the increase of computational cost caused by the update strategies.
In all figures basic FSS refers to the basic FSS algorithm as presented in [3].
S1 - linear decrease of weights. The results for strategy S1 are show in
Fig. 2. The results are shown for four different values of Δlin (abbreviated as
“Δ lin 0.0XXX” in the figure) ranging from 0.0125 to 0.075. Subplots (B) to
(F) show the fitness per iteration for the five benchmark functions, and Subplot
(A) shows the average (mean) weight of all fish per iteration, which decreases
with increasing Δlin . Obviously, in most cases S1 is not able to improve the
results, but for the Rastrigin function the final result after 5 000 iterations can
by clearly improved when Δlin is set to 0.075 (and partly also for 0.05). However,
generally this update strategy is neither able to improve the final results after
5 000 iterations nor to achieve better results after a given number of iterations.
S2 - fitness based decrease of weights. The results for strategy S2 are
show in Fig. 3. Subplot (A) shows again the average (mean) weight of all fish
per iteration – the average weight is very similar to Subplot (A) of Fig. 2. The
parameter cf it that scales the decrease of the weights is abbreviated as “cf it X”.
The results for the Rastrigin function are even better than for the strategy S1
Feeding the Fish – Weight Update Strategies for the FSS Algorithm 559
f(x)
150
4
100
50 2
0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(C) Griewank function (D) Rastrigin function
8 200
basic FSS
Δ lin 0.0250
6 150
Δ lin 0.0500
Δ lin 0.0750
f(x)
f(x)
4
100 basic FSS
Δ lin 0.0250
2 Δ lin 0.0500
50
Δ lin 0.0750
0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(E) Rosenbrock function (F) Sphere function
10000 10000
8000 8000
6000 6000
f(x)
f(x)
150
4
100
50 2
0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(C) Griewank function (D) Rastrigin function
8 200
basic FSS
c fit 3
6 150
c fit 4
c fit 5
f(x)
f(x)
4 basic FSS
100
c fit 3
2 c fit 4
50 c fit 5
0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(E) Rosenbrock function (F) Sphere function
10000 10000
basic FSS
8000 c fit 3 8000
c fit 4
6000 6000
c 5
f(x)
f(x)
f(x)
4
0.001 % 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(C) Griewank function (D) Rastrigin function
8 200
basic FSS basic FSS
interpolated interpolated
6 150
non−linear non−linear
f(x)
f(x)
4
100
2
50
0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(E) Rosenbrock function (F) Sphere function
10000 10000
basic FSS basic FSS
8000 interpolated 8000 interpolated
non−linear non−linear
6000 6000
f(x)
f(x)
4000 4000
2000 2000
0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
c dil 4
3 6 c 5
f(x)
dil
2 4
1 2
0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(C) Griewank function (D) Rastrigin function
8 200
basic FSS basic FSS
non−linear non−linear
6 150
c dil 4 c dil 4
c 5 c 5
f(x)
f(x)
4 dil dil
100
2
50
0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
(E) Rosenbrock function (F) Sphere function
10000 10000
basic FSS basic FSS
8000 non−linear 8000 non−linear
c 4 c 4
dil dil
6000 c dil 5 6000 c dil 5
f(x)
f(x)
4000 4000
2000 2000
0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
Iterations Iterations
Table 2. Comparison of mean value and standard deviation (in small font under mean
vaule) for 15 trials after 5 000 iterations for the five benchmarks functions. The best
results are highlighted in bold. Last row: computational cost.
and also the results for the Rosenbrock function could be improved slightly. As
Table 2 indicates, this strategy achieves the best final result for the Rastrigin
function of all strategies after 5 000 iterations. For the other benchmark functions
this strategy perform equally or worse than basic FSS.
S3 - non-linear decrease of stepind and stepvol . S3 results are show in Fig. 4,
where stepnonlinear (t) is abbreviated as “non-linear”. “Interpolated” shows the
results using an interpolation of “basic FSS” and “non-linear”, i.e. stepinterpol (t)
= stepind (t) - [stepind (t) − stepnonlinear (t)] /2. Subplot (A) shows the behavior
of stepind and should be compared to Fig. 1. The results indicate that this non-
linear decrease of the step-size parameters significantly improves the fitness per
iteration for all five benchmark functions. Generally, “non-linear” achieves the
best results, followed by “interpolated”. For some functions, such as (D) or (E),
this strategy needs only about half as many iterations as basic FSS to achieve
almost the same results as basic FSS after 5 000 iterations.
S4 - combination of S2, S3 and dilation multiplier. The results for stra-
tegy S4 are show in Fig. 5 and are compared to basic FSS and “non-linear”
from strategy S3 . The dilation multiplier cdil is abbreviated as “cdil X”. Since
the weight of all fish is reset to 1 if a dilation occurs, the average (mean) weight
per iteration is relatively low (see Subplot (A)). Generally, this strategy achieves
similar results as strategy S3 , but clearly improves the results for the Rastrigin
function and achieves a better final result after 5 000 iterations and also better
fitness per iteration for “cdil 5”.
Comparison of final results. Table 2 shows a comparison of the mean values
and the standard deviations after 5 000 iterations. As can be seen, the results for
all five benchmark functions could be improved. Overall, strategy S4 achieves
the best results followed by S3 . S1 and S2 are better or equal than basic FSS
for 4 out of 5 benchmark functions.
562 A. Janecek and Y. Tan
Computational cost. The last row of Table 2 shows the increase in computa-
tional cost caused by the additional computations of the update steps. Example:
the runtime for S1 is 1.0017 times as long as the runtime for basic FSS. This
indicates that the increase in runtime is only marginal and further motivates the
utilization of the presented update steps.
6 Conclusion
In this paper we presented new update strategies for the Fish School Swarm algo-
rithm. We investigated the influence of newly developed weight update strategies
as well as the influence of a non-linear decrease of the step-size parameters stepind
and stepvol . Results indicate that strategies S3 and S4 are able to significantly
improve the fitness per iteration for all benchmark functions and also achieve
better final results after 5 000 iterations when compared to the basic implemen-
tation of FSS. The results motivate for further research on update strategies for
FSS and also for adapting the non-linear decrease of the step size parameters
for other search heuristics.
References
[1] Bastos Filho, C., Lima Neto, F., Lins, A., Nascimento, A.I.S., Lima, M.: A novel
search algorithm based on fish school behavior. In: IEEE International Conference
on Systems, Man and Cybernetics, SMC 2008, pp. 2646–2651 (2008)
[2] Bastos Filho, C., Lima Neto, F., Lins, A., Nascimento, A.I.S., Lima, M.: Fish
school search: An overview. In: Chiong, R. (ed.) Nature-Inspired Algorithms for
Optimisation. SCI, vol. 193, pp. 261–277. Springer, Heidelberg (2009)
[3] Bastos Filho, C., Lima Neto, F., Sousa, M., Pontes, M., Madeiro, S.: On the in-
fluence of the swimming operators in the fish school search algorithm. In: Int.
Conference on Systems, Man and Cybernetics, pp. 5012–5017 (2009)
[4] Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE
International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
[5] Janecek, A.G., Tan, Y.: Using population based algorithms for initializing non-
negative matrix factorization. In: ICSI 2011: Second International Conference on
Swarm Intelligence (to appear, 2011)
[6] Goldberg, D.E.: Algorithms in Search, Optimization and Machine Learning, 1st
edn. Addison-Wesley Longman, Boston (1989)
[7] Storn, R., Price, K.: Differential Evolution - A Simple and Efficient Heuristic
for Global Optimization over Continuous Spaces. Journal of Global Optimiza-
tion 11(4), 341–359 (1997)
[8] Tan, Y., Zhu, Y.: Fireworks algorithm for optimization. In: Tan, Y., Shi, Y., Tan,
K.C. (eds.) ICSI 2010. LNCS, vol. 6145, pp. 355–364. Springer, Heidelberg (2010)
Density as the Segregation Mechanism in Fish School
Search for Multimodal Optimization Problems
1 Introduction
Multimodal Optimization Problems (MMOP) occur in various fields including
geophysics [1], electromagnetism [2], climatology [3] and logistics [4, 5], among
others. To find more than one optimal solution of MMOPs can be useful because of
two main reasons [6, p. 88]: (1) to provide insights in functions landscape; and (2) to
allow selections of alternative solutions, e.g. when the dynamic nature of constraints in
the search space makes a previous optimum solution infeasible.
Several methods based on computational models inspired on natural processes have
been proposed to deal with MMOPs. For example, Particle Swarm Optimization (PSO)
[7] is an effective optimization method [8] for which several approaches have been
proposed in order to make it able to capture multiple optimal solutions of MMOPs
[9, 10, 11, 12, 13, 14]. Moreover, new methods to MMOPs have been proposed based
on, for example, a swarm of glowworms [6].
Although several swarm based methods have been proposed to deal with MMOPs,
there are still two important issues to be addressed. The first one concerns to the
fact that performance of most of the proposed methods depends on manual parameter
adjustment. And the second one concerns to the reduction of performance when the
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 563–572, 2011.
c Springer-Verlag Berlin Heidelberg 2011
564 S.S. Madeiro et al.
dimensionality of the problem increases. In other words, many current methods, when
applied to MMOPs with more than five dimensions, e.g. present low performance.
In this article, we discuss a density segregation mechanism that enables simultaneous
capture of multiple optimal solutions of MMOPs put together on top of the original
algorithm Fish School Search (FSS) [15]. The new proposal allow each fish to
find different food sources (i.e. different optimal solutions). So, at each iteration
of the algorithm, the school can be divided into subgroups. Each created subgroup
corresponds to a possible solution to the multimodal problem. At the end of the
execution, the set of all captured solutions is provided.
The paper is organized as follows: in Section 2, we give a detailed description of
the algorithm FSS. In Section 3, we give a detailed description of the modified FSS
algorithm, based on density. In Section 4, we compare the performance of the density
FSS with the performance of other algorithms, anmelly, NichePSO and GSO for a set
of well known benchmark of multimodal functions. The conclusion is provided and
commented upon in Section 5.
where xi (t) is the current position of the fish in dimension i, xi (t+1) is the new calculated
position of the fish for dimension i, and rand() is a function that return numbers
uniformly distributed in a given interval. The stepind is calculated as a percentage
of xmax ) for all dimension i. stepind decreases linearly during iterations by using
stepind (t + 1) = stepind (t) − (stepind initial − stepind f inal )/iterations in order to improve
the exploitation ability in later iterations, where iterations is the number of iterations
used in the simulation. The stepind initial and stepind f inal are the initial and the final
individual movement step, respectively. Note that the stepind initial must be higher then
stepind f inal in order for the gradual shift from exploration to exploitation modes of
operation along the iterations.
of the search space (i.e. aquarium). The amount of food that a fish eats depends on
the improvement in its fitness and the largest improvement in the fitness of the entire
school. The weight of a fish is updated according to Wi (t + 1) = Wi (t) + Δ fi /max(Δ f ),
where Wi (t) is the weight of the fish i, (Δ fi ) is the difference of the fitness at current and
new position for the fish i, max(Δ f ) is a function that returns the maximum difference
of the fitness values among all the fish. One should remember that Δ fi = 0 for a fish
that does not perform the individual movement at the current iteration.
∑Ni=1 Δ xi Δ fi
I(t) = , (2)
∑Ni=1 Δ fi
(x(t) − B(t))
x(t + 1) = x(t) − stepvol rand(0, 1) , (5)
distance(x(t), B(t))
(x(t) − B(t))
x(t + 1) = x(t) + stepvol rand(0, 1) , (6)
distance(x(t), B(t))
where distance() is a function which returns the Euclidean distance between the
barycenter and the fish current position, stepvol is a predetermined step used to control
the displacement from/to the barycenter.
566 S.S. Madeiro et al.
The stepvol must be in the same order of magnitude of the step used in the individual
movement. As stepvol is multiplied by a factor drawn from the uniform distribution in
interval [0,1] with expected value equal to 0.5, which is usually twice the stepind value.
In (7), Pi must be evaluated for each fish i and it represents the amount of food the
fish i will receive after sharing Δ fi . The other fish j will receive (d 1 )qi j of Pi , as given
Ri j
in (8). In (7), when i = j, we have dRi j = 0 and qi j = 0, resulting in 00 . For this case,
computationally, we consider 00 = 1. Note that if, for a given fish i, we have min dik = 0,
we decided to consider min dik = 4, 9E −324 as this is the lowest possible value for the
numerical precision using the data type double in our computer set up.
Pi Δ fi
C(i, j) = = . (8)
(dRi j )qi j (dRi j )qi j ∑Nk=1 (dR 1 )qik
ik
According to (7) and (8), each fish j will receive an amount of food exponentially
smaller according to (dRi j )−qi j . This equation is based on the one used by Martinetz and
Schulten [16, p. 520] in the step (iv) of their proposed algorithm for creating Topology
−ki
Representing Networks (TRN). For density FSS, the expression e λ of [16, p. 520]
was adapted to (dRi j )−qi j , in order to quantity the amount of food a fish j will receive
because the successful foraging behavior of another ”colleague” fish i. The greater the
value of qi j (i.e. the greater the density of fish around fish i), the smaller will be the
amount of food avaiable for fish j. That means that crowded areas exert little influence
over other fish.
At the end of each iteration, each fish i received an amount of food given as C( j, i)
from other fish j that successful found food. The sum of each amount C( j, i) for all
other fish j corresponds to the total amount of food fish i received in a given iteration.
Then, the weight Wi (t) of fish i at the t th iteration is updated according to (9).
Density as the Segregation Mechanism in FSS for MMOP 567
Q
Δ fj
Wi (t + 1) = Wi (t) + ∑ , (9)
(d
j=1 Ri j ) ∑
qi j N
k=1 (d
1
q jk
R jk )
where Q is the quantity of fish that successful found food at the t th iteration. In this
new proposal, differently from a real fish in nature, we assumed that the weight of all
artificial fish do not decrease along iterations.
In the algorithm derived here, each fish i has a memory Mi = {Mi1 , Mi2 , . . . , MiN }, where
N is the number of fish in the school, and Mi j quantifies the influence of one fish j over
the fish i. Mi j depends on the total amount of food the fish i received because of the
foraging behavior of fish j (i.e. C( j, i)) along the entire execution of the algorithm.
The bigger is C( j, i), the greater will be the influence of fish j over fish i. This exerted
influence manifests itself in terms of how synchronized will be the behavior of a fish i
regarding the foraging behavior of another fish j.
After the Feeding Operator is computed, the Memory Operator updates Mi j as shown
in (10). In (10), 0 ≤ ρ ≤ 1 is a parameter that controls the influence of one fish over
every other. For example, if the value of ρ is close to 1, in general, just after a relatively
small number of iterations (e.g. 10 iterations), the memory of each fish may be greatly
changed. That is, fish here learn and forget rather quickly.
Δ fj
Mi j (t + 1) = (1 − ρ )Mi j (t) + = (1 − ρ )Mi j (t) + C( j, i). (10)
(dRi j )
qi j
∑Nk=1 (d 1
q jk
R jk )
For density based FSS, the resultant behavior Ii for each fish i is evaluated as shown
in (11). In (11), Ii is a sum of the directions taken by each fish j during the Individual
Movement Operator weighted by Mi j . Note that, contrary of FSS, even though a fish
does not locate food (i.e. Δ x j = 0), it would influence the resultant behavior Ii of other
fish i. In other words, even if fish j does not locate food, fish i will mimic its behavior
(i.e. remain stationary) according to the memorized value Mi j .
∑Nj=1 Δ x j Mi j
Ii (t) = . (11)
∑Nk=1 Mik
In our new approach, at each iteration of the algorithm, the main school is partitioned
into subgroups. One fish i will be in the same subgroup of other fish j if and only if:
where N is the number of fish in the main school. Therefore, fish i is into the same
subgroup of fish j if and only if fish j is the fish that exerts the largest influence over
fish i or fish i is the fish that exerts the largest influence over fish j.
The algorithm for the partition of the main school into subgroups is illustrated in
Algorithm 1. Through this procedure, a fish i is chosen randomly in the main school.
After that, all other fish j of the main school that are into the same subgroup of fish
i, according to the definition in (12), are removed from the main school and put in the
same subgroup of fish i. Then, for each fish j, all other fish k that are into the same
subgroup of fish j are removed from the main school. This process of selection of fish
that are into the same subgroup of fish i is repeated until all the fish into the subgroup of
fish i have been removed from the main school. Then, another fish i is chosen randomly
from the main school and the procedure is repeated until all the fish have been removed
from the main school.
4 Experiments
In this section the performance of our new approach is compared to the performance of
NichePSO [10] and GSO [6] for benchmark functions with two and more dimensions
and with a finite number of optimal solutions.
4.1 Methodology
For the experiments in this section, we used the set of multimodal functions shown in
Table 1.
For all benchmark functions, we consider that: (1) the entities are disposed uniformly
on the search space; to ensure a uniform distribution of the entities in the search
space, we used Faure sequences [18] to generate a uniform sequence of pseudo-random
numbers; (2) the number of iterations for density FSS is halved regarding the number
of iterations of the algorithms NichePSO and GSO for all experiments, since in density
FSS each fish performs two calls to the objective function at each iteration; (3) if the
normalized distance between two captured optima i and j, for all optima, is less than
0.01, we assume those optima as being the same optimal solution; in this case, only
the fittest optimum will be taken, and the other one will be discarded.
The normalized
distance dN (i, j) between two points i and j is given as dN (i, j) = (xN −xN )·(x N −xN )
i j i j
D ,
x1 x2 x1
where xN = ( x1 , x2 , . . . , xD ), D is the number of dimensions of one MMOP, and
max max max
xkmax is the upper bound of the dimension k; (4) for all selected optima, we considered
that density FSS and NichePSO have captured an optimal solution k if the normalized
distance between k and the optima closest to k is less than 0.005; in GSO [6, p. 99], an
optimal solution k is captured when at least three glowworms are located at a distance
less than ε = 0.05 from k. In this paper, the value of ε was modified to 0.005, following
the procedure used for the algorithms NichePSO and density FSS, as described earlier.
570 S.S. Madeiro et al.
Table 1. Multimodal benchmark functions. The domain of the search space and the corresponding
number of peaks for that domain are given in the second and the third columns, respectively.
F1 (X) = ∑m
i=1 cos (X(i)), X ∈ R
2 m [−π , π ]m 3m [6]
F2 (x, y) = cos2 (x) + sin2 (y) [−5, 5]2 12 [11, 6]
x2i xi
F3 (x) = 1 + ∑ni=1 4000 − ∏ni=1 cos √ [−29, 29]2 124 [17]
i
F4 (x, y) = 200 − (x2 + y2 − 11)2 − (x + y2 − 7)2 [−6, 6]2 4 [10, 6]
x)2 e−[x +(y+1) ]
2 2
F5 (x, y) = 3(1 − − 10( 5x − x3 − [−3, 3]2 3 [6]
y5 )e−[x +y ] − ( 13 )e−[(x+1) +y ]
2 2 2 2
The parameter configuration for NichePSO was the same used in [14, p. 2300]. In
[14, p. 2300], c1 = c2 = 1.2, w linearly decreases from 0.7 to 0.2, δ = 10−4, μ = 10−2
and ε = 0.1. Those values are used for all experiments performed in this paper. For
GSO, we used the same configuration as described in [6, p. 99]: ρ = 0.4, γ = 0.6,
β = 0.08, nt = 5, s = 0.03, l0 = 5. The value of the parameter rs = 2 was chosen
for all experiments based on the results presented for GSO in [6, pp. 109–110]. For
density FSS, the parameter values were: ρ = 0.3, stepinit = 0.05, decaymin = 0.999,
decaymaxini = 0.99, and decaymaxend = 0.95. All those values were determined based on
tedious numerical experiments performed earlier.
Table 2. Comparison of the performance of the algorithms NichePSO, GSO and density FSS
regarding the percentage of the total number of combinations for which the algorithms captured
on average more than 95% of the number of optimal solutions of one MMOP
F1 F2 F3 F4 F5 F6 F7
In order to compare the performance of the algorithms NichePSO, GSO and density
FSS we choose the metric: percentage of the total number of combinations for which
the algorithms captured on average more than 95% of the number of optimal solutions
(i.e. number of peaks) of one MMOP. Table 2 summarizes the performance of the
algorithms. In general, as one can note in Table 2, density FSS outperformed the
algorithms NichePSO and GSO for all functions used in this Section, regarding the first
metric. For the function F7 , for example, density FSS captured on average more than 95
optimal solutions for 23.48% of the total number of combinations, whereas NichePSO
and GSO failed to capture more than 95 optimal solutions for all combinations.
5 Conclusions
In this paper, the algorithm FSS was adapted to locate simultaneously multiple optima
of MMOP. We were able to produce a new mechanism (and algorithm), based on
the principle of density, that affords the segregation for splitting the fish school and,
consquently, is able to locate multiple optima. Two new operators are proposed for
the partition of the fish school into subswarms, such that each created subswarm
corresponds to one potential optimal solution of a given MMOP. At the end of the
execution of the algorithm, a set of captured optima of a given MMOP is clearly
produced.
The experimental results demonstrate that FSS based on density is a far better
approach to MMOP than NichePSO and GSO. The reason for that is the evident
ability of density FSS to simultaneously capture multiple optima without heavy
parameterization additional costs. The highlights of the current proposal are: (1) it
outperforms NichePSO and GSO for all benchmark functions; (2) it has the ability
to tackle MMOPs of more than two dimensions with the need of manual parameter
adjustments.
Acknowledgments
The authors acknowledge the financial support from CAPES, CNPq and University of
Pernambuco for scholarships, support and travel grants.
572 S.S. Madeiro et al.
References
[1] Koper, K., Wysession, M., Wiens, D.: Multimodal function optimization with a niching
genetic algorithm: A seismological example. Bulletin of the Seismological Society of
America 89(4), 978–988 (1999)
[2] Dilettoso, E., Salerno, N.: A self-adaptive niching genetic algorithm for multimodal
optimization of electromagnetic devices. IEEE Transactions on Magnetics 42(4), 1203–
1206 (2006)
[3] El Imrani, A., Zine El Abidine, H., Limouri, M., Essaid, A.: Multimodal optimization of
thermal histories. Comptes Rendus de l’Academie de Sciences - Serie IIa: Sciences de la
Terre et des Planetes 329(8), 573–577 (1999)
[4] Luh, G.-C., Chueh, C.-H.: Job shop scheduling optimization using multi-modal immune
algorithm. In: Okuno, H.G., Ali, M. (eds.) IEA/AIE 2007. LNCS (LNAI), vol. 4570, pp.
1127–1137. Springer, Heidelberg (2007)
[5] Naraharisetti, P., Karimi, I., Srinivasan, R.: Supply chain redesigns - multimodal
optimization using a hybrid evolutionary algorithm. Industrial and Engineering Chemistry
Research 48(24), 11094–11107 (2009)
[6] Krishnanand, K., Ghose, D.: Glowworm swarm optimization for simultaneous capture of
multiple local optima of multimodal functions. Swarm Intelligence 3(2), 87–124 (2009)
[7] Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. Micro Machine
and Human Science, 39–43 (1995)
[8] Shi, Y., Eberhart, R.C.: An empirical study of particle swarm optimization. In: IEEE
Congress on Evolutionary Computation, pp. 1945–1960 (1999)
[9] Parsopoulos, K., Vrahatis, M.N.: Modification of the particle swarm optimizer for locating
all the global minima. In: Karny (ed.) Artificial Neural Networks and Genetic Algorithms,
pp. 324–327 (2001)
[10] Brits, R., Engelbrecht, A.P., van den Bergh, F.: A niching particle swarm optimizer. In:
Proceedings of the 4th Asia-Pacific conference on simulated evolution and learning, pp.
692–696 (2002)
[11] Parsopoulos, K., Vrahatis, M.N.: On the computation of all global minimizers through
particle swarm optimization. IEEE Transactions on Evolutionary Computation 8(3), 211–
224 (2004)
[12] Brits, R., Engelbrecht, A., van den Bergh, F.: Locating multiple optima using particle swarm
optimization. Applied Mathematics and Computation 189(2), 1859–1883 (2007)
[13] Ozcan, E., Yilmaz, M.: Particle swarms for multimodal optimization. In: Beliczynski, B.,
Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS, vol. 4431, pp.
366–375. Springer, Heidelberg (2007)
[14] Engelbrecht, A., Van Loggerenberg, L.: Enhancing the nichepso. In: IEEE Congress on
Evolutionary Computation, CEC, pp. 2297–2302 (2007)
[15] Bastos-Filho, C., de Lima Neto, F., Lins, A., Nascimento, A., Lima, M.: Fish school search.
In: Chiong, R. (ed.) Nature-Inspired Algorithms for Optimisation. SCI, vol. 193, pp. 261–
277. Springer, Heidelberg (2009)
[16] Martinetz, T.M., Schulten, K.J.: Topology representing networks. Neural Networks 7(3),
507–522 (1994)
[17] Griewank, A.: Generalized descent for global optimization. Journal of Optimization Theory
and Applications 34, 11–39 (1981)
[18] Thiemard, E.: Economic generation of low-discrepancy sequences with a b-ary gray
code. Technical report, Departement de Mathematiques, Ecole Polytechnique Federale de
Lausanne, CH-1015 Lausanne, Switzerland (1998)
Mining Coherent Biclusters with Fish School Search
1 Introduction
Fish schools are one of the best examples of collective animal behavior [17]. Schools
are groups composed of many fish, usually of the same species, acting as a single unit
and moving in more or less harmonious patterns throughout the oceans. These groups
show a streamlined structure and uniform behavior aiming at avoiding predators and
finding food. Fish join schools for selfish reasons; therefore, in order for schooling to
improve fitness, schools must offer benefits greater than the costs of increased
visibility to predators, increased competition, and energetic instability [12].
Recently, a novel swarm intelligence metaheuristic, named as Fish School Search
(FSS), was introduced by Bastos Filho et al. [2]. In a nutshell, FSS is inspired by the
collective behavior displayed by real fish schools and thus is composed of operators
that mimic their feeding and swimming activities. Together these operators afford
salient computational properties such as [2][14]: (i) high-dimensional search abilities;
(ii) on-the-‘swim’ selection between exploration and exploitation; and (iii) self-
adaptable guidance towards sought solutions (which can be multimodal).
In FSS, the school “swims” (searches) for “food” (candidate solutions) in the
“aquarium” (search space). The weight of each fish acts as a sort of memory of its
individual success, and both individual and collective movements are performed so as
to locate and explore promising areas of the aquarium. So far, this algorithm has been
adopted with success to solve continuous optimization problems [2][14]. In this paper,
we follow a different perspective by providing a preliminary assessment of the
potentials of FSS while tackling a non-trivial data mining task known as biclustering.
The main idea behind biclustering is to simultaneously cluster both rows and
columns of a data matrix, allowing the extraction of contextual information from it [13].
Y. Tan et al. (Eds.): ICSI 2011, Part II, LNCS 6729, pp. 573–582, 2011.
© Springer-Verlag Berlin Heidelberg 2011
574 L. Menezes and A.L.V. Coelho
This notion can be traced back to the 1960’s, though it has become more well-known
since the beginning of the last decade, when it was reintroduced by Cheng and Church
[3] in the domain of gene expression data analysis. Biclustering techniques have been
applied in different contexts, such as in bioinformatics, time series expression data, text
mining, and collaborative filtering [5]-[9][13]. Some of their advantages over
conventional clustering algorithms are [3][9][13][18]: (i) they can properly deal with
missing data and corrupted measurements by automatically selecting rows and columns
with more coherent values and dropping those corrupted with noise; (ii) they group
items based on a similarity measure that depends on a context, i.e. a subset of the
attributes, describing not only the grouping, but the context as well; and (iii) they allow
that rows and columns be simultaneously included in multiple biclusters.
Many different biclustering algorithms can be found in the literature [13][16][18].
In particular, due to the highly combinatorial nature of this problem, bio-inspired
metaheuristics have been successfully adopted to tackle it, such as genetic algorithms
(GA), particle swarm optimization (PSO), artificial immune systems (AIS), and ant
colony optimization (ACO) [7]-[9][15][19]. In this study, to assess the performance of
FSS in mining coherent and sizeable biclusters, a simple modification of the
algorithm was performed in order to allow for the representation of binary solutions.
Two datasets, one related to bioinformatics [4] and the other to collaborative filtering
[6], were considered, and the assessment is done here having as yardstick the levels of
performance exhibited by GA and PSO. Overall, the results achieved so far suggest
that the FSS algorithm is competitive in terms of locating coherent biclusters and
prevails in terms of better computational efficiency.
The rest of the paper is organized as follows: Section 2 provides a brief account on
the biclustering problem. Section 3 describes the FSS algorithm while Section 4
reviews the main steps of the GA and PSO algorithms. In Section 5, the empirical
results achieved so far are discussed, while, in Section 6, we provide final remarks.
MSR , ∑ , . (2)
| || |
In (1) and (2), I denotes the set of rows, J means the set of columns, is an element
in the submatrix, stands for the average of the ith row, indicates the average of
the jth column, is the average of the whole submatrix, and δ is a threshold to be
pre-defined by the user.
Each individual in the chosen algorithms (GA, PSO and FSS) denotes a candidate
bicluster and is formed by two strings of sizes n and m, where n (m) denotes the total
number of rows (columns) of the data matrix [19]. The solution encoding adopted is
binary with the bit ‘1’ (‘0’) meaning that the corresponding row or column belongs
(not belongs) to the bicluster. So, a matrix element needs to have both its row and
column set to ‘1’ to effectively belong to the bicluster.
evaluation of the fitness function of the previous and current fish position, aiming at
modeling the difference of food concentration on these sites. If the fish did not find a
better position, it is assumed that its weight remains constant, according to [14].
After feeding, the collective-instinctive movement takes place by calculating a
weighted average of the individual movements based on the immediate success of all
fish in the school [2][14]. Only those fish that had successful individual movements
influence the resulting direction of this collective movement. When the overall
direction is computed, each fish is repositioned.
Finally, the third swimming operator, referred to as collective-volitive movement
[2][14], is devised as an overall success/failure evaluation based on the incremental
weight variation of the whole fish school. If the school is accumulating weight, the
radius of the school should contract; otherwise, it should enlarge. This amplification
or contraction is applied as a small step drift to every fish position taking as reference
the school’s barycenter. The barycenter is calculated by considering all fish positions
and their respective weights. Also in this movement, a control parameter, , is
adopted to determine the effective fish displacement in the aquarium [2][14]. The
value of this parameter can be set as a function of [14].
In this paper, in order to cope with the biclustering problem, a binary encoding of
the candidate solutions has been adopted. Since the standard FSS algorithm was
originally conceived to deal with continuous optimization problems [2][14], we have
resorted to the same trick adopted in the context of PSO to convert the representation
of the solutions from real to binary (see Subsection 4.2). By this means, a vector with
binary positions ( ) effectively encodes the solution (bicluster) associated with each
fish after the individual movement operator.
The main steps of the FSS algorithm adopted here are described below [2][14]:
Randomly initialize the first population of fish
Evaluate the fitness value of each fish
While termination conditions are not satisfied
For each fish i
Update its position applying the individual movement operator
1 1,1
1, if 1 0, 1
1 , where
0, otherwise
1
1
1
Evaluate the fitness value of fish i
Apply feeding operator
∆
1
max ∆
end for
Calculate weighted average of individual movements
∑ ∆ ∆
∑ ∆
For each fish i
Mining Coherent Biclusters with Fish School Search 577
1 0,1 , otherwise
end for
Update individual and volitive steps
1
#
1 2 1
end while
4 Contestant Algorithms
In the sequel, we briefly outline the main steps behind the GA and PSO algorithms as
investigated here for biclustering purposes.
Genetic Algorithms are general-purpose search algorithms that use the vocabulary
and principles borrowed from natural genetics [10]. In a nutshell, a GA instance
moves from one population of individuals (solutions), referred to as chromosomes, to
a new population, using selection (to reproduce and to survive) together with
genetics-inspired operators, such as crossover and mutation. By this means,
individuals with better genetic features are more likely to survive and produce
offspring increasingly fit in the next generations, while less fit individuals tend to
disappear. The most known form of a GA (referred to as standard or simple GA [10])
employs a binary encoding of the solutions whereby a genotype-phenotype mapping
is used for interpretation and evaluation of the individuals. The pseudocode of the GA
instance used in this paper is presented next:
Randomly initialize the first population of chromosomes
Evaluate the fitness value of each chromosome
While termination conditions are not satisfied
Select parents to generate new chromosomes
Apply genetic operators (crossover and mutation) with a given probability
Evaluate the new individuals
Choose individuals to form the new generation
end while
578 L. Menezes and A.L.V. Coelho
The PSO algorithm maintains a swarm of particles where each particle represents a
solution to the problem [11]. During its flight, all particles perform three basic
operations, namely, they evaluate themselves, compare the quality of their solutions
with that of their neighbors, and try to mimic that neighbor with the best performance
so far. By this means, the position of each particle is adjusted each iteration based on
its own previous experience and the experience of its social neighbors.
Originally, two PSO variants were developed, which differ in the type of the
particles’ neighborhoods: (i) Global Best PSO, aka gbest PSO, in which the
neighborhood of each particle is the entire swarm; and (2) Local Best PSO, aka lbest
PSO, which creates a neighborhood for each particle comprising a number of local
neighbors, possibly including the particle itself [11].
Each particle i has a set of attributes, namely, its current position ( , current
velocity ( , the best position discovered by the particle so far ( ) and the best
position discovered so far by its associated neighborhood ( or ). All
particles start with randomly initialized velocities and positions. The position of a
particle is changed by adding a velocity to the current position. It is the velocity
vector that drives the search process, and reflects both the experiential knowledge of
the particle and socially exchanged information from the particle’s neighborhood.
In the update of the particle’s velocity, three control parameters assume important
roles, namely: the inertia weight, , which is a sort of momentum factor, controlling
the influence of the previous velocity; and and , which are acceleration
coefficients (known as cognitive and social factors), delimiting how strongly the
particle is attracted by the regions containing pbest and gbest, respectively. The
impact of the latter parameters are modulated by random variables, which are
responsible for the stochastic nature of the algorithm [11][19].
In this study, a discrete version of the gbest PSO was used to search for biclusters
as particles. In this case, for a particle flying over a binary space, the values of its
position and velocity vectors must lie in the range 0,1 . A straightforward trick is to
use a logistic function over velocity to transform it from real to binary spaces [19]:
.
By now, the dimensions of the velocity vector of each particle are represented as
probability thresholds, and the components of the novel position vector of the particle
are calculated by randomly choosing a number in 0,1 and then verifying whether
this number is higher than the respective threshold. The pseudocode of the PSO
instance used in this paper is presented next:
runtime (2nd. column), the rate of overlapping among the biclusters of the last
generation (5th. column), the iteration where the best bicluster was found for the
first time (6th. column), and the rate of success (in 30 trials) in locating coherent
and sizeable biclusters (7th. column) are also given. One can notice that FSS is
indeed sensitive to the calibration of and , showing better search
behavior for biclusters with higher initial values for these control parameters.
Success
Time (s) MSR Volume Overlap Iteration (%)
(10, 1) 440.964 ± 3.808 256.822 ± 34.674 2,953.5 ± 55.516 0.258 ± 0.074 22 ± 12.749 100
(10, 0.1) 440.046 ± 4.201 257.057 ± 36.914 3,008.2 ± 269.947 0.303 ± 0.098 21.85 ± 10.302 95
(10, 0.01) 439.052 ± 1.342 256.457 ± 60.529 3,067.95 ± 413.851 0.297 ± 0.06 19.25 ± 9.419 90
(10, 0.001) 439.157 ± 1.351 253.615 ± 52.875 3,036.3 ± 301.966 0.303 ± 0.094 25.35 ± 10.189 95
(1, 0.1) 439.273 ± 1.358 315.87 ± 183.119 3,915.55 ± 2,115.46 0.281 ± 0.063 17.5 ± 12.352 60
(1, 0.01) 444.821 ± 13.441 332.083 ± 164.295 3,467 ± 1,184.546 0.246 ± 0.039 14.4 ± 8.958 45
(1, 0.001) 442.096 ± 5.814 333.791 ± 152.209 3,750.3 ± 1,786.454 0.247 ± 0.064 15.65 ± 8.235 40
(0.1, 0.01) 440.338 ± 1.829 457.138 ± 196.66 4,768.8 ± 2,156.878 0.228 ± 0.016 6.45 ± 5.125 20
(0.1, 0.001) 439.444 ± 1.331 527.752 ± 115.233 6,498.9 ± 2,215.774 0.227 ± 0.009 8.2 ± 4.372 5
Success
Time (s) MSR Volume Overlap Iteration (%)
(10, 1) 284.666 ± 17.539 0.658 ± 0.026 504,929.4 ± 14,549.076 0.402 ± 0.011 32.95 ± 12.275 100
(10, 0.1) 290.29 ± 17.574 0.662 ± 0.02 509,949.95 ± 15,747.593 0.413 ± 0.018 38.6 ± 14.873 100
(10, 0.01) 286.56 ± 18.938 0.662 ± 0.03 509,269.45 ± 17,080.4 0.397 ± 0.025 35.1 ± 16.635 100
(10, 0.001) 289.01 ± 14.671 0.671 ± 0.025 512,090.05 ± 14,601.992 0.409 ± 0.02 39.1 ± 12.957 100
(1, 0.1) 261.52 ± 11.832 0.671 ± 0.025 459,960.25 ± 13,354.922 0.349 ± 0.036 14.45 ± 9.801 100
(1, 0.01) 266.843 ± 13.277 0.664 ± 0.022 464,786.9 ± 12,737.374 0.343 ± 0.04 20.85 ± 12.758 100
(1, 0.001) 263.142 ± 13.73 0.662 ± 0.017 458,808.75 ± 14,839.044 0.348 ± 0.044 17.2 ± 14.062 100
(0.1, 0.01) 251.749 ± 5.104 0.667 ± 0.023 445,906.55 ± 6,306.942 0.269 ± 0.002 7.6 ± 6.116 100
(0.1, 0.001) 253.378 ± 3.617 0.675 ± 0.026 447,246.6 ± 8,891.807 0.269 ± 0.002 7.1 ± 4.09 100
concerns the sizes of the biclusters elicited, FSS method has not accomplished the
same level of performance as demonstrated by GA and mostly by PSO, with the latter
noticeably championing in this measure.
6 Final Remarks
In this paper, we presented a first-round empirical evaluation of the performance of a
new bio-inspired metaheuristic, Fish School Search [2], while tackling the non-trivial
biclustering task. When compared to GA and PSO, the results achieved for two real-
world datasets suggest that FSS is indeed very competitive in terms of fast locating
coherent biclusters. As future work, we shall conduct further experiments with other
datasets from bioinformatics [5][7] and text mining [9], and investigate the use of
alternative fitness functions for helping FSS better locating more sizeable biclusters.
Acknowledgment
This work was financially supported by CNPq (via Grant # 312934/2009-2) and
CAPES/PROSUP (via a master degree scholarship).
References
1. Barkow, S., Bleuler, S., Prelić, A., Zimmermann, P., Zitzler, E.: BicAT: A Biclustering
Analysis Toolbox. Bioinformatics 22, 1282–1283 (2006)
2. Filho, C.J.A.B., de Lima Neto, F.B., Lins, A.J.C.C., Nascimento, A.I.S., Lima, M.P.: Fish
School Search. In: Chiong, R. (ed.) Nature-Inspired Algorithms for Optimisation. SCI,
vol. 193, pp. 261–277. Springer, Heidelberg (2009)
582 L. Menezes and A.L.V. Coelho
3. Cheng, Y., Church, G.M.: Biclustering of Expression Data. In: International Conference on
Intelligent Systems for Molecular Biology, pp. 93–103 (2000)
4. Cho, R., Campbell, M., Winzeler, E., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg,
T., Gabrielian, A., Landsman, D., Lockhart, D., Davis, R.: A Genome-Wide
Transcriptional Analysis of the Mitotic Cell Cycle. Mol. Cell 2, 65–73 (1998)
5. Coelho, G.P., de França, F.O., Von Zuben, F.J.: Multi-Objective Biclustering: When Non-
dominated Solutions are not Enough. J. Math. Model Algor. 8, 175–202 (2009)
6. de Castro, P.A.D., de França, F.O., Ferreira, H.M., Von Zuben, F.J.: Applying Biclustering
to Perform Collaborative Filtering. In: International Conference on Intelligent System
Design and Applications, pp. 421–426 (2007)
7. de Castro, P.A.D., de França, F.O., Ferreira, H.M., Von Zuben, F.J.: Applying Biclustering
to Text Mining: An Immune-Inspired Approach. In: de Castro, L.N., Von Zuben, F.J.,
Knidel, H. (eds.) ICARIS 2007. LNCS, vol. 4628, pp. 83–94. Springer, Heidelberg (2007)
8. de França, F.O., Coelho, G.P., Von Zuben, F.J.: bicACO: An Ant Colony Inspired
Biclustering Algorithm. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T.,
Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 401–402. Springer, Heidelberg
(2008)
9. Divina, F., Aguilar-Ruiz, J.S.: Biclustering of Expression Data with Evolutionary
Computation. IEEE Trans. Knowl. Data Eng. 18, 590–602 (2006)
10. Eiben, A.E., Smith, J.: Introduction to Evolutionary Computing, 2nd edn. Springer,
Heidelberg (2007)
11. Engelbrecht, A.P.: Fundamentals of Computational Swarm Intelligence. Wiley, Chichester
(2007)
12. Hamilton, W.D.: Geometry for the Selfish Herd. J. Theor. Biol. 31, 295–311 (1970)
13. Madeira, S.C., Oliveira, A.L.: Biclustering Algorithms for Biological Data Analysis: A
Survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004)
14. Madeiro, S.S.: Multimodal Search by Density-Based Fish Schools (in Portuguese). Master
Dissertation, University of Pernambuco (2010)
15. Mitra, S., Banka, H.: Multi-objective Evolutionary Biclustering of Gene Expression Data.
Pattern Recogn. 39, 2464–2477 (2006)
16. Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig,
L., Thiele, L., Zitzler, E.: A Systematic Comparison and Evaluation of Biclustering
Methods for Gene Expression Data. Bioinformatics 22, 1122–1129 (2006)
17. Sumpter, D.: Collective Animal Behavior. Princeton Univ. Press, Princeton (2010)
18. Tanay, A., Sharan, R., Shamir, R.: Biclustering Algorithms: A Survey. In: Srinivas, A.
(ed.) Handbook of Computational Molecular Biology, Chapman & Hall/CRC (2005)
19. Xie, B., Chen, S., Liu, F.: Biclustering of Gene Expression Data Using PSO-GA Hybrid.
In: International Conference Bioinformatics and Biomedical Engineering, pp. 302–305
(2007)