0% found this document useful (0 votes)
63 views22 pages

Information Sciences: Jun Sun, Xiaojun Wu, Wei Fang, Yangrui Ding, Haixia Long, Webo Xu

This document discusses using an improved quantum-behaved particle swarm optimization (QPSO) algorithm to train hidden Markov models (HMMs) for multiple sequence alignment (MSA). The proposed method, called diversity-maintained QPSO (DMQPO), analyzes QPSO mathematically and integrates a diversity control strategy to enhance its global search ability. DMQPO, QPSO, and other algorithms are used to train HMMs for MSA on benchmark datasets, and DMQPO and QPSO yield better alignments than commonly used HMM training methods like Baum-Welch and PSO.

Uploaded by

Neha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views22 pages

Information Sciences: Jun Sun, Xiaojun Wu, Wei Fang, Yangrui Ding, Haixia Long, Webo Xu

This document discusses using an improved quantum-behaved particle swarm optimization (QPSO) algorithm to train hidden Markov models (HMMs) for multiple sequence alignment (MSA). The proposed method, called diversity-maintained QPSO (DMQPO), analyzes QPSO mathematically and integrates a diversity control strategy to enhance its global search ability. DMQPO, QPSO, and other algorithms are used to train HMMs for MSA on benchmark datasets, and DMQPO and QPSO yield better alignments than commonly used HMM training methods like Baum-Welch and PSO.

Uploaded by

Neha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Information Sciences 182 (2012) 93–114

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

Multiple sequence alignment using the Hidden Markov Model trained


by an improved quantum-behaved particle swarm optimization
Jun Sun ⇑, Xiaojun Wu, Wei Fang, Yangrui Ding, Haixia Long, Webo Xu
Key Laboratory of Advanced Control for Light Industry (Ministry of Education, China), Jiangnan University, No. 1800, Lihu Avenue, Wuxi, Jiangsu 214122, China
Department of Computer Science and Technology, Jiangnan University, No. 1800, Lihu Avenue, Wuxi, Jiangsu 214122, China

a r t i c l e i n f o a b s t r a c t

Article history: Multiple sequence alignment (MSA) is an NP-complete and important problem in bioinfor-
Available online 18 November 2010 matics. For MSA, Hidden Markov Models (HMMs) are known to be powerful tools. How-
ever, the training of HMMs is computationally hard so that metaheuristic methods such
Keywords: as simulated annealing (SA), evolutionary algorithms (EAs) and particle swarm optimiza-
Multiple sequence alignment tion (PSO), have been employed to tackle the training problem. In this paper, quantum-
Hidden Markov Model behaved particle swarm optimization (QPSO), a variant of PSO, is analyzed mathematically
Parameter optimization
firstly, and then an improved version is proposed to train the HMMs for MSA. The proposed
Quantum-behaved particle swarm
optimization
method, called diversity-maintained QPSO (DMQPO), is based on the analysis of QPSO and
Population diversity integrates a diversity control strategy into QPSO to enhance the global search ability of the
particle swarm. To evaluate the performance of the proposed method, we use DMQPSO,
QPSO and other algorithms to train the HMMs for MSA on three benchmark datasets.
The experiment results show that the HMMs trained with DMQPSO and QPSO yield better
alignments for the benchmark datasets than other most commonly used HMM training
methods such as Baum–Welch and PSO.
 2010 Elsevier Inc. All rights reserved.

1. Introduction

Multiple sequence alignment (MSA) of nucleotides or amino acids is one of the most important and challenging problems
in bioinformatics. It is an extension of pairwise alignment to incorporate more than two sequences at a time. Multiple align-
ment methods try to align all of the sequences in a given query set, and the resulting aligned sequences are often used to
construct phylogenetic trees, to find protein families, to predict secondary and tertiary structures of new sequences, and
to demonstrate the homology between new sequences and existing families [38]. Multiple sequence alignment is computa-
tionally difficult to produce and most formulations of the problem lead to NP-complete combinatorial optimization prob-
lems [66]. Nevertheless, the utility of these alignments in bioinformatics has led to the development of a variety of
methods suitable for aligning three or more sequences.
The technique of dynamic programming is theoretically applicable to any number of sequences. However, since it is com-
putationally expensive in both time and memory, it is rarely used for more than three or four sequences in its most basic
form. One method to tackle this problem is to use the so-called ‘‘progressive alignment’’ strategies [17,60]. Put it briefly,
the progressive alignment strategy repeats the following steps until all given sequences are aligned. First, two sequences
are chosen from the given multiple sequences. Then the two sequences are aligned by the dynamic programming algorithm

⇑ Corresponding author. Address: Department of Computer Science and Technology, Jiangnan University, No. 1800, Lihu Avenue, Wuxi, Jiangsu 214122,
China. Tel./fax: +86 510 85912136.
E-mail address: [email protected] (J. Sun).

0020-0255/$ - see front matter  2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.ins.2010.11.014
94 J. Sun et al. / Information Sciences 182 (2012) 93–114

and finally are replaced with the resulting pairwise alignment. Though it is certain that this strategy allows us to align a large
number of given sequences in a practical amount of computation time, the resulting alignment is often not necessarily opti-
mal, because the result is affected by the local (partial) information produced by each pairwise alignment. The most well-
known representative utilizing progressive alignments is the MSA program Clustal W [60].
An alternative to progressive alignment methods is to use stochastic optimization method, such as simulated annealing
(SA) [26,34] or evolutionary algorithms (EAs) [7,39,62]. These approaches execute a series of steps of updating the alignment
to improve the objective function value, which measures multiple alignment quality, to find an optimal alignment. The EAs
evolve a population of alignments in a quasi evolutionary manner and gradually improve the fitness of the population as
measured by the objective function.
Another efficient approach, based on probabilistic models such as Hidden Markov Models (HMMs) [2,3,27,36], is cur-
rently one of the most popular techniques for multiple sequence alignment. HMMs have been applied to MSA and have
shown to be efficient tools for the problem [23]. For MSA, HMMs are employed to create an operation sequence of gap inser-
tion and deletion instruction to align the sequences. Generally, an HMM topology used for the MSA problem requires roughly
as many states as the average length of the sequences in the problem. Therefore, one issue of the HMM approach is that there
is no known deterministic algorithm that can guarantee to yield an optimally trained HMM within reasonable computational
time. The most common way to deal with this problem is to employ approximation algorithm based on statistics and re-esti-
mation, such as the most widely used Baum–Welch (BW) algorithm which is known as forward–backward algorithm [4]. The
gradient methods [3] were also used to optimize the parameters of an HMM. However, these methods are local search tech-
niques that usually result in sub-optimally trained HMM. Another possibility is to estimate the parameters of an HMM by
random optimization algorithms, such as SA [14] and EAs [28,51]. SA or EAs ensure a higher chance of reaching a global opti-
mum by starting with single or multiple random search points and updating the candidate solutions randomly. However, it is
always complained that SA and EAs encounter some faults such as lack of local search ability, premature convergence and
slow convergence speed.
During the last decade, the development in optimization theory saw the emergence of swarm intelligence, a category of
random search methods of solving global optimization problems. Ant colony (AC) and particle swarm optimization (PSO) are
two paradigms of this kind of methods [8,9,15,24,50,64], and recently have shown to be efficient tools for solving the MSA
problem [29,46].
The PSO algorithm was originally proposed by Kennedy as a simulation of social behavior of bird flock [24]. It can be easily
implemented and is computationally inexpensive, since its memory and CPU speed requirements are low. PSO has been
proved to be an efficient approach for many continuous global optimization problems and in some cases it does not suffer
the difficulties encountered by GA [1]. During the last decade, there have been a lot of remarkable developments in the field
of PSO, particularly in improvements and applications of the algorithm [9,13,16,22,25,32,37,43,44,50,63,68,69].
Recently, a new variant of PSO, called quantum-behaved particle swarm optimization (QPSO), has been proposed in order
to improve the global search ability of the original PSO [53–55]. The iterative equation of QPSO is far different from that of
PSO in that it is needs no velocity vectors for particles, has fewer parameters to adjust and can be implemented more easily.
It has been proved that this iterative equation leads QPSO to be global convergent [59].
The QPSO algorithm has been aroused the interests of many researchers from different communities. It has been shown to
successfully solve a wide range of continuous optimization problems. Among these applications, it has been used to tackle
constraint optimization problems [57], multi-objective optimization problems [40], neural network training [31], economic
dispatch problems [58], electromagnetic design [11,35], semiconductor design [48], clustering problems [56], system iden-
tification [18], mechanical design [12], image processing [19,30], bioinformatics [5,6], to name only a few.
In addition to the applications, many efficient strategies have been proposed to improve the QPSO algorithm. For exam-
ple, Liu et al. introduced the mutation operation into QPSO to improve the search ability of the algorithm [33]. In [67], the
authors proposed a local QPSO (LQPSO) as a generalized local search operator and incorporated LQPSO into a main QPSO
algorithm, which leads to a hybrid QPSO scheme QPSO-LQPSO with enhanced searching qualities. In [12], it was shown that
chaotic mutation operation could diversify the population of QPSO and thus improve the performance of the algorithm. Pant
et al. developed a new variant of QPSO, which used interpolation based recombination operator for generating a new solu-
tion vector in the search space [41]. They also proposed a new mutation operator called Sobal mutation to enhance the per-
formance of the QPSO algorithm [42]. In [21], the authors proposed a new improved QPSO, using the better recording
locations of all particles and the mutation of the best behaved particle in order to filtrate the particle swarm and accelerate
the convergence.
While empirical evidence has shown that the QPSO algorithm works well, there has thus far been little insight into how it
works and the algorithm has not been used to solve the MSA problem. In this paper, we make analyses for a single particle’s
behavior in QPSO, deriving the sufficient and necessary condition for probabilistic boundedness of the particle that can guar-
antee the particle swarm to converge. Then based on the analyses, we propose an improved QPSO, called diversity-main-
tained QPSO (DMQPSO), in which the diversity is maintained at a certain level to enhance the global search ability of
QPSO. Finally, the DMQPSO algorithm is used to train the HMMs for MSA and tested on three benchmark datasets.
The rest of this paper is organized as follows. In Section 2, we describe the topology and training of the HMM used for
MSA. Section 3 presents the principle of QPSO. Section 4 presents the analyses of the QPSO algorithm and the proposed
DMQPSO algorithm. Section 5 describes how to apply DMQPSO to HMM training for MSA. The experiment results on bench-
mark datasets are provided and discussed in Section 6. Some concluding remarks are offered in the last section.
J. Sun et al. / Information Sciences 182 (2012) 93–114 95

2. Hidden Markov Model for MSA

2.1. Topology of HMM for MSA

The HMM structure used in this study is the standard topology for the MSA problem originally suggested by Krogh et al.
[27]. Fig. 1 shows a simple topology example of the HMM described as a directed graph. The HMM consists of a set of q states
(S1, S2, . . . , Sq) that are divided into three groups: match (M), insert (I) and delete (D). In addition, there are two special states:
begin state and end state. States are connected to each other by transition probability aij where aij P 0 (1 6 i,j 6 q) and
Pl
j¼1 aij ¼ 1 ð1 6 i 6 qÞ. A match or insert state (Sj) emits an observable (a symbol) (vk) from an output alphabet R with a
P
probability bj(k) where bj(k) P 0 (1 6 j 6 q, 1 6 k 6 m) and m k¼1 bj ðkÞ ¼ 1 ð1 6 j 6 qÞ. Here m is the number of observables.
Delete state, begin state and end state do not emit observables and thus are called silent states.
Starting from begin state to end state, the HMM generates sequences (strings of observables) by making nondeterministic
walks that randomly go from state to state according to the transitions. Each walk yields a path p = (p1, p2, . . . , pp) of visited
states and a sequence consisting of emitted observable states on the path. When the HMM is applied to MSA, the sequence of
observables is given in the form of an unaligned sequence of amino acids. The goal of MSA is thus to find a path p which
generate best alignment. We can use forward and Viterbi algorithms to determine the probability of a given sequence (o)
being generated by an HMM (k), i.e., P(ojk) and derive the path p with maximal probability of generating (o) [45].

2.2. Training HMM for MSA

For a given sequence (o) and an HMM (k), the goal of training the HMM is to determine the parameters (transition and
emission probabilities) of k such that P(ojk) is maximized. This task is usually tackled by either Baum–Welch technique that
is based on statistical re-estimation formulas [45], or by random search methods such as SA [14] or EAs [28,51,62]. Before
training, the length of the HMM should be determined. A commonly used estimate is the average length of the sequences to
be aligned. After training, a better model length can be chosen with a heuristic method known as model surgery[27].
The quality of the HMM needs to be evaluated during the training. Generally, a log-odds score is used for this purpose,
which is based on a log-likelihood score [45] given by

1X N
PðOi jkÞ
log-oddsðO; kÞ ¼ log2 ð1Þ
N i¼1 PðOi jkN Þ

where O = (O1, O2, . . . , ON) is the set of unaligned sequences, k is the trained HMM, and kN is a null-hypothesis model. In this
paper, a random model is chosen as the null-hypothesis model.
The final step after the training of the HMM is to interpret the learned sequences as a multiple sequence alignment. The
HMM model from the training phase is considered to be a profile for the set of sequences. Thus the unaligned sequence can
be aligned by such a profile HMM k.

3. Quantum-behaved particle swarm optimization

In the PSO with M individuals, each individual is treated as a volume-less particle in the D-dimensional space, with the
current position vector and velocity vector of particle i at the nth iteration represented as X i;n ¼ ðX 1i;n ; X 2i;n ; . . . ; X Di;n Þ and
V i;n ¼ ðV 1i;n ; V 2i;n ; . . . ; V Di;n Þ. The particle moves according to the following equations:
   
V ji;nþ1 ¼ w  V ji;n þ c1 r ji;n X ji;n  P ji;n þ c2 Rji;n X ji;n  Gjn ð2Þ

X ji;nþ1 ¼ X ji;n þ V ji;nþ1 ð3Þ

Fig. 1. An example of HMM of length 3 for MSA problem.


96 J. Sun et al. / Information Sciences 182 (2012) 93–114

for i = 1, 2, . . ., M; j = 1, 2, . . ., D, where c1 and c2 are called acceleration coefficients. The parameter w is known as the inertia
weight which can be adjusted to balance the explorative search and the exploitive search of the particle. Vector
Pi;n ¼ ðP 1i;n ; P2i;n ; . . . ; PDi;n Þ is the best previous position (the position giving the best objective function value or fitness value)
of particle i and called personal best (pbest) position, and vector Gn ¼ ðG1n ; G2n ; . . . ; GDn Þ is the position of the best particle among
all the particles in the population and called global best (gbest) position. Without loss of generality, we consider the following
maximization problem:

Maximize f ðXÞ; s:t: X 2 S # RD ð4Þ


where f(X) is an objective function continuous almost everywhere and S is the feasible space. Accordingly, Pi,n can be updated
by

X i;n if f ðX i;n Þ > f ðPi;n1 Þ
Pi;n ¼ ð5Þ
Pi;n1 if f ðX i;n Þ 6 f ðPi;n1 Þ

and Gn can be found by Gn = Pg,n, where g = argmax16i6M[f(Pi,n)]. The parameters r ji;n and Rji;n are sequences of two different
sequences of random numbers distributed uniformly within (0, 1), which is denoted by r ji;n ; Rji;n  Uð0; 1Þ. Generally, the value
of V ji;n is restricted in the interval [Vmax, Vmax].
Trajectory analysis in [10] showed that convergence of the PSO algorithm may be achieved if each particle converges to its
local attractor, pi;n ¼ ðp1i;n ; p2i;n ;    pDi;n Þ defined at the coordinates

c1 r ji;n P ji;n þ c2 Rji;n Gjn


pji;n ¼ ; 16j6D ð6Þ
c1 rji;n þ c2 Rji;n

or

pji;n ¼ uji;n  Pji;n þ ð1  uji;n Þ  Gjn ð7Þ


j
where u ¼ i;n c1 rji;n =ðc1 r ji;n
þ c2 Rji;n Þ
with regard to the random numbers and rji;n Rji;n
in Eqs. (2) and (4). In PSO, the acceleration
coefficients c1 and c2 are generally set to be equal, i.e. c1 = c2, and thus uji;n is a sequence of uniformly distributed random
numbers within (0, 1). As a result, Eq. (7) can be restated as

pji;n ¼ uji;n  Pji;n þ ð1  uji;n Þ  Gjn ; uji;n  Uð0; 1Þ ð8Þ

In QPSO, each single particle is treated as a spin-less one moving in quantum space. Thus state of the particle is charac-
terized by wave function w, where jwj2 is the probability density function of its position. Inspired by convergence analysis of
the particle in PSO [10], we assume that, at the nth iteration,particle i flies in the D-dimensional space with a d potential well
centered at pji;n on the jth dimension (1 6 j 6 D). Let Y ji;nþ1 ¼ jX ji;nþ1  pji;n j, we can obtain the normalized wave function at iter-
ation n + 1
1  
wðY ji;nþ1 Þ ¼ qffiffiffiffiffiffiffi exp Y ji;nþ1 =Lji;n ð9Þ
Lji;n

which satisfies the bound condition that wðY ji;nþ1 Þ ! 0 as Y ji;nþ1 ! 1. Lji;n is the characteristic length of the wave function. By
the statistical interpretation of wave function, the probability density function is given by
1  
QðY ji;nþ1 Þ ¼ jwðY ji;n Þj2 ¼ exp 2Y ji;nþ1 =Lji;n ð10Þ
Lji;n

and thus the probability distribution function is


 
FðY ji;nþ1 Þ ¼ 1  exp 2Y ji;nþ1 =Lji;n ð11Þ

Using Monte Carlo method, we can measure the jth component of position of particle i at the (n + 1)th iteration by
Lji;n  
X ji;nþ1 ¼ pji;n  ln 1=uji;nþ1 uji;nþ1  Uð0; 1Þ ð12Þ
2
where uji;nþ1 is a sequence of random numbers uniformly distributed within (0, 1). The value of Lji;n is determined by:

Lji;n ¼ 2a  jX ji;n  C jn j ð13Þ

where C n ¼ ðC 1n ; C 2n ; . . . ; C Dn Þ is called mean best (mbest) position defined by the average of the pbest positions of all particles,
P
i.e. C jn ¼ ð1=MÞ M j
i¼1 P i;n ð1 6 j 6 DÞ. Thus the position of the particle updates according to the following equation:
 
X ji;nþ1 ¼ pji;n  a  jX ji;n  C jn j  ln 1=uji;nþ1 ð14Þ
J. Sun et al. / Information Sciences 182 (2012) 93–114 97

The parameter a in Eqs. (13) and (14) is called contraction–expansion (CE) coefficient, which can be adjusted to balance the
local search and the global search of the algorithm during the optimization process. The current position of the particle in
QPSO is thus updated according to Eqs. (8) and (14).
The QPSO algorithm starts with the initialization of the particle’s current positions and their pbest positions (setting
Pi,0 = Xi,0), followed by the iteration of updating the particle swarm. At each iteration, the mbest position of the particle swarm
is computed and the current position of each particle is updated according to Eqs. (8) and (14). Before each particle updates
its current position, its fitness value is evaluated and then its pbest position and the current gbest position are updated. In Eq.
(14), the probability of using either operation ‘‘+’’ or operation ‘‘’’ is equal to 0.5. The search procedure continues until the
termination condition is met.
We outline the procedure of the QPSO algorithm as follows:

Procedure of the QPSO:

Step 1: Initialize the population;


Step 2: Execute the following steps;
Step 3: Compute mean best position C;
Step 4: Properly select the value of a;
Step 5: For each particle in the population, execute from Step 6 to Step 8;
Step 6: Evaluate the objective function value f(XBiB,n);
Step 7: Update Pi,n and Gn;
Step 8: Update each component the particle’s position according to Eqs. (8) and (14);
Step 9: While the termination condition is not met, return to step 2;
Step 10: Output the results;

4. Analysis of QPSO and the diversity-maintained QPSO

4.1. Analysis of QPSO

The weighting of the control parameter a in the QPSO algorithm may result in a kind of explosion as position coordinates
careen toward infinity. This section demonstrates that properly selected a can prevent explosion, and further, that this coef-
ficient can induce the particle to be probabilistic bounded.
It appears from Eq. (14) that each dimension is updated independently from the others. The only link between the dimen-
sions of the problem space is introduced via the objective function, and in turn, through the locations of the personal and
global best positions found so far, as well as the mean best position among particles. Without loss of universality, the issue
of probabilistic boundedness for a particle in the D-dimensional space can be reduced to the one of a single particle in the
one-dimensional space. Therefore, Eqs. (14) can be simplified as:
X nþ1 ¼ p  a  jX n  Cj  lnð1=unþ1 Þ unþ1  Uð0; 1Þ ð15Þ
In the above equation, p, along with C, is treated as a probabilistic bounded random variable, instead of a constant as in [10].
Here, the probabilistic boundedness of p and C means that P{supjpj < 1} = 1 and P{supjCj < 1} = 1. The position sequence
{Xn} is a sequence of random variables and {un} is a sequence of independent identically distributed (i.i.d) random variables
with un  U(0, 1) for all n.
Now we tackle the behavior of the particle in QPSO, beginning by rewriting Eq. (15) as
X nþ1  C ¼ p  C  a  jX n  Cj  lnð1=unþ1 Þ un  Uð0; 1Þ ð16Þ
The following theorem can be deduced based on this equality.

Theorem 1. The necessary and sufficient condition that the position sequence of the particle {Xn} is probabilistic bounded is that
a 6 ec  1.781, where c  0.577215665 is called Euler constant.

Proof. Let kn = aln(1/un). Since kn is a continuous random variable, P{kn = 1} = 0. Considering that P{supjC  pj < 1} = 1, we
have P{supjC  pj/(1  kn) < 1} = 1. Let supjC  pj/(1  kn) = r, where 0 < r < 1, and we have that for every n > 0, jC  pj/
(1  kn) 6 r namely,
jC  pj 6 rð1  kn Þ ð17Þ
Proof of sufficiency: According to (16), we have
jX n  Cj 6 jC  pj þ a  jX n1  Cj  lnð1=un Þ ¼ jC  pj þ kn jX n1  Cj ð18Þ
Thus we can obtain the inequality below by replacing jC  pj in (18) by that in (17)
jX n  Cj  r 6 kn ðjX n1  Cj  rÞ
98 J. Sun et al. / Information Sciences 182 (2012) 93–114

from which we find that


Y
n
jX n  Cj 6 r þ ðjX 0  Cj  rÞ ki ð19Þ
i¼1
Qn
Since i¼1 ki > 0, the following inequality holds.
" # !
Y
n Y
n
sup jX n  Cj 6 sup r þ ðjX 0  Cj  rÞ ki 6 r þ supðjX 0  Cj  rÞ  sup ki
i¼1 i¼1

Refer to Theorem A1 in Appendix, we have that, whenever a 6 ec,


( ! )
Y
n
Pfsup bn < 1g ¼ P sup ki <1 ¼1
i¼1
Qn
where bn ¼ i¼1 ki . Since r < 1, P{sup(jX0  Cj  r) < 1} = 1. Therefore
( ! ) ( ( ) )
Y
n Yn
Pfsup jX n  Cj < 1g P P r þ supðjX 0  Cj  rÞ  sup ki <1 ¼ P supðjX 0  Cj  rÞ  sup ki < 1
i¼1 i¼1

P Pf½supðjX 0  Cj  rÞ < 1 \ ðsup bn < 1Þg ¼ P ¼ PfsupðjX 0  Cj  rÞ < 1g þ Pfsup bn < 1g


 Pf½supðjX 0  Cj  rÞ < 1 [ ðsup bn < 1Þg
¼ 1 þ 1  Pf½supðjX 0  Cj  rÞ < 1 [ ðsup bn < 1Þg P 1

We immediately have that P{supjXn  Cj < 1} = 1, implying that jXn  Cj is probabilistic bounded. Consequently, Xn is also
probabilistic bounded.

Proof of necessity: According to (16), we have the following inequality

jX n  Cj P jC  pj þ a  jX n1  Cj  lnð1=un Þ ¼ jC  pj þ kn jX n1  Cj ð20Þ

Replacing jC  pj in (20) by that in (17), we find that

jX n  Cj þ r P kn ðjX n1  Cj þ rÞ ð21Þ

from which we obtain


Y
n
jX n  Cj þ r P ðjX 0  Cj þ rÞ ki ð22Þ
i¼1

Thus we have that


" #
Y
n
supðjX n  Cj þ rÞ P sup ðjX 0  Cj þ rÞ ki ¼ sup½ðjX 0  Cj þ rÞbn  ¼ supðjX 0  Cj þ rÞ  sup bn
i¼1

Qn
where bn ¼ i¼1 ki . Consequently, it follows that

PfsupðjX n  Cj þ rÞ < 1g 6 PfsupðjX 0  Cj þ rÞ  sup bn < 1g ¼ Pf½supðjX 0  Cj þ rÞ < 1 \ ½sup bn < 1g ð23Þ

If Xn is bounded, P{supjXn  Cj < 1} = 1. Since r < 1, P{sup(jXn  Cj + r) < 1} = 1. Inequality (23) immediately results in
P{[sup(jX0  Cj + r) < 1] \ [sup bn < 1]} = 1. Due to the probabilistic boundedness of jX0  Cj, we have that P{sup(jX0 
Cj + r) < 1} = 1, from which we obtain P{sup bn < 1} = 1. According to Theorem A1 in Appendix, we have that a 6 ec is the
necessary condition for the probabilistic boundedness of Xn.
This completes the proof of the theorem. h

The above theorem reveals that the behavior of the particle in QPSO is related to the convergence of bn. Besides, the par-
ticle is also influenced by point C. In practice, when the QPSO algorithm is running, the personal best positions of all the par-
ticles converge to the same point. This implies that limn?1jC  pj = 0, leading to the fact that if and only if a < ec,
limn?1jXn  Cj = 0 or limn?1jXn  pj = 0, according to (19). According to the proof of Theorem A1 in Appendix, when
a = ec, the bn can be any positive real number and the particle’s position is probabilistic bounded. However, when a > ec,
the particle diverges. In [55], it was shown by stochastic simulation results that when a 6 1.7, the particle is probabilistic
bounded, and when a P 1.8, it diverges. Thus, it is verified that the theoretical results of the particle’s behavior are consistent
with the simulation results.
J. Sun et al. / Information Sciences 182 (2012) 93–114 99

4.2. The proposed DMQPSO

In a PSO system, with the fast information flow among particles due to its collectiveness, the diversity of the particle
swarm declines rapidly, leaving the PSO algorithm with great difficulties in escaping local optima. In the QPSO algorithm,
diversity loss of the whole population is also inevitable. Inspired by works undertaken by Ursem and Riget et al. [47,65],
we propose a diversity-maintained QPSO (DMQPSO) in this paper.
As in [47,65], the population diversity of the DMQPSO is denoted as dn and is measured by average Euclidean distance
from the particle’s current position to the centroid of the swarm, namely
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
M u
X uX D h i2
dn ¼ ½1=ðM  jAjÞ  t X ji;n  X  jn ð24Þ
i¼1 j¼1

where X  jn ¼ ð1=MÞPM X j In (24), jAj is the length of longest the diagonal in the search space, and D is the dimension of the
i¼1 i;n
problem. Hence, we may guide the search of the particles with the diversity measures when the algorithm is running.
At the beginning of the search, the diversity of the population is relatively high after initialization. With the development
of evolution, the convergence of the particle makes the diversity be declining, which, accordingly, is enhancing the local
search ability (exploitation) but weakening the global search ability (exploration) of the algorithm. At early or middle stage
of the evolution, the decline of the diversity is necessary for the particle swarm to search effectively. However, after middle
or at later stage, the particles may converge into such a small region that the diversity of the swarm is very low and further
search is difficult. At that time, if the particle with global best position is at local optima or sub-optima, premature conver-
gence occurs.
To avoid the premature convergence and improve the performance of the algorithm, we introduce a diversity control
method into QPSO and develop the DMQPSO algorithm. But unlike to Uresem’s and Riget’s works [47,65], in DMQPSO, only
low bound dlow is set for dn to prevent the diversity from constantly decreasing. The procedure of the algorithm is as follows.
After initialization, the algorithm is running in convergence mode. In Section 3, the analysis of QPSO has shown that
when a 6 ec, the particle is probabilistic bounded, which leads to convergence of the particle swarm, or otherwise the par-
ticle swarm explodes. In this study, the convergence mode is realized by varying a linearly on the course of running. That is, a
is calculated as
a ¼ ða1  a2 Þ  ðnmax  nÞ=nmax þ a2 ð25Þ
where a1 and a2 (a1 > a2) are the initial and final values of a, respectively, n is the current iteration number and nmax is the
maximum number of allowable iterations. This parameter control method is also adopted in the original QPSO and can result
in good performance of the algorithm generally [54,55]. On the course of evolution, if the diversity measure of the swarm
drops to below the low bound dlow, the particle swarm turns to be in explosion mode in which the particles are controlled
to explode to increase the diversity until it is larger than dlow. As suggested by the analyses of QPSO, we may
set a = a0 > ec  1.781 to make the particle diverge, which in turn results in the increase of dn. We outline the DMQPSO algo-
rithm below.

Procedure of the DMQPSO

Step 0: Initialize particles with random position; set the pbest position of each particle as Pi,0 = Xi,0;
Step 1: For n = 1 to nmax (maximum number of iterations), execute the following steps;
Step 2: Calculate the mean best position C among the particles;
Step 3: Compute the value of a according to (25) (in convergence mode);
Step 4: Measure dn according to Eq. (24). If dn < dlow, set a = a0 (in explosion mode);
Step 5: For each particle, execute Step 6 to Step 9;
Step 6: Compute its objective function f(Xi,n). If f(Xi,n) > f(Pi,n), then Pi,n = Xi,n;
Step 7: Select the current global best (gbest) position Gn;
Step 8: For each dimension of each particle, get the stochastic point pji;n by Eq. (8);
Step 9: Update each component of the current position by Eq. (14) and return to Step 1;

It can be seen that DMQPSO runs in convergence mode during the most of the iterations. Only when the diversity declines
to below dlow do the particles fly in explosion mode. The explosion process is transitory, and once diversity is over the thresh-
old, the population returns to convergence mode again. Moreover, our preliminary experiment results for several widely
used benchmark function showed that setting dlow  104 may lead the algorithm to good performance in general.

5. DMQPSO trained HMMs for MSA

5.1. Training HMMs by DMQPSO

Now we turn our attention to the application of DMQPSO to training the HMM for MSA. When using the DMQPSO algo-
rithm to train the HMM, we keep the length of the HMM constant during training and only optimize the parameters of the
100 J. Sun et al. / Information Sciences 182 (2012) 93–114

HMM, namely, the transition and emission probabilities. The candidate solution for the HMM is represented as the position
of a particle with real encoding of l transitions and m emission probabilities. Thus the dimension of the search space for the
HMM training is l + m.
During each iteration of DMQPSO, a copy of the population is created. All particles in this copy are normalized such that
the constraints on the transition and emission probabilities mentioned in Section 2 are satisfied. Each particle in the copy
population X0 is evaluated either by the log-odds in Eq. (1) or by sum-of-pairs score (SOP) which is described in the next
subsection.
The procedure of HMM training with DMQPSO is outline as follows:

Step 0: Initialize the populations of particles’ current positions X and personal best positions P, and normalize P.
Step 1: For n = 1 to nmax (maximum number of iterations), execute the following steps;
Step 2: Copy populations of particle’s current positions X to X0 ;
Step 3: Calculate the mean best position C among the particles;
Step 4: Compute the value of a according to (25) (in convergence mode);
Step 5: Measure dn according to Eq. (24). If dn < dlow, set a = a0 (in explosion mode);
Step 6: Normalize the population X0 ;
Step 7: For each particle, execute Steps 8–11;
Step 8: Compute its objective function of its position in X0 by f ðX 0i;n Þ. If f ðX 0i;n Þ > f ðP i;n Þ, then P i;n ¼ X 0i;n ;
Step 9: Select the current global best (gbest) position Gn;
Step 10: For each dimension of each particle, get the stochastic point pji;n by Eq. (8);
Step 11: Update each component of the current position in population X by Eq. (14) and return to Step 1;

5.2. MSA with the trained HMM

After training of the HMM with DMQPSO, the output gbest position of the particles represents the optimized parameters
of the HMM, i.e. transition and emission probabilities. The trained HMM can be considered as a profile for the set of se-
quences. Thus we can align multiple sequences based on the HMM using Viterbi algorithm [45]. Finally, the resulting align-
ment of multiple sequences is evaluated according to the sum-of-pairs score (SOP) [46].

5.3. Scoring of MSA

We can employ two different methods of scoring the testing alignment obtained by the experiments. The first one does
not rely on any form of prior knowledge about the structure of resulting alignment, while the second requires a reference
alignment for comparison, where the reference alignment is referred to a manually refined alignment that is believed to
be of high quality. Both of the scoring methods are based on the widely used sum-of-pairs (SOP) scoring function.
For the experiments without prior knowledge regarding the structure of the resulting alignments, we use the standard
sum-of-pairs scoring as follows:
X
n1 X
n
sum-of-pairs ðSOPÞ ¼ Dðli ; lj Þ ð26Þ
i¼1 j¼iþ1

where li is aligned sequence i, and D is a distance metric. In this study, the widely accepted BLOSUM62 replacement matrix in
[20] is used as the distance metric. To prevent the accumulation of many gaps in an alignment, we deduct from the sum-of-
pairs function an affine gap cost given by
Gap Cost ¼ GOP þ n  GEP ð27Þ
where GOP is a fixed penalty for opening a gap, GEP is the penalty for extending the gap, and n is the number of gap symbols
in the gap. With the gap cost calculated for each gap in each of the aligned sequences, the sum of these costs is then deducted
from the sum-of-pairs score.
In the experiments when the reference alignment is available, a modified sum-of-pairs is employed to evaluate the align-
ment [61]. Given the tested alignment of N sequences consisting of M columns, we denote the ith column of the alignment by
Ai1, Ai2, . . ., AiN. For each pair of residues Aij and Aik, pijk is defined such that pijk = 1 if residues Aij and Aik from the test alignment
are aligned with each other in the reference alignment; otherwise pijk = 0. Thus we define the column score by
X
N X
N
Si ¼ pijk ð28Þ
j¼1;j–k k¼1

The sum-of-pairs score for the test alignment is then given by


X
N
Si
sum-of-pairs ðSOPÞ ¼ PM r ð29Þ
i¼1 i¼1 Sri

where Mr is the number of columns in the reference alignment and Sri is the score Si for the ith column in the reference
alignment.
J. Sun et al. / Information Sciences 182 (2012) 93–114 101

6. Experiments

6.1. Benchmark datasets

We tested the performance of DMQPSO in training the HMMs for MSA on three benchmark dataset: (1) simulated nucle-
otide data sets, (2) three protein families, and (3) amino-acid data sets from benchmark alignment database.

6.1.1. Benchmark dataset A


Nucleotide sequences (20 sequences, 300 characters) were generated with the program Rose [52] (JC model, mean sub-
stitution rate = 0.013, and insert/delete probability = 0.03) as follows: a root random sequence (of length 500) was evolved
on a random tree to yield sequences of ‘low’ or ‘high’ mean divergences, i.e. with an average number of substitutions per site
of 0.5 or 1.0, respectively. Furthermore, the insertion/deletion length distribution was set to ‘short’ (frequencies of gaps of
length 1–3 = 0.8, 0.1, 0.1) or ‘long’ (frequencies of gaps of length 1–7 = 0.3, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1). Hence the four ‘mean
divergence-mean gap length’ conditions tested here are ‘low-short’ ‘low-long’, ‘high-short’, and ‘high-long’. 50 random data
sets were generated under each of the four combinations of settings, and the average performance is reported. On this data-
set, sum-of-pairs given by Eq. (26) is employed to score the resulting alignments.

6.1.2. Benchmark dataset B


The second dataset consisted of four large sets of unaligned sequences for the three protein families G5, CagY_M, Inter-
feron and Biopterin_H, which were extracted from the Pfam database at website http://pfam.jouy.inra.fr/browse.shtml [49].
The average, minimum and maximum lengths, number of sequences and size of training set of each family are listed in Table
1. All of these protein families have previously been used in MSA studies with HMMs and GAs [14,62].
In order to check over-fitting of the HMM model, the four protein family sets were divided into training and validation
sets. The size of each training set was 150 as shown in Table 1. The validation sets were the original datasets excluding the
training sets. On this dataset, the sum-of-pairs score function is also given by Eq. (26).

6.1.3. Benchmark dataset C


The third group of datasets was from BAliBASE (benchmark alignment database) database. The database can be found at
website http://www.cs.nmsu.edu/jinghe/CS516BIOINFO/Fall05/BAliBASE/align_index.html. It contains several manually
refined multiple sequence alignments specifically designed for the evaluation and comparison of multiple sequence align-
ment methods. We selected twelve sequence sets from the first reference set in BALIBASE database, which are listed in Table
2. The first column in the table lists the names of selected sequence sets, N is the number of sequences in each set, the third
column includes the average, minimum and maximum lengths of the sequences in each set and the fourth column lists the

Table 1
The four protein families.

Family N LESQ (min, max) LESQ T


G5 277 79 (67, 88) 79 150
CagY_M 549 31 (24, 35) 31 150
Interferon 375 164 (23, 200) 164 150
Biopterin_H 343 170 (13, 359) 170 150

N, number of sequences; LSEQ, length of sequences, and T, size of training set.

Table 2
The 12 benchmark datasets from BAliBASE.

Name N LSEQ (min, max) Identity (%)


1aboA 5 (49, 80) <25
1idy 5 (49, 58) <25
451c 5 (70, 87) 20–40
1krn 5 (66, 82) >35
1bbt3 5 (149, 192) <25
kinase 5 (263, 276) <25
1pii 4 (247, 259) 20–40
5ptp 5 (222, 245) >35
gal4 5 (335, 395) <25
1ajsA 4 (258, 387) <25
glg 5 (438, 486) 20–40
1tag 5 (806, 928) >35
102 J. Sun et al. / Information Sciences 182 (2012) 93–114

identity of the sequences. Since the reference alignments for these datasets are available, we employed the sum-of-pairs
score in Eq. (29) for determining the quality of the resulting test alignments.

6.2. Experiment settings

6.2.1. Parameter settings of the algorithms


The proposed DMPSO algorithm, QPSO, PSO and BW algorithm were used to train the HMMs on the datasets. The popu-
lation with the size of 20 in DMQPSO, QPSO and PSO were initialized randomly with uniform distribution in the search scope
[0, 1]D. The parameters of PSO and QPSO were determined according to published recommendations [50,54,55]. For DMQPSO,
the values of a1 and a2 in convergence mode was controlled as in the original QPSO [54,55], while in explosion mode, it was
set to be 2.0, in consistent with the fact that setting a0 > ec  1.781 can make the particle explode. Our preliminary exper-
iments showed that the performance of DMQPSO on MSA problems was not sensitive to the value of a0 as long as a0 was set
to be larger than 1.781. It appeared that if the population diversity was below the low bound dlow, the algorithm needed only

Table 3
The results of HMM log-odds scores for nucleotide sequences and the average execution time over 20 runs of
each experiment (Bold figures represent the best results obtained for each sequence set).

Nucleotide Algorithms Average log-odds score Normalized score Average execution


(standard deviation) time (h)
Low-short BW 394.6 1.2140 0.0404
PSO 418.4 (7.35) 0.4018 4.3503
QPSO 448.8 (9.68) 0.6356 4.3435
DMQPSO 458.9 (5.80) 0.9802 4.3543
Low-long BW 226.5 0.7540 0.0378
PSO 215.8 (8.26) 0.9158 4.2912
QPSO 310.7 (4.68) 0.5195 4.3277
DMQPSO 352.4 (3.76) 1.1502 4.3421
High-short BW 186.3 0.9481 0.0345
PSO 195.6 (2.31) 0.7374 5.3179
QPSO 253.8 (3.45) 0.5811 5.4497
DMQPSO 276.9 (2.41) 1.1044 5.4770
High-long BW 280.8 0.6056 0.0371
PSO 256.4 (8.52) 1.0696 5.3424
QPSO 345.8 (9.73) 0.6304 5.3689
DMQPSO 367.6 (8.43) 1.0449 5.3726

Table 4
The results of SOP scores for the final alignments of nucleotide sequences and the average execution time over
20 runs of each experiment (Bold figures represent the best results obtained for each sequence set).

Nucleotide Algorithms Average SOP score Normalized score Average execution


(standard deviation) time (h)
Low-short CW 2514.9 0.2836 –
BW 2024.7 1.2356 0.0430
PSO 2154.3 (25.36) 0.8339 5.9012
QPSO 2638.4 (20.364) 0.6663 5.9824
DMQPSO 2784.7 (10.54) 1.1197 6.1269
Low-long CW 8760.7 0.5038 –
BW 8623.9 0.6942 0.0419
PSO 8430.8 (15. 0154) 0.9629 5.9170
QPSO 9873.3 (18.462) 1.0444 5.9249
DMQPSO 9925.1 (16.5) 1.1165 5.9293
High-short CW 4316.5 0.128 –
BW 4270.2 1.2093 0.0477
PSO 4294.7 (13.5643) 0.6371 5.8703
QPSO 4351.9 (13.375) 0.6988 5.8937
DMQPSO 4376.6 (12.645) 1.2756 5.9122
High-long CW 7781.7 0.0683 –
BW 7418.2 0.9695 0.0469
PSO 7538.5 (22.8645) 0.6713 6.5177
QPSO 7848.8 (29.6881) 0.0981 6.5305
DMQPSO 8458.9 (25.8046) 1.6109 6.5763
J. Sun et al. / Information Sciences 182 (2012) 93–114 103

a few iterations to restore the diversity to above dlow, and accordingly, the total number of iterations spent in explosion mode
was much fewer than that spent in convergence mode. Therefore, there are little differences among the numbers of iterations
in explosion mode caused by different value of a0. As long as a0 > 1.781, it has less effect on the performance of algorithm
than a1 and a2 in convergence mode. Here we recommend that a0 be set to 2.0 for HMM training problems, according to our
preliminary experiments.
Another important parameter of DMQPSO is dlow, whose value determines the level at which the diversity was maintained
to balance the exploration and exploitation of the swarm at the later stage of search. If it is set too large, DMQPSO has stron-
ger global search ability but the efficiency of local search is weakened; if too small, the algorithm has better local conver-
gence but the global ability of DMQPSO is not significantly enhanced compared to the original QPSO. Our preliminary
experiments suggested that selecting the value of dlow in the interval [103, 106] can result in satisfactory performance
of the algorithm. In our study, we set dlow to be 104 which appeared to lead DMQPSO to the good performance on average,
although for each instance in our experiments, a better solution may be obtained by selecting other value for dlow. Here we
list the detailed parameter settings for PSO, QPSO and DMQPSO as follows:

Low-Short Nucleotide Sequences Low-short Nucleotide Sequences


460 3000

440 2500
QPSO
420
2000
DMQPSO
Sum-of-Pairs Score
400 PSO CW
BW
Log-odds Score

1500 DMQPSO
PSO
380

BW 1000
360 QPSO
BW
BW 500
CW
340
PSO PSO
QPSO QPSO
0
320 DMQPSO DMQPSO

300 -500
0 200 400 600 800 1000 0 200 400 600 800 1000
No. of Iteration No. of Iteration

(a) (b)
Fig. 2. Average fitness values found by the algorithms during training of the HMM for low-short nucleotide sequences, with the fitness values being (a) log-
odds scores and (b) sum-of-pairs scores.

Table 5
The results of HMM log-odds scores for the training sets of the four protein families and the average execution
time over 20 runs of each experiment (Bold figures represent the best results obtained for each sequence set).

Nucleotide Algorithms Average log-odds score Normalized score Average execution


(standard deviation) time (h)
G5 BW 103.146 0.8232 0.0595
PSO 101.4500 (1.89) 0.8694 8.6404
QPSO 154.94 (1.06) 0.5876 8.9469
DMQPSO 173.94 (0.94) 1.1051 9.0768
CagY_M BW 11.178 1.2711 0.0357
PSO 20.090 (1.14) 0.2958 7.0527
QPSO 28.255 (0.92) 0.5978 7.5136
DMQPSO 31.649 (0.48) 0.9692 7.6140
Interferon BW 158.314 0.4138 0.0714
PSO 141.736 (2.41) 1.1985 12.2936
QPSO 179.549 (1.43) 0.5913 12.5230
DMQPSO 188.63 (1.11) 1.0211 12.7288
Biopterin_H BW 162.431 1.1474 0.0754
PSO 179.521 (4.38) 0.0738 14.3029
QPSO 179.549 (1.43) 0.0721 14.4857
DMQPSO 201.284 (4.82) 1.2933 14.7279
104 J. Sun et al. / Information Sciences 182 (2012) 93–114

PSO: population size M = 20; w decreasing linearly from 0.9 to 0.4; c1 = c2 = 2.0; Vmax = 1.0.
QPSO: population size M = 20; a1 = 1.0 and a2 = 0.5.
DMQPSO: population size M = 20; a1 = 1.0 and a2 = 0.5; a0 = 2.0; dlow = 0.0001.

6.2.2. Other experiment configurations


For nucleic acids, all pairwise alignments were computed with the affine-gap-score algorithm and the IUB substitution
score table was used as the distance metric in the sum-of-pairs scoring function, and for penalties, a GOP of 15 and a GEP
of 7 were used. For amino-acid data, the BLOSUM62 replacement matrix is used as the distance metric in the sum-of-pairs
scoring function with GOP and GEP set to be 11 and 2, respectively. These parameter values were configured as in [46].
For each set of sequences in datasets A, B and C, we performed HMM training experiments with the four training algo-
rithms with log-odds score and sum-of-pairs score as the objective function, respectively. Except for BW, a deterministic
algorithm given a fixed initial HMM, we repeated each experiment 20 times with each run executing 1000 iterations.

Table 6
The results of HMM log-odds scores for the validation sets of the four protein families and the average
execution time over 20 runs of each experiment (Bold figures represent the best results obtained for each
sequence set).

Nucleotide Algorithms Average log-odds score Normalized score Average execution


(standard deviation) time (h)
G5 BW 78.357 1.4202 0.0529
PSO 141.4500 (2.42) 0.1033 6.9691
QPSO 154.94 (1.12) 0.4291 7.0767
DMQPSO 173.94 (0.85) 0.8878 7.0856
CagY_M BW 12.832 1.4766 0.1729
PSO 20.090 (2.13) 0.3478 10.4372
QPSO 20.255 (1.08) 0.3892 10.4611
DMQPSO 21.649 (0.56) 0.7396 10.5357
Interferon BW 102.652 0.7919 0.3196
PSO 95.736 (1.54) 0.9324 17.9688
QPSO 179.549 (1.28) 0.7699 18.1971
DMQPSO 188.63 (1.06) 0.9544 18.2801
Biopterin_H BW 171.281 0.3689 0.3254
PSO 162.292 (2.38) 1.1665 19.9303
QPSO 179.549 (1.28) 0.3648 20.268
DMQPSO 188.63 (1.06) 1.1706 20.8479

Table 7
The results of SOP scores (divided by 1000) for the final alignments of the training sets of the four protein
families and the average execution time over 20 runs of each experiment (Bold figures represent the best
results obtained for each sequence set).

Nucleotide Algorithms Average SOP score Normalized score Average execution


(standard deviation) time (h)
G5 CW 189 0.5737 –
BW 192 0.4525 0.0737
PSO 176 (113.1) 1.0989 13.1803
QPSO 229 (63.7) 1.0423 13.2774
DMQPSO 230 (18.4) 1.0827 13.3631
CagY_M CW 142 1.1306 –
BW 138 0.9067 0.0602
PSO 120 (131.5) 0.1007 9.8439
QPSO 106 (103.37) 0.8844 9.8653
DMQPSO 103 (96.53) 1.0523 10.0782
Interferon CW 3226 0.9456 –
BW 3294 0.8429 0.0873
PSO 3772 (101.4) 0.1216 15.7066
QPSO 4136 (98.5) 0.4277 15.7198
DMQPSO 4835 (87.17) 1.4825 15.753
Biopterin_H CW 4015 0.6142 –
BW 4113 0.3697 0.0945
PSO 3924 (59.27) 0.8412 18.2017
QPSO 4328 (92.39) 0.1666 18.6513
DMQPSO 4926 (113.46) 1.6585 19.6002
J. Sun et al. / Information Sciences 182 (2012) 93–114 105

6.3. Experiment results

6.3.1. Results for dataset A


Tables 3 and 4 summarize the experiments results for HMM training on dataset A. Table 3 lists the results of experiments
where the log-odds score was used as the quality measure for the HMMs. It is shown that the DMQPSO algorithm was able to
generate the HMMs that had better average log-odds scores than the HMMs trained with BW, PSO and QPSO, and the QPSO
algorithm has the second best performance in training the HMMs among all the algorithms. Table 4 shows the results for
average best sum-of-pairs score for alignments produced by the HMMs for the training sets. The sum-of-pairs scores for
alignments generated by Clustal W (CW) are also listed in Table 4. As evident from the results, DMQPSO and QPSO achieved
the best and the second best sum-of-pairs scores among all the methods. Alignments with Clstal W showed to yield better
scores than those produced by the HMMs trained with PSO and BW.

Table 8
The results of HMM log-odds scores for the BAliBASE test sets and the average execution time over 20 runs of
each experiment (Bold figures represent the best results obtained for each sequence set).

Nucleotide Algorithms Average SOP score Normalized score Average execution


(standard deviation) time (h)
1aboA BW 46.3814 1.2643 0.0012
PSO 63.8205 (0.6243) 0.3374 0.0366
QPSO 84.2931 (0.8274) 0.7506 0.0397
DMQPSO 86.1835 (0.6321) 0.8511 0.0440
1idy BW 42.0576 1.2887 0.0011
PSO 59.7932 (0.4202) 0.2217 0.0355
QPSO 71.4864 (0.7630) 0.4818 0.0378
DMQPSO 80.5751 (0.5721) 1.0286 0.0425
451c BW 68.3522 1.2966 0.0016
PSO 89.1605 (0.5126) 0.0405 0.0426
QPSO 106.3024 (0.6950) 1.142 0.0440
DMQPSO 90.3075 (0.7453) 0.1142 0.0470
1krn BW 69.0222 1.1604 0.0013
PSO 81.9846 (0.9212) 0.4851 0.0402
QPSO 103.6417 (0.8287) 0.6432 0.0413
DMQPSO 110.5327 (0.6451) 1.0022 0.0448
1bbt3 BW 172.3816 0.7714 0.0035
PSO 169.2160 (0.8528) 0.8687 0.4926
QPSO 211.4329 (0.7319) 0.4293 0.5279
DMQPSO 236.8514 (0.5421) 1.2108 0.5754
kinase BW 214.9693 0.8306 0.0074
PSO 211.2745 (0.8959) 0.8681 0.7359
QPSO 356.8937 (0.6798) 0.6109 0.7513
DMQPSO 403.8526 (0.4665) 1.0878 0.7789
1pii BW 213.0459 1.3629 0.0061
PSO 277.0576 (0.9306) 0.0996 0.6736
QPSO 328.1439 (0.8513) 0.9087 0.6925
DMQPSO 310.1645 (0.4472) 0.5538 0.7245
5ptp BW 266.5928 1.0248 0.0058
PSO 311.5647 (0.9254) 0.6103 0.6474
QPSO 428.8537 (0.6151) 0.4706 0.6612
DMQPSO 504.1372 (0.6508) 1.1644 0.6850
gal4 BW 347.2819 1.0125 0.0119
PSO 389.3147 (0.7413) 0.5863 1.5203
QPSO 484.5218 (0.6931) 0.3792 1.5871
DMQPSO 567.3841 (0.5916) 1.2195 1.6979
1ajsA BW 326.4896 1.1082 0.0089
PSO 381.6639 (0.8374) 0.5289 0.7655
QPSO 483.7352 (0.5643) 0.5427 0.7994
DMQPSO 536.2753 (0.4375) 1.0944 0.8154
glg BW 380.7306 0.8528 0.0180
PSO 395.1211 (0.9201) 0.7034 5.7376
QPSO 486.5318 (0.5950) 0.2458 5.7533
DMQPSO 589.0737 (0.7475) 1.3105 5.7845
1tag BW 729.3726 0.9815 0.0415
PSO 763.7521 (1.0002) 0.6484 8.9221
QPSO 875.6509 (0.7841) 0.4357 9.1275
DMQPSO 953.9467 (0.5324) 1.1943 9.3188
106 J. Sun et al. / Information Sciences 182 (2012) 93–114

Fig. 2 illustrates typical examples of the convergence of average log-odds scores and sum-of-pairs scores during search
process averaged over 20 runs of each the algorithm. It is shown that BW converged very fast to local optima. PSO had
the fastest convergence speed among the three heuristics but may encounter premature convergence. Although the DMPSO
algorithm had the slowest convergence speed, it was able to find the solutions with the best quality due to its stronger global
search ability.

6.3.2. Results for dataset B


Tables 5–8 list the experiment results for HMM training and validation sets of dataset B. Tables 5 and 6 are the results for
the experiments where log-odds scores were used as fitness value. It reflects that DMQPSO had better log-odds scores for
HMM training than any of its competitors and QPSO yielded better scores than PSO and BW. The PSO algorithm showed
to not have better log-odds scores than BW for some sequence sets such as family G5, Interferon, validation sets of CagY_M
and validation sets of Interferon. Tables 7 and 8 summarize the results for sum-of-pairs scores used as fitness values. It is
evident that DMQPSO and QPSO yielded the best and the second best scores, respectively. Fig. 3 shows that DMQPSO and
QPSO had the better convergence properties than their competitors.

6.3.3. Results for dataset C


Table 8 shows the results obtained from the experiments conducted on dataset C when the log-odds scores were used as
the fitness functions. The DMQPSO and QPSO algorithms were able to produce the HMMs that had better average log-odds
scores than the HMMs trained with BW and PSO, whereas the results of PSO were only comparable to those of BW.
Table 9 shows the experiment results obtained from the experiments performed on dataset C with the best SOP scores
used as the fitness values. Apart from the results for the HMM methods, the table also shows the scores achieved by the well
known and widely used Clustal W and the scores obtained by SA-trained HMMs recorded from [46]. The results of DMQPSO
were better than those of QPSO, PSO and BW, but not as good as those of Clustal W on some sequence sets. However, the
results are still remarkable considering that Clustal W is a highly specialized algorithm for multiple sequence alignment,
which, for instance, estimates the evolutionary distance between all sequences in the set to create an evolutionary tree be-
fore it aligns the sequences (using iterative pairwise alignment) [17]. Our comparison shows that DMQPSO and QPSO pro-
duced better scores than BW, SA and PSO for all test sets. This is quite remarkable since BW and SA are the most commonly
used techniques for HMM training.
Fig. 4 shows that DMQPSO had the best convergence properties among all the training algorithms. Moreover, Figs. 5–9
show the reference alignments of lidy sequences and the best resulting alignments obtained by the HMMs trained with
BW, PSO, QPSO and DMQPSO.

6.4. Further evaluation

In order to make an overall performance comparison among all the tested methods, we normalized the average scores
over all the tested algorithms provided in Tables 3–9, listing these normalized scores in the corresponding tables. The nor-
malized score is defined by

6
Interferon Protein Sequence form Pfam database x 10 Interferon Protein Sequence form Pfam database
200 4
DMQPSO
180 3.5
DMQPSO
160 3

140 CW QPSO
2.5
Sum-of-Pairs Score

QPSO
Log-odds Score

120
2
100 BW
1.5
80
BW
PSO 1
60 BW BW
PSO
PSO CW
0.5
40 QPSO PSO
DMQPSO QPSO
20 0
DMQPSO
0 -0.5
0 200 400 600 800 1000 0 200 400 600 800 1000
No. of Iteration No. of Iteration

(a) (b)
Fig. 3. Average fitness values found by the algorithms during training of the HMM for Interferon protein sequences from Pfam database, with the fitness
values being (a) log-odds scores and (b) sum-of-pairs scores.
J. Sun et al. / Information Sciences 182 (2012) 93–114 107

Table 9
The results of SOP scores for the BAliBASE test sets and the average execution time over 20 runs of each
experiment (Bold figures represent the best results obtained for each sequence set).

Nucleotide Algorithms SOP score Normalized score Average execution time (h)
1aboA CW 0.714 0.031 –
BW 0.6418 1.4491 0.0022
PSO 0.6974 0.3571 0.0432
QPSO 0.7519 0.7133 0.0469
DMQPSO 0.7728 1.1238 0.0501
1idy CW 0.705 0.2266 –
BW 0.5132 1.2328 0.0021
PSO 0.5658 0.8326 0.0412
QPSO 0.7763 0.7691 0.0439
DMQPSO 0.8158 1.0697 0.0491
451c CW 0.7190 1.3568 –
BW 0.3989 1.0766 0.0025
PSO 0.4519 0.6737 0.0518
QPSO 0.5027 0.2875 0.0540
DMQPSO 0.6301 0.681 0.0584
1krn CW 1.000 0.8623 –
BW 0.8182 0.9183 0.0023
PSO 0.7863 1.2308 0.0472
QPSO 0.9585 0.4558 0.0491
DMQPSO 0.9968 0.831 0.0520
1bbt3 CW 0.638 0.1148 –
BW 0.5347 1.4478 0.0053
PSO 0.6219 0.3226 0.6618
QPSO 0.7146 0.8736 0.6873
DMQPSO 0.7253 1.0117 0.7226
kinase CW 0.7360 1.1469 –
BW 0.2268 1.2261 0.0087
PSO 0.3061 0.8566 0.8055
QPSO 0.5753 0.398 0.8318
DMQPSO 0.6053 0.5378 0.8513
1pii CW 0.8640 1.1266 –
BW 0.1647 1.2266 0.0073
PSO 0.2738 0.8595 0.7369
QPSO 0.6372 0.3634 0.7514
DMQPSO 0.7064 0.5962 0.7853
5ptp CW 0.9660 1.0588 –
BW 0.6053 1.2958 0.0062
PSO 0.6831 0.7879 0.7183
QPSO 0.8572 0.3486 0.7372
DMQPSO 0.9074 0.6763 0.7420
gal4 CW 0.483 0.3865 –
BW 0.2017 1.4019 0.0121
PSO 0.3185 0.6593 1.5898
QPSO 0.5294 0.6815 1.6812
DMQPSO 0.5784 0.9931 1.7420
1ajsA CW 0.571 0.6134 –
BW 0.2864 1.2105 0.0097
PSO 0.3245 0.9663 1.2831
QPSO 0.5914 0.7442 1.3207
DMQPSO 0.6031 0.8191 1.3863
glg CW 0.9410 0.9836 –
BW 0.5691 1.361 0.0246
PSO 0.6684 0.735 6.5017
QPSO 0.8569 0.4534 6.6381
DMQPSO 0.8895 0.659 6.7512
1tag CW 0.9630 1.373 –
BW 0.6453 1.14 0.0457
PSO 0.6931 0.7619 15.7178
QPSO 0.7953 0.0465 15.807
DMQPSO 0.8504 0.4823 15.8623
108 J. Sun et al. / Information Sciences 182 (2012) 93–114

lidy Sequences from BaliBASE


90

80 DMQPSO

70

60

Log-odds Score
50 QPSO
PSO

40

BW
30
BW
20 PSO
QPSO
10 DMQPSO

0
0 100 200 300 400 500 600 700 800 900 1000
No. of Iteration

Fig. 4. Average log-odds scores found by the algorithms during training of the HMM for lidy sequences from BAliBase database.

Fig. 5. Lidy reference alignments.

Fig. 6. The best resulting alignments generated by the HMM trained with BW for lidy sequences.

Fig. 7. The best resulting alignments generated by the HMM trained with PSO for lidy sequences.

Fig. 8. The best resulting alignments generated by the HMM trained with QPSO for lidy sequences.

Fig. 9. The best resulting alignments generated by the HMM trained with DMQPSO for lidy sequences.
J. Sun et al. / Information Sciences 182 (2012) 93–114 109

Table 10
The average normalized scores for all sets of tested sequences (Bold figures represent the best results obtained for each sequence set).

Algorithms Average normalized log-odds score Average normalized SOP score Total average normalized score
CW – 0.2654 0.2654
BW 1.0079 1.0833 1.0422
PSO 0.5470 0.7055 0.6190
QPSO 0.5444 0.5294 0.5376
DMQPSO 1.0105 0.9940 1.0030

Normalized score ðNSÞ ¼ ðSi  SÞ=rS ð30Þ


where Si is the score, S is the mean of the scores and rS is the standard deviation of the scores. After that, we averaged the
normalized log-odds scores, the normalized SOP scores and all of the two kinds of normalized scores over all the tested
sequence sets (named as total normalized scores), and listed them in Table 10. It is shown that each type of normalized
scores produced by the DMQPSO-trained HMMs was better than that by any other competitor, indicating that the HMMs
trained by DMQPSO had the best overall performance on the MSA problems of the sequence sets. The QPSO algorithm
showed the second best overall performance and the BW method performed worst in training the HMMs for the MSA prob-
lems. CW outperformed the PSO-trained HMMs but its performance was worse than that of the HMMs trained by QPSO or
DMQPSO, which implies that a good training algorithm is crucial for the HMM to play to its advantages in solving MSA
problems.
A further evaluation of the training algorithms was undertaken by comparing the computational costs of the algorithms.
Since CW program is progressive alignment method that does not involve HMM training and can achieve multiple sequence
alignment within 1 min, we only recorded and compared the computational consumptions of the algorithms in HMM train-
ing for MSAs. The average execution time over 20 runs of each experiment is listed in Tables 3–8. It is evident that BW con-
sumed the least computational time since it is a local search technique. DMQPSO was slightly more time-consuming in
training the HMMs for MSA than QPSO and PSO, due to computation of the diversity measure. However, it was worthwhile
to consume tolerably more computational time to obtain the significant performance advantages of DMQPSO over its
competitors.

7. Conclusions

In this paper, we analyzed the QPSO algorithm and proposed an improved version of the algorithm, DMQPSO, to train the
HMMs for MSA. The proposed DMQPSO, along with QPSO, PSO and BW, was tested in training the HMMs on three benchmark
datasets. The results were evaluated to compare the performances of all the competitor algorithms.
The analysis of QPSO showed that when CE coefficient a 6 ec  1.781, the particle’s position is in probabilistic bounded, or
otherwise it diverges. Based on this proposition, we introduced a diversity maintaining strategy into QPSO and thus devel-
oped the DMQPSO algorithm, in which a is set to be larger than 1.781 to make the particles explode once the population
diversity declines to below the pre-specified threshold value d low. By maintaining the diversity above the threshold, the
DMQPSO algorithm can avoid the premature convergence effectively and thus has much stronger global search ability than
QSPO.
From the experiment results for training the HMMs for MSA, it can be observed that with log-odds scores as the objective
functions (or fitness values), DMQPSO was able to produce the HMMs that had better average log-odds scores than the
HMMs trained with BW, PSO and QPSO for all three training sets. It is reflects that with sum-of-pairs as the objective func-
tion, DMQPSO was also able to yield better HMMs than its competitors. On the third dataset, the resulting alignments by the
HMMs trained with DMQPSO and QPSO did not have better sum-of-pairs scores on some sequence sets than those by Clustal
W program, which is a highly specialized algorithm for multiple sequence alignment. In order to make an overall perfor-
mance evaluation, we averaged the normalized scores of the tested methods and found that the DMQPSO-trained HMMs
had the best average normalized scores, namely, the best overall performance in the tested MSA problems than Clustal W
and the HMMs trained by any other compared training method. Moreover, comparison of the execution time among the
algorithms shows that in each experiment, the proposed DMQPSO algorithm consumed slightly more CPU time. However,
considering its significant performance advantages over its competitors, we can conclude that DMQPSO is a very efficient
approach of training the HMMs for MSA problems.

Acknowledgements

This work is supported by Natural Science Foundation of Jiangsu Province, China (Project No: BK2010143), by the
Fundamental Research Funds for the Central Universities (Project No: JUSRP21012), by the innovative research team
project of Jiangnan University (Project No: JNIRT0702), and by National Natural Science Foundation of China (Project No:
60973094).
110 J. Sun et al. / Information Sciences 182 (2012) 93–114

Appendix A

Lemma A1. We have the following improper integral


Z 1
ln½lnð1=xÞdx ¼ c ðA1Þ
0

where c  0.577215665 is called Euler constant.

R1 R1 R1
Proof. Letting s = 1/x, we have 0 ln½lnð1=xÞdx ¼ 0 es ln sds. Since CðmÞ ¼ 0 xm1 ex dx, where C() is the gamma function,
0
R 1 m1 x 0
R 1 x
C ðmÞ ¼ 0 x e ln xdx. We have C ð1Þ ¼ 0 e ln xdx ¼ c, which implies that
Z 1 Z 1
ln½lnð1=xÞdx ¼ es ln s ds ¼ C0 ð1Þ ¼ c
0 0
This completes the proof of the theorem. h

Lemma A2. If {un} is a sequence of independent identically distributed random variables with un  U(0, 1) for all n and fn = ln[ln(1/
un)], then

1X n
a:s:
f ! c ðA2Þ
n i¼1 i

Proof. Since {un} is a sequence of independent identically distributed (i.i.d.) random variables, {fn} is also a sequence of i.i.d.
random variables. Lemma A1 implies that
Z 1
Eðfn Þ ¼ Efln½lnð1=un Þg ¼ ln½lnð1=xÞdx ¼ c;
0

Thus, by Kolmogorov’s Strong Law of Large Number, we have

1X n
a:s:
f ! Eðfn Þ ¼ c 
n i¼1 n

Q
Theorem A1. Let ki = aln(1/ui) and bn ¼ ni¼1 ki , where ui  U(0, 1). The necessary and sufficient condition that bn is probabilistic
bounded, i.e. P{sup bn < 1} = 1 is that a < ec  1.781.

Proof. Let fi = ln[ln(1/ui)] and consider the following three possible cases:

(a) If a < ec, we have the following proof.


(i) From Lemma A2, we have that "m 2 Z+, $K1 2 Z+ such that whenever k P K1
( )
X
k
P ln a  c  1=m < ln a þ ð1=kÞ fi < ln a  c þ 1=m ¼1 ðA3Þ
i¼1

Since a < ec, ln a < c. We have


ln a  c þ 1=m < 1=m; ðA4Þ
and therefore
( ) ( )
X
k X
k
ln a þ ð1=kÞ fi < 1=m ln a þ ð1=kÞ fi < ln a  c þ 1=m
i¼1 i¼1
( )
X
k
ln a  c  1=m < ln a þ ð1=kÞ fi < ln a  c þ 1=m
i¼1

From (A3), we have


( ) ( )
X
k X
k
P ln a þ ð1=kÞ fi < 1=m P P ln a þ ð1=kÞ fi < ln a  c þ 1=m
i¼1 i¼1
( )
X
k
P P ln a  c  1=m < ln a þ ð1=kÞ fi < ln a  c þ 1=m ¼1
i¼1
J. Sun et al. / Information Sciences 182 (2012) 93–114 111

and thus
( )
X
k
P ln a þ ð1=kÞ fi < 1=m ¼1 ðA5Þ
i¼1

Since m/k < 1/m, we find that


( ) ( ) ( )
X
k X
k X
k
ln a þ ð1=kÞ fi < m=k ¼ ln a þ ð1=kÞ fi < 1=m  m=k 6 ln a þ ð1=kÞ fi < 1=m
i¼1 i¼1 i¼1

resulting in the fact that


( ) ( ) ( )
X
k X
k X
k
P ln a þ ð1=kÞ fi < m=k ¼ P ln a þ ð1=kÞ fi < 1=m  P m=k 6 ln a þ ð1=kÞ fi < 1=m
i¼1 i¼1 i¼1
( )
X
k
¼ 1  P m=k 6 ln a þ ð1=kÞ fi < 1=m ðA6Þ
i¼1

(ii) "m 2 Z+, $K2 = m2 such that whenever k P K2, m/k > 1/m, from which we have
( ) ( )
X
k X
k
m=k 6 ln a þ ð1=kÞ fi < 1=m
1=m < ln a þ ð1=kÞ fi < 1=m
i¼1 i¼1

and thus have


( ) ( )
X
k X
k
P m=k 6 ln a þ ð1=kÞ fi < 1=m 6 P 1=m < ln a þ ð1=kÞ fi < 1=m ðA7Þ
i¼1 i¼1

From (A6) and (A7), we have that "m 2 Z+, $K = max(K1, K2) such that whenever k P K,
( ) ( ) ( )
X
k X
k X
k
P ln a þ ð1=kÞ fi < m=k ¼ 1  P m=k 6 ln a þ ð1=kÞ fi < 1=m P 1  P 1=m 6 ln a þ ð1=kÞ fi < 1=m
i¼1 i¼1 i¼1

which is equivalent to
( ) ( )
\
1 [
1 \
1 X
k \
1 [
1 \
1 X
k
P ðln a þ ð1=kÞ fi < m=kÞ P1P ð1=m < ln a þ ð1=kÞ fi < 1=mÞ
m¼1 n¼1 k¼n i¼1 m¼1 n¼1 k¼n i¼1
( )
X
n
¼ 1  P lim ð1=nÞ fi ¼  ln a ðA8Þ
n!1
i¼1
P a:s:  P 
Since ð1=nÞ ni¼1 fi ! c, we can get that P limn!1 ð1=nÞ ni¼1 fi ¼  ln a ¼ Pfln a ¼ cg. The condition that a < ec implies that
P{lna = c} = 0, so
( )
X
n
P lim ð1=nÞ fi ¼  ln a ¼0 ðA9Þ
n!1
i¼1

From inequality (A8), we can obtain


( ) ( )
\
1 [
1 \
1 X
k X
n
P ðln a þ ð1=kÞ fi < m=kÞ P 1  P lim ð1=nÞ fi ¼  ln a ¼10 ¼1
n!1
m¼1 n¼1 k¼n i¼1 i¼1

nT o  
1 S1 T1 Pk Pn
and thus P m¼1 n¼1 k¼n ðln a þ ð1=kÞ i¼1 fi < m=kÞ ¼ 1, implying that P limn!1 ðn ln a þ i¼1 fn ¼ 1 ¼ 1 and thus
 Q  p
P limn!1 ln½an ni¼1 lnð1=ui Þ ¼ 1 ¼ 1. Consequently we have P{limn?1bn = 0} = 1 or bn ! 0. Since convergence in proba-
d
bility implies convergence in distribution, we immediately have bn ! 0.
P a:s:  P 
(b) If a = ec, lna = c. Since ð1=nÞ ni¼1 fi ! c, we obtain that P limn!1 jð1=nÞ ni¼1 fi þ cj ¼ 0 ¼ 1 and in turn
( )
X
n
P lim jð1=nÞ fi þ ln aj ¼ 0 ¼1
n!1
i¼1
112 J. Sun et al. / Information Sciences 182 (2012) 93–114
n P o
which means that "m 2 Z+, $K 2 Z+, such that whenever k P K, P jð1=kÞ ki¼1 fi þ ln aj < 1=m ¼ 1, namely
(  ) ( " # )
X k   Yk 
   
P  fi þ k ln a < k=m ¼ P ln ak lnð1=ui Þ  < k=m ¼ Pfln bk < k=mg ¼ 1 ðA10Þ
 i¼1   i¼1


The above proposition is equivalent to the following one that


( ) ( )
\
1 [
1 \
1 \
1
P ðln bk < k=mÞ ¼P ðlim ln bn < 1Þ ¼ Pflim ln bn < 1g ¼ 1
n!1 n!1
m¼1 n¼1 k¼n m¼1

scilicet that P{limn?1bn < 1} = 1, which means that when n ? 1, the limit of bn can be any positive real number, but not
infinity, implying that bn

P a:s: P
(iii) If a > ec, lna > c. Since ð1=nÞ ni¼1 fi ! c, we have Pflimn!1 jð1=nÞ ni¼1 fi þ ln aj > 0g ¼ 1, which means that $b > 0, such
that
(   )
 Xn 
 
P lim ð1=nÞ fi þ ln a ¼ b ¼ 1 ðA11Þ
n!1  
i¼1
P
And it is then easy to deduce that "m 2 Z+, $K 2 Z+, such that whenever k P K, Pfjjð1=kÞ ki¼1 fi þ ln aj  bj < 1=mg ¼ 1,
namely,
(   )
 Xk 
 
P b  1=m < ð1=kÞ fi þ ln a < 1=m þ b ¼ 1
 i¼1


Thus we can obtain that


(  ) ( " # )
   X k   Yk 
   
P ð1=kÞsumki¼1 fi þ ln a > b  1=m ¼ P  fi þ k ln a > kb  k=m ¼ P ln ak lnð1=ui Þ  > kb  k=m
 i¼1   i¼1

¼ Pfj ln bk j > kb  k=mg ¼ 1
Since kb > kb  k/m, {jlnbkj > kb  k/m}
{jlnbkj > kb}, resulting in P{jlnbkj > kb} P P{jlnbkj > kb  k/m} = 1. Due to the proper-
ties of probability measure, P{jlnbkj > kb} = 1. The proposition is equivalent to the following one that
( ) ( )
\
1 [
1 \
1 \
1 n o
P ðj ln bk j > kbÞ ¼P ðlim j ln bk j ¼ þ1Þ ¼ P lim j ln bk j ¼ þ1 ¼ 1 ðA12Þ
n!1 n!1
m¼1 n¼1 k¼n m¼1

This proves that when n ? +1, bn is divergent.

From the above three cases, we find that the theorem follows. This completes the proof of the theorem. h

References

[1] P.J. Angeline, Evolutionary optimization versus particle swarm optimization: philosophy and performance differences, Evolutionary Programming VII,
Lecture Notes in Computer Science 1447 (1998) 601–610.
[2] R.deA. Araújo, Swarm-based translation-invariant morphological prediction method for financial time series forecasting, Information Sciences 180
(2010) 4784–4805.
[3] P. Baldi, Y. Chauvin, T. Hunkapiller, M.A. McClure, Hidden Markov Models of biological primary sequence information, in: Proceedings of National
Academy of Sciences USA, vol. 91, 1994, pp. 1059–1063.
[4] L.E. Baum, T. Petrie, G. Soules, N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,
Annals of Mathematical Statistics 41 (1970) 164–171.
[5] Y. Cai, J. Sun, J. Wang, Y. Ding, N. Tian, X. Liao, W. Xu, Optimizing the codon usage of synthetic gene with QPSO algorithm, Journal of Theoretical Biology
254 (2008) 123–127.
[6] W. Chen, J. Sun, Y. Ding, W. Fang, W. Xu, Clustering of gene expression data with quantum-behaved particle swarm optimization, in: Proceedings of the
Twenty First International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2008), 2008, pp. 388–
396.
[7] K. Chellapilla, G.B. Fogel, Multiple sequence alignmentusing evolutionary programming, in: Proceedings of the First Congress on Evolution
Composition, 1999, pp. 445–452.
[8] M. Chica, Ó. Cordón, S. Damas, J. Bautista, Multiobjective constructive heuristics for the 1/3 variant of the time and space assembly line balancing
problem: ACO and random greedy search, Information Sciences 180 (2010) 3465–3487.
[9] M. Clerc, The swarm and the queen: towards a deterministic and adaptive particle swarm optimization, in: Proceedings of 1999 Congress on
Evolutionary Computation, 1999, pp. 1951–1957.
[10] M. Clerc, J. Kennedy, The particle swarm-explosion, stability and convergence in a multidimensional complex space, IEEE Transactions on Evolutionary
Computation 6 (2002) 58–73.
[11] L.S. Coelho, P. Alotto, Global optimization of electromagnetic devices using an exponential quantum-behaved particle swarm optimizer, IEEE
Transactions on Magenetics 44 (2008) 1074–1077.
J. Sun et al. / Information Sciences 182 (2012) 93–114 113

[12] L.S. Coelho, A quantum particle swarm optimizer with chaotic mutation operator, Chaos, Solitons & Fractals 37 (2008) 1409–1418.
[13] W. Du, B. Li, Multi-strategy ensemble particle swarm optimization for dynamic optimization, Information Sciences 178 (2008) 3096–3109.
[14] S.R. Eddy, Multiple alignment using Hidden Markov Models, in: Proceedings of the International Conference on Intelligent Systems for Molecular
Biology, 1995, pp. 114–120.
[15] I. Ellabib, P. Calamai, O. Basir, Exchange strategies for multiple ant colony system, Information Sciences 177 (2007) 1248–1264.
[16] A. Elhossini, S. Areibi, R. Dony, Strength pareto particle swarm optimization and hybrid EA-PSO for multi-objective optimization, Evolutionary
Computation 18 (2010) 127–156.
[17] D.F. Feng, R.F. Doolittle, Progressive sequence alignment as a prerequisite to correct phylogenetic trees, Journal of Molecular Evolution 25 (1987) 351–
360.
[18] F. Gao, Parameters estimation on-line for Lorenz system by a novel quantum-behaved particle swarm optimization, Chinese Physics B 17 (2008) 1196–
1201.
[19] H. Gao, W. Xu, J. Sun, Y. Tang, Multilevel thresholding for image segmentation through an improved quantum-behaved particle swarm algorithm, IEEE
Transactions on Instrumentation and Measurement 59 (2010) 934–946.
[20] S. Henikoff, J.G. Henikoff, Amino acid substitution matrices from protein blocks, in: Proceedings of National Academy of Sciences USA, vol. 89, 1992, pp.
10915–10919.
[21] Z. Huang, Y.J. Wang, C.J. Yang, C.Z. Wu, A new improved quantum-behaved particle swarm optimization model, in: Proceedings of the 5th IEEE
Conference on Industrial Electronics and Applications, 2009, pp. 1560–1564.
[22] S. Janson, M. Middendorf, A hierarchical particle swarm optimizer and its adaptive variant, IEEE Transactions on Systems, Man and Cybernetics, Part B:
Cybernetics 6 (2005) 1272–1282.
[23] K. Karplus, C. Barrett, R. Hughey, Hidden Markov Models for detecting remote protein homologies, Bioinformatics 14 (1998) 846–856.
[24] J. Kennedy, R.C. Eberhart, Particle swarm optimization, in: Proceedings of IEEE International Conference on Neural Networks, 1995, pp. 1942–
1948.
[25] J. Kennedy, Bare bones particle swarms, in: Proceedings of IEEE Swarm Intelligence Symposium, Indianapolis, IN, 2003, pp. 80–87.
[26] J. Kim, S. Pramanik, M.J. Chung, Multiple sequence alignment using simulated annealing, Bioinformatics 10 (1994) 419–426.
[27] A. Krogh, M. Brown, I.S. Mian, K. Sjolander, D. Haussler, Hidden Markov Models in computational biology: applications to protein modeling, Journal of
Molecular Biology 235 (1994) 1501–1531.
[28] S. Kwong, C. Chau, K. Man, K. Tang, Optimisation of HMM topology and its model parameters by genetic algorithm, Pattern Recognition 34 (2001) 509–
522.
[29] Z.-J. Lee, S.-F. Su, C.-C. Chuang, K.-H. Liu, Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment, Applied Soft
Computing 8 (2008) 55–78.
[30] X.J. Lei, A.L. Fu, Two-dimensional maximum entropy image segmentation method based on quantum-behaved particle swarm optimization algorithm,
in: Proceedings of Fourth International Conference on Nature Computation, 2008, pp. 692–696.
[31] S.-Y. Li, R.-G. Wang, W.-W. Hu, J.-Q. Sun, A new QPSO based BP neural network for face detection, Advances in Soft Computing, Fuzzy Information and
Engineering, vol. 40, Springer, 2007.
[32] X. Li, Niching without niching parameters: particle swarm optimization using a ring topology, IEEE Transactions on Evolutionary Computation 14
(2010) 150–169.
[33] J. Liu, W.B. Xu, J. Sun, Quantum-behaved particle swarm optimization with mutation operator, in: Proceedings of the 17th IEEE International
Conference on Tools with Artificial Intelligence, 2005, pp. 237–240.
[34] A.V. Lukashin, J. Engelbrecht, S. Brunak, Multiple alignment using simulated annealing: branch point definition in human mRNA splicing, Nucleic Acids
Research 20 (1992) 2511–2516.
[35] S.M. Mikki, A.A. Kishk, Quantum particle swarm optimization for electromagnetics, IEEE Transactions on Antennas and Propagation 54 (2006) 2764–
2775.
[36] G. Mongillo, S. Deneve, Online learning with Hidden Markov Models, Neural Computation 20 (2008) 1706–1716.
[37] M.A. Montes de Oca, T. Stutzle, M. Birattari, M. Dorigo, Frankenstein’s PSO: a composite particle swarm optimization algorithm, IEEE Transactions on
Evolutionary Computation 13 (2009) 1120–1132.
[38] D.W. Mount, Bioinformatics: Sequence and Genome Analysis, Cold Spring Harbor Laboratory Press, 2001.
[39] C. Notredame, D.G. Higgins, SAGA: sequence alignment by genetic algorithm, Nucleic Acids Research 24 (1996) 1515–1524.
[40] S.N. Omkara, R. Khandelwala, T.V.S. Ananthb, G.N. Naika, S. Gopalakrishnana, Quantum behaved particle swarm optimization (QPSO) for multi-
objective design optimization of composite structures, Expert Systems with Applications 36 (2009) 11312–11322.
[41] M. Pant, R. Thangaraj, A. Abraham, A new quantum behaved particle swarm optimization, in: Proceedings of the 10th Annual Conference on Genetic
and Evolutionary Computation, 2008, pp. 87–94.
[42] M. Pant, R. Thangaraj, V.P. SinghSobol, Sobal mutated quantum particle swarm optimization, International Journal of Recent Trends in Engineering 1
(2009) 95–99.
[43] R. Poli, Mean and variance of the sampling distribution of particle swarm optimizers during stagnation, IEEE Transactions on Evolutionary
Computation 13 (2009) 712–721.
[44] D. Parrott, X. Li, Locating and tracking multiple dynamic optima by a particle swarm model using speciation, IEEE Transactions on Evolutionary
Computation 10 (2006) 440–458.
[45] L.R. Rabiner, A tutorial on Hidden Markov Models and selected applications in speech recognition, in: Proceedings of the IEEE, 1989, pp. 257–285.
[46] T.K. Rasmussen, T. Krink, Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary
algorithm hybrid, BioSystems 72 (2003) 5–17.
[47] J. Riget, J.S. Vesterstrøm, A Diversity-Guided Particle Swarm Optimizer – The ARPSO, Technical Report, University of Aarhus, Denmark, 2002.
[48] S.L. Sabata, L.S. Coelho, A. Abrahamc, MESFET DC model parameter extraction using quantum particle swarm optimization, Microelectronics Reliability
49 (2009) 660–666.
[49] E.L. Sonnhammer, S.R. Eddy, R. Durbin, Pfam: a comprehensive database of protein families based on seed alignments, Proteins 28 (1997) 405–420.
[50] Y. Shi, R.C. Eberhart, A modified particle swarm optimizer, in: Proceedings of IEEE International Conference on Evolutionary Computation, 1998, pp.
69–73.
[51] M. Slimane, G. Venturini, J.A. de Beauville, T. Brouard, A. Brandeau, Optimizing Hidden Markov Models with a genetic algorithm, Lecture Notes in
Computer Science 1063 (1996) 384–396.
[52] J. Stoye, D. Evers, F. Meyer, Rose: generating sequence families, Bioinformatics 14 (1998) 157–163.
[53] J. Sun, B. Feng, W.B. Xu, Particle swarm optimization with particles having quantum behavior, in: Proceedings of Congress on Evolutionary
Computation, Portland, USA, June 2004, pp. 326–331.
[54] J. Sun, W.B. Xu, B. Feng, A global search strategy of quantum-behaved particle swarm optimization, in: Proceedings of IEEE Conference on Cybernetics
and Intelligent Systems, Singapore, December 2004, pp. 111–116.
[55] J. Sun, W.B. Xu, B. Feng, AddTaptive parameter control for quantum-behaved particle swarm optimization on individual level, in: Proceedings of IEEE
International Conference on Systems, Man and Cybernetics, Hawaii, October 2005, vol. 4, pp. 3049–3054.
[56] J. Sun, W.B. Xu, B. Ye, Quantum-behaved particle swarm optimization clustering algorithm, in: Proceedings of the International Conference on
Advanced Data Mining and Applications, 2006, pp. 340–347.
[57] J. Sun, J. Liu, W.B. Xu, Using quantum-behaved particle swarm optimization algorithm to solve non-linear programming problems, International
Journal of Computer Mathematics 84 (2007) 261–272.
114 J. Sun et al. / Information Sciences 182 (2012) 93–114

[58] J. Sun, W. Fang, D. Wang, W. Xu, Solving the economic dispatch problem with a modified quantum-behaved particle swarm optimization method,
Energy Conversion and Management 50 (2009) 2967–2975.
[59] J. Sun, W. Fang, Z. Xie, C.-H. Lai, W. Xu, Particle swarm optimization with particles having quantum behavior: convergence analysis and performance
evaluation, submitted for publication.
[60] J.D. Thompson, D.G. Higgins, T.J. Gibson, CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice, Nucleotide Acids Research 22 (1994) 4673–4680.
[61] J. Thompson, F. Plewniak, O. Poch, A comprehensive comparison of multiple sequence alignment programs, Nucleotide Acids Research 27 (1999) 2682–
2690.
[62] R. Thomsen, Evolving the topology of Hidden Markov Models using evolutionary algorithms, in: Proceedings of Parallel Problem Solving from Nature
VII (PPSN), 2002, pp. 861–870.
[63] P.K. Tripathi, S. Bandyopadhyay, S.K. Pal, Multi-objective particle swarm optimization with time variant inertia and acceleration coefficients,
Information Sciences 177 (2007) 5033–5049.
[64] C. Twomey, T. Stützle, M. Dorigo, M. Manfrin, M. Birattari, An analysis of communication policies for homogeneous multi-colony ACO algorithms,
Information Sciences 180 (2010) 2390–2404.
[65] R.K. Ursem, Diversity-Guided evolutionary algorithms, in: Proceedings of the Parallel Problem Solving from Nature Conference, 2002, pp. 462–471.
[66] L. Wang, T. Jiang, On the complexity of multiple sequence alignment, Journal of Computational Biology 1 (1994) 337–348.
[67] J. Wang, Y. Zhou, Quantum-behaved particle swarm optimization with generalized local search operator for global optimization, in: Proceedings of
International Conference on Intelligent Computing, 2007, pp. 344–352.
[68] Y. Wang, Y. Yang, Particle swarm optimization with preference order ranking for multi-objective optimization, Information Sciences 179 (2009) 1944–
1959.
[69] Q. Wu, R. Law, Complex system fault diagnosis based on a fuzzy robust wavelet support vector classifier and an adaptive Gaussian particle swarm
optimization, Information Sciences 180 (2010) 4514–4528.

You might also like