0% found this document useful (0 votes)
24 views13 pages

Engineering Applications of Arti Cial Intelligence

This paper presents a novel unsupervised genetic algorithm for decision boundary analysis (GADBA) aimed at detecting structural damage in bridges, even amidst operational and environmental variability. The GADBA approach utilizes a concentric hypersphere algorithm to optimize clustering of normal state conditions and improve damage detection accuracy compared to traditional methods. Results from testing on data from the Z-24 and Tamar Bridges indicate that GADBA offers superior classification performance in identifying damage indicators.

Uploaded by

surajitdas06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views13 pages

Engineering Applications of Arti Cial Intelligence

This paper presents a novel unsupervised genetic algorithm for decision boundary analysis (GADBA) aimed at detecting structural damage in bridges, even amidst operational and environmental variability. The GADBA approach utilizes a concentric hypersphere algorithm to optimize clustering of normal state conditions and improve damage detection accuracy compared to traditional methods. Results from testing on data from the Z-24 and Tamar Bridges indicate that GADBA offers superior classification performance in identifying damage indicators.

Uploaded by

surajitdas06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Engineering Applications of Artificial Intelligence 52 (2016) 168–180

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence


journal homepage: www.elsevier.com/locate/engappai

A novel unsupervised approach based on a genetic algorithm


for structural damage detection in bridges
Moisés Silva a,n, Adam Santos a, Eloi Figueiredo b, Reginaldo Santos a, Claudomiro Sales a,
João C.W.A. Costa a
a
Applied Electromagnetism Laboratory, Universidade Federal do Pará, R. Augusto Corrêa, Guamá 01, Belém, 66075-110 Pará, Brazil
b
Faculty of Engineering, Universidade Lusófona de Humanidades e Tecnologias, Campo Grande 376, 1749-024 Lisbon, Portugal

art ic l e i nf o a b s t r a c t

Article history: This paper proposes a novel unsupervised and nonparametric genetic algorithm for decision boundary
Received 18 September 2015 analysis (GADBA) to support the structural damage detection process, even in the presence of linear and
Received in revised form nonlinear effects caused by operational and environmental variability. This approach is rooted in the
29 February 2016
search of an optimal number of clusters in the feature space, representing the main state conditions of a
Accepted 2 March 2016
structural system, also known as the main structural components. This genetic-based clustering
approach is supported by a novel concentric hypersphere algorithm to regularize the number of clusters
Keywords: and mitigate the cluster redundancy. The superiority of the GADBA is compared to state-of-the-art
Structural health monitoring approaches based on the Gaussian mixture models and the Mahalanobis squared distance, on data sets
Genetic algorithm
from monitoring systems installed on two bridges: the Z-24 Bridge and the Tamar Bridge. The results
Concentric hypersphere algorithm
demonstrate that the proposed approach is more efficient in the task of fitting the normal condition and
Damage detection
Environmental and operational variability its structural components. This technique also revealed to have better classification performance than the
Clustering alternative ones in terms of false-positive and false-negative indications of damage, suggesting its
applicability for real-world structural health monitoring applications.
& 2016 Elsevier Ltd. All rights reserved.

1. Introduction 2003; Gattulli and Chiaramonte, 2005). At the same time, SHM is
becoming increasingly attractive due to its potential ability to detect
Improved and more continuous condition assessment of damage, with the consequent life-safety and economical benefits
bridges has been demanded by modern societies to better face the (Worden et al., 2007). The authors believe that all approaches to
challenges presented by aging civil infrastructure (Figueiredo et al., SHM can be posed in the context of a statistical pattern recognition
2013). In the last two decades, bridge condition assessment (SPR) paradigm. This SPR paradigm for the development of SHM
approaches have been developed independently based on two solutions is described as a four-phase process (Farrar et al., 2001):
concepts: bridge management systems (BMSs) and structural (1) operational evaluation, (2) data acquisition, (3) feature extrac-
health monitoring (SHM). The BMS is a visual inspection-based tion, and (4) statistical modeling for feature classification. Inherent
decision-support tool developed to analyze engineering and eco- in the data acquisition, feature extraction, and statistical modeling
nomic factors and to assist the authorities in determining how and portions of this paradigm, the data normalization is the process of
when to make decisions regarding maintenance, repair and reha- separating changes in damage-sensitive features caused by damage
from those caused by varying operational and environmental con-
bilitation of structures (Lee et al., 2008; Wenzel, 2009). On the
ditions (Sohn and Farrar, 2001). Actually, these influences on the
other hand, the SHM traditionally refers to the process of imple-
structural response have been cited as one of the major challenges
menting monitoring systems to measure the structural responses
for the transition of SHM technology from research to practice
in real-time and to identify anomalies and/or damage at early
(Sohn, 2007; Xia et al., 2012).
stages (Farrar and Worden, 2007).
The focus of this study is on the fourth phase, which is con-
Even with the inherent limitation imposed by the visual
cerned with the implementation of algorithms that analyze and
inspections, the BMS has already been accepted by bridge owners
learn the distributions of the extracted damage-sensitive features
around the world (Miyamoto et al., 2001; Estes and Frangopol, from the raw data, in an effort to determine the structural health
condition (Worden and Manson, 2007). Therefore, in the hier-
n
Corresponding author. archical structure of damage identification, this paper addresses the
E-mail address: [email protected] (M. Silva). first level, i.e., the damage detection level (Figueiredo et al., 2011).

http://dx.doi.org/10.1016/j.engappai.2016.03.002
0952-1976/& 2016 Elsevier Ltd. All rights reserved.
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 169

Numerous studies have established the concept of auto- 2. Related work


matically discovering and characterizing the normal condition of
bridges, even when they are affected by extreme operational and For most civil engineering infrastructure where SHM systems
environmental conditions (Figueiredo and Cross, 2013; Figueiredo are applied, the unsupervised learning algorithms are often
et al., 2014). In those studies, the damage detection is carried out required because only data from the undamaged (or normal)
on the basis of an outlier detection strategy using distance metrics condition are available (Farrar and Worden, 2013). Some of the
and machine learning algorithms, which permits one to track the traditional unsupervised machine learning algorithms and their
outlier formation in time regarding the chosen groups of state adaptations for damage detection in bridges can be found in the
conditions. In contrast with approaches that consist of measuring following references Figueiredo et al. (2011); Sohn et al. (2002);
directly parameters related to operational and environmental Hsu and Loh (2010); Hakim and Razak (2014); Santos et al. (2016,
variations (e.g., traffic loading and temperature) (Peeters et al., 2015); Xu et al. (2004); Liu et al. (2011). Herein, the most tradi-
tional cluster-based and bioinspired methods for damage detec-
2001; Peeters and Roeck, 2001; Ni et al., 2005; Kullaa, 2009), these
tion are discussed. Moreover, the most relevant genetic-based
algorithms pave the way for data-based models applicable to
clustering methods are also introduced.
structural systems of arbitrary complexity, with the advantage to
eschew the measure of operational and environmental variations
2.1. Traditional cluster-based damage detection methods
and physics-based model approaches.
Therefore, coupled with the results highlighted in the previous
The approaches based on the Mahalanobis squared
authors' publication (Figueiredo and Cross, 2013), which suggests
distance (MSD) and the Gaussian mixture model (GMM) are
the potential of cluster-based algorithms for damage detection
relevant, as they operate on a set of clusters representing
under operational and environmental variability, this paper pro- undamaged state conditions (Figueiredo and Cross, 2013;
poses an unsupervised and nonparametric approach using a Figueiredo et al., 2014).
genetic algorithm (GA) to detect structural damage in bridges, The MSD-based approach is one of the most traditional meth-
namely a genetic algorithm for decision boundary analysis ods for damage detection, having widespread use in real scenarios
(GADBA). Combined with the robust search capability inherent in due to its ability to identify outliers (Worden et al., 2007; Worden
GAs, this study presents a new method to characterize the main and Manson, 2007; Nguyen et al., 2014), as it assumes that the
clusters (components) that correspond to the normal state con- normal condition is encoded by an unique cluster from a multi-
ditions of a bridge as well as a new algorithm to regularize the variate Gaussian distribution. When abnormal observations
optimal number of clusters and mitigate the cluster redundancy, appear statistically inconsistent with the rest of the data, it is
namely the concentric hypersphere (CH) algorithm. Additionally, assumed that the data have been generated by an alternative
an objective function is also proposed to evaluate the quality of source, which is not related to the normal condition established
different component configurations. with a mean vector and a covariance matrix derived from the
The proposed GADBA-based approach is summarized in two baseline data sets obtained under operational and environmental
steps: (i) the main normal state conditions of a system are auto- conditions. However, when nonlinearities are present in the
matically discovered by clustering the training observations observations, the MSD fails in modeling the normal condition of a
according to the closest centroids, which are targets of the opti- bridge as it assumes that the baseline data might follow a multi-
mization performed by the GA; this optimization defines boundary variate Gaussian distribution (Figueiredo and Cross, 2013).
regions between the clusters and reduces the number of dis- A new concept based on GMMs was developed in Figueiredo
covered state conditions; (ii) the damage detection strategy is and Cross (2013), Figueiredo et al. (2014) as a two-step damage
based on the Euclidean distances between the test observations detection strategy. In the first step, the GMM-based approach is
and the optimized centroids. For each observation, the minimum applied to model the main clusters that correspond to the normal
distance to the centroids represents the damage indicator (DI). and undamaged state conditions of a bridge, even when it is
To test the superiority of the proposed approach, standard data affected by unknown operational and environmental conditions.
sets from the Z-24 Bridge, in Switzerland, and the Tamar Bridge, in In the former study, the parameters of the GMMs are estimated
from the training data, using the maximum likelihood estimation
England, are used. The classification performance is evaluated on
based on the expectation-maximization (EM) algorithm. To
the basis of Type I/Type II error trade-offs. In SHM, in the context of
improve the parameter estimation of the GMMs, a Bayesian
damage detection, a Type I error is a false-positive indication of
approach based on a Markov-chain Monte Carlo method is applied
damage and a Type II error is a false-negative indication of damage.
in the latter study. In the second step, the damage detection is
The overall organization of this paper is as follows. In Section 2, a
performed on the basis of a MSD outlier formation regarding the
brief review of the most traditional cluster-based and bioinspired
chosen clusters of main states. Although these approaches have
methods for damage detection is presented, along with a discussion
revealed better damage detection performance when compared to
that synthesizes the relevant genetic-based clustering approaches MSD, they also assume Gaussian distributions which may com-
available in the literature. Section 3 describes all the new con- promise the reliable estimation of structural components and their
straints and mechanisms developed to cluster the normal state training phase is quite slow as several replications of the EM
conditions of bridges, by using the GADBA-based approach, and to algorithm are required.
detect damage based on the identified clusters. Section 4 highlights
a structural description of both bridges as well as a summary of the 2.2. Bioinspired damage detection methods
data sets from the bridges that encompass a wide spectrum of
challenges associated with practical damage detection problems. The most traditional bioinspired methods for damage detection
Section 5 presents the applicability of the proposed approach on in SHM correlate a complex physics-based model with measured
such real-world data sets and compares its performance with other data from the monitored structure. A set of variables is updated to
two approaches. Finally, Section 6 summarizes and discusses the obtain the minimum difference between the numerical and
implementation and analysis carried out in this study. experimental data. Then, a damage is modeled as a reduction in
170 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

structural stiffness and detected by comparing the undamaged and genetic algorithm (RGA) demonstrates the superiority of the
damaged states. ARSAGA over to RGA, regarding the localization of damage and its
Based on this usual approach, a real-coded GA is proposed in extent. However, the ARSAGA may be difficult to employ in real
Xia and Hao (2001) and applied to structural damage identification SHM scenarios, particularly in bridge monitoring, as it depends on
using vibration data and modal parameters. This approach iden- the availability of damaged data and the current localization of
tifies damage by directly comparing changes in the measurements some damaged elements.
before and after damage occurrence using two finite element In all aforesaid methods, the use of complex physics-based
models (FEMs) built with data from undamaged and damaged models is pointed out as a drawback to full implementation of
conditions. The GA minimizes an objective function that combines SHM, due to the high complexity imposed when modeling the
parameters related to mode shapes and frequency changes to operational and environmental influences, which remain not fully
update the reference FEM, and then obtain another FEM that understood (Reynders et al.), mainly in case of bridge monitoring.
reproduces the measured vibration data of the damaged state.
A damage detection approach based on particle swarm opti- 2.3. Genetic-based clustering methods
mization is employed to structural elements in Gkda and Yildiz
(2001). The damage location and extent are identified by mini- In the last years, the searching capability of GAs has been
mization of an objective function based on some modal para- exploited to establish appropriated clusters. Basically, GAs are
meters. A FEM of a Timoshenko beam is used to attest the relia- stochastic techniques to optimize objective functions guided by
bility of this method. Damage is simulated by adding stiffness loss evolutionary principles, with capabilities to find solutions of
in some elements. multimodal complex optimization problems by taking into
A real-coded GA coupled with a local search method is devel- account several restrictions (Chambers, 2000; Goldberg, 1989).
oped in Meruane and Heylen (2011) to locate and quantify struc- The GA-based clustering is a well-known unsupervised algo-
tural damage. The main goal of this approach is to select the ele- rithm to solve clustering problems in m-dimensional Euclidean
mental stiffness reduction factors, defined as the ratio of the space, Rm (Maulik and Bandyopadhyay, 2000). This approach is
stiffness reduction to the initial stiffness. The objective function is very similar to the K-means algorithm (Jain, 2010), where the
composed by five fundamental functions with an additional observations are divided into a given number (K) of subsets
damage penalization term. The algorithm is compared to the (clusters), whose centroids are determined by applying genetic
inverse eigen-sensitivity and response function methods on operators (selection, crossover and mutation). The main challenge
measured data from a tridimensional frame structure with differ- of this technique is to estimate the correct number of clusters,
ent degrees of freedom. The results indicate that this GA is more which represents the number of normal state conditions of a
appropriate to detect damage than conventional optimization system.
methods. However, as demonstrated by Friswell and Penny (2002), A GA-based clustering method – COWCLUS – is proposed in
for high frequencies, this damage detection procedure has lim- Cowgill et al. (1999). In this method, the variance ratio criteria
itations to quantify the damage severity for cracks, as well as it is (VRC) is used as an objective function to define the internal cluster
affected by mesh density. homogeneity and the degree of isolation between different clus-
In addition, a parallel genetic algorithm (PGA) to improve ters. The results demonstrate that this method outperforms K-
the time cost of the aforementioned approach is considered in means by optimizing the VRC function. However, the number of
Meruane and Heylen (2010). The reliability of the PGA is verified in clusters presents in the data must be defined as input of the
two test case structures: an airplane subjected to three levels of algorithm.
damage and a multiple cracked reinforced concrete beam sub- A genetically guided algorithm (GGA) developed for clustering
jected to a nonsymmetrical static load. Despite reduction in time is applied to brain tissue magnetic resonance imaging in Hall et al.
cost, the results indicate that the modeling improvement provided (1999), using objective functions from two other algorithms: fuzzy
by PGA depends on each specific problem. c-means (FCM) (Wen and Celebi, 2011) and hard c-means (HCM)
A new method for damage detection, localization and severity (Runkler and Keller, 2012). This approach consists of minimizing
estimation on any kind of structures using a GA is proposed in an adapted function from the original objective ones used in FCM
Chou and Ghaboussi (2001). A physics-based model is developed and HCM, rewriting the fuzzy partition matrix by another matrix
from the information on the geometry of the structure using that represents a distance measure from each observation to all
section properties treated as unknowns to be determined from centroids. The comparison of the GGA with FCM and HCM
undamaged condition. To determine these parameters, a GA is demonstrate that the GGA provides equivalent results in terms of a
applied to minimize the difference between the measured dis- “good” clustering. Although this method has proved to be suc-
placements of the bridge under normal condition and the com- cessful in modeling overlapping clusters, it may be difficult to
puted displacements from the model coded in the individuals. employ in real applications, mainly when there is not prior
When compared to the FEM, the results suggest that the GA can knowledge about data structure, as well as the number of clusters.
perform damage localization and severity estimation with more
reliability. However, this method was only tested in structures
under normal effects of traffic loading and did not consider other 3. Genetic algorithm for decision boundary analysis
common types of variability (e.g., temperature, humidity, wind
speed and boundary conditions). In general, the GADBA capabilities for searching and optimizing
A two-step hybrid damage detection strategy is proposed in He are presented in this paper with the purpose of grouping data into
and Hwang (2007). Based on grey relation analysis (GRA) and logical structural components given a maximum number of clus-
adaptive real-parameter simulated annealing genetic algorithm ters, K max , resulting in suitable geometric centers (centroids) for
(ARSAGA), this method reduces the number of displacement each cluster in the Euclidean space, Rm . In particular, the task of
variables by excluding the structural elements with less damage the proposed CH algorithm is to support the automatic identifi-
probability using a FEM that models a damaged structure. A cation of the number of clusters, K, by choosing the appropriate
comparison analysis between ARSAGA and a real-parameter centers C ¼ c1 ; c2 ; …; cK of each cluster, through the maximization
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 171

F(1, 1) F(1, ...) F(1, m) F(2, 1) F(2, ...) F(2, m) ... F(K, 1) F(K, ...) F(K, m) from the population. Afterwards, only the best individual is
selected from R and submitted to the crossover process with
Fig. 1. Representation scheme of a single individual.
another individual selected in the same manner. Besides, the
survival selection is based on the elitism concept (Deb et al., 2002),
in which two sets of parents Ip and offspring Ic are joint, creating a
of the objective function proposed for the GADBA, which con- set Iu ¼ Ip ⋃Ic . Then, a new fitness value is calculated based on the
tributes to use the lowest number of clusters as possible. Essen- Pareto Front and crowding distance. The solutions that compose
tially, the GADBA-based approach performs the CH algorithm in the new set Iu are sorted to select the jPj better individuals as the
the set of solutions in each generation, aiming to estimate the new population set Pðt þ 1Þ.
correct number of components through an agglomerative clus- The stopping criteria are: when the maximum number of
tering process. generations is reached and/or the difference of the fitness between
For general purposes in SHM, the training matrix X A Rnm is the two best individuals, of the last two generations, is less than a
composed of n observations under operational and environmental given threshold ϵ (e.g., ϵ ¼ 5).
variability when the structure is undamaged, where m is the
number of features per observation obtained during the feature 3.3. Recombination
extraction phase. The test matrix Z A Rwm is defined as a set of w
observations collected during the undamaged/damaged conditions Recombination performs the exploration towards the known
of the structure. Note that an observation represents a feature solution space aiming to refine the prior knowledges. Although a
vector encoding the structural condition at a given time. lot of different recombination operators are suggested in the lit-
erature (Hruschka et al., 2009; Mitchell, 1998), in this study is
3.1. Individual representation developed a strategy that combines not only useful segments of
different parents, but also the centroid positions. The recombina-
The individual representation assumed for a candidate solution tion method operates in three steps using two probability para-
is described herein. Each individual (also known as chromosome) meters defined as input, prec and ppos:
is a real vector of K  m genes composed of centroid positions, as
shown in Fig. 1. (i) for each pair of parents Pi and Pj , if a random number r r prec ,
In the individual representation, Fði; jÞ is the real value of the then two cut points π1 and π2 are randomly generated, cor-
i-th centroid in the j-th dimension. The number of genes varies responding to a range within centroid positions of both par-
 
between m; …; K max  m, in such a way that its length must be a ents, such that 1 r π 1 o π 2 r min K i ; K j . The centroids in the
multiple of m. The initial population Pðt ¼ 0Þ is created by ran- range are switched to form two offspring individuals. In the
domly choosing a number K in the interval ½1; …; K max  for each case of prec is not satisfied, then both parents become the new
individual. The K centroid positions are also randomly initialized offspring individuals;
by selecting K observations from the training matrix X. The pro- (ii) each centroid position receives a random number r A ½0; 1, in
cess is repeated for all jPj individuals to be generated. such a way if r r ppos , then for each pair of parent genes, an
arithmetic recombination is performed according to
3.2. Genetic operators  
F ðx;t

¼ F ðx;t
iÞ ðjÞ
þ F y;t  F ðx;t

T; ð2Þ
Aiming to perform several tasks of mutation, parent selection  
and survival selection, herein three well-known methods are F ðx;t

¼ F ðx;t

þ F ðy;t

 F ðx;t

T; ð3Þ
highlighted and adopted to support the GADBA-based approach.
The mutation process controls the exploration of the solution where T is a random value defined in ½0; 1, and F ðiÞ ðjÞ
x;t and F x;t are
space by means of performing changes in the individuals. In this the t-th positions of the x-th centroid from the i-th and j-th
study, this process is composed of two steps: parents, respectively;
(iii) finally, a length ratio, λ, defines the number of centroids
(i) the number of centroids is changed via a stochastic variation enabled in each offspring individual. Note that the parents
method. An increment rate is previously determined by already have λi and λj length ratios associated to themselves,
computing the inverse of the maximum number of clusters, K
1
T x ¼ K max . A random real value Tr defined in the range ½0; 1 is λ¼ : ð4Þ
K max
used to determine the number of centroids to be enabled in
the offspring individual by applying K new ¼ ⌈TT xr ⌉. In the case of Hence, λ maps the number of clusters, K, to the interval ð0; 1.
K o K new r K max , then the miss positions are completed by Hereafter, another arithmetic recombination is performed on
0 0
selecting K new  K observations at random from X, otherwise the parents' length ratio to generate λi and λj for the offspring
the last centroids are eliminated; individuals. Thus, the number of clusters (Ki and Kj) enabled in
(ii) the mutation occurs in each centroid position in a stochastic the final offspring individuals are
manner. A mutation probability pmut is associated to all posi- maxðλi ; λj Þ
tions, which are subjected to the Gaussian mutation, Ki ¼ ; ð5Þ
λ0i
F i;j ¼ F i;j þ Nð0; 1Þ; ð1Þ
maxðλi ; λj Þ
where Nð0; 1Þ is a random number from a Gaussian distribu- Kj ¼ : ð6Þ
λ0j
tion with zero mean and unitary standard deviation, and F i;j
the real value of the i-th centroid in the j-th dimension.

The selection operator drives the searching towards a promis- 3.4. Objective function
ing region in the feature space. The parent selection method is
based on the well-known tournament with reposition. This Based on the approaches that create clusters from circular
method creates a subset R by randomly selecting j Rj individuals distributions (MacQueen, 1967), a nonlinear metric to characterize
172 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

Fig. 2. CH algorithm using linear inflation.

different clusters is proposed. This metric is used as the objective manner on a list of centroids, by evaluating the boundary regions
function, which intends to evaluate different set of clustering that limit each cluster, being divided in three steps:
solutions by taking into account the observation dispersion in
relation to the centroids and the proximity between centroids. The Step 1: Centroid displacement: For each cluster, its centroid is
objective function assumes that each component (representing a dislocated to the position with greater observation den-
structural behavior) from the training matrix introduces a quasi- sity, i.e., the mean of its observations.
circular cluster of observations, allowing the damage detection in Step 2: Linear inflation of concentric hyperspheres: Linear infla-
the presence of operational and environmental variability and tion occurs on each centroid, of a candidate solution, by
when damage introduces new orthogonal components. In addi- progressively increasing an initial hypersphere radius,
tion, to evaluate the data dispersion around each centroid, the
R0 ¼ log 10 ð J C i  xmax J þ 1Þ; ð12Þ
density of the observations in the clusters is also considered.
Therefore, the first term of the objective function takes the where Ci is the centroid of the i-th cluster and xmax is its
summation of each distance among the centroids (Ci and Cj), farthest observation, such that J C i xmax J is the radius
of the cluster centered in Ci. However, the radius grows
KX
1 X
K
G1 ð J C i  C j J Þ; ð7Þ up in the form of an arithmetic progression with com-
i ¼ 1 j ¼ iþ1 mon difference equal to R0 . The creation of new hyper-
spheres is set by a criterion based on the positive varia-
where G1 is a nonlinear penalization function defined as tion of the observation density between two consecutive
1  e  d1 inflations, defined as the inverse of the variance; other-
G1 ðd1 Þ ¼ : ð8Þ wise the process is stopped.
e  d1
Step 3: Cluster agglutination: If there is more than one centroid
As Eq. (8) positively increases for all d1 4 0, one easily con- inside the inflated hypersphere, these centroids are
cludes that when G1 increases, the distances between centroids agglutinated to create an unique representative centroid
also increase. The second term is defined as as the mean of the initial centroids. On the other hand, if
! only the pivot centroid is within the inflated hyper-
XK X
G2 J C k x J ; ð9Þ sphere, this centroid is assumed to be on the geometric
k¼1 8 x A Ck center of a real cluster and the agglutination is not
performed.
where
1 For completeness, the CH algorithm is summarized in Fig. 2,
G2 ðd2 Þ ¼ : ð10Þ
e2d2 which presents an example of the method applied to a three-
In this case, Eq. (10) increases as the summation of the norms component scenario with a five-centroid candidate solution.
decreases for all d2 4 0. Therefore, the objective function is defined Initially, in Fig. 2(a), the centroids are moved to the center of their
by the combination of these two terms, regularized by the number clusters, as indicated in Step 1. In Fig. 2(b) and (c), two centroids
of components and the standard deviation of all distances are agglutinated to form one cluster, once they are under the same
between centroids, cluster. On the other hand, in Fig. 2(d), only one centroid is located
0 under a real cluster, therefore the CH algorithm is stopped after
!1
1 X
1 @KX K XK X the Step 2. In the case where the agglutination process occurs, all
F ðx; C; K; σ Þ ¼ G ðJC C JÞþ G2 J C k  x J A; ð11Þ
σK i ¼ 1 j ¼ i þ 1 1 i j k¼1 8 x A Ck centroids analyzed before are evaluated again to infer whether the
new one is not positioned under another cluster or closer to a
where the maximization of F ðÞ provides the optimal clustering boundary region.
solution by maximization of distances between centroids (Eq. (7)) The steps of the CH algorithm are summarized in Algorithm 1.
and minimization of data dispersion around the centroids (Eq. (9)). Initially, it identifies the cluster in which each observation belongs
and moves the centroids to the mean of their observations. Then, a
3.5. Concentric hypersphere algorithm hypersphere is built on a pivot centroid, by inflation until the
density between two consecutive hyperspheres decreases. Finally,
In this study, the GAs capabilities are combined with the the agglutination of all centroids within the last hypersphere is
objective function to evaluate different clustering configurations performed, by replacing these centroids by their mean. The pro-
for a given maximum number of clusters, K max . Thus, the novel CH cess is repeated until convergence, i.e., the solution is composed
algorithm is proposed to regularize the number of clusters enco- by only one centroid or there is no centroid agglutination after
ded within the individuals. This algorithm works in an iterative evaluate all centroids.
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 173

Algorithm 1. Summary of the CH algorithm. mutation, respectively. Pðt þ 1Þ is the resulting set of selection
operation in PðtÞ [ P″ðtÞ with size 2j Pj . The initial population
1 i’1 Pðt ¼ 0Þ is randomly generated.
2 createClusters ðC; X; nÞ Algorithm 2. Summary of the GADBA-based approach.
3 move ðC; X; nÞ
4 while i r j Cj AND j Cj 4 1 do
1 t¼ 0
5 radius0 ’calcRadiusðC½i; X; nÞ
2 Initialize population P(t)
6 radius; density0 ; density1 ; delta0 ; delta1 ’0
3 while convergence is not reached do
7 repeat
4 CH (P(t))
8 radius’radius þ radius0
5 Evaluate (P(t))
9 H’calcHypersphereðC; i; X; n; radiusÞ
6 P 0 ðtÞ ¼ Recombine (P(t))
10 density0 ’density1
11 density1 ’calcDensityðHÞ 7 P ″ ðtÞ ¼ Mutate ðP 0 ðtÞÞ
12 delta0 ’delta1 8 Evaluate ðP ″ ðtÞÞ
 
13 delta1 ’density density 
0 1
9 Pðt þ 1Þ ¼ Select ðPðtÞ [ P ″ ðtÞÞ
14 ðuntil delta0 4 delta1 Þ 10 t ¼ t þ 1
15 j’reduceðC; HÞ 11 end while
16 If j4 0 then 12 P max ¼ maxðPðtÞ:fitnessÞ
17 i’1 13 P best ¼ CHðP max Þ
18 createClustersðC; X; nÞ 14 DI ¼ damageIndicatorðP best ; ZÞ
19 else
20 i’i þ 1
21 end if
22 end while 4. Test structures and data sets

The applicability and comparison between the proposed


3.6. Structural damage classification approach and state-of-the-art ones are evaluated using the
damage-sensitive features extracted from the data sets of the Z-24
After the definition of the optimal number of components and Tamar Bridges. In the case of Z-24 Bridge, the standard data
embedded in the training data, the damage detection process is sets are unique in the sense that they combine one-year mon-
carried out through a global DI estimated for each test observation. itoring of the healthy condition, realistic damage scenarios artifi-
The DIs are generated through a method known as distributed DIs cially introduced and effects of operational and environmental
(Figueiredo et al., 2014). Basically, for a given test feature vector, zi, variability. In a different manner, a monitoring system was carried
the Euclidean distance for all centroids is calculated, where the DI out on the Tamar Bridge during almost two-years, generating data
(i) is considered the smallest distance, sets related only to undamaged scenarios. Its importance derives
from the fact that in real monitoring systems, damage or varia-
DIðiÞ ¼ minð J zi  c1 J ; J zi  c2 J ; …; J zi  cK J Þ; ð13Þ
bility effects occur naturally.
where c1 ; c2 ; …; cK are the centroids of K different components. In
this study, the threshold is defined for 95% of confidence on the 4.1. The Z-24 Bridge
DIs taking into account only the baseline data used in the training
process. Thus, if this approach has learned the baseline condition, The Z-24 Bridge was a standard post-tensioned concrete box
i.e., the identified components suitably represent the undamaged girder bridge composed of a main span of 30 m and two side-
and normal condition under all possible operational and envir- spans of 14 m, as shown in Fig. 3. The bridge, before complete
onmental conditions, then this approach should output less than demolition, was extensively instrumented and tested with the
5% of false alarms for the undamaged data used in test phase. purpose of providing a feasibility tool for vibration-based SHM in
civil engineering. A long-term monitoring test was carried out,
3.7. Summary of the GADBA-based approach from 11th of November 1997 until 10th of September 1998, to
quantify the operational and environmental variability present on
Many variants of genetic operators are available in literature. the bridge and detect the existence of damage artificially intro-
However, the proposed approach aims to reach satisfactory results duced in the last month of operation. Every hour, eight accel-
by keeping its structure as simple as possible. A general schematic erometers captured the vibrations of the bridge and an array of
of the GADBA-based approach is summarized in Algorithm 2. sensors measured environmental parameters, such as temperature
As each individual in the population represents a candidate at several locations, for 11 min. Progressive damage tests (settle-
solution, the final result is the one with best fitness provided by ment, concrete spalling, landslide at abutment, concrete hinge
the objective function. In the start of the process, the CH algorithm failure, anchor head failure, and rupture of tendons) were carried
is performed on all individuals in the population, and their asso- out in one-month time period shortly before the demolition of the
ciated parameters are updated at iteration t ¼0. Then, the objective bridge (from 4th of August to 10th of September 1998), to prove
function is computed for each updated individual. Genetic opera- that realistic damage has a measurable influence on the bridge
tors are applied until convergence, i.e., when the value given by dynamics (Peeters et al., 2001).
the objective function does not change, significantly, for ten gen- To verify the applicability of the proposed approach for long-
erations, providing the best set of centroids for the clustering term monitoring, daily monitoring data measured at 5 a.m.
problem. Finally, the CH algorithm is used to refine the best (because of the lower differential temperature on the bridge) from
achieved model and the DIs are estimated by applying Eq. (13). an array of accelerometers are used to extract damage-sensitive
PðtÞ denotes a population set of size j Pj at generation t. P0 ðtÞ and features, which yields a feature vector (observation) per day of
P00 ðtÞ are the resulting populations after recombination and operation. An automatic modal analysis procedure based on the
174 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

frequency domain decomposition was developed to extract the periods, which significantly contributes to the stiffness of the
natural frequencies (Peeters and Roeck, 1999). It was verified that bridge. Actually, Peeters et al. (2001) showed the existence of a
the automatic procedure was only able to estimate the first three bilinear behavior in the natural frequencies for below and above
frequencies with high reliability, yielding a three-dimensional freezing temperature.
feature vector per day (Figueiredo et al., 2014). During the fea- In conclusion, the statistical modeling is carried out taking into
ture extraction process, it was observed that the first and third account only the first two frequencies and using all 235 observa-
natural frequencies are strongly correlated (with a correlation tions, resulting in 197 observations from the undamaged condition
coefficient of 0.94), which permits one to perform dimension (1–197 observations) and 38 observations from the damaged
reduction of the extracted feature vectors from three to two. The condition (198–235 observations). The corresponding training and
first two natural frequencies, along with circles referring the test matrices are X1972 and Z2352 , respectively. The heterogeneity
observations below 0 1 C, are depicted in Fig. 4(a). among observations in a two dimensional space is evidenced in
The last 38 observations correspond to the damage progressive Fig. 4(b), which suggests the existence of components that may be
testing period, which is highlighted, especially in the second fre- find through latent variables and clustering methods.
quency, by a clear drop in the magnitude. Note that the damage
scenarios are carried out in a sequential manner, which cause 4.2. The Tamar Bridge
cumulative degradation of the bridge. Therefore, in this study, it is
assumed that the bridge operates within its undamaged condition The Tamar Bridge (Fig. 5) is situated in the south-west of the
(baseline condition), even though under operational and envir- United Kingdom and connects Saltash in the county of Cornwall
onmental variability, from 11th of November 1997 to 3rd of August with the city of Plymouth in Devon. This bridge is a major road
across the River Tamar and plays a significant role in the local
1998 (1–197 observations). On the other hand, the bridge is
economy. Initially, in 1961, the bridge had a main span of 335 m
assumed in its damaged condition from 4th of August to 10th of
and side spans of 114 m. If the anchorage and approach are
September 1998 (198–235 observations). The observed jumps in
included, the overall length of the structure is 643 m. The bridge
the natural frequencies are related to the asphalt layer in cold
stands on two concrete towers with a height of 73 m with the
bridge deck suspended at mid-height (Cross et al., 2013).
In the late 1990s, an upgrade was performed regarding the
structure after an EU directive. Various sensor systems were
installed to extract data such as tensions on stays, accelerations,
wind speed, temperature, deflection and tilt. Eight accelerometers
were implemented in orthogonal pairs to four stay cables and
three sensors measured deck accelerations. The time series were
stored with a sampling frequency of 64 Hz at 10 min intervals. The
data were then passed to a computer-based system and via the
covariance-driven stochastic subspace identification (Peeters and
Roeck, 1999), the natural frequencies were calculated (more detail
in Cross et al., 2013). The first five natural frequencies are illu-
strated in Fig. 6, for the period from 1st of July 2007 to 24th of
February 2009 (602 observations).
Herein, there is no damaged observations known in advance,
and so it is assumed that all observations are extracted from the
undamaged condition. Therefore, only Type I errors can be iden-
tified. From a total amount of 602 observations, the first 363 ones
Fig. 3. Longitudinal section (upper) and top view (bottom) of the Z-24 Bridge are used for statistical modeling in the training process (corre-
(Peeters et al., 2001).
sponding to one-year monitoring from 1st of July 2007 to 30th of

6
Baseline Condition 5.8 Undamaged
Damaged Condition Damaged
o
Temperature < 0 C
5.5 5.6

5.4
Frequency (Hz)

5
F2 (Hz)

5.2
4.5
5

4
4.8

3.5 4.6
1 197 235 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4
Observations (Days) F1 (Hz)

Fig. 4. The first two natural frequencies extrated daily at 5 a.m. from 11th of November 1997 to 10th of September 1998 (a); feature distribution of the two most relevant
natural frequencies (b).
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 175

Fig. 5. The Tamar Suspension Bridge viewed from River Tamar margins (a) and cantilever (b) perspective.

0.75

0.7

0.65
Frequency (Hz)

0.6

BC − Baseline Condition
0.55
Training Data
0.5

0.45

0.4

0.35
1 363 602
Observations (Days)

Fig. 6. The first five natural frequencies extracted from the Tamar Bridge.

June 2008) and the entire data set is used in the test process,
yielding a training matrix X3635 (1–363 observations) and a test
matrix Z6025 (1–602 observations).

5. Results: statistical modeling and feature classification

In this section, the performances of the GADBA-, GMM-, and


MSD-based approaches are compared in terms of the Type I and
Type II errors. The GADBA works through some previously defined
parameters. The number of iterations required to infer the con-
vergence of the fitness value, when the best solution is achieved, is
equal to 5, considering an oscillation of the best fitness in the
order of 10  4 . The crossover and mutation probabilities are
0.8 and 0.01, respectively. The size of the ring in the tournament
method for individual selection is set to 4. Furthermore, the
Fig. 7. Centroids along with the observations using the data sets from the Z-24
population size and the maximum number of components are Bridge: (a) all the 235 observations; (b) 1–197 observations corresponding to the
taken to be 20 and 10, respectively. Most of the parameters were baseline condition.
adopted based on the performance observed in recent research
studies (Bandyopadhyay and Maulik, 2002). The GMM-based Table 1
approach was set as described in Figueiredo et al. (2014), where Comparison of the parameter estimation using the CH and EM algorithms on the
the parameters are estimated from the training data using the EM entire data sets (1–235) from the Z-24 Bridge (standard errors smaller than
algorithm. The MSD-based approach was set as described in 10e 003).
Figueiredo and Cross (2013), where the covariance matrix and
Algorithm Description Cluster 1 Cluster 2 Cluster 3 Cluster 4
mean vector were defined based on the training data.
CH Weight (%) 69 10 6 15
5.1. The Z-24 Bridge Mean (Hz) (3.97, 5.18) (4.17, 5.28) (4.31, 5.59) (3.86, 4.84)

EM Weight (%) 64 21 15 ——
For all 235 observations, the four centroids corresponding to Mean (Hz) (3.97, 5.19) (4.16, 5.32) (3.86, 4.82) ——
the same number of clusters or structural components ðK ¼ 4Þ is
176 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

plotted in Fig. 7(a), as suggested by the CH algorithm. As indicated this relationship. In the case of the MSD-based approach, patterns
in Table 1, the first cluster is centered at (3.97, 5.18), attracting in the DIs caused by the freezing effects can be pointed out, which
around 69% of all assigned data. In this case, this cluster is possibly indicate that this approach is not able to remove, completely, the
related to the baseline condition obtained under small environ- effects of environmental variations and so demonstrates to be not
mental and operational influences. The second cluster is centered effective to model the normal condition.
at (4.17, 5.28) and is assigned with 10% of the observations. The Therefore, to quantify the classification performance, Table 3
authors speculate that it might be related to gradual decrease of summarizes the Type I and Type II errors for the test matrix. Basi-
temperature in the asphalt layer, enough to slightly change the cally, the GADBA- and GMM-based approaches have the same
elastic properties of the structure. The third cluster centered at classification performance, reaching 5.07% and 2.63% of Type I and
(4.31, 5.59) attracts 6% of observations and may be related to Type II errors, respectively, and a total amount of errors equal to
changes in the structural response derived from stiffness changes 4.68%. These results are quite similar due to the function adopted to
in the asphalt layer caused by freezing temperatures. The fourth evaluate the observation density within the inflated hyperspheres.
cluster is positioned in the lower region of the feature space However, the GADBA filters nearly all operational and environ-
centered at (3.86, 4.84). It embeds around 15% of the entire mental variability, especially in the damaged observations, instead
observations and is related to the space region assigned to the of the GMM that provides a poor data normalization in these
damaged condition. As demonstrated in Figueiredo et al. (2014), observations. As expected, the MSD-based approached obtained a
these results suggest the possibility to correlate physical states of similar result in relation to the amount of Type I errors; however,
the structure with a finite and well defined number of main the Type II errors reached over 39%, demonstrating its inefficiency
structural components. Figueiredo et al. (2014) show the existence when classifying abnormal conditions.
of this phenomenon, which is assigned to the natural grouping of
similar observations in certain regions of the feature space. Com- 5.2. The Tamar Bridge
paring the results from the CH and EM algorithms, one may verify
the similarity of the results in Table 1. However, the EM algorithm The data sets from the Z-24 Bridge are unique, as it was known
agglutinates the second and third clusters suggested by the CH a priori the existence of damage. On the other hand, the data sets
algorithm, incorporating all gradual changes in the asphalt layer to from the Tamar Bridge represent the most common situation
one cluster only. observed in real-world SHM applications on bridges, as there is no
The challenge to simulate damage in high capital expenditure indication of damage in advance.
civil engineering structures is well-known, namely due to the one- Following the same procedure carried out in the previous
of-a-kind structural type, the cost associated with the simulation subsection, the clusters and centroids defined by the GADBA-
based approach, during model estimation, are shown in Fig. 9, in a
of damage in such infrastructure, and due to the unfeasibility to
two-dimensional representation. Table 4 summarizes the centroid
cover all damage scenarios (Figueiredo et al., 2014; Westgate and
localizations in the original five-dimensional feature space along
Brownjohn, 2011). Therefore, the unsupervised approaches are
with the distribution weights inferred by both the CH (K ¼ 3) and
often required as long as the existence of data from the unda-
the EM (K ¼ 3) algorithms. Even though the number of centroids is
maged condition is known a priori. Thus, and for real applications,
equal for both approaches, one can assert that the CH appears to
the centroids defined by the CH algorithm are shown in Fig. 7(b),
perform a better modeling of the underlying components than EM
taking into account only feature vectors from the baseline condi-
algorithm, due to the GADBA is less sensitive to the choice of the
tion. In this case, three clusters are positioned in close positions as
initial parameters and also guides the solutions towards the global
indicated in Table 2. Comparing the results obtained from the CH
optimal. The second frequency is observed to considerably con-
and EM algorithms, one can verify, once again, similarities in the
tribute to the best distinction of these clusters, related to struc-
cluster location. However, the CH algorithm splits the observations
tural components. In addition, the GADBA-based approach con-
under gradual freezing effects into two clusters.
verged after 86 generations and for the GMM achieve complete
In relation to convergence, in average, after several runs with
convergence, it was required around 1000 iterations.
different initial populations, the GADBA-based approach converges
Furthermore, for the CH algorithm, one can figure out that the
to consistent results at 55 generations. In turn, the GMM executes
three hyperspheres defined through the linear inflation step have
31 iterations when running to a three-component scenario. How-
the expected behavior, by stopping their inflations close to the
ever, to automatically estimate the number of components, the
boundary of each cluster, as shown in Fig. 9. This behavior is
Bayesian information criterion was adopted. Thus, it was necessary
verified at the boundary regions, where one can find, especially
around 1000 iterations of GMM, under different initial conditions between the first and third components, the lowest concentration
and number of parameters, before complete convergence. of observations. On the other hand, one can find a high con-
The DIs obtained from the test matrix, Z2352 , are highlighted in centration of observations around the first and second centroids.
Fig. 8. It shows that the GADBA-based approach outputs a mono- For an overall analysis purpose, the DIs for all observations in
tonic relationship in the amplitude of the DIs related to the the test matrix, Z6025 , are plotted in Fig. 10, where the set of
damage level accumulation; whereas the GMM fails to establish observations from 1 to 363 is used on the training phase. For the
GMM-based approach, a concentration of outliers in the data not
used in the training phase is observed, suggesting an inappropriate
Table 2
Comparison of the parameter estimation using the CH and EM algorithms on the modeling of the normal condition. On the other hand, the GADBA-
baseline condition data (1–197) from the Z-24 Bridge (standard errors smaller than based approach seems to output a random pattern among the
10e  003). expected outlier observations, especially among the ones not used
in training process, suggesting a properly understanding of the
Algorithm Description Cluster 1 Cluster 2 Cluster 3
normal condition by the clusters defined by the CH algorithm.
CH Weight (%) 81 12 7 Note that, in this case, there is no indications about the existence
Mean (Hz) (3.97, 5.19) (4.17, 5.29) (4.30, 5.60) of neither damage nor extreme operational and environmental
variability in the data sets.
EM Weight (%) 81 19 ——
Mean (Hz) (3.97, 5.19) (4.22, 5.39) ——
For completeness, Table 5 summarizes the Type I errors for all
three approaches. The Type II errors are not summarized herein as
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 177

BC 35 BC
DC DC
0.6 Outliers
Outliers
30

0.5
25

0.4
20

DI
DI

0.3
15

0.2 10

0.1 5

1 197 235 1 197 235

30
BC
DC
Outliers
25

20
DI

15

10

1 197 235
Observations (Days)

Fig. 8. Damage indicators along with a threshold based on a cut-off value of 95% over the baseline data (1–197) from the Z-24 Bridge: (a) GADBA-, (b) GMM-, and (c) MSD-
based approaches.

Table 3 Table 4
Number and percentage of Type I and Type II errors for each approach using the Parameter estimation using the CH and EM algorithms on the baseline condition
data sets from the Z-24 Bridge. data (1-363) from the Tamar Bridge (approximation errors smaller than 10e  003).

Approach Type I Type II Total Algorithm Description Feature Cluster 1 Cluster 2 Cluster 3

GADBA 10 (5.07%) 1 (2.63%) 11 (4.68%) CH Weight (%) —— 36 52 12


GMM 10 (5.07%) 1 (2.63%) 11 (4.68%) F1 0.38 0.39 0.38
MSD 10 (5.07%) 15 (39.47%) 25 (10.63%) Mean (Hz) F2 0.46 0.48 0.44
F3 0.59 0.60 0.59
F4 0.68 0.68 0.68
F5 0.72 0.73 0.72

0.5
First component EM Weight (%) —— 40 26 34
Second component F1 0.38 0.39 0.39
0.49 Third component
Centroids 2 Mean (Hz) F2 0.46 0.46 0.48
F3 0.59 0.59 0.60
0.48 F4 0.68 0.68 0.69
F5 0.72 0.73 0.73
1
0.47
F2 (Hz)

0.46
there is no indications about structural damage. The total number
0.45 3
of Type I errors is 32 (5.32%), 65 (10.8%) and 30 (4.98%) for the
GADBA-, GMM- and MSD-based approaches, respectively. There-
0.44
fore, as the percentage of errors given by the GADBA is close to the
0.43 5% level of significance assumed in the training process, one
concludes that the GADBA-based approach offers the best model
0.35 0.36 0.37 0.38 0.39 0.4 0.41 0.42 0.43 0.44
F (Hz) to filter the environmental and operational influences and fit the
1
normal condition of the bridge. The importance of this result is
Fig. 9. The three main clusters defined by the CH algorithm, with their centroids rooted on the fact that this scenario is close to the ones found in
and corresponding final hyperspheres in the two-dimensional feature space using
only the first two frequencies from the Tamar Bridge. real-world monitoring, where there is no indications of damage a
178 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

0.045
BC BC
Outliers 30 Outliers
0.04
Threshold Threshold
0.035
25
0.03
20
0.025

DI
DI

0.02 15

0.015
10
0.01
5
0.005

0 0
1 363 602 1 363 602

BC
Outliers
50
Threshold

40

30
DI

20

10

0
1 363 602
Observations (Days)

Fig. 10. Damage indicators along with a threshold based on a cut-off value of 95% over the training data: (a) GADBA-, (b) GMM- and (c) MSD-based approaches.

Table 5 respectively. The structures were subjected to environmental and


Number and percentage of Type I errors for each approach using operational influences, which could cause structural changes.
the data sets from the Tamar Bridge.
In terms of result analysis, as verified on both bridges, the
Approach Type I GADBA-based approach demonstrates to be: (i) as robust as the
GMM-based one to detect the existence of damage; and (ii)
GADBA 32 (5.32%) potentially more effective to model the baseline condition and
GMM 65 (10.8%)
MSD 30 (4.98%)
attenuate the effects of the operational and environmental varia-
bility, as suggested by the minimization of false alarms on the data
from the Tamar Bridge.
In terms of theory formulation, the proposed approach
assumes no particular underlying distribution and its genetically
priori, which permits one to reduce the number of false alarms guided characteristic increases the chance to obtain a solution
and increase the reliability of the SHM system. close to the global optimal. On the other hand, the GMM assumes
the existence of Gaussian distributions and the EM converges
toward a local optimum. Therefore, the GADBA-based approach is
conceptually simpler to be deployed in real-world applications
6. Summary and conclusions and embedded in hardware (e.g., sensor nodes), in situations
where it is not possible to make any assumption about the data
This paper presented the performance of an unsupervised and distribution. Besides, the CH algorithm provides special cap-
nonparametric cluster-based approach (GADBA) applied to detect abilities (inflation and observation density analysis) to regularize
damage in bridges, even in the presence of environmental and the number of components and define better clusters, resulting in
operational influences. This approach is supported by a novel more accurate models to accomplish data normalization. In addi-
method (CH) based on spacial geometry and sample density of tion, compared to GMM, the GADBA-based approach demonstrates
each cluster, aiming to eliminate redundant clusters, also known faster convergence in both case studies. It is also important to note
as structural components. that several runs of GMM are needed to automatically estimate the
The proposed approach was compared with two alternative number of components using some off-line penalization criterion.
parametric cluster-based approaches extensively studied in the Finally, based on the data sets used in this study, both GADBA-
literature (GMM and MSD), through their applications on two and GMM-based approaches fit the well-known theorem that there
conceptually different but real-world data sets, from the Z-24 and is no free lunch in which machine learning algorithms are classified
Tamar Bridges, located in Switzerland and United Kingdom, in two classes: specialized methods for some category of problems
M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180 179

and methods that maintain a reasonable performance in the solu- Hakim, S., Razak, H.A., 2014. Modal parameters based structural damage detection
tion of most part of problems. Thus, the GMM fits the category of using artificial neural networks—a review. Smart Struct. Syst. 14 (2), 159–189.
http://dx.doi.org/10.12989/sss.2014.14.2.159.
specialized methods that do not generate good results for all type of Hall, L., Ozyurt, I., Bezdek, J., 1999. Clustering with a genetically optimized
applications. On the other hand, the GADBA fits the category in approach. IEEE Trans. Evolut. Comput. 3 (2), 103–112. http://dx.doi.org/10.1109/
4235.771164.
which results are often acceptable, i.e., it has a superiority in terms He, R.-S., Hwang, S.-F., 2007. Damage detection by a hybrid real-parameter genetic
of generalization. algorithm under the assistance of grey relation analysis. Eng. Appl. Artif. Intell.
20 (7), 980–992. http://dx.doi.org/10.1016/j.engappai.2006.11.020.
Hruschka, E., Campello, R., Freitas, A., de Carvalho, A., 2009. A survey of evolu-
tionary algorithms for clustering. IEEE Trans. Syst. Man Cybern. Part C: Appl.
Acknowledgments Rev. 39 (2), 133–155. http://dx.doi.org/10.1109/TSMCC.2008.2007252.
Hsu, T.-Y., Loh, C.-H., 2010. Damage detection accommodating nonlinear environ-
mental effects by nonlinear principal component analysis. Struct. Control
The authors acknowledge the financial support received from Health Monitor. 17 (3), 338–354. http://dx.doi.org/10.1002/stc.320.
CNPq (Grant 142236/2014-4 and Grant 454483/2014-7) and Jain, A.K., 2010. Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31
(8), 651–666. http://dx.doi.org/10.1016/j.patrec.2009.09.011.
CAPES. We also would like to acknowledge Prof. Guido De Roeck,
Kullaa, J., 2009. Eliminating environmental or operational influences in structural
from the Katholieke Universiteit Leuven, and Prof. James Brown- health monitoring using the missing data analysis. J. Intell. Mater. Syst. Struct.
john, from the University of Exeter, for giving us the entire data as 20 (11), 1381–1390. http://dx.doi.org/10.1177/1045389X08096050.
Lee, J., Sanmugarasa, K., Blumenstein, M., Loo, Y.-C., 2008. Improving the reliability
well as documentation from the Z-24 and Tamar Bridges. of a Bridge Management System (BMS) using an ANN-based Backward Pre-
diction Model (BPM). Autom. Constr. 17 (6), 758–772. http://dx.doi.org/10.1016/
j.autcon.2008.02.008.
Liu, Y.-Y., Ju, Y.-F., Duan, C.-D., Zhao, X.-F., 2011. Structure damage diagnosis using
neural network and feature fusion. Eng. Appl. Artif. Intell. 24 (1), 87–92. http:
References //dx.doi.org/
10.1016/j.engappai.2010.08.011.
Bandyopadhyay, S., Maulik, U., 2002. An evolutionary technique based on k-means MacQueen, J.B., 1967. Some methods for classification and analysis of multivariate
algorithm for optimal clustering in rn. Inf. Sci. 146 (1–4), 221–237. http://dx. observations. In: Cam, L.M.L., Neyman, J. (Eds.), Proceedings of the fifth Ber-
doi.org/10.1016/S0020-0255(02)00208-6. keley Symposium on Mathematical Statistics and Probability, vol. 1. University
Chambers, L.D., 2000. The Practical Handbook of Genetic Algorithms: Applications, of California Press, Berkeley, California, pp. 281–297.
Second Edition, 2nd ed. Chapman and Hall CRC. Boca Raton, Florida. Maulik, U., Bandyopadhyay, S., 2000. Genetic algorithm-based clustering technique.
Chou, J.-H., Ghaboussi, J., 2001. Genetic algorithm in structural damage detection. Pattern Recognit. 33 (9), 1455–1465. http://dx.doi.org/10.1016/S0031-3203(99)
Comput. Struct. 79 (14), 1335–1353. http://dx.doi.org/10.1016/S0045-7949(01) 00137-5.
00027-X. Meruane, V., Heylen, W., 2010. Damage detection with parallel genetic algorithms
Cowgill, M., Harvey, R., Watson, L., 1999. A genetic algorithm approach to cluster and operational modes. Struct. Health Monitor. 9 (6), 481–496. http://dx.doi.
analysis. Comput. Math. Appl. 37 (7), 99–108. http://dx.doi.org/10.1016/ org/10.1177/1475921710365400.
S0898-1221(99)00090-5. Meruane, V., Heylen, W., 2011. An hybrid real genetic algorithm to detect structural
Cross, E., Koo, K., Brownjohn, J., Worden, K., 2013. Long-term monitoring and data damage using modal properties. Mech. Syst. Signal Process. 25 (5), 1559–1573.
analysis of the Tamar bridge. Mech. Syst. Signal Process. 35 (1–2), 16–34. http: http://dx.doi.org/10.1016/j.ymssp.2010.11.020.
//dx.doi.org/ Mitchell, M., 2009. An Introduction to Genetic Algorithms. MIT Press, Cambridge,
10.1016/j.ymssp.2012.08.026. MA, USA.
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective Miyamoto, A., Kawamura, K., Nakamura, H., 2001. Development of a bridge man-
genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 6 (2), 182–197. http: agement system for existing bridges. Adv. Eng. Softw. 32 (10–11), 821–833.
//dx.doi.org/ http://dx.doi.org/10.1016/S0965-9978(01)00034-5.
10.1109/4235.996017. Nguyen, T., Chan, T.H., Thambiratnam, D.P., 2014. Controlled Monte Carlo data
Estes, A., Frangopol, D., 2003. Updating bridge reliability based on bridge man- generation for statistical damage identification employing Mahalanobis
agement systems visual inspection results. J. Bridge Eng. 8 (6), 374–382. http: squared distance. Struct. Health Monitor. 13 (4), 461–472. http://dx.doi.org/
//dx.doi.org/ 10.1177/1475921714521270.
10.1061/(ASCE)1084-0702(2003)8:6(374). Ni, Y., Hua, X., Fan, K., Ko, J., 2005. Correlating modal properties with temperature
Farrar, C.R., Worden, K., 2007. An introduction to structural health monitoring. using long-term monitoring data and support vector machine technique. Eng.
Philos. Trans. R. Soc.: Math. Phys. Eng. Sci. 365 (1851), 303–315. http://dx.doi. Struct. 27 (12), 1762–1773. http://dx.doi.org/10.1016/j.engstruct.2005.02.020.
org/10.1098/rsta.2006.1928. Peeters, B., Roeck, G.D., 1999. Reference-based stochastic subspace identification for
Farrar, C.R., Worden, K., 2013. Structural Health Monitoring: A Machine Learning output-only modal analysis. Mech. Syst. Signal Process. 13 (6), 855–878. http:
Perspective. John Wiley & Sons, Inc., Hoboken NJ, United States. //dx.doi.org/10.1006/mssp.1999.1249.
Farrar, C.R., Doebling, S.W., Nix, D.A., 2001. Vibration-based structural damage Peeters, B., Roeck, G. De, 2001. One-year monitoring of the z24-bridge: environ-
identification. Philos. Trans. R. Soc.: Math. Phys. Eng. Sci. 359 (1778), 131–149. mental effects versus damage events. Earthq. Eng. Struct. Dyn. 30 (2), 149–171.
http://dx.doi.org/10.1098/rsta.2000.0717. http://dx.doi.org/10.1002/1096-9845(200102)30:2〈149::AID-EQE1〉3.0.CO;2-Z.
Figueiredo, E., Cross, E., 2013. Linear approaches to modeling nonlinearities in long- Peeters, B., Maeck, J., Roeck, G.D., 2001. Vibration-based damage detection in civil
term monitoring of bridges. J. Civ. Struct. Health Monitor. 3 (3), 187–194. http: engineering: excitation sources and temperature effects. Smart Mater. Struct.
//dx.doi.org/ 10 (3), 518. http://dx.doi.org/10.1088/0964-1726/10/3/314.
10.1007/s13349-013-0038-3. Reynders, E., Wursten, G., De Roeck, G., 2014. Output-only structural health mon-
Figueiredo, E., Park, G., Farrar, C.R., Worden, K., Figueiras, J., 2011. Machine learning itoring in changing environmental conditions by means of nonlinear system
algorithms for damage detection under operational and environmental varia- identification. Struct. Health Monitor 6 (13), 82–93. http://dx.doi.org/10.1177/
bility. Struct. Health Monitor. 10 (6), 559–572. http://dx.doi.org/10.1177/ 1475921713502836.
1475921710388971. Runkler, T., Keller, J., 2012. Fuzzy approaches to hard c-means clustering. In: 2012
Figueiredo, E., Moldovan, I., Marques, M.B., 2013. Condition Assessment of Bridges: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2012, pp. 1–7.
Past, Present, and Future—A Complementary Approach, Universidade Católica http://dx.doi.org/10.1109/FUZZ-IEEE.2012.6251343.
Editora, Portugal. Santos, A., Silva, M., Sales, C., Costa, J., Figueiredo, E., 2015. Applicability of linear
Figueiredo, E., Radu, L., Worden, K., Farrar, C.R., 2014. A Bayesian approach based on and nonlinear principal component analysis for damage detection. In: 2015
a Markov-chain Monte Carlo method for damage detection under unknown IEEE International Instrumentation and Measurement Technology Conference
sources of variability. Eng. Struct. 80 (0), 1–10. http://dx.doi.org/10.1016/j. (I2MTC), pp. 869–874. http://dx.doi.org/10.1109/I2MTC.2015.7151383.
engstruct.2014.08.042. Santos, A., Figueiredo, E., Silva, M., Sales, C., Costa, J., 2016. Machine learning
Friswell, M.I., Penny, J.E.T., 2002. Crack modeling for structural health monitoring. algorithms for damage detection: kernel-based approaches. J. Sound Vib. 363
Struct. Health Monitor. 1 (2), 139–148. http://dx.doi.org/10.1177/ (17), 584–599. http://dx.doi.org/10.1016/j.jsv.2015.11.008.
1475921702001002002. Sohn, H., 2007. Effects of environmental and operational variability on structural
Gattulli, V., Chiaramonte, L., 2005. Condition assessment by visual inspection for a health monitoring. Philos. Trans. R. Soc.: Math. Phys. Eng. Sci. 365 (1851),
bridge management system. Comput.-Aided Civ. Infrastruct. Eng. 20 (2), 539–560. http://dx.doi.org/10.1098/rsta.2006.1935.
95–107. http://dx.doi.org/10.1111/j.1467-8667.2005.00379.x. Sohn, H., Farrar, C.R., 2001. Damage diagnosis using time series analysis of vibration
Gkda, H., Yildiz, A.R., 2001. Structural damage detection using modal parameters signals. Smart Mater. Struct. 10 (3), 446–451. http://dx.doi.org/10.1088/
and particle swarm optimization. Mater. Test. 54 (6), 416–420. http://dx.doi. 0964-1726/10/3/304.
org/10.3139/120.110346. Sohn, H., Worden, K., Farrar, C.R., 2002. Statistical damage classification under
Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and Machine changing environmental and operational conditions. J. Intell. Mater. Syst. Struct.
Learning, 1st ed. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 13 (9), 561–574. http://dx.doi.org/10.1106/104538902030904.
USA.
180 M. Silva et al. / Engineering Applications of Artificial Intelligence 52 (2016) 168–180

Wen, Q., Celebi, M., 2011. Hard versus fuzzy c-means clustering for color quanti- Xia, Y., Hao, H., 2001. A genetic algorithm for structural damage detection based on
zation. EURASIP J. Adv. Signal Process. 2011 (1), 118. http://dx.doi.org/10.1186/ vibration data. In: XIX International Modal Analysis Conference, pp. 1381–1387.
1687-6180-2011-118. Xia, Y., Chen, B., Weng, S., Ni, Y.-Q., Xu, Y.-L., 2012. Temperature effect on vibration
Wenzel, H., 2009. Health Monitoring of Bridges. John Wiley & Sons, Inc., United properties of civil structures: a literature review and case studies. J. Civ. Struct.
States. Health Monit. 2 (1), 29–46. http://dx.doi.org/10.1007/s13349-011-0015-7.
Westgate, K.Y.K.R.J., Brownjohn, J.M.W., 2011. Environmental effects on a suspen- Xu, B., Wu, Z., Chen, G., Yokoyama, K., 2004. Direct identification of structural
sion bridge's dynamic response, Leuven, Belgium. parameters from dynamic responses with neural networks. Eng. Appl. Artif.
Worden, K., Manson, G., 2007. The application of machine learning to structural Intell. 17 (8), 931–943. http://dx.doi.org/10.1016/j.engappai.2004.08.010.
health monitoring. Philos. Trans. R. Soc.: Math. Phys. Eng. Sci. 365 (1851),
515–537. http://dx.doi.org/10.1098/rsta.2006.1938.
Worden, K., Farrar, C.R., Manson, G., Park, G., 2007. The fundamental axioms of
structural health monitoring. Philos. Trans. R. Soc.: Math. Phys. Eng. Sci. 463
(2082), 1639–1664. http://dx.doi.org/10.1098/rspa.2007.1834.

You might also like