Deep Learning Models For Bridge Deck Evaluation Using Impact Echo
Deep Learning Models For Bridge Deck Evaluation Using Impact Echo
net/publication/339272663
Deep Learning Models for Bridge Deck Evaluation Using Impact Echo
CITATIONS READS
0 1,020
2 authors, including:
Sattar Dorafshan
University of North Dakota
50 PUBLICATIONS 1,474 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sattar Dorafshan on 18 February 2020.
1*
Corresponding Author, Assistant Professor, Department of Civil Engineering,
2
Nondestructive Evaluation Program and Laboratory Manager, Federal Highway
Abstract
One of the challenges in using non-destructive evaluation methods (NDE) for bridge
evaluation is their dependency on experienced users to perform the tests and interpret the results.
Impact echo (IE) is a common NDE method to detect subsurface defects in concrete bridge
decks. The conventional approach for analyzing the IE data (peak frequency method) requires
user expertise and user-defined parameters that could hinder broad field implementation. In this
paper, the feasibility of using deep learning for autonomous subsurface defect detection has been
studied to introduce automation in IE data analysis. A set of eight lab-made reinforced concrete
bridge specimens with known subsurface defects were made at the Federal Highway
number of 2016 of IE data was collected from these specimens. A one-dimensional (1D)
convolutional neural network (CNN), and a 1D recurrent neural network using bidirectional
2
long-short term memory units, were developed and applied on the IE data. In addition, two-
dimensional (2D) CNNs with AlexNet, GoogleNet, and ResNet architectures were applied on the
2D representatives of the IE data, i.e., spectrograms. The proposed 1D CNN achieved an overall
accuracy of 0.88 by classifying 0.70 of the defects and 0.95 of the sound regions correctly. The
accuracy varied between 0.80-0.86 for the rest of the models. The 1D CNN was considerably
faster to train compared to the other investigated models. The deep learning models were also
used to generate a defect map for each specimen that matched or exceeded the quality of the
Keywords: Deep learning, Impact Echo, Bridge Decks, Convolutional Neural Networks,
Introduction
Bridge inspection has been included in the Code of Federal Regulations for more than
half a century [1]. Conventional visual inspections [1-2] are time-consuming and dependent on
the inspector’s skills and experience [3-6]. Considering only one type of infrastructure (bridges)
and one bridge element (decks), bridge decks cover almost 4 billion square meters of the US
transportation grid [7], which is equivalent to an area larger than the City of Detroit in the United
States. Bridge decks are subjected to the same biannual inspections as other parts of bridges.
Bridge decks become start exhibiting more defects as they become older, making visual
inspections even more burdensome. Bridge inspectors use nondestructive evaluation (NDE)
techniques for comprehensive bridge deck evaluation as these methods provide essential
planning have led to implementing artificial intelligence (AI) for bridge evaluation. One can
define a set of features in inspection data and employ AI to create statistical classifiers using
supervised learning to detect bridge defects, i.e., machine learning. However, defining the feature
still requires an expert’s opinion. It is possible to obtain both features and classifiers using
unsupervised learning and through training on annotated inspection datasets to further reduce the
role of experts in the classification process, i.e., deep learning. An artificial neural network
(ANN) is a computational system that learns to perform tasks without being tuned by task-
Researchers have used machine learning techniques for semi-autonomous surface defect
detection from visual images to reduce the cost and time associated with the current manned
practice, and to minimize the role of humans for interpreting the inspection data [8-10]. The
application of machine learning for bridge condition evaluations has been mostly focused on
surface defects in visual images of bridge decks [11-20]. Deep learning frameworks has been
leveraged for characterization of acoustic emission signals in metallic plates [21-22]. Deep
learning architecture has been shown to be superior to humans in terms of recognizing an object
or a pattern in visual images [23-25]. Deep learning models (DLMs) also have been used in past
studies to detect surface defects in bridges and other structures [9, 25-30].
A variety of NDE methods can be used to detect different subsurface defects in bridge
decks [35-36], The methods include sonic and ultrasonic methods [37-41], ground penetration
radar [43-47], electrical resistivity [48-49], and impact echo (IE) [50-52]. Researchers have used
machine learning to detect subsurface defects through NDE data [57-61]. In particular,
researchers have developed classifiers and predication models using signal processing to classify
4
IE processed data [62-65]; however, these classifiers still require manual feature selection in the
IE signals which limits their applications in the field. DLMs have not been designed for or
implemented on IE data. DLMs are superior to statistical classifiers, i.e., unsupervised learning
methods, because they depend on considerably fewer user-defined parameters [9,25]. The
dependency on human expertise could be reduced by combining two or more NDE methods
using homogenous and heterogeneous data fusion techniques in concrete [66] and steel [67]
structures. However, extracting and combining the proper features from these sources can be
The authors hypnotize that DLMs can be used for robust and unbiased IE data analysis as
one of the essential steps toward introducing automation to bridge evaluation. The scopes of this
• Forming a fully annotated dataset made of raw IE signal and its transformation;
• Developing and training DLMs capable of separating defected and intact concrete
Background
Testing Material (ASTM) adopted this technology in 1998 to be formally used for thickness
impact source, a nearby receiver, and a data acquisition system, as shown in Figure 1. The
impact source generates seismic waves through and on the inspected region. When the waves hit
the bottom of the medium, i.e., backwall, or a subsurface irregularity, they are reflected. These
reflections, i.e., echoes, then travel back to the surface where their amplitudes are recorded by
the receiver [36]. The travel time of the echoes could indicate the location of the reflectors, and
the IE test can be used to determine the thickness of the deck if applied on a sound region, or to
Receiver
Impact Source
(a) (b)
Figure 1. (a) IE impact source and receiver, (b) data acquisition system.
The following relationship exists between the travel time of echoes in seconds (t), the
echo’s velocity in concrete in meter per second (𝐶𝐶𝑝𝑝 ), and the traveled distance in meter (d):
𝑑𝑑
𝑡𝑡 = (1)
𝐶𝐶𝑝𝑝
If the echoes all correspond to the backwall reflection, then the travelled distance would
roughly be equal to 2𝑇𝑇, where T is the thickness of the deck. Performing Fourier transform on IE
signals, one can establish a relationship between the deck thickness and the frequency associated
𝐶𝐶𝑝𝑝
𝑇𝑇 = 𝛽𝛽 (2)
2𝑓𝑓𝐼𝐼𝐼𝐼
where β is a correction factor influenced by the material properties of the propagation
medium.
In theory, a distinctive peak in the frequency response, close to the thickness frequency
( 𝑓𝑓𝐼𝐼𝐼𝐼 ), indicates sound concrete. The presence of defects could be manifested in the spectra in
two distinct peaks, a single peak at higher frequencies than the 𝑓𝑓𝐼𝐼𝐼𝐼 , or a low-frequency response
of the flexural mode [36]. However, these patterns are not always present in the frequency
response. Figure 2a and b show the IE signals associated with a sound and a defected (shallow
Highway Administration (FHWA) Advanced Sensing Technology (FAST) NDE laboratory. The
frequency responses exhibit a single peak, Figure 2c and d. Without knowing the value of the
thickness frequency or past engagement with the IE data, it is difficult to recognize the difference
between the two responses. In addition, the shape of the frequency responses in these figures did
0.50 0.50
0.30 0.30
Voltage
0.10
Voltage
0.10
-0.10 -0.10
-0.30 -0.30
-0.50
-0.50
0.000 0.005 0.010
0.000 0.005 0.010
Time (s) Time (s)
(a) (b)
7
1.00 1.00
0.80 0.80
Normalized Amp.
Normalized Amp
0.60 0.60
0.40 0.40
0.20 0.20
0.00 0.00
1,000 11,000 21,000 1,000 11,000 21,000
Frequency (Hz) Frequency (Hz)
(c) (d)
1.00 1.00
0.80 0.80
Normalized Amp.
Normalized Amp
0.60 0.60
0.40 0.40
0.20 0.20
0.00 0.00
1,000 11,000 21,000 1,000 11,000 21,000
Frequnecy (Hz) Frequency (Hz)
(e) (f)
Figure 2. IE signals (a) sound (b) defected (shallow delamination), frequency response without processing (c) sound,
(d) defected, and frequency response with processing (e) sound, (f) defected concrete
Figure 2e and f show the frequency responses after a Blackman Harris filter was applied
[69]. As seen, it is after the filtering that the frequency responses started to match with the
theoretical interpretations. Note that the IE tests in Figure 2 were performed in a controlled
environment. IE signals and their frequency responses, collected in practice, tend to be noisier
than the laboratory data which makes them more difficult to classify with the peak frequency
method. Therefore, practitioners and inspectors demand for alternatives that could produce
consistent results. Unlike the peak frequency method that heavily relies on user-defined
parameters and expertise, DLMs could potentially analyze raw IE data in an autonomous manner
which could lead to a broader implementation of this technique in bridge inspection and
evaluation.
8
Experimental Program
Deck Specimens
Eight reinforced concrete specimens, 3.0 m long, 1.0 m wide, and 0.2 m thick, were
constructed at the FAST NDE Laboratory to have four types of artificial defects: shallow
delamination (above the specimens’ top rebar level), deep delamination (above the specimens’
Initially, the specimens were made to study the effectiveness of NDE methods on
specimens with different types of overlay systems [35]. The specimens were constructed by
normal-weight concrete mix with a water-to-cement ratio of 0.37 and a 28-day compressive
strength of 27.6 MPa, with two mats of uncoated steel reinforcement with no. 5 rebar (15.8 mm
diameter) spaced 203 mm in both longitudinal and transverse directions. The specimens were
inspected using several NDE methods including IE before and after placing the overlays. In this
study, the IE data of specimens without overlays was investigated. All defects were 0.30 m long
along the length of the specimens, and 0.20 m wide along the width of the specimens. A
9
transverse crack was also artificially made in the middle of all specimens. Details about the
Annotated IE Dataset
The specimens were marked with a marker to create a grid system on their surface with
100 mm spacing (longitudinally and transversely). The IE tests were performed on the grid
points using a custom-made IE device, an accelerometer (receiver), and steel spheres (11mm in
diameter) on spring rods (impact source), as shown in Figure 1. The sampling frequency was 200
kHz for 10 ms of IE data acquisition at any point of the gird. Each IE signal consisted of 2000
points (shown in Figure 2a and b) and was saved by a name that represented the location of the
receiver on the grid. No IE tests were conducted on and within 100 mm of the transverse crack.
Table 1 shows the classification of the IE test signals into sound and defected concrete,
along with their quantities. In this table, class 0 is assigned to sound concrete and class 1 is
Recurrent neural network (RNN) is a type of DLM that has been used for classification of
time-variant signals. Hidden layer(s) in a RNN include the neurons that have a memory to
preserve the information about the previous time step [71]. In this study, a bidirectional long
short-term memory (biLSTM) network was developed and implemented on the IE signals. A
biLSTM network is a combination of two LSTM networks, one works on the signal in a normal
Convolutional neural networks (CNNs) are another type of DLMs that have been widely
and successfully used for image classification in different disciplines, including defect detection
in civil infrastructure [27-34]. CNNs are commonly used on 2D input data, i.e., images; however,
it is possible to construct CNNs for 1D data as well [73-74]. In addition to the biLSTM model, a
To cover other possible analysis methods, the authors also investigated 2D CNNs on 2D
representatives of the IE data. In this paper, short-time Fourier transform (STFT) [74] was used
to generate spectrograms, as seen in Figure 4. In the spectrograms, the horizontal axis represents
time and the vertical axis represents frequency. The spectrograms provide the opportunity for
implementing famous CNN architectures, commonly used in the realm of visual images, for IE
classification. Note that the spectrograms were generated using all the existing frequencies (0-
(a) (b)
Figure 4. Data 2D representatives of IE signals with the horizontal axis represents time and the vertical axis
represents frequency: (a) sound concrete, (b) defected concrete (with shallow delamination).
1D Deep Learning
The biLSTM network architecture is shown in Figure 5 (the number of neurons in the
middle layer is schematic). The network consisted of an input layer to read the IE data, a hidden
layer (biLSTM layer with 200 neurons, a fully connected layer, and a softmax layer), and an
output (classification) layer to label the IE data. The number of neurons was selected empirically
and the network architecture was inspired by [75]. The fully connected layer connects the
11
parameters from the past layers together in way that the softmax layer can assign a probability to
each class. The classification layer assigns a label to the data, 0 or 1, according to the highest
probabilities.
Figure 6 shows the architecture of the proposed 1D CNN. This network is a combination
conventional layers in deep learning literature including convolution, batch normalization, and
rectified linear (CBR layers); max pooling (MP layers); fully connected, rectified linear unit, and
dropout (FRD layers), and softmax layer. The size and number of convolution kernels for each
CBR layer are shown Figure 6. In CBR, a batch normalization is applied to the result of the
convolution operation on the input from the previous layer (i.e., activation maps). The batch
normalization layer increases the speed of training by normalizing the activation maps. The
rectified linear unit (ReLU) is an activation function that adds non-linearity to the normalized
activation maps. If a value in the normalized activation map is negative, the ReLU would assign
0 to it; but it would not change a positive value. The MP layer analyzes sets of neighboring
locations in each activation map and represents them by their maximum value to reduce the size
of the input and avoid overfitting. In the FRD layers, dropout layers toss out responses smaller
2D Deep Learning
Three world renown CNN architectures were trained and applied on the IE spectrograms:
AlexNet [76], GoogleNet [77], and ResNet [77]. Each has won the image classification
competition (2012, 2014, and 2016, respectively) [78]. For brevity, the details of these networks
are not discussed in detail here, given most of layers were already defined before. Figure 7 shows
this architecture to the realm of CNNs was the introduction and application of an “inception”
layer. The inception layer consisted of a combination of convolution and max pooling kernels
with a configuration shown in Figure 8. With an inception layer, a concatenation filter is applied
13
to make the output of the inception layer compatible with the rest of network. The AP in the
GoogleNet architecture stands for average pooling and works similar to the MP layer except it
connections (a combination of convolution and max pooling) between the convolution kernels to
The annotated IE dataset was split into a training and a testing dataset. The IE data of one
specimen was used as testing dataset, while the rest were used as the training dataset for each
DLM. By rotating the order, eight training and testing dataset were formed which then used to
train eight networks using every investigated DLM. Table 2 shows the size and labels of the
The biLSTM and the 1D CNN were fully trained (FT) using the training datasets. For the
2D CNNs, a method called transfer learning (TL) was also used. In this mode, parameters
obtained from trained networks on large image datasets, i.e., ImageNet [79], are squeezed into
the architecture of the new models before being trained on a new dataset, i.e., the IE data.
15
Therefore, only the end layers (with respect to the input layer) are altered during the training.
The weights in the first layers consist of generic feature extractors which potentially are effective
on any image while the weights in the end layers are updated through re-training to match the
new dataset [30]. TL is very effective when training deep architectures on small datasets.
Therefore, AlexNet-based model was trained in both FT and TL manners while GoogleNet-
based and ResNet-based models were only TL trained. The layers used for re-training in the TL
mode for each DLM was different which are shown in Figure 7, Figure 8, and Figure 9 by
dashed circles.
memory, Intel® Core™ i7 CPU, and two GeForce GTX 1080 Ti graphics processing units
(GPU). MATLAB 2019a was used to program the DLMs. Before training began, several training
parameters, i.e., hyperparameters, were determined for each model. These parameters and their
values (shown in Table 3) were chosen empirically to optimize the training process. In Table 3,
solver refers to the type of algorithm used to optimize the model during the training. The purpose
of an optimizer is to solve the gradient descent of the model in order to minimize the model’s
error. An adaptive moment estimation (ADAM) optimizer was selected for the biLSTM model
while the stochastic gradient descent with momentum (SGDM) was chosen for the rest. The
optimizers were selected based on their past successful implementations on each type of ANN.
There are other parameters that should be determined empirically before the training
begins. Mini batch size is the size of the segments of the training data used to calculate the loss
and error of the models. Iteration is when the models go through a forward and backward
processing for a batch of data while one epoch is when all training data is processed once over
the training dataset. Learning rate is the tolerance of parameters to optimize the gradient descent
16
and minimize the loss. The number of retrained layers in each network is also shown in the table
The performance of a DLM was evaluated using binary classification metrics (equations
3 through 6). The DLMs labeled an input signal or an image as the class that had the highest a
probability among the sound and defected classes. When a model labeled a defected input
correctly, it was a true positive (TP) whereas false negative (FN) was when the model labeled it
as sound. When the model labeled a sound input correctly, it was a true negative (TN) whereas
false positive (FP) was when the model labeled it as defected. The true positive rate (TPR), true
negative rate (TNR), accuracy (ACC), and F1 score were used to evaluate and compare the
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇𝑇𝑇 = � � (3)
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇𝑇𝑇 = � � (4)
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝐴𝐴𝐴𝐴𝐴𝐴 = � � (5)
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
2𝑇𝑇𝑇𝑇
𝐹𝐹1 = � � (6)
2𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝐹𝐹𝐹𝐹
Training Results
The results of training on each specimen are shown in Figure 10. The legends of these
plots represent the specimen id that was used for testing. For instance, a training process labeled
17
as S1 is when the DLMs are trained on specimens S2 through S8. All models reached an
100 100
75 75
Accuracy (%)
Accurcy (%)
S1 S1
50 S2 50 S2
S3 S3
S4 S4
25 S5 25 S5
S6 S6
S7 S7
S8 S8
0 0
0 2000 4000 6000 8000 0 400 800 1200
Iteration Iteration
(a) (b)
100 100
75 75
Accuracy (%)
Accurcy (%)
S1 S1
50 S2 50 S2
S3 S3
S4 S4
25 S5 25 S5
S6 S6
S7 S7
S8 S8
0 0
0 400 800 1200 0 400 800 1200
Iteration Iteration
(c) (d)
100 100
75 75
Accuracy (%)
Accuracy (%)
S1 S1
50 S2 50 S2
S3 S3
S4 S4
25 S5 25 S5
S6 S6
S7 S7
S8 S8
0 0
0 400 800 1200 0 400 800
Iteration Iteration
(e) (f)
Figure 10. Results of training using different deep leaning models: (a) biLSTM, (b) 1D CNN, (c) AlexNet fully
trained, (d) AlexNet (transfer learning), (e) GoogleNet (transfer learning), (f) ResNet (transfer learning)
Each model required a different amount of time to finish one round of iteration based on
its architectures. The 2D models were more time-consuming than the 1D models since less
calculations are required to analyze an image, i.e., spectrograms, than a signal. As the models got
18
deeper they required more time to finish one iteration with ResNet spending 1.21 s, GoogleNet
spending 0.57 s, and AlexNet (TL) spending 0.28 s. The AlexNet FT required 0.36 s per iteration
since more parameters had to be determined compared to TL. With 0.11 s per iteration, the
biLSTM model required twice the time 1D CNN spent on one iteration, i.e., only 0.05 s, which
makes both methods considerably faster than the 2D CNNs. As seen in Figure 10a the biLSTM
model required more iterations than convolutional networks to reach a desirable accuracy (e.g.,
95%) at the end of the training. Taking this fact into consideration, the proposed 1D CNN was
Testing Results
After the training was completed, DLMs were used to classify the IE data in the testing
dataset. The testing data is not used during the training of the model. The results are shown in
Figure 11, and Table 4 shows the mean, coefficient of variation (COV), maximum, and
minimum values of the performance metrics (equations 3 through 6) for each DLM. The results
are promising and relatively consistent (low COV values). Looking at the results, 1D CNN
produced the best average TPR, 0.70, with the lowest COV, 0.14. The ResNet model resulted in
the lowest TPR values with only 0.53 on average. The 1D CNN model also produced the highest
TNR and ACC on average with 0.95 and 0.88 respectively, which were consistent for all
specimens (COV of 0.04). The biLSTM model produced the lowest average TNR and ACC
values of 0.87 and 0.80, respectively. In terms of F1 scores, the 1D CNN was also the winner
among the investigated models with 0.75; the ResNet and biLSTM models shared the low of
0.61. Comparing the 2D DLMs, AleNet FT had the best overall performance by achieving a TPR
of 0.70, TNR of 0.92, ACC of 0.86, and F1 score of 0.72. Results indicate that the shallower
CNNs was, the more successful it performed on the binary annotated IE dataset. In addition, the
19
1D models were more accurate than the 2D models since they were applied directly on the
signal, not the 2D approximation. One challenge of the using 2D representations is that all
frequencies were included. Considering the conventional IE analysis, peak frequency described
in section 2.2., the four mentioned patterns exhibit themselves in specific range of frequencies
[80]. Unlike visual images, the vertical and horizontal axis in spectrograms represent time and
frequency, which limits the 2D CNNs spatial invariance for feature detection. In addition, the
pixels in the visual images often belong to an object whereas there are no objects in the
spectrograms. The challenges of using DLMs on the 2D spectrograms of signals have been
previously discussed in [81-82]. The TPR values were consistently higher than TNR values. As
shown in Table 3, the number of training data classified as 0, i.e., sound, was almost four times
larger than the testing dataset which would justify high TNRs. However, unlike other
applications of deep learning, this imbalance is benign since it matches the reality of in-service
bridges. The second reason was due to the variability of the defected regions compared to the
sound concrete. The defects had four sources: shallow delamination, voids, honeycombing, and
deep delamination, as shown in Figure 3. This difference was likely manifested in some of
features extracted by the DLMs from the IE signals and consequently the spectrograms within
the same class, i.e., 1. One solution is to designate a class for each defect type if enough data for
The performance of the 2D CNNs seems to be counterintuitive at first since the deeper
networks resulted in poorer performance. The performance of these models does not always
follow their performance on ImageNet [83-84]. In addition, deeper networks, e.g., GoogleNet
and ResNet, do not necessarily perform better than shallower networks, e.g., AlexNet, for binary
classifications [27]. For better implementation of 2D CNNs, more labeled spectrograms are
20
required. Note that spectrograms could be generated with differently, using other transforms such
1.00
0.75
0.50
0.25
0.00
(a)
1.00
0.75
0.50
0.25
0.00
ResNet
ResNet
ResNet
ResNet
ResNet
ResNet
ResNet
ResNet
ResNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
AlexNetFT
AlexNetTL
GoogleNet
TPR TNR ACC F1
(b)
Figure 11. Testing metrics for the investigated DLMs: (a) using IE signals, (b) using IE spectrograms.
To investigate sources of false reports, the misclassified IE data are shown in Table 5.
Almost all the mutual false negatives between the investigated DLMs were associated with the
IE data on the boundary. Higher false negatives in some models (S6 compared to S3-S5) while
models were trained on relatively similar datasets indicates that this inconsistency has external
sources. It is possible that the IE instrument was not properly positioned during the data
In this section, the 1D CNN, as the best performing DLM, was compared to other
methods for IE data analysis in two subsections. In subsection one, a set of defect maps for
studied specimens were generated and compared using the proposed 1D CNN and the peak
frequency method. In subsection two, TPR, TNR, ACC, and F1 were computed and compared
using the proposed 1D CNN, and two a machine learning approaches using support vector
machine (SVM) and wavelet decomposition coupled with extreme learning method (ELM).
Defect maps were used to compare the investigated DLMs to the peak frequency method.
A defect map shows the relationship between the spatial location of subsurface defects to a
certain property of the inspected deck obtained through a particular NDE method [35,48,52].
This property is the frequency associated with highest amplitude, e.g., vertical axis of the
spectrums shown in Figure 2. Using these frequencies, one can generate contour plots to present
22
the defect maps. A similar approach was taken to create contour plots using the DLMs by
plotting the probability assigned to the defected class which essentially the probability of defect
presence. Figure 12 shows the defect maps generated by the peak frequency method and the best
S7(CONVENTIONAL)
S7 (1D CNN)
S8 (CONVENTIONAL)
S8 (1D CNN)
Shallow delamination was detected easier by both approaches which was expected [35].
The proposed 1D DLM had difficulties when used on S6 and S8 as expected due to possible
construction or data acquisition errors. The 1D CNN provided a precise prediction of defects
26
locations for all four defects type in all the specimens. In addition, the 1D CNNs produced a
better distinction between the sound and the defected concrete through a more tangible contrast
change in the defect maps compared to the frequency method. The results show not only that
deep learning is a feasible solution for IE data analysis but it was more robust in generating
Metrics
Model TPR TNR ACC F1
(Mean, COV, Max, Min) (Mean, COV, Max, Min) (Mean, COV, Max, Min) (Mean, COV, Max, Min)
1D CNN
0.70 0.14 0.81 0.48 0.95 0.04 1.00 0.85 0.88 0.04 0.92 0.83 0.75 0.08 0.83 0.65
(proposed)
SVM []
ELM
The DLMs were successfully implemented for bridge evaluation in the previous section.
However, testing datasets were similar to training datasets in terms of defects, geometry, and
material properties. Even though there exist IE filed dataset in some public repositories, such as
FHWA InfoBridge [7], they are not validated with the ground truth. Therefore, the investigated
DLMs could not be validated on such datasets. To mimic the field investigation, the models were
used on a set of IE data acquired using a different IE device on a different bridge specimen. The
bridge specimen was built in the Center for Transportation Infrastructure Systems at the
University of Texas at El Paso, as shown in Figure 13 [85]. The 0.22 m thick specimen was
inspected by applying an IE device on 126 grid points with a sampling frequency of 400 kHz for
5 ms of data acquisition.
27
Figure 13. Schematic plan of the validation bridge specimen with delamination.
First, the IE data of the specimen were analyzed using the peak frequency analysis which
resulted in a defect map shown in Figure 14a. The user tailored frequency method detected most
of the defected areas. Then the proposed 1D DLM was used to evaluate the new specimen
(Figure 14b). The 1D CNN model trained on all 8 specimens using the same training parameters
as before. The 1D CNN detected most of the defected area of the specimen; however, it reported
several false positives. Since not a single IE data of the new specimen was used during the
training, the new dataset was and the models were unfamiliar with respect to each other. Yet, the
1D CNN was able to detect most of the damaged areas. This shows the possibly of using these
models for bridge evaluations using IE in the field. With a larger and more diverse annotated
dataset it is likely to acquire substantially more accurate defect maps than the one shown in
Figure 14b.
(a)
28
(b)
Figure 14. Defect maps of the validation specimen using (a) frequency method, (b) 1D CNN.
Conclusion
The feasibility of using deep learning models (DLMs) to classify impact echo (IE) data
on bridge decks to localize subsurface defects has been investigated in this paper for the first
time. Eight lab-made bridge specimens with four types of defects were constructed at the Federal
These specimens were inspected using IE. Six DLMs were trained on the data from seven
specimens and then were used to classify the IE data on the remaining specimen. A bidirectional
long short-term memory (biLSTM) and a one-dimensional convolutional neural network (1D
CNN) were developed and applied directly on the IE signals. In addition, the IE signals were
converted into 2D representation, i.e., color images, through the short-time Fourier transform to
use pre-trained 2D CNNs with AlexNet, GoogleNet, and ResNet architectures. The proposed 1D
CNN outperformed other investigations by detecting 0.70 of the defects on average. This model
also had the lowest false positive average (0.05), highest accuracy (0.88), and highest F1 score
(0.75). The proposed 1D CNN was also considerably faster during the training compared to the
other DLMs. The defect maps generated by 1D CNN matched or exceeded the ones generated by
the peak frequency method by providing a more accurate localization of the defects. The
proposed 1D CNN was applied on a validation specimen with different dimensions to mimic the
29
field testing. The model was able detect most of the defected areas although it reported more
false positives than the conventional peak frequency method due to lack of diversity in the
training dataset.
AUTHOR CONTRIBUTIONS
analysis, Investigation, Writing - Original Draft, Writing - Review & Editing, and Visualization.
Acknowledgment
This research was sponsored by the Federal Highway Administration (FHWA). The
contents do not necessarily reflect the official views or policies of the FHWA. This research was
performed while the first author held a National Research Council (NRC) Research
References
[1] Hearn, G. (2007). Bridge inspection practices, A Synthesis of Highway Practice (Vol. 375).
[2] Ryan, T. W., Hartle, R. A., Mann, J. E., & Danovich, L. J. (2012). Bridge inspector’s
reference manual. Report No. FHWA NHI, 03-001, U.S. Department of Transportation,
Washington, DC.
[3] Graybeal, B. A., Phares, B. M., Rolander, D. D., Moore, M., & Washer, G. (2002). Visual
[4] Dorafshan, S., & Maguire, M. (2018). Bridge inspection: human performance, unmanned
aerial systems and automation. Journal of Civil Structural Health Monitoring, 8(3), 443-476.
[5] Washer, G.; Hammed, M.; Brown, H.; Connor, R.; Jensen, P.; Fogg, J.; Salazar, J.; Leshko,
B.; Koonce, J.; Karper, C. Guidelines to Improve the Quality of Element-Level Bridge
Inspection Data; No. NCHRP Project 12-104; 2018, Washington DC, USA.
[6] Dorafshan, S., Maguire, M., Hoffer, N. V., & Coopmans, C. (2017, June). Challenges in
bridge inspection using small unmanned aerial systems: Results and lessons learned. In 2017
[8] Lattanzi, D., & Miller, G. (2017). Review of robotic infrastructure inspection
[9] Dorafshan, S., Thomas, R. J., & Maguire, M. (2018). Comparison of deep convolutional
[10] Gibb, S., La, H. M., Le, T., Nguyen, L., Schmid, R., & Pham, H. (2018). Nondestructive
evaluation sensor fusion with autonomous robotic system for civil infrastructure
[11] Dorafshan, S., Thomas, R. J., & Maguire, M. (2019). Benchmarking Image Processing
[12] Kim, H., Ahn, E., Cho, S., Shin, M., & Sim, S. H. (2017). Comparative analysis of image
binarization methods for crack identification in concrete structures. Cement and Concrete
[13] Farhidzadeh, A., Ebrahimkhanlou, A., & Salamone, S. (2014, March). A vision-based
Structural and Biological Systems 2014 (Vol. 9064, p. 90642H). International Society for
[14] Dorafshan, S., & Maguire, M. (2017, June). Autonomous detection of concrete cracks on
bridge decks and fatigue cracks on steel members. In Digital Imaging 2017 (pp. 33-44).
[15] Luo, Q., Ge, B., & Tian, Q. (2019). A fast adaptive crack detection algorithm based on a
double-edge extraction operator of FSM. Construction and Building Materials, 204, 244-254.
[16] Koch, C., Georgieva, K., Kasireddy, V., Akinci, B., & Fieguth, P. (2015). A review on
computer vision based defect detection and condition assessment of concrete and asphalt
[17] Dorafshan, S., Maguire, M., & Chang, M. (2017, March). Comparing automated image-
based crack detection techniques in the spatial and frequency domains. In 26th ASNT
[18] Omar, T., & Nehdi, M. L. (2017). Remote sensing of concrete bridge decks using
[19] Dorafshan, S.; Maguire, M.; Qi, X. Automatic Surface Crack Detection in Concrete
Structures Using OTSU Thresholding and Morphological Operations; Paper 1234; Civil and
[20] Dixit, A., & Wagatsuma, H. (2018, October). Comparison of Effectiveness of Dual Tree
Complex Wavelet Transform and Anisotropic Diffusion in MCA for Concrete Crack
[21] Ebrahimkhanlou, A., Dubuc, B., & Salamone, S. (2019). A generalizable deep learning
framework for localizing and characterizing acoustic emission sources in riveted metallic
[22] Ebrahimkhanlou, A., Dubuc, B., & Salamone, S. (2019, April). A deep learning-based
metallic panels using only one sensor. In Health Monitoring of Structural and Biological
Systems XIII (Vol. 10972, p. 1097209). International Society for Optics and Photonics.
[23] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing
[24] Dodge, S., & Karam, L. (2017, July). A study and comparison of human and deep
[25] Dorafshan, S., Thomas, R. J., & Maguire, M. (2018). SDNET2018: An annotated image
dataset for non-contact concrete crack detection using deep convolutional neural
[26] Dung, C. V. (2019). Autonomous concrete crack detection using deep fully convolutional
[27] Doulamis, A., Doulamis, N., Protopapadakis, E., & Voulodimos, A. (2018, October).
Combined convolutional neural networks and fuzzy spectral clustering for real time crack
[28] Dorafshan, S., Thomas, R. J., Coopmans, C., & Maguire, M. (2018, June). Deep learning
neural networks for sUAS-assisted structural inspections: Feasibility and application. In 2018
[29] Atha, D. J., & Jahanshahi, M. R. (2018). Evaluation of deep learning approaches based
[30] Cha, Y. J., Choi, W., Suh, G., Mahmoudkhani, S., & Büyüköztürk, O. (2018).
Autonomous structural visual inspection using region‐based deep learning for detecting
multiple damage types. Computer‐Aided Civil and Infrastructure Engineering, 33(9), 731-
747.
[31] Azari, H., Lin, S., & Meng, D. (2019). Nondestructive Corrosion Evaluation of
[32] FHWA (2015). Nondestructive Evaluation (NDE) Web Manual, Version 1.0. FAST NDE
[33] Popovics, J. S., Roesler, J. R., Bittner, J., Amirkhanian, A. N., Brand, A. S., Gupta, P., &
Flowers, K. (2017). Ultrasonic imaging for concrete infrastructure condition assessment and
[34] Jalinoos, F., Tran, K. T., Nguyen, T. D., & Agrawal, A. K. (2017). Evaluation of Bridge
[35] Lin, S., Shams, S., Choi, H., & Azari, H. (2018). Ultrasonic imaging of multi-layer
[36] Nguyen, T. D., Tran, K. T., & Gucunski, N. (2016). Detection of bridge-deck
[37] Azari, H., Meng, D., Choi, H., Shams, S., & Lin, S. (2019). Estimation of Wave Velocity
[38] Yehia, S., Abudayyeh, O., Nabulsi, S., & Abdelqader, I. (2007). Detection of common
[39] Dinh, K., Zayed, T., Romero, F., & Tarussov, A. (2014). Method for analyzing time-
series GPR data of concrete bridge decks. Journal of Bridge Engineering, 20(6), 04014086.
[40] Ma, X., Liu, H., Wang, M. L., & Birken, R. (2018). Automatic detection of steel rebar in
bridge decks from ground penetrating radar data. Journal of Applied Geophysics, 158, 93-
102.
[41] Dinh, K., Gucunski, N., Kim, J., & Duong, T. H. (2017). Method for attenuation
assessment of GPR data from concrete bridge decks. NDT & E International, 92, 50-58.
35
[42] Dinh, K., Gucunski, N., & Duong, T. H. (2018). An algorithm for automatic localization
and detection of rebars from GPR data of concrete bridge decks. Automation in
[43] Romero, F. A., Barnes, C. L., Azari, H., Nazarian, S., & Rascoe, C. D. (2019). Validation
[44] Gucunski, N., Romero, F., Kruschwitz, S., Feldmann, R., Abu-Hawash, A., & Dunn, M.
[45] La, H. M., Lim, R. S., Basily, B. B., Gucunski, N., Yi, J., Maher, A., ... & Parvardeh, H.
(2013). Mechatronic systems design for an autonomous robotic system for high-efficiency
1655-1664.
[46] Tawhed, W. F., & Gassman, S. L. (2002). Damage assessment of concrete bridge decks
[47] Kee, S. H., Oh, T., Popovics, J. S., Arndt, R. W., & Zhu, J. (2011). Nondestructive bridge
deck testing with air-coupled impact-echo and infrared thermography. Journal of Bridge
[48] Gucunski, N., Kee, S., La, H., Basily, B., & Maher, A. (2015). Delamination and
concrete quality assessment of concrete bridge decks using a fully autonomous RABIT
[49] Choi, H., Shams, S., & Azari, H. (2018). Frequency Wave Number–Domain Analysis of
04018015.
[50] Choi, H., & Azari, H. (2017). Guided wave analysis of air-coupled impact-echo in
[51] Azari, H., Nazarian, S., & Yuan, D. (2014). Assessing sensitivity of impact echo and
[52] Besaw, L. E., & Stimac, P. J. (2015, May). Deep convolutional neural networks for
classifying GPR B-scans. In Detection and Sensing of Mines, Explosive Objects, and
Obscured Targets XX (Vol. 9454, p. 945413). International Society for Optics and Photonics.
[53] Kim, N., Kim, K., An, Y. K., Lee, H. J., & Lee, J. J. (2018). Deep learning-based
underground object detection for urban road pavement. International Journal of Pavement
Engineering, 1-13.
[54] Kaur, P., Dana, K. J., Romero, F. A., & Gucunski, N. (2016). Automated GPR rebar
analysis for robotic bridge deck evaluation. IEEE transactions on cybernetics, 46(10), 2265-
2276.
(2017). A new method to determine locations of rebars and estimate cover thickness of RC
structures using GPR data. Construction and Building Materials, 140, 257-273.
[56] Epp, T., Svecova, D., & Cha, Y. J. (2018). Semi-Automated Air-Coupled Impact-Echo
[57] Xu, J., Ren, Q., & Shen, Z. (2018). Analysis method of impact echo based on variational
[58] Li, B., Cao, J., Xiao, J., Zhang, X., & Wang, H. (2014, June). Robotic impact-echo non-
destructive evaluation based on fft and svm. In Proceeding of the 11th World Congress on
[59] Igual, J., Salazar, A., Safont, G., & Vergara, L. (2015). Semi-supervised Bayesian
[60] Amini, K., Cetin, K., Ceylan,H., Taylor, P., (2018). Development of prediction models
for mechanical properties and durability of concrete using combined nondestructive tests.
[61] Völker, C., & Shokouhi, P. (2015). Multi sensor data fusion approach for automatic
[62] Xiao, X., Gao, B., yun Tian, G., & qing Wang, K. (2019). Fusion model of inductive
thermography and ultrasound for nondestructive testing. Infrared Physics & Technology,
101, 162-170.
[63] Zhang, J. K., Yan, W., & Cui, D. M. (2016). Concrete condition assessment using
[64] ASTM C 1383. (2004). Test method for measuring the P-wave speed and the thickness of
concrete plates using the Impact-Echo method, ASTM Standards. vol. 04.02. ASTM: West
Conshohocken, PA.
[65] Medina, R., & Garrido, M. (2007). Improving impact-echo method by using cross-
[66] Lin, S., Meng, D., Choi, H., Shams, S., & Azari, H. (2018). Laboratory assessment of
[67] Elman, J. L. (1990). Finding structure in time. Cognitive science, 14(2), 179-211.
[68] Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE
[69] Abdeljaber, O., Avci, O., Kiranyaz, M. S., Boashash, B., Sodano, H., & Inman, D. J.
(2018). 1-D CNNs for structural damage detection: verification on a structural health
[70] Zhang, Y., Miyamori, Y., Mikami, S., & Saito, T. (2019). Vibration‐based structural state
Infrastructure Engineering.
[71] Carino, N. J., Sansalone, M., & Hsu, N. N. (1986). Flaw detection in concrete by
[72] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
1097-1105).
[73] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A.
(2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer
[74] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image
[75] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A
large-scale hierarchical image database. In 2009 IEEE conference on computer vision and
[76] Abdel-Hamid, O., Deng, L., & Yu, D. (2013, August). Exploring convolutional neural
network structures and optimization techniques for speech recognition. In Interspeech (Vol.
[77] Wyse, L. (2017, May). Audio Spectrogram Representations for Processing with
Deep Learning and Music, Anchorage, US, May, 2017., pp. 37-41(pp. 37-41).
[78] Chorowski, J., Weiss, R. J., Saurous, R. A., & Bengio, S. (2018, April). On using
backpropagation for speech texture generation and voice conversion. In 2018 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2256-
2260). IEEE.
[79] Pritt, M., & Chern, G. (2017, October). Satellite Image Classification with Deep
Learning. In 2017 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (pp. 1-7).
IEEE.
[80] Wu, R. T., Singla, A., Jahanshahi, M. R., Bertino, E., Ko, B. J., & Verma, D. (2019).
Pruning deep convolutional neural networks for efficient edge computing in condition
[81] Azari, H., Yuan, D., Nazarian, S., & Gucunski, N. (2012). Sonic methods to detect
delamination in concrete bridge decks: Impact of testing configuration and data analysis