Earthquake Prediction Using Neural Network
Earthquake Prediction Using Neural Network
Neural networks are investigated for predicting the magnitude of the largest seismic event in the fol-
lowing month based on the analysis of eight mathematically computed parameters known as seismicity
indicators. The indicators are selected based on the Gutenberg-Richter and characteristic earthquake
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
magnitude distribution and also on the conclusions drawn by recent earthquake prediction studies. Since
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
there is no known established mathematical or even empirical relationship between these indicators and
the location and magnitude of a succeeding earthquake in a particular time window, the problem is
modeled using three different neural networks: a feed-forward Levenberg-Marquardt backpropagation
(LMBP) neural network, a recurrent neural network, and a radial basis function (RBF) neural network.
Prediction accuracies of the models are evaluated using four different statistical measures: the probability
of detection, the false alarm ratio, the frequency bias, and the true skill score or R score. The models
are trained and tested using data for two seismically different regions: Southern California and the San
Francisco bay region. Overall the recurrent neural network model yields the best prediction accuracies
compared with LMBP and RBF networks. While at the present earthquake prediction cannot be made
with a high degree of certainty this research provides a scientific approach for evaluating the short-term
seismic hazard potential of a region.
∗
Corresponding author.
13
March 14, 2007 17:42 00089 FA2
magnitudes and frequencies for certain seismic In general, neural network modeling has been
zones.6,32 found to be an effective solution approach when
Another temporal earthquake distribution model (a) many variables of diverse types need to be
that is being given increasing scientific attention included in problem-formulation, (b) the form of
in recent years is the characteristic earthquake dis- relationship between the dependent and independent
tribution model originally proposed by Kagan and variables is unknown, and (c) future data needs to be
Jackson.42 Several active seismic zones exhibit a included in the model.63 Neural networks have been
“recurring” or “characteristic” trend when it comes used for solution of a variety of complex problems
to major seismic events (also known as characteris- from image recognition,2 to design automation,7 con-
tic events). In such regions, characteristic events are struction scheduling,4 automatic detection of traf-
punctuated by almost uniform time intervals. Sev- fic incidents in intelligent freeway systems,5 and
eral recent attempts have been made to quantify the forecasting nonlinear natural phenomena such as
characteristic magnitude and intervening time period earthquakes49,50,52,10,46,51 and avalanches.63
for major seismic zones.31,58,62 In this work, three neural network models are pre-
Earthquake parameter prediction based on the sented for predicting the magnitude of the largest seis-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
analysis of observed precursor signals prior to major mic event in the following month based on the analysis
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
earthquakes is also a popular field of research. of eight mathematically defined seismicity indicators.
For example, earthquake precursors such as fore- For a given seismic region, eight seismicity indicators
shocks before large earthquakes have been used to calculated from a pre-defined number of significant
predict earthquake parameters.14 Foreshocks can be seismic events (here-on called simply events) before
defined as a series of seismic events of magnitudes the beginning of each month are used as inputs to
higher than the magnitude of normal or background predict the occurrence or non-occurrence of an earth-
seismic activity for a particular region that precede quake of a pre-defined threshold magnitude or more
a large earthquake. during that month using a multi-layer neural network.
Another important focus of earthquake prediction
studies has been the phenomenon of seismic quiescence
observed in certain regions.16 Seismic quiescence is 2. Seismicity Indicators
defined as the apparent lull in the normal seismicity of In this section, eight mathematically-defined seis-
a region for some time before a major earthquake. Nor- mic parameters known as seismicity indicators are
mal seismicity for a seismic zone is usually measured in presented. These indicators can be used to evalu-
terms of the earthquake energy released or the rate of ate the seismic potential of a region. Three of the
its release. The extent of quiescence has been related eight parameters are independent of the temporal
to the magnitude of the succeeding earthquake based distribution of the earthquake magnitude assumed.
on the elastic rebound theory.64 The rate of release of They are the time elapsed over a predefined number
seismic energy or its square root may provide a means (n) of events (T), the mean magnitude of the last n
of predicting future earthquakes. events (Mmean ), and the rate of release of square root
However, due to the extreme non-linear and com- of energy (dE1/2 ).
plex geophysical processes that lead to earthquake Three indicators are based on the Gutenberg–
occurrence, there is no accurate mathematical or Richter inverse power law temporal magnitude distri-
empirical relationship between any physically record- bution. They are the slope of the Gutenberg-Richter
able parameter and the time of occurrence, magni- inverse power-law curve, known as the b value,
tude or location of a future earthquake. There are the summation of the mean square deviation about
the regression line based on the Gutenberg-Richter
several factors that contribute to the non-linearity of
inverse power law, known as the η value, and the
earthquake occurrence time, location, and magnitude
magnitude deficit or the difference between observed
such as stress state of the fault after the last earth-
and expected magnitudes based on the Gutenberg-
quake, healing of gouge and other chemical processes,
Richter inverse power law or the ∆M value.33
changes in pore pressure due to compaction and fluid
The remaining two indicators are based on
migration.54 Therefore, such relationships, if existing,
the characteristic temporal earthquake magni-
are expected to be highly non-linear and complex. tude distribution. They are the mean time between
March 14, 2007 17:42 00089 FA2
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 15
Table 1. Seismicity indicator symbols and their description and magnitude distribution
(GR represents Gutenberg-Richter power-law distribution and C represents characteris-
tic earthquake distribution).
T Elapsed time —
Mmean Mean magnitude —
dE1/2 Rate of square root of seismic energy —
β Slope of magnitude-log (frequency) Plot GR
η Mean square deviation GR
∆M Magnitude deficit GR
µ Mean time C
c Coefficient of variation C
Table 2. Earthquake Magnitude Recordings in South- of the cumulative seismic energy released ver-
ern California for the 10 years between Jan. 1st 1987 sus time for a hypothetical seismic region, rep-
and Dec. 31st 1996, showing the distribution similar to
resentative of the earthquake occurrence process
Magnitude-Frequency inverse-power law distribution of
Gutenberg and Richter. in several seismic regions.68,73,20 In Fig. 1, the
region of the graph between points O and A
Magnitude Count per Cumulative total above is approximately linear and depicts background
(M) range M range lower M in range
seismic activity in the region. The graph between
2.5–2.9 9471 13590 A and B is a plateau showing disruption in back-
3.0–3.4 2784 4119 ground seismic activity or seismic quiescence.
3.5–3.9 912 1335 Cumulative energy released increases abruptly at
4.0–4.4 285 423 point B (end of quiescence where stored energy
4.5–4.9 90 128
reaches a region-dependent threshold) indicat-
5.0–5.4 32 48
5.5–5.9 10 16 ing a major earthquake. Therefore, the seismic
6.0–6.4 3 6 energy released during such earthquakes can be
6.5–6.9 2 3 approximated from the rate of background seis-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
A
B tude for a sample set of data recorded in South-
ern California for the 10-year period between Jan
1st 1986 and Dec 31st 1995. The figure illustrates
Quiescence
the Gutenberg-Richter inverse power law.
Background Rate
The values of a and b can be calculated using
the linear least squares regression analysis as
O Time
follows23 :
Fig. 1. Cumulative seismic energy released versus time
(nΣ(Mi logNi ) − ΣMi ΣlogNi )
illustrating the earthquake occurrence process for a hypo- b= (6)
thetical seismic region that exhibits seismic quiescence. ((ΣMi )2 − nΣM2i )
March 14, 2007 17:42 00089 FA2
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 17
3
Since an event of the largest magnitude will
2.5 likely occur only once among the n events, N = 1,
2 log N = 0 and Eq. (3) yields
1.5
Mmax,expected = a/b (10)
1
0.5 (7) Mean time between characteristic or typical
0
events (µ value)
0 2 4 6 8 This is the average time or gap observed
Magnitude
between characteristic or typical events among
the last n events. Several seismic zones includ-
Fig. 2. Earthquake magnitude versus the logarithm of ing the well-studied Parkfield, California, exhibit
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
the frequency of occurrence of events of equal or greater periodic trends in the gradual stress built-up and
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
3. Neural Network Models for parameters used are limited to those based on the
Categorical Earthquake Magnitude Gutenberg-Richter inverse power law which is not
Prediction applicable in all seismic regions.
There have been a number of efforts using neural Negarestani et al.55 use a BP neural network
networks to predict earthquake parameters during to differentiate environmentally induced changes in
the past decade or so. Lakkos et al.,49 used a feed- soil radon concentration from those caused by earth-
forward backpropagation (BP) neural network for quakes. The authors propose the model as a tool for
predicting earthquake magnitude using variations of earthquake time prediction based on measuring soil
electrotelluric fluid (also called seismic electric sig- radon concentration. The study reports improvement
nals), occurring between a few days to a few hours in results using neural networks compared with a
prior to major earthquakes as input. The succeed- linear time-series analysis when used on soil radon
ing earthquake magnitude and epicentral depth are readings from a site in Thailand.
the predicted output. The BP neural network was Kerh and Chu46 use seismic data recorded in the
trained using seismic electric signals recorded in Kaohsiung region of Taiwan in a BP neural net-
Western Greece. The authors report predictions were work to predict peak ground acceleration values. The
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 19
in a seismic region using eight seismicity indicators calculated for each month for a predefined number
described earlier as input simultaneously. They are a of seismic events prior to that month using the def-
feed-forward Levenberg-Marquardt backpropagation initions described in the previous section. Network
(LMBP) neural network, a recurrent neural network, output is obtained as
and a radial basis function (RBF) neural network. n
For all three networks, output is either 1 represent- Opi = f(si .wj ) (13)
ing the occurrence of an earthquake of predefined j=1
threshold magnitude or greater during the following where Opi is the predicted occurrence (1) or non-
month, or 0 otherwise. Network operation is repeated occurrence (0) of an earthquake of threshold mag-
by gradually increasing the threshold magnitude in nitude or greater during the ith month, si is the
increments of 0.5. The largest earthquake magni- 8 × 1 input vector of seismicity indicators for the ith
tude during the following month is the particular month, wj is the weight vector associated with the jth
threshold magnitude above which network output hidden layer, f is the transfer function, and n is the
changes to 0. total number of hidden layers. In this research the
tan-sigmoid function was found to be the most suit-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
3.1. Levenberg-Marquardt
and number of nodes in each hidden layer are deter-
backpropagation network
mined by trial-and-error to obtain the best results.
Backpropagation was developed by generalizing the Backpropagation is essentially a gradient-descent
Widrow-Hoff learning rule to multi-layer networks algorithm in which the network weights are moved
using nonlinear differentiable transfer functions.13,59 along the negative of the gradient of a performance
Architecture of the BP neural network model for pre- or error function. The problem is formulated as an
dicting the occurrence of an earthquake of threshold unconstrained optimization problem, that is, to min-
magnitude or greater during the following month is imize a performance function (z) defined as the mean
shown in Fig. 3. square difference between the predicted and observed
Each training dataset corresponds to one month (desired) output.
in the historical record and contains an eight by
1
N
one input vector and a single output. Each input z= (Ooi − Opi )2 (14)
node represents one of the eight seismicity indicators N i=1
∆M
Mmean
Fig. 3. Architecture of the backpropagation neural network for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.
March 14, 2007 17:42 00089 FA2
where Ooi is the observed occurrence (1) or non- network is immaterial. For instance, seismicity indi-
occurrence (0) of an event of threshold magnitude cators corresponding to each month in the histor-
or greater during the ith month, and N is the total ical record (input) and the observed occurrence or
number of training datasets. non-occurrence of an event of threshold magnitude
Batch training using gradient descent algo- or greater during that month (output) can be used
rithms is often too slow for large or complicated in a simple BP network in any random order. In
problems.2 Faster training can be accomplished contrast to BP networks, recurrent networks have
using other numerical optimization-based algorithms the capacity to retain past results by incorporating
such as the conjugate gradient,2 quasi-Newton,17 a time-delay.24 Because of this ability to operate not
and the Levenberg-Marquardt34 algorithms which only on the input space, but also on an internal state
attain convergence anywhere between 10 to a 100 space, recurrent networks have been used in a num-
times faster than the standard BP algorithm. The ber of problems involving time series, e.g., acoustic
Levenberg-Marquardt backpropagation algorithm is phonetic decoding of continuous speech27 and pre-
used as the learning rule in this work. diction of volatile stock trading trends.30 Temporal
There are three basic parameters that determine earthquake magnitude recordings also form time-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
the accuracy and training time for the Levenberg- series data and as such, recurrent neural networks
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
Marquardt BP algorithm. They are (a) the num- are suitable for predicting earthquake trends. To the
ber of iterations to be performed before training is best of the authors’ knowledge, no research has been
stopped or the number of epochs, (b) the value of published on the prediction of earthquake parame-
the mean square error function at which training ters using recurrent neural networks.
stops, and (c) the minimum magnitude of the gra- Architecture of the recurrent network for pre-
dient below which training stops. dicting the occurrence of an earthquake of threshold
magnitude or greater during the following month is
shown in Fig. 4. Similar to the BP neural network,
3.2. Recurrent network the input layer has eight nodes representing the eight
BP networks are also called static networks because seismicity indicators corresponding to a given month
the order in which training datasets are used in the in the historical record. The number of hidden layers
Feedback
connection
Output of the recurrent layer
b
Recurrent layer
∆M
Mmean
c Hidden layer
Fig. 4. Architecture of the recurrent neural network developed for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.
March 14, 2007 17:42 00089 FA2
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 21
recurrent neural network, in every iteration, the net- Opi = Φ(si − wj ) = e j (16)
work output is passed through a recurrent layer and j=1
the output of the recurrent layer is added to the where σj is known as a width factor associated with
output of the hidden layer and the sum is used as the jth hidden layer. Theoretically, wj and σj deter-
the argument of the transfer function to obtain the mine the center and spread of the Gaussian bell
network output in the succeeding iteration as follows: curve, respectively. In this research, it was observed
through numerical experimentation that the width
n
factor σ does not contribute to the network’s ability
Opi = f[si .wj + Op(i−1) .wr ] (15) for function approximation and therefore its value is
j=1
set equal to one. The Levenberg-Marquardt training
algorithm is used to minimize the mean-square error
where wr is the vector of weights of the links con-
function.
necting the nodes in the recurrent layer to the out-
put node. Therefore, network output is obtained as
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
2
exp S i − W1
T
2
exp S i − W2
b
2
exp S i − W3
∆M
Mmean 2
exp S i − W4
η
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
2 Output (1 if earthquake of
exp S i − W5
threshold magnitude or
1/2
dE greater than threshold
magnitude occurs, 0 if not)
µ 2
exp S i − W6
c
2
exp S i − Wn
Hidden layer which operates on the Euclidian
distance between the input vector and the
vector of weights of the links connecting the
input and hidden layers using the Gaussian
transfer function
Fig. 5. Architecture of the radial basis function neural network for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.
magnitude or greater occurred and was predicted; Table 4. Parameters used for the computation of POD,
Npi (predicted-incorrect) is the number of months FAR, FB and R scores. “Yes” denotes predicted or
during which an earthquake of threshold magnitude observed occurrence of an earthquake of threshold mag-
nitude or greater and “No” denotes the predicted or
or greater did not occur but was predicted, Nnc observed non-occurrence of an earthquake of threshold
(not predicted-correct) is the number of months dur- magnitude or greater.
ing which an earthquake of threshold magnitude or ```
greater did not occur and was not predicted, and Nni ``` Predicted
``` Yes No
(not predicted-incorrect) is the number of months Observed ``
during which an earthquake of threshold magnitude Yes Npc Npi
or greater occurred but was not predicted. These No Nni Nnc
variables are summarized in Table 4.
Another commonly used forecast verification
method is computing the so-called skill scores. Skill between the probability of detection and false alarm
scores are a measure of the skill (or competence) of ratio for each predicted magnitude:
the prediction tool (neural network modeling in this Npc Npi
R = POD − FAR = − (20)
case) in predicting a particular parameter (earth- Npc + Npi Npi + Nnc
quake magnitude in this case).56 In this research, the The R score is −1 if no correct predictions are
Hanssen-Kuiper skill score, also known as the real made and +1 if all predictions are correct. This
skill or R score, is used. It is defined as the difference score is considered advantageous over POD and FAR
March 14, 2007 17:42 00089 FA2
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 23
Table 5. Sample readings from the SCEC catalog showing search parameters,
parameter values and events that matched the search criteria.
longitude (this region includes most of north-central arrays form 492 training datasets. Sample training
California). The three neural network models are datasets for the 12 months between January and
trained and tested separately using historical seismic December of 1994 for southern California showing
data recorded at two different regions: southern Cal- eight-element input vectors and the corresponding
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
ifornia and the San Francisco Bay region as defined desired outputs (M = 4.5 or greater) are presented in
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
Table 6. Sample training datasets for the 12 months between January and December of 1994 for southern
California showing eight-element input vectors and the corresponding desired outputs (M = 4.5 or greater).
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 25
predict the occurrence of an earthquake of magnitude and small earthquakes is studied using the statistical
4.5 or greater for the nine months between January measures described in an earlier section.
and September of 2005 using the 8 × 1 input vectors
of seismicity indicators for these months as input. Southern California
The R scores for the nine predictions are computed The computed values of POD, FAR, FB, and R
for each of the three neural network models. The score for different magnitudes using each of the three
best network architectures are obtained by gradually neural network models are summarized in Table 7.
increasing the number of hidden layers and the num- This table also includes the values of the probabili-
ber of nodes in each hidden layer until the increase ties of occurrence of earthquakes of different magni-
in R scores by doing so is less than 0.01. The best tudes in a given month based on the Poisson’s null
architectures for the BP and recurrent neural net- hypothesis (p0 ).
work models were found to be the same for both
seismic regions whereas different best architectures (a) Prediction of large earthquakes (Magnitude 6.5
were found in the case of the RBF neural network or greater)
model for the two regions. There were two months in the testing period dur-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
Table 7. Computed values of p0 ,POD, FAR, FB, and R scores for different magnitudes using the three neural network
models for southern California.
4.5 0.301 0.52 0.44 0.86 0.08 0.67 0.31 0.90 0.36 0.44 0.45 0.90 −0.01
5.0 0.131 0.46 0.33 0.68 0.13 0.80 0.29 1.00 0.51 0.64 0.27 0.79 0.37
5.5 0.052 0.50 0.50 1.00 0.00 0.75 0.25 1.00 0.50 0.75 0.40 0.86 0.35
6.0 0.021 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
6.5 0.011 0.00 0.00 0.00 0.00 1.00 0.00 1.00 1.00 1.00 0.00 1.00 1.00
7.0 0.003 0.00 0.00 0.00 0.00 1.00 0.00 1.00 1.00 0.50 0.50 1.00 0.00
March 14, 2007 17:42 00089 FA2
predicted the occurrence of an earthquake of magni- earthquakes of magnitude between 5.0 and 5.5 dur-
tude between 6.5 and 7.0 yielding an FAR of 0.0. ing six months yielding an FAR of 0.27.
There were 27 months in the testing period dur-
(b) Prediction of moderate earthquakes (Magnitude ing which an earthquake of magnitude between 4.5
5.5 or greater but less than 6.5) and 5.0 occurred. The BP neural network correctly
There was a single month in the testing period dur- predicted 14 of these events yielding a POD of
ing which an earthquake of magnitude between 6.0 0.52 and falsely predicted earthquakes of magnitude
and 6.5 occurred (April and June 1992). The BP neu- between 4.5 and 5.0 during 11 months yielding an
ral network did not predict this earthquake and did FAR of 0.44. The recurrent neural network correctly
not make any false prediction of an earthquake of predicted 18 of the 27 events yielding a POD of
magnitude between 6.0 and 6.5 yielding a POD and 0.67 and falsely predicted earthquakes of magnitude
FAR of 0.00. The recurrent network did not predict between 4.5 and 5.0 for eight months yielding an
this event yielding a POD 0.0 and did not falsely FAR of 0.31. The RBF neural network correctly pre-
predict any earthquake of magnitude between 6.0 dicted 12 of the 27 events yielding POD of 0.44 and
and 6.5 yielding an FAR of 0.00. The RBF neu- falsely predicted earthquakes of magnitude between
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 27
Table 8. Computed values of R scores for different predicted magnitudes, using different
input vectors obtained by removing one parameter at a time for the southern California
region.
Table 9. Computed values of R scores for different predicted magnitudes, using input
vectors (seismicity indicators) calculated form different threshold magnitudes and number
of events of threshold magnitude or greater for the southern California region.
events in computing seismicity indicators as summa- occurred. Therefore POD for all three networks for
rized in Table 9. M = 6.5 and M = 7.0 is 0.0. None of the networks
falsely predicted earthquakes of magnitude greater
San Francisco bay region than 6.5 yielding an FAR of 0.0 for M = 6.5 and
(a) Prediction of large earthquakes (Magnitude 6.5 M = 7.0.
or greater)
There was no month in the testing period during (b) Prediction of moderate earthquakes (Magnitude
which an earthquake of magnitude greater than 6.5 5.5 or greater but less than 6.5)
March 14, 2007 17:42 00089 FA2
There was no month in the testing period dur- 4.5 and 5.0 for four months yielding an FAR of 0.44
ing which an earthquake of magnitude between 6.0 for M = 4.5. The RBF neural network correctly pre-
and 6.5 occurred. Therefore the POD for all three dicted four of the eight events yielding a POD of
networks is 0.0 for M = 6.0. The BP neural network 0.50 and falsely predicted two earthquakes of mag-
falsely predicted seven events of magnitude between nitude between 4.5 and 5.0 yielding an FAR of 0.33
6.0 and 6.5 yielding an FAR of 1.00. The recurrent for M = 4.5.
network did not falsely predict any event of magni-
tude between 6.0 and 6.5 yielding an FAR of 0.00. (d) Parametric Analysis
The RBF neural network falsely predicted 3 events Similar to the Southern California region, parametric
of magnitude between 6.0 and 6.5 yielding an FAR studies are performed by removing the seismicity
of 1.00 for M = 6.0. indicators one at a time and in each case R scores
There was no month in the testing period dur- are computed for different magnitudes. It is observed
ing which an earthquake of magnitude between 5.5 that prediction accuracies are influenced by seismic-
and 6.0 occurred. Therefore the POD for all three ity indicators based on the Gutenberg-Richter earth-
networks is 0.00 for M = 5.5. The BP neural network quake magnitude-frequency distribution (b, η, and
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
falsely predicted 18 events of magnitude between 5.5 ∆M) and not significantly altered by seismicity indi-
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
and 6.0 yielding an FAR of 1.00. The recurrent net- cators based on the characteristic earthquake distri-
work did not falsely predict any earthquake of mag- bution model (µ and c) leading to the conclusion
nitude between 5.5 and 6.0 yielding an FAR of 0.00. that the Gutenberg-Richter earthquake magnitude-
The RBF network falsely predicted 10 earthquakes frequency relationship is better suited to the San
of magnitude between 5.5 and 6.0 yielding an FAR Francisco bay region compared with the charac-
of 1.00 for M = 5.5. teristic earthquake distribution. The R scores are
significantly reduced if dE1/2 is removed from the
(c) Prediction of small earthquakes (Magnitude 4.5
input vector leading to the conclusion that San
or greater but less than 5.5)
Francisco bay region shows strong seismic quies-
There were four months in the testing period dur-
cence trends.
ing which an earthquake of magnitude between 5.0
As in the case of the southern California region,
and 5.5 occurred. The BP neural network did not
it is observed that no major improvement in predic-
predict any of these events yielding a POD of 0.00
tion accuracies is achieved as a result of considering
and falsely predicted an earthquake of magnitude
large numbers of low magnitude events for comput-
between 5.0 and 5.5 for three months yielding an
ing seismicity indicators.
FAR of 1.00 for M = 5.0. The recurrent neural net-
work correctly predicted two of the four events
yielding a POD of 0.50 and falsely predicted an
7. Final Comments
earthquake of magnitude between 5.0 and 5.5 dur-
ing one month yielding an FAR of 0.33 for M = 5.0. Earthquake parameter prediction is a highly complex
The RBF neural network predicted one of the four problem because it involves a large number of vari-
events yielding a POD of 0.25 and falsely pre- ables whose effects are not completely understood.
dicted an earthquake of magnitude between 5.0 and Neural networks have not been used extensively for
5.5 for one month yielding an FAR of 0.50 for earthquake parameter prediction even though they
M = 5.0. have been used as an effective prediction tool in a
There were eight months during which an earth- variety of fields from economics to signal process-
quake of magnitude between 4.5 and 5.0 were ing. In this paper, three neural network models are
recorded. The BP neural network predicted two of presented for earthquake magnitude prediction using
these events yielding a POD of 0.25 and falsely pre- eight mathematically computed seismic parameters
dicted earthquakes of magnitude between 4.5 and as input. The goal is to predict the magnitude of the
5.0 during two months yielding an FAR of 0.50 largest earthquake in a following period, for example,
for M = 4.5. The recurrent neural network predicted one month.
five of the eight events yielding a POD of 0.63 and Overall the recurrent neural network model yields
falsely predicted earthquakes of magnitude between the best prediction accuracies compared with the
March 14, 2007 17:42 00089 FA2
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 29
BP and RBF networks. This result is consistent Table 10. Observed largest earthquake, and correspond-
with the fact that recurrent neural networks have ing location compared with predicted magnitude for each
month in the testing dataset for Southern California
the inherent capacity to model time-series data
using the recurrent neural network. Blank cells indicate
better compared with other networks. The study lack of observed or predicted events of magnitude 4.5 or
that is most closely related to the present work greater during the corresponding month.
was undertaken by Ma et al.,52 where six seis-
micity indicators were used as input to a BP Month Observed Observed location Predicted
Mmax Mmax
neural network to predict the magnitude of the Latitude Longitude
largest earthquake in the following month. How-
ever, the indicators were limited to those based Jan-91 — — — —
Feb-91 — — — —
on the Gutenberg-Richter magnitude-frequency rela-
Mar-91 — — — —
tionship alone. Therefore, their model is not appli- Apr-91 — — — —
cable in regions where seismicity data do not follow May-91 — — — —
this relationship. When tested on the two seismic Jun-91 5.8 34.27 −117.99 5.5
regions, all three neural network models developed Jul-91 — — — —
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
Month Observed Observed location Predicted Month Observed Observed location Predicted
Mmax Mmax Mmax Mmax
Latitude Longitude Latitude Longitude
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 31
Aug-04 — — — —
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
22. H. Dai and C. Macbeth, Application of backprop- 37. M. Ishimoto and K. Iida, Observations of earth-
agation neural networks to identification of seis- quakes registered with the micro seismograph con-
mic arrival types, Journal of Geophysical Research structed recently, Bulletin of Earthquake Research,
102(B7) (1997) 15105–15113. University of Tokyo, 17 (1939) 443–478 [Translated
23. N. Draper and H. Smith, Applied Regression Analy- from Japanese].
sis (Wiley, New York, NY, 1966). 38. D. Jackson, Hypothesis testing and earthquake pre-
24. J. L. Elman, Finding structure in time, Cognitive diction, Earthquake Prediction: The Scientific Chal-
Science 14 (1999) 179–211. lenge, February 10–11, Irvine, CA (1995).
25. W. Ellsworth, M. Mathews, R. Nadeau, S. Nishenko, 39. D. Jackson and Y. Y. Kagan, Parkfield earthquake:
P. Reasenberg and R. Simpson, A physically- Not likely this year, Seismological Research Letters
based earthquake recurrence model for estimation 69(2) (1998) 151.
of long-term earthquake probabilities, Workshop on 40. S. Jaume, D. Weatherley and P. Mora, Accelerat-
Earthquake Recurrence: State-of-the-art and Direc- ing moment release and the evolution of event time
tions for the Future, February 22–25
. , Rome, Italy and size statistics, results from two cellular automa-
(1999). tion models, Pure and Applied Geophysics 157(11)
26. M. W. Firebaugh, Artificial Intelligence, A Knowl- (2000) 2209–2226.
edge Based Approach, Boyd and Fraser, Boston, MA 41. Y. Y. Kagan, VAN earthquake predictions, a statis-
(1988). tical evaluation, Geophysical Research Letters 23: 11
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 33
on Neural Networks, August 19–21 . , Dalian, China 63. J. Stephens, E. Adams, X. Huo, J. Dent and J.
(2004), pp. 962–969. Hicks, Use of neural networks in avalanche forecast-
52. L. Ma, L. Zhu and Y. Shi, Attempts at using seismic- ing, in Proceedings of the International Snow Science
ity indicators for the prediction of large earthquakes Workshop, October 30 . –November 3., Snowbird, USA,
by Genetic Algorithm-Neural Network method, (1994), pp. 327–340.
Asia-Pacific Economic Cooperation for Earthquake 64. C. Thanassoulas, Earthquake prediction based
Simulation, January 31 . –February 5., Brisbane, Aus- on electrical signals recorded on ground sur-
tralia (1999). face, in Proceedings: Possible Correlation between
53. D. Marquardt, An algorithm for the least-squares Electromagnetic Earth-Fields and Future Earth-
estimation of non-linear parameters, Journal of quakes, Bulgarian Academy of Sciences, Sofia,
Applied Mathematics 11 (1963) 431–441. Bulgaria (2001), pp. 19–30.
54. M. Matthews, W. Ellsworth and P. Reasenberg, 65. K. Tiampo, J. Rundle, S. McGinnis, S. Gross and
A Brownian model for recurrent earthquakes, Bul- W. Klein, Mean-field threshold systems and phase
letin of the Seismological Society of America 92(6) dynamics: An application to earthquake fault sys-
(2002) 2233–2250. tems, Europhysics Letters 60 (2002) 481–487.
55. A. Negarestani, S. Sestayeshi, M. Ghannadi- 66. P. Varshney, Distributed Detection and Data Fusion
Maragheh and B. Akashe, Layered neural networks (Springer-Verlag, New York, NY, 1997).
based analysis of radon concentration and environ- 67. D. Vere-Jones, R. Robinson and W. Wang, Remarks
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.
mental parameters in earthquake prediction, Jour- on the accelerated release moment model: Prob-
nal of Environmental Radioactivity 62(3) (2002) lems of model formulation, simulation and esti-
225–233. mation, Geophysics Journal International 144(3)
56. T. Read and N. Cressie, Goodness-of-Fit Statis- (2001) 517–531.
tics for Discrete Multivariate Data (Springer Verlag, 68. M. Wyss, P. Bodin and R. E. Haberman, Seismic
New York, NY, 1988). quiescence at Parkfield; an independent indication
57. H. Reid, The mechanism of the earthquake; The of an imminent earthquake, Nature 345(290) (1990)
California earthquake of April, 18, 1906, Report 426–428.
of the State Earthquake Investigation Commission, 69. Working Group on California Earthquake Probabil-
Carnegie Institute of Washington, Washington D.C. ities, Earthquake probabilities in the San Fransisco
2 (1910) 16–28. bay region, United States Geological Survey Open-
58. E. Roeloffs, The Parkfield, California earthquake File Report, 03–214 (2003).
experiment, An update in 2000, Current Science 70. S. Xu, Ability evaluation for earthquake prediction,
79(9) (2000) 1226–1236. Science Books and Periodicals Press — Seismology
59. D. Rumelhart, G. Hinton and R. Williams, Learning Volume 586–590 [Translated from Chinese] (1989).
representations by back-propagating error, Nature 71. I. Zaliapin, V. Kelis-Borok and G. Axen, Premon-
323(9) (1986) 533–536. itory spreading of seismicity of faults’ network in
60. M. Sharma and M. Arora, Prediction of seismic- southern California: Precursor accord, Journal of
ity cycles in the Himalayas using artificial neural Geophysical Research 107(B10) (2002) 2221.
networks, Acta Geophysica Polonica 53(3) (2005) 72. I. Zaliapin, V. Kelis-Borok and M. Ghil, A Boolean
299–309. delay equation model of colliding cascades; Part II:
61. Y. Shi, J. Liu and G. Zhang, An evaluation of Prediction of critical transitions, Journal of Statisti-
Chinese annual earthquake predictions, 1990–1998, cal Physics 111(3) (2003) 839–861.
Journal of Applied Probability 38(A) (2001) 222–231. 73. G. Zoller, S. Hainzl and J. Kurths, A systematic test
62. K. Sieh, The repetition of large earthquake ruptures, on precursory seismic quiescence in Armenia, Natu-
in Proceedings of the National Academy of Science ral Hazards 26(3) (2002) 245–263.
93(9) (1996) 3764–3771.