0% found this document useful (0 votes)
106 views21 pages

Earthquake Prediction Using Neural Network

This document discusses three neural network models for predicting earthquake magnitude: a feed-forward Levenberg-Marquardt backpropagation neural network, a recurrent neural network, and a radial basis function neural network. The models use eight mathematically computed seismicity indicators as inputs to predict the magnitude of the largest seismic event in the following month. The models are trained and tested on data from Southern California and the San Francisco Bay area. The recurrent neural network model achieved the best prediction accuracy according to statistical measures of detection probability, false alarm ratio, frequency bias, and true skill score. While prediction with high certainty is not currently possible, this research provides a scientific approach to evaluating short-term seismic hazard potential.

Uploaded by

Rohit Bharadwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views21 pages

Earthquake Prediction Using Neural Network

This document discusses three neural network models for predicting earthquake magnitude: a feed-forward Levenberg-Marquardt backpropagation neural network, a recurrent neural network, and a radial basis function neural network. The models use eight mathematically computed seismicity indicators as inputs to predict the magnitude of the largest seismic event in the following month. The models are trained and tested on data from Southern California and the San Francisco Bay area. The recurrent neural network model achieved the best prediction accuracy according to statistical measures of detection probability, false alarm ratio, frequency bias, and true skill score. While prediction with high certainty is not currently possible, this research provides a scientific approach to evaluating short-term seismic hazard potential.

Uploaded by

Rohit Bharadwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

March 14, 2007 17:42 00089 FA2

International Journal of Neural Systems, Vol. 17, No. 1 (2007) 13–33


c World Scientific Publishing Company

NEURAL NETWORK MODELS FOR EARTHQUAKE MAGNITUDE


PREDICTION USING MULTIPLE SEISMICITY INDICATORS
ASHIF PANAKKAT and HOJJAT ADELI∗
Department of Civil and Environmental Engineering and Geodetic Science,
The Ohio State University, Columbus, Ohio 43210 USA

[email protected]

Neural networks are investigated for predicting the magnitude of the largest seismic event in the fol-
lowing month based on the analysis of eight mathematically computed parameters known as seismicity
indicators. The indicators are selected based on the Gutenberg-Richter and characteristic earthquake
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

magnitude distribution and also on the conclusions drawn by recent earthquake prediction studies. Since
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

there is no known established mathematical or even empirical relationship between these indicators and
the location and magnitude of a succeeding earthquake in a particular time window, the problem is
modeled using three different neural networks: a feed-forward Levenberg-Marquardt backpropagation
(LMBP) neural network, a recurrent neural network, and a radial basis function (RBF) neural network.
Prediction accuracies of the models are evaluated using four different statistical measures: the probability
of detection, the false alarm ratio, the frequency bias, and the true skill score or R score. The models
are trained and tested using data for two seismically different regions: Southern California and the San
Francisco bay region. Overall the recurrent neural network model yields the best prediction accuracies
compared with LMBP and RBF networks. While at the present earthquake prediction cannot be made
with a high degree of certainty this research provides a scientific approach for evaluating the short-term
seismic hazard potential of a region.

Keywords: Earthquake prediction; neural networks; seismicity; seismology.

1. Introduction a temporal distribution model for earthquake mag-


Prediction of the time of occurrence, magnitude, and nitude for the seismic region considered in the study.
epicentral location of future large earthquakes has Such models describe the frequencies of occurrence of
been the subject of a number of scientific efforts seismic events as functions of their magnitudes. The
with distinctly different conclusions in recent years. most widely used magnitude-frequency model for
While some scientists have concluded that the pre- hazard estimation is that based on the Gutenberg-
diction of time of occurrence, magnitude, or location Richter inverse power law.33 For example, it is the
of a single future earthquake cannot be done,29 oth- basis for the new U.S. National Earthquake Hazard
ers have suggested several procedures for predicting Maps.54 The law is based on an original empirical
these parameters.18,41,52,25,15,16,47,48,28,54,36,72,31,51,60 power law put forth by Ishimito and Iida.37
Such procedures are based on either the study of The Gutenberg-Richter inverse power-law
precursory phenomena before earthquakes such as establishes an inverse linear relationship between the
seismic quiescence, changes in magnetic and electric magnitude of seismic events and the logarithm of fre-
signals recorded at seismic sites, and abnormal ani- quency of occurrence of events of magnitude equal
mal behavior, or the analysis of historical earthquake or lower than that magnitude. Probabilistic earth-
data recorded in seismic catalogs. quake magnitude-frequency distributions such as
Earthquake magnitude prediction studies based those based on the Gamma and Weibul distributions
on the analysis of historical earthquake data assumes have also been employed to relate earthquake


Corresponding author.

13
March 14, 2007 17:42 00089 FA2

14 A. Panakkat & H. Adeli

magnitudes and frequencies for certain seismic In general, neural network modeling has been
zones.6,32 found to be an effective solution approach when
Another temporal earthquake distribution model (a) many variables of diverse types need to be
that is being given increasing scientific attention included in problem-formulation, (b) the form of
in recent years is the characteristic earthquake dis- relationship between the dependent and independent
tribution model originally proposed by Kagan and variables is unknown, and (c) future data needs to be
Jackson.42 Several active seismic zones exhibit a included in the model.63 Neural networks have been
“recurring” or “characteristic” trend when it comes used for solution of a variety of complex problems
to major seismic events (also known as characteris- from image recognition,2 to design automation,7 con-
tic events). In such regions, characteristic events are struction scheduling,4 automatic detection of traf-
punctuated by almost uniform time intervals. Sev- fic incidents in intelligent freeway systems,5 and
eral recent attempts have been made to quantify the forecasting nonlinear natural phenomena such as
characteristic magnitude and intervening time period earthquakes49,50,52,10,46,51 and avalanches.63
for major seismic zones.31,58,62 In this work, three neural network models are pre-
Earthquake parameter prediction based on the sented for predicting the magnitude of the largest seis-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

analysis of observed precursor signals prior to major mic event in the following month based on the analysis
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

earthquakes is also a popular field of research. of eight mathematically defined seismicity indicators.
For example, earthquake precursors such as fore- For a given seismic region, eight seismicity indicators
shocks before large earthquakes have been used to calculated from a pre-defined number of significant
predict earthquake parameters.14 Foreshocks can be seismic events (here-on called simply events) before
defined as a series of seismic events of magnitudes the beginning of each month are used as inputs to
higher than the magnitude of normal or background predict the occurrence or non-occurrence of an earth-
seismic activity for a particular region that precede quake of a pre-defined threshold magnitude or more
a large earthquake. during that month using a multi-layer neural network.
Another important focus of earthquake prediction
studies has been the phenomenon of seismic quiescence
observed in certain regions.16 Seismic quiescence is 2. Seismicity Indicators
defined as the apparent lull in the normal seismicity of In this section, eight mathematically-defined seis-
a region for some time before a major earthquake. Nor- mic parameters known as seismicity indicators are
mal seismicity for a seismic zone is usually measured in presented. These indicators can be used to evalu-
terms of the earthquake energy released or the rate of ate the seismic potential of a region. Three of the
its release. The extent of quiescence has been related eight parameters are independent of the temporal
to the magnitude of the succeeding earthquake based distribution of the earthquake magnitude assumed.
on the elastic rebound theory.64 The rate of release of They are the time elapsed over a predefined number
seismic energy or its square root may provide a means (n) of events (T), the mean magnitude of the last n
of predicting future earthquakes. events (Mmean ), and the rate of release of square root
However, due to the extreme non-linear and com- of energy (dE1/2 ).
plex geophysical processes that lead to earthquake Three indicators are based on the Gutenberg–
occurrence, there is no accurate mathematical or Richter inverse power law temporal magnitude distri-
empirical relationship between any physically record- bution. They are the slope of the Gutenberg-Richter
able parameter and the time of occurrence, magni- inverse power-law curve, known as the b value,
tude or location of a future earthquake. There are the summation of the mean square deviation about
the regression line based on the Gutenberg-Richter
several factors that contribute to the non-linearity of
inverse power law, known as the η value, and the
earthquake occurrence time, location, and magnitude
magnitude deficit or the difference between observed
such as stress state of the fault after the last earth-
and expected magnitudes based on the Gutenberg-
quake, healing of gouge and other chemical processes,
Richter inverse power law or the ∆M value.33
changes in pore pressure due to compaction and fluid
The remaining two indicators are based on
migration.54 Therefore, such relationships, if existing,
the characteristic temporal earthquake magni-
are expected to be highly non-linear and complex. tude distribution. They are the mean time between
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 15

Table 1. Seismicity indicator symbols and their description and magnitude distribution
(GR represents Gutenberg-Richter power-law distribution and C represents characteris-
tic earthquake distribution).

Seismicity indicator Description Assumed distribution

T Elapsed time —
Mmean Mean magnitude —
dE1/2 Rate of square root of seismic energy —
β Slope of magnitude-log (frequency) Plot GR
η Mean square deviation GR
∆M Magnitude deficit GR
µ Mean time C
c Coefficient of variation C

characteristic or typical events, known as the µ value


Together with the T value (which is a measure
and the coefficient of variation of the mean time or
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

of the frequency of foreshocks), the mean of the


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

the aperiodicity of the mean, known as the c value.42


magnitudes of foreshocks is also a crucial indica-
The eight indicators, their descriptions, and the mag-
tor of an impending earthquake in some regions.
nitude distribution assumed as their basis are sum-
According to the accelerated release hypothesis18
marized in Table 1.
and its modifications,40,67 the energy released
(1) The T value from a fractured fault increases exponentially as
The time elapsed over the last n events of mag- the time to earthquake occurrence gets shorter.
nitude greater than a predefined threshold value In other words, the observed magnitudes of fore-
is defined as shocks increase immediately before the occur-
rence of a major earthquake.
T = tn − t1 (1)
(3) The rate of square root of seismic energy released
where tn is the time of occurrence of the nth (dE1/2 )
event and t1 is the time of occurrence of the The rate of square root of seismic energy
1st event. Most earthquakes are preceded by a released over time T is defined as
significant precursor activity such as a series of dE1/2 = ΣE1/2 /T (3)
smaller magnitude quakes called foreshocks.14
1/2
In fact, some of the most popular earthquake where E is the square root of seismic
prediction models such as the colliding cascades energy (E) calculated from the corresponding
model72 and other earthquake dynamic studies Richter magnitude using the following empirical
are based on observing the frequency and inten- relationship45
sity of foreshocks. The T value can be a mea- E = 10(11.8+1.5M)ergs (4)
sure of the frequency of foreshocks depending on
the threshold value chosen for the magnitude. Most seismic regions can be approximated as
In this case a large T value indicates a relative open physical systems with gradual build-up
lack of foreshocks which in many seismic regions of energy through the movement of lithospheri
may indicate a lower probability of occurrence plates. Such systems remain in relative equilib-
of a forthcoming large seismic event. Inversely a rium if this gradual built-up is released through
small T value indicates a relatively high fore- regular low-magnitude (or background) seismic
shock frequency and a higher probability of activity.58 If the background seismic activity is
occurrence of a forthcoming large seismic event. disrupted for significantly long periods of time
(2) The Mean Magnitude (seismic quiescence) due to frictional or mechan-
The mean of the Richter magnitudes of the ical reasons, the physical system accumulates
last n events is defined as energy that will be released abruptly in the form
of a major seismic event when the stored energy
Mmean = ΣMi /n (2) reaches a threshold.65 Figure 1 shows the plot
March 14, 2007 17:42 00089 FA2

16 A. Panakkat & H. Adeli

Table 2. Earthquake Magnitude Recordings in South- of the cumulative seismic energy released ver-
ern California for the 10 years between Jan. 1st 1987 sus time for a hypothetical seismic region, rep-
and Dec. 31st 1996, showing the distribution similar to
resentative of the earthquake occurrence process
Magnitude-Frequency inverse-power law distribution of
Gutenberg and Richter. in several seismic regions.68,73,20 In Fig. 1, the
region of the graph between points O and A
Magnitude Count per Cumulative total above is approximately linear and depicts background
(M) range M range lower M in range
seismic activity in the region. The graph between
2.5–2.9 9471 13590 A and B is a plateau showing disruption in back-
3.0–3.4 2784 4119 ground seismic activity or seismic quiescence.
3.5–3.9 912 1335 Cumulative energy released increases abruptly at
4.0–4.4 285 423 point B (end of quiescence where stored energy
4.5–4.9 90 128
reaches a region-dependent threshold) indicat-
5.0–5.4 32 48
5.5–5.9 10 16 ing a major earthquake. Therefore, the seismic
6.0–6.4 3 6 energy released during such earthquakes can be
6.5–6.9 2 3 approximated from the rate of background seis-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

7.0–7.4 1 1 mic activity (slope of the approximately linear


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

portion between points O and A of Fig. 1) and


Table 3. Peak Magnitude and corresponding seismic the period of quiescence (distance along the time
energy for the 12 months of 1994. axis between points A and B).73 Rate of seismic
Month Peak recorded Seismic energy energy released is therefore a crucial seismicity
magnitude released (ergs) indicator in regions that show quiescence.
(4) Slope of the log of the earthquake frequency ver-
Jan 94 6.70 4.31 × 1020 sus magnitude curve (b value)
Feb 94 4.09 8.91 × 1017
This parameter is based on the so-called
Mar 94 5.24 4.57 × 1019
Gutenberg-Richter inverse power law for earth-
Apr 94 4.78 9.66 × 1018
May 94 4.40 2.6 × 1018 quake magnitude and frequency which is
Jun 94 4.97 1.79 × 1019 expressed as
Jul 94 3.93 4.95 × 1017
Aug 94 4.85 1.18 × 1019 log10 N = a − bM (5)
Sep 94 3.76 2.75 × 1017
Oct 94 4.18 1.17 × 1018 where N is the number of events of magnitude
Nov 94 4.21 1.34 × 1018 M or greater, and a and b are constants. The
Dec 94 4.88 4.07 × 1018 parameter b (known in earthquake prediction lit-
erature as the b-value) is the slope of the approx-
imately linear plot between earthquake magni-
tude and logarithm of frequency of occurrence
of events of equal or greater magnitude. As an
example, Fig. 2 shows the plot of earthquake
magnitude versus the logarithm of the frequency
Cumulative Seismic

of occurrence of events of equal or greater magni-


Energy Released

A
B tude for a sample set of data recorded in South-
ern California for the 10-year period between Jan
1st 1986 and Dec 31st 1995. The figure illustrates
Quiescence
the Gutenberg-Richter inverse power law.
Background Rate
The values of a and b can be calculated using
the linear least squares regression analysis as
O Time
follows23 :
Fig. 1. Cumulative seismic energy released versus time
(nΣ(Mi logNi ) − ΣMi ΣlogNi )
illustrating the earthquake occurrence process for a hypo- b= (6)
thetical seismic region that exhibits seismic quiescence. ((ΣMi )2 − nΣM2i )
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 17

4.5 where Mmax,observed is the maximum observed


4 magnitude in the last n events and Mmax,expected
is the maximum magnitude in the last n events
3.5
based on the inverse power-law relationship.
Log10 (Frequency)

3
Since an event of the largest magnitude will
2.5 likely occur only once among the n events, N = 1,
2 log N = 0 and Eq. (3) yields
1.5
Mmax,expected = a/b (10)
1
0.5 (7) Mean time between characteristic or typical
0
events (µ value)
0 2 4 6 8 This is the average time or gap observed
Magnitude
between characteristic or typical events among
the last n events. Several seismic zones includ-
Fig. 2. Earthquake magnitude versus the logarithm of ing the well-studied Parkfield, California, exhibit
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

the frequency of occurrence of events of equal or greater periodic trends in the gradual stress built-up and
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

magnitude for a sample set of data recorded in southern


California for the 10-year period between Jan 1st 1986
subsequent release through large earthquakes
and Dec 31st 1995. according to the elastic rebound hypothesis.57
For the Parkfield region, Kagan and Jackson42
found that the intervening times between large
a = Σ(log10 Ni + bMi )/n (7) earthquakes are relatively constant. Such large
earthquakes are known as characteristic events.
where Mi is magnitude of the ith event, Ni is the In this context magnitudes are defined within
number of events with magnitude Mi or greater, a given range of approximation. For example,
and n is the total number of seismic events. earthquakes of magnitude 7 to 7.5 are grouped
(5) Summation of the mean square deviation from together as one characteristic magnitude. Char-
the regression line based on the Gutenberg- acteristic events should ideally be separated by
Richter inverse power law (η value) approximately equal time periods. The mean
This parameter is defined based on the time µ is then given by
Gutenberg-Richter magnitude-frequency rela-
tionship as follows: µ = Σ(ti characterisic )/ncharacteristic (11)

Σ(log10 Ni − (a − bMi ))2 where ti characteristic is the observed elapsed time


η= (8)
(n − 1) between characteristic events of magnitude Mi
and ncharacteristic is the total number of charac-
This is a measure of the conformance of the
teristic events.
observed seismic data to the Gutenberg-Richter
(8) Coefficient of variation of the mean time between
inverse power-law relationship. The lower the
characteristic events (µ), also known as the ape-
η value, the more likely that the observed dis-
riodicity of the mean (c value)
tribution can be estimated using the inverse
This parameter is a measure of the closeness of
power law whereas a high η value indicates
the magnitude distribution of the seismic region
higher randomness and the inappropriateness
to the characteristic distribution and is defined
of the power-law for describing the magnitude-
mathematically as
frequency distribution.
(6) Magnitude deficit or the difference between c = standard deviation of the observed times/µ
the largest observed magnitude and the largest (12)
expected magnitude based on the Gutenberg-
Richter relationship (∆M value) A high c value indicates a large difference
This is defined as between the calculated mean time and the
observed mean time between characteristic
∆M = Mmax,observed − Mmax,expected (9) events and vice versa.
March 14, 2007 17:42 00089 FA2

18 A. Panakkat & H. Adeli

3. Neural Network Models for parameters used are limited to those based on the
Categorical Earthquake Magnitude Gutenberg-Richter inverse power law which is not
Prediction applicable in all seismic regions.
There have been a number of efforts using neural Negarestani et al.55 use a BP neural network
networks to predict earthquake parameters during to differentiate environmentally induced changes in
the past decade or so. Lakkos et al.,49 used a feed- soil radon concentration from those caused by earth-
forward backpropagation (BP) neural network for quakes. The authors propose the model as a tool for
predicting earthquake magnitude using variations of earthquake time prediction based on measuring soil
electrotelluric fluid (also called seismic electric sig- radon concentration. The study reports improvement
nals), occurring between a few days to a few hours in results using neural networks compared with a
prior to major earthquakes as input. The succeed- linear time-series analysis when used on soil radon
ing earthquake magnitude and epicentral depth are readings from a site in Thailand.
the predicted output. The BP neural network was Kerh and Chu46 use seismic data recorded in the
trained using seismic electric signals recorded in Kaohsiung region of Taiwan in a BP neural net-
Western Greece. The authors report predictions were work to predict peak ground acceleration values. The
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

network inputs are epicentral distance from measur-


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

within 0.5 Richter in magnitude and 0.30 in epicen-


tral location when tested using data not part of the ing stations, focal depths, and magnitudes of past
training dataset. earthquakes. The neural network model yielded cor-
Leach and Dowla50 use velocity, amplitude, and relation coefficients in the range 0.6–0.98 compared
spectral characteristics of primary or P waves as with 0.6–0.75 obtained from other non-linear regres-
input in a BP network to predict the arrival time, sion analysis techniques.
intensity and duration of secondary S as well as All neural network approaches for earthquake
Raleigh and Love waves. When tested using earth- parameter prediction reported in the literature use
quake wave data recorded in Landers, CA, the net- the simple BP neural network with the exception
work yielded correlation coefficients in the range 0.6– of one recent conference proceedings paper by Liu
0.9. However, the network can only begin prediction et al.51 who use an ensemble of radial basis function
after the arrival of P waves thus providing warning (RBF) neural networks for earthquake magnitude
time in the order of seconds which is usually too prediction using past earthquake magnitude data as
short for life saving measures and initiating emer- network input. A bootstrap aggregating algorithm is
gency management efforts. used to combine and average predictions from the
Dai and Macbeth22 use the BP neural network component neural networks. When tested on 30 past
to differentiate between P and S waves arriving earthquakes recorded in China, the RBF ensemble
at a recording station. Network training is done yielded lower prediction errors than single networks
by converting P and S “noise bursts” into special for all 30 earthquakes.
training data based on their degree of polarization. Sharma and Arora60 use a multi layer feed-
The authors report 60–75% accuracy in identify- forward BP network to model seismicity cycles
ing P and S waves when the model was tested in the Himalayan region. The authors divide the
using data from two different recording stations. Himalayas into six seisomogenic sub-regions. The
It should be noted that in this work neural network recorded earthquake magnitude data for these six
has been used as a classification tool to play the sub-regions between 1970 and 1998 are used as a
role of an expert seismologist and not for earthquake time series in order to recognize patterns using the
parameter prediction. BP network. The authors report moderate success
Ma et al.52 use the BP neural network to predict in predicting the 1999 earthquake of Sinchen, Tibet,
magnitudes of future large earthquakes in Northern with magnitude M = 5.4, where the time of occur-
and Southwestern China using six seismicity indica- rence was predicted within 100 days of the actual
tors, namely, T, Mmean , dE1/2 , b, η and ∆M as net- occurrence of the earthquake.
work input. The model yielded an average R score, In the present work, three neural network models
defined later in this paper, of 0.12 for Northern are presented and tested for predicting the largest
China and 0.10 for Southwest China. The seismic earthquake magnitude during the following month
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 19

in a seismic region using eight seismicity indicators calculated for each month for a predefined number
described earlier as input simultaneously. They are a of seismic events prior to that month using the def-
feed-forward Levenberg-Marquardt backpropagation initions described in the previous section. Network
(LMBP) neural network, a recurrent neural network, output is obtained as
and a radial basis function (RBF) neural network.  n

For all three networks, output is either 1 represent- Opi = f(si .wj ) (13)
ing the occurrence of an earthquake of predefined j=1

threshold magnitude or greater during the following where Opi is the predicted occurrence (1) or non-
month, or 0 otherwise. Network operation is repeated occurrence (0) of an earthquake of threshold mag-
by gradually increasing the threshold magnitude in nitude or greater during the ith month, si is the
increments of 0.5. The largest earthquake magni- 8 × 1 input vector of seismicity indicators for the ith
tude during the following month is the particular month, wj is the weight vector associated with the jth
threshold magnitude above which network output hidden layer, f is the transfer function, and n is the
changes to 0. total number of hidden layers. In this research the
tan-sigmoid function was found to be the most suit-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

able transfer function. The number of hidden layers


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

3.1. Levenberg-Marquardt
and number of nodes in each hidden layer are deter-
backpropagation network
mined by trial-and-error to obtain the best results.
Backpropagation was developed by generalizing the Backpropagation is essentially a gradient-descent
Widrow-Hoff learning rule to multi-layer networks algorithm in which the network weights are moved
using nonlinear differentiable transfer functions.13,59 along the negative of the gradient of a performance
Architecture of the BP neural network model for pre- or error function. The problem is formulated as an
dicting the occurrence of an earthquake of threshold unconstrained optimization problem, that is, to min-
magnitude or greater during the following month is imize a performance function (z) defined as the mean
shown in Fig. 3. square difference between the predicted and observed
Each training dataset corresponds to one month (desired) output.
in the historical record and contains an eight by
1 
N
one input vector and a single output. Each input z= (Ooi − Opi )2 (14)
node represents one of the eight seismicity indicators N i=1

∆M

Mmean

dE1/2 Output (1 if earthquake of


magnitude equal to or greater than
threshold magnitude occurs, 0
µ otherwise)
Hidden layers
c

Input layer (each node


represents one seismicity
indicator)

Fig. 3. Architecture of the backpropagation neural network for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.
March 14, 2007 17:42 00089 FA2

20 A. Panakkat & H. Adeli

where Ooi is the observed occurrence (1) or non- network is immaterial. For instance, seismicity indi-
occurrence (0) of an event of threshold magnitude cators corresponding to each month in the histor-
or greater during the ith month, and N is the total ical record (input) and the observed occurrence or
number of training datasets. non-occurrence of an event of threshold magnitude
Batch training using gradient descent algo- or greater during that month (output) can be used
rithms is often too slow for large or complicated in a simple BP network in any random order. In
problems.2 Faster training can be accomplished contrast to BP networks, recurrent networks have
using other numerical optimization-based algorithms the capacity to retain past results by incorporating
such as the conjugate gradient,2 quasi-Newton,17 a time-delay.24 Because of this ability to operate not
and the Levenberg-Marquardt34 algorithms which only on the input space, but also on an internal state
attain convergence anywhere between 10 to a 100 space, recurrent networks have been used in a num-
times faster than the standard BP algorithm. The ber of problems involving time series, e.g., acoustic
Levenberg-Marquardt backpropagation algorithm is phonetic decoding of continuous speech27 and pre-
used as the learning rule in this work. diction of volatile stock trading trends.30 Temporal
There are three basic parameters that determine earthquake magnitude recordings also form time-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

the accuracy and training time for the Levenberg- series data and as such, recurrent neural networks
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

Marquardt BP algorithm. They are (a) the num- are suitable for predicting earthquake trends. To the
ber of iterations to be performed before training is best of the authors’ knowledge, no research has been
stopped or the number of epochs, (b) the value of published on the prediction of earthquake parame-
the mean square error function at which training ters using recurrent neural networks.
stops, and (c) the minimum magnitude of the gra- Architecture of the recurrent network for pre-
dient below which training stops. dicting the occurrence of an earthquake of threshold
magnitude or greater during the following month is
shown in Fig. 4. Similar to the BP neural network,
3.2. Recurrent network the input layer has eight nodes representing the eight
BP networks are also called static networks because seismicity indicators corresponding to a given month
the order in which training datasets are used in the in the historical record. The number of hidden layers

Feedback
connection
Output of the recurrent layer

b
Recurrent layer

∆M

Mmean

dE1/2 Output (1 if earthquake of


magnitude equal to or greater
µ than threshold magnitude
occurs, 0 if not)

c Hidden layer

Input layer (each node represents one


seismicity indicator)

Fig. 4. Architecture of the recurrent neural network developed for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 21

and the number of nodes in each hidden layer are output as


‚ s −w ‚2
determined by trial-and-error for best results. In a 
n ‚ i
‚ σ ‚
j‚

recurrent neural network, in every iteration, the net- Opi = Φ(si − wj ) = e j (16)
work output is passed through a recurrent layer and j=1

the output of the recurrent layer is added to the where σj is known as a width factor associated with
output of the hidden layer and the sum is used as the jth hidden layer. Theoretically, wj and σj deter-
the argument of the transfer function to obtain the mine the center and spread of the Gaussian bell
network output in the succeeding iteration as follows: curve, respectively. In this research, it was observed
through numerical experimentation that the width

n
factor σ does not contribute to the network’s ability
Opi = f[si .wj + Op(i−1) .wr ] (15) for function approximation and therefore its value is
j=1
set equal to one. The Levenberg-Marquardt training
algorithm is used to minimize the mean-square error
where wr is the vector of weights of the links con-
function.
necting the nodes in the recurrent layer to the out-
put node. Therefore, network output is obtained as
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

4. Prediction Verification and


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

a function of not only the input vector, but also the


Hypothesis Testing
predicted occurrence (1) or non-occurrence (0) of an
earthquake of threshold magnitude or greater during Earthquake parameter prediction is a complicated
the preceding time period (month). The Levenberg- problem involving a large number of unknown vari-
Marquardt BP algorithm is used for training the net- ables. No prediction model is suitable for all seis-
work and finding two sets of weights, wj and wr , mic regions. Emergency management organizations
which minimize the mean square error function. such as the U.S. Federal Emergency Management
Agency in the U.S. base their public warning and
evacuation procedures on the probability associated
3.3. Radial basis function (RBF) with a prediction. In this research, prediction accu-
network racies obtained from the three different neural net-
In an RBF neural network, the output is computed work models are compared using four different statis-
as a function of the Euclidian distance between the tical measures: the probability of detection (POD),
input vector and a prototype vector.3,5,13,43 The pro- false alarm ratio (FAR), frequency bias (FB), and
totype vector is conveniently defined as the vector of R scores.66 Prediction accuracies are also compared
weights of links connecting the input layer to the with the probability of detection for each magnitude
hidden layer (or one hidden layer to the next hid- obtained using the Poisson’s null hypothesis. Testing
den layer). Usually, RBF neural networks require is limited to the prediction of earthquakes of Richter
more nodes than BP networks, but provide signifi- magnitudes 4.5 or greater since seismic events of
cantly improved results especially when a large num- lower magnitudes do not have much structural engi-
ber of training datasets is available. Also, an RBF neering or emergency management significance.
network can usually be trained in a fraction of the The POD (or hit rate), FAR and FB are calcu-
time required to train a BP network.3 lated for categorical predictions (such as prediction
Architecture of the RBF neural network for pre- of the occurrence of earthquake of a threshold mag-
dicting the occurrence of an earthquake of thresh- nitude or greater) using the following equations
old magnitude or greater during the following month Npc
POD = (17)
is shown in Fig. 5. The network has an eight-node Npc + Nni
input layer and a single output. In most applica- Npi
tions of RBF neural networks, one hidden layer is FAR = (18)
Npc + Npi
found to be sufficient. In this research, however, we
also investigate the improvement in prediction accu- Npc + Npi
FB = (19)
racy using more than one hidden layer. The Gaussian Npc + Nni
function, which is the most commonly used radial where Npc (predicted-correct) is the number of
basis transfer function, is used to compute network months during which an earthquake of threshold
March 14, 2007 17:42 00089 FA2

22 A. Panakkat & H. Adeli

Input layer (each node represents


one seismicity indicator)

2
exp S i − W1

T
2
exp S i − W2
b

2
exp S i − W3
∆M

Mmean 2
exp S i − W4

η
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

2 Output (1 if earthquake of
exp S i − W5
threshold magnitude or
1/2
dE greater than threshold
magnitude occurs, 0 if not)
µ 2
exp S i − W6

c
2
exp S i − Wn
Hidden layer which operates on the Euclidian
distance between the input vector and the
vector of weights of the links connecting the
input and hidden layers using the Gaussian
transfer function

Fig. 5. Architecture of the radial basis function neural network for predicting the occurrence of an earthquake of threshold
magnitude or greater during the following month.

magnitude or greater occurred and was predicted; Table 4. Parameters used for the computation of POD,
Npi (predicted-incorrect) is the number of months FAR, FB and R scores. “Yes” denotes predicted or
during which an earthquake of threshold magnitude observed occurrence of an earthquake of threshold mag-
nitude or greater and “No” denotes the predicted or
or greater did not occur but was predicted, Nnc observed non-occurrence of an earthquake of threshold
(not predicted-correct) is the number of months dur- magnitude or greater.
ing which an earthquake of threshold magnitude or ```
greater did not occur and was not predicted, and Nni ``` Predicted
``` Yes No
(not predicted-incorrect) is the number of months Observed ``
during which an earthquake of threshold magnitude Yes Npc Npi
or greater occurred but was not predicted. These No Nni Nnc
variables are summarized in Table 4.
Another commonly used forecast verification
method is computing the so-called skill scores. Skill between the probability of detection and false alarm
scores are a measure of the skill (or competence) of ratio for each predicted magnitude:
the prediction tool (neural network modeling in this Npc Npi
R = POD − FAR = − (20)
case) in predicting a particular parameter (earth- Npc + Npi Npi + Nnc
quake magnitude in this case).56 In this research, the The R score is −1 if no correct predictions are
Hanssen-Kuiper skill score, also known as the real made and +1 if all predictions are correct. This
skill or R score, is used. It is defined as the difference score is considered advantageous over POD and FAR
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 23

because it includes an equal representation of both 5.2. Parametric analysis for


correct and incorrect predictions. determining seismic quiescence
Further, the models are evaluated using a hypoth- patterns
esis verification technique: the probability of detec- If higher POD and R scores and lower FAR and FB
tion calculated for different predicted magnitudes are calculated when dE1/2 is part of the input vec-
using the three neural networks are compared with tor compared to when it is not, it can be concluded
the probability of occurrence for those magnitudes that the seismic region exhibits seismic quiescence
based on the Poisson’s null hypothesis. Assuming characteristics.
that earthquakes occur at completely random inter-
vals, the probability that an earthquake of a certain
5.3. Parametric analysis for
magnitude occurring during a given month based on
determining the optimum
the Poisson’s null-hypothesis (p0 ) is given by38
threshold magnitude and the size
p0 = 1 − e−rt (21) of dataset for calculating
seismicity indicators
where r is the frequency of occurrence of earthquakes
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

Threshold magnitude and the number of seismic


of the particular magnitude in the historical record
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

events of threshold magnitude or greater used for


for a time period of t (for example t = 1 month). A
computing seismicity indicators corresponding to
prediction model for an earthquake of a given magni-
each month in the historical record are varied to
tude is superior to the Poisson’s null hypothesis only
determine their statistical significance. For example,
if the POD for that magnitude is greater than p0 .
what is the relative statistical significance of 100
events of magnitude 4.0 or more compared with 200
5. Parametric Analysis of Earthquake events of magnitude 3.5 or more in computing seis-
Prediction micity indicators corresponding to each month?
For each neural network model, the effects of any
input parameter on the prediction accuracies are 6. Example Applications
investigated by removing that parameter from the
The Southern California Earthquake-Data Center
input vector. In that case the input layer will have
(SCEC), a joint venture between the United States
seven nodes. A comparison of results obtained from
Geological Survey (USGS) and the California Insti-
the neural network with a 7-node input layer with
tute of Technology, maintains a list of earthquake
those of the network with an 8-node input layer will
recordings (time of occurrence, geographic location
provide the significance of the removed parameter.
of the epicenter, depth and magnitude) for the
Significant parameters provide a larger POD and R
region. Their catalog is available over the internet
score and lower FB and FAR for predicted magni-
at www.scec.org, and is searchable for different mag-
tudes. Parametric analysis in this research is divided
nitude, time, location and depth ranges. Table 5
into three main categories as follows.
shows sample readings from the SCEC catalog with
search parameters, parameter values, and events that
5.1. Parametric analysis for matched the search criteria.
determining the most suitable Data recorded for two seismic regions obtained
frequency-magnitude relationship
from the SCEC website is used to train and test
Parametric analysis of seismicity indicators will help the neural network models developed. The first
to determine what magnitude-frequency relationship region is southern California described by the geo-
is the most suitable for the seismic region under graphic coordinates 32 and 36 N latitude and 114
study. If T, η or ∆M are significant parameters, the and 120 W longitude.69 The second region is the
Gutenberg-Richter inverse power law is probably the San-Francisco bay area which has been observed his-
most suitable magnitude-frequency relationship for torically to have different seismicity patterns from
the region. On the other hand, if µ and c are signifi- southern California.39 In this research, the San Fran-
cant parameters, characteristic earthquake distribu- cisco bay region is defined by the geographic coordi-
tion is probably the most suitable for the region. nates 37.5 and 40 N latitude and 116 and 123.5 W
March 14, 2007 17:42 00089 FA2

24 A. Panakkat & H. Adeli

Table 5. Sample readings from the SCEC catalog showing search parameters,
parameter values and events that matched the search criteria.

Search parameter Parameter range Event 1 Event 2 Event 3

Magnitude 2.75–3.50 3.20 2.82 2.97


Date, Time 05/15/2005, 00:00:00–23:59:59 13:29:56 16:14:08 18:34:46
Latitude 320 –36.50 33.45 34.36 32.35
Longitude −114.750 –1210 −116.63 −116.91 −115.32
Epicentral Depth 0 m–900 m 196 m 215 m 36 m

longitude (this region includes most of north-central arrays form 492 training datasets. Sample training
California). The three neural network models are datasets for the 12 months between January and
trained and tested separately using historical seismic December of 1994 for southern California showing
data recorded at two different regions: southern Cal- eight-element input vectors and the corresponding
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

ifornia and the San Francisco Bay region as defined desired outputs (M = 4.5 or greater) are presented in
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

in a previous section. For training the networks eight Table 6.


seismicity indicators corresponding to each month Convergence criteria for training each network
between January 1950 and December 1990 (four hun- include limiting the value of the mean square error
dred ninety two 8 × 1 input vectors, corresponding to function to 0.001 and the maximum number of iter-
492 months) for each region are calculated and stored ations to 1000. Network operation is repeated by
in respective two-dimensional arrays where each col- increasing the threshold magnitude by increments of
umn represents one set of seismicity indicators. The 0.5 to obtain the largest predicted magnitude.
indicators are calculated from a predetermined num-
ber, say 100, events of certain threshold magnitude,
6.1. Best network architecture
say 4.5, or greater prior to each month. The desired
network output which is the observed occurrence (1) For each seismic region, all three neural networks
or non-occurrence (0) of an earthquake of threshold are initially modeled with one hidden layer and one
magnitude or greater during each of the 492 months recurrent layer in the case of the recurrent neu-
for each one of the two regions is stored in respective ral network with four nodes in each layer using
one-dimensional arrays. For each region, these two the tan-sigmoid transfer function. Once trained, the
networks with this base architecture are used to

Table 6. Sample training datasets for the 12 months between January and December of 1994 for southern
California showing eight-element input vectors and the corresponding desired outputs (M = 4.5 or greater).

Month Input vector Desired output


1/2 19
T (Days) b η ∆M Mmean dE (×10 ergs) µ (Days) c

Jan 94 1875 0.89 0.34 1.48 4.41 0.023 28 0.50 1


Feb 94 1688 0.91 0.28 2.50 4.99 0.015 35 0.37 0
Mar 94 1753 0.80 0.15 0.44 5.10 0.009 17 0.55 1
Apr 94 1225 0.72 0.56 0.58 5.15 0.067 38 0.78 1
May 94 1311 0.77 0.48 0.63 4.37 0.004 55 0.25 0
Jun 94 988 0.92 0.18 2.48 4.10 0.101 64 0.12 1
Jul 94 1028 0.83 0.67 1.14 4.01 0.001 58 0.87 0
Aug 94 1191 0.98 0.59 0.35 4.85 0.055 29 0.66 1
Sep 94 1226 0.99 0.46 0.88 4.27 0.126 37 0.54 0
Oct 94 1365 0.92 0.80 1.60 4.09 0.005 44 0.68 1
Nov 94 1028 0.88 0.15 1.71 4.66 0.098 19 0.81 0
Dec 94 988 0.78 0.29 1.01 4.58 0.085 34 0.78 1
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 25

predict the occurrence of an earthquake of magnitude and small earthquakes is studied using the statistical
4.5 or greater for the nine months between January measures described in an earlier section.
and September of 2005 using the 8 × 1 input vectors
of seismicity indicators for these months as input. Southern California
The R scores for the nine predictions are computed The computed values of POD, FAR, FB, and R
for each of the three neural network models. The score for different magnitudes using each of the three
best network architectures are obtained by gradually neural network models are summarized in Table 7.
increasing the number of hidden layers and the num- This table also includes the values of the probabili-
ber of nodes in each hidden layer until the increase ties of occurrence of earthquakes of different magni-
in R scores by doing so is less than 0.01. The best tudes in a given month based on the Poisson’s null
architectures for the BP and recurrent neural net- hypothesis (p0 ).
work models were found to be the same for both
seismic regions whereas different best architectures (a) Prediction of large earthquakes (Magnitude 6.5
were found in the case of the RBF neural network or greater)
model for the two regions. There were two months in the testing period dur-
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

Since the chronological order of input has no sig-


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

ing which an earthquake of magnitude greater than


nificance in BP and RBF networks, network weights 7.0 occurred (June 1992 and October 1999). The BP
are updated after all 492 training datasets are neural network did not predict either of these earth-
applied. The best architecture for the BP neural net- quakes and did not make any false prediction of an
work has two hidden layers, each with eight nodes. earthquake of magnitude 7.0 or greater yielding a
The best RBF network architecture has one hidden POD and FAR of 0.0. The recurrent neural network
layer with eight nodes for the Southern California correctly predicted both of these earthquakes yield-
region and two hidden layers, each with eight nodes, ing a POD of 1.0 and did not falsely predict any
for the San Francisco Bay region. Chronological earthquake of magnitude 7.0 or greater yielding an
order of input data is maintained in the recurrent FAR of 0.00. The RBF neural network correctly pre-
neural network, and network weights are adjusted dicted the event of June 1992, but did not predict
after each training dataset is applied. The best recur- the event of October 1999 yielding a POD of 0.5.
rent neural network architecture has one hidden The network also falsely predicted an earthquake of
layer with eight nodes and one recurrent layer with magnitude 7.0 or greater during one month (March
four nodes. 2000), yielding an FAR of 0.5.
There was a single month in the testing period
6.2. Prediction results during which an earthquake of magnitude between
The neural network models were tested by compar- 6.5 and 7.0 occurred (January 1994). The BP neural
ing the network output with observed earthquake network failed to predict this event yielding a POD
magnitudes for the 177 months between January of 0.0. Both the recurrent neural network and the
1991 and September 2005 (testing period). The accu- RBF neural network correctly predicted this event
racy of each network in predicting large, moderate, yielding a POD of 1.0. None of the networks falsely

Table 7. Computed values of p0 ,POD, FAR, FB, and R scores for different magnitudes using the three neural network
models for southern California.

Magnitude p0 Backpropagation network Recurrent network Radial-basis network


POD FAR FB R Score POD FAR FB R Score POD FAR FB R Score

4.5 0.301 0.52 0.44 0.86 0.08 0.67 0.31 0.90 0.36 0.44 0.45 0.90 −0.01
5.0 0.131 0.46 0.33 0.68 0.13 0.80 0.29 1.00 0.51 0.64 0.27 0.79 0.37
5.5 0.052 0.50 0.50 1.00 0.00 0.75 0.25 1.00 0.50 0.75 0.40 0.86 0.35
6.0 0.021 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
6.5 0.011 0.00 0.00 0.00 0.00 1.00 0.00 1.00 1.00 1.00 0.00 1.00 1.00
7.0 0.003 0.00 0.00 0.00 0.00 1.00 0.00 1.00 1.00 0.50 0.50 1.00 0.00
March 14, 2007 17:42 00089 FA2

26 A. Panakkat & H. Adeli

predicted the occurrence of an earthquake of magni- earthquakes of magnitude between 5.0 and 5.5 dur-
tude between 6.5 and 7.0 yielding an FAR of 0.0. ing six months yielding an FAR of 0.27.
There were 27 months in the testing period dur-
(b) Prediction of moderate earthquakes (Magnitude ing which an earthquake of magnitude between 4.5
5.5 or greater but less than 6.5) and 5.0 occurred. The BP neural network correctly
There was a single month in the testing period dur- predicted 14 of these events yielding a POD of
ing which an earthquake of magnitude between 6.0 0.52 and falsely predicted earthquakes of magnitude
and 6.5 occurred (April and June 1992). The BP neu- between 4.5 and 5.0 during 11 months yielding an
ral network did not predict this earthquake and did FAR of 0.44. The recurrent neural network correctly
not make any false prediction of an earthquake of predicted 18 of the 27 events yielding a POD of
magnitude between 6.0 and 6.5 yielding a POD and 0.67 and falsely predicted earthquakes of magnitude
FAR of 0.00. The recurrent network did not predict between 4.5 and 5.0 for eight months yielding an
this event yielding a POD 0.0 and did not falsely FAR of 0.31. The RBF neural network correctly pre-
predict any earthquake of magnitude between 6.0 dicted 12 of the 27 events yielding POD of 0.44 and
and 6.5 yielding an FAR of 0.00. The RBF neu- falsely predicted earthquakes of magnitude between
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

4.5 and 5.0 for nine months yielding an FAR of


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

ral network did not predict the event yielding a


POD of 0.00 and did not falsely predict any event 0.45.
of magnitude between 5.5 and 6.5 yielding an FAR (d) Parametric Analysis
of 0.00. Parametric studies are performed by removing the
There were four months in the testing period dur- seismicity indicators one at a time. The computed
ing which an earthquake of magnitude between 5.5 values of R score for prediction of earthquakes of
and 6.0 occurred. The BP network correctly pre- magnitude 5.0, 6.0 and 7.0 using the three neural net-
dicted two of these events yielding a POD of 0.50 and work models are presented in Table 8. It is observed
falsely predicted earthquakes of magnitude between that for the southern California region, seismic-
5.5 and 6.0 during two months yielding an FAR of ity indicators obtained from the Gutenberg-Richter
0.50. The recurrent network correctly predicted three earthquake magnitude-frequency relationship (b, η,
of these earthquakes yielding a POD of 0.75 and and ∆M) have very little impact on the computed R
falsely predicted earthquakes of magnitude between score and therefore the prediction accuracy whereas
5.5 and 6.0 during one month only yielding an FAR indicators based on the characteristic earthquake dis-
of 0.25. The RBF neural network correctly predicted tribution (µ and c) have a significant impact on the
three of the four events yielding a POD of 0.75 prediction accuracy. It is therefore concluded that
and incorrectly predicted earthquakes of magnitude the characteristic earthquake distribution model bet-
between 5.5 and 6.0 during two months yielding an ter suits seismic patterns in southern California. The
FAR of 0.40. indicator corresponding to seismic energy (dE1/2 )
has significantly less effect on the R score of the pre-
(c) Prediction of small earthquakes (Magnitude 4.5 dicted magnitudes for the southern California region
or greater but less than 5.5) than the San Francisco bay region (to be discussed
There were 25 months in the testing period during in the next section) therefore suggesting that the
which an earthquake of magnitude between 5.0 and San Francisco region shows significantly more seis-
5.5 occurred. The BP neural network correctly pre- mic quiescence patterns.
dicted 12 of these events yielding a POD of 0.46 and To determine the optimum value of the threshold
falsely predicted earthquakes of magnitude between magnitude and the number of events used to calcu-
5.0 and 5.5 during six months yielding an FAR of late seismicity indicators, the three neural networks
0.33. The recurrent network correctly predicted 18 are trained and tested and the results are compared
of the 25 events yielding a POD of 0.72 and falsely for three cases: 200 events of magnitude 3.5 or more,
predicted earthquakes of magnitude between 5.0 and 100 events of magnitude 4.5 or more, and 50 events
5.5 during eight months yielding an FAR of 0.30. The of magnitude 5.0 or more prior to each month. No
RBF neural network correctly predicted 16 of the 25 substantial improvement in prediction results was
events yielding a POD of 0.64 and falsely predicted observed as a result of considering smaller magnitude
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 27

Table 8. Computed values of R scores for different predicted magnitudes, using different
input vectors obtained by removing one parameter at a time for the southern California
region.

Predicted Parameter removed BP Recurrent RBF


magnitude from input vector

None 0.13 0.51 0.37


b 0.13 0.44 0.33
η 0.11 0.42 0.31
5.0 ∆M 0.14 0.33 0.28
µ 0.01 0.27 0.12
c 0.04 0.20 0.22
dE1/2 0.11 0.39 0.32
None 0.00 0.00 0.00
b 0.00 0.00 0.00
η 0.00 0.00 0.00
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

6.0 ∆M 0.00 0.00 0.00


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

µ −1.00 −1.00 −1.00


c 0.00 0.00 −1.00
dE1/2 −1.00 0.00 0.00
None 0.00 1.00 0.00
b 0.00 1.00 0.00
η 0.00 1.00 0.00
7.0 ∆M 0.00 1.00 0.00
µ −1.00 0.00 0.00
c −1.00 0.00 −1.00
dE1/2 0.00 0.00 0.00

Table 9. Computed values of R scores for different predicted magnitudes, using input
vectors (seismicity indicators) calculated form different threshold magnitudes and number
of events of threshold magnitude or greater for the southern California region.

Predicted magnitude Threshold magnitude No. of events BP Recurrent RBF

3.5 200 0.16 0.50 0.35


5.0 4.5 100 0.13 0.51 0.37
5.5 50 0.12 0.48 0.29
3.5 200 0.00 0.00 0.00
6.0 4.5 100 0.00 0.00 0.00
5.5 50 0.00 0.00 0.00
3.5 200 0.00 1.00 0.00
7.0 4.5 100 0.00 1.00 0.00
5.5 50 0.00 1.00 0.00

events in computing seismicity indicators as summa- occurred. Therefore POD for all three networks for
rized in Table 9. M = 6.5 and M = 7.0 is 0.0. None of the networks
falsely predicted earthquakes of magnitude greater
San Francisco bay region than 6.5 yielding an FAR of 0.0 for M = 6.5 and
(a) Prediction of large earthquakes (Magnitude 6.5 M = 7.0.
or greater)
There was no month in the testing period during (b) Prediction of moderate earthquakes (Magnitude
which an earthquake of magnitude greater than 6.5 5.5 or greater but less than 6.5)
March 14, 2007 17:42 00089 FA2

28 A. Panakkat & H. Adeli

There was no month in the testing period dur- 4.5 and 5.0 for four months yielding an FAR of 0.44
ing which an earthquake of magnitude between 6.0 for M = 4.5. The RBF neural network correctly pre-
and 6.5 occurred. Therefore the POD for all three dicted four of the eight events yielding a POD of
networks is 0.0 for M = 6.0. The BP neural network 0.50 and falsely predicted two earthquakes of mag-
falsely predicted seven events of magnitude between nitude between 4.5 and 5.0 yielding an FAR of 0.33
6.0 and 6.5 yielding an FAR of 1.00. The recurrent for M = 4.5.
network did not falsely predict any event of magni-
tude between 6.0 and 6.5 yielding an FAR of 0.00. (d) Parametric Analysis
The RBF neural network falsely predicted 3 events Similar to the Southern California region, parametric
of magnitude between 6.0 and 6.5 yielding an FAR studies are performed by removing the seismicity
of 1.00 for M = 6.0. indicators one at a time and in each case R scores
There was no month in the testing period dur- are computed for different magnitudes. It is observed
ing which an earthquake of magnitude between 5.5 that prediction accuracies are influenced by seismic-
and 6.0 occurred. Therefore the POD for all three ity indicators based on the Gutenberg-Richter earth-
networks is 0.00 for M = 5.5. The BP neural network quake magnitude-frequency distribution (b, η, and
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

falsely predicted 18 events of magnitude between 5.5 ∆M) and not significantly altered by seismicity indi-
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

and 6.0 yielding an FAR of 1.00. The recurrent net- cators based on the characteristic earthquake distri-
work did not falsely predict any earthquake of mag- bution model (µ and c) leading to the conclusion
nitude between 5.5 and 6.0 yielding an FAR of 0.00. that the Gutenberg-Richter earthquake magnitude-
The RBF network falsely predicted 10 earthquakes frequency relationship is better suited to the San
of magnitude between 5.5 and 6.0 yielding an FAR Francisco bay region compared with the charac-
of 1.00 for M = 5.5. teristic earthquake distribution. The R scores are
significantly reduced if dE1/2 is removed from the
(c) Prediction of small earthquakes (Magnitude 4.5
input vector leading to the conclusion that San
or greater but less than 5.5)
Francisco bay region shows strong seismic quies-
There were four months in the testing period dur-
cence trends.
ing which an earthquake of magnitude between 5.0
As in the case of the southern California region,
and 5.5 occurred. The BP neural network did not
it is observed that no major improvement in predic-
predict any of these events yielding a POD of 0.00
tion accuracies is achieved as a result of considering
and falsely predicted an earthquake of magnitude
large numbers of low magnitude events for comput-
between 5.0 and 5.5 for three months yielding an
ing seismicity indicators.
FAR of 1.00 for M = 5.0. The recurrent neural net-
work correctly predicted two of the four events
yielding a POD of 0.50 and falsely predicted an
7. Final Comments
earthquake of magnitude between 5.0 and 5.5 dur-
ing one month yielding an FAR of 0.33 for M = 5.0. Earthquake parameter prediction is a highly complex
The RBF neural network predicted one of the four problem because it involves a large number of vari-
events yielding a POD of 0.25 and falsely pre- ables whose effects are not completely understood.
dicted an earthquake of magnitude between 5.0 and Neural networks have not been used extensively for
5.5 for one month yielding an FAR of 0.50 for earthquake parameter prediction even though they
M = 5.0. have been used as an effective prediction tool in a
There were eight months during which an earth- variety of fields from economics to signal process-
quake of magnitude between 4.5 and 5.0 were ing. In this paper, three neural network models are
recorded. The BP neural network predicted two of presented for earthquake magnitude prediction using
these events yielding a POD of 0.25 and falsely pre- eight mathematically computed seismic parameters
dicted earthquakes of magnitude between 4.5 and as input. The goal is to predict the magnitude of the
5.0 during two months yielding an FAR of 0.50 largest earthquake in a following period, for example,
for M = 4.5. The recurrent neural network predicted one month.
five of the eight events yielding a POD of 0.63 and Overall the recurrent neural network model yields
falsely predicted earthquakes of magnitude between the best prediction accuracies compared with the
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 29

BP and RBF networks. This result is consistent Table 10. Observed largest earthquake, and correspond-
with the fact that recurrent neural networks have ing location compared with predicted magnitude for each
month in the testing dataset for Southern California
the inherent capacity to model time-series data
using the recurrent neural network. Blank cells indicate
better compared with other networks. The study lack of observed or predicted events of magnitude 4.5 or
that is most closely related to the present work greater during the corresponding month.
was undertaken by Ma et al.,52 where six seis-
micity indicators were used as input to a BP Month Observed Observed location Predicted
Mmax Mmax
neural network to predict the magnitude of the Latitude Longitude
largest earthquake in the following month. How-
ever, the indicators were limited to those based Jan-91 — — — —
Feb-91 — — — —
on the Gutenberg-Richter magnitude-frequency rela-
Mar-91 — — — —
tionship alone. Therefore, their model is not appli- Apr-91 — — — —
cable in regions where seismicity data do not follow May-91 — — — —
this relationship. When tested on the two seismic Jun-91 5.8 34.27 −117.99 5.5
regions, all three neural network models developed Jul-91 — — — —
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

in this research yielded larger and therefore superior Aug-91 — — — —


by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

Sep-91 5.01 35.81 −119.42 —


R scores (Table 7) than those reported by Ma et al.
Oct-91 — — — —
(0.10 for Northern China and 0.12 for Southwestern Nov-91 — — — —
China). Dec-91 4.68 35.20 −116.68 4.5
The following results from the parametric anal- Jan-92 — — — —
ysis are in agreement with general consensus in the Feb-92 — — — —
Mar-92 — — — —
field of seismology:
Apr-92 6.1 33.96 −116.32 5.5
May-92 4.98 33.94 −116.3 4.5
1. The San-Francisco bay region shows seismic qui-
Jun-92 7.3 34.2 −116.44 7
escence characteristics. Jul-92 5.67 35.21 −118.07 5.5
2. The characteristic earthquake distribution is Aug-92 5.23 34.2 −116.86 5.0
suited to model seismic data in Southern Cali- Sep-92 5.26 34.06 −116.36 5.0
fornia Oct-92 4.59 34.6 −116.64 —
3. The occurrence of large earthquakes has a sig- Nov-92 5.29 34.34 −116.9 5.0
Dec-92 5.26 34.37 −116.9 5.0
nificant influence on future seismic activity in a
Jan-93 — —
region whereas the occurrence (even in large num- Feb-93 4.5 35.03 −116.97 —
bers) of small earthquakes usually does influence Mar-93 — — — —
future seismic activity. Apr-93 0 —
May-93 5.19 35.15 −119.1 5.0
It appears that the networks make false predic- Jun-93 — — — —
tions for the months for which there is significant Jul-93 — — — —
Aug-93 5 34.03 −116.32 —
high seismic activity immediately before or after the
Sep-93 — — — —
month. Also, since only one event is predicted per Oct-93 — — — —
month, the networks tend to miss significant after- Nov-93 4.64 35.97 −120.00 4.5
shocks and pre-shocks as these usually occur within Dec-93 — — — —
30 days of the main-shock. It is expected that the Jan-94 6.7 34.21 −118.54 6.5
results can be improved by computing seismicity Feb-94 — — — —
Mar-94 5.24 34.23 −118.48 5.0
indicators for time-periods shorter than one month
Apr-94 4.78 34.19 −117.1 —
and for smaller regions. May-94 — — — —
The observed and predicted largest events using Jun-94 4.97 34.27 −116.4 4.5
the recurrent neural network are compared for each Jul-94 — — — —
month in the testing dataset for Southern California Aug-94 4.85 34.64 −116.52 4.5
Sep-94 — — — —
in Table 10. This table also presents the observed lat-
Oct-94 4.50 34.28 −116.42 —
itude and longitude of the largest earthquake during Nov-94 — — — —
each month.
March 14, 2007 17:42 00089 FA2

30 A. Panakkat & H. Adeli

Table 10. (Continued ) Table 10. (Continued )

Month Observed Observed location Predicted Month Observed Observed location Predicted
Mmax Mmax Mmax Mmax
Latitude Longitude Latitude Longitude

Dec-94 4.88 35.91 −120.00 4.5 Apr-99 — — — —


Jan-95 — — — — May-99 4.93 34.06 −116.37 4.5
Feb-95 — — — — Jun-99 4.92 32.38 −115.24 4.5
Mar-95 — — — — Jul-99 — — — —
Apr-95 — — — — Aug-99 — — — —
May-95 4.77 33.91 −116.29 4.5 Sep-99 4.8 32.27 −115.23 4.5
Jun-95 5.02 34.39 −118.67 — Oct-99 7.1 34.59 −116.27 7
Jul-95 — — — — Nov-99 — — — —
Aug-95 5.36 35.78 −117.66 5.0 Dec-99 — — — —
Sep-95 5.75 35.76 −117.64 5.5 Jan-00 — — — 4.5
Oct-95 — — — — Feb-00 — — — —
Nov-95 — — — — Mar-00 — — — —
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

Dec-95 — — — — Apr-00 — — — 5.0


Jan-96 5.17 35.76 −117.65 5.0 May-00 — — — —
Feb-96 — — — — Jun-00 4.51 34.78 −116.3 —
Mar-96 — — — — Jul-00 — — — —
Apr-96 — — — — Aug-00 — — — —
May-96 — — — — Sep-00 — — — 4.5
Jun-96 — — — — Oct-00 — — — —
Jul-96 — — — — Nov-00 — — — —
Aug-96 — — — — Dec-00 — — — —
Sep-96 — — — — Jan-01 — — — —
Oct-96 — — — — Feb-01 5.13 34.29 −116.95 5.0
Nov-96 5.3 36 −117.65 5.0 Mar-01 — — — —
Dec-96 — — — — Apr-01 — — — 5.0
Jan-97 — — — — May-01 — — — —
Feb-97 — — — — Jun-01 — — — —
Mar-97 5.26 34.97 −116.82 5.0 Jul-01 5.1 36.02 −117.87 5.0
Apr-97 5.07 34.37 −118.67 — Aug-01 — — — —
May-97 — — — — Sep-01 — — — —
Jun-97 4.76 32.63 −118.11 4.5 Oct-01 5.09 33.51 −116.51 4.5
Jul-97 4.89 33.4 −116.35 4.5 Nov-01 — — — —
Aug-97 — — — — Dec-01 4.89 32 −115.01 —
Sep-97 — — — 5.0 Jan-02 — — — —
Oct-97 — — — 5.0 Feb-02 5.7 32.32 −115.32 —
Nov-97 — — — — Mar-02 4.6 33.67 −119.33 4.5
Dec-97 — — — — Apr-02 — — — —
Jan-98 — — — — May-02 — — — 5.0
Feb-98 — — — — Jun-02 4.87 36.69 −116.34 —
Mar-98 5.23 36 −117.64 5.0 Jul-02 — — — —
Apr-98 — — — — Aug-02 — — — —
May-98 — — — — Sep-02 4.75 33.92 −117.78 4.5
Jun-98 — — — — Oct-02 4.77 34.8 −116.27 —
Jul-98 4.75 35.95 −117.53 4.5 Nov-02 — — — —
Aug-98 4.78 34.12 −116.93 — Dec-02 4.84 32.23 −115.8 4.5
Sep-98 — — — — Jan-03 4.54 35.32 −118.65 —
Oct-98 4.82 34.32 −116.84 4.5 Feb-03 5.37 34.31 −116.85 5.0
Nov-98 — — — — Mar-03 4.64 34.36 −116.13 —
Dec-98 — — — — Apr-03 — — — —
Jan-99 — — — 5.0 May-03 — — — —
Feb-99 — — — — Jun-03 — — — 5.0
Mar-99 — — — — Jul-03 — — — —
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 31

Table 10. (Continued ) 7. H. Adeli and H. S. Park, Neurocomputing for Design


Automation (CRC Press, Boca Raton, FL, 1998).
Month Observed Observed location Predicted 8. H. Adeli and G. F. Sirca, Neural network model for
Mmax Mmax uplift capacity of metal roof panels, Journal of Struc-
Latitude Longitude tural Engineering 127(11) (2001) 1276–1285.
9. K. Aki, Introduction to seismology for earthquake
Aug-03 — — — —
prediction, in Proceedings of the Seventh Work-
Sep-03 — — — —
shop on Non-Linear Dynamics and Earthquake Pre-
Oct-03 — — — 4.5
diction, September 29 . –October 11 . , Trieste, Italy
Nov-03 — — — 4.5
(2003).
Dec-03 — — — —
10. M. Anghel and Y. Ben-Zion, Non-linear system iden-
Jan-04 — — — —
tification and forecasting of earthquake fault dynam-
Feb-04 — — — —
ics using artificial neural networks, in Proceedings of
Mar-04 — — — 5.0
the Fall Meeting of the American Geophysical Union,
Apr-04 — — — —
December 10–14. , San-Francisco, CA (2001).
May-04 — — — —
11. W. Bakun and A. Lindh, The Parkfield, CA earth-
Jun-04 5.27 32.33 −117.92 5.5
quake prediction experiment, Science 229(4714)
Jul-04 — — — —
(1985) 619–624.
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com

Aug-04 — — — —
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

12. J. Barnes, An algorithm for solving non-linear equa-


Sep-04 5.04 35.39 −118.62 —
tions based on the secant method, Computer Journal
Oct-04 — — — —
8(1) (1965) 66–67.
Nov-04 — — — —
13. C. Bishop, Neural Network for Pattern Recognition
Dec-04 — — — 4.5
(Oxford University Press, Oxford, UK, 1995).
Jan-05 — — — 4.5
14. D. Boore, Comparisons of ground motions from
Feb-05 — — — —
the 1999 Chi-Chi earthquake with empirical predic-
Mar-05 — — — —
tions largely based on data from California, Bulletin
Apr-05 5.15 35.03 −119.18 5.0
of Seismological Society of America 91(5) (2001)
May-05 — — — —
1212–1217.
Jun-05 5.2 33.53 −116.57 5.0
15. D. Brehm and L. Braile, Intermediate-term earth-
Jul-05 — — — —
quake prediction using the modified time-to-failure
Aug-05 4.59 33.17 −115.64 4.5
method in the New Madrid seismic zone, Bulletin
Sep-05 5.11 33.16 −115.64 5.0
of Seismological Society of America 88(2) (1998)
564–580.
16. D. Brehm and L. Braile, Intermediate-term earth-
quake prediction using the modified time-to-failure
References method in Southern California, Bulletin of Seismo-
1. H. Adeli, Neural networks in civil engineering: 1989– logical Society of America 89(1) (1999) 275–293.
2000, Computer Aided Civil and Infrastructure Engi- 17. C. G. Broyden, A class of methods for solving non-
neering 16(2) (2001) 126–142. linear simultaneous equations, Mathematics of Com-
2. H. Adeli and S. L. Hung, Machine Learning-Neural putation 19(1) (1965) 577–593.
Networks, Genetic Algorithms and Fuzzy System 18. C. Bufe and D. Varnes, Predictive modeling of
(Wiley, New York, NY, 1995). seismic cycle in the greater San-Fransisco bay
3. H. Adeli and A. Karim, Fuzzy-wavelet RBFNN region, Journal of Geophysical Research 98 (1993)
model for freeway incident detection, Journal of 9871–9983.
Transportation Engineering, ASCE 126(6) (2000) 19. Y. Chen, J. Liu, B. Tsai and C. Chen, Statisti-
464–471. cal tests for pre-earthquake ionospheric anomaly,
4. H. Adeli and A. Karim, Construction Scheduling, Terrestrial Atmospheric and Oceanic Sciences 15(3)
Cost Optimization, and Management — A New (2004) 385–396.
Model Based on Neurocomputing and Object Tech- 20. G. Chouliaras and G. Stavrakakis, Evaluating seis-
nologies (Spon Press, London, UK, 2001). mic quiescence in Greece, in Proceeding of the
5. H. Adeli and Karim, A. Wavelets in Intelligent European Geophysical Society Joint Assembly, April
Transportation Systems (John Wiley and Sons, 7–11
. , Nice, France (2003).
New York, NY, 2005). 21. P. Cortez, M. Rocha and J. Neves, Evolving time-
6. H. Adeli and J. Mohammadi, Seismic risk analy- series forecasting neural network models, in Proceed-
sis based on Weibull distribution, in Proceedings of ings of the International Symposium on Adaptive
the Eight World Conference on Earthquake Engi- Systems: Evolutionary Computation and Probabilis-
neering, Vol. 1, San Francisco, July 21–28. (1984), tic Graphical Models, March 19–23 . , Havana Cuba
pp. 191–198. (2001), pp. 84–91.
March 14, 2007 17:42 00089 FA2

32 A. Panakkat & H. Adeli

22. H. Dai and C. Macbeth, Application of backprop- 37. M. Ishimoto and K. Iida, Observations of earth-
agation neural networks to identification of seis- quakes registered with the micro seismograph con-
mic arrival types, Journal of Geophysical Research structed recently, Bulletin of Earthquake Research,
102(B7) (1997) 15105–15113. University of Tokyo, 17 (1939) 443–478 [Translated
23. N. Draper and H. Smith, Applied Regression Analy- from Japanese].
sis (Wiley, New York, NY, 1966). 38. D. Jackson, Hypothesis testing and earthquake pre-
24. J. L. Elman, Finding structure in time, Cognitive diction, Earthquake Prediction: The Scientific Chal-
Science 14 (1999) 179–211. lenge, February 10–11, Irvine, CA (1995).
25. W. Ellsworth, M. Mathews, R. Nadeau, S. Nishenko, 39. D. Jackson and Y. Y. Kagan, Parkfield earthquake:
P. Reasenberg and R. Simpson, A physically- Not likely this year, Seismological Research Letters
based earthquake recurrence model for estimation 69(2) (1998) 151.
of long-term earthquake probabilities, Workshop on 40. S. Jaume, D. Weatherley and P. Mora, Accelerat-
Earthquake Recurrence: State-of-the-art and Direc- ing moment release and the evolution of event time
tions for the Future, February 22–25
. , Rome, Italy and size statistics, results from two cellular automa-
(1999). tion models, Pure and Applied Geophysics 157(11)
26. M. W. Firebaugh, Artificial Intelligence, A Knowl- (2000) 2209–2226.
edge Based Approach, Boyd and Fraser, Boston, MA 41. Y. Y. Kagan, VAN earthquake predictions, a statis-
(1988). tical evaluation, Geophysical Research Letters 23: 11
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

27. F. Freitag and E. Monte, Acoustic-phonetic decod- (1996) 1315–1318.


ing based on Elman predictive neural networks, in 42. Y. Y. Kagan and Jackson, D. Long-term earthquake
Proceedings of the International Conference on Spo- clustering, Geophysical Journal International 104
ken Language Processing, October 3–6. , Philadelphia, (1991) 117–133.
USA (1996), pp. 522–533. 43. A. Karim and H. Adeli, Radial basis function neural
28. A. Gabrielov, I. Zaliapin, W. Newman and V. Kelis- network for work zone capacity and queue estima-
Borok, Colliding cascades model for earthquake pre- tion, Journal of Transportaiton Engineering, ASCE
diction, Geophysical Journal International 143(2) 129(5) (2003) 494–502.
(2000) 427–437. 44. K. Kasahara, Earthquake Mechanics (Cambridge
29. R. J. Geller, D. D. Jackson, Y. Y. Kagan and University Press, Cambridge, 1981).
F. Mulgaria, Earthquakes cannot be predicted, Sci- 45. V. I. Kelis-Borok, and V. G. Kossobokov, Premoni-
ence 275(5306) (1997) 1616–1617. tory activation of earthquake flow: Algorithm M8,
30. C. Giles, S. Lawrence and A. Tsoi, Noisy time- Physics of the Earth and Planetary Interiors 61
series prediction using a recurrent neural network (1990) 73–83.
and grammatical inference, Machine Learning 44(2) 46. T. Kerh, and D. Chu, Neural networks approach
(2001) 161–183. and micro-tremor measurements in estimating peak
31. J. Gomez and A. Pacheco, The minimalist model ground acceleration due to strong motion, Advances
of characteristic earthquakes as a useful tool for in Engineering Software 33(11) (2002) 733–742.
description of the recurrence of large earthquakes, 47. J. Kirschvink, Earthquake prediction by animals,
Bulletin of Seismological Society of America 95(5) evolution and sensory perception, Bulletin of the
(2004) 1960–1967. Seism Society of America 90(2) (2000) 312–323.
32. A. Gonzalez, M. Vasquez-Prada, J. Gomez and 48. L. Knopoff, The magnitude distribution of declus-
A. Pacheko, Using synchronization to improve earth- tered earthquakes in Southern California, in Pro-
quake forecasting in cellular automation models, ceedings of the National Academy of Sciences 97(22)
Condensed Matter 04(035) (2004) 93. (2000), pp. 11880–11884.
33. B. Gutenberg and C. F. Richter, Earthquake mag- 49. S. Lakkos, A. Hadjiprocopis, R. Comley and
nitude, intensity, energy and acceleration, Bulletin P. Smith, A neural network scheme for earth-
of the Seismological Society of America 46(1) (1956) quake prediction based on the seismic electric sig-
105–146. nals, Neural Networks for Signal Processing, Pro-
34. M. T. Hagan, H. B. Demuth and M. Beale, Neural ceedings of the 1994 IEEE Signal Processing Society
Network Design (PWS Publishing Company, Boston, Workshop, September 6–8 . , Ermioni, Greece (1994),
MA, 1996). pp. 681–689.
35. R. A. Harris, Forecasts of the 1989 Loma Prieta, 50. R. Leach and F. Dowla, Earthquake early warning
California earthquake, Bulletin of the Seismological system using real-time signal processing, Neural Net-
Society of America 88(4) (1998) 898–916. works for Signal Processing, Proceedings of the 1996
36. Y. Honkura, M. Matsushima, N. Oshiman, IEEE Signal Processing Society Workshop, Septem-
M. Tuncer, S. Baris, A. Ito, Y. Iio and A. Isikara, ber 4–6
. , Kyoto, Japan (1996), pp. 463–472.
Small electric and magnetic signals observed before 51. Y. Liu, Y. Wang, Y. Li, B. Zhang and G. Wu, Earth-
the arrival of seismic wave, Earth Planets Space quake prediction by RBF neural network ensem-
54(E) (2002) 9–12. ble, in Proceedings of the International Symposium
March 14, 2007 17:42 00089 FA2

Neural Network Models for Earthquake Magnitude Prediction Using Multiple Seismicity Indicators 33

on Neural Networks, August 19–21 . , Dalian, China 63. J. Stephens, E. Adams, X. Huo, J. Dent and J.
(2004), pp. 962–969. Hicks, Use of neural networks in avalanche forecast-
52. L. Ma, L. Zhu and Y. Shi, Attempts at using seismic- ing, in Proceedings of the International Snow Science
ity indicators for the prediction of large earthquakes Workshop, October 30 . –November 3., Snowbird, USA,
by Genetic Algorithm-Neural Network method, (1994), pp. 327–340.
Asia-Pacific Economic Cooperation for Earthquake 64. C. Thanassoulas, Earthquake prediction based
Simulation, January 31 . –February 5., Brisbane, Aus- on electrical signals recorded on ground sur-
tralia (1999). face, in Proceedings: Possible Correlation between
53. D. Marquardt, An algorithm for the least-squares Electromagnetic Earth-Fields and Future Earth-
estimation of non-linear parameters, Journal of quakes, Bulgarian Academy of Sciences, Sofia,
Applied Mathematics 11 (1963) 431–441. Bulgaria (2001), pp. 19–30.
54. M. Matthews, W. Ellsworth and P. Reasenberg, 65. K. Tiampo, J. Rundle, S. McGinnis, S. Gross and
A Brownian model for recurrent earthquakes, Bul- W. Klein, Mean-field threshold systems and phase
letin of the Seismological Society of America 92(6) dynamics: An application to earthquake fault sys-
(2002) 2233–2250. tems, Europhysics Letters 60 (2002) 481–487.
55. A. Negarestani, S. Sestayeshi, M. Ghannadi- 66. P. Varshney, Distributed Detection and Data Fusion
Maragheh and B. Akashe, Layered neural networks (Springer-Verlag, New York, NY, 1997).
based analysis of radon concentration and environ- 67. D. Vere-Jones, R. Robinson and W. Wang, Remarks
Int. J. Neur. Syst. 2007.17:13-33. Downloaded from www.worldscientific.com
by UNIVERSITY OF HONG KONG on 10/08/13. For personal use only.

mental parameters in earthquake prediction, Jour- on the accelerated release moment model: Prob-
nal of Environmental Radioactivity 62(3) (2002) lems of model formulation, simulation and esti-
225–233. mation, Geophysics Journal International 144(3)
56. T. Read and N. Cressie, Goodness-of-Fit Statis- (2001) 517–531.
tics for Discrete Multivariate Data (Springer Verlag, 68. M. Wyss, P. Bodin and R. E. Haberman, Seismic
New York, NY, 1988). quiescence at Parkfield; an independent indication
57. H. Reid, The mechanism of the earthquake; The of an imminent earthquake, Nature 345(290) (1990)
California earthquake of April, 18, 1906, Report 426–428.
of the State Earthquake Investigation Commission, 69. Working Group on California Earthquake Probabil-
Carnegie Institute of Washington, Washington D.C. ities, Earthquake probabilities in the San Fransisco
2 (1910) 16–28. bay region, United States Geological Survey Open-
58. E. Roeloffs, The Parkfield, California earthquake File Report, 03–214 (2003).
experiment, An update in 2000, Current Science 70. S. Xu, Ability evaluation for earthquake prediction,
79(9) (2000) 1226–1236. Science Books and Periodicals Press — Seismology
59. D. Rumelhart, G. Hinton and R. Williams, Learning Volume 586–590 [Translated from Chinese] (1989).
representations by back-propagating error, Nature 71. I. Zaliapin, V. Kelis-Borok and G. Axen, Premon-
323(9) (1986) 533–536. itory spreading of seismicity of faults’ network in
60. M. Sharma and M. Arora, Prediction of seismic- southern California: Precursor accord, Journal of
ity cycles in the Himalayas using artificial neural Geophysical Research 107(B10) (2002) 2221.
networks, Acta Geophysica Polonica 53(3) (2005) 72. I. Zaliapin, V. Kelis-Borok and M. Ghil, A Boolean
299–309. delay equation model of colliding cascades; Part II:
61. Y. Shi, J. Liu and G. Zhang, An evaluation of Prediction of critical transitions, Journal of Statisti-
Chinese annual earthquake predictions, 1990–1998, cal Physics 111(3) (2003) 839–861.
Journal of Applied Probability 38(A) (2001) 222–231. 73. G. Zoller, S. Hainzl and J. Kurths, A systematic test
62. K. Sieh, The repetition of large earthquake ruptures, on precursory seismic quiescence in Armenia, Natu-
in Proceedings of the National Academy of Science ral Hazards 26(3) (2002) 245–263.
93(9) (1996) 3764–3771.

You might also like