Clustering of Earthquake Records
Clustering of Earthquake Records
SEARCH
Davoud Rezazadeh
M.Sc Earthquake Engineering, Kharazmi University, Tehran, Iran
Mohsen Shahrouzi
Faculty of Engineering, Kharazmi University, Tehran, Iran
ABSTRACT
Various methods have been proposed for clustering, including k-means algorithm[1]. It is known as a commonly used
and widely used technique. But because of the dependence of the result of this algorithm on the initial point, in the present
study, its combination with the algorithm Opposition-Switching Search is presented. In this paper, the data matrix
provides over 100 records of earthquakes on different types of soils, fault mechanisms and magnitudes. To form the
matrix, various attributes such as seismic energy, peak movement rates, velocity and acceleration of strong ground
motion, spectral intensity and also earthquake duration have been used. Then, clustering is performed using a new hybrid
method and is compared with the traditional k-means. The results of several tests indicate an improvement in the
evaluation indicators such as the profile index. In the present study, Silhouette Value Indicators have been selected in the
objective function.
INTRODUCTION
Data mining has been extensively studied in various sciences. Data clustering is among the various topics presented
for this purpose. Clustering is defined as the unattended process of dividing a set of data into separate subsets. In other
words, clustering algorithms attempt to classify data into groups based on their similarity, so that the created cluster is
meaningful and usable. One of the simplest clustering algorithms is the k-means algorithm, which is one of the most
popular and well-known data clustering methods because of its simple computational cost implementation. However, this
algorithm is very sensitive to initial values and easily falls into the trap of local optimization. Therefore, researchers have
used meta-heuristic algorithms to get rid of local optimizations [1].
v
SEE 8
K-MEANS ALGORITHM
Clustering methods are divided into fuzzy and deterministic categories. In this paper, the k-means clustering method is
used, which is one of the most commonly used clustering algorithms. The letter k, which is the first name of the
algorithm, refers to the fact that the purpose of this algorithm is to find a fixed number of clusters based on the proximity
of the data points. The process of the k-means algorithm is as follows:
Problem Statement, Identification of Options, Identification of Criteria and Formation of Data Matrix
Select k data as the center of the cluster
Determine the intervals of the rest of the data from the center of the cluster
Position the data closest to the center of each cluster in that cluster
Calculate the average of each cluster as the center of that cluster
Repeat steps 2 to 4 until stopping conditions
X XU X L X (3)
Where X L and X U are the corresponding lower and upper bounds, respectively.
Another feature is switching between a position and its opposition as the starting point for defining movement
direction in the search space
An elitist strategy is also implemented saving the best-found solution as the target of such walks
OSS algorithm is proposed via the following steps:
Generate a population of n individual agents by randomly locating them in the design space
Evaluate the objective function for the entire population
Repeat the following steps until termination criterion is satisfied
Update the best-so-far solution; known as the Global best, X
Gb
Y C r s( X i )
i (4)
Where Y C r s( X i ) is a crossover operator on the individuals denoted by X i .
Switch the pseudo-mean
i
Z to either Y or Y by equal chance. Take the type-I velocity vector as
V l rand *( X Gb Z ) (5)
ll 1 2
Switch the type-II velocity V to either S or S by equal chance. They are given as follows:
S 1 rand *( X Gb X i ) (6)
S 2 rand *( X i X Gb ) (7)
v
SEE 8
Cd ,i
Generate the candidate solution by X Cd ,i X i V l V LL . Modify X to fall between its lower and upper
bounds.
Evaluate objective function for the candidate solution
i Cd ,i i
Substitute X with X if the candidate solution is better than X .
As a common termination criterion for OSS algorithm it loops up to N Iterations.
v
SEE 8
Table 2. Continue
Name and year of Name and year of Name and year of
Record station Record station Record station
earthquake occurrence earthquake occurrence earthquake occurrence
APEEL 2 - Larkspur Ferry
Whittier Narrows-02 1987 Carson - Water St Yountville -2000 Yountville -2000
Redwood City Terminal
TRA-642 ETR
Yountville-2000 Treasure Island Borah Peak, ID-01 -1983 TAN-719 Borah Peak, ID-01-1983
Reactor Bldg(Bsmt)
El Centro Array Hollister City
Borrego -1942 Central Calif-01 -1954 Corinth, Greece-1981 Corinth
#9 Hall
Dinar, Turkey -1995 Dinar Drama, Greece -1985 Drama El Alamo -1956 El Centro Array #9
Friuli, Italy-01 1976 Codroipo Friuli, Italy-02 1976 Codroipo Gazli, USSR 1976 Karakyr
Ferndale City
Hollister-01 -1961 Hollister City Hall Humbolt Bay -1937 Imperial Valley-03 -1951 El Centro Array #9
Hall
LA -
Irpinia, Italy-01-1980 Bovino Kern County 1952 Hollywood Kozani, Greece-01 -1995 Kardista
Stor F
Station #1- Station #5- Colton - So Cal
Little Skull Mtn,NV -1992 Little Skull Mtn,NV -1992 Lytle Creek 1970
Lathrop Wells Pahrump 1 Edison
LA - Hollywood Mammoth
Lytle Creek -1970 Mammoth Lakes-01 -1980 Manjil, Iran 1990 Abhar
Stor FF Lakes H. S
Ferndale City
Manjil, Iran 1990 Rudsar Northern Calif-01 -1941 Northwest Calif-01 -1938 Ferndale City Hall
Hall
Pacific Heights Rd
Northwest China-02-1997 Xiker Northwest China-03 -1997 Jiashi Oroville-03 1975
(OR4)
Cholame - Shandon Array
Parkfield -1966 Parkfield -1966 Point Mugu 1973 Port Hueneme
Shandon Array #5 #8
Whittier Narrows Parachute Test
San Fernando -1971 Superstition Hills-02-1987 Tabas, Iran 1978 Boshrooyeh
Dam Site
Taiwan SMART1(33)-1985 SMART1 M01 Chi-Chi, Taiwan-04 1999 HWA002 Chi-Chi, Taiwan-04 -1999 KAU003
Chi-Chi, Taiwan-04 1999 TTN042 Chi-Chi,Taiwan-04-1999 CHY102 Chuetsu-oki, Japan 2007 AKTH05
Chuetsu-oki, Japan 2007 FKIH03 Chuetsu-oki, Japan 2007 FKSH07 Coyote Lake 1979 Gilroy Array #1
Duzce, Turkey 1999 Bolu Duzce, Turkey 1999 IRIGM 487 Duzce, Turkey 1999 Lamont 1060
El Mayor-Cucapah, Mexico - El Mayor-Cucapah, Mexico El Mayor-Cucapah, Westside Elementary
Blythe Toro Canyon
2010 2010 Mexico 2010 School
12440 Imperial Alhambra - LA Griffith Park
Hector Mine 1999 Hector Mine 1999 Hector Mine 1999
Hwy, North Grn Co PW HQ FF Observatory
Gilroy Array
Hollister-02 1961 Hollister City Hall Hollister-03 1974 Hollister-03 1974 Hollister City Hall
#1
Iwate, Japan 2008 FKSH15 Iwate, Japan 2008 IWTH09 Iwate, Japan 2008 IWTH14
Kobe, Japan 1995 Kobe University Kocaeli, Turkey -1999 Gebze Kocaeli, Turkey 1999 Izmit
Kocaeli, Turkey 1999 Yarimca Landers -1992 Lucerne Landers -1992 Yermo Fire Station
Los Gatos -
Coyote Lake Dam -
Loma Prieta 1989 Gilroy Array #3 Loma Prieta 1989 Lexington Morgan Hill 1984
Southwest Abutment
Dam
Gilroy Array
Morgan Hill 1984 Gilroy Array #1 Morgan Hill 1984 Niigata, Japan 2004 FKIH03
#6
Niigata, Japan 2004 GIFH11 Niigata, Japan 2004 TKYH13 Niigata, Japan 2004 YMN001
Pacoima Dam PacoimaDam
Northridge-01 1994 Northridge-01 1994 Northridge-06 1994 LA - Wonderland Ave
(downstr) (upper left)
PARKFIELD -
Cedar Springs, Pacoima Dam (upper
Parkfield-02, CA 2004 TURKEY FLAT San Fernando 1971 San Fernando 1971
Allen Ranch left abut)
#1 (0M)
Pasadena - Old
San Fernando 1971 Tabas, Iran 1978 Tabas Tottori, Japan 2000 FKIH03
Seismo Lab
Tottori, Japan 2000 FKOH05 Tottori, Japan 2000 FKOH06
NUMERICAL EXAMPEL
1. Clustering 50 earthquakes into 5 clusters: In the first example, the 50 records are divided into 5 clusters once using the
OSS algorithm and again using the k mean algorithm. Figures 1 and 2 and the resulting index numbers show the
superiority of the algorithm over the kmean method. Including that number The records with the negative index value
allocation in the new algorithm were less and the sum of the index values, the same merit function, was higher.
v
SEE 8
Figure 1. The result of clustering 50 earthquake records with the OSS algorithm (5 cluster)
Figure 2. The result of clustering 50 earthquake records with the k-means algorithm (5 cluster)
2. Clustering 80 earthquakes into 6 cluster: In the second example, 80 earthquake records are divided into 6 clusters, and
the results of the OSS algorithm in Figure 3 and the k-means shown in Figure 4.
3. Clustering 110 earthquakes into 7 cluster: In the third example, 110 earthquake records are divided into 7 clusters, and
the results of the OSS algorithm in Figure 5 and k-means method shown in Figure 6.
The statistical results of the different implementations of this algorithm, starting from different random populations are
presented in Table 3. As can be seen, the superiority of the proposed method over the k-means algorithm is not only the
best and coincidental results but also the averages and worst-case results of all three examples.
Figure 3. The result of clustering 80 earthquake records with the OSS algorithm (6 cluster)
v
SEE 8
Figure 4. The result of clustering 80 earthquake records with the k-means algorithm (6 cluster)
Figure 5. The result of clustering 110 earthquake records with the OSS algorithm (7 Cluster)
Figure 6. The result of clustering 110 earthquake records with the k-means algorithm (7 Cluster)
v
SEE 8
CONCLUSIONS
As can be seen in the figures, the superiority of the OSS algorithm is compared to the method k-means in all three
examples, which means that the superiority was not accidental. Therefore, to improve the clustering quality, it is
recommended to use meta-heuristic algorithm.
REFERENCES
Shahrouzi M, RashidiMoghadam M (2016) Ground motion clustering by a hybrid K-Means and Colliding Bodeis
Optimization. Int J Optim Civ Eng 6:567–578.
Geem, Z.W.: Music-inspired harmony search algorithm: theory and applications (2009).
Shahrouzi, M., Sazjini, M.: Refined harmony search for optimal scaling and selection of accelerograms. Sci. Iran. 19,
218–224 (2012). https://doi.org/10.1016/j.scient.2012.02.002
Shahrouzi M (2019) Optimal Spectral Matching of Strong Ground Motion by Opposition-Switching Search. In: EngOpt
2018 Proc. 6th Int. Conf. Eng. Optim. Springer International Publishing, Lisbon, pp 713–724.
Rojas-Morales, N., Riff Rojas, M.C., Montero Ureta, E.: A survey and classification of opposition-based metaheuristics.
Comput. Ind. Eng. 110, 424–435 (2017). https://doi.org/10.1016/j.cie.2017.06.028.
Shahrouzi, M., Pashaei, M.: Stochastic directional search: an efficient heuristic for structural optimization of building
frames. Sci. Iran. 20, 1124–1132 (2013).
Pacific Earthquake Engineering Research Center, PEER Strong Motion Database, http://peer.berkeley.edu/smcat/