0% found this document useful (0 votes)
24 views7 pages

Clustering of Earthquake Records

Uploaded by

Parsa Rashvand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views7 pages

Clustering of Earthquake Records

Uploaded by

Parsa Rashvand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CLUSTERING OF EARTHQUAKE RECORDS BY A HYBRID OPPOSITION-SWITCHING

SEARCH

Davoud Rezazadeh
M.Sc Earthquake Engineering, Kharazmi University, Tehran, Iran
Mohsen Shahrouzi
Faculty of Engineering, Kharazmi University, Tehran, Iran

Keywords: Clustering, Metaheuristic Algorithms, Opposition Switching Search, k-means

ABSTRACT
Various methods have been proposed for clustering, including k-means algorithm[1]. It is known as a commonly used
and widely used technique. But because of the dependence of the result of this algorithm on the initial point, in the present
study, its combination with the algorithm Opposition-Switching Search is presented. In this paper, the data matrix
provides over 100 records of earthquakes on different types of soils, fault mechanisms and magnitudes. To form the
matrix, various attributes such as seismic energy, peak movement rates, velocity and acceleration of strong ground
motion, spectral intensity and also earthquake duration have been used. Then, clustering is performed using a new hybrid
method and is compared with the traditional k-means. The results of several tests indicate an improvement in the
evaluation indicators such as the profile index. In the present study, Silhouette Value Indicators have been selected in the
objective function.

INTRODUCTION
Data mining has been extensively studied in various sciences. Data clustering is among the various topics presented
for this purpose. Clustering is defined as the unattended process of dividing a set of data into separate subsets. In other
words, clustering algorithms attempt to classify data into groups based on their similarity, so that the created cluster is
meaningful and usable. One of the simplest clustering algorithms is the k-means algorithm, which is one of the most
popular and well-known data clustering methods because of its simple computational cost implementation. However, this
algorithm is very sensitive to initial values and easily falls into the trap of local optimization. Therefore, researchers have
used meta-heuristic algorithms to get rid of local optimizations [1].

OPTIMIZATION PROBLEM FORMULATION


The formed clusters are used visually using validation indices to interpret the results. In the present paper, among the
validation indices, Silhouette Value is selected because of its prevalence in previous studies. This index is denoted for i by
the vector of the characteristic vector si and defined in relation (1).
bi  ai
si  (1)
max( ai , bi )
Where the variable ai is the average distance of the characteristic vector i from all other characteristic vectors in the
cluster and with the minimum mean distance of the characteristic vector i from all other characteristic vectors of the
cluster. If the value of si is close to 1, it can be concluded that the characteristic vector i is in its proper cluster. On the
other hand, if the values of si are close to 1, it can be deduced that the characteristic vector i does not belong to a proper
cluster. The average of all si values is used to judge the generality of the clustering competence, which should be
maximized by this merit function.

International Institute of Earthquake Engineering and Seismology (IIEES) 1

v
SEE 8

Max. Fitness = s i Subject to 1  s i  1 (2)


i
According to the latter relation, competence is obtained from the sum of the index value [1].

K-MEANS ALGORITHM
Clustering methods are divided into fuzzy and deterministic categories. In this paper, the k-means clustering method is
used, which is one of the most commonly used clustering algorithms. The letter k, which is the first name of the
algorithm, refers to the fact that the purpose of this algorithm is to find a fixed number of clusters based on the proximity
of the data points. The process of the k-means algorithm is as follows:
 Problem Statement, Identification of Options, Identification of Criteria and Formation of Data Matrix
 Select k data as the center of the cluster
 Determine the intervals of the rest of the data from the center of the cluster
 Position the data closest to the center of each cluster in that cluster
 Calculate the average of each cluster as the center of that cluster
 Repeat steps 2 to 4 until stopping conditions

OPPOSITION SWITCHING SEARCH ALGORITHM


Opposition-Based Learning, OBL is a recent issue for performance improvement of certain procedures in artificial
intelligence. Shahrouzi [2] first used OBL to develop a population-based algorithm called Opposition-Switching-Search,
OSS. It is suitable for continuous optimization and utilizes the following features:
 Information sharing between the search-agents. It is performed via a crossover to generate a pseudo-mean solution
by picking up any of its components from randomly chosen members of the population.
 Taking into account opposition of a solution as well. A simple definition is utilized
so that opposite of a typical solution X is given by:

X  XU  X L  X (3)

Where X L and X U are the corresponding lower and upper bounds, respectively.
 Another feature is switching between a position and its opposition as the starting point for defining movement
direction in the search space
 An elitist strategy is also implemented saving the best-found solution as the target of such walks
OSS algorithm is proposed via the following steps:
 Generate a population of n individual agents by randomly locating them in the design space
 Evaluate the objective function for the entire population
 Repeat the following steps until termination criterion is satisfied
 Update the best-so-far solution; known as the Global best, X
Gb

 For every ith individual do


 Generate the sharing solution; Y by

Y  C r s( X i )
i (4)
Where Y  C r s( X i ) is a crossover operator on the individuals denoted by X i .
 Switch the pseudo-mean
i
Z to either Y or Y by equal chance. Take the type-I velocity vector as
V l  rand *( X Gb  Z ) (5)
ll 1 2
 Switch the type-II velocity V to either S or S by equal chance. They are given as follows:
S 1  rand *( X Gb  X i ) (6)

S 2  rand *( X i  X Gb ) (7)

2 International Institute of Earthquake Engineering and Seismology (IIEES)

v
SEE 8

Cd ,i
 Generate the candidate solution by X Cd ,i  X i V l V LL . Modify X to fall between its lower and upper
bounds.
 Evaluate objective function for the candidate solution
i Cd ,i i
 Substitute X with X if the candidate solution is better than X .
As a common termination criterion for OSS algorithm it loops up to N Iterations.

CHARACTERISTICS OF EARTHQUAKE IN DATA MATRIX


In this paper, the PEER NGA site is used to form the data matrix. The data matrix contains over 100 earthquake
records recorded around the world with different magnitudes, fault mechanisms, soil types and different focal lengths.
Selected earthquakes are larger than 5 magnitude, which fall into the seismic haze. Due to the requirement of bylaws
based on the selection of longitudinal and transverse components of earthquakes in the performance of temporal history
analysis, in this paper only the two components are used to form the data matrix and the vertical component is omitted.
The acceleration criteria used in Standard 2800 are earthquake magnitude, fault distance, soil type, and duration of severe
earthquakes [6]. These parameters represent as far as possible the characteristics of the area in question. A number of
other criteria have also been used in the conveyor, such as maximum ground acceleration (PGA), maximum ground
velocity (PGV), maximum ground displacement (PGD), Arias intensity, and housner intensity as two parameters
proportional to the earthquake input energy to the structures in the structure. Have been considered. Three methods of
Bracket Duration, Uniform and Significant were used to calculate the duration of severe earthquake motion. In
algorithms, each accelerometer is considered an object, and each attribute is an element for the accelerometer vector. Each
of these accelerograms should be in a cluster that is most similar to the other cluster members. The criterion of similarity
between the members is determined by the objective function used in the algorithm, which is the sum of the values of the
coefficient obtained for the accelerometer.

Table 1. Characteristics of earthquakes used in the data matrix


Feature name Unit
PGA g
PGV m/s
PGD cm
Arias intensity m/s
Hausner intensity cm
Moment magnitude Energy
Predominant Period Second
D5-95 Second
Fault Type Dipslip,strik slip, . . .
Depth to Hypocenter Km
Type of soil )dimension less(4-1

Table 2. Characteristics of earthquakes used


Name and year of Name and year of Name and year of
Record station Record station Record station
earthquake occurrence earthquake occurrence earthquake occurrence
Christchurch, New Christchurch
Chi-Chi, Taiwan-1999 CHY054 Chi-Chi, Taiwan-1999 CHY076
Zealand 2011 Resthaven
Christchurch, New Zealand- Pages Road Parkfield -
Coalinga-198301 Coalinga-198301 Parkfield - Fault Zone
2011 Pumping Station Cholame 2WA
Christchurch Imperial Valley- 60-9191
Darfield, New Zealand- 2010 Duzce, Turkey- 1999 Ambarli El Centro Array 3#
Resthaven
Foster City -
El Centro Array Larkspur Ferry
Imperial Valley-197907 Loma Prieta-1989 Menhaden Loma Prieta -1989
3# Terminal (FF)
Court
Foster City –
Loma Prieta - 1989 Treasure Island Morgan Hill-1984 Northridge- 01-1994 Carson - Water St
APEEL 1
Parkfield - Parkfield -
Parkfield-02, CA 2004 Parkfield-02, CA 2004 Tottori, Japan -2000 SMN002
Cholame 2WA Fault Zone
Tottori, Japan-2000 AIC003 Tottori, Japan-2000 TTR005 Whittier Narrows-01 1987 Carson - Water St

International Institute of Earthquake Engineering and Seismology (IIEES) 3

v
SEE 8

Table 2. Continue
Name and year of Name and year of Name and year of
Record station Record station Record station
earthquake occurrence earthquake occurrence earthquake occurrence
APEEL 2 - Larkspur Ferry
Whittier Narrows-02 1987 Carson - Water St Yountville -2000 Yountville -2000
Redwood City Terminal
TRA-642 ETR
Yountville-2000 Treasure Island Borah Peak, ID-01 -1983 TAN-719 Borah Peak, ID-01-1983
Reactor Bldg(Bsmt)
El Centro Array Hollister City
Borrego -1942 Central Calif-01 -1954 Corinth, Greece-1981 Corinth
#9 Hall
Dinar, Turkey -1995 Dinar Drama, Greece -1985 Drama El Alamo -1956 El Centro Array #9
Friuli, Italy-01 1976 Codroipo Friuli, Italy-02 1976 Codroipo Gazli, USSR 1976 Karakyr
Ferndale City
Hollister-01 -1961 Hollister City Hall Humbolt Bay -1937 Imperial Valley-03 -1951 El Centro Array #9
Hall
LA -
Irpinia, Italy-01-1980 Bovino Kern County 1952 Hollywood Kozani, Greece-01 -1995 Kardista
Stor F
Station #1- Station #5- Colton - So Cal
Little Skull Mtn,NV -1992 Little Skull Mtn,NV -1992 Lytle Creek 1970
Lathrop Wells Pahrump 1 Edison
LA - Hollywood Mammoth
Lytle Creek -1970 Mammoth Lakes-01 -1980 Manjil, Iran 1990 Abhar
Stor FF Lakes H. S
Ferndale City
Manjil, Iran 1990 Rudsar Northern Calif-01 -1941 Northwest Calif-01 -1938 Ferndale City Hall
Hall
Pacific Heights Rd
Northwest China-02-1997 Xiker Northwest China-03 -1997 Jiashi Oroville-03 1975
(OR4)
Cholame - Shandon Array
Parkfield -1966 Parkfield -1966 Point Mugu 1973 Port Hueneme
Shandon Array #5 #8
Whittier Narrows Parachute Test
San Fernando -1971 Superstition Hills-02-1987 Tabas, Iran 1978 Boshrooyeh
Dam Site
Taiwan SMART1(33)-1985 SMART1 M01 Chi-Chi, Taiwan-04 1999 HWA002 Chi-Chi, Taiwan-04 -1999 KAU003
Chi-Chi, Taiwan-04 1999 TTN042 Chi-Chi,Taiwan-04-1999 CHY102 Chuetsu-oki, Japan 2007 AKTH05
Chuetsu-oki, Japan 2007 FKIH03 Chuetsu-oki, Japan 2007 FKSH07 Coyote Lake 1979 Gilroy Array #1
Duzce, Turkey 1999 Bolu Duzce, Turkey 1999 IRIGM 487 Duzce, Turkey 1999 Lamont 1060
El Mayor-Cucapah, Mexico - El Mayor-Cucapah, Mexico El Mayor-Cucapah, Westside Elementary
Blythe Toro Canyon
2010 2010 Mexico 2010 School
12440 Imperial Alhambra - LA Griffith Park
Hector Mine 1999 Hector Mine 1999 Hector Mine 1999
Hwy, North Grn Co PW HQ FF Observatory
Gilroy Array
Hollister-02 1961 Hollister City Hall Hollister-03 1974 Hollister-03 1974 Hollister City Hall
#1
Iwate, Japan 2008 FKSH15 Iwate, Japan 2008 IWTH09 Iwate, Japan 2008 IWTH14
Kobe, Japan 1995 Kobe University Kocaeli, Turkey -1999 Gebze Kocaeli, Turkey 1999 Izmit
Kocaeli, Turkey 1999 Yarimca Landers -1992 Lucerne Landers -1992 Yermo Fire Station
Los Gatos -
Coyote Lake Dam -
Loma Prieta 1989 Gilroy Array #3 Loma Prieta 1989 Lexington Morgan Hill 1984
Southwest Abutment
Dam
Gilroy Array
Morgan Hill 1984 Gilroy Array #1 Morgan Hill 1984 Niigata, Japan 2004 FKIH03
#6
Niigata, Japan 2004 GIFH11 Niigata, Japan 2004 TKYH13 Niigata, Japan 2004 YMN001
Pacoima Dam PacoimaDam
Northridge-01 1994 Northridge-01 1994 Northridge-06 1994 LA - Wonderland Ave
(downstr) (upper left)
PARKFIELD -
Cedar Springs, Pacoima Dam (upper
Parkfield-02, CA 2004 TURKEY FLAT San Fernando 1971 San Fernando 1971
Allen Ranch left abut)
#1 (0M)
Pasadena - Old
San Fernando 1971 Tabas, Iran 1978 Tabas Tottori, Japan 2000 FKIH03
Seismo Lab
Tottori, Japan 2000 FKOH05 Tottori, Japan 2000 FKOH06

NUMERICAL EXAMPEL
1. Clustering 50 earthquakes into 5 clusters: In the first example, the 50 records are divided into 5 clusters once using the
OSS algorithm and again using the k mean algorithm. Figures 1 and 2 and the resulting index numbers show the
superiority of the algorithm over the kmean method. Including that number The records with the negative index value
allocation in the new algorithm were less and the sum of the index values, the same merit function, was higher.

4 International Institute of Earthquake Engineering and Seismology (IIEES)

v
SEE 8

Figure 1. The result of clustering 50 earthquake records with the OSS algorithm (5 cluster)

Figure 2. The result of clustering 50 earthquake records with the k-means algorithm (5 cluster)

2. Clustering 80 earthquakes into 6 cluster: In the second example, 80 earthquake records are divided into 6 clusters, and
the results of the OSS algorithm in Figure 3 and the k-means shown in Figure 4.
3. Clustering 110 earthquakes into 7 cluster: In the third example, 110 earthquake records are divided into 7 clusters, and
the results of the OSS algorithm in Figure 5 and k-means method shown in Figure 6.
The statistical results of the different implementations of this algorithm, starting from different random populations are
presented in Table 3. As can be seen, the superiority of the proposed method over the k-means algorithm is not only the
best and coincidental results but also the averages and worst-case results of all three examples.

Figure 3. The result of clustering 80 earthquake records with the OSS algorithm (6 cluster)

International Institute of Earthquake Engineering and Seismology (IIEES) 5

v
SEE 8

Figure 4. The result of clustering 80 earthquake records with the k-means algorithm (6 cluster)

Figure 5. The result of clustering 110 earthquake records with the OSS algorithm (7 Cluster)

Figure 6. The result of clustering 110 earthquake records with the k-means algorithm (7 Cluster)

Table 3: Results of clustering examples (over 20 runs)


Method Mean The best The worst
Example1 K-means 31.0284 32.5961 28.2943
Example1 OSS 38.8305 38.8305 38.8305
Example2 K-means 36.1482 39.3947 31.2596
Example2 OSS 46.7104 46.0238 46.0238
Example3 K-means 49.3642 52.4231 46.2967
Example3 OSS 67.4112 67.4112 67.4112

6 International Institute of Earthquake Engineering and Seismology (IIEES)

v
SEE 8

CONCLUSIONS
As can be seen in the figures, the superiority of the OSS algorithm is compared to the method k-means in all three
examples, which means that the superiority was not accidental. Therefore, to improve the clustering quality, it is
recommended to use meta-heuristic algorithm.

REFERENCES
Shahrouzi M, RashidiMoghadam M (2016) Ground motion clustering by a hybrid K-Means and Colliding Bodeis
Optimization. Int J Optim Civ Eng 6:567–578.
Geem, Z.W.: Music-inspired harmony search algorithm: theory and applications (2009).
Shahrouzi, M., Sazjini, M.: Refined harmony search for optimal scaling and selection of accelerograms. Sci. Iran. 19,
218–224 (2012). https://doi.org/10.1016/j.scient.2012.02.002
Shahrouzi M (2019) Optimal Spectral Matching of Strong Ground Motion by Opposition-Switching Search. In: EngOpt
2018 Proc. 6th Int. Conf. Eng. Optim. Springer International Publishing, Lisbon, pp 713–724.
Rojas-Morales, N., Riff Rojas, M.C., Montero Ureta, E.: A survey and classification of opposition-based metaheuristics.
Comput. Ind. Eng. 110, 424–435 (2017). https://doi.org/10.1016/j.cie.2017.06.028.
Shahrouzi, M., Pashaei, M.: Stochastic directional search: an efficient heuristic for structural optimization of building
frames. Sci. Iran. 20, 1124–1132 (2013).
Pacific Earthquake Engineering Research Center, PEER Strong Motion Database, http://peer.berkeley.edu/smcat/

International Institute of Earthquake Engineering and Seismology (IIEES) 7

You might also like