HMTK Tutorial
HMTK Tutorial
Hazard Modeller’s
Toolkit - User Guide
Copyright c 2014 GEM Foundation
Citation
Please cite this document as:
Weatherill, G. A. (2014) OpenQuake Hazard Modeller’s Toolkit - User Guide. Global Earth-
quake Model (GEM). Technical Report
Disclaimer
The “Hazard Modeller’s Tookit - User Guide” is distributed in the hope that it will be useful,
but without any warranty: without even the implied warranty of merchantability or fitness for a
particular purpose. While every precaution has been taken in the preparation of this document,
in no event shall the authors of the manual and the GEM Foundation be liable to any party for
direct, indirect, special, incidental, or consequential damages, including lost profits, arising out
of the use of information contained in this document or from the use of programs and source
code that may accompany it, even if the authors and GEM Foundation have been advised of
the possibility of such damage. The Book provided hereunder is on as "as is" basis, and the
authors and GEM Foundation have no obligations to provide maintenance, support, updates,
enhancements, or modifications.
The current version of the book has been revised only by members of the GEM model facility
and it must be considered a draft copy.
License
This Book is distributed under the Creative Common License Attribution-NonCommercial-
NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) (see link below). You can download this Book and
share it with others as long as you provide proper credit, but you cannot change it in any way or
use it commercially.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 The Development Process 5
1.2 Getting Started and Running the Software 6
1.2.1 Current Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 About this Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Catalogue Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 The Earthquake Catalogue 17
2.1.1 The Catalogue Format and Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.2 The “Selector” Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Declustering 24
2.2.1 GardnerKnopoff1974 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.2 AFTERAN (Musson1999PSHABalkan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Completeness 27
2.3.1 Stepp1971 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Recurrence Models 30
2.4.1 Aki1965 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.3 KijkoSmit2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.4 Weichert1980 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Maximum Magnitude 32
2.5.1 Kijko2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.2 Cumulative Moment (MakropoulosBurton1983) . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Smoothed Seismicity 36
2.6.1 frankel1995 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.2 Implementing the Smoothed Seismicity Analysis . . . . . . . . . . . . . . . . . . . . . . 36
3 Hazard Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1 Source Model and Hazard Tools 39
3.1.1 The Source Model Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.2 The Source Model Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Hazard Calculation Tools 46
4 Geology Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1 Fault Recurrence from Geology 49
4.1.1 Epistemic Uncertainties in the Fault Modelling . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.2 Tectonic Regionalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.3 Definition of the Fault Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.4 Fault Recurrence Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.5 Running a Recurrence Calculation from Geology . . . . . . . . . . . . . . . . . . . . 61
5 Geodetic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1 Recurrence from Geodetic Strain 67
5.1.1 The Recurrence Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1.2 Running a Strain Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Books 71
Articles 71
Reports 71
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
I Appendices 75
1. Introduction
The Hazard Modeller’s Toolkit (or “openquake.hmtk”) is a Python library of functions originally
written by scientists at the GEM Model Facility, and now maintained by the GEM Foundation
Secretariat. The HMTK is intended to provide scientists and engineers with the tools to help
create the seismogenic input models that go into the OpenQuake hazard engine. The process of
developing a hazard model is a complex and often challenging one, and while many aspects of
the practice are relatively common, the choice of certain methods or tools for undertaking each
step can be a matter of judgement. The intention of this software is to provide scientists and
engineers with the means to apply many of the most commonly used algorithms for preparing
seismogenic source models using seismicitiy and geological data.
This manual is Version 2.0 of the HMTK tutorial. The major differences in the toolkit and
the tutorial compared to the original release are i) the HMTK is now contained in the OpenQuake
Engine, and does not require any separate installation, ii) the OpenQuake hazardlib source
classes have been adopted in order to ensure full compatibility and consistency between the
two libraries, and iii) the plotting functions that produce maps now use Generic Mapping Tools
(GMT) and Python scripts housed in the OpenQuake Model Building Toolkit.
Portability Reduction in the number of Python dependencies to allow for a high degree of
cross-platform deployment
Adaptability Cleaner separation of methods into self-contained components that can be imple-
mented and tested within requiring adaption of the remainder of the code.
Abstraction This concept is often a critical component object-oriented development. It de-
scribes the specification of a core behaviour of a method, which implementations (by
means of the subclass) must follow. For example, a declustering algorithm must follow the
common behaviour path, in this instance i) reading and earthquake catalogue and some
configurable parameters, ii) identifying the clusters of events, iii) identifying the main-
shocks from within each cluster,iv) returning this information to the user. The details of the
implementation are then dependent on the algorithm, providing that the core flow is met.
This is designed to allow the algorithms to be interchangeable in the sense that different
methods for particular task could be selected with no (or at least minimal) modification to
the rest of the code.
Usability The creation of a library which could itself be embedded within larger applications
(e.g. as part of a graphical user interface).
Feature Algorithm
Seismicity
Declustering GardnerKnopoff1974
AFTERAN (Musson1999)
Completeness Stepp1971
Recurrence Maximum Likelihood (Aki1965)
Time-dependent MLE
Weichert1980
Smoothed Seismicity frankel1995
Geology
Recurrence AndersonLuco1983 “Arbitrary”
AndersonLuco1983 “Area MMAX ”
Characteristic (Truncated Gaussian)
YoungsCoppersmith1985 Exponential
YoungsCoppersmith1985 Characteristic
Geodetic Strain
Recurrence Seismic Hazard Inferred from Tectonics (SHIFT)
BirdLiu2007; Bird_etal2010
software, and requires some investment of time from the user to understand the functionalities
and learn how to link the various tools together into a workflow that will be suitable for the
modelling problem at hand.
This manual is designed to explain the various functions in the toolkit and to provide some
illustrative examples showing how to implement them for particular contexts and applications.
The tutorial itself does not specifically require a working knowledge of Python. However,
an understanding of the basic python data types, and ideally some familiarity with the use
of Python objects, is highly desirable. Users who are new to Python are recommended to
familiarise themselves with Appendix A of this tutorial. This provides a brief overview of
the Python programming language and should introduce concepts such as classes and dictio-
naries, which will be encountered in due course. For more detail of the complete Python
language, a comprehensive overview of its features and usage standard python documentation
(http://docs.python.org/2/tutorial/). Where necessary particular Python programming concepts
will be explained in further detail.
The code snippets (indicated by verbatim text) can be executed from within an ”Interactive
Python (IPython)” environment, or may form the basis for usage of the openquake.hmtk in other
python scripts that the user may wish to run construct themselves. If not already installed on
your system, IPython can be installed from the python package repository by entering:
An “interactive” session can then be opened by typing ipython at the command prompt. If
matplotlib is installed and you wish to use the plotting functionalities described herein then
you should open IPython with the command:
~$ ipython --pylab
For a more visual application of the openquake.hmtk the reader is encouraged to utilise the
“IPython Notebook” (http://ipython.org/notebook.html). This novel tool implements IPython
inside a web-browser environment, permitting the user to create and store real Python workflows
that can be retrieved and executed, whilst allowing for images and text to be embedded. A
screenshot of the openquake.hmtk used in an IPython Notebook environment is shown in Figure
1.1. From version 1.0 of IPython, the IPython Notebook comes installed. A notebook session
can be started via the command:
1.2.3 Visualisation
In addition to the scientific tools, which will be described in detail in due course, the original
version of the openquake.hmtk also included a set of functionalities for visualisation of data and
results pertinent to the preparation of seismic hazard input models. While not considered an
essential component of the openquake.hmtk, the usage of the plotting functions can facilitate
1.2 Getting Started and Running the Software 9
model development. Particular visualisation functions shall be referred to where relevant for the
particular tool or data set.
The current version of the HMTK includes most of the original visualisation tools. However,
the tools for map creation have been depracated, and replaced by a set of mapping functions in
the OpenQuake Model Building Toolkit (MBTK). The tools now use Generic Mapping Tools
(GMT), and so were moved outside of the HMTK library as to not add GMT as a dependency of
the OpenQuake Engine. However, the mapping functions are still described in this tutorial in
order to provide users with a replacement to the depracated functions.
Map Creation
An IPython Notebook demonstrating how to use the mapping methods can be found in the
MBTK. The basic functionalities are described herein.
To set-up a simple basemap it is necessary to define the configuration of the plot (such as
spatial limit and coastline resolution). This is done as follows:
1 In [1]: from openquake . plt . mapping import HMTKBaseMap
2
3 In [2]: map_config = { " min_lon " : 18.0 ,
4 " max_lon " : 32.0 ,
5 " min_lat " : 33.0 ,
6 " max_lat " : 43.0 ,
7 " title " : " Title of Map " }
8
9 In [3]: basemap1 = HMTKBaseMap ( map_config )
The class HMTKBaseMap contains a set of methods for mapping catalogue data or simplified
source models:
This function will overlay an earthquake catalogue onto the basemap. The input value cat is the
earthquake catalogue as an instance of the class
openquake.hmtk.seismicity.catalogue.Catalogue (see the next section for details). The
catalogue is the only mandatory parameter, but the user can also specify the following optional
parameters:
• scale: a scaling coefficient that sets the symbol size per magnitude m. Size follows the
equation scale ∗ 10(−1.5+m∗0.3) , where m is magnitude. See GMT documentation.
• cpt_file: name of an existing color pallet to color earthquake markers. If not specified,
the default "tmp.cpt" is generated based on the catalogue color_field
• color_field: the parameter used to color the earthquake markers. The given field must
correspond to the catalogue header. If not specified, the markers are colored by depth.
• logscale: if ‘True’, generates the color pallet according to a log scale. ‘False’ uses a
linear color scale. Default is ‘True’. Ignored if cpt_file is specified.
.add_source_model(model)
This method adds a source model to the basemap. The input value model is an instance of the
class openquake.hazardlib.nrml.SourceModel (see section TODO -> this is replacing
the mtk source classes). An example of a source model plot is shown in 1.5.
NB: At present, only the following source typologies can be plotted automatically:
• Point sources
• Simple faults
• Complex faults
• Area sources
This method overlays a set of data points with colour scaled according to the data values. Three
data arrays are required: one each with the longitude and latitude coordinates of the data
points, and data, a set of scalar values (e.g. magnitude or depth, if plotting an earthquake
catalogue) associated with those points. In addition to these, the method takes four optional
keyword parameters:
This method overlays a set of data points with size scaled according to the data values. Three
data arrays are required: one each with the longitude and latitude coordinates of points to
be plotted, and data, a set of scalar values associated with those points. In addition to these, the
method takes eight optional keyword parameters:
• shape: a string indicating the shape of the data markers, using GMT syntax starting with
‘-S’ (see GMT psxy markers. The default, ‘-Ss’ is squares.
• logplot: if True, use a logscale to create the marker sizes. Default is False.
• color: a string that indicates the marker color (see GMT psxy markers. Default is ‘blue’.
• smin: size of the smallest symbol in cm. Marker size is computed as
smin + coeff × datasscale . Default is 0.01 cm.
12 Chapter 1. Introduction
Figure 1.3 – Example visualisation of a source model for Papua New Guinea ghasemi2016 with
area sources (blue) and a complex fault.
• coeff: used with sscale and smin to set the marker sizes. Default is 1.0.
• sscale: used with coeff and smin to set the marker sizes. Default is 2.0.
• label: a string that corresponds to the data array
• legend: if True, adds a legend to the plot. Default is True.
.add_focal_mechanism(filename, mech_format)
This method overlays focal mechanisms. The string filename indicates a file containing focal
mechanism data. mech_format is a string contained by quotations used to indicate the data
format used by filename, allowing two options—focal mechanism (‘FM’) and seismic moment
tensor (‘MT’)—both using the Harvard CMT convention, as described by GMT psmeca.
• filename: a string used to name the map, which includes a suffix (limited to ‘.pdf’, ‘.png’,
and ‘.jpg’) indicating the desired file type. If not specified, the map is saved as map.pd f in
the directory output_folder that was assigned during the HMTKBaseMap instantiatation.
• save_script: if True, the GMT commands are saved to a shell script, and this with all
1.2 Getting Started and Running the Software 13
files needed to create the map are saved in output_folder. If False (the default), all the
temporary files are erased and only the map is saved.
• verb (verbose): if True, GMT commands are printed as they are executed.
The save_script option gives the user more flexibility to modify the plot settings than are
available through the methods, while providing the structure of the GMT script as a starting point.
NB: Take care not to overwrite scripts that have been customized by rerunning the mapping
code!
The .savemap() method is used as follows (continuing from the above Python lines):
2. Catalogue Tools
N.B. the csv file can contain additional attributes of the catalogue too and will be
parsed correctly; however, if the attribute is not one that is specifically recognised by the
catalogue class then a message will be displayed indicating:
This is expected behaviour and simply indicates that although this data is given in the
input file, it is not retained in the data dictionary.
18 Chapter 2. Catalogue Tools
Attribute Description
eventID* A unique identifier (integer) for each earthquake in the catalogue
Agency The code (string) of the recording agency for the event solution
year* Year of event (integer) in the range -10000 to present
(events before common era (BCE) should have a negative value)
month* Month of event (integer)
day* Day of event (integer)
hour* Hour of event (integer) - if unknown then set to 0
minute* Minute of event (integer) - if unknown then set to 0
second* Second of event (float) - if unknown set to 0.0
timeError Error in event time (float)
longitude* Longitude of event, in decimal degrees (float)
latitude* Latitude of event, in decimal degrees (float)
SemiMajor90 Length (km) of the semi-major axis of the 90 %
confidence ellipsoid for location error (float)
SemiMinor90 Length (km) of the semi-minor axis of the 90 %
confidence ellipsoid for location error (float)
ErrorStrike Azimuth (in degrees) of the 90 %
confidence ellipsoid for location error (float)
depth* Depth (km) of earthquake (float)
depthError Uncertainty (as standard deviation) in earthquake depth (km) (float)
magnitude* Homogenised magnitude of the event (float) - typically Mw
sigmaMagnitude* Uncertainty on the homogenised magnitude (float) typically Mw
Table 2.1 – List of Attributes in the Earthquake Catalogue File (* Indicates Essential)
The catalogue class contains several helpful methods (called via catalogue. ...):
• catalogue.get_number_events() Returns the number of events currently in the cata-
logue (integer)
• catalogue.load_to_array(keys) Returns a numpy array of floating data, with the
columns ordered according to the list of keys. If the key corresponds to a string item (e.g.
Agency) then an error will be raised.
1 >> catalogue . load_to_array ([ ’ year ’ , ’ longitude ’ , ’ latitude ’ ,
2 ’ depth ’ , ’ magnitude ’ ])
3 array ([[ 1910. , 26.941 , 38.507 , 13.2 , 6.5 ] ,
4 [ 1910. , 22.190 , 37.720 , 20.4 , 6.5 ] ,
5 [ 1910. , 28.881 , 33.274 , 25.0 , 6.0 ] ,
6 ... ,
7 [ 2009. , 20.054 , 39.854 , 20.2 , 4.8 ] ,
8 [ 2009. , 23.481 , 38.050 , 15.2 , 5.2 ] ,
9 [ 2009. , 28.959 , 34.664 , 18.4 , 4.1 ]])
2.1 The Earthquake Catalogue 19
• catalogue.get_decimal_time()
Returns the time of the earthquake in a decimal format
• catalogue.hypocentres_as_mesh()
Returns the hypocentres of an earthquake as an instance of the class
“openquake.hazardlib.geo.mesh.Mesh” (useful for geospatial functions)
• catalogue.hypocentres_to_cartesian()
Returns the hypocentres in a 3D cartesian framework
• catalogue.purge_catalogue(flag_vector)
Purges the catalogue of all False events in the boolean vector. Thus is used for remov-
ing foreshocks and aftershocks from a catalogue after the application of a declustering
algorithm.
• catalogue.sort_catalogue_chronologically()
Sorts an input into chronological order.
N.B. Some methods will implicitly assume that the catalogue is in chronological order, so
it is recommended to run this function if you believe that there may be events out of order
• catalogue.select_catalogue_events(IDX)
Orders the catalogue according to the event order specified in IDX. Behaves the same as
purge_catalogue(IDX) if IDX is a boolean vector
• catalogue.get_depth_distribution(depth_bins, normalisation=False,
bootstrap=None)
Returns a depth histogram for the catalogue using bins specified by depth_bins. If
normalisation=True then the function will return the histogram as a probability mass
function, otherwise the original count will be returned. If uncertainties are reported on
depth such that one or more values in
catalogue.data[’depthError’] are greater than 0., the function will perform a boot-
strap analysis, taking into account the depth error, with the number of bootstraps given by
the keyword bootstrap.
20 Chapter 2. Catalogue Tools
To generate a simple histogram plot of hypocentral depth, the process below can be
followed to produce a depth histogram similar to the one shown in Figure 2.1:
1 >> from openquake . hmtk . plotting . seismicity . catalogue_plots import \
2 plot_depth_histogram
3
4 >> depth_bin = 5.0
5 >> plot_depth_histogram ( catalogue ,
6 depth_bin ,
7 filename = " / path / to / image . eps " ,
8 filetype = " eps " )
Depth Histogram
3500
3000
2500
2000
Count
1500
1000
500
0
0 50 100 150 200 250 300 350 400
Depth (km)
• catalogue.get_magnitude_depth_distribution(magnitude_bins, depth_bins,
normalisation=False, bootstrap=None)
Returns a two-dimensional histogram of magnitude and hypocentral depth, with the cor-
responding bins defined by the vectors magnitude_bins and depth_bins. The options
normalisation and bootstrap are the same as for the one dimensional histogram. The
usage is illustrated below:
1 # Define depth bins for ( e . g )
2 # 0. - 150 km in intervals of 55 km
2.1 The Earthquake Catalogue 21
Magnitude-Depth Count
400
350
102
300
250
Depth (km)
200
101
150
100
50
0
3 4 5 6 7 100
Magnitude
Figure 2.2 – Example magnitude-depth density plot
• catalogue.get_magnitude_time_distribution(magnitude_bins, time_bins,
normalisation=False, bootstrap=None)
Returns a 2D histogram of magnitude with time. time_bins are the bin edges for the
time windows, in decimal years. To run the function simple follow:
22 Chapter 2. Catalogue Tools
To automatically generate a plot, similar to that shown in Figure 2.3 , run the following:
1 >> from openquake . hmtk . plotting . seismicity . catalogue_plots import \
2 plot_magnitude_time_density
3 >> magnitude_bin_width = 0.1
4 >> time_bin_width = 0.1
5 >> p l o t _ m a g n i t u d e _ t i m e _ d e n s i t y ( catalogue ,
6 magnitude_bin_width ,
7 time_bin_width ,
8 filename = " / path / to / image . eps " ,
9 filetype = " eps " )
Magnitude-Time Count
8
102
6
Magnitude
5
101
2
1920 1940 1960 1980 2000 100
Time (year)
The optional keyword create_copy ensures that when the events not selected are purged
from the catalogue a “deepcopy” is taken of the original catalogue. This ensures that the original
catalogue remains unmodified when a subset of events is selected.
The catalogue selector class has the following methods:
.within_polygon(polygon, distance=None)
Selects events within a polygon described by the class openquake.hazardlib.geo.
polygon.Polygon. distance is the distance (in km) to use as a buffer, if required. Optional
keyword arguments upper_depth and lower_depth can be used to limit the depth range of
the catalogue returned by the selector to only those events whose hypocentres are within the
specified depth limits.
.circular_distance_from_point(point, distance, distance_type="epicentral")
Selects events within a distance from the a location. The location (point) is an instance of
the openquake.hazardlib.geo.point.Point class, whilst distance is the selection distance (km)
and distance_type can be either "epicentral" or "hypocentral".
.cartesian_square_centred_on_point(point, distance)
Selects events within a square of side length distance, on a location (represented as an
openquake Point class).
.within_joyner_boore_distance(surface, distance)
Returns earthquakes within a distance (km) of the surface projection (“Joyner-Boore” dis-
tance) of a fault surface. The fault surface must be defined as an instance of the class
openquake.hazardlib.geo.surface.simple_fault.SimpleFaultSurface or
openquake.hazardlib.geo.surface.complex_fault.ComplexFaultSurface.
.within_rupture_distance(surface, distance)
Returns earthquakes within a distance (km) of a fault surface. The fault surface must be
defined as an instance of the class
openquake.hazardlib.geo.surface.simple_fault.SimpleFaultSurface or
openquake.hazardlib.geo.surface.complex_fault.ComplexFaultSurface.
.within_time_period(start_time=None, end_time=None)
Selects earthquakes within a time period. Times must be input as instances of a datetime
object. For example:
1 >> from datetime import datetime
2 >> selector1 = CatalogueSelector ( catalogue1 , create_copy = True )
3 # Early time limit is 1 January 1990 00:00:00
4 >> early = datetime (1990 , 1 , 1 , 0 , 0 , 0)
5 # Late time limit is 31 December 1999 23:59:59
6 > >: late = datetime (1999 , 12 , 31 , 23 , 59 , 59)
7 >> c atalogue_nineties = selector1 . within_time_period (
8 start_time = early ,
9 end_time = late )
.within_depth_range(lower_depth=None, upper_depth=None)
Selects earthquakes whose hypocentres are within the range specified by the lower depth
limit (lower_depth) and the upper depth limit (upper_depth), both in km.
.within_magnitude_range(lower_mag=None, upper_mag=None)
24 Chapter 2. Catalogue Tools
Selects earthquakes whose magnitudes are within the range specified by the lower limit
(lower_mag) and the upper limit (upper_mag).
2.2 Declustering
To identify Poissonian rate of seismicity, it is necessary to remove foreshocks/aftershocks/swarms
from the catalogue. The Modeller’s Toolkit contains, at present, two algorithms to undertake this
task, with more under development.
2.2.1 GardnerKnopoff1974
The most widely applied simple windowing algorithm is that of GardnerKnopoff1974. Orig-
inally conceived for Southern California, the method simply identifies aftershocks by virtue
of fixed time-distance windows proportional to the magnitude of the main shock. Whilst this
premise is relatively simple, the manner in which the windows are applied can be ambiguous.
Four different possibilities can be considered (LuenStark2012):
1. Search events in magnitude-descending order. Remove events if it is in the window of the
largest event
2. Remove every event that is inside the window of a previous event, including larger events
3. An event is in a cluster if, and only if, it is in the window of at least one other event in the
cluster. In every cluster remove all events except the largest
4. In chronological order, if the ith event is in the window of a preceding larger shock that
has not already been deleted, remove it. If a larger shock is in the window of the ith event,
delete the ith event. Otherwise retain the ith event.
It is the first of the four options that is implemented in the current toolkit, whilst others may be
considered in future. The algorithm is capable if identifying foreshocks and aftershocks, simply
by applying the windows forward and backward in time from the mainshock. No distinction
is made between primary aftershocks (those resulting from the mainshock) and secondary or
tertiary aftershocks (those originating due to the previous aftershocks); however, it is assumed
all would occur within the window.
Several modifications to the time and distance windows have been suggested, which are
summarised in vanStiphout2012. The windows originally suggested by GardnerKnopoff1974
are approximated by:
2
distance (km) =e1.77+(0.037+1.02M)
2
(
|e−3.95+(0.62+17.32M) | if M ≥ 6.5 (2.2)
time (decimal years) =
102.8+0.024M otherwise
A comparison of the expected window sizes with magnitude are shown for distance and time
(Figure 2.4).
2.2 Declustering 25
(a)
1,000
Distance (km)
100
(b)
10,000
1,000
Time (days)
100
Figure 2.4 – Scaling of declustering time and distance windows with magnitude
The GardnerKnopoff1974 algorithm and its derivatives represent are most computationally
straightforward approach to declustering. The time_dist_windows attribute indicates the
choice of the time and distance window scaling model from the three listed. As the current
version of this algorithm considers the events in a descending-magnitude order, the parameter
foreshock_time_window defines the size of the time window used for searching for foreshocks,
as a fractional proportion of the size of the aftershock window (the distance windows are always
equal for both fore- and aftershocks). So for an evenly sized time window for foreshocks and
aftershocks,the
foreshock_time_window parameter should equal 1. For shorter or longer foreshock time
windows this parameter can be reduced or increased respectively.
To run a declustering analysis on the earthquake catalogue it is necessary to set-up the config-
uration using a python dictionary (see Appendix A). A config file for the GardnerKnopoff1974
algorithm, using for example the Uhrhammer1986 time-distance windows with equal sized
time window for aftershocks and foreshocks, would be created as shown:
1 >> from openquake . hmtk . seismicity . declusterer . dist ance_t ime_wi ndows import \
2 UhrhammerWindow
3
4 >> declust_config = {
5 ’ time_distance_window ’: UhrhammerWindow () ,
6 ’ fs_time_prop ’: 1.0}
To run the declustering algorithm simply import and run the algorithm as shown:
1 >> from openquake . hmtk . seismicity . declusterer . dec_gardner_knopoff import \
2 GardnerKnopoffType1
3
4 >> declustering = GardnerKnopoffType1 ()
5
6 >> cluster_index , cluster_flag = declustering . decluster (
7 catalogue ,
8 declust_config )
26 Chapter 2. Catalogue Tools
2.3 Completeness
In the earliest stages of processing an instrumental seismic catalogue to derive inputs for seismic
hazard analysis, it is necessary to determine the magnitude completeness threshold of the
catalogue. To outline the meaning of the term ”magnitude completeness” and the requirements
for its analysis as an input to PSHA, the terminology of MignanWoessner2012 is adopted.
This defines the magnitude of completeness as the ”lowest magnitude at which 100 % of the
events in a space-time volume are detected (RydelekSacks1989; WoessnerWiemer2005)”.
Incompleteness of an earthquake catalogue will produce bias when determining models of
earthquake recurrence, which may have a significant impact on the estimation of hazard at a
site. Identification of the completeness magnitude of an earthquake catalogue is therefore a clear
requirement for the processing of input data for seismic hazard analysis.
It should be noted that this summary of methodologies for estimating completeness is directed
toward techniques that can be applied to a ”typical” instrumental seismic catalogue. We therefore
make the assumption that the input data will contain basic information for each earthquake such
as time, location, magnitude. We do not make the assumption that network-specific or station-
specific properties (e.g., configuration, phase picks, attenuation factors) are known a priori. This
limits the selection of methodologies to those classed as estimators of ”sample completeness”,
which defines completeness on the basis of the statistical properties of the earthquake catalogue,
rather than ”probability-based completeness”, which defines the probability of detection given
knowledge of the properties of the seismic network (SchorlemmerWoessner2008). This there-
fore excludes the methodology of SchorlemmerWoessner2008, and similar approaches such as
that of Felzer2008
The current workflows assume that completeness will be applied to the whole catalogue,
ideally returning a table of time-varying completeness. The option to explore spatial variation
in completeness is not explicitly supported, but could be accommodated by an appropriate
configuration of the toolkit.
In the current version of the Modeller’s Toolkit the Stepp1971 methodology for analysis of
catalogue completeness is implemented. Further methods are in development, and will be input
in future releases.
2.3.1 Stepp1971
This is one of the earliest analytical approaches to estimation of completeness magnitude. It
is based on estimators of the mean rate of recurrence of earthquakes within given magnitude
and time ranges, identifying the completeness magnitude when the observed rate of earthquakes
above MC begins to deviate from the expected rate. If a time interval (Ti ) is taken, and the
earthquake sequence assumed Poissonian, then the unbiased estimate of the mean rate of events
per unit time interval of a given sample is:
1 n
λ= ∑ Ti (2.4)
n i=1
with variance σλ2 = λ /n. Taking the unit time interval to be 1 year, the standard deviation of
the estimate of the mean is:
√ √
σλ = λ/ T (2.5)
where T is√the sample length. As the Poisson assumption implies a stationary process, σλ
behaves as 1/ T in the sub-interval of the sample in which the mean rate of occurrence of a
28 Chapter 2. Catalogue Tools
magnitude class is constant. Time variation of MC can usually be inferred graphically√ from the
analysis, as is illustrated in Figure 2.5. In this example, the deviation from the 1/ T line for
each magnitude class occurs at around 40 years for 4.5 < M < 5, 100 years for 5.0 < M < 6.0,
approximately 150 years for 6.0 < M < 6.5 and 300 years for M > 6.5. Knowledge of the
sources of earthquake information for a given catalogue may usually be reconciled with the
completeness time intervals.
The analysis of Stepp1971 is a coarse, but relatively robust, approach to estimating the
temporal variation in completeness of a catalogue. It has been widely applied since its develop-
ment. The accuracy of the completeness magnitude depends on the magnitude and time intervals
considered, and a degree of judgement is often needed to determine the time at which the rate
deviates from the expected values. It has tended to be applied to catalogues on a large scale, and
for relatively higher completeness magnitudes.
To translate the methodology from a largely graphical methods into a computational method
the completeness period needs to be identified by automatically identifying the point at which
the gradient of the observed values decreases with respect to that expected from a Poisson
process (see 2.5). In the implementation found within the current toolkit, the divergence point is
identified by fitting a two-segment piecewise linear function to the observed data. Although a
two-segment piecewise linear function is normally fit with four parameters (intercept, slope1 ,
slope2 and crossover point), by virtue of the assumption that for the complete catalogue the rate
is assumed to be stationary such that σλ = √1T the slope of the first segment can be fixed as
−0.5, and the second slope should be constrained such that slope2 ≤ −0.5, whilst the crossover
point (xc ) is subject to the constraint (xc ≥ 0.0). Thus it is possible to fit the two-segment linear
function using constrained optimisation with only three free parameters. For this purpose the
toolkit minimises the residual sum-of-squares of the model fit using numerical optimisation.
To run the Stepp1971 algorithm the configuration parameters should be entered in the form
2.3 Completeness 29
The algorithm has three configurable options. The time_bin parameter describes the
size of the time window in years, the magnitude_bin parameter describes the size of the
magnitude bin, sensitivity is as described previously. The final option (increment_lock) is
an option that is used to ensure consistency in the results to avoid the completeness magnitude
increasing for the latest intervals in the catalogue simply due to the variability associated
with the short duration. If increment_lock is set to True, the program will ensure that the
completeness magnitude for shorter, more recent windows is less than or equal to that of
older, longer windows. This is often a condition for some recurrence analysis tools, so it
may be advisable to set this option to true in certain workflows. Otherwise it should be set
to False to show the apparent variability. Some degree of judgement is necessary
here. In particular it is expected that the user may be aware of circumstances particular to their
catalogue for which a recent increase in completeness magnitude is expected (for example, a
certain recording network no longer operational).
The process of running the algorithm is shown below:
1 >> from openquake . hmtk . seismicity . completeness . comp_stepp_1971 import \
2 Stepp1971
3
4 >> c omp le ten ess _a lgo ri thm = Stepp1971 ()
5
6 >> c ompleteness_table = co mp let ene ss _al go rit hm . completeness (
7 catalogue ,
8 comp_config )
9
10 >> c ompleteness_table
11 array ([[ 1990. , 4.25] ,
12 [ 1962. , 4.75] ,
13 [ 1959. , 5.25] ,
14 [ 1906. , 5.75] ,
15 [ 1906. , 6.25] ,
16 [ 1904. , 6.75] ,
17 [ 1904. , 7.25]])
If a completeness_table is input then this will override the selection of the completeness
algorithm, and the calculation will take the values in completeness_table directly.
30 Chapter 2. Catalogue Tools
2.4.1 Aki1965
The classical maximum likelihood estimator for a simple unbounded GutenbergRichter1944
model is that of Aki1965, adapted for binned magnitude data by Bender1983. It assumes a fixed
completeness magnitude (MC ) for the catalogue, and a simple power law recurrence model. It
does not explicitly take into account magnitude uncertainty.
log10 (e)
b= (2.6)
m̄ − m0 + ∆M 2
where m̄ is the mean magnitude, m0 the minimum magnitude and ∆M the discretisation interval
of magnitude within a given sample.
1 S
b̂ = ∑ wi bi (2.7)
S i=1
2.4.3 KijkoSmit2012
A recent adaption of the Aki1965 estimator of b-value for a catalogue containing different
completeness periods has been proposed by KijkoSmit2012. Dividing the earthquake catalogue
2.4 Recurrence Models 31
s ni
L = ∏ ∏ β exp( −β mij − mimin )
(2.8)
i=1 j=1
−1
r1 r2 rs
β= + +···+ (2.9)
β1 β2 βs
where ri = ni /n and n = ∑si=1 ni above the level of completeness mi .
1
2 >> kijko_smit_config = { ’ magnitude_interval ’: 0.1 ,
3 ’ reference_magnitude ’: None \}
2.4.4 Weichert1980
Recognising the typical conditions of an earthquake catalogue, Weichert1980 developed a
maximum likelihood estimator of b for grouped magnitudes and unequal periods of observation.
The likelihood formulation for this approach is:
N!
L (β |ni , mi ,ti ) = ∏ pni i (2.10)
n !
∏i i i
where L is the likelihood estimator of β , n the number of earthquakes in magnitude bin m
with observation period t. The parameter p is defined as:
ti exp (−β mi )
pi = (2.11)
∑ j t j exp (−β m j )
The extremum of ln (L) is found at:
∑i ti mi exp (−β mi )
(2.12)
∑ j t j exp (−β m j )
The computational implementation of this method is given as an appendix to Weichert1980.
This formulation of the maximum likelihood estimator for b-value, and consequently seis-
micity rate, is in widespread use, with applications in many national seismic hazard analysis
(usgsNSHM1996; usgsNSHM2002). The algorithm has been demonstrated to be efficient and
unbiased for most applications. It is recognised by Felzer2008 that an implicit assumption is
made regarding the stationarity of the seismicity for all the time periods.
To implement the Weichert1980 recurrence estimator, the configuration properties are
defined as:
1
2 >> weichert_config = { ‘ magnitude_interval ’: 0.1 ,
3 ‘ reference_magnitude ’: None ,
4 # The remaining parameters are optional
5 ‘ bvalue ’: 1.0 ,
6 ‘ itstab ’: 1E -5 ,
7 ‘ maxite r ’: 1000}
32 Chapter 2. Catalogue Tools
As the Weichert1980 algorithm is reaches the MLE estimation by iteration then three
additional optional parameters can control the iteration process: bvalue is the initial guess for
the b-value, itstab the difference in b-value in order to reach convergence, and maxiter the
maximum number of iterations. 1
2.5.1 Kijko2004
Three different estimators of maximum magnitude are given by Kijko2004, each depending on
a different set of assumptions:
1. ”Fixed b-value”: Assumes a single b-value with no uncertainty
2. ”Uncertain b-value”: Assumes and uncertain b-value defined by an expected b and the
standard deviation
3. ”Non-Parametric Gaussian”: Assumes no functional form (can be applied to seismicity
observed to follow a more characteristic distribution)
Each of these estimators assumes the general form:
mmax = mobs
max + ∆ (2.13)
q
σmmax = σm2 obs + ∆2 (2.14)
max
In the three estimators some lower bound magnitude constraint must be defined. For those
estimators that assume an exponential recurrence model the lower bound magnitude must be
specified by the users. For the non-Parametric Gaussian method and explicit lower bound
magnitude does not have to be specified; however, the estimation is conditioned upon the largest
N magnitudes, where N must be specified by the user.
If the user wishes to input a maximum magnitude that is larger than that observed in the
catalogue (e.g. a known historical magnitude), this can be specified in the config file using
input_mmax with the corresponding uncertainty defined by
input_mmax_uncertainty. If these are not defined (i.e. set to None) then the maximum
magnitude will be taken from the catalogue.
All three estimators require an iterative solution, therefore additional parameters can be
specified in the configuration file that control the iteration process: tolerance difference in
MM ax estimate for the algorithm to be considered converged, and
maximum_iterations the maximum number of iterations for stability.
1 Theiterative nature of the Weichert1980 algorithm can result in very slow convergence and unstable behaviour
when the magnitudes infer b-values that are very small, or even negative. This can occur when very few events are in
the resulting catalogue, or when the magnitudes converge within a narrow range.
2.5 Maximum Magnitude 33
”Fixed b-value”
For a catalogue of n earthquakes, whose magnitudes are distributed by a GutenbergRichter1944
distribiution with a fixed "b" value, the increment of maximum magnitude is determined via:
m
Zmax n
1 − exp [−β (m − mmin )]
∆= dm (2.15)
1 − exp [−β (mobs
max − mmin )]
mmin
”Uncertain b-value”
For a catalogue of n earthquakes, whose magnitudes are distributed by a GutenbergRichter1944
distribiution with an uncertain "b" value, characterised by and expected term (b) and a corre-
sponding undertainty (σb ), the increment of maximum magnitude is determined via:
mm ax q n
n Z p
∆ = Cβ 1− dm (2.16)
p + m − mmin
mm in
2 2
where β = b ln (10.0), p = β / σβ , q = β /σβ and Cβ is a normalising coefficient
determined via:
1
Cβ = (2.17)
1 − [p/ (p + mmax − mmin )]q
In both the fixed and uncertain ”b” case a minimum magnitude will need to be input into
the calculation. If this value is lower than the minimum magnitude observed in the catalogue
the iterator may not stabilise to a satisfactory value, so it is recommended to use a minimum
magnitude that is greater than the minimum found in the observed catalogue.
The execution of the ”uncertain b-value” estimator is undertaken in a very similar to that of
the fixed b-value, the only additional parameter being the sigma-b term:
1 >> mmax_config = { ’ input_mmax ’: 7.6 ,
2 ’ i np ut_ mm ax_ unc er tai nt y ’: 0.22 ,
3 ’b - value ’: 1.0 ,
4 ’ sigma - b ’: 0.15
5 ’ input_mmin ’: 5.0 ,
6 ’ tolerance ’: 1.0 E -5 ,
7 ’ maximum_iterations ’: 1000}
34 Chapter 2. Catalogue Tools
8
9 >> from openquake . hmtk . seismicity . max - magnitude . kijko_sellevol_bayes \
10 import KijkoSellevolBayes
11
12 >> mmax_estimator = KijkoSellevolBayes ()
13
14 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
15 mmax_config )
Non-Parametric Gaussian
The non-parametric Gaussian estimator for maximum magnitude mmax is defined as:
m
Zmax
" #n
∑ni=1 Φ m−m − Φ mminh−mi
i
h
∆= dm (2.18)
∑ni=1 Φ mmaxh−mi − Φ mminh−mi
mmin
where mmin and mmax are the minimum and maximum magnitudes from a set of n events, Φ is
the standard normal cumulative distribution function. h a kernel smoothing factor:
with σ the standard deviation of a set of n earthquakes with magnitude mi where i = 1, 2, ...n,
and IQR the inter-quartile range.
Therefore the uncertainty on mmax is conditioned primarily on the uncertainty of the largest
observed magnitude. As in many catalogues the largest observed magnitude may be an earlier
historical event, which will be associated with a large uncertainty, this estimator tends towards
large uncertainties on mmax .
Due to the need to define some additional parameters the configuration file is slightly
different. No b-value or minimum magnitude needs to be specified; however, the algorithm will
consider only the largest number_earthquakes magnitudes (or all magnitudes if the number
of observations is smaller). The algorithm also numerically approximates the integral of the
Gaussian pdf, so number_samples is the number of samples of the distribution. The rest of the
execution remains the same as for the exponential recurrence estimators of Mmax :
1 >> mmax_config = { ’ input_mmax ’: 7.6 ,
2 ’ i np ut_ mm ax_ unc er tai nt y ’: 0.22 ,
3 ’ number_samples ’: 51 , # Default
4 ’ number_earthquakes ’: 100 # Default
5 ’ tolerance ’: 1.0 E -5 ,
6 ’ maximum_iterations ’: 1000}
7
8 >> from openquake . hmtk . seismicity . max - magnitude . k i j k o _ n o n p a r a m e t r i c _ g a u s s i a n \
9 import K i j k o N o n P a r a m e t r i c G a u s s i a n
10
11 >> mmax_estimator = K i j k o N o n P a r a m e t r i c G a u s s i a n ()
12
13 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
14 mmax_config )
indicates the mean moment release for the input catalogue in question. Two further straight lines
are defined with gradients equal to that of the slope of mean cumulative moment release, both
enveloping the cumulative plot. The vertical distance between these two lines indicates the total
amount of moment that may be released in the region, if no earthquakes were to occur in the
corresponding time (i.e. the distance between the upper and lower bounding lines on the time
axis). This concept is illustrated in Figure 2.6.
The cumulative moment estimator of mmax , whilst simple in concept, has several key advan-
tages. As a non-parametric method it is independent of any assumed probability distribution
and cannot estimate mmax lower than the observed mmax . It is also principally controlled by the
largest events in the catalogue, this making it relative insensitive to uncertainties in completeness
or lower bound threshold. In practice, this estimator, and to some extent that of Kijko2004
are dependent on having a sufficiently long record of events relative to the strain cycle for
the region in question, such that the estimate of average moment release is stable. This will
obviously depend on the length of the catalogue, and for some regions, particularly those in low
strain intraplate environments, it is often the case that mmax will be close to the observed mmax .
Therefore it may be the case that it is most appropriate to use these techniques on a larger scale,
either considering multiple sources or an appropriate tectonic proxy.
For the cumulative moment estimator it is possible to take into account the uncertainty on
mmax by applying bootstrap sampling to the observed magnitudes and their respective uncer-
tainties. This has the advantage that σmmax is not controlled by the uncertainty on the observed
mmax , as it is for the Kijko2004 algorithm. Instead it takes into account the uncertainty on
all the magnitudes in the catalogue. The cost of this, however, is that this method is more
computationally intensive, and therefore slower, than Kijko2004, depending on the number of
bootstrap samples the user chooses.
The algorithm is slightly simpler to run than the Kijko2004 methods; however, due to the
bootstrapping process it is slightly slower. It is run as per the following example:
1
2 >> mmax_config = { ’ number_bootstraps ’: 1000}
3
4 >> from openquake . hmtk . seismicity . max_magnitude . c u m u l a t i v e _m o m e n t _ r e l e as e \
5 import CumulativeMoment
6
7 >> mmax_estimator = CumulativeMoment ()
8
9 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
36 Chapter 2. Catalogue Tools
10 mmax_config )
For the cumulative moment algorithm the only user configurable parameter is the
number_bootstraps, which is the number of samples used during the bootstrapping process.
2.6.1 frankel1995
A smoothed seismicity method that has one of the clearest precedents for use in seismic hazard
analysis is that of frankel1995, originally derived to characterise the seismicity of the Central
and Eastern United States as part of the 1996 National Seismic Hazard Maps of the United States.
The method applies a simple isotropic Gaussian smoothing kernel to derive the expected rate
of events at each cell ñi from the observed rate n j of seismicity in a grid of j cells. This kernel
takes the form:
di2j /c2
∑ j n je
ñi = (2.20)
di2j /c2
∑j e
In the implementation of the algorithm, two steps are taken that we prefer to make config-
urable options here. The first step is that the time-varying completeness is accounted for using a
correction factor (t f ) based on the Weichert1980 method:
∑i e−β mci
tf = (2.21)
∑i Ti e−β mci
where mci the completeness magnitude corresponding to the mid-point of each completeness
interval, and Ti the duration of the completeness interval. The completeness magnitude bins must
be evenly-spaced; hence, within the application of the progress a function is executed to render
the input completeness table to one in which the magnitudes are evenly spaced with a width of
0.1 magnitude units.
Next setup the smoothing algorithm using and the corresponding kernel:
1
2 # Imports the smoothed seismicity algorithm
3 >> from openquake . hmtk . seismicity . smoothing . smoothed_seismicity import \
4 SmoothedSeismicity
5
6 # Imports the Kernel function
7 >> from openquake . hmtk . seismicity . smoothing . kernels . isotropic_gaussian \
8 import IsotropicGaussian
9
10 # Grid limits should be set up as
11 # [ min_long , max_long , spc_long ,
12 # min_lat max_lat , spc_lat ,
13 # min_depth , max_depth , spc_depth ]
14 >> grid_limits = [0. , 10. , 0.1 , 0. , 10. , 0.1 , 0. , 60. , 30.]
15 # Assuming a b - value of 1.0
16 >> smooth_seis = SmoothedSeismicity ( grid_limits ,
17 use_3d = True ,
18 bvalue =1.0)
The smoothed seismicity function needs to be set up with three variables: i) the extent (and
spacing) of the grid, ii) the choice to use 3D smoothing (i.e. distances are taken as hypocentral
rather than epicentral) and iii) the input b-value. The extent of the grid can also be defined from
the catalogue. If preferred the user need only specify the spacing of the longitude-latitude grid
(as a single floating point value), then the grid will be defined by taking the bounding box of the
earthquake catalogue and extended by the total smoothing length (i.e. the bandwidth (in km)
multiplied by the maximum number of bandwidths).
To run the smoothed seismicity analysis, the configurable parameters are: BandWidth the
bandwidth of the Gaussian kernel (in km), Length_Limit the number of bandwidths considered
as a maximum smoothing length, and increment chooses whether to output the incremental
a-value (for consistency with the original frankel1995 methodology) or the cumulative a-value
(corresponding to the a-value of the Gutenberg-Richter model).
The algorithm requires two essential inputs (the earthquake catalogue and the config file),
and three optional inputs:
• completeness_table A table of completeness magnitudes and their corresponding
completeness years (as output from the completeness algorithms)
• smoothing_kernel An instance of the required smoothing kernel class (currently only
Isotropic Gaussian is supported - and will be used if not specified)
• end_year The final year of the catalogue. This will be taken as the last year found in the
catalogue, if not specified by the user
The analysis is then run via:
1 # Set up config ( e . g . 50 km band width , up to 3 bandwidths )
2 >> config = { ‘ Length_Limit ’: 3. ,
3 ‘ BandWidth ’: 50. ,
4 ’ increment ’: True }
5 # Run the analysis !
6 >> output_data = smooth_seis . run_analysis (
7 catalogue ,
8 config ,
9 completeness_table ,
10 smoothing_kernel = IsotropicGaussian () ,
11 end_year = None )
12
13 # To write the resulting data to a csv file
14 >> smooth_seis . write_to_csv ( ‘ path / to / output_file . csv ’)
38 Chapter 2. Catalogue Tools
The resulting output will be a csv file with the following columns:
Longitude, Latitude, Depth, Observed Count, Smoothed Rate, b-value
where Observed Count is the observed number of earthquakes in each cell, and
Smoothed Rate is the smoothed seismicity rate.
Source Model and Hazard Tools
The Source Model Format
The Source Model Classes
Hazard Calculation Tools
3. Hazard Tools
Point Source
<pointGeometry>
<gml:Point>
<gml:pos>-122.0 38.0</gml:pos>
</gml:Point>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>10.0</lowerSeismoDepth>
</pointGeometry>
<magScaleRel></magScaleRel>
<ruptAspectRatio></ruptAspectRatio>
<nodalPlaneDist>
<nodalPlane probability="" strike="" dip="" rake="" />
<nodalPlane probability="" strike="" dip="" rake="" />
</nodalPlaneDist>
<hypoDepthDist>
<hypoDepth probability="" depth="" />
<hypoDepth probability="" depth="" />
</hypoDepthDist>
</pointSource>
</sourceModel>
</nrml>
Area Source
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>10.0</lowerSeismoDepth>
</areaGeometry>
<magScaleRel></magScaleRel>
<ruptAspectRatio></ruptAspectRatio>
<nodalPlaneDist>
<nodalPlane probability="" strike="" dip="" rake="" />
<nodalPlane probability="" strike="" dip="" rake="" />
</nodalPlaneDist>
<hypoDepthDist>
<hypoDepth probability="" depth="" />
<hypoDepth probability="" depth="" />
</hypoDepthDist>
</areaSource>
</sourceModel>
</nrml>
<simpleFaultGeometry>
<gml:LineString>
<gml:posList>
-121.82290 37.73010
-122.03880 37.87710
</gml:posList>
</gml:LineString>
<dip>45.0</dip>
<upperSeismoDepth>10.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel></magScaleRel>
<ruptAspectRatio></ruptAspectRatio>
<rake></rake>
</simpleFaultSource>
</sourceModel>
</nrml>
<complexFaultGeometry>
<faultTopEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5493260E+01
-124.977 41.214 0.4988560E+01
-125.140 42.096 0.4897340E+01
</gml:posList>
</gml:LineString>
</faultTopEdge>
<intermediateEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5593260E+01
-124.977 41.214 0.5088560E+01
-125.140 42.096 0.4997340E+01
</gml:posList>
</gml:LineString>
</intermediateEdge>
<intermediateEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5693260E+01
-124.977 41.214 0.5188560E+01
-125.140 42.096 0.5097340E+01
</gml:posList>
</gml:LineString>
</intermediateEdge>
<faultBottomEdge>
42 Chapter 3. Hazard Tools
<gml:LineString>
<gml:posList>
-123.829 40.347 0.2038490E+02
-124.137 41.218 0.1741390E+02
-124.252 42.115 0.1752740E+02
</gml:posList>
</gml:LineString>
</faultBottomEdge>
</complexFaultGeometry>
<magScaleRel></magScaleRel>
<ruptAspectRatio></ruptAspectRatio>
<rake></rake>
</complexFaultSource>
</sourceModel>
</nrml>
To load in a source model such as those shown above, in an IPython environment simply
execute the following:
1 >> from openquake . hmtk . parsers . source_model . nrml04_parser import \
2 nrm lSourc eMode lParse r
3
4 >> model_filename = ’ path / to / source_model_file . xml ’
5
6 >> model_parser = nrm lSourc eMode lParse r ( model_filename )
7
8 >> model = model_parser . read_file ()
9 Area source - ID : 1 , name : Quito
10 Point Source - ID : 2 , name : point
11 Simple Fault source - ID : 3 , name : Mount Diablo Thrust
12 Complex Fault Source - ID : 4 , name : Cascadia Megathrust
If loaded successfully a list of the source typology, ID and source name for each source will
be returned to the screen as shown above. The variable model contains the whole source model,
and can support multiple typologies (i.e. point, area, simple fault and complex fault).
If a list of sources is already provided, these can be passed to the class at the creation:
3.1 Source Model and Hazard Tools 43
The optional parameters control the discretisation of the geometry of the corresponding
sources, if they are present in the model: simple_mesh_spacing the mesh spacing (in km) of
the simple fault typology, complex_mesh_spacing the mesh spacing (in km) of the complex
fault typology, and area_discretisation the spacing of the mesh of nodes used to discretise
the area source model.
Default Values
In the ideal circumstances the user will have defined, for each source, the complete input model
needed for a PSHA calculation before converting to either the nrml or the oq-hazardlib format. It
is recognised, however, that it still be desirable to generate a hazard model from the source model,
even if some information (such as hypocentral depth distribution or nodal plane distribution)
remains incomplete. This might be the case if one wishes to explore the sensitivity of the hazard
curve to certain aspects of the modelling process. The default values are assumed to be as
follows:
• Aspect Ratio: 1.0
• Magnitude Scaling Relation: wells1994 (“WC1994”)
• Nodal Plane Distribution: Strike = 0.0, Dip = 90.0, Rake = 0.0, Weight=1.0
• Hypocentral Depth Distribution: Depth = 10.0 km, Weight = 1.0
Selects the earthquakes within a distance of a simple fault source, where selector is an in-
stance of the HMTK “selector” class, distance is the distance from the fault, distance_metric
is the type of distance metric used (“joyner-boore” or “rupture”).
.create_oqnrml_source(use_defaults=False)
Converts the mtkSimpleFaultSource into its equivalent OpenQuake nrml model.
.create_oqhazardlib_source(tom, mesh_spacing, use_defaults=False)
Converts the source model into its equivalent oq-hazardlib class. tom is the temporal
occurrence model and mesh_spacing is the spacing (in km) of the mesh of nodes used to
discretise the fault surface.
Alternatively if you have your site data in an array (such would be the case if you were
loading from a csv file), you can use a built-in HMTK function to create the site model
1 # For the same sites as in the previous example
2 >> from openquake . hmtk . hazard import si t e _a r r ay _ t o_ c o ll e c ti o n
3 >> site_array = np . array (
4 [[30.0 , 40.0 , 760. , 1.0 , 100. , 5.0 , 1.] ,
5 [30.5 , 40.5 , 500. , 1.0 , 100. , 5.0 , 2.] ,
6 [31.0 , 40.6 , 200. , 0.0 , 100. , 5.0 , 3.]])
7 >> sites = s it e _ ar r a y_ t o _c o l le c t io n ( site_array )
3. Define the GMPE tectonic regionalisation. In this case we consider only one tectonic
region type (Active Shallow Crust) and one GMPE (akkar2010)).
1 # The Akkar & Bommer (2010) GMPE is known to
2 # OpenQuake as AkkarBommer2010
3 >> gmpe_model = { " Active Shallow Crust " : " AkkarBommer2010 " }
4. Define the intensity measure types and corresponding intensity measure levels
1 >> imt_list = [ " PGA " , " SA (1.0) " ]
2 >> pga_iml = [0.001 , 0.01 , 0.02 , 0.05 , 0.1 ,
3 0.2 , 0.4 , 0.6 , 0.8 , 1.0 , 2.0]
4 >> sa1_iml = [0.001 , 0.01 , 0.02 , 0.05 , 0.1 ,
5 0.2 , 0.3 , 0.5 , 0.7 , 1.0 , 1.5]
6 >> iml_list = [ pga_iml , sa1_iml ]
3 sites ,
4 gmpe_model ,
5 iml_list ,
6 imt_list ,
7 truncation_level =3.0 ,
8 s ou rc e_ i nt eg ra t io n_ di s t = None ,
9 r u pt u r e_ i n te g r at i o n_ d i st = None )
6. The output, in the above example “haz_curves”, is a dictionary that has the following
form:
1
2 >> haz_curves
3 { PGA : np . array ([[ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
4 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
5 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )]]) ,
6 SA (1.0): np . array ([[ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
7 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
8 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )]])}
where P(IMLi ) is the probability of exceeding intensity measure level i in the period of
the temporal occurrence model (50 years in this case). So for each intensity measure
type there is a corresponding 2-D array of values with NSIT ES rows and NIMLS columns,
where NSIT ES is the number of sites in the site model, and NIMLS is the number of intensity
measure levels defined for the specific intensity measure type.
Fault Recurrence from Geology
Epistemic Uncertainties in the Fault Mod-
elling
Tectonic Regionalisation
Definition of the Fault Input
Fault Recurrence Models
Running a Recurrence Calculation from
Geology
4. Geology Tools
where A is the area of the fault surface (in km2 ), µ is the shear modulus (characterised in the
toolkit in terms of GigaPascals, GPa) and c the coefficient of seismogenic coupling. Slip rates
must be input in mm yr−1 ; lengths and area in km or km2 respectively. The magnitude frequency
distribution calculators differ primarily in the manner in which this moment rate is distributed in
order to constrain the activity rate. The different calculators are described below.
hazardlib, a purely decision based epistemic uncertainty analysis is supported. This requires that,
for each parameter upon which epistemic uncertainty is supported, the user must specify the
alternative values and the corresponding weightings. Currently, we support epistemic uncertainty
on six different parts of the model:
1. Slip Rate (mm yr−1 )
2. Magnitude Scaling Relation
3. Shear Modulus (GPa)
4. Displacement to Length Ratio
5. Number of standard deviations on the Magnitude Scaling Relation
6. Configuration of the Magnitude Frequency Distribution
As input to the function, these epistemic uncertainties must be represented as a list of tuples,
with a corresponding value and weight. For example, if one wished to characterise the slip on a
fault by three values (e.g. 3.0, 5.0, 7.0) with corresponding weightings (e.g. 0.25, 0.5, 0.25), the
slip should be defined as shown below:
1 >> slip = [(3.0 , 0.25) , (5.0 , 0.5) , (7.0 , 0.25)]
In characterising uncertainty in this manner the user is essentially creating a logic tree for
each source, in which the total number of activity rate models (i.e. the end branches of the logic
tree) is the product of the number of alternative values input for each supported parameter. The
user can make a choice as to how they wish this uncertainty to be represented in the resulting
hazard model:
Complete Enumeration
This will essentially reproduce the fault in the source model N times, where N is the total number
of end branches, where on each branch the resulting activity rate is multiplied by the end weight
the branch.
Collapsed Branches
In some cases it may become to costly to reproduce faults models separately for each end branch,
and the user may simply wish to collapse the logic tree into a single activity rate. This rate,
represented by an incremental magnitude frequency distribution, is the sum of the weighted
activity rates on all the branches. To calculate this the program will determine the minimum and
maximum magnitude of all the branches, then using a user specified bin width will calculate the
weighted sum of the occurrence rates in each bin.
N.B. When collapsing the branches it the original magnitude scaling relations used on the
branches and the scaling relation associated to the source in the resulting OpenQuake source
model are not necessarily the same! The user will need to specify the scaling relation that will
be assigned to the fault source when the branches are collapsed.
Magnitude Scaling Relations
To ensure compatibility with the OpenQuake engine, the scaling relations are taken directly
from the OpenQuake library. Therefore the only scaling relations available are those that can
be currently found in the oq-hazardlib (wells1994 and thomas2010 at the time of writing). To
implement new magnitude scaling relations the reader is referred to the documentation and
source code for the OpenQuake hazard library (http://docs.openquake.org/oq-hazardlib)
Here the tectonic regionalisation represents a list of categories (albeit only one is shown
above). The lists elements are demarcated by the ( - ) symbol and all indentation is with respect
to that symbol. So in the above example, a single region class has been defined with the name
“Active Shallow Crust” and a unique identifier code “001”. The default values are now provided
for the three data attributes: magnitude scaling relation, shear modulus and displacement to
length ratio. Each attribute is associated with a dictionary containing two keys: “Value” and
“Weight”, which then define the lists of the values and the corresponding weights, respectively.
N.B. If, for any of the above attributes, the number of weights is not the same as the number
of values, or the weights do not sum to 1.0 then an error will be raised!
A fault model must be defined with both a model identifier key (“001”) and a model name,
“001” and “Template Simple Fault” in the example below. From then on the each fault is then
defined as an element in the list.
52 Chapter 4. Geology Tools
Fault_Model_ID: 001
Fault_Model_Name: Template Simple Fault
Fault_Model:
- ID: 001
Tectonic_Region: Active Shallow Crust
Fault_Name: A Simple Fault
Fault_Geometry: {
Fault_Typology: Simple,
# For simple typology, defines the trace in terms of Long., Lat.
Fault_Trace: [30.0, 30.0,
30.0, 31.5],
- Model_Name: YoungsCoppersmithExponential
# Example constructor of the Youngs & Coppersmith (1985) Exponential model
MFD_spacing: 0.1
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
b_value: [0.8, 0.05]
####################################################
# Example constructor of the Youngs & Coppersmith (1985) Characteristic model
- Model_Name: YoungsCoppersmithCharacteristic
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
MFD_spacing: 0.1
b_value: [0.8, None]
Shear_Modulus: {
Value: [30., 35.0],
Weight: [0.8, 0.2]}
Magnitude_Scaling_Relation: {
Value: [WC1994],
Weight: [1.0]}
Scaling_Relation_Sigma: {
Value: [-1.5, 0.0, 1.5],
Weight: [0.15, 0.7, 0.15]}
Aspect_Ratio: 1.5
Displacement_Length_Ratio: {
Value: [1.25E-5, 1.5E-5],
Weight:[0.5, 0.5]}
¯
Mo (MW ) = 10.016.05+1.5MW = Mo (MMAX ) e−d∆M (4.2)
AndersonLuco1983 “Arbitrary”
The study of AndersonLuco1983 defines three categories of models for defining a magnitude
recurrence distribution from the geological slip. The first model refers to the case when the
recurrence is related to the entire fault, which we call the “Arbitrary“ model here. The second
model refers the recurrence to the rupture area of the maximum earthquake, the “Area Mmax”
model here. The third category relates the recurrence to a specific site on the fault, and is not
yet implemented in to tools. From within each of the three categories there are three different
subcategories, which allow for different forms of tapering at the upper edge of the model. The
reader is referred to the original paper of AndersonLuco1983 for further details and a complete
derivation of the models. The different forms of the recurrence model are referred to here as types
1, 2 and 3, which correspond to equations 4, 3 and 2 in the original paper of AndersonLuco1983.
1. The ‘first’ type of AndersonLuco1983 arbitrary model is defined as:
d¯ − b̄ µAṡ
N (MW ≥ M) = ¯ exp b̄ [MMAX − M] (4.3)
d Mo (MMAX )
where b̄ = b loge (10.0), Mo (MMAX ) is the moment of the maximum magnitude, and A and
ṡ are the are of the fault and the slip rate, as defined previously.
2. The ‘second’ type of model is defined as:
d¯ − b̄ µAṡ
N (MW ≥ M) = exp −b̄ [MMAX − M] − 1 (4.4)
b̄ Mo (MMAX )
d¯ d¯ − b̄
µAṡ
N (MW ≥ M) = ×...
b̄ Mo (MMAX )
(4.5)
1
exp b̄ [MMAX − M] − 1 − [MMAX − M]
b̄
- Model_Name: AndersonLucoArbitrary
# Example constructor of the Anderson & Luco (1983) - Arbitrary Exponential
# Type - chooses between type 1 (’First’), type 2 (’Second’) or type 3 (’Third’)
Type: First
MFD_spacing: 0.1
Model_Weight: 0.1
# Maximum Magnitude of the exponential distribution
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 4.5
# b-value of the exponential distribution as [expected, uncertainty]
b_value: [0.8, 0.05]
The Model_Name and the Type are self explanatory. Model_Weight is the weighting of the
particular MFD within the epistemic uncertainty analysis. MFD_Spacing is the spacing of the
evenly discretized magnitude frequency distribution that will be output. The three parameters
Minimum_Magnitude, Maximum_Magnitude and
Maximum_Magnitude_Uncertainty define the bounding limits of the MFD and the stan-
dard deviation of MMAX in MW units. Minimum_Magnitude is an essential attribute, whereas
Maximum_Magnitude and Maximum_Magnitude_Uncertainty are optional. If not specified
the code will calculate Maximum_Magnitude and
Maximum_Magnitude_Uncertainty from the magnitude scaling relationship. Finally, as these
are exponential models the b-value must be specified. Here it is preferred that b-value is specified
as a tuple of [b − value, b − value error], although at present the epistemic uncertainty on b-value
does not propagate (this may change in future!).
As with the catalogue tools, plotting functions are available to assist the user understand
the nature of the recurrence model used for a given fault. To illustrate the impact of the choice
of the ‘first’, ‘second’ and ‘third’ type of model we consider a simple fault with the following
properties: along-strike length = 200 km, down-dip width = 20 km, rake = 0.0 (strike-slip) and
slip-rate = 10 mm/yr. The wells1994 magnitude scaling relation is assumed. The fault and three
magnitude frequency distributions are configured as shown:
1 >> slip = 10.0 # Slip rate in mm / yr
2 # Area = along - strike length ( km ) * down - dip with ( km )
3 >> area = 100.0 * 20.0
4 # Rake = 0.
5 >> rake = 0.
6 \ # Magnitude Scaling Relation
7 >> from openquake . hazardlib . scalerel . wc1994 import WC1994
8 >> msr = WC1994 ()
9
10 >> and_luc_config1 = { ’ Model_Name ’: ’ An derson LucoA rbitra ry ’ ,
11 ’ Model_Type ’: ’ First ’ ,
12 ’ Model_Weight ’: 1.0 ,
13 ’ MFD_spacing ’: 0.1 ,
14 ’ Maximum_Magnitude ’: None ,
15 ’ Minimum_Magnitude ’: 4.5 ,
16 ’ b_value ’: [0.8 , 0.05]}
17 >> and_luc_config2 = { ’ Model_Name ’: ’ And ersonL ucoAr bitrar y ’ ,
18 ’ Model_Type ’: ’ Second ’ ,
19 ’ Model_Weight ’: 1.0 ,
20 ’ MFD_spacing ’: 0.1 ,
21 ’ Maximum_Magnitude ’: None ,
22 ’ Minimum_Magnitude ’: 4.5 ,
23 ’ b_value ’: [0.8 , 0.05]}
24 >> and_luc_config3 = { ’ Model_Name ’: ’ And ersonL ucoAr bitrar y ’ ,
25 ’ Model_Type ’: ’ Third ’ ,
56 Chapter 4. Geology Tools
26 ’ Model_Weight ’: 1.0 ,
27 ’ MFD_spacing ’: 0.1 ,
28 ’ Maximum_Magnitude ’: None ,
29 ’ Minimum_Magnitude ’: 4.5 ,
30 ’ b_value ’: [0.8 , 0.05]}
101
AndersonLucoArbitrary - First Type
AndersonLucoArbitrary - Second Type
AndersonLucoArbitrary - Third Type
100
10-1
Annual Rate
10-2
10-3
Figure 4.1 – Comparison of magnitude frequency distributions for the specified fault using the three
different models of the AndersonLuco1983 “Arbitrary” configuration
The models are then compared, as shown in Figure 4.1, using the following commands:
The second set of models from AndersonLuco1983 consider the case when the the recurrence
model is referred to the rupture area of the maximum earthquake specified on the fault. As
the area is not extracted directly from the geometry, additional information must be provided,
namely the aspect ratio of ruptures on the fault and the displacement to length ratio (α) of the
4.1 Fault Recurrence from Geology 57
s
αMo (0)
β= (4.6)
µW
The three types of AndersonLuco1983 “Area Mmax” model are then calculated via:
1. Type 1 (“First”)
d¯ − b̄ ṡ
¯
d
N (MW ≥ M) = ¯ exp − MMAX exp b̄ [MMAX − M] (4.7)
d β 2
2. Type 2 (“Second”)
d¯ − b̄ ṡ
¯
d
N (MW ≥ M) = exp − MMAX exp −b̄ [MMAX − M] − 1 (4.8)
b̄ β 2
3. Type 3 (“Third”)
d¯ d¯ − b̄ ṡ
¯
d
N (MW ≥ M) = exp − MMAX × . . .
b̄ β 2
(4.9)
1
exp b̄ [MMAX − M] − 1 − [MMAX − M]
b̄
As the rupture aspect ratio and displacement to length ratio are attributes of the fault and not of
the MFD, then the MFD configuration is the same as that of the AndersonLuco1983 “Arbitrary”
calculator, albeit that Model_Name must now be specified as AndersonLucoAreaMmax. As
before, the maximum magnitude and their uncertainty are optional, and will taken from the
magnitude scaling relation if not specified in the configuration. This is permitted simply to
ensure flexibility of the algorithm, although given the context of the “Area MMax” algorithm it
is understood that the maximum magnitude should be interpreted by the modeller. If this is not
the case, and the maximum magnitude is intended to be constrained using the geometry of the
rupture, the “Arbitrary” model may be preferable.
The three distributions can compared visually for the same fault using the plotting tools
shown previously. The example below, using the same fault properties defined previously, will
generate a plot similar to that shown in Figure 4.2.
1 >> and_luc_config1 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
2 ’ Model_Type ’: ’ First ’ ,
3 ’ Model_Weight ’: 1.0 ,
4 ’ MFD_spacing ’: 0.1 ,
5 ’ Maximum_Magnitude ’: None ,
6 ’ Minimum_Magnitude ’: 4.5 ,
7 ’ b_value ’: [0.8 , 0.05]}
8 >> and_luc_config2 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
9 ’ Model_Type ’: ’ Second ’ ,
10 ’ Model_Weight ’: 1.0 ,
11 ’ MFD_spacing ’: 0.1 ,
12 ’ Maximum_Magnitude ’: None ,
13 ’ Minimum_Magnitude ’: 4.5 ,
14 ’ b_value ’: [0.8 , 0.05]}
15 >> and_luc_config3 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
16 ’ Model_Type ’: ’ Third ’ ,
58 Chapter 4. Geology Tools
102
AndersonLucoAreaMmax - First Type
AndersonLucoAreaMmax - Second Type
AndersonLucoAreaMmax - Third Type
101
100
Annual Rate
10-1
10-2
Figure 4.2 – Comparison of magnitude frequency distributions for the specified fault using the three
different models of the AndersonLuco1983 “Area-Mmax” configuration
17 ’ Model_Weight ’: 1.0 ,
18 ’ MFD_spacing ’: 0.1 ,
19 ’ Maximum_Magnitude ’: None ,
20 ’ Minimum_Magnitude ’: 4.5 ,
21 ’ b_value ’: [0.8 , 0.05]}
22 >> a n de rs on _ lu co _a r ea _m ma x = [ and_luc_config1 ,
23 and_luc_config2 ,
24 and_luc_config3 ]
25 >> p lot _r ecu rre nc e_m od els ( anderson_luco_area_mmax ,
26 area ,
27 slip ,
28 msr ,
29 rake ,
30 disp_length_ratio =1.25 E -5 ,
31 msr_sigma =0.0)
Characteristic
Although the term “Characteristic” may take on certain different meanings in the literature, in
the present calculator it is referring to the circumstance when the fault is assumed to rupture
with magnitudes distributed in a narrow range around the single characteristic magnitude. The
model is therefore a truncated Gaussian distribution, in which the following must be specified:
mean characteristic magnitude, the uncertainty (in magnitude units) and the number of standard
deviations above and below the mean to be used as truncation limits.
10-2
Characteristic
10-3
Annual Rate
10-4
10-5
Figure 4.3 – Magnitude frequency distribution for the specified fault configuration using the “Char-
acteristic” model
The distribution is shown, for the fault example defined previously, in Figure 4.3, which is
generated using the code example shown below:
1 >> characteristic = [{ ’ Model_Name ’: ’ Characteristic ’ ,
2 ’ MFD_spacing ’: 0.05 ,
3 ’ Model_Weight ’: 1.0 ,
4 ’ Maximum_Magnitude ’: None ,
5 ’ Sigma ’: 0.15 ,
6 ’ Lower_Bound ’: -3.0 ,
7 ’ Upper_Bound ’: 3.0}]
8 >> p lot _r ecu rre nc e_m od els ( characteristic ,
9 area ,
10 slip ,
11 msr ,
12 rake ,
13 msr_sigma =0.0)
60 Chapter 4. Geology Tools
YoungsCoppersmith1985 “Exponential”
This model is another form of “Exponential” model and is noted as being similar in construct to
the AndersonLuco1983 Type 2 models. It is included here mostly for completeness. The model
is given as:
where M0MAX is the moment corresponding to the maximum magnitude. The inputs for the model
are defined in a similar manner as for the AndersonLuco1983 models:
- Model_Name: YoungsCoppersmithExponential
# Example constructor of the Youngs & Coppersmith (1985) Exponential model
MFD_spacing: 0.1
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
b_value: [0.8, 0.05]
Note that all of the exponential models described here contain the term d − b, or some variant
thereof, where d is equal to 1.5. This introduces the condition that b ≤ 1.5.
YoungsCoppersmith1985 “Characteristic”
The YoungsCoppersmith1985 model is a hybrid model, comprising an exponential distribution
for lower magnitudes and a fixed recurrence rate for the characteristic magnitude MC . The
exponential component of the model is described via:
" #
µAṡ e(−β (MMAX −M−0.5)) M0MAX b10−c/2 beβ 1 − 10−c/2
N (M) − N (MC ) = + (4.11)
1 − e(−β (MMAX −M−0.5)) c−b c
where β = b ln (10), c = 1.5 and all other parameters are described as for the YoungsCoppersmith1985
“Exponential” and AndersonLuco1983 models. The rate for the characteristic magnitude is then
given by:
100
YoungsCoppersmithExponential
YoungsCoppersmithCharacteristic
10-1
Annual Rate
10-2
10-3
Figure 4.4 – Magnitude frequency distribution for the specified fault configuration using the
YoungsCoppersmith1985 “Exponential” and “Hybrid” models
source. The parser contains a method called read_file(x) which takes as an input the desired
mesh spacing (in km) used to create the mesh of the fault surface. In the example below this is
set to 1.0 km, but for large complex faults (such as large subduction zones) it may be desirable to
select a larger spacing to avoid storing a large mesh in RAM.
1 >> from openquake . hmtk . parsers . faults . fault_yaml_parser import \
2 FaultYmltoSource
3
4 >> input_file = ’ path / to / fault_model_file . yml ’
5
6 >> parser = FaultYmltoSource ( input_file )
7
8 # Spacing of the fault mesh ( km ) must be specified
9 # at input ( here 1.0 km )
10 >> fault_model , tect_reg = parser . read_file (1.0)
In the second step, we simply execute the method to calculate the recurrence on the fault,
and the write the resulting source model to an xml file.
1 # Builds the fault model
2 >> fault_model . build_fault_model ()
3 # Specify the xml file for writing the output model
4 >> output_file = ’ path / to / o u tp u t _s o u rc e _ mo d e l_ f i le . yml ’
5
6 >> fault_model . source_model . serialise_to_nrml ( outfile ,
7 use_defaults = True )
The serialiser takes as an optional input the choice to accept default values for certain
attributes that may be missing from the fault source definition. These values are as follows:
Epistemic Uncertainties
The following example compares the case when epistemic uncertainties are incorporated into the
analysis. A demonstration file (tests/parsers/faults/yaml_examples/
simple_fault_example_4branch.yml) is included, which considers two epistemic uncer-
tainties for a specific fault: slip rate and magnitude frequency distribution type. The fault has
two estimates of slip rate (5 mm yr−1 and 7 mm yr−1 ), each assigned a weighting of 0.5. Two
magnitude frequency distribution types (Characteristic and AndersonLuco1983 “Arbitrary”) are
assigned, with weights of 0.7 and 0.3 respectively. The analysis is run, in the manner described
previously. Firstly we consider the case when the different options are enumerated (the default
option:
1 >> from openquake . hmtk . parsers . faults . fault_yaml_parser import \
2 FaultYmltoSource
3 >> input_file = \
4 ’ tests / parsers / faults / yaml_examples /
5 s i m p l e _ f a u l t _ e x a m p l e _ 4 b r a n c h . yml ’
6 >> parser = FaultYmltoSource ( input_file )
7 >> fault_model , tect_reg = parser . read_file (1.0)
8 >> fault_model . build_fault_model ()
9 >> output_file = ’ path / to / o u tp u t _s o u rc e _ mo d e l_ f i le . yml ’
10 >> fault_model . source_model . serialise_to_nrml ( output_file ,
11 use_defaults = True )
4.1 Fault Recurrence from Geology 63
As four different activity rates are produced, the source is duplicated four time, each with
an activity rate that corresponds to the rate calculated for the specific branch, multiplied by the
weight of the branch. The output is a nrml file with four sources, as illustrated below:
<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="4.5" binWidth="0.1">
<occurRates>0.0566539993476 0.0471227441454 0.0391949913751
0.0326009738345 0.0271163089382 0.0225543633808
0.0187599023404 0.0156038071162 0.0129786814505
0.0107951970272 0.00897905378917 0.00746845164061
0.00621198750089 0.00516690614979 0.00429764534408
0.00357462569825 0.00297324415106 0.00247303676749
0.00205698238781 0.00171092342797 0.00142308412252
0.00118366981634 0.000984533670182 0.000818899438288
0.000681130884944 0.000566539993476</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
</sourceModel>
</nrml>
If, alternatively, the user wishes to collapse the logic tree branches to give a single source as
an output, this option can be selected as follows:
1 ...
2 >> from openquake . hazardlib . scalerel . wc1994 import WC1994
3 >> fault_model . build_fault_model ( collapse = True ,
4 rendered_msr = WC1994 ())
5 ...
To collapse the branches it is simple necessary to specify as an input to the function collapse
True. The second option requires further explanation. A seismogenic source requires the
definition of a corresponding magnitude scaling relation, even if multiple scaling relations have
been used as part of the epistemic uncertainty analysis. As collapsing the branches means that
the activity rate is no longer associated with a specified scaling relation, but is an aggregate of
many, the user must select the scaling relation to be associated with the output activity rate, for
use in the OpenQuake hazard calculation. Therefore the input rendered_msr must be given an
instance of one of the supported magnitude scaling relationships.
The output file should resemble the following:
5. Geodetic Tools
where A is the area of the coupled surface, ṡ is the slip rate, µ the shear modulus and c the
coefficient describing the fraction of seismogenic coupling. Initially, the moment-rate tensor
may be related to the strain rate tensor (εi j ) using the formula of Kostrov1974:
where H is the seismogenic thickness and A the area of the seismogenic source. Adopting
the definition for a deforming continuum given by BirdLiu2007 the moment rate is equal to:
2ε̇3 : ε̇2 < 0
Ṁo = A hczi µ (5.3)
−2ε̇1 : ε̇2 ≥ 0
68 Chapter 5. Geodetic Tools
where assuming ε̇1 ≤ ε̇2 ≤ ε̇3 and ε̇1 + ε̇2 + ε̇3 = 0 (i.e. no volumetric changes are observed).
The coupled seismogenic thickness (hczi) is a characteristic of each tectonic zone, and for the
current purposes corresponds to the values (and regionalisation) proposed by BirdKagan2004.
The BirdLiu2007 Approach
As the regionalisation of BirdKagan2004 underpins the BirdLiu2007 methodology, the follow-
ing approach is used to derive the shallow sesimicity rate. The geodetic moment rate is first
divided by the “model” moment rate (ṀoCMT ), which is the integral of the tapered Gutenberg-
Richter distribution fit to the subset of the Global CMT catalogue for each tectonic zone, then
multiplied by the number of events in the sub-catalogue (NCMT ) above the threshold (complete-
ness) magnitude for the sub-catalogue (mT ):
Ṁo
mCMT ṄCMT
Ṅ m > T = (5.4)
ṀoCMT
The forecast rate of seismicity greater than mT (Ṅ (m > mT )) for a particular zone (or cell) is
described using the tapered Gutenberg-Richter distribution:
!−β
Mo (mT )
Ṅ (m > mT ) =Ṅ m > mCMT
T ×...
Mo mCMTT
! (5.5)
Mo mCMT
T − M o (m T )
exp
Mo (mc )
In the present format, the values exx, eyy, exy describe the horizontal components of the
strain tensor (in the example above in terms of nanostrain, 10−9 ). The region term corresponds
to the region (in this instance the Kreemer_etal2003 class) to which the cell belongs: Intra-plate
[IPL], Subduction [S], Continental [C], Oceanic[O] and Ridge[R]. If the user does not have a
previously defined regionalisation, can be determined using the 0.6◦ × 0.5◦ global regionalisation
cells.
The simplest strain workflow is to implement the model as defined by Bird_etal2010. This
process is illustrated in the following steps:
1. To simply load in the csv data the ReadStrainCsv tool is used:
1 >> from openquake . hmtk . parsers . strain . strain_csv_parser import \
2 ReadStrainCsv , WriteStrainCsv
3
4 # Load the file
5 >> reader = ReadStrainCsv ( ’ ’ path / to / strain / file . csv ’ ’)
6 >> strain_model = reader . read_data ( scaling_factor =1 E -9)
5.1 Recurrence from Geodetic Strain 69
2. If the regionalisation is not supplied within the input file then it can be assigned initially
from the in-built Kreemer_etal2003 regionalisation. This is done as follows (and may
take time to execute depending on the size of the data):
1 # ( Optional ) To assign regionalisation from Kreemer et al . 2003
2 >> from openquake . hmtk . strain . regionalisation . k re em e r_ re g io na li s at io n \
3 import Kr eem er Reg io nal is ati on
4 # ( Optional )
5 >> regionalisation = Kr ee mer Re gio na lis ati on ()
6 # ( Optional )
7 >> strain_model = regionalisation . get_regionalisation (
8 strain_model )
3. The next step is to implement the Shift calculations. The Shift module must first be
imported and the magnitudes for which the activity rates are to be calculated must be
defined as a list (or array). The strain data is input into the calculation along with two other
configurable options: cumulative decides whether it is the cumulative rate of events
above each magnitude (True), or the incremental activity rate for the bin M (i) : M (i + 1)
(False), in_seconds decides whether to return the rates per second for consistency with
Bird_etal2010 (True) or as annual rates (False).
1 >> from openquake . hmtk . strain . shift import Shift
2 # In this example , calculate cumulative rates
3 # for M > 5. , 6. , 7. , 8.
4 >> magnitudes = [5. , 6. , 7. , 8.]
5 >> model = Shift ( magnitudes )
6 >> model . c al cu l at e_ ac t iv it y_ r at e ( strain_model ,
7 cumulative = False ,
8 in_seconds = False )
4. Finally the resulting model can be written to a csv file. This will be in the same format as
the input file, now with additional attributes and activity rates calculated.
1 # Export the resulting rates to a csv file
2 >> writer = WriteStrainCsv ( ’ ’ path / to / output / file . csv ’ ’)
3 >> writer . write_file ( model . strain , scaling_factor =1 E -9)
Additional support for writing a continuum model to a nrml Point Source model is envisaged,
although further work is needed to determine the optimum approach for defining the seismogenic
coupling depths, hypocentral depths and focal mechanisms.
Bibliography
Books
Articles
Other Sources
5.1 Recurrence from Geodetic Strain 73
Part I
Appendices
Basic Data Types
Scalar Parameters
Iterables
Dictionaries
Loops and Logicals
Functions
Classes and Inheritance
Simple Classes
Inheritance
Abstraction
Numpy/Scipy
The HMTK is intended to be used by scientists and engineers without the necessity of having an
existing knowledge of Python. It is hoped that the examples contained in this manual should
provide enough context to allow the user to understand how to use the tools for their own needs.
In spite of this, however, an understanding of the fundamentals of the Python programming
language can greatly enhance the user experience and permit the user to join together the tools in
a workflow that best matches their needs.
The aim of this appendix is therefore to introduce some fundamentals of the Python program-
ming language in order to help understand how, and why, the HMTK can be used in a specific
manner. If the reader wishes to develop their knowledge of the Python programming language
beyond the examples shown here, there is a considerable body of literature on the topic from
both a scientific and developer perspective.
• integer An integer number. If the decimal point is omitted for a floating point number the
number will be considered an integer
1 >> b = 3
2 >> print b , type ( b )
3 3 < type ’ int ’ >
The functions float() and int() can convert an integer to a float and vice-versa. Note
that taking int() of a fraction will round the fraction down to the nearest integer
78 Chapter A. The 10 Minute Guide to Python!
1 >> float ( b )
2 3
3 >> int ( a )
4 3
• string A text string (technically a “list” of text characters). The string is indicated by the
quotation marks ”something” or ’something else’
1 >> c = " apples "
2 >> print c , type ( c )
3 apples < type ’ str ’ >
• bool For logical operations python can recognise a variable with a boolean data type (True
/ False).
1 >> d = True
2 >> if d :
3 print "y"
4 else :
5 print "n"
6 y
7 >> d = False
8 >> if d :
9 print "y"
10 else :
11 print "n"
12 n
Care should be taken in Python as the value 0 and 0.0 are both recognised as False if
applied to a logical operation. Similarly, booleans can be used in arithmetic where True
and False take the values 1 and 0 respectively
1 >> d = 1.0
2 >> if d :
3 print "y"
4 else :
5 print "n"
6 y
7 >> d = 0.0
8 >> if d :
9 print "y"
10 else :
11 print "n"
12 n
Scalar Arithmetic
Scalars support basic mathematical operations (# indicates a comment):
1 >> a = 3.0
2 >> b = 4.0
3 >> a + b # Addition
4 7.0
5 >> a * b # Multiplication
6 12.0
7 >> a - b # Subtraction
8 -1.0
9 >> a / b # Division
10 0.75
11 >> a ** b # Exponentiation
12 81.0
13 # But integer behaviour can be different !
14 >> a = 3; b = 4
A.1 Basic Data Types 79
15 >> a / b
16 0
17 >> b / a
18 1
A.1.2 Iterables
Python can also define variables as lists, tuples and sets. These data types can form the basis for
iterable operations. It should be noted that unlike other languages, such as Matlab or Fortran,
Python iterable locations are zero-ordered (i.e. the first location in a list has an index value of 0,
rather than 1).
• List A simple list of objects, which have the same or different data types. Data in lists can
be re-assigned or replaced
1 >> a_list = [3.0 , 4.0 , 5.0]
2 >> print a_list
3 [3.0 , 4.0 , 5.0]
4 >> another_list = [3.0 , " apples " , False ]
5 >> print another_list
6 [3.0 , ’ apples ’ , False ]
7 >> a_list [2] = -1.0
8 a_list = [3.0 , 4.0 , -1.0]
• Tuples Collections of objects that can be iterated upon. As with lists, they can support
mixed data types. However, objects in a tuple cannot be re-assigned or replaced.
1 >> a_tuple = (3.0 , " apples " , False )
2 >> print a_tuple
3 (3.0 , ’ apples ’ , False )
4 # Try re - assigning a value in a tuple
5 >> a_tuple [2] = -1.0
6 TypeError Traceback ( most recent call last )
7 < ipython - input -43 -644687 cfd23c > in < module >()
8 ----> 1 a_tuple [2] = -1.0
9
10 TypeError : ’ tuple ’ object does not support item assignment
• Sets A set is a special case of an iterable in which the elements are unordered, but contains
more enhanced mathematical set operations (such as intersection, union, difference, etc.)
1 >> from sets import Set
2 >> x = Set ([3.0 , 4.0 , 5.0 , 8.0])
3 >> y = Set ([4.0 , 7.0])
4 >> x . union ( y )
5 Set ([3.0 , 4.0 , 5.0 , 7.0 , 8.0])
6 >> x . intersection ( y )
7 Set ([4.0])
8 >> x . difference ( y )
9 Set ([8.0 , 3.0 , 5.0]) # Notice the results are not ordered !
80 Chapter A. The 10 Minute Guide to Python!
Indexing
For some iterables (including lists, sets and strings) Python allows for subsets of the iterable
to be selected and returned as a new iterable. The selection of elements within the set is done
according to the index of the set.
1 >> x = range (0 , 10) # Create an iterable
2 >> print x
3 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9]
4 >> print x [0] # Select the first element in the set
5 0 # recall that iterables are zero - ordered !
6 >> print x [ -1] # Select the last element in the set
7 9
8 >> y = x [:] # Select all the elements in the set
9 >> print y
10 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9]
11 >> y = x [:4] # Select the first four element of the set
12 >> print y
13 [0 , 1 , 2 , 3]
14 >> y = x [ -3:] # Select the last three elements of the set
15 >> print y
16 [7 , 8 , 9]
17 >> y = x [4:7] # Select the 4 th , 5 th and 6 th elements
18 >> print y
19 [4 , 5 , 6]
A.1.3 Dictionaries
Python is capable of storing multiple data types associated with a map of variable names inside a
single object. This is called a “Dictionary”, and works in a similar manner to a “data structure” in
languages such as Matlab. Dictionaries are used frequently in the HMTK as ways of structuring
inputs to functions that share a common behaviour but may take different numbers and types of
parameters on input.
1 >> earthquake = { " Name " : " Parkfield " ,
2 " Year " : 2004 ,
3 " Magnitude " : 6.1 ,
4 " Recording Agencies " = [ " USGS " , " ISC " ]}
5 # To call or view a particular element in a dictionary
6 >> print earthquake [ " Name " ] , earthquake [ " Magnitude " ]
7 Parkfield 6.1
1 >> a = 3.5
2 >> if ( a <= 1.0) or ( a > 3.0):
3 b = a - 1.0
4 else :
5 b = a ** 2.0
6 >> print b
7 2.5
Looping
There are several ways to apply looping in python. For simple mathematical operations, the
simplest way is to make use of the range function:
1 >> for i in range (0 , 5):
2 print i , i ** 2
3 0 0
4 1 1
5 2 4
6 3 9
7 4 16
The same could be achieved using the while function (though possibly this approach is far
less desirable depending on the circumstance):
1 >> i = 0
2 >> while i < 5:
3 print i , i ** 2
4 i += 1
5 0 0
6 1 1
7 2 4
8 3 9
9 4 16
The same results can be generated, arguably more cleanly, by making use of the enumerate
function:
1 >> fruit_data = [ " apples " , " oranges " , " bananas " , " lemons " ,
2 " cherries " ]
3 >> for i , fruit in enumerate ( fruit_data ):
4 print i , fruit
5 0 apples
6 1 oranges
7 2 bananas
8 3 lemons
9 4 cherries
As with many other programming languages, Python contains the statements break to break
out of a loop, and continue to pass to the next iteration.
82 Chapter A. The 10 Minute Guide to Python!
1 >> i = 0
2 >> while i < 10:
3 if i == 3:
4 i += 1
5 continue
6 elif i == 5:
7 break
8 else :
9 print i , i ** 2
10 i += 1
11 0 0
12 1 1
13 2 4
14 4 16
A.2 Functions
Python easily supports the definition of functions. A simple example is shown below. Pay careful
attention to indentation and syntax!
1 >> def a_simple_multiplier (a , b ):
2 """
3 Documentation string - tells the reader the function
4 will multiply two numbers , and return the result and
5 the square of the result
6 """
7 c = a * b
8 return c , c ** 2.0
9
10 >> x = a_simple_multiplier (3.0 , 4.0)
11 >> print x
12 (12.0 , 144.0)
In the above example the function returns two outputs. If only one output is assigned then
that output will take the form of a tuple, where the elements correspond to each of the two
outputs. To assign directly, simply do the following:
1 >> x , y = a_simple_multiplier (3.0 , 4.0)
2 >> print x
3 12.0
4 >> print y
5 144.0
In this example the class holds both the attribute converter_year and the method to convert
the magnitude. The class is created (or “instantiated”) with only the information regarding the
cut-off year to use the different conversion formulae. Then the class has a method to convert a
specific magnitude depending on its year.
A.3.2 Inheritance
Classes can be useful in many ways in programming. One such way is due to the property of
inheritance. This allows for classes to be created that can inherit the attributes and methods of
another class, but permit the user to add on new attributes and/or modify methods.
In the following example we create a new magnitude converter, which may work in the same
way as the MagnitudeConverter class, but with different conversion methods.
1 >> class NewMag nitud eConve rter ( MagnitudeConverter ):
2 """
3 A magnitude converter using different conversion
4 formulae
5 """
6 def convert ( self , magnitude , year ):
7 """
8 Converts the magnitude from one scale to another
9 - differently !!!
10 """
11 if year < self . converter_year :
12 converted_magnitude = -0.1 + 1.05 * magnitude
84 Chapter A. The 10 Minute Guide to Python!
13 else :
14 converted_magnitude = 0.4 + 0.8 * magnitude
15 return converted_magnitude
16 # Now compare converters
17 >> converter1 = MagnitudeConverter (1990)
18 >> converter2 = Ne wMagni tudeCo nvert er (1990)
19 >> mag1 = converter1 . convert (5.0 , 1987)
20 >> print mag1
21 5.7
22 >> mag2 = converter2 . convert (5.0 , 1987)
23 >> print mag2
24 5.15
25 >> mag3 = converter1 . convert (5.0 , 1994)
26 >> print mag3
27 4.8
28 >> mag4 = converter2 . convert (5.0 , 1994)
29 >> print mag4
30 4.4
A.3.3 Abstraction
Inspection of the HMTK code (https://github.com/gem/oq-engine shows frequent usage of classes
and inheritance. This is useful in our case if we wish to make available different methods for
the same problem. In many cases the methods may have similar logic, or may provide the same
types of outputs, but the specifics of the implementation may differ. Functions or attributes that
are common to all methods can be placed in a “Base Class”, permitting each implementation of
a new method to inherit the “Base Class” and its functions/attributes/behaviour. The new method
will simply modify those aspects of the base class that are required for the specific method in
question. This allows functions to be used interchangeably, thus allowing for a "mapping" of
data to specific methods.
An example of abstraction is shown using our two magnitude converters shown previously.
Imagine that a seismic recording network (named "XXX") has a model for converting from their
locally recorded magnitude to a reference global scale (for the purposes of narrative, imagine that
a change in recording procedures in 1990 results in a change of conversion model). A different
recording network (named “YYY”) has a different model for converting their local magnitude
to a reference global scale (and we imagine they also changed their recording procedures, but
they did so in 1994). We can create a mapping that would apply the correct conversion for each
locally recorded magnitude in a short catalogue, provided we know the local magnitude, the year
and the recording network.
1 >> CONVERSION_MAP = { " XXX " : MagnitudeConverter (1990) ,
2 " YYY " : Ne wMagn itudeC onver ter (1994)}
3 >> e arthquake_catalogue = [(5.0 , " XXX " , 1985) ,
4 (5.6 , " YYY " , 1992) ,
5 (4.8 , " XXX " , 1993) ,
6 (4.4 , " YYY " , 1997)]
7 >> for earthquake in earthquake_catalogue :
8 converted_magnitude = \ # Line break for long lines !
9 CONVERSION_MAP [ earthquake [1]]. convert ( earthquake [0] ,
10 earthquake [2])
11 print earthquake , converted_magnitude
12 (5.0 , " XXX " , 1985) 5.7
13 (5.6 , " YYY " , 1992) 5.78
14 (4.8 , " XXX " , 1993) 4.612
15 (4.4 , " YYY " , 1997) 3.92
So we have a simple magnitude homogenisor that applies the correct function depending on
A.4 Numpy/Scipy 85
the network and year. It then becomes a very simple matter to add on new converters for new
agencies; hence we have a “toolkit” of conversion functions!
A.4 Numpy/Scipy
Python has two powerful libraries for undertaking mathematical and scientific calculation,
which are essential for the vast majority of scientific applications of Python: Numpy (for
multi-dimensional array calculations) and Scipy (an extensive library of applications for maths,
science and engineering). Both libraries are critical to both OpenQuake and the HMTK. Each
package is so extensive that a comprehensive description requires a book in itself. Fortunately
there is abundant documentation via the online help for Numpy www.numpy.org and Scipy
www.scipy.org, so we do not need to go into detail here.
The particular facet we focus upon is the way in which Numpy operates with respect to
vector arithmatic. Users familiar with Matlab will recognise many similarities in the way the
Numpy package undertakes array-based calculations. Likewise, as with Matlab, code that is well
vectorised is signficantly faster and more efficient than the pure Python equivalent.
The following shows how to undertake basic array arithmetic operations using the Numpy
library
1 >> import numpy as np
2 # Create two vectors of data , of equal length
3 >> x = np . array ([3.0 , 6.0 , 12.0 , 20.0])
4 >> y = np . array ([1.0 , 2.0 , 3.0 , 4.0])
5 # Basic arithmetic
6 >> x + y # Addition ( element - wise )
7 np . array ([4.0 , 8.0 , 15.0 , 24.0])
8 >> x + 2 # Addition of scalar
9 np . array ([5.0 , 8.0 , 14.0 , 22.0])
10 >> x * y # Multiplication ( element - wise )
11 np . array ([3.0 , 12.0 , 36.0 , 80.0])
12 >> x * 3.0 # Multiplication by scalar
13 np . array ([9.0 , 18.0 , 36.0 , 60.0])
14 >> x - y # Subtraction ( element - wise )
15 np . array ([2.0 , 4.0 , 9.0 , 16.0])
16 >> x - 1.0 # Subtraction of scalar
17 np . array ([2.0 , 5.0 , 11.0 , 19.0])
18 >> x / y # Division ( element - wise )
19 np . array ([3.0 , 3.0 , 4.0 , 5.0])
20 >> x / 2.0 # Division over scalar
21 np . array ([1.5 , 3.0 , 6.0 , 10.0])
22 >> x ** y # Exponentiation ( element - wise )
23 np . array ([3.0 , 36.0 , 1728.0 , 160000.0])
24 >> x ** 2.0 # Exponentiation ( by scalar )
25 np . array ([9.0 , 36.0 , 144.0 , 400.0])
Numpy contains a vast set of mathematical functions that can be operated on a vector (e.g.):
1 >> x = np . array ([3.0 , 6.0 , 12.0 , 20.0])
2 >> np . exp ( x )
3 np . array ([2.00855369 e +01 , 4.03428793 e +02 , 1.62754791 e +05 ,
4 4.85165195 e +08])
5 # Trigonometry
6 >> theta = np . array ([0. , np . pi / 2.0 , np . pi , 1.5 * np . pi ])
7 >> np . sin ( theta )
8 np . array ([0.0000 , 1.0000 , 0.0000 , -1.0000])
9 >> np . cos ( theta )
10 np . array ([1.0000 , 0.0000 , -1.0000 , 0.0000])
Some of the most powerful functions of Numpy, however, come from its logical indexing:
86 Chapter A. The 10 Minute Guide to Python!
The reader is referred to the online documentation for the full set of functions!