Climada Python Readthedocs Io en Latest
Climada Python Readthedocs Io en Latest
Release 6.0.2-dev
CLIMADA contributors
i
ii
CLIMADA documentation, Release 6.0.2-dev
CLIMADA (CLIMate ADAptation) CLIMADA is a free and open-source software framework for climate risk assess-
ment and adaptation option appraisal. Designed by a large scientific community, it helps researchers, policymakers, and
businesses analyse the impacts of natural hazards and explore adaptation strategies.
CLIMADA is primarily developed and maintained by the Weather and Climate Risks Group at ETH Zürich.
If you use CLIMADA for your own scientific work, please reference the appropriate publications according to the Citation
Guide.
This is the documentation of the CLIMADA core module which contains all functionalities necessary for performing
climate risk analysis and appraisal of adaptation options. Modules for generating different types of hazards and other
specialized applications can be found in the CLIMADA Petals module.
Useful links: WCR Group | CLIMADA Petals | CLIMADA website | Mailing list
Getting Started Getting started with CLIMADA: How to install? What are the basic concepts and functionalities?
Getting started
User Guide Want to go more in depth? Check out the User guide. It contains detailed tutorials on the different concepts,
modules and possible usage of CLIMADA.
To the user guide!
Implementation API reference The reference guide contains a detailed description of the CLIMADA API. The API
reference describes each module, class, methods and functions.
To the reference guide!
Developer guide Saw a typo in the documentation? Want to improve existing functionalities? Want to extend them?
The contributing guidelines will guide you through the process of improving CLIMADA.
To the development guide!
b Hint
ReadTheDocs hosts multiple versions of this documentation. Use the drop-down menu on the bottom left to switch
versions. stable refers to the most recent release, whereas latest refers to the latest development version.
® Copyright Notice
CONTENTS 1
CLIMADA documentation, Release 6.0.2-dev
2 CONTENTS
CHAPTER
ONE
GETTING STARTED
µ See also
You don’t have mamba or conda installed, or you are looking for advanced installation instructions? Look up our
detailed instructions on CLIMADA installation.
3
CLIMADA documentation, Release 6.0.2-dev
Getting started
The Getting started section, where you are currently, presents the very basics of climada.
For instance, to start learning about CLIMADA, you can have a look at the introduction.
You can also have a look at the paper repository to get an overview of research projects conducted with CLIMADA.
Programming in Python
It is best to have some basic knowledge of Python programming before starting with CLIMADA. But if you need a quick
introduction or reminder, have a look at the short Python Tutorial. Also have a look at the python Python Dos and Don’t
guide and at the Python Performance Guide for best practice tips.
Tutorials
A good way to start using CLIMADA is to have a look at the tutorials in the User Guide. The 10 minute climada tutorial
will give you a quick introduction to CLIMADA, with a brief example on how to calculate you first impacts, as well as
your first appraisal of adaptation options, while the Overview will present the whole structure of CLIMADA more in
depth. You can then look at the specific tutorials for each module (for example if you are interested in a specific hazard,
like Tropical Cyclones, or in learning to estimate the value of asset exposure,…).
Contributing
If you would like to participate in the development of CLIMADA, carefully read the Developer Guide. Here you will find
how to set up an environment to develop new features for CLIMADA, the workflow and rules to follow to make sure you
can implement a valuable contribution!
API Reference
The API reference presents the documentation of the internal modules, classes, methods and function of CLIMADA.
Changelog
In the Changelog section, you can have a look at all the changes made between the different versions of CLIMADA
External links
The top bar of this website also link to the documentation of Climada Petals, the webpage of the Weather and Climate
Risk group at ETH, and the official CLIMADA website.
Other Questions
If you cannot find you answer in the other guides provided here, you can open an issue for somebody to help you.
1.2.2 Introduction
CLIMADA implements a fully probabilistic risk assessment model. According to the IPCC [1], natural risks emerge
through the interplay of climate and weather-related hazards, the exposure of goods or people to this hazard, and the
specific vulnerability of exposed people, infrastructure and environment.
The unit of measurement for risk in CLIMADA is selected based on its relevance to the specific decision-making context
and is not limited to monetary units alone. For instance, wildfire risk may be quantified by the burned area (hazard) and
the exposure could be measured by the population density or the replacement value of homes. Consequently, risk could be
expressed in terms of the number of people affected for evacuation planning, or the cost of repairs for property insurance
purposes.
Risk has been defined by the International Organization for Standardization as the “effect of uncertainty on objectives” as
the potential for consequences when something of value is at stake and the outcome is uncertain, recognizing the diversity
of values. Risk can then be quantified as the combination of the probability of a consequence and its magnitude:
In the simplest case, × stands for a multiplication, but more generally, it represents a convolution of the respective
distributions of probability and severity. We approximate the severity as follows:
where fimp is the impact function which parametrizes to what extent an exposure will be affected by a specific hazard.
While the term ‘vulnerability function’ is broadly used in the modelers community, we adopt the broader term
‘impact function’. Impact functions can be vulnerability functions or structural damage functions, but could also
be productivity functions or warning levels. This definition also explicitly includes the option of opportunities
(i.e. negative damages).
Using this approach, CLIMADA constitutes a platform to analyse risks of different hazard types in a globally consistent
fashion at different resolution levels, at scales from multiple kilometres down to meters, tailored to the specific require-
ments of the analysis.
References
[IPCC] IPCC: Climate Change 2014: Impacts, Adaptation and Vulnerability. Part A: Global and Sectoral Aspects.
Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change,
edited by C. B. Field, V. R. Barros, D. J. Dokken, K. J. Mach, M. D. Mastrandrea, T. E. Bilir, M. Chatterjee, K. L. Ebi,
Y. O. Estrada, R. C. Genova, B. Girma, E. S. Kissel, A. N. Levy, S. MacCracken, P. R. Mastrandrea, and L. L. White,
Cambridge University Press, United Kingdom and New York, NY, USA., 2014.
1.2.3 Installation
The following sections will guide you through the installation of CLIMADA and its dependencies.
Á Attention
CLIMADA has a complicated set of dependencies that cannot be installed with pip alone. Please follow the instal-
lation instructions carefully! We recommend to use a conda-based python environment manager such as Mamba or
Conda for creating a suitable software environment to execute CLIMADA.
All following instructions should work on any operating system (OS) that is supported by conda, including in particular:
Windows, macOS, and Linux.
b Hint
If you need help with the vocabulary used on this page, refer to the Glossary.
• Open the “Terminal” app, copy-paste the two commands below, and hit enter:
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/
,→Miniforge3-$(uname)-$(uname -m).sh"
Windows
® Python Versions
CLIMADA is primarily tested against a supported Python version, but is allowed to run with others. If you follow
the installation instructions exactly, you will create an environment with the supported version. Depending on your
setup, you are free to choose another allowed version, but we recommend the supported one.
b Hint
When mentioning the terms “terminal” or “command line” in the following, we are referring to the “Terminal” apps
on macOS or Linux and the “Miniforge Prompt” on Windows.
CLIMADA is divided into two packages, CLIMADA Core (climada_python) and CLIMADA Petals (climada_petals).
The Core contains all the modules necessary for probabilistic impact, averted damage, uncertainty and forecast calcu-
lations. Data for hazard, exposures and impact functions can be obtained from the CLIMADA Data API. Hazard and
Exposures subclasses are included as demonstrators only.
Á Attention
CLIMADA Petals is not a standalone module and requires CLIMADA Core to be installed!
CLIMADA Petals contains all the modules for generating data (e.g., TC_Surge, WildFire, OpenStreeMap, …). New
modules are developed and tested here. Some data created with modules from Petals is available to download from the
Data API. This works with just CLIMADA Core installed. CLIMADA Petals can be used to generate additional data of
this type, or to have a look at the tutorials for all data types available from the API.
Both installation approaches mentioned above support CLIMADA Petals. If you are unsure whether you need Petals, you
can install the Core first and later add Petals in both approaches.
Simple Instructions
These instructions will install the most recent stable version of CLIMADA without cloning its repository.
1. Open the command line. Create a new Conda environment with CLIMADA by executing
You should now see (climada_env) appear in the beginning of your command prompt. This means the envi-
ronment is activated.
3. Verify that everything is installed correctly by executing a single test:
Executing CLIMADA for the first time will take some time because it will generate a directory tree in your
home/user directory. After a while, some text should appear in your terminal. In the end, you should see an
“Ok”. If so, great! You are good to go.
4. Optional: Install CLIMADA Petals into the environment:
Á Warning
If you followed the Simple Instructions before, make sure you either remove the environment with:
before you continue, or you use a different environment name for the following instructions (e.g. climada_dev
instead of climada_env).
1. If you are using a Linux OS, make sure you have git installed (Windows and macOS users are good to go once
Conda is installed). On Ubuntu and Debian, you may use APT:
apt update
apt install git
Both commands will probably require administrator rights, which can be enabled by prepending sudo.
2. Create a folder for your code. We will call it workspace directory. To make sure that your user can manipu-
late it without special privileges, use a subdirectory of your user/home directory. Do not use a directory that is
synchronized by cloud storage systems like OneDrive, iCloud or Polybox!
3. Open the command line and navigate to the workspace directory you created using cd. Replace <path/to/
workspace> with the path of the workspace directory:
cd <path/to/workspace>
4. Clone CLIMADA from its GitHub repository. Enter the directory and check out the branch of your choice. The
latest development version will be available under the branch develop.
b Hint
Use the wildcard .* at the end to allow a downgrade of the bugfix version of Python. This increases compati-
bility when installing the requirements in the next step.
® Note
You may choose any of the allowed Python versions from the list above.
6. Use the default environment specs in env_climada.yml to install all dependencies. Then activate the environ-
ment:
7. Install the local CLIMADA source files as Python package using pip:
b Hint
Using a path ./ (referring to the path you are currently located at) will instruct pip to install the local files
instead of downloading the module from the internet. The -e (for “editable”) option further instructs pip to
link to the source files instead of copying them during installation. This means that any changes to the source
files will have immediate effects in your environment, and re-installing the module is never required.
Further note that this works only for the source files not for the dependencies. If you change the latter, you will
need to update the environment with step 6. !
Executing CLIMADA for the first time will take some time because it will generate a directory tree in your
home/user directory. If this test passes, great! You are good to go.
Advanced users, or reviewers, may also want to check the feature of a specific branch other than develop. To do so,
assuming you did install CLIMADA in editable mode (`pip install` with the `-e` flag), you just have to:
` git fetch git checkout <branch> git pull `
This will work most of the time, except if the target branch defines new dependencies that you don’t have already in your
environment (as they will not get installed this way), in that case you can install these dependencies yourself, or create a
new environment with the new requirements from the branch.
If you did not install CLIMADA in editable mode, you can also reinstall CLIMADA from its folder after switching the
branch (pip install [-e] ./ ).
Building the documentation and running the entire test suite of CLIMADA requires additional dependencies which are
not installed by default. They are also not needed for using CLIMADA. However, if you want to develop CLIMADA,
we strongly recommend you install them.
With the climada_env activated, enter the workspace directory and then the CLIMADA repository as above. Then,
add the dev extra specification to the pip install command (mind the quotation marks, and see also pip install
examples):
The developer dependencies also include pre-commit, which is used to install and run automated, so-called pre-commit
hooks before a new commit. In order to use the hooks defined in .pre-commit-config.yaml, you need to install the
hooks first. With the climada_env activated, execute
pre-commit install
Please refer to the guide on pre-commit hooks for information on how to use this tool.
For executing the pre-defined test scripts in exactly the same way as they are executed by the automated CI pipeline, you
will need make to be installed. On macOS and on Linux it is pre-installed. On Windows, it can easily be installed with
Conda:
Instructions for running the test scripts can be found in the Testing Guide.
If you are unsure whether you need Petals, see the notes above.
To install CLIMADA Petals, we assume you have already installed CLIMADA Core with the advanced instructions above.
1. Open the command line and navigate to the workspace directory.
2. Clone CLIMADA Petals from its repository. Enter the directory and check out the branch of your choice. The
latest development version will be available under the branch develop.
3. Update the Conda environment with the specifications from Petals and activate it:
Code Editors
JupyterLab
2. Make sure that the climada_env is activated (see above) and then start JupyterLab:
Basic Setup
2. Install the Python and Jupyter extensions. In the left sidebar, select the “Extensions” symbol, enter “Python” in the
search bar and click Install next to the “Python” extension. Repeat this process for “Jupyter”.
3. Open a Jupyter Notebook or create a new one. On the top right, click on Select Kernel, select Python Environments…
and then choose the Python interpreter from the climada_env.
See the VSCode docs on Python and Jupyter Notebooks for further information.
b Hint
Both of the following setup instructions work analogously for Core and Petals. The specific instructions for Petals are
shown in square brackets: []
Workspace Setup
Setting up a workspace for the CLIMADA source code is only available for advanced installations.
1. Open a new VSCode window. Below Start, click Open…, select the climada_python [climada_petals] repos-
itory folder in your workspace directory, and click on Open on the bottom right.
2. Click File > Save Workspace As… and store the workspace settings file next to (not in!) the climada_python
[climada_petals] folder. This will enable you to load the workspace and all its specific settings in one go.
3. Open the Command Palette by clicking View > Command Palette or by using the shortcut keys Ctrl+Shift+P
(Windows, Linux) / Cmd+Shift+P (macOS). Start typing “Python: Select Interpreter” and select it from the drop-
down menu. If prompted, choose the option to set the interpreter for the workspace, not just the current folder.
Then, choose the Python interpreter from the climada_env.
For further information, refer to the VSCode docs on Workspaces.
After you set up a workspace, you might want to configure the test explorer for easily running the CLIMADA test suite
within VSCode.
® Note
1. In the left sidebar, select the “Testing” symbol, and click on Configure Python Tests.
2. Select “pytest” as test framework and then select climada [climada_petals] as the directory containing the
test files.
3. Select “Testing” in the Activity Bar on the left or through View > Testing. The “Test Explorer” in the left sidebar
will display the tree structure of modules, files, test classes and individual tests. You can run individual tests or test
subtrees by clicking the Play buttons next to them.
4. By default, the test explorer will show test output for failed tests when you click on them. To view the logs for any
test, click on View > Output, and select “Python Test Log” from the dropdown menu in the view that just opened.
If there are errors during test discovery, you can see what’s wrong in the “Python” output.
For further information, see the VSCode docs on Python Testing.
Spyder
Installing Spyder into the existing Conda environment for CLIMADA might fail depending on the exact versions of
dependencies installed. Therefore, we recommend installing Spyder in a separate environment, and then connecting it to
a kernel in the original climada_env.
1. Follow the Spyder installation instructions. You can follow the “Conda” installation instructions. Keep in mind you
are using mamba, though!
2. Check the version of the Spyder kernel in the new environment:
- spyder-kernels=X.Y.Z=<hash>
Copy the part spyder-kernels=X.Y.Z (until the second =) and paste it into the following command to install
the same kernel version into the climada_env:
3. Obtain the path to the Python interpreter of your climada_env. Execute the following commands:
5. Set the Python interpreter used by Spyder to the one of climada_env. Select Preferences > Python Interpreter >
Use the following interpreter and paste the iterpreter path you copied from the climada_env.
FAQs
Answers to frequently asked questions.
Updating CLIMADA
We recommend keeping CLIMADA up-to-date. To update, follow the instructions based on your installation type:
• Simple Instructions: Update CLIMADA using mamba:
• Advanced Instructions: Move into your local CLIMADA repository and pull the latest version of your respective
branch:
cd <path/to/workspace>/climada_python
git pull
You might use CLIMADA in code that requires more packages than the ones readily available in the CLIMADA Conda
environment. If so, prefer installing these packages via Conda, and only rely on pip if that fails. The default channels
of Conda sometimes contain outdated versions. Therefore, use the conda-forge channel:
If you followed the installation instructions, you already executed a single unit test. This test, however, will not cover all
issues that could occur within your installation setup. If you are unsure if everything works as intended, try running all
unit tests. This is only available for advanced setups! Move into the CLIMADA repository, activate the environment and
then execute the tests:
cd <path/to/workspace>/climada_python
mamba activate climada_env
python -m unittest discover -s climada -p "test*.py"
Error: ModuleNotFoundError
Something is wrong with the environment you are using. After each of the following steps, check if the problem is solved,
and only continue if it is not:
1. Make sure you are working in the CLIMADA environment:
mamba deactivate
mamba env remove -n climada_env
Logging Configuration
Climada makes use of the standard logging package. By default, the “climada”-Logger is detached from logging.
root, logging to stdout with the level set to WARNING.
If you prefer another logging configuration, e.g., for using Climada embedded in another application, you can opt out of
the default pre-configuration by setting the config value for logging.climada_style to false in the configuration
file climada.conf.
Changing the logging level can be done in multiple ways:
• Adjust the configuration file climada.conf by setting a the value of the global.log_level property. This
only has an effect if the logging.climada_style is set to true though.
• Set a global logging level in your Python script:
import logging
logging.getLogger('climada').setLevel(logging.ERROR) # to silence all warnings
We experienced several issues with the default conda package manager lately. This is likely due to the large dependency
set of CLIMADA, which makes solving the environment a tedious task. We therefore switched to the more performant
mamba and recommend using it.
Ϫ Caution
In theory, you could also use an Anaconda or Miniconda distribution and replace every mamba command in this guide
with conda. In practice, however, conda is often unable to solve an environment that mamba solves without issues
in few seconds.
Conda might report a permission error on macOS Mojave. Carefully follow these instructions: https://github.com/conda/
conda/issues/8440#issuecomment-481167572
This may happen when a demo file from CLIMADA was not updated after the change in the impact function naming
pattern from if_ to impf_ when CLIMADA v2.2.0 was released. Execute
print("Addition: 2 + 2 =", 2 + 2)
print("Substraction: 50 - 5*6 =", 50 - 5 * 6)
print("Use of parenthesis: (50 - 5*6) / 4 =", (50 - 5 * 6) / 4)
print("Classic division returns a float: 17 / 3 =", 17 / 3)
print("Floor division discards the fractional part: 17 // 3 =", 17 // 3)
print("The % operator returns the remainder of the division: 17 % 3 =", 17 % 3)
print("Result * divisor + remainder: 5 * 3 + 2 =", 5 * 3 + 2)
print("5 squared: 5 ** 2 =", 5**2)
print("2 to the power of 7: 2 ** 7 =", 2**7)
The integer numbers (e.g. 2, 4, 20) have type int, the ones with a fractional part (e.g. 5.0, 1.6) have type float. Operators
with mixed type operands convert the integer operand to floating point:
Strings can be enclosed in single quotes (’…’) or double quotes (”…”) with the same result. \ can be used to escape quotes.
If you don’t want characters prefaced by \ to be interpreted as special characters, you can use raw strings by adding an r
before the first quote.
Strings can be indexed (subscripted), with the first character having index 0.
Indices may also be negative numbers, to start counting from the right. Note that since -0 is the same as 0, negative indices
start from -1.
word = "Python"
print("word = ", word)
print("Character in position 0: word[0] =", word[0])
print("Character in position 5: word[5] =", word[5])
print("Last character: word[-1] =", word[-1])
print("Second-last character: word[-2] =", word[-2])
print("word[-6] =", word[-6])
In addition to indexing, slicing is also supported. While indexing is used to obtain individual characters, slicing allows
you to obtain substring:
Lists
Lists can be written as a list of comma-separated values (items) between square brackets. Lists might contain items of
different types, but usually the items all have the same type.
Like strings (and all other built-in sequence type), lists can be indexed and sliced:
Unlike strings, which are immutable, lists are a mutable type, i.e. it is possible to change their content:
List comprehensions provide a concise way to create lists. Common applications are to make new lists where each element
is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of
those elements that satisfy a certain condition.
squares = []
for x in range(10):
squares.append(x**2)
squares
# lambda functions: functions that are not bound to a name, e.g lambda x: x**2
# Map applies a function to all the items in an input_list: map(function_to_apply,␣
,→list_of_inputs)
Tuples
A tuple consists of a number of values separated by commas, for instance:
Tuples are immutable, and usually contain a heterogeneous sequence of elements that are accessed via unpacking or
indexing. Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.
Sets
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating
duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric
difference.
Curly braces or the set() function can be used to create sets.
"crabgrass" in basket
a | b # letters in a or b or both
Dictionaries
Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable
type; strings and numbers can always be keys.
It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique
(within one dictionary). A pair of braces creates an empty dictionary: {}. Placing a comma-separated list of key:value
pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output.
tel["jack"]
del tel["sape"]
tel["irv"] = 4127
tel
list(tel.keys())
sorted(tel.keys())
"guido" in tel
Functions
We can create a function that writes the Fibonacci series to an arbitrary boundary:
The value of the function name has a type that is recognized by the interpreter as a user-defined function. This value can
be assigned to another name which can then also be used as a function. This serves as a general renaming mechanism:
print(fib)
print(type(fib)) # function type
f = fib
f(100)
Be careful when using mutable types as inputs in functions, as they might be modified:
def dummy(x):
x += x
xx = 5
print("xx before function call: ", xx)
dummy(xx)
print("xx after function call: ", xx)
yy = [5]
print("yy before function call: ", yy)
dummy(yy)
print("yy after function call: ", yy)
The most useful form is to specify a default value for one or more arguments. This creates a function that can be called
with fewer arguments than it is defined to allow. For example:
Functions can also be called using keyword arguments of the form kwarg=value:
Default None values: None default values can be used to handle optional parameters.
def test(x=None):
if x is None:
print("no x here")
else:
print(x)
test()
Objects
Example class definition:
When a class defines an _init_() method, class instantiation automatically invokes _init_() for the newly-created class
instance:
d = Dog(
"Fido"
) # creates a new instance of the class and assigns this object to the local␣
,→variable d
d.name
e = Dog(
"Buddy"
) # creates a new instance of the class and assigns this object to the local␣
,→variable e
d.tricks # unique to d
e.tricks # unique to e
Inheritance:
A derived class can override any methods of its base class or classes, and a method can call the method of a base class
with the same name. Example:
Python supports a form of multiple inheritance as well. A class definition with multiple base classes looks like this:
class DerivedClassName(Base1, Base2, Base3):
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is
a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated
as a non-public part of the API (whether it is a function, a method or a data member).
Example of internal class use of private method __update. The user is not meant to use __update, but update. However,
_update can be used internally to be called from the _init method:
class Mapping:
def __init__(self, iterable):
self.items_list = []
self.__update(iterable)
class MappingSubclass(Mapping):
TWO
USER GUIDE
This user guide contains all the detailed tutorials about the different parts of CLIMADA. If you are a new user, we advise
you to have a look at the 10 minutes CLIMADA which introduces the basics briefly, or the full Overview which goes
more in depth.
You can then go on to more specific tutorial about Hazard, Exposures or Impact or advanced usage such as Uncertainty
Quantification
Hazard objects
First, we read a demo hazard file that includes information about several tropical cyclone events.
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
29
CLIMADA documentation, Release 6.0.2-dev
warnings.filterwarnings("ignore")
We can infer some information from the Hazard object. The central piece of the hazard object is a sparse matrix at
haz.intensity that contains the hazard intensity values for each event (axis 0) and each location (axis 1).
print(
f"The hazard object contains {haz.intensity.shape[0]} events. \n"
f"The maximal intensity contained in the Hazard object is {haz.intensity.max():.
,→2f} {haz.units}. \n"
The probabilistic event set and its single events can be plotted. For instance, below we plot maximal intensity per grid
point over the whole event set.
Exposure objects
Now, we read a demo expopure file containing the location and value of a number of exposed assets in Florida.
exp = Exposures.from_hdf5(EXP_DEMO_H5)
We can print some basic information about the exposure object. The central information of the exposure object is con-
tained in a geopandas.GeoDataFrame at exp.gdf.
print(
f"In the exposure object, a total amount of {exp.value_unit} {exp.gdf.value.sum()␣
,→/ 1_000_000_000:.2f}B"
In the exposure object, a total amount of USD 657.05B is distributed among 50 points.
exp.plot_basemap(figsize=(6, 6));
Impact Functions
To model the impact to the exposure that is caused by the hazard, CLIMADA makes use of an impact function. This
function relates both percentage of assets affected (PAA, red line below) and the mean damage degree (MDD, blue line
below), to the hazard intensity. The multiplication of PAA and MDD result in the mean damage ratio (MDR, black
dashed line below), that relates the hazard intensity to corresponding relative impact values. Finally, a multiplication with
the exposure values results in the total impact.
Below, we read and plot a standard impact function for tropical cyclones.
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_set = ImpactFuncSet([impf_tc])
impf_set.plot();
Impact calculation
Having defined hazard, exposure, and impact function, we can finally perform the impact calcuation.
The Impact object contains the results of the impact calculation (including event- and location-wise impact information
when save_mat=True).
print(
f"The total expected annual impact over all exposure points is {imp.unit} {imp.
,→aai_agg / 1_000_000:.2f} M. \n"
The total expected annual impact over all exposure points is USD 288.90 M.
The largest estimated single-event impact is USD 20.96 B.
The largest expected annual impact for a single location is USD 9.58 M.
Several visualizations of impact objects are available. For instance, we can plot the expected annual impact per location
on a map.
imp.plot_basemap_eai_exposure(figsize=(6, 6))
To analyze the effect of the adaptation measure, we can, for instance, plot the impact exceedance frequency curves that
describe, according to the given data, how frequent different impacts thresholds are expected to be exceeded.
ax = imp.calc_freq_curve().plot(label="Without measure")
new_imp.calc_freq_curve().plot(axis=ax, label="With measure")
ax.legend()
<matplotlib.legend.Legend at 0x17f5ef710>
This tutorial
This tutorial is for people new to CLIMADA who want to get a high level understanding of the model and work through
an example risk analysis. It will list the current features of the model, and go through a complete CLIMADA analysis to
give an idea of how the model works. Other tutorials go into more detail about different model components and individual
hazards.
CLIMADA classes
This is a full directory of tutorials for CLIMADA’s classes to use as a reference. You don’t need to read all this to do this
tutorial, but it may be useful to refer back to.
Core (climada_python):
• Hazard: a class that stores sets of geographic hazard footprints, (e.g. for wind speed, water depth and fraction,
drought index), and metadata including event frequency. Several predefined extensions to create particular hazards
from particular datasets and models are included with CLIMADA:
– Tropical cyclone wind: global hazard sets for tropical cyclone events, constructing statistical wind fields from
storm tracks. Subclasses include methods and data to calculate historical wind footprints, create forecast
enembles from ECMWF tracks, and create climatological event sets for different climate scenarios.
– European windstorms: includes methods to read and plot footprints from the Copernicus WISC dataset and
for DWD and ICON forecasts.
• Entity: this is a container that groups CLIMADA’s socio-economic models. It’s is where the Exposures and Impact
Functions are stored, which can then be combined with a hazard for a risk analysis (using the Engine’s Impact class).
It is also where Discount Rates and Measure Sets are stored, which are used in adaptation cost-benefit analyses
(using the Engine’s CostBenefit class):
– Exposures: geolocated exposures. Each exposure is associated with a value (which can be a dollar value,
population, crop yield, etc), information to associate it with impact functions for the relevant hazard(s) (in the
Entity’s ImpactFuncSet), a geometry, and other optional properties such as deductables and cover. Exposures
can be loaded from a file, specified by the user, or created from regional economic models accessible within
CLIMADA, for example:
∗ LitPop: regional economic model using nightlight and population maps together with several economic
indicators
∗ Polygons_lines: use CLIMADA Impf you have your exposure in the form of shapes/polygons or in the
form of lines.
– ImpactFuncSet: functions to describe the impacts that hazards have on exposures, expressed in terms of e.g.
the % dollar value of a building lost as a function of water depth, or the mortality rate for over-70s as a function
of temperature. CLIMADA provides some common impact functions, or they can be user-specified. The
following is an incomplete list:
∗ ImpactFunc: a basic adjustable impact function, specified by the user
∗ IFTropCyclone: impact functions for tropical cyclone winds
∗ IFRiverFlood: impact functions for river floods
∗ IFStormEurope: impact functions for European windstorms
– DiscRates: discount rates per year
– MeasureSet: a collection of Measure objects that together describe any adaptation measures being modelled.
Adaptation measures are described by their cost, and how they modify exposure, hazard, and impact functions
(and have have a method to do these things). Measures also include risk transfer options.
• Engine: the CLIMADA Engine contains the Impact and CostBenefit classes, which are where the main model
calculations are done, combining Hazard and Entity objects.
– Impact: a class that stores CLIMADA’s modelled impacts and the methods to calculate them from Exposure,
Impact Function and Hazard classes. The calculations include average annual impact, expected annual impact
by exposure item, total impact by event, and (optionally) the impact of each event on each exposure point.
Includes statistical and plotting routines for common analysis products.
– Impact_data: The core functionality of the module is to read disaster impact data as downloaded from the In-
ternational Disaster Database EM-DAT (www.emdat.be) and produce a CLIMADA Impact()-instance from
it. The purpose is to make impact data easily available for comparison with simulated impact inside CLI-
MADA, e.g. for calibration purposes.
– CostBenefit: a class to appraise adaptation options. It uses an Entity’s MeasureSet to calculate new Impacts
based on their adjustments to hazard, exposure, and impact functions, and returns statistics and plotting rou-
tines to express cost-benefit comparisons.
– Unsequa: a module for uncertainty and sensitivity analysis.
– Unsequa_helper: The InputVar class provides a few helper methods to generate generic uncertainty input
variables for exposures, impact function sets, hazards, and entities (including measures cost and disc rates).
This tutorial complements the general tutorial on the uncertainty and sensitivity analysis module unsequa.
– Forecast: This class deals with weather forecasts and uses CLIMADA ImpactCalc.impact() to forecast im-
pacts of weather events on society. It mainly does one thing: It contains all plotting and other functionality
that are specific for weather forecasts, impact forecasts and warnings.
climada_petals:
• Hazard:
– Storm surge: Tropical cyclone surge from linear wind-surge relationship and a bathtub model.
– River flooding: global water depth hazard for flood, including methods to work with ISIMIP simulations.
– Crop modelling: combines ISIMIP crop simulations and UN Food and Agrigultre Organization data. The
module uses crop production as exposure, with hydrometeorological ‘hazard’ increasing or decreasing pro-
duction.
– Wildfire (global): This class is used to model the wildfire hazard using the historical data available and cre-
ating synthetic fires which are summarized into event years to establish a comprehensiv probabilistic risk
assessment.
– Landslide: This class is able to handle two different types of landslide source files (in one case, already the
finished product of some model output, in the other case just a historic data collection).
– TCForecast: This class extends the TCTracks class with methods to download operational ECMWF ensemble
tropical storm track forecasts, read the BUFR files they’re contained in and produce a TCTracks object that
can be used to generate TropCyclone hazard footprints.
– Emulator:Given a database of hazard events, this module climada.hazard.emulator provides tools to subsample
events (or time series of events) from that event database.
– Drought (global): tutorial under development
• Entity:
– Exposures:
∗ BlackMarble: regional economic model from nightlight intensities and economic indicators (GDP, in-
come group). Largely succeeded by LitPop.
∗ OpenStreetMap: CLIMADA provides some ways to make use of the entire OpenStreetMap data world
and to use those data within the risk modelling chain of CLIMADA as exposures.
• Engine:
– SupplyChain: This class allows assessing indirect impacts via Input-Ouput modeling.
This list will be updated periodically along with new CLIMADA releases. To see the latest, development version of all
tutorials, see the tutorials page on the CLIMADA GitHub.
2.2.4 Hazard
Hazards are characterized by their frequency of occurrence and the geographical distribution of their intensity. The
Hazard class collects events of the same hazard type (e.g. tropical cyclone, flood, drought, …) with intensity values over
the same geographic centroids. They might be historical events or synthetic.
See the Hazard tutorial to learn about the Hazard class in more detail, and the CLIMADA features section of this document
to explore tutorials for different hazards, including tropical cyclones, as used here.
Tropical cyclones in CLIMADA and the TropCyclone class work like any hazard, storing each event’s wind speeds
at the geographic centroids specified for the class. Pre-calculated hazards can be loaded from files (see the full Hazard
tutorial, but they can also be modelled from a storm track using the TCTracks class, based on a storm’s parameters at
each time step. This is how we’ll construct the hazards for our example.
So before we create the hazard, we will create our storm tracks and define the geographic centroids for the locations we
want to calculate hazard at.
Storm tracks
Storm tracks are created and stored in a separate class, TCTracks. We use its method from_ibtracs_netcdf to create
the tracks from the IBTRaCS storm tracks archive. In the next block we will download the full dataset, which might take
a little time. However, to plot the whole dataset takes too long (see the second block), so we choose a shorter time range
here to show the function. See the full TropCyclone tutorial for more detail and troubleshooting.
import numpy as np
from climada.hazard import TCTracks
import warnings # To hide the warnings
warnings.filterwarnings("ignore")
tracks = TCTracks.from_ibtracs_netcdf(
provider="usa", basin="NA"
) # Here we download the full dataset for the analysis
# afterwards (e.g. return period), but you can also use "year_range" to adjust the␣
,→range of the dataset to be downloaded.
# While doing that, you need to make sure that the year 2017 is included if you want␣
,→to run the blocks with the codes
# subsetting a specific tropic cyclone, which happened in 2017. (Of course, you can␣
,→also change the subsetting codes.)
This will load all historical tracks in the North Atlantic into the tracks object (since we set basin='NA'). The
TCTracks.plot method will plot the downloaded tracks, though there are too many for the plot to be very useful:
# plotting tracks can be very time consuming, depending on the number of tracks. So␣
,→we choose only a few here, by limiting the time range to one year
tracks_2017 = TCTracks.from_ibtracs_netcdf(
provider="usa", basin="NA", year_range=(2017, 2017)
)
tracks_2017.plot(); # This may take a very long time
It’s also worth adding additional time steps to the tracks (though this can be memory intensive!). Most tracks are reported
at 3-hourly intervals (plus a frame at landfall). Event footprints are calculated as the maximum wind from any time step.
For a fast-moving storm these combined three-hourly footprints give quite a rough event footprint, and it’s worth adding
extra frames to smooth the footprint artificially (try running this notebook with and without this interpolation to see the
effect):
tracks.equal_timestep(time_step_h=0.5)
Now, irresponsibly for a risk analysis, we’re only going to use these historical events: they’re enough to demonstrate
CLIMADA in action. A proper risk analysis would expand it to include enough events for a statistically robust climatology.
See the full TropCyclone tutorial for CLIMADA’s stochastic event generation.
Centroids
A hazard’s centroids can be any set of locations where we want the hazard to be evaluated. This could be the same as
the locations of your exposure, though commonly it is on a regular lat-lon grid (with hazard being imputed to exposure
between grid points).
Here we’ll set the centroids as a 0.1 degree grid covering Puerto Rico. Centroids are defined by a Centroids class,
which has the from_pnt_bounds method for generating regular grids and a plot method to inspect the centroids.
Hazard footprint
Now we’re ready to create our hazard object. This will be a TropCyclone class, which inherits from the Hazard class,
and has the from_tracks constructor method to create a hazard from a TCTracks object at given centroids.
In 2017 Hurricane Maria devastated Puerto Rico. In the IBTRaCs event set, it has ID 2017260N12310 (we use this
rather than the name, as IBTRaCS contains three North Atlantic storms called Maria). We can plot the track:
tracks.subset(
{"sid": "2017260N12310"}
).plot(); # This is how we subset a TCTracks object
haz.plot_intensity(event="2017260N12310");
A Hazard object also lets us plot the hazard at different return periods. The IBTRaCS archive produces footprints from
1980 onwards (CLIMADA discarded earlier events) and so the historical period is short. Therefore these plots don’t make
sense as ‘real’ return periods, but we’re being irresponsible and demonstrating the functionality anyway.
See the TropCyclone tutorial for full details of the TropCyclone hazard class.
We can also recalculate event sets to reflect the effects of climate change. The apply_climate_scenario_knu method
applies changes in intensity and frequency projected due to climate change, as described in ‘Global projections of in-
tense tropical cyclone activity for the late twenty-first century from dynamical downscaling of CMIP5/RCP4.5 scenarios’
(Knutson et al. 2015). See the tutorial for details.
Exercise: Extend this notebook’s analysis to examine the effects of climate change in Puerto Rico.
You’ll need to extend the historical event set with stochastic tracks to create a robust statistical storm
climatology - the TCTracks class has the functionality to do this. Then you can apply the ap-
ply_climate_scenario_knu method to the generated hazard object to create a second hazard clima-
tology representing storm activity under climate change. See how the results change using the different
hazard sets.
Next we’ll work on exposure and vulnerability, part of the Entity class.
2.2.5 Entity
The entity class is a container class that stores exposures and impact functions (vulnerability curves) needed for a risk
calculation, and the discount rates and adaptation measures for an adaptation cost-benefit analysis.
As with Hazard objects, Entities can be read from files or created through code. The Excel template can be found in
climada_python/climada/data/system/entity_template.xlsx.
In this tutorial we will create an Exposure object using the LitPop economic exposure module, and load a pre-defined
wind damage function.
Exposures
The Entity’s exposures attribute contains geolocalized values of anything exposed to the hazard, whether monetary
values of assets or number of human lives, for example. It is of type Exposures.
See the Exposures tutorial for more detail on the structure of the class, and how to create and import exposures. The
LitPop tutorial explains how CLIMADA models economic exposures using night-time light and economic data, and is
what we’ll use here. To combine your exposure with OpenStreetMap’s data see the OSM tutorial.
LitPop is a module that allows CLIMADA to estimate exposed populations and economic assets at any point on the planet
without additional information, and in a globally consistent way. Before we try it out with the next code block, we’ll need
to download a data set and put it into the right folder:
1. Go to the download page on Socioeconomic Data and Applications Center (sedac).
2. You’ll be asked to log in or register. Please register if you don’t have an account.
3. Wait until several drop-down menus show up.
4. Choose in the drop-down menus: Temporal: single year, FileFormat: GeoTiff, Resolution: 30 seconds. Click
“2020” and then “create download”.
5. Copy the file “gpw_v4_population_count_rev11_2020_30_sec.tif” into the folder “~/climada/data”. (Or you can
run the block once to find the right path in the error message)
Now we can create an economic Exposure dataset for Puerto Rico.
exp_litpop = LitPop.from_countries(
"Puerto Rico", res_arcsec=120
) # We'll go lower resolution than default to keep it simple
,→144000/144897/BlackMarble_2016_B1_geo_gray.tif
,→yyljy/Documents/climada_main/doc/tutorial/results/Wealth-Accounts_CSV.zip
LitPop’s default exposure is measured in US Dollars, with a reference year depending on the most recent data available.
Once we’ve created our impact function we will come back to this Exposure and give it the parameters needed to connect
exposure to impacts.
Impact functions
Impact functions describe a relationship between a hazard’s intensity and your exposure in terms of a percentage loss.
The impact is described through two terms. The Mean Degree of Damage (MDD) gives the percentage of an exposed
asset’s numerical value that’s affected as a function of intensity, such as the damage to a building from wind in terms of
its total worth. Then the Proportion of Assets Affected (PAA) gives the fraction of exposures that are affected, such as
the mortality rate in a population from a heatwave. These multiply to give the Mean Damage Ratio (MDR), the average
impact to an asset.
Impact functions are stored as the Entity’s impact_funcs attribute, in an instance of the ImpactFuncSet class which
groups one or more ImpactFunc objects. They can be specified manually, read from a file, or you can use CLIMADA’s
pre-defined impact functions. We’ll use a pre-defined function for tropical storm wind damage stored in the IFTropCy-
clone class.
See the Impact Functions tutorial for a full guide to the class, including how data are stored and reading and writing to
files.
We initialise an Impact Function with the IFTropCyclone class, and use its from_emanuel_usa method to load the
Emanuel (2011) impact function. (The class also contains regional impact functions for the full globe, but we’ll won’t use
these for now.) The class’s plot method visualises the function, which we can see is expressed just through the Mean
Degree of Damage, with all assets affected.
imp_fun = ImpfTropCyclone.from_emanuel_usa()
imp_fun.plot();
The plot title also includes information about the function’s ID, which were also set by the from_emanuel_usa class
method. The hazard is “TC” and the function ID is 1. Since a study might use several impact functions - for different
hazards, or for different types of exposure.
We then create an ImpactFuncSet object to store the impact function. This is a container class, and groups a study’s
impact functions together. Studies will often have several impact functions, due to multiple hazards, multiple types of
exposure that are impacted differently, or different adaptation scenarios. We add it to our Entity object.
imp_fun_set = ImpactFuncSet([imp_fun])
Finally, we can update our LitPop exposure to point to the TC 1 impact function. This is done by adding a column to the
exposure:
exp_litpop.gdf["impf_TC"] = 1
Here the impf_TC column tells the CLIMADA engine that for a tropical cyclone (TC) hazard, it should use the first
impact function defined for TCs. We use the same impact function for all of our exposure.
This is now everything we need for a risk analysis, but while we’re working on the Entity class, we can define the adaptation
measures and discount rates needed for an adaptation analysis. If you’re not interested in the cost-benefit analysis, you
can skip ahead to the Impact section
Adaptation measures
CLIMADA’s adaptation measures describe possible interventions that would change event hazards and impacts, and the
cost of these interventions.
They are stored as Measure objects within a MeasureSet container class (similarly to ImpactFuncSet containing
several ImpactFuncs), and are assigned to the measures attribute of the Entity.
See the Adaptation Measures tutorial on how to create, read and write measures. CLIMADA doesn’t yet have pre-defined
adaptation measures, mostly because they are hard to standardise.
The best way to understand an adaptation measure is by an example. Here’s a possible measure for the creation of coastal
mangroves (ignore the exact numbers, they are just for illustration):
meas_mangrove = Measure(
name="Mangrove",
haz_type="TC",
color_rgb=np.array([0.2, 0.2, 0.7]),
cost=500000000,
mdd_impact=(1, 0),
paa_impact=(1, -0.15),
hazard_inten_imp=(1, -10),
)
meas_set = MeasureSet(measure_list=[meas_mangrove])
meas_set.check()
We can apply these measures to our existing Exposure, Hazard and Impact functions, and plot the old and new impact
functions:
Let’s define a second measure. Again, the numbers here are made up, for illustration only.
meas_buildings = Measure(
name="Building code",
haz_type="TC",
color_rgb=np.array([0.2, 0.7, 0.5]),
cost=100000000,
hazard_freq_cutoff=0.1,
)
meas_set.append(meas_buildings)
meas_set.check()
This measure describes an upgrade to building codes to withstand 10-year events. The measure costs 100,000,000 USD
and, through hazard_freq_cutoff = 0.1, removes events with calculated impacts below the 10-year return period.
The Adaptation Measures tutorial describes other parameters for describing adaptation measures, including risk transfer,
assigning measures to subsets of exposure, and reassigning impact functions.
We can compare the 5- and 20-year return period hazard (remember: not a real return period due to the small event set!)
compared to the adjusted hazard once low-impact events are removed.
haz.plot_rp_intensity(return_periods=(5, 20))
buildings_haz.plot_rp_intensity(return_periods=(5, 20));
It shows there are now very few events at the 5-year return period - the new building codes removed most of these from
the event set.
Discount rates
The disc_rates attribute is of type DiscRates. This class contains the discount rates for the following years and
computes the net present value for given values.
See the Discount Rates tutorial for more details about creating, reading and writing the DiscRates class, and how it is
used in calculations.
Here we will implement a simple, flat 2% discount rate.
We are now ready to move to the last part of the CLIMADA model for Impact and Cost Benefit analyses.
Define Entity
We are now ready to define our Entity object that contains the exposures, impact functions, discount rates and measures.
ent = Entity(
exposures=exp_litpop,
disc_rates=disc,
(continues on next page)
2.2.6 Engine
The CLIMADA Engine is where the main risk calculations are done. It contains two classes, Impact, for risk assessments,
and CostBenefit, to evaluate adaptation measures.
Impact
Let us compute the impact of historical tropical cyclones in Puerto Rico.
Our work above has given us everything we need for a risk analysis using the Impact class. By computing the impact for
each historical event, the Impact class provides different risk measures, as the expected annual impact per exposure, the
probable maximum impact for different return periods and the total average annual impact.
Note: the configurable parameter CONFIG.maz_matrix_size controls the maximum matrix size contained in a chunk.
You can decrease its value if you are having memory issues when using the Impact’s calc method. A high value will
make the computation fast, but increase the memory use. (See the config guide on how to set configuration values.)
CLIMADA calculates impacts by providing exposures, impact functions and hazard to an Impact object’s calc method:
A useful parameter for the calc method is save_mat. When set to True (default is False), the Impact object saves
the calculated impact for each event at each point of exposure, stored as a (large) sparse matrix in the imp_mat attribute.
This allows for more detailed analysis at the event level.
The Impact class includes a number of analysis tools. We can plot an exceedance frequency curve, showing us how
often different damage thresholds are reached in our source data (remember this is only 40 years of storms, so not a full
climatology!)
For additional functionality, including plotting the impacts of individual events, see the Impact tutorial.
Exercise: Plot the impacts of Hurricane Maria. To do this you’ll need to set save_mat=True in the earlier
ImpactCalc.impact().
We recommend to use CLIMADA’s writers in hdf5 or csv whenever possible. It is also possible to save our variables in
pickle format using the save function and load them with load. This will save your results in the folder specified in the
configuration file. The default folder is a results folder which is created in the current path (see default configuration
file climada/conf/defaults.conf). The pickle format has a transient format and should be avoided when possible.
import os
from climada.util import save, load
Impact also has write_csv() and write_excel() methods to save the impact variables, and
write_sparse_csr() to save the impact matrix (impact per event and exposure). Use the Impact tutorial to
get more information about these functions and the class in general.
cost_ben = CostBenefit()
cost_ben.calc(haz, ent, future_year=2040) # prints costs and benefits
cost_ben.plot_cost_benefit()
# plot cost benefit ratio and averted damage of every exposure
cost_ben.plot_event_view(
return_per=(10, 20, 40)
); # plot averted damage of each measure for every return period
This is just the start. Analyses improve as we add more adaptation measures into the mix.
Cost-benefit calculations can also include
• climate change, by specifying the haz_future parameter in CostBenefit.calc()
• changes to economic exposure over time (or to whatever exposure you’re modelling) by specifying the ent_future
parameter in CostBenefit.calc()
• different functions to calculate risk benefits. These are specified in CostBenefit.calc() and by default use
changes to average annual impact
• linear, sublinear and superlinear evolution of impacts between the present and future, specified in the
imp_time_depen parameter in CostBenefit.calc()
And once future hazards and exposures are defined, we can express changes to impacts over time as waterfall diagrams.
See the CostBenefit class for more details.
Exercise: repeat the above analysis, creating future climate hazards (see the first exercise), and future ex-
posures based on projected economic growth. Visualise it with the CostBenefit.plot_waterfall()
method.
Note that intensity and fraction are scipy.sparse matrices of size num_events x num_centroids. The fraction
attribute is optional. The Centroids class contains the geographical coordinates where the hazard is defined. A Cen-
troids instance provides the coordinates either as points or raster data together with their Coordinate Reference System
(CRS). The default CRS used in climada is the usual EPSG:4326. Centroids provides moreover methods to compute
centroids areas, on land mask, country iso mask or distance to coast.
%matplotlib inline
import numpy as np
from climada.hazard import Hazard
from climada.util.constants import HAZ_DEMO_FL
warnings.filterwarnings("ignore")
# read intensity from raster file HAZ_DEMO_FL and set frequency for the contained␣
,→event
haz_ven = Hazard.from_raster(
[HAZ_DEMO_FL], attrs={"frequency": np.ones(1) / 2}, haz_type="FL"
)
haz_ven.check()
/Users/vgebhart/miniforge3/envs/climada_env/lib/python3.9/site-packages/dask/
,→dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will␣
,→soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading.
warnings.warn(
event_id: [1]
event_name: ['1']
date: [1.]
frequency: [0.5]
orig: [ True]
min, max fraction: 0.0 1.0
EXERCISE:
# Solution:
# 2. Transformations of the coordinates can be set using the transform option and␣
,→Affine
haz = Hazard.from_raster(
[HAZ_DEMO_FL],
haz_type="FL",
transform=Affine(
0.009000000000000341,
0.0,
-69.33714959699981,
0.0,
-0.009000000000000341,
10.42822096697894,
),
height=500,
width=501,
)
haz.check()
print("\n Solution 2:")
print("raster info:", haz.centroids.get_meta())
print("intensity size:", haz.intensity.shape)
# 3. A partial part of the raster can be loaded using the window or geometry
from rasterio.windows import Window
Solution 1:
centroids CRS: epsg:2201
raster info: {'crs': <Projected CRS: EPSG:2201>
Name: REGVEN / UTM zone 18N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Venezuela - west of 72°W.
- bounds: (-73.38, 7.02, -71.99, 11.62)
Coordinate Operation:
(continues on next page)
Solution 2:
raster info: {'crs': <Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- undefined
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, 'height': 500, 'width': 501, 'transform': Affine(0.009000000000000341, 0.0, -69.
,→33714959699981,
Solution 3:
raster info: {'crs': <Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- undefined
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, 'height': 30, 'width': 20, 'transform': Affine(0.009000000000000341, 0.0, -69.
,→2471495969998,
• MATLAB: Hazards generated with CLIMADA’s MATLAB version (.mat format) can be read using from_mat().
• vector data: Use Hazard’s from_vector-constructor to read shape data (all formats supported by fiona).
• hdf5: Hazards generated with the CLIMADA in Python (.h5 format) can be read using from_hdf5().
# Hazard needs to know the acronym of the hazard type to be constructed!!! Use 'NA'␣
,→if not known.
haz_tc_fl = Hazard.from_hdf5(
HAZ_DEMO_H5
) # Historic tropical cyclones in Florida from 1990 to 2004
haz_tc_fl.check() # Use always the check() method to see if the hazard has been␣
,→loaded correctly
# setting points
import numpy as np
from scipy import sparse
lat = np.array(
[
26.933899,
26.957203,
26.783846,
26.645524,
26.897796,
26.925359,
26.914768,
26.853491,
26.845099,
26.82651,
26.842772,
26.825905,
26.80465,
26.788649,
26.704277,
26.71005,
26.755412,
26.678449,
(continues on next page)
lon = np.array(
[
-80.128799,
-80.098284,
-80.748947,
-80.550704,
-80.596929,
-80.220966,
-80.07466,
-80.190281,
-80.083904,
-80.213493,
-80.0591,
-80.630096,
-80.075301,
-80.069885,
-80.656841,
(continues on next page)
haz = Hazard(
haz_type="TC",
intensity=intensity,
fraction=fraction,
centroids=Centroids(lat=lat, lon=lon), # default crs used
units="m",
event_id=np.arange(n_ev, dtype=int),
(continues on next page)
),
orig=np.zeros(n_ev, bool),
frequency=np.ones(n_ev) / n_ev,
)
haz.check()
haz.centroids.plot();
# using from_pnt_bounds
# bounds
left, bottom, right, top = (
-72,
-3.0,
-52.0,
22,
) # the bounds refer to the bounds of the center of the pixel
# resolution
res = 0.5
centroids = Centroids.from_pnt_bounds(
(left, bottom, right, top), res
) # default crs used
# the same can be done with the method `from_meta`, by definition of a raster meta␣
(continues on next page)
import rasterio
from climada.util.constants import DEF_CRS
# raster info:
# border upper left corner (of the pixel, not of the center of the pixel)
max_lat = top + res / 2
min_lon = left - res / 2
# resolution in lat and lon
d_lat = -res # negative because starting in upper corner
d_lon = res # same step as d_lat
# number of points
n_lat, n_lon = centroids.shape
centroids_from_meta == centroids
True
import numpy as np
from scipy import sparse
haz = Hazard(
"TC",
centroids=centroids,
intensity=intensity,
fraction=fraction,
units="m",
event_id=np.arange(n_ev, dtype=int),
event_name=[
"ev_12",
(continues on next page)
),
orig=np.zeros(n_ev, bool),
frequency=np.ones(n_ev) / n_ev,
)
haz.check()
print("Check centroids borders:", haz.centroids.total_bounds)
haz.centroids.plot();
• centroids properties such as area per pixel, distance to coast, country ISO code, on land mask or elevation are
available through different set_XX()methods.
• set_lat_lon_to_meta() computes the raster meta dictionary from present lat and lon.
set_meta_to_lat_lon() computes lat and lon of the center of the pixels described in attribute meta.
The raster meta information contains at least: width, height, crs and transform data (use help(Centroids)
for more info). Using raster centroids can increase computing performance for several computations.
• when using lats and lons (vector data) the geopandas.GeoSeries geometry attribute contains the CRS infor-
mation and can be filled with point shapes to perform different computation. The geometry points can be then
released using empty_geometry_points().
EXERCISE:
# help(hist_tc.centroids) # If you want to run it, do it after you execute the next␣
,→block
# SOLUTION:
# 2. Generate a hazard with historical hurricanes ocurring between 1995 and 2001.
hist_tc = haz_tc_fl.select(date=("1995-01-01", "2001-12-31"), orig=True)
print("Number of historical events between 1995 and 2001:", hist_tc.size)
# 3. How many historical hurricanes occured in 1999? Which was the year with most␣
,→hurricanes between 1995 and 2001?
# 4. What is the number of centroids with distance to coast smaller than 1km?
num_cen_coast = np.argwhere(hist_tc.centroids.get_dist_coast() < 1000).size
print("Number of centroids close to coast: ", num_cen_coast)
help(haz_tc_fl.plot_intensity)
help(haz_tc_fl.plot_rp_intensity)
Parameters
(continues on next page)
Returns
-------
matplotlib.axes._subplots.AxesSubplot
Raises
------
ValueError
,→Hazard instance
# 4. tropical cyclone intensities maps for the return periods [10, 50, 75, 100]
exceedance_intensities, label, column_label = haz_tc_fl.local_exceedance_intensity(
(continues on next page)
# 5. tropical cyclone return period maps for the threshold intensities [30, 40]
return_periods, label, column_label = haz_tc_fl.local_return_period([30, 40])
from climada.util.plot import plot_from_gdf
# 7. intensities of all the events in centroid closest to lat, lon = (26.5, -81)
haz_tc_fl.plot_intensity(centr=(26.5, -81));
# 7. one figure with two plots: maximum intensities and selected centroid with all␣
,→intensities:
# If you see an error message, try to create a depository named results in the␣
,→repository tutorial.
haz_tc_fl.write_hdf5("results/haz_tc_fl.h5")
haz = Hazard.from_hdf5("results/haz_tc_fl.h5")
haz.check()
Pickle will work as well, but note that pickle has a transient format and should be avoided when possible:
# this generates a results folder in the current path and stores the output there
save("tutorial_haz_tc_fl.p", haz_tc_fl)
Coordinates
time
latitude
longitude
Descriptive variables
time_step
radius_max_wind
max_sustained_wind
central_pressure
environmental_pressure
Attributes
max_sustained_wind_unit
central_pressure_unit
sid
name
orig_event_flag
data_provider
basin
id_no
category
The best-track historical data from the International Best Track Archive for Climate Stewardship (IBTrACS) can easily
be loaded into CLIMADA to study the historical records of TC events. The constructor from_ibtracs_netcdf()
generates the Datasets for tracks selected by IBTrACS id, or by basin and year range. To achieve this, it downloads the
first time the IBTrACS data v4 in netcdf format and stores it in ~/climada/data/. The tracks can be accessed later
either using the attribute data or using get_track(), which allows to select tracks by its name or id. Use the method
append() to extend the data list.
If you get an error downloading the IBTrACS data, try to manually access https://www.ncei.noaa.gov/data/
international-best-track-archive-for-climate-stewardship-ibtracs/v04r01/access/netcdf/, click on the file IBTrACS.
ALL.v04r01.nc and copy it to ~/climada/data/.
%matplotlib inline
from climada.hazard import TCTracks
sel_ibtracs = TCTracks.from_ibtracs_netcdf(
provider="usa", year_range=(1993, 1994), basin="EP", correct_pres=False
)
print("Number of tracks:", sel_ibtracs.size)
ax = sel_ibtracs.plot()
ax.get_legend()._loc = 2 # correct legend location
ax.set_title("1993-1994, EP") # set title
track1 = TCTracks.from_ibtracs_netcdf(
provider="usa", storm_id="2007314N10093"
) # SIDR 2007
track2 = TCTracks.from_ibtracs_netcdf(
provider="usa", storm_id="2016138N10081"
) # ROANU 2016
track1.append(track2.data) # put both tracks together
ax = track1.plot()
ax.get_legend()._loc = 2 # correct legend location
ax.set_title("SIDR and ROANU"); # set title
tr_irma.get_track("2017242N16333")
<xarray.Dataset>
Dimensions: (time: 123)
Coordinates:
* time (time) datetime64[ns] 2017-08-30 ... 2017-09-13T1...
lat (time) float32 16.1 16.15 16.2 ... 36.2 36.5 36.8
lon (time) float32 -26.9 -27.59 -28.3 ... -89.79 -90.1
Data variables:
(continues on next page)
Once tracks are present in TCTracks, one can generate synthetic tracks for each present track based on directed random
walk. Note that the tracks should be interpolated to use the same timestep before generation of probabilistic events.
calc_perturbed_trajectories() generates an ensemble of “nb_synth_tracks” numbers of synthetic tracks is com-
puted for every track. The methodology perturbs the tracks locations, and if decay is True it additionally includes decay
of wind speed and central pressure drop after landfall. No other track parameter is perturbed.
# here we use tr_irma retrieved from IBTrACS with the function above
# select number of synthetic tracks (nb_synth_tracks) to generate per present tracks.
tr_irma.equal_timestep()
tr_irma.calc_perturbed_trajectories(nb_synth_tracks=5)
tr_irma.plot();
# see more configutration options (e.g. amplitude of max random starting point shift␣
,→in decimal degree; max_shift_ini)
<GeoAxesSubplot:>
tr_irma.data[-1] # last synthetic track. notice the value of orig_event_flag and name
<xarray.Dataset>
Dimensions: (time: 349)
Coordinates:
* time (time) datetime64[ns] 2017-08-30 ... 2017-09-13T1...
lon (time) float64 -27.64 -27.8 -27.96 ... -97.81 -97.93
lat (time) float64 15.39 15.41 15.42 ... 27.41 27.49
Data variables:
time_step (time) float64 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
radius_max_wind (time) float64 60.0 60.0 60.0 ... 60.0 60.0 60.0
radius_oci (time) float64 180.0 180.0 180.0 ... 350.0 350.0
max_sustained_wind (time) float64 30.0 30.67 31.33 ... 15.0 14.99 14.96
central_pressure (time) float64 1.008e+03 1.008e+03 ... 1.005e+03
environmental_pressure (time) float64 1.012e+03 1.012e+03 ... 1.008e+03
basin (time) <U2 'NA' 'NA' 'NA' 'NA' ... 'NA' 'NA' 'NA'
on_land (time) bool False False False ... False True True
dist_since_lf (time) float64 nan nan nan nan ... nan 7.605 22.71
Attributes:
max_sustained_wind_unit: kn
central_pressure_unit: mb
name: IRMA_gen5
sid: 2017242N16333_gen5
orig_event_flag: False
data_provider: ibtracs_usa
id_no: 2017242016333.05
category: 5
EXERCISE
# SOLUTION:
import numpy as np
ECMWF publishes tropical cyclone forecast tracks free of charge as part of the WMO essentials. These tracks are detected
automatically in the ENS and HRES models. The non-supervised nature of the model may lead to artefacts.
The tc_fcast trackset below inherits from TCTracks, but contains some additional metadata that follows ECMWF’s
definitions. Try plotting these tracks and compare them to the official cones of uncertainty! The example track at
tc_fcast.data[0] shows the data structure.
In addition to the historical records of TCs (IBTrACS), the probabilistic extension of these tracks, and the ECMWF Forecast
tracks, CLIMADA also features functions to read in synthetic TC tracks from other sources. These include synthetic
storm tracks from Kerry Emanuel’s coupled statistical-dynamical model (Emanuel et al., 2006 as used in Geiger et al.,
2016), from an open source derivative of Kerry Emanuel’s model FAST, synthetic storm tracks from a second coupled
statistical-dynamical model (CHAZ) (as described in Lee et al., 2018), and synthetic storm tracks from a fully statistical
model (STORM) Bloemendaal et al., 2020). However, these functions are partly under development and/or targeted at
advanced users of CLIMADA in the context of very specific use cases. They are thus not covered in this tutorial.
When setting tropical cyclones from tracks, the centroids where to map the wind gusts (the hazard intensity) can be
provided. If no centroids are provided, the global centroids GLB_NatID_grid_0360as_adv_2.mat are used.
From the track properties the 1 min sustained peak gusts are computed in each centroid as the sum of a circular wind
field (following Holland, 2008) and the translational wind speed that arises from the storm movement. We incorporate the
decline of the translational component from the cyclone centre by multiplying it by an attenuation factor. See CLIMADA
v1 and references therein for more information.
# construct centroids
min_lat, max_lat, min_lon, max_lon = 16.99375, 21.95625, -72.48125, -61.66875
cent = Centroids.from_pnt_bounds((min_lon, min_lat, max_lon, max_lat), res=0.12)
cent.plot()
# and then the kernel will be killed: So, don't use this function without given␣
,→centroids!
tc_irma.check()
tc_irma.plot_intensity("2017242N16333")
# IRMA
tc_irma.plot_intensity("2017242N16333_gen2"); # IRMA's synthetic track 2
<GeoAxesSubplot:title={'center':'Event ID 3: 2017242N16333_gen2'}>
apply_climate_scenario_knu implements the changes in frequency due to climate change described in Knutson et
al (2020) and Jewson et al. (2021). This requires to pass the rcp scenario of interest, the projection’s future reference
year, the projection’s percentile of interest and the historical baseline period. For simplicity we keep these latter two as
default values only spefify the rcp (45) and the future reference year (2055).
rel_freq_incr = np.round(
(np.mean(tc_irma_cc.frequency) - np.mean(tc_irma.frequency))
/ np.mean(tc_irma.frequency)
* 100,
0,
)
print(
f"\nA TC like Irma would undergo a frequency increase of about {rel_freq_incr} %␣
,→in 2055 under RCP 45"
<GeoAxesSubplot:title={'center':'Event ID 1: 2017242N16333'}>
Note: this method to implement climate change is simplified and does only take into account changes in TC frequency.
However, how hurricane damage changes with climate remains challenging to assess. Records of hurricane damage
exhibit widely fluctuating values because they depend on rare, landfalling events which are substantially more volatile
than the underlying basin-wide TC characteristics. For more accurate future projections of how a warming climate might
shape TC characteristics, there is a two-step process needed. First, the understanding of how climate change affects
critical environmental factors (like SST, humidity, etc.) that shape TCs is required. Second, the means of simulating how
these changes impact TC characteristics (such as intensity, frequency, etc.) are necessary. Statistical-dynamical models
(Emanuel et al., 2006 and Lee et al., 2018) are physics-based and allow for such climate change studies. However, this
goes beyond the scope of this tutorial.
Multiprocessing is part of the tropical cyclone module. Simply provide a process pool as method argument. Below is an
example of how large amounts of data could be processed.
WARNING: Running multiprocessing code from Jupyter Notebooks can be cumbersome. It’s suggested to copy the code
and paste it into an interactive python console.
from climada.hazard import TCTracks, Centroids, TropCyclone
tc_track.equal_timestep(pool=pool)
tc_track.calc_perturbed_trajectories(pool=pool) # OPTIONAL: if you want to generate a␣
,→probabilistic set of TC tracks.
d) Making videos
Videos of a tropical cyclone hitting specific centroids can be created with the method video_intensity().
WARNING: Creating an animated gif file may consume a lot of memory, up to the point where the os starts swapping
or even an ‘out-of-memory’ exception is thrown.
# Note: execution of this cell will fail unless there is enough memory available (>␣
,→10G)
tc_video = TropCyclone()
tc_list contains a list with TropCyclone instances plotted at each time step tr_coord contains a list with the track
path coordinates plotted at each time step
Animated gif images occupy a lot of space. Using mp4 as output format makes the video sequences much smaller!
However this requires the package ffmpeg to be installed, which is not part of the ordinary climada environment. It can
be installed by executing the following command in a console:
Creating the same videa as above in mp4 format can be done in this way then:
# Note: execution of this cell will fail unless there is enough memory available (>␣
,→12G) and ffmpeg is installed
import shutil
from matplotlib import animation
from matplotlib.pyplot import rcParams
rcParams["animation.ffmpeg_path"] = shutil.which("ffmpeg")
writer = animation.FFMpegWriter(bitrate=500)
(continues on next page)
REFERENCES:
• Bloemendaal, N., Haigh, I. D., de Moel, H., Muis, S., Haarsma, R. J., & Aerts, J. C. J. H. (2020).
Generation of a global synthetic tropical cyclone hazard dataset using STORM. Scientific Data, 7(1).
https://doi.org/10.1038/s41597-020-0381-2
• Emanuel, K., S. Ravela, E. Vivant, and C. Risi, 2006: A Statistical Deterministic Approach to Hurricane Risk
Assessment. Bull. Amer. Meteor. Soc., 87, 299–314, https://doi.org/10.1175/BAMS-87-3-299.
• Geiger, T., Frieler, K., & Levermann, A. (2016). High-income does not protect against hurricane losses. Environ-
mental Research Letters, 11(8). https://doi.org/10.1088/1748-9326/11/8/084012
• Knutson, T. R., Sirutis, J. J., Zhao, M., Tuleya, R. E., Bender, M., Vecchi, G. A., … Chavas, D. (2015). Global
projections of intense tropical cyclone activity for the late twenty-first century from dynamical downscaling of
CMIP5/RCP4.5 scenarios. Journal of Climate, 28(18), 7203–7224. https://doi.org/10.1175/JCLI-D-15-0129.1
• Lee, C. Y., Tippett, M. K., Sobel, A. H., & Camargo, S. J. (2018). An environmentally forced
tropical cyclone hazard model. Journal of Advances in Modeling Earth Systems, 10(1), 223–241.
https://doi.org/10.1002/2017MS001186
%matplotlib inline
import matplotlib.pyplot as plt
Reading Data
StormEurope was written under the presumption that you’d start out with WISC storm footprint data in netCDF format.
This notebook works with a demo dataset. If you would like to work with the real data: (1) Please follow the link and
download the file C3S_WISC_FOOTPRINT_NETCDF_0100.tgz from the Copernicus Windstorm Information Service,
(2) unzip it (3) uncomment the last two lines in the following codeblock and (4) adjust the variable “WISC_files”.
We first construct an instance and then point the reader at a directory containing compatible .nc files. Since there are
other files in there, we must be explicit and use a globbing pattern; supplying incompatible files will make the reader fail.
The reader actually calls climada.util.files_handler.get_file_names, so it’s also possible to hand it an ex-
plicit list of filenames, or a dirname, or even a list of glob patterns or directories.
storm_instance = StormEurope.from_footprints(WS_DEMO_NC)
# WISC_files = '/path/to/folder/C3S_WISC_FOOTPRINT_NETCDF_0100/fp_era[!er5]*_0.nc'
# storm_instance = StormEurope.from_footprints(WISC_files)
Introspection
Let’s quickly see what attributes this class brings with it:
?storm_instance
Type: StormEurope
String form: <climada.hazard.storm_europe.StormEurope object at 0x7f2a986b4c70>
File: ~/code/climada_python/climada/hazard/storm_europe.py
Docstring:
A hazard set containing european winter storm events. Historic storm
events can be downloaded at https://cds.climate.copernicus.eu/ and read
with `from_footprints`. Weather forecasts can be automatically downloaded from
https://opendata.dwd.de/ and read with from_icon_grib(). Weather forecast
from the COSMO-Consortium https://www.cosmo-model.org/ can be read with
from_cosmoe_file().
Attributes
----------
ssi_wisc : np.array, float
Storm Severity Index (SSI) as recorded in
the footprint files; apparently not reproducible from the footprint
values only.
ssi : np.array, float
SSI as set by set_ssi; uses the Dawkins
definition by default.
Init docstring: Calls the Hazard init dunder. Sets unit to 'm/s'.
You could also try listing all permissible methods with dir(storm_instance), but since that would include the methods
from the Hazard base class, you wouldn’t know what’s special. The best way is to read the source: uncomment the
following statement to read more.
# StormEurope??
storm_instance.set_ssi(
method="wind_gust",
intensity=storm_instance.intensity,
# the above is just a more explicit way of passing the default
on_land=True,
threshold=25,
sel_cen=None,
# None is default. sel_cen could be used to subset centroids
)
Probabilistic Storms
This class allows generating probabilistic storms from historical ones according to a method outlined in Schwierz et al.
2010. This means that per historical event, we generate 29 new ones with altered intensities. Since it’s just a bunch of
vector operations, this is pretty fast.
However, we should not return the entire probabilistic dataset in-memory: in trials, this used up 60 GB of RAM, thus
requiring a great amount of swap space. Instead, we must select a country by setting the reg_id parameter to an ISO_N3
country code used in the Natural Earth dataset. It is also possible to supply a list of ISO codes. If your machine is up for
the job of handling the whole dataset, set the reg_id parameter to None.
Since assigning each centroid a country ID is a rather inefficient affair, you may need to wait a minute or two for the entire
WISC dataset to be processed. For the small demo dataset, it runs pretty quickly.
%%time
storm_prob = storm_instance.generate_prob_storms(reg_id=528)
storm_prob.plot_intensity(0);
fig.tight_layout()
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0x1dafba69940>
We can get much more fancy in our calls to generate_prob_storms; the keyword arguments after ssi_args are
passed on to _hist2prob, allowing us to tweak the probabilistic permutations.
ssi_args = {
"on_land": True,
"threshold": 25,
}
storm_prob_xtreme = storm_instance.generate_prob_storms(
reg_id=[56, 528], # BEL and NLD
spatial_shift=2,
ssi_args=ssi_args,
power=1.5,
scale=0.3,
)
We can now check out the SSI plots of both these calculations. The comparison between the historic and probabilistic ssi
values, only makes sense for the full dataset.
storm_prob_xtreme.plot_ssi(full_area=True)
storm_prob.plot_ssi(full_area=True);
2.3.4 Using the Copernicus Seasonal Forecast Tools package to create a hazard
object
Introduction
The copernicus-seasonal-forecast-tools package was developed to manage seasonal forecast data from the Copernicus
Climate Data Store (CDS) for the U-CLIMADAPT project. It offers comprehensive tools for downloading, process-
ing, computing climate indices, and generating hazard objects based on seasonal forecast datasets, particularly Seasonal
forecast daily and subdaily data on single levels. The package is tailored to integrate seamlessly with the CLIMADA,
supporting climate risk assessment and the development of effective adaptation strategies.
Features:
• Automated download of the high-dimensional seasonal forecasts data via the Copernicus API
• Preprocessing of sub-daily forecast data into daily formats
• Calculation of heat-related climate indices (e.g., heatwave days, tropical nights)
• Conversion of processed indices into CLIMADA hazard objects ready for impact modelling
• Flexible modular architecture to accommodate additional indices or updates to datasets
In this tutorial, you can see a simple example of how to retrieve and process data from Copernicus, calculate a heat-related
index, and create a hazard object. For more detailed documentation and advanced examples, please visit the repository
or the documentation.
Prerequisites:
1. CDS account and API key: Register at https://cds.climate.copernicus.eu
2. CDS API client installation: pip install cdsapi
3. CDS API configuration: Create a .cdsapirc file in your home directory with your API key and URL. For instructions,
visit: https://cds.climate.copernicus.eu/how-to-api#install-the-cds-api-client
4. Dataset Terms and Conditions: After selecting the dataset to download, make sure to accept the terms and
conditions on the corresponding dataset webpage in the CDS portal before running this notebook. Here,
https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=download.
For more information, visit the comprehensive CDS API setup guide, which walks you through each step of the process.
Once configured, you’ll be ready to explore and analyze seasonal forecast data.
Note: Ensure you have the necessary permissions and comply with the CDS data usage policies when using this
package. You can view the terms and conditions at https://cds.climate.copernicus.eu/datasets/seasonal-original-single-
levels?tab=download. You can find them at the bottom of the download page.
# Import packages
import warnings
import datetime as dt
warnings.filterwarnings("ignore")
from seasonal_forecast_tools import SeasonalForecast, ClimateIndex
from seasonal_forecast_tools.utils.coordinates_utils import bounding_box_from_
,→countries
Set up parameters
To configure the package for working with Copernicus forecast data and converting it into a hazard object for CLIMADA,
you will need to define several essential parameters. These settings are crucial as they specify the type of data to be
retrieved, the format, the forecast period, and the geographical area of interest. These parameters influence how the
forecast data is processed and transformed into a hazard object.
Below, we outline these parameters and use an example for the Tmax – Maximum Temperature index to demonstrate the
seasonal forecast functionality.
To learn more about what these parameters entail and their significance, please refer to the documentation on the CDS
webpage.
Overview of parameters
index_metric: Defines the type of index to be calculated. There are currently 12 predefined options available, including
temperature-based indices (Tmean – Mean Temperature, Tmin – Minimum Temperature, Tmax – Maximum Tempera-
ture), heat stress indicators (HIA – Heat Index Adjusted, HIS – Heat Index Simplified, HUM – Humidex, AT – Apparent
Temperature, WBGT – Wet Bulb Globe Temperature (Simple)), and extreme event indices (HW – Heat Wave, TR – Tropical
Nights, TX30 – Hot Days).
• Heat Waves (“HW”):
If index_metric is set to ‘HW’ for heat wave calculations, additional parameters can be specified to fine-tune the
heat wave detection:
– threshold: Temperature threshold above which days are considered part of a heat wave. Default is 27°C.
– min_duration: Minimum number of consecutive days above the threshold required to define a heat wave
event. Default is 3 days.
– max_gap: Maximum allowable gap (in days) between two heat wave events to consider them as one single
event. Default is 0 days.
• Tropical Nights (“TR”):
If index_metric is set to ‘TR’ for tropical nights, an additional parameter can be specified to set the threshold:
– threshold: Nighttime temperature threshold, above which a night is considered “tropical.” Default is 20°C.
• ⚠ Flexibility: Users can define and integrate their own indices into the pipeline to extend the analysis according
to their specific needs.
format : Specifies the format of the data to be downloaded, “grib” or “netcdf”. Copernicus do NOT recommended
netcdf format for operational workflows since conversion to netcdf is considered experimental. More information here.
originating_centre: Identifies the source of the data. A standard choice is “dwd” (German Weather Service), one of
eight providers including ECMWF, UK Met Office, Météo France, CMCC, NCEP, JMA, and ECCC.
system: Refers to a specific model or configuration used for forecasts. In this script, the default value is “21,” which
corresponds to the GCSF (German Climate Forecast System) version 2.1. More details can be found in the CDS docu-
mentation.
year_list: A list of years for which data should be downloaded and processed.
initiation_month: A list of the months in which the forecasts are initiated. Example: [“March”, “April”].
forecast_period: Specifies the months relative to the forecast’s initiation month for which the data is forecasted. Example:
[“June”, “July”, “August”] indicates forecasts for these months. The maximum available is 7 months.
• 2/7 Important: When an initiation month is in one year and the forecast period in the next, the system recognizes the
forecast extends beyond the initial year. Data is retrieved based on the initiation month, with lead times covering
the following year. The forecast is stored under the initiation year’s directory, ensuring consistency while spanning
both years.
area_selection: This determines the geographical area for which the data should be downloaded. It can be set to
• Global coverage:
– Use the predefined function bounding_box_global() to select the entire globe.
• Custom geographical bounds (cardinal coordinates):
– Input explicit latitude/longitude limits (in EPSG:4326).
– bounds = bounding_box_from_cardinal_bounds(northern=49, eastern=20, southern=40, western=10)
• Country codes (ISO alpha-3):
– Provide a list of ISO 3166-1 alpha-3 country codes (e.g., “DEU” for Germany, “CHE” for Switzerland). The
bounding box is constructed as the union of all selected countries.See this wikipedia page for the country
codes.
– bounds = bounding_box_from_countries([“CHE”, “DEU”])
overwrite: Boolean flag that, when set to True, forces the system to redownload and reprocess existing files.
# Describe the selected climate index and the associated input data
forecast = SeasonalForecast(
index_metric=index_metric,
year_list=year_list,
forecast_period=forecast_period,
initiation_month=initiation_month,
bounds=bounds,
data_format=data_format,
originating_centre=originating_centre,
system=system,
)
The variables required for your selected index will be printed below. This allows you to see which data will be accessed
and helps estimate the data volume.
forecast.explain_index()
You can now call the forecast.download_and_process_data method, which efficiently retrieves and organizes
Copernicus forecast data. It checks for existing files to avoid redundant downloads, stores data by format (grib or netCDF),
year, month. Then the files are processed for further analysis, such as calculating climate indices or creating hazard objects
within CLIMADA. Here are the aspects of this process:
• Data Download: The method downloads the forecast data for the selected years, months, and regions. The data
is retrieved in grib or netCDF formats, which are commonly used for storing meteorological data. If the required
files already exist in the specified directories, the system will skip downloading them, as indicated by the log
messages such as:
“Corresponding grib file SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/downloaded_data/grib
already exists.”
• Data Processing: After downloading (or confirming the existence of) the files, the system converts them into
daily netCDF files. Each file contains gridded, multi-ensemble data for daily mean, maximum, and minimum,
structured by forecast step, ensemble member, latitude, and longitude. The log messages confirm the existence or
creation of these files, for example:
“Daily file SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/processed_data/TX30_boundsW4_S
already exists.”
• Geographic and Temporal Focus: The files are generated for a specific time frame (e.g., June and July 2022) and
a predefined geographic region, as specified by the parameters such as bounds, month_list, and year_list.
This ensures that only the selected data for your analysis is downloaded and processed.
• Data Completeness: Messages like “already exists” ensure that you do not redundantly download or process data,
saving time and computing resources. However, if the data files are missing, they will be downloaded and processed
as necessary.
,→grib/Tmax_boundsN-59_S-35_E-52_W-29.grib')},
,→Tmax_boundsN-59_S-35_E-52_W-29.nc')}}
From here, you can consult the data created by calling xarray. This will display the structure of the dataset, including
dimensions such as time (here called steps), latitude, longitude, and ensemble members, as well as coordinates, data
variables such as the processed daily values of temperature at two meters (mean, max, and min), and associated metadata
and attributes.
This already processed daily data can be used as needed; or you can now also calculate a heat-related index as in the
following cells.
import xarray as xr
ds = xr.open_dataset(file_path)
ds
# You can also just select the data for the first time step of the first ensemble␣
,→member
ds.isel(step=0, number=0)
If you decide to calculate an index, you can call the forecast.calculate_index method to compute specific climate
indices (such as Maximum Temperature). The output is automatically saved and organized in a structured format for
further analysis. Here are some details:
• Index Calculation: The method processes seasonal forecast data to compute the selected index for the chosen
years, months, and regions. This index represents a specific climate condition, such as the number of Maximum
Temperature (“Tmax”) over the forecast period, as defined in the parameters.
• Data Storage: The calculated index data is saved in netCDF format. These files are automatically saved in
directories specific to the index and time period. The file paths are printed below the processing steps. For
example, the computed index values are stored in:
“SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/indices/TX30/TX30_boundsW4_S44_E11_N4
Similarly, the statistics of the index (e.g., mean, max, min, std) are saved in:
“SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/indices/TX30/TX30_boundsW4_S44_E11_N4
These files ensure that both the raw indices and their statistical summaries are available for detailed analysis.
Each file contains data for a specific month and geographic region, as defined in the parameters. This allows you
to analyze how the selected climate index varies over time and across different locations.
• Completeness of Data Processing: Messages ‘Index Tmax successfully calculated and saved for…’ confirm the
successful calculation and storage of the index, ensuring that all requested data has been processed and saved
correctly.
# Calculate index
forecast.calculate_index(
hw_threshold=hw_threshold, hw_min_duration=hw_min_duration, hw_max_gap=hw_max_gap
)
,→59_S-35_E-52_W-29_daily.nc'),
'monthly': PosixPath('/Users/daraya/climada/data/copernicus_data/seasonal_forecasts/
,→dwd/sys21/2022/init11/valid12_02/indices/Tmax/Tmax_boundsN-59_S-35_E-52_W-29_
,→monthly.nc'),
'stats': PosixPath('/Users/daraya/climada/data/copernicus_data/seasonal_forecasts/
,→dwd/sys21/2022/init11/valid12_02/indices/Tmax/Tmax_boundsN-59_S-35_E-52_W-29_stats.
,→nc')}}
We can explore the properties of the daily file containing the calculated index and, for example, visualize the values for
each ensemble member on a specific date. This enables a quick visual inspection of how the predicted Tmax varies across
ensemble members on that day.
)
ds_daily
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
(continues on next page)
target_datetime = pd.to_datetime(
target_date
) # Find matching step index for target date
index_match = np.where(forecast_dates.normalize() == target_datetime.normalize())[0]
if len(index_match) == 0:
raise ValueError(
f"Date {target_date} not found in forecast data.\nAvailable dates: {forecast_
,→dates.strftime('%Y-%m-%d').tolist()}"
)
step_index = index_match[0]
data = ds_daily[index_metric].isel(step=step_index)
for i in range(50):
ax = axs[i]
p = data.isel(number=i).plot(
ax=ax,
transform=ccrs.PlateCarree(),
x="longitude",
y="latitude",
add_colorbar=False,
cmap="viridis",
)
ax.coastlines(color="white")
ax.add_feature(cfeature.BORDERS, edgecolor="white")
ax.set_title(f"Member {i+1}", fontsize=8)
ax.set_xticks([])
ax.set_yticks([])
plt.subplots_adjust(
bottom=0.1, top=0.93, left=0.05, right=0.95, wspace=0.1, hspace=0.1
) # Add shared colorbar
cbar_ax = fig.add_axes([0.15, 0.05, 0.7, 0.015])
fig.colorbar(p, cax=cbar_ax, orientation="horizontal", label=index_metric)
(continues on next page)
We can also access the monthly index data, where the step coordinate now represents monthly values instead of daily
ones, reflecting the aggregation over each forecast month.
import datetime
ds_monthly = xr.open_dataset(
"/Users/daraya/climada/data/copernicus_data/seasonal_forecasts/dwd/sys21/2022/
,→init11/valid12_02/indices/Tmax/Tmax_boundsN-59_S-35_E-52_W-29_monthly.nc"
)
for step in ds_monthly.step.values:
print(str(step))
2022-12
2023-01
2023-02
You can now also explore ensemble statistics over time using the precomputed stats.nc file. The file contains ten
statistical variables: mean, median, max, min, standard deviation, and percentiles from p5 to p95, computed across
ensemble members. These stored variables allow for easy plotting and visualization of how ensemble members vary
across forecast months. As expected, the spread in member predictions tends to increase toward the end of the forecast
period.
)
ds_stats
# Extract statistics
steps = ds_stats["step"].values
mean = ds_stats["ensemble_mean"].mean(dim=["latitude", "longitude"])
median = ds_stats["ensemble_median"].mean(dim=["latitude", "longitude"])
std = ds_stats["ensemble_std"].mean(dim=["latitude", "longitude"])
min_ = ds_stats["ensemble_min"].mean(dim=["latitude", "longitude"])
max_ = ds_stats["ensemble_max"].mean(dim=["latitude", "longitude"])
p5 = ds_stats["ensemble_p5"].mean(dim=["latitude", "longitude"])
p25 = ds_stats["ensemble_p25"].mean(dim=["latitude", "longitude"])
p75 = ds_stats["ensemble_p75"].mean(dim=["latitude", "longitude"])
p95 = ds_stats["ensemble_p95"].mean(dim=["latitude", "longitude"])
# Plot
plt.figure(figsize=(10, 6))
plt.plot(steps, mean, label="Mean", color="black", linewidth=2)
plt.plot(steps, median, label="Median", color="orange", linestyle="--")
plt.fill_between(steps, min_, max_, color="skyblue", alpha=0.3, label="Min-Max Range")
plt.fill_between(
steps, mean - std, mean + std, color="salmon", alpha=0.4, label="Mean ± Std"
)
plt.plot(steps, p5, label="P5", linestyle=":", color="gray")
plt.plot(steps, p25, label="P25", linestyle="--", color="gray")
plt.plot(steps, p75, label="P75", linestyle="--", color="gray")
plt.plot(steps, p95, label="P95", linestyle=":", color="gray")
plt.title("Ensemble Statistics Over Time")
plt.xlabel("Forecast Month")
plt.ylabel("Index Value (averaged over space)")
plt.xticks(rotation=45)
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()
Then you can call forecast.process_and_save_hazards method to convert processed index from Copernicus
forecast data into a hazard object.
• Hazard Object Creation: The method processes seasonal forecast data for specified years and months, converting
these into hazard objects. These objects encapsulate potential risks associated with specific weather events or
conditions, such as Maximum Temperature (‘Tmax’) indicated in the parameters, over the forecast period.
• Data Storage: The hazard data for each ensemble member of the forecast is saved as HDF5
files. These files are automatically stored in specific directories corresponding to each month
and type of hazard. The file paths are printed below the processing steps. For example, “/SYS-
TEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/hazard/TX30/TX30_boundsW4_S44_E11_N48.hd
HDF5 is a versatile data model that efficiently stores large volumes of complex data.
Each file is specific to a particular month and hazard scenario (‘Tmax’ in this case) and covers all ensemble members for
that forecast period, aiding in detailed risk analysis.
• Completeness of Data Processing: Messages like ‘Completed processing for 2022-07. Data saved in…’ confirm
the successful processing and storage of the hazard data for that period, ensuring that all requested data has been
properly handled and stored.
forecast.save_index_to_hazard()
{'2022_init11_valid12_02': PosixPath('/Users/daraya/climada/data/copernicus_data/
,→seasonal_forecasts/dwd/sys21/2022/init11/valid12_02/hazard/Tmax/Tmax_boundsN-59_S-
,→35_E-52_W-29.hdf5')}
You can always inspect the properties of the Hazard object or visualize its contents. Noted the date attribute uses serial
date numbers (ordinal format), which is common in climate data. To convert these to standard datetime format, you can
use datetime.datetime.fromordinal.
# Load the hazard and plot intensity for the selected grid
forecast.save_index_to_hazard()
initiation_month_str = f"{month_name_to_number(initiation_month[0]):02d}"
forecast_month_str = f"{forecast.valid_period_str[-2:]}" # Last month in valid period
forecast_year = year_list[0]
path_to_hazard = forecast.get_pipeline_path(
forecast_year, initiation_month_str, "hazard"
)
hazard = Hazard.from_hdf5(path_to_hazard)
Hazard attributes:
- Shape of intensity (time, gridpoint): (150, 56)
- Centroids: (7, 8)
- Units: °C
- event_id: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
145 146 147 148 149 150]
- frequency: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1.]
- min, max fraction: 0.0 0.0
- Date: [738490 738521 738552 738490 738521 738552 738490 738521 738552 738490
738521 738552 738490 738521 738552 738490 738521 738552 738490 738521
738552 738490 738521 738552 738490 738521 738552 738490 738521 738552
738490 738521 738552 738490 738521 738552 738490 738521 738552 738490
738521 738552 738490 738521 738552 738490 738521 738552 738490 738521
738552 738490 738521 738552 738490 738521 738552 738490 738521 738552
(continues on next page)
(None,
['member0',
'member0',
'member0',
'member1',
'member1',
'member1',
'member2',
'member2',
'member2',
'member3',
'member3',
'member3',
'member4',
'member4',
'member4',
'member5',
'member5',
'member5',
'member6',
'member6',
'member6',
'member7',
'member7',
'member7',
'member8',
'member8',
'member8',
'member9',
'member9',
'member9',
'member10',
'member10',
'member10',
'member11',
'member11',
'member11',
'member12',
'member12',
'member12',
(continues on next page)
forecast_month = int(forecast.valid_period_str[-2:])
forecast_year = (
initiation_year + 1
if int(initiation_month_str) > forecast_month
else initiation_year
)
path_to_hazard = forecast.get_pipeline_path(
initiation_year, initiation_month_str, "hazard"
)
haz = Hazard.from_hdf5(path_to_hazard)
if haz:
available_dates = sorted(set(haz.date))
readable_dates = [
dt.datetime.fromordinal(d).strftime("%Y-%m-%d") for d in available_dates
]
print("Available Dates Across Members:", readable_dates)
target_date = dt.datetime(
forecast_year, forecast_month, 1
).toordinal() # Look for the first day of the last forecast month
closest_date = min(available_dates, key=lambda x: abs(x - target_date))
closest_date_str = dt.datetime.fromordinal(closest_date).strftime("%Y-%m-%d")
else:
print("No hazard data found for the selected period.")
Now you have a Hazard object that you can use in your specific impact assessment. In addition, you also have access
to daily and monthly index estimates, along with ensemble statistics. Of course, the original hourly data of the climate
variables related to your index of interest is also available, including their daily statistics.
If you would like to explore more advanced examples, please visit the package repository. There, you will find additional
Python notebooks as well as links to plug-and-play Google Colab notebooks demonstrating the full capabilities of the
package.
Resources
• Copernicus Seasonal Forecast Tools package
• Copernicus Seasonal Forecast Tools documentation
• Copernicus Seasonal Forecast Tools demo
• Copernicus Seasonal Forecast Tools extended demostration
Additional resources:
• U-CLIMADAPT Project
• Seasonal forecast daily and subdaily data on single levels
• Copernicus Climate Data Store
• CLIMADA Documentation
*) an Exposures object is valid without such a column, but it’s required for impact calculation
Apart from data the Exposures object has the following attributes and properties:
import numpy as np
from climada.entity import Exposures
latitude = [1, 2, 3] * 3
longitude = [4] * 3 + [5] * 3 + [6] * 3
exp_arr = Exposures(
lat=latitude, # list or array
lon=longitude, # instead of lat and lon one can provide an array of Points␣
,→through the geometry argument
In case you are unfamiliar with the data structure, check out the pandas DataFrame documentation.
import numpy as np
from pandas import DataFrame
from climada.entity import Exposures
# Fill a pandas DataFrame with the 3 mandatory variables (latitude, longitude, value)␣
,→for a number of assets (10'000).
# We will do this with random dummy data for purely illustrative reasons:
exp_df = DataFrame()
n_exp = 100 * 100
# provide value
exp_df["value"] = np.random.random_sample(n_exp)
# provide latitude and longitude
lat, lon = np.mgrid[
15 : 35 : complex(0, np.sqrt(n_exp)), 20 : 40 : complex(0, np.sqrt(n_exp))
]
exp_df["latitude"] = lat.flatten()
exp_df["longitude"] = lon.flatten()
# For each exposure entry, specify which impact function should be taken for which␣
,→hazard type.
# In this case, we only specify the IDs for tropical cyclone (TC); here, each␣
,→exposure entry will be treated with
# Generate Exposures from the pandas DataFrame. This step converts the DataFrame into
# a CLIMADA Exposures instance!
exp = Exposures(exp_df)
print(f"exp has the type: {type(exp)}")
print(f"and contains a GeoDataFrame exp.gdf: {type(exp.gdf)}\n")
description: None
ref_year: 2018
value_unit: USD
crs: EPSG:4326
data: (10000 entries)
value impf_TC geometry
0 0.533764 1 POINT (20.00000 15.00000)
1 0.995993 1 POINT (20.20202 15.00000)
2 0.603523 1 POINT (20.40404 15.00000)
3 0.754253 1 POINT (20.60606 15.00000)
9996 0.069044 1 POINT (39.39394 35.00000)
9997 0.116560 1 POINT (39.59596 35.00000)
9998 0.239856 1 POINT (39.79798 35.00000)
9999 0.099568 1 POINT (40.00000 35.00000)
In case you are unfamiliar with with data structure, check out the geopandas GeoDataFrame documentation. The main
difference to the example above (pandas DataFrame) is that, while previously, we provided latitudes and longitudes which
were then converted to a geometry GeoSeries using the set_geometry_points method, GeoDataFrames alread
come with a defined geometry GeoSeries. In this case, we take the geometry info and use the set_lat_lon method to
explicitly provide latitudes and longitudes. This example focuses on data with POINT geometry, but in principle, other
geometry types (such as POLYGON and MULTIPOLYGON) would work as well.
import numpy as np
import geopandas as gpd
from climada.entity import Exposures
,→com/downloads/110m-cultural-vectors/.
world = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
# Generate Exposures: value, latitude and longitude for each exposure entry.
world["value"] = np.arange(n_exp)
(continues on next page)
description: None
ref_year: 2018
value_unit: USD
crs: EPSG:4326
data: (243 entries)
name value geometry
0 Vatican City 0.876947 POINT (12.45339 41.90328)
1 San Marino 0.895454 POINT (12.44177 43.93610)
2 Vaduz 0.373366 POINT (9.51667 47.13372)
3 Lobamba 0.422729 POINT (31.20000 -26.46667)
239 São Paulo 0.913955 POINT (-46.62697 -23.55673)
240 Sydney 0.514479 POINT (151.21255 -33.87137)
241 Singapore 0.830635 POINT (103.85387 1.29498)
242 Hong Kong 0.764571 POINT (114.18306 22.30693)
# For each exposure entry, specify which impact function should be taken for which␣
,→hazard type.
# In this case, we only specify the IDs for tropical cyclone (TC); here, each␣
,→exposure entry will be treated with
description: None
ref_year: 2018
value_unit: USD
crs: EPSG:4326
data: (243 entries)
name value geometry impf_TC
0 Vatican City 0.876947 POINT (12.45339 41.90328) 1
1 San Marino 0.895454 POINT (12.44177 43.93610) 1
2 Vaduz 0.373366 POINT (9.51667 47.13372) 1
3 Lobamba 0.422729 POINT (31.20000 -26.46667) 1
239 São Paulo 0.913955 POINT (-46.62697 -23.55673) 1
240 Sydney 0.514479 POINT (151.21255 -33.87137) 1
241 Singapore 0.830635 POINT (103.85387 1.29498) 1
242 Hong Kong 0.764571 POINT (114.18306 22.30693) 1
The fact that Exposures is built around a geopandas.GeoDataFrame offers all the useful functionalities that come
with the package. The following examples showcase only a few of those.
)
sel_polygon.data
Geopandas can read almost any vector-based spatial data format including ESRI shapefile, GeoJSON files and more,
see readers geopandas. Pandas supports formats such as csv, html or sql; see readers pandas. Using the corresponding
readers, DataFrame and GeoDataFrame can be filled and provided to Exposures following the previous examples.
If you manually collect exposure data, Excel may be your preferred option. In this case, it is easiest if you format your data
according to the structure provided in the template climada_python/climada/data/system/entity_template.
xlsx, in the sheet assets.
import pandas as pd
from climada.util.constants import ENT_TEMPLATE_XLS
from climada.entity import Exposures
# Read your Excel file into a pandas DataFrame (we will use the template example for␣
,→this demonstration):
file_name = ENT_TEMPLATE_XLS
exp_templ = pd.read_excel(file_name)
As we can see, the general structure is the same as always: the exposure has latitude, longitude and value columns.
Further, this example specified several impact function ids: some for Tropical Cyclones (impf_TC), and some for Floods
(impf_FL). It also provides some meta-info (region_id, category_id) and insurance info relevant to the impact
calculation in later steps (cover, deductible).
Last but not least, you may have your exposure data stored in a raster file. Raster data may be read in from any file-type
supported by rasterio.
from rasterio.windows import Window
from climada.util.constants import HAZ_DEMO_FL
from climada.entity import Exposures
# We take an example with a dummy raster file (HAZ_DEMO_FL), running the method set_
,→from_raster directly loads the
# specifying a window, if not the entire file should be read, or a bounding box.␣
,→Check them out.
exp_raster.derive_raster()
exp_hdf5 = Exposures.from_hdf5(EXP_DEMO_H5)
print(exp_hdf5)
description: None
ref_year: 2016
value_unit: USD
crs: EPSG:4326
data: (50 entries)
value impf_TC deductible cover category_id region_id \
0 1.392750e+10 1 0.0 1.392750e+10 1 1.0
1 1.259606e+10 1 0.0 1.259606e+10 1 1.0
2 1.259606e+10 1 0.0 1.259606e+10 1 1.0
3 1.259606e+10 1 0.0 1.259606e+10 1 1.0
46 1.264524e+10 1 0.0 1.264524e+10 1 1.0
47 1.281438e+10 1 0.0 1.281438e+10 1 1.0
48 1.260291e+10 1 0.0 1.260291e+10 1 1.0
49 1.262482e+10 1 0.0 1.262482e+10 1 1.0
geometry
0 POINT (-80.12880 26.93390)
1 POINT (-80.09828 26.95720)
2 POINT (-80.74895 26.78385)
3 POINT (-80.55070 26.64552)
46 POINT (-80.11640 26.34907)
47 POINT (-80.08385 26.34635)
48 POINT (-80.24130 26.34802)
49 POINT (-80.15886 26.34796)
setstate(state)
Visualize Exposures
The method plot_hexbin() uses cartopy and matplotlib’s hexbin function to represent the exposures values as 2d bins
over a map. Configure your plot by fixing the different inputs of the method or by modifying the returned matplotlib
figure and axes.
The method plot_scatter() uses cartopy and matplotlib’s scatter function to represent the points values over a 2d
map. As usal, it returns the figure and axes, which can be modify aftwerwards.
The method plot_raster() rasterizes the points into the given resolution. Use the save_tiff option to save the
resulting tiff file and the res_rasteroption to re-set the raster’s resolution.
Finally, the method plot_basemap() plots the scatter points over a satellite image using contextily library.
# further keyword arguments to play around with: pop_name, buffer, gridsize, ...
Plotting exp_df.
exp_gpd.to_crs("epsg:3035", inplace=True)
exp_gpd.plot_scatter(pop_name=False);
<GeoAxesSubplot:>
ax = exp.plot_raster()
# plot with same resolution as data
add_cntry_names(
ax,
[
exp.gdf["longitude"].min(),
exp.gdf["longitude"].max(),
exp.gdf["latitude"].min(),
exp.gdf["latitude"].max(),
],
)
Since Exposures is a GeoDataFrame, any function for visualization from geopandas can be used. Check making maps
and examples gallery.
array([[<AxesSubplot:title={'center':'value'}>]], dtype=object)
import fiona
fiona.supported_drivers
from climada import CONFIG
results = CONFIG.local_data.save_dir.dir()
# DataFrame save to csv format. geometry writen as string, metadata not saved!
exp_templ.gdf.to_csv(results.joinpath("exp_templ.csv"), sep="\t")
Optionally use climada’s save option to save it in pickle format. This allows fast to quickly restore the object in its current
state and take up your work right were you left it the next time. Note however, that pickle has a transient format and is
not suitable for storing data persistently.
Background
The modeling of economic disaster risk on a global scale requires high-resolution maps of exposed asset values. We have
developed a generic and scalable method to downscale national asset value estimates proportional to a combination of
nightlight intensity (“Lit”) and population data (“Pop”).
Asset exposure value is disaggregated to the grid points proportionally to Litm P opn , computed at each grid cell:
Litm P opn = Litm ∗ P opn , with exponents = [m, n] ∈ R+ (Default values are m = n = 1).
For more information please refer to the related publication (https://doi.org/10.5194/essd-12-817-2020) and data archive
(https://doi.org/10.3929/ethz-b-000331316).
How to cite: Eberenz, S., Stocker, D., Röösli, T., and Bresch, D. N.: Asset exposure data for global physical risk assessment,
Earth Syst. Sci. Data, 12, 817–833, https://doi.org/10.5194/essd-12-817-2020, 2020.
Input data
Note: All required data except for the population data from Gridded Population of the World (GPW) is downloaded
automatically when an LitPop.set_* method is called.
Warning: Processing the data for the first time can take up huge amounts of RAM (>10 GB), depending on country or
region size. Consider using the wrapper function of the data API to download readily computed LitPop exposure data for
default values (n = m = 1) on demand.
Nightlight intensity
Black Marble annual composite of the VIIRS day-night band (Grayscale) at 15 arcsec resolution is downloaded from
the NASA Earth Observatory: https://earthobservatory.nasa.gov/Features/NightLights (available for 2012 and 2016 at
15 arcsec resolution (~500m)). The first time a nightlight image is used, it is downloaded and stored locally. This might
take some time.
Population count
Gridded Population of the World (GPW), v4: Population Count, v4.10, v4.11 or later versions (2000, 2005, 2010, 2015,
2020), available from http://sedac.ciesin.columbia.edu/data/collection/gpw-v4/sets/browse.
The GPW file of the year closest to the requested year (reference_year) is required. To download GPW data a (free)
login for the NASA SEDAC website is required.
Direct download links are avilable, also for older versions, i.e.:
• v4.11: http://sedac.ciesin.columbia.edu/downloads/data/gpw-v4/gpw-v4-population-count-rev11/gpw-v4-
population-count-rev11_2015_30_sec_tif.zip
• v4.10: http://sedac.ciesin.columbia.edu/downloads/data/gpw-v4/gpw-v4-population-count-rev10/gpw-v4-
population-count-rev10_2015_30_sec_tif.zip,
The easiest way to download existing data is using the wrapper function of the data API.
Readily computed LitPop asset exposure data based on Lit1 P op1 for 224 countries, distributing produced capital /
non-financial wealth of 2014 at a resolution of 30 arcsec can also be downloaded from the ETH Research Repository:
https://doi.org/10.3929/ethz-b-000331316. The dataset contains gridded data for more than 200 countries as CSV files.
Attributes
The LitPop class inherits from Exposures. It adds the following attributes:
exponents : Defining powers (m, n) with which nightlights and population go into␣
,→Lit**m * Pop**n.
fin_mode : Socio-economic indicator to be used as total asset value for␣
,→disaggregation.
fin_mode
Key Methods
• from_countries: set exposure for one or more countries, see section from_countries below.
• from_nightlight_intensity: wrapper around from_countries and from_shape to load nightlight data
to exposure.
• from_population: wrapper around from_countries and from_shape_population to load pure popula-
tion data to exposure. This can be used to initiate a population exposure set.
• from_shape_and_countries: given a shape and a list of countries, exposure is initiated for the countries and
then cropped to the shape. See section Set custom shapes below.
• from_shape: given any shape or geometry and an estimate of total values, exposure is initiated for the shape
directly. See section Set custom shapes below.
from_countries
In the following, we will create exposure data sets and plots for a variety of countries, comparing different settings.
Default Settings
Per default, the exposure entity was initiated using the default parameters, i.e. a resolution of 30 arcsec, produced capital
‘pc’ as total asset value and using the exponents (1, 1).
# Initiate a default LitPop exposure entity for Switzerland and Liechtenstein (ISO3-
,→Codes 'CHE' and 'LIE'):
try:
exp = LitPop.from_countries(
["CHE", "Liechtenstein"]
) # you can provide either single countries or a list of countries
except FileExistsError as err:
print(
"Reason for error: The GPW population data has not been downloaded, c.f.␣
,→section 'Input data' above."
)
raise err
exp.plot_scatter()
Instead on produced capital, we can also downscale other available macroeconomic indicators as estimates of asset value.
The indicator can be set via the parameter fin_mode, either to ‘pc’, ‘pop’, ‘gdp’, ‘income_group’, ‘nfw’, ‘tw’, ‘norm’, or
‘none’. See descriptions of each alternative above in the introduction.
We can also change the resolution via res_arcsec and the exponents.
The default resolution is 30 arcsec ≈ 1 km. A resolution of 3600 arcsec = 1 degree corresponds to roughly 110 km close
to the equator.
from_population
Let’s initiate an exposure instance with the financial mode “income_group” and at a resolution of 120 arcsec (roughly 4
km).
# Initiate a LitPop exposure entity for Costa Rica with varied resolution, fin_mode,␣
,→and exponents:
exp = LitPop.from_countries(
"Costa Rica", fin_mode="income_group", res_arcsec=120, exponents=(1, 1)
) # change the parameters and see what happens...
# exp = LitPop.from_countries('Costa Rica', fin_mode='gdp', res_arcsec=90,␣
,→exponents=(3,0)) # example of variation
exp.plot_raster()
# note the log scale of the colorbar
exp.plot_scatter();
Reference year
Additionally, we can change the year our exposure is supposed to represent. For this, nightlight and population data
are used that are closest to the requested years. Macroeconomic indicators like produced capital are interpolated from
available data or scaled proportional to GDP.
Let’s load a population exposure map for Switzerland in 2000 and 2021 with a resolution of 300 arcsec:
# You may want to check if you have downloaded dataset Gridded Population of the␣
,→World (GPW), v4: Population Count, v4.11
pop_2000.plot_scatter()
pop_2021.plot_scatter()
"""Note the difference in total values on the color bar."""
These wrapper methods can be used to produce exposures that are showing purely nightlight intensity or purely population
count.
res = 30 # If you don't get an output after a very long time with country = "MEX",␣
,→try with res = 100
country = "JAM" # Try different countries, i.e. 'JAM', 'CHE', 'RWA', 'MEX'
markersize = 4 # for plotting
buffer_deg = 0.04
exp_nightlights = LitPop.from_nightlight_intensity(
countries=country, res_arcsec=res
) # nightlight intensity
exp_nightlights.plot_hexbin(linewidth=markersize, buffer=buffer_deg)
# Compare to the population map:
exp_population = LitPop().from_population(countries=country, res_arcsec=res)
exp_population.plot_hexbin(linewidth=markersize, buffer=buffer_deg)
# Compare to default LitPop exposures:
exp = LitPop.from_countries(countries=country, res_arcsec=res)
exp.plot_hexbin(linewidth=markersize, buffer=buffer_deg);
For Switzerland, population is resolved on the 3rd administrative level, with 2538 distinct geographical units. Therefore,
the purely population-based map is highly resolved.
For Jamaica, population is only resolved on the 1st administrative level, with only 14 distinct geographical units. There-
fore, the purely population-based map shows large monotonous patches. The combination of Lit and Pop results in a
concentration of asset value estimates around the capital city Kingston.
A population exposure for a custom shape can be initiated directly via from_population without providing to-
tal_value.
import time
import climada.util.coordinates as u_coord
import climada.entity.exposures.litpop as lp
country_iso3a = "USA"
state_name = "Florida"
reslution_arcsec = 600
"""First, we need to get the shape of Florida:"""
admin1_info, admin1_shapes = u_coord.get_admin1_info(country_iso3a)
admin1_info = admin1_info[country_iso3a]
admin1_shapes = admin1_shapes[country_iso3a]
admin1_names = [record["name"] for record in admin1_info]
print(admin1_names)
for idx, name in enumerate(admin1_names):
if admin1_names[idx] == state_name:
break
print("Florida index: " + str(idx))
Florida index: 20
)
exp.plot_scatter(vmin=100, buffer=0.5)
"""Note the differences in computational speed and total value between the two␣
,→approaches"""
'Note the differences in computational speed and total value between the two␣
,→approaches'
import time
from shapely.geometry import Polygon
"""initiate LitPop exposures for a geographical box around the city of Zurich:"""
bounds = (8.41, 47.25, 8.70, 47.47) # (min_lon, max_lon, min_lat, max_lat)
total_value = 1000 # required user input for `from_shape`, here we just assume USD␣
,→1000 of total value
shape = Polygon(
[
(bounds[0], bounds[3]),
(bounds[2], bounds[3]),
(bounds[2], bounds[1]),
(bounds[0], bounds[1]),
]
)
import time
start = time.process_time()
exp = LitPop.from_shape(shape, total_value)
print(f"\n Runtime `from_shape` : {time.process_time() - start:1.2f} sec.\n")
(continues on next page)
)
exp.plot_scatter()
"""Note the difference in total value between the two exposure sets!"""
"""For comparison, initiate population exposure for a geographical box around the␣
,→city of Zurich:"""
start = time.process_time()
exp_pop = LitPop.from_population(shape=shape)
print(f"\n Runtime `from_population` : {time.process_time() - start:1.2f} sec.\n")
exp_pop.plot_scatter()
"""Population exposure for a custom shape can be initiated directly via `set_
,→population` without providing `total_value`"""
'Population exposure for a custom shape can be initiated directly via `set_
,→population` without providing `total_value`'
How To:
The intermediate downscaling layer can be activated with the parameter admin1_calc=True.
ent_adm0 = LitPop.from_countries(
"CHE", res_arcsec=120, fin_mode="gdp", admin1_calc=False
(continues on next page)
ent_adm1 = LitPop.from_countries(
"CHE", res_arcsec=120, fin_mode="gdp", admin1_calc=True
)
ent_adm1.check()
print("Done.")
Done.
# Plotting:
from matplotlib import colors
Quick example
Get example polygons (provinces), lines (rails), points exposure for the Netherlands, and create one single Exposures.
Get demo winter storm hazard and a corresponding impact function.
IMPF = ImpfStormEurope.from_welker()
IMPF_SET = ImpactFuncSet([IMPF])
# disaggregate in the same CRS as the exposures are defined (here degrees),␣
,→resolution 1degree
impact = u_lp.calc_geom_impact(
exp=EXP_MIX,
impf_set=IMPF_SET,
haz=HAZ,
res=0.2,
to_meters=False,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
/Users/ckropf/Documents/Climada/climada_python/climada/util/lines_polys_handler.
,→py:931: UserWarning: Geometry is in a geographic CRS. Results from 'length' are␣
line_lengths = gdf_lines.length
u_lp.plot_eai_exp_geom(impact);
# disaggregate in meters
(continues on next page)
# aggregate by summing
impact = u_lp.calc_geom_impact(
exp=EXP_MIX,
impf_set=IMPF_SET,
haz=HAZ,
res=1000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=1.0,
agg_met=u_lp.AggMethod.SUM,
);
ax = u_lp.plot_eai_exp_geom(
impact, legend_kwds={"label": "percentage", "orientation": "horizontal"}
)
Polygons
Polygons or shapes are a common geographical representation of countries, states etc. as for example in NaturalEarth.
Map data, as for example buildings, etc. obtained from openstreetmap (see tutorial here), also frequently come as (multi-
)polygons. Here we want to show you how to deal with exposure information as polygons.
Load data
Lets assume we have the following data given. The polygons of the admin-1 regions of the Netherlands and an exposure
value each, which we gather in a geodataframe. We want to know the Impact of Lothar on each admin-1 region.
In this tutorial, we shall see how to compute impacts for exposures defined on shapely geometries (polygons and/or lines).
The basic principle is to disaggregate the geometries to a raster of points, compute the impact per points, and then re-
aggregate. To do so, several methods are available. Here is a brief overview.
# Imports
import geopandas as gpd
import pandas as pd
from pathlib import Path
def gdf_poly():
from cartopy.io import shapereader
from climada_petals.entity.exposures.black_marble import country_iso_geom
# assign a value to each admin-1 area (assumption 100'000 USD per inhabitant)
population_prov_NL = {
"Drenthe": 493449,
"Flevoland": 422202,
"Friesland": 649988,
"Gelderland": 2084478,
"Groningen": 585881,
"Limburg": 1118223,
"Noord-Brabant": 2562566,
"Noord-Holland": 2877909,
"Overijssel": 1162215,
"Zuid-Holland": 3705625,
"Utrecht": 1353596,
"Zeeland": 383689,
}
value_prov_NL = {
n: 100000 * population_prov_NL[n] for n in population_prov_NL.keys()
}
exp_nl_poly = Exposures(gdf_poly())
exp_nl_poly.gdf["impf_WS"] = 1
exp_nl_poly.gdf.head()
population value \
Drenthe 493449 49344900000
Flevoland 422202 42220200000
Friesland 649988 64998800000
Gelderland 2084478 208447800000
Groningen 585881 58588100000
geometry impf_WS
Drenthe POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
Flevoland POLYGON ((5.74046 52.83874, 5.75012 52.83507, ... 1
(continues on next page)
# take a look
exp_nl_poly.gdf.plot("value", legend=True, cmap="OrRd")
<AxesSubplot:>
# define hazard
storms = StormEurope.from_footprints(WS_DEMO_NC)
# define impact function
impf = ImpfStormEurope.from_welker()
impf_set = ImpactFuncSet([impf])
/Users/ckropf/Documents/Climada/climada_python/climada/hazard/centroids/centr.py:822:␣
,→UserWarning: Geometry is in a geographic CRS. Results from 'buffer' are likely␣
All in one: The main method calc_geom_impact provides several disaggregation keywords, specifiying
• the target resolution (res),
• the method on how to distribute the values of the original geometries onto the newly generated interpolated points
(disagg_met)
• the source (and number) of the value to be distributed (disagg_val).
• the aggregation method (agg_met)
disagg_met can be either fixed (FIX), replicating the original shape’s value onto all points, or divided evenly (DIV), in
which case the value is divided equally onto all new points. disagg_val can either be taken directly from the exposure
gdf’s value column (None) or be indicated here explicitly (float). Resolution can be given in the gdf’s original (mostly
degrees lat/lon) format, or in metres. agg_met can currently be only (SUM) were the value is summed over all points in
the geometry.
Polygons can also be disaggregated on a given fixed grid, see example below
Example 1: Target resolution in degrees lat/lon, equal (average) distribution of values from exposure gdf among points.
imp_deg = u_lp.calc_geom_impact(
exp=exp_nl_poly,
impf_set=impf_set,
haz=storms,
res=0.005,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
u_lp.plot_eai_exp_geom(imp_deg);
Example 2: Target resolution in metres, equal (divide) distribution of values from exposure gdf among points.
imp_m = u_lp.calc_geom_impact(
exp=exp_nl_poly,
impf_set=impf_set,
haz=storms,
res=500,
to_meters=True,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
u_lp.plot_eai_exp_geom(imp_m);
For this specific case, both disaggregation methods provide a relatively similar result, given the chosen numbers:
Example 3: Target predefined grid, equal (divide) distribution of values from exposure gdf among points.
res = 0.1
(_, _, xmax, ymax) = exp_nl_poly.gdf.geometry.bounds.max()
(xmin, ymin, _, _) = exp_nl_poly.gdf.geometry.bounds.min()
bounds = (xmin, ymin, xmax, ymax)
height, width, trafo = u_coord.pts_to_raster_meta(bounds, (res, res))
x_grid, y_grid = u_coord.raster_to_meshgrid(trafo, width, height)
imp_g = u_lp.calc_grid_impact(
exp=exp_nl_poly,
impf_set=impf_set,
haz=storms,
grid=(x_grid, y_grid),
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
u_lp.plot_eai_exp_geom(imp_g);
Step 1: Disaggregate polygon exposures to points. It is useful to do this separately, when the discretized exposure is used
several times, for example, to compute with different hazards.
Several disaggregation methods can be used as shown below:
# Disaggregate exposure to 10'000 metre grid, each point gets average value within␣
,→polygon.
exp_pnt = u_lp.exp_geom_to_pnt(
exp_nl_poly,
res=10000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
)
exp_pnt.gdf.head()
population value \
Drenthe 0 493449 2.056038e+09
1 493449 2.056038e+09
2 493449 2.056038e+09
3 493449 2.056038e+09
4 493449 2.056038e+09
geometry_orig impf_WS \
Drenthe 0 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
1 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
2 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
3 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
4 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
exp_pnt2 = u_lp.exp_geom_to_pnt(
exp_nl_poly,
res=0.1,
to_meters=False,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=None,
)
exp_pnt2.gdf.head()
population value \
Drenthe 0 493449 49344900000
1 493449 49344900000
2 493449 49344900000
3 493449 49344900000
4 493449 49344900000
geometry_orig impf_WS \
Drenthe 0 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
1 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
2 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
3 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
4 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
# Disaggregate exposure to 1'000 metre grid, each point gets value corresponding to
# its representative area (1'000^2).
exp_pnt3 = u_lp.exp_geom_to_pnt(
exp_nl_poly,
res=1000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=10e6,
)
exp_pnt3.gdf.head()
population value \
Drenthe 0 493449 10000000.0
1 493449 10000000.0
2 493449 10000000.0
3 493449 10000000.0
4 493449 10000000.0
geometry_orig impf_WS \
Drenthe 0 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
1 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
2 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
3 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
4 POLYGON ((7.07215 52.84132, 7.06198 52.82401, ... 1
# Disaggregate exposure to 1'000 metre grid, each point gets value corresponding to 1
# After dissagregation, each point has a value equal to the percentage of area of the␣
,→polygon
exp_pnt4 = u_lp.exp_geom_to_pnt(
exp_nl_poly,
res=1000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=1,
)
exp_pnt4.gdf.tail()
population value \
Zeeland 1897 383689 0.000526
1898 383689 0.000526
1899 383689 0.000526
1900 383689 0.000526
1901 383689 0.000526
geometry_orig impf_WS \
Zeeland 1897 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
1898 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
1899 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
1900 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
1901 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
res = 0.1
(_, _, xmax, ymax) = exp_nl_poly.gdf.geometry.bounds.max()
(xmin, ymin, _, _) = exp_nl_poly.gdf.geometry.bounds.min()
bounds = (xmin, ymin, xmax, ymax)
(continues on next page)
population value \
Zeeland 17 383689 0.045455
18 383689 0.045455
19 383689 0.045455
20 383689 0.045455
21 383689 0.045455
geometry_orig impf_WS \
Zeeland 17 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
18 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
19 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
20 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
21 MULTIPOLYGON (((3.45388 51.23563, 3.42194 51.2... 1
# Point-impact
imp_pnt = ImpactCalc(exp_pnt3, impf_set, hazard=storms).impact(save_mat=True)
# Aggregated impact (Note that you need to pass the gdf and not the exposures)
imp_geom = u_lp.impact_pnt_agg(imp_pnt, exp_pnt3.gdf, agg_met=u_lp.AggMethod.SUM)
eai_exp
at_event
array([4321211.03400214, 219950.4291506 ])
aai_agg
412832.8602866131
Lines
Lines are common geographical representation of transport infrastructure like streets, train tracks or powerlines etc. Here
we will play it through for the case of winter storm Lothar’s impact on the Dutch Railway System:
Loading Data
Note: Hazard and impact functions data have been loaded above.
def gdf_lines():
gdf_lines = gpd.read_file(Path(DEMO_DIR, "nl_rails.gpkg"))
gdf_lines = gdf_lines.to_crs(epsg=4326)
return gdf_lines
exp_nl_lines = Exposures(gdf_lines())
exp_nl_lines.gdf["impf_WS"] = 1
exp_nl_lines.gdf["value"] = 1
exp_nl_lines.gdf.head()
value
0 1
(continues on next page)
exp_nl_lines.gdf.plot("value", cmap="inferno");
Example 1: Disaggregate values evenly among road segments; split in points with 0.005 degree distances.
imp_deg = u_lp.calc_geom_impact(
exp=exp_nl_lines,
impf_set=impf_set,
haz=storms,
res=0.005,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
/Users/ckropf/Documents/Climada/climada_python/climada/util/lines_polys_handler.
,→py:931: UserWarning: Geometry is in a geographic CRS. Results from 'length' are␣
line_lengths = gdf_lines.length
u_lp.plot_eai_exp_geom(imp_deg);
Example 2: Disaggregate values evenly among road segments; split in points with 500 m distances.
imp_m = u_lp.calc_geom_impact(
exp=exp_nl_lines,
impf_set=impf_set,
haz=storms,
res=500,
to_meters=True,
disagg_met=u_lp.DisaggMethod.DIV,
disagg_val=None,
agg_met=u_lp.AggMethod.SUM,
)
u_lp.plot_eai_exp_geom(imp_m);
import numpy as np
(continues on next page)
The largest relative different between degrees and meters impact in this example is 0.
,→09803913811822067
Step 1: As in the polygon example above, there are several methods to disaggregate line exposures into point exposures,
of which several are shown here:
/Users/ckropf/Documents/Climada/climada_python/climada/util/lines_polys_handler.
,→py:931: UserWarning: Geometry is in a geographic CRS. Results from 'length' are␣
line_lengths = gdf_lines.length
exp_pnt4 = u_lp.exp_geom_to_pnt(
exp_nl_lines,
res=1000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=1000,
)
exp_pnt4.gdf.head()
Step 2 & 3: The procedure is analogous to the example provided above for polygons.
Quick example
Here we provide a quick example of an impact calculation with CLIMADA and OpenStreetMap (OSM) data. We use
in this example main roads in Honduras as exposures, and historical tropical cyclones as hazard. We load the OSM data
using osm-flex and disaggregate the exposures, compute the damages, and reaggregate the exposures to their original
shape using the function calc_geom_impact from the util module lines_polys_handler. For more details on the
lines_polys_handler module, please refer to the documentation.
warnings.filterwarnings("ignore")
from climada import CONFIG
import logging
from climada.util.config import LOGGER
LOGGER.setLevel(logging.ERROR)
osm_flex.enable_logs()
The first step is to download a raw osm.pbf file (“data dump”) for Honduras from geofabrik.de and extract the layer of
interest (here roads). See the set-up CLIMADA exposures from OpenStreetMap section for more details.
# lets extract all roads from the Honduras file, via the wrapper
gdf_roads = osm_flex.extract.extract_cis(path_ctr_dump, "road")
# set crs
gdf_roads = gdf_roads.to_crs(epsg=4326)
Next, we set up the exposure, and select our hazard and vulnerability.
)
(continues on next page)
# exposures
exp_line = Exposures(gdf_roads)
# impact function
impf_line = ImpfTropCyclone.from_emanuel_usa()
impf_set = ImpactFuncSet([impf_line])
Finally, we use the wrapper function calc_geom_impact to compute the impacts in one line of code. For a reminder,
the calc_geom_impact is covering the 3 steps of shapes-to-points disagreggation, impact calculation, and reaggregation
to the original shapes. calc_geom_impact requires the user to specify a target resolution of the disaggregation (res),
as well as how to assign a value to the disaggregated exposure (disagg_met and disagg_val). Here, we arbitrarily
decide to give a fixed value 100k USD to each road segment of 500m, but note that other options are possible.
# disaggregate in the same CRS as the exposures are defined (here meters), resolution␣
,→500m
impact = u_lp.calc_geom_impact(
exp=exp_line,
impf_set=impf_set,
haz=haz,
res=500,
to_meters=True,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=1e5,
agg_met=u_lp.AggMethod.SUM,
);
osm-flex
osm-flex is a python package which allows to flexibly extract data from OpenStreetMap. See osm-flex and the associated
publication for more information: Mühlhofer, Kropf, Riedel, Bresch and Koks: OpenStreetMap for Multi-Faceted Climate
Risk Assessments. Environ. Res. Commun. 6 015005 doi: 10.1088/2515-7620/ad15ab Obtaining CLIMADA exposures
object from OpenStreetMap using osm-flex consists in the following steps:
1. Download a raw osm.pbf file (“data dump”) for a specific country or region from geofabrik.de
2. Extract the features of interest (e.g. a road network) as a geodataframe
3. Pre-process; apply pre-processing steps as e.g. clipping, simplifying, or reprojecting the retrieved layer.
4. Cast the geodataframe into a CLIMADA Exposures object.
5. Disagreggate complex shapes exposures into points for impact calculation.
Once those 5 steps are completed, one can proceed with the impact calculation. For more details on how to use lines and
polygons as exposures within CLIMADA, please refer to the documentation.
In the following, we illustrate how to obtain different exposures types such as forests or healthcare facilities, and how to
use them within CLIMADA as points, lines, and polygons exposures. We also briefly illustrate the use of the simplify
module available within the osm-flex package.
First, we need to select a specific country and download its data from geofabrik.de. It is possible to download data from
specific countries using iso3 codes or from regions directly.
We next extract the exposures data of interest from OSM using the extract() method which allows us to query any
tags available on OpenStreetMap. Two variables have to be specified: osm_keys, a list with all the columns to report in
the GeoDataFrame, and osm_query, a string of key-value constraints to apply during the search. We illustrate its use by
querying the download of forests for Honduras.
# set crs
gdf_forest = gdf_forest.to_crs(epsg=4326)
# Plot results
ax = gdf_forest.plot(
figsize=(15, 15),
alpha=1,
markersize=5,
color="blue",
edgecolor="blue",
label="forests HND",
)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles, loc="upper left")
ax.set_title("Forests Honduras", fontsize=25)
plt.show()
Alternatively, we can use the extract_cis method to download specific types of critical infrastructures available on
OSM.
# lets extract all healthcares from the Honduras file, via the wrapper
gdf_hc = osm_flex.extract.extract_cis(path_ctr_dump, "healthcare")
# set crs
gdf_hc = gdf_hc.to_crs(epsg=4326)
# plot results
ax = gdf_hc.plot(
figsize=(15, 15),
alpha=1,
markersize=5,
color="blue",
edgecolor="blue",
label="healthcares HND",
)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles, loc="upper left")
ax.set_title("Healthcare facilities Honduras", fontsize=25)
plt.show()
It can be necessary to apply some preprocessing steps before using the retrieved OSM data as CLIMADA exposures. In
particular, the following two pre-processing tasks are available as modules within the osm-flex package:
1. Clipping; allows to clip the country data to a user-determined region.
2. Simplify; in some cases, simplifying the retrieved data is necessary to remove redundant or erroneous
In the following, we show how to simplify some data retrieved from OpenStreetMap using the osm_flex.simplify
module. For more details on clipping or on other features available within osm_flex, please refer to its documentation.
# here we illustrate how to simplify the polygon-based forest layer by removing small␣
,→polygons
import osm_flex.simplify as sy
gdf_forest = sy.remove_small_polygons(
gdf_forest, min_area
) # remove all areas < 100m2 (always in units of respective CRS)
print(f"Number of results after removal of small polygons: {len(gdf_forest)}")
The last step consists in transforming exposures data obtained from OSM into CLIMADA-readable objects. This is
simply done using the CLIMADA Exposuresclass.
gdf_forest = gdf_forest.to_crs(
epsg=4326
) # !ensure that all exposures are in the same CRS!
exp_poly = Exposures(gdf_forest)
Additionally, multiple exposures of different types can be combine within a single CLIMADA Exposures object using
concat.
exp_points = Exposures(gdf_hc)
exp_mix.plot()
The last step before proceeding to the usual impact calculation consists in transforming all the exposure data that is in
a format other than point (e.g. lines, polygons) into point data (disagreggation) and assigning them values. Those two
tasks can be done simultaneously using the util function exp_geom_to_pnt. Disagreggating and assigning values to the
disagreggated exposures requires the following:
1. Specify a resolution for the disaggregation (res).
2. Specify a value to be disaggregated (disagg_val).
3. Specify how to distribute the value to the disaggregated points (disagg_met).
In the following, we illustrate how to disaggregate our mixed-types exposures to a 10km-resolution, arbitrarily assigning a
fixed value of 500k USD to each point. For more details on how to use lines and polygons as exposures within CLIMADA,
please refer to the documentation.
exp_mix_pnt = u_lp.exp_geom_to_pnt(
exp_mix,
res=10000,
to_meters=True,
disagg_met=u_lp.DisaggMethod.FIX,
disagg_val=5e5,
)
exp_mix_pnt.plot()
What is an Impact?
The impact is the combined effect of hazard events on a set of exposures mediated by a set of impact functions. By
computing the impact for each event (historical and synthetic) and for each exposure value at each geographical location,
the Impact class provides different risk measures, such as the expected annual impact per exposure, the probable maximum
impact for different return periods, and the total average annual impact.
All other methods compute values from the attributes set by ImpactCalc.impact(). For example, one can compute
the frequency exceedance curve, plot impact data, or compute traditional risk transfer over impact.
In CLIMADA, impacts are computed using the Impact class. The computation of the impact requires an Exposure ,
an ImpactFuncSet, and a Hazard object. For details about how to define Exposures , Hazard, Impact Functions see
the respective tutorials.
The steps of an impact caculations are typically:
• Set exposure
• Set hazard and hazard centroids
• Set impact functions in impact function set
• Compute impact
• Visualize, save, use impact output
Hints: Before computing the impact of a given Exposure and Hazard, it is important to correctly match the Exposures
coordinates with the Hazard Centroids. Try to have similar resolutions in Exposures and Hazard. By the impact
calculation the nearest neighbor for each Exposure to the Hazard's Centroids is searched.
Hint: Set first the Exposures and use its coordinates information to set a matching Hazard.
Hint: The configuration value max_matrix_size controls the maximum matrix size contained in a chunk. By default
it is set to 1e9 in the default config file. A high value makes the computation fast at the cost of increased memory
consumption. You can decrease its value if you are having memory issues with the ImpactCalc.impact() method.
(See the config guide on how to set configuration values).
We present a detailed example for the hazard Tropical Cyclones and the exposures from LitPop .
Reminder: The exposures must be defined according to your problem either using CLIMADA exposures such as Black-
Marble, LitPop, OSM, extracted from external sources (imported via csv, excel, api, …) or directly user defined.
As a reminder, exposures are geopandas dataframes with at least columns ‘latitude’, ‘longitude’ and ‘value’ of exposures.
For impact calculations, for each exposure values of the corresponding impact function to use (defined by the column
impf_) and the associated hazard centroids must be defined. This is done after defining the impact function(s) and the
hazard(s).
See tutorials on Exposures , Hazard, ImpactFuncSet for more details.
Exposures are either defined as a series of (latitude/longitude) points or as a raster of (latitude/longitude) points. Fun-
damentally, this changes nothing for the impact computations. Note that for larger number of points, consider using a
raster which might be more efficient (computationally). For a low number of points, avoid using a raster if this adds a lot
of exposures values equal to 0.
We shall here use a raster example.
# you want to execute this cell on your computer. If you haven't downloaded it before,
,→ please have a look at the section
%matplotlib inline
import numpy as np
from climada.entity import LitPop
exp_lp.gdf.head()
impf_
0 1
1 1
2 1
3 1
4 1
Raster properties exposures: {'width': 129, 'height': 41, 'crs': <Geographic 2D CRS:␣
,→EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- undefined
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, 'transform': Affine(0.08333333000000209, 0.0, -84.91666666500001,
0.0, -0.08333332999999854, 23.249999994999996)}
Let us define a tropical cyclone hazard using the TropCyclone and TCTracks modules.
# Load histrocial tropical cyclone tracks from ibtracs over the North Atlantic basin␣
,→between 2010-2012
ibtracks_na = TCTracks.from_ibtracs_netcdf(
provider="usa", basin="NA", year_range=(2010, 2012), correct_pres=True
)
print("num tracks hist:", ibtracks_na.size)
ibtracks_na.equal_timestep(
0.5
) # Interpolation to make the track smooth and to allow applying calc_perturbed_
,→trajectories
From the tracks, we generate the hazards (the tracks are only the coordinates of the center of the cyclones, the full cyclones
however affects a region around the tracks).
First thing we define the set of centroids which are geographical points where the hazard has a defined value. In our case,
# Using the tracks, compute the windspeed at the location of the centroids
tc = TropCyclone.from_tracks(ibtracks_na, centroids=centrs)
tc.check()
Hint: The operation of computing the windspeed in different location is in general computationally expensive. Hence,
if you have a lot of tropical cyclone tracks, you should first make sure that all your tropical cyclones actually affect your
exposure (remove those that don’t). Then, be careful when defining the centroids. For a large country like China, there is
no need for centroids 500km inland (no tropical cyclones gets so far).
Impact function
For Tropical Cyclones, some calibrated default impact functions exist. Here we will use the one from Emanuel (2011).
# impact function TC
impf_tc = ImpfTropCyclone.from_emanuel_usa()
Recall that the exposures, hazards and impact functions must be matched in the impact calculations. Here it is simple,
since there is a single impact function for all the hazards. We must simply make sure that the exposure is assigned this
impact function through renaming the impf\_ column from the hazard type of the impact function in the impact function
set and set the values of the column to the id of the impact function.
impf_TC
0 1
1 1
2 1
3 1
4 1
Impact computation
We are finally ready for the impact computation. This is the simplest step. Just give the exposure, impact function and
hazard to the ImpactCalc.impact() method.
Note: we did not specifically assign centroids to the exposures. Hence, the default is used - each exposure is associated
with the closest centroids. Since we defined the centroids from the exposures, this is a one-to-one mapping.
Note: we did not define an Entity in this impact calculations. Recall that Entity is a container class for Exposures,
Impact Functions, Discount Rates and Measures. Since we had only one Exposure and one Impact Function, the container
would not have added any value, but for more complex projects, the Entity class is very useful.
# Compute impact
from climada.engine import ImpactCalc
exp_lp.gdf
For example we can now obtain the aggregated average annual impact or plot the average annual impact in each exposure
location.
imp.plot_hexbin_eai_exposure(buffer=1);
Impact concatenation
There can be cases in which an impact function for a given hazard type is not constant throughout the year. This is for
example the case in the context of agriculture. For example, if a crop is already harvested the impact of a certain weather
event can be much lower or even zero. For such situations of two or more different impact functions for the same hazard
and exposure type, it can be useful to split the events into subsets and compute impacts separately. In order to then analyze
the total impact, the different impact subsets can be concatenated using the Impact.concat method. This is done here
for the hypothetical example using LitPop as exposure and TCs as hazard. For illustration purposes, we misuse the LitPop
exposure in this case as exposure of a certain crop. We assume a constant harvest day (17 October) after which the impact
function is reduced by a factor of 10.
First, we prepare the hazard subsets.
# loop over all events an check if they happened before or after harvest
event_ids_post_harvest = []
event_ids_pre_harvest = []
for event_id in tc.event_id:
event_date = tc.date[np.where(tc.event_id == event_id)[0][0]]
day_of_year = (
(continues on next page)
tc_post_harvest = tc.select(event_id=event_ids_post_harvest)
tc_pre_harvest = tc.select(event_id=event_ids_pre_harvest)
# print('pre-harvest:', tc_pre_harvest.event_name)
# print('post-harvest:', tc_post_harvest.event_name)
Now we get two different impact functions, one valid for the exposed crop before harvest and one after harvest. Then, we
compute the impacts for both phases separately.
# impact function TC
impf_tc = ImpfTropCyclone.from_emanuel_usa()
# impact function TC after harvest is by factor 0.5 smaller
impf_tc_posth = ImpfTropCyclone.from_emanuel_usa()
impf_tc_posth.mdd = impf_tc.mdd * 0.1
# add the impact function to an Impact function set
impf_set = ImpactFuncSet([impf_tc])
impf_set_posth = ImpactFuncSet([impf_tc_posth])
impf_set.check()
impf_set_posth.check()
# plot
impf_set.plot()
impf_set_posth.plot()
# Compute impacts
imp_preh = ImpactCalc(exp_lp, impf_set, tc_pre_harvest).impact(save_mat=True)
imp_posth = ImpactCalc(exp_lp, impf_set_posth, tc_post_harvest).impact(save_mat=True)
Now, we can concatenate the impacts again and plot the results
# plot result
import matplotlib.pyplot as plt
ax = imp_preh.plot_hexbin_eai_exposure(gridsize=100, adapt_fontsize=False)
ax.set_title("Expected annual impact: Pre-Harvest")
ax = imp_posth.plot_hexbin_eai_exposure(gridsize=100, adapt_fontsize=False)
ax.set_title("Expected annual impact: Post-Harvest")
ax = imp_tot.plot_hexbin_eai_exposure(gridsize=100, adapt_fontsize=False)
ax.set_title("Expected annual impact: Total")
%matplotlib inline
# EXAMPLE: POINT EXPOSURES WITH POINT HAZARD
import numpy as np
from climada.entity import Exposures, ImpactFuncSet, IFTropCyclone
from climada.hazard import Centroids, TCTracks, TropCyclone
from climada.engine import ImpactCalc
# Compute Impact
imp_pnt = ImpactCalc(exp_pnt, impf_pnt, tc_pnt).impact()
# nearest neighbor of exposures to centroids gives identity
print(
"Nearest neighbor hazard.centroids indexes for each exposure:",
exp_pnt.gdf["centr_TC"].values,
)
imp_pnt.plot_scatter_eai_exposure(ignore_zero=False, buffer=0.05);
exp_ras = LitPop.from_countries(
countries=["VEN"], res_arcsec=300, fin_mode="income_group"
)
exp_ras.gdf.reset_index()
exp_ras.check()
exp_ras.plot_raster()
print("\n Raster properties exposures:", exp_ras.meta)
# Compute impact
imp_ras = ImpactCalc(exp_ras, impf_ras, haz_ras).impact(save_mat=False)
# nearest neighbor of exposures to centroids is not identity because litpop does not␣
,→contain data outside the country polygon
print(
"\n Nearest neighbor hazard.centroids indexes for each exposure:",
exp_ras.gdf["centr_FL"].values,
)
imp_ras.plot_raster_eai_exposure();
Raster properties exposures: {'width': 163, 'height': 138, 'crs': <Geographic 2D␣
,→ CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- undefined
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, 'transform': Affine(0.08333333000000209, 0.0, -73.41666666500001,
0.0, -0.08333332999999987, 12.166666665)}
2023-01-26 11:59:23,374 - climada.util.coordinates - INFO - Reading C:\Users\
,→F80840370\climada\demo\data\SC22000_VE__M1.grd.gz
,→CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- undefined
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, 'transform': Affine(0.08333333000000209, 0.0, -73.41666666500001,
0.0, -0.08333332999999987, 12.166666665)}
2023-01-26 11:59:28,695 - climada.entity.exposures.base - INFO - No specific impact␣
(continues on next page)
Visualization
Making plots
The expected annual impact per exposure can be visualized through different methods:
plot_hexbin_eai_exposure(), plot_scatter_eai_exposur(), plot_raster_eai_exposure() and
plot_basemap_eai_exposure() (similarly as with Exposures).
imp_pnt.plot_basemap_eai_exposure(buffer=5000);
Making videos
Given a fixed exposure and impact functions, a sequence of hazards can be visualized hitting the exposures.
# exposure
from climada.entity import add_sea
from climada_petals.entity import BlackMarble
exp_video = BlackMarble()
exp_video.set_countries(["Cuba"], 2016, res_km=2.5)
exp_video.check()
# impact function
impf_def = ImpfTropCyclone.from_emanuel_usa()
impfs_video = ImpactFuncSet([impf_def])
impfs_video.check()
track_name = "2017242N16333"
tr_irma = TCTracks.from_ibtracs_netcdf(provider="usa", storm_id=track_name) # IRMA␣
,→2017
tc_video = TropCyclone()
tc_list, _ = tc_video.video_intensity(
track_name, tr_irma, centr_video
) # empty file name to not to write the video
represent 100 types of buildings exposed to tropical cyclone’s wind damage. These 100 ImpactFunc are all gathered in
an ImpactFuncSet.
Users may use ImpactFunc.check() to check that the attributes have been set correctly. The mean damage ratio mdr
(mdr=mdd*paa) is calculated by the method ImpactFunc.calc_mdr().
The ImpactFuncSet class contains all the ImpactFunc classes. Users are not required to define any attributes in
ImpactFuncSet.
To add an ImpactFunc into an ImpactFuncSet, simply use the method ImpactFuncSet.append(ImpactFunc).
If the users only has one impact function, they should generate an ImpactFuncSet that contains one impact function.
ImpactFuncSet is to be used in the impact calculation.
Here we generate an impact function with random dummy data for illustrative reasons. Assuming this impact function is
a function that relates building damage to tropical cyclone (TC) wind, with an arbitrary id 3.
import numpy as np
from climada.entity import ImpactFunc
The method plot() uses the matplotlib’s axes plot function to visualise the impact function. It returns a figure and axes,
which can be modified by users.
ImpfTropCyclone is a derivated class of ImpactFunc. This in-built impact function estimates the insured property
damages by tropical cyclone wind in USA, following the reference paper Emanuel (2011).
To generate this impact function, method set_emanual_usa() is used.
# Here we generate the impact function for TC damage using the formula of Emanuel 2011
impFunc_emanuel_usa = ImpfTropCyclone.from_emanuel_usa()
# plot the impact function
impFunc_emanuel_usa.plot();
import numpy as np
import matplotlib.pyplot as plt
from climada.entity import ImpactFunc, ImpactFuncSet
The method plot() in ImpactFuncSet also uses the the matplotlib’s axes plot function to visualise the impact functions,
returning a figure with all the subplots of impact functions. Users may modify these plots.
User may want to retrive a particular impact function from ImpactFuncSet. Using the method get_func(haz_type,
id), it returns an ImpactFunc class of the desired impact function. Below is an example of extracting the TC impact
function with id 1, and using plot() to visualise the function.
If there is an unwanted impact function from the ImpactFuncSet, we may remove it using the method
remove_func(haz_type, id) to remove it from the set.
For example, in the previous generated impact function set imp_fun_set contains an unwanted TC impact function with
id 3, we might thus would like to remove that from the set.
# first plotting all the impact functions in the impact function set to see what is␣
,→in there:
imp_fun_set.plot();
Impact functions defined in an excel file following the template provided in sheet impact_functions of
climada_python/climada/data/system/entity_template.xlsx can be ingested directly using the method
from_excel().
Users may write the impact functions in Excel format using write_excel() method.
Alternatively, users may also save the impact functions into pickle format, using CLIMADA in-built function save().
Note that pickle has a transient format and should be avoided when possible.
# this generates a results folder in the current path and stores the output there
save("tutorial_impf_set.p", imp_set_xlsx)
plt.tight_layout()
Measure class
A measure is characterized by the following attributes:
Related to measure’s description:
• name (str): name of the action
• haz_type (str): related hazard type (peril), e.g. TC
• color_rgb (np.array): integer array of size 3. Gives color code of this measure in RGB
• cost (float): discounted cost (in same units as assets). Needs to be provided by the user. See the example provided
in climada_python/climada/data/system/entity_template.xlsx sheets _measures_details and
_discounting_sheet to see how the discounting is done.
hazard_freq_cutoff modifies the hazard by putting 0 intensities to the events whose impact exceedance frequency
are greater than hazard_freq_cutoff.
imp_fun_map indicates the ids of the impact function to replace and its replacement. The impf_XX variable of Expo-
sures with the affected impact function id will be correspondingly modified (XX refers to the haz_type of the measure).
exp_region_id will apply all the previous changes only to the region_id indicated. This means that only the exposures
with that region_id and the hazard’s centroids close to them will be modified with the previous changes, the other regions
will remain unaffected to the measure.
risk_transf_attach and risk_transf_cover are the deductible and coverage of any event to happen.
Methods description:
The method check() validates the attibutes. apply() applies the measure to a given exposure, impact function and
hazard, returning their modified values. The parameters related to insurability (risk_transf_attach and risk_transf_cover)
affect the resulting impact and are therefore not applied in the apply() method yet.
calc_impact() calls to apply(), applies the insurance parameters and returns the final impact and risk transfer of the
measure. This method is called from the CostBenefit class.
The method apply() allows to visualize the effect of a measure. Here are some examples:
# define measure
meas = Measure(
name="Mangrove",
haz_type="TC",
color_rgb=np.array([1, 1, 1]),
cost=500000000,
mdd_impact=(1, 0),
paa_impact=(1, -0.15),
hazard_inten_imp=(1, -10), # reduces intensity by 10
)
# impact functions
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_all = ImpactFuncSet([impf_tc])
impf_all.plot()
# effect of hazard_freq_cutoff
import numpy as np
from climada.entity import ImpactFuncSet, ImpfTropCyclone, Exposures
from climada.entity.measures import Measure
from climada.hazard import Hazard
(continues on next page)
# define measure
meas = Measure(
name="Mangrove",
haz_type="TC",
color_rgb=np.array([1, 1, 1]),
cost=500000000,
hazard_freq_cutoff=0.0255,
)
# impact functions
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_all = ImpactFuncSet([impf_tc])
# Hazard
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
haz.check()
# Exposures
exp = Exposures.from_hdf5(EXP_DEMO_H5)
exp.check()
# new hazard
new_exp, new_impfs, new_haz = meas.apply(exp, impf_all, haz)
# if you look at the maximum intensity per centroid: new_haz does not contain the␣
,→event with smaller impact (the most frequent)
haz.plot_intensity(0)
new_haz.plot_intensity(0)
# you might also compute the exceedance frequency curve of both hazard
imp = ImpactCalc(exp, impf_all, haz).impact()
ax = imp.calc_freq_curve().plot(label="original")
,→initialization method. When making the change, be mindful of axis order changes:␣
,→https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
<matplotlib.legend.Legend at 0x7f433756d970>
# effect of exp_region_id
import numpy as np
from climada.entity import ImpactFuncSet, ImpfTropCyclone, Exposures
from climada.entity.measures import Measure
from climada.hazard import Hazard
from climada.engine import ImpactCalc
# define measure
meas = Measure(
name="Building code",
haz_type="TC",
color_rgb=np.array([1, 1, 1]),
cost=500000000,
hazard_freq_cutoff=0.00455,
exp_region_id=[1], # apply measure to points close to exposures with region_id=1
)
# impact functions
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_all = ImpactFuncSet([impf_tc])
# Hazard
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
haz.check()
# Exposures
exp = Exposures.from_hdf5(EXP_DEMO_H5)
(continues on next page)
# new hazard
new_exp, new_impfs, new_haz = meas.apply(exp, impf_all, haz)
# the cutoff has been apllied only in the region of the exposures
haz.plot_intensity(0)
new_haz.plot_intensity(0)
# the exceddance frequency has only been computed for the selected exposures before␣
,→doing the cutoff.
# since we have removed the hazard of the places with exposure, the new exceedance␣
,→frequency curve is zero.
,→initialization method. When making the change, be mindful of axis order changes:␣
,→https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
# define measure
meas = Measure(
name="Insurance",
haz_type="TC",
color_rgb=np.array([1, 1, 1]),
cost=500000000,
risk_transf_attach=5.0e8,
risk_transf_cover=1.0e9,
)
# impact functions
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_all = ImpactFuncSet([impf_tc])
# Hazard
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
haz.check()
# Exposures
exp = Exposures.from_hdf5(EXP_DEMO_H5)
exp.check()
# impact before
imp = ImpactCalc(exp, impf_all, haz).impact()
ax = imp.calc_freq_curve().plot(label="original")
risk_transfer 2.7e+07
MeasureSet class
Similarly to the ImpactFuncSet, MeasureSet is a container which handles Measure instances through the methods
append(), extend(), remove_measure()and get_measure(). Use the check() method to make sure all the
measures have been properly set.
For a complete class documentation, refer to the Python modules docs: climada.entity.measures.measure_set.
MeasureSet
# build measures
import numpy as np
import matplotlib.pyplot as plt
from climada.entity.measures import Measure, MeasureSet
meas_1 = Measure(
haz_type="TC",
name="Mangrove",
color_rgb=np.array([1, 1, 1]),
cost=500000000,
mdd_impact=(1, 2),
paa_impact=(1, 2),
hazard_inten_imp=(1, 2),
risk_transf_cover=500,
)
meas_2 = Measure(
haz_type="TC",
name="Sandbags",
color_rgb=np.array([1, 1, 1]),
cost=22000000,
(continues on next page)
Sandbags 22000000
An example of use - we define discount rates and apply them on a coastal protection scheme which initially costs 100 mn.
USD plus 75’000 USD mainteance each year, starting after 10 years. Net present value of the project can be calculated
as displayed:
%matplotlib inline
import numpy as np
from climada.entity import DiscRates
# Compute net present value between present year and future year.
ini_year = 2019
end_year = 2050
val_years = np.zeros(end_year - ini_year + 1)
val_years[0] = 100000000 # initial investment
val_years[10:] = 75000 # maintenance from 10th year
npv = disc.net_present_value(ini_year, end_year, val_years)
print("net present value: {:.5e}".format(npv))
# write file
disc.write_excel("results/tutorial_disc.xlsx")
Pickle can always be used as well, but note that pickle has a transient format and should be avoided when possible:
# this generates a results folder in the current path and stores the output there
save("tutorial_disc.p", disc)
Data Source
The International Disaster Database EM-DAT www.emdat.be
Download: https://public.emdat.be/ (register for free and download data to continue)
Demo data
The demo data used here (demo_emdat_impact_data_2020.csv) contains entries for the disaster subtype “Tropical cy-
clone” from 2000 to 2020.
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
clean_emdat_df()
"""Create DataFrame df with EM-DAT entries of tropical cyclones in Thailand and Viet␣
,→Nam in the years 2005 and 2006"""
df = clean_emdat_df(
emdat_file_path,
countries=["THA", "Viet Nam"],
hazard=["TC"],
year_range=[2005, 2006],
)
print(df)
... End Day Total Deaths No Injured No Affected No Homeless Total Affected \
0 ... 30.0 75.0 28.0 337632.0 NaN 337660.0
1 ... 30.0 10.0 NaN 2000.0 NaN 2000.0
2 ... 19.0 8.0 NaN 8500.0 NaN 8500.0
(continues on next page)
[8 rows x 43 columns]
emdat_countries_by_hazard()
Pick a hazard and a year range to get a list of countries affected from the EM-DAT data.
print(country_names)
print(iso3_codes)
,→', 'Pakistan', 'Philippines', 'Hong Kong', 'Korea, Republic of', 'Nicaragua', 'Oman
,→', 'Japan', 'Puerto Rico', 'Thailand', 'Martinique', 'Papua New Guinea', 'Tonga',
,→'Venezuela, Bolivarian Republic of', 'Viet Nam', 'Saint Vincent and the Grenadines',
,→Islands', 'Saint Kitts and Nevis', "Lao People's Democratic Republic", 'Mauritius',
,→Barthélemy', 'Virgin Islands, British', 'Saint Martin (French part)', 'Sint Maarten␣
['CHN', 'DOM', 'ATG', 'FJI', 'AUS', 'BGD', 'BLZ', 'BRB', 'COK', 'CAN', 'BHS', 'GTM',
,→'JAM', 'LCA', 'MDG', 'MEX', 'PRK', 'SLV', 'MMR', 'PYF', 'SLB', 'TWN', 'IND', 'USA',
,→'HND', 'HTI', 'PAK', 'PHL', 'HKG', 'KOR', 'NIC', 'OMN', 'JPN', 'PRI', 'THA', 'MTQ',
,→'PNG', 'TON', 'VEN', 'VNM', 'VCT', 'VUT', 'DMA', 'CUB', 'COM', 'MOZ', 'MWI', 'WSM',
,→'ZAF', 'LKA', 'PLW', 'WLF', 'SOM', 'SYC', 'REU', 'KIR', 'CPV', 'FSM', 'PAN', 'CRI',
,→'YEM', 'TUV', 'MNP', 'COL', 'AIA', 'DJI', 'KHM', 'MAC', 'IDN', 'GLP', 'TCA', 'KNA',
,→'LAO', 'MUS', 'MHL', 'PRT', 'VIR', 'ZWE', 'BLM', 'VGB', 'MAF', 'SXM', 'TZA']
emdat_to_impact()
function to load EM-DAT impact data and return impact set with impact per event
Parameters:
Optional parameters:
• hazard_type_emdat (list or str): List of Disaster (sub-)type according EMDAT terminology or CLIMADA hazard
type abbreviations. e.g. [‘Wildfire’, ‘Forest fire’] or [‘BF’]
• year_range (list with 2 integers): start and end year e.g. [1980, 2017]
• countries (list of str): country ISO3-codes or names, e.g. [‘JAM’, ‘CUB’]. Set to None or [‘all’] for all countries
• reference_year (int): reference year of exposures for normalization. Impact is scaled proportional to GDP to the
value of the reference year. No scaling for reference_year=0 (default)
• imp_str (str): Column name of impact metric in EMDAT CSV, e.g. ‘Total Affected’; default = “Total Damages”
Returns:
• impact_instance (instance of climada.engine.Impact): Impact() instance (same format as output from CLIMADA
impact computations). Values are scaled with GDP to reference_year if reference_year not equal 0. im-
pact_instance.eai_exp holds expected annual impact for each country. impact_instance.coord_exp holds rough
central coordinates for each country.
• countries (list): ISO3-codes of countries imn same order as in impact_instance.eai_exp
print(
"Number of TC events in EM-DAT 2000 to 2009 globally: %i"
% (impact_emdat.event_id.size)
)
print(
"Global annual average monetary damage (AAI) from TCs as reported in EM-DAT 2000␣
,→to 2009: USD billion %2.2f"
% (impact_emdat.aai_agg / 1e9)
)
# People affected
impact_emdat_PHL, countries = emdat_to_impact(
emdat_file_path,
"TC",
countries="PHL",
year_range=(2013, 2013),
imp_str="Total Affected",
)
print(
"Number of TC events in EM-DAT in the Philipppines, 2013: %i"
% (impact_emdat_PHL.event_id.size)
)
print("\nPeople affected by TC events in the Philippines in 2013 (per event):")
print(impact_emdat_PHL.at_event)
print("\nPeople affected by TC events in the Philippines in 2013 (total):")
print(int(impact_emdat_PHL.aai_agg))
ax = plt.scatter(impact_emdat_PHL_USD.at_event, impact_emdat_PHL.at_event)
plt.title("Typhoon impacts in the Philippines, 2013")
plt.xlabel("Total Damage [USD]")
plt.ylabel("People Affected");
# plt.xscale('log')
# plt.yscale('log')
emdat_impact_yearlysum()
function to load EM-DAT impact data and return DataFrame with impact summed per year and country
Parameters:
Optional parameters:
• hazard (list or str): List of Disaster (sub-)type according EMDAT terminology or CLIMADA hazard type abbre-
viations. e.g. [‘Wildfire’, ‘Forest fire’] or [‘BF’]
• year_range (list with 2 integers): start and end year e.g. [1980, 2017]
• countries (list of str): country ISO3-codes or names, e.g. [‘JAM’, ‘CUB’]. Set to None or [‘all’] for all countries
• reference_year (int): reference year of exposures for normalization. Impact is scaled proportional to GDP to the
value of the reference year. No scaling for reference_year=0 (default)
• imp_str (str): Column name of impact metric in EMDAT CSV, e.g. ‘Total Affected’; default = “Total Damages”
• version (int): given EM-DAT data format version (i.e. year of download), changes naming of columns/variables
(default: 2020)
Returns:
yearly_damage_normalized_to_2019 = emdat_impact_yearlysum(
emdat_file_path,
countries="USA",
hazard="Tropical cyclone",
year_range=None,
reference_year=2019,
)
yearly_damage_current = emdat_impact_yearlysum(
emdat_file_path,
countries=["USA"],
hazard="TC",
)
[2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2014
2015 2016 2017 2018 2019 2020]
What is a cost-benefit?
A cost-benefit analysis in CLIMADA lets you compare the effectiveness of different hazard adaptation options.
The cost-benefit ratio describes how much loss you can prevent per dollar of expenditure (or whatever currency you’re
using) over a period of time. When a cost-benefit ratio is less than 1, the cost is less than the benefit and CLIMADA is
predicting a worthwhile investment. Smaller ratios therefore represent better investments. When a cost-benefit is greater
than 1, the cost is more than the benefit and the offset losses are less than the cost of the adaptation measure: based on
the financials alone, the measure may not be worth it. Of course, users may have factors beyond just cost-benefits that
influence decisions.
CLIMADA doesn’t limit cost-benefits to just financial exposures. The cost-benefit ratio could represent hospitalisations
avoided per Euro spent, or additional tons of crop yield per Swiss Franc.
The cost-benefit calculation has a few complicated components, so in this section we’ll build up the calculation step by
step.
Simple cost-benefits
Time-dependence
The above equation works well when the only thing changing is an adaptation measure. But usually CLIMADA cost-
benefit calculation will want to describe a climate and exposure that also change over time. In this case it’s not enough to
multiply the change in average annual impact by the number of years we’re evaluating over, and we need to calculate a
benefit for every year separately and sum them up.
We can modify the benefit part of cost-benefit to reflect this. CLIMADA doesn’t assume that the user will have explicit
hazard and impact objects for every year in the study period, and so interpolates between the impacts at the start and the
end of the period of interest. If we’re evaluating between years T0 , usually close to the present, and T1 in the future, then
we can say:
∑
T1 ( )
benefit = α(t) AAI with measuresT1 − AAI with measuresT0 − N ∗ AAI without measureT0
t=T0
Where α(t) is a function of the year t describing the interpolation of hazard and exposure values between T0 and T1 . The
function returns values in the range [0, 1], usually with α(T0 ) = 0 and α(T0 ) = 1.
Note that:
• This calculation now requires three separate impact calculations: present-day impacts without measures imple-
mented, present-day impacts with measures implemented, and future impacts with measures implemented.
• Setting α(t) = 1 for all values of t simplifies this to the first cost-benefit equation above.
(t−T0 )k
CLIMADA lets you set α(t) to 1 for all years t, or as $αk (t) = (T1 −T0 )k
for t ∈ [T0 , T1 ]$
where k is user-provided, called imp_time_depen. This expression is a polynomial curve between T0 and T1 normalised
so that αk (T0 ) = 0 and αk (T1 ) = 1. The choice of k determines how quickly the transition occurs between the present
and future. When k = 1 the function is a straight line. When k > 1 change begins slowly and speeds up over time. When
k < 1 change is begins quickly and slows over time.
If this math is tough, the key takeaways are
• Cost benefit calculations take a long view, summing the benefits of adaptation measures over many years in a
changing world
• CLIMADA describes how the change from the present to the future scenarios happens with the imp_time_depen
parameter. With values < 1 the change starts quickly and slows down. When it is equal to 1 change is steady. When
it’s > 1 change starts slowly and speeds up.
Discount rates
∏
t
D(t) = (1 − d(y))
y=T0
With a constant 1.4% discount rate, we have D(t) = 0.986t−T0 . With a discount rate of zero we have D(t) = 1.
Adding this to our equation for total benefits we get:
∑
T1
benefit = α(t)D(t)(AAI with measuresT1 − AAI with measuresT0 ) − N ∗ AAI without measureT0
t=T0
Note:
• Setting the rates to zero (d(t) = 0) means D(t) = 1 and the term drops out of the equation.
• Be careful with your choice of discount rate when your exposure is non-economic. It can be hard to justify applying
rates to e.g. ecosystems or human lives.
Each dictionary stored in the attributes imp_meas_future and imp_meas_present has entries:
The dictionary will also include a ‘no measure’ entry with the same structure, giving the impact analysis when no measures
are implemented.
These are:
• hazard (Hazard object): the present-day or baseline hazard event set
• entity (Entity object): the present-day or baseline Entity object. Entity is the container class containing
– exposure (Exposures object): the present-day or baseline exposure
– disc_rates (DiscRates object): the discount rates to be applied in the cost-benefit calculation. Only dis-
count rates from entity and not ent_future are used.
– impact_funcs (ImpactFuncSet object): the impact functions required to calculate impacts from the present-
day hazards and exposures
– measures (MeasureSet object): the set of measures to implement in the analysis. This will almost always be
the same as the measures in the ent_future Entity (if set).
• haz_future (Hazard object, optional): the future hazard event set, if different from present.
• ent_future (Entity object, optional): the future Entity, if different from present. Note that the same adaptation
measures must be present in both entity and ent_future.
• future_year (int): the year of the future scenario. This is only used if the Entity’s exposures.ref_year isn’t
set, or no future entity is provided.
• risk_func (function): this is the risk function used to describe the annual impacts used to describe benefits.
The default is risk_aai_agg, the average annual impact on the Exposures (defined in the CostBenefit module).
This function can be replaces with any function that takes an Impact object as input and returns a number. The
CostBenefit module provides two others functions risk_rp_100 and risk_rp_250, the 100-year and 250-year
return period impacts respectively.
• imp_time_depen (float): This describes how hazard and exposure evolve over time in the calculation. In the
descriptions above this is the parameter k defining αk (t). When > 1 change is superlinear and occurs nearer the
start of the analysis. When < 1 change is sublinear and occurs nearer the end.
• save_imp (boolean): whether to save the hazard- and location-specific impact data. This is used in a lot of follow-
on calculations, but is very large if you don’t need it.
Download hazard
We will get data for present day tropical cyclone hazard in Haiti, and for 2080 hazard under the RCP 6.0 warming scenario.
Note that the Data API provides us with a full event set of wind footprints rather than a TCTracks track dataset, meaning
we don’t have to generate the wind fields ourselves.
client = Client()
future_year = 2080
haz_present = client.get_hazard(
"tropical_cyclone",
properties={
"country_name": "Haiti",
"climate_scenario": "historical",
"nb_synth_tracks": "10",
},
)
haz_future = client.get_hazard(
"tropical_cyclone",
properties={
"country_name": "Haiti",
"climate_scenario": "rcp60",
(continues on next page)
,→1980_2020/v1/tropical_cyclone_10synth_tracks_150arcsec_HTI_1980_2020.hdf5
,→rcp85_HTI_2080/v1/tropical_cyclone_10synth_tracks_150arcsec_rcp85_HTI_2080.hdf5
We can plot the hazards and show how they are forecast to intensify. For example, showing the strength of a 50-year
return period wind in present and future climates:
/Users/chrisfairless/opt/anaconda3/envs/climada_env/lib/python3.8/site-packages/
,→cartopy/crs.py:825: ShapelyDeprecationWarning: __len__ for multi-part geometries is␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
,→is deprecated and will be removed in Shapely 2.0. Use the `geoms` property to␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
exp_present = client.get_litpop(country="Haiti")
,→150arcsec_HTI.hdf5
For 2080’s economic exposure we will use a crude approximation, assuming the country will experience 2% economic
growth annually:
import copy
exp_future = copy.deepcopy(exp_present)
exp_future.ref_year = future_year
n_years = exp_future.ref_year - exp_present.ref_year + 1
growth_rate = 1.02
growth = growth_rate**n_years
exp_future.gdf["value"] = exp_future.gdf["value"] * growth
We can plot the current and future exposures. The default scale is logarithmic and we see how the values of exposures
grow, though not by a full order of magnitude.
/Users/chrisfairless/opt/anaconda3/envs/climada_env/lib/python3.8/site-packages/
,→pyproj/crs/crs.py:1256: UserWarning: You will likely lose important projection␣
,→information when converting to a PROJ string from another format. See: https://proj.
,→org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
return self._crs.to_proj4(version=version)
/Users/chrisfairless/opt/anaconda3/envs/climada_env/lib/python3.8/site-packages/
,→cartopy/crs.py:825: ShapelyDeprecationWarning: __len__ for multi-part geometries is␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
,→is deprecated and will be removed in Shapely 2.0. Use the `geoms` property to␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
/Users/chrisfairless/opt/anaconda3/envs/climada_env/lib/python3.8/site-packages/
,→pyproj/crs/crs.py:1256: UserWarning: You will likely lose important projection␣
,→information when converting to a PROJ string from another format. See: https://proj.
,→org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
return self._crs.to_proj4(version=version)
/Users/chrisfairless/opt/anaconda3/envs/climada_env/lib/python3.8/site-packages/
,→cartopy/crs.py:825: ShapelyDeprecationWarning: __len__ for multi-part geometries is␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
,→is deprecated and will be removed in Shapely 2.0. Use the `geoms` property to␣
,→deprecated and will be removed in Shapely 2.0. Check the length of the `geoms`␣
<GeoAxesSubplot:>
We then need to map the exposure points to the hazard centroids. (Note: we could have done this earlier before we copied
the exposure, but not all analyses will have present and future exposures and hazards on the same sets of points.)
In this analysis we’ll use the popular sigmoid curve impact function from Emanuel (2011).
impf_tc = ImpfTropCyclone.from_emanuel_usa()
# Rename the impact function column in the exposures and assign hazard IDs
# This is more out of politeness, since if there's only one impact function
# and one `impf_` column, CLIMADA can figure it out
exp_present.gdf.rename(columns={"impf_": "impf_TC"}, inplace=True)
exp_present.gdf["impf_TC"] = 1
exp_future.gdf.rename(columns={"impf_": "impf_TC"}, inplace=True)
exp_future.gdf["impf_TC"] = 1
For adaptation measures we’ll follow some of the examples from the Adaptation MeasureSet tutorial. See the tutorial to
understand how measures work in more depth.
These numbers are completely made up. We implement one measure that reduces the (effective) wind speed by 5 m/s
and one that completely protects 10% of exposed assets.
import numpy as np
import matplotlib.pyplot as plt
from climada.entity.measures import Measure, MeasureSet
meas_2 = Measure(
haz_type="TC",
name="Measure B",
color_rgb=np.array([0.1, 0.1, 0.8]),
cost=220000000,
paa_impact=(1, -0.10), # 10% fewer assets affected
)
We’ll define two discount rate objects so that we can compare their effect on a cost-benefit. First, a zero discount rate,
where preventing loss in 2080 is valued the same a preventing it this year. Second, the often-used 1.4% per year.
Now we have everything we need to create Entities. Remember, Entity is a container class for grouping Exposures,
Impact Functions, Discount Rates and Measures.
In this first example we’ll set discount rates to zero.
entity_present = Entity(
exposures=exp_present,
disc_rates=discount_zero,
impact_func_set=impf_set,
measure_set=meas_set,
)
entity_future = Entity(
exposures=exp_future,
(continues on next page)
We are now ready to perform our first cost-benefit analysis. We’ll start with the simplest and build up complexity.
The first analysis only looks at solely at the effects of introducing adaptation measures. It assumes no climate change and
no economic growth. It evaluates the benefit over the period 2018 (present) to 2080 (future) and sets the discount rate to
zero.
costben_measures_only = CostBenefit()
costben_measures_only.calc(
haz_present,
entity_present,
haz_future=None,
ent_future=None,
future_year=future_year,
risk_func=risk_aai_agg,
imp_time_depen=None,
save_imp=True,
)
Combining measures
We can also combine the measures to give the cost-benefit of implementing everything:
combined_costben = costben_measures_only.combine_measures(
["Measure A", "Measure B"],
"Combined measures",
new_color=np.array([0.1, 0.8, 0.8]),
disc_rates=discount_zero,
)
Note: the method of combining measures is naive. The offset impacts are summed over the event set while not letting the
impact of any single event drop below zero (it therefore doesn’t work in analyses where impacts can go below zero).
Finally, we can see how effective the adaptation measures are at different return periods. The plot_event_view plot
shows the difference in losses at different return periods in the future scenario (here the same as the present scenario) with
the losses offset by the adaptation measures shaded.
We see that the Measure A, which reduces wind speeds by 5 m/s, is able to completely stop impacts at the 25 year return
period, and that at 250 years – the strongest events – the measures have greatly reduced effectiveness.
Cost-benefit #2: adaptation measures with climate change and economic growth
Our next analysis will introduce a change in future scenarios. We’ll add hazard_future and entity_future into the
mixture. We’ll set imp_time_depen set to 1, meaning we interpolate linearly between the present and future hazard
and exposures in our summation over years. We’ll still keep the discount rate at zero.
costben = CostBenefit()
costben.calc(
haz_present,
entity_present,
haz_future=haz_future,
ent_future=entity_future,
future_year=future_year,
risk_func=risk_aai_agg,
imp_time_depen=1,
save_imp=True,
)
Waterfall plots
Now that there are more additional components in the analysis, we can use more of the CostBenefit class’s visualisation
methods. The waterfall plot is the clearest way to break down the components of risk:
ax = waterfall()
The waterfall plot breaks down the average annual risk faced in 2080 (this is \$0.984 bn, as printed out during the cost-
benefit calculation).
We see that the baseline 2018 risk in blue. The ‘Economic development’ bar in orange shows the change in annual impacts
resulting from growth in exposure, and the ‘Climate change’ bar in green shows the additional change from changes in the
hazard.
In this analysis, then, we see that changes in annual losses are likely to be driven by both economic development and
climate change, in roughly equal amounts.
The plot_arrow_averted graph builds on this, adding an indication of the risk averted to the waterfall plot. It’s slightly
awkward to use, which is why we wrote a function to create the waterfall plot earlier:
costben.plot_arrow_averted(
axis=waterfall(),
in_meas_names=["Measure A", "Measure B"],
accumulate=True,
combine=False,
risk_func=risk_aai_agg,
disc_rates=None,
imp_time_depen=1,
)
Exercise: In addition, the plot_waterfall_accumulated method is available to produce a waterfall plot from a
different perspective. Instead of showing a breakdown of the impacts from the year of our future scenario, it accumulates
the components of risk over the whole analysis period. That is, it sums the components over every year between 2018
(when the entire risk is the baseline risk) to 2080 (when the breakdown is the same as the plot above). The final plot
has the same four components, but gives them different weightings. Look up the function in the climada.engine.
cost_benefit module and try it out. Then try changing the value of the imp_time_depen parameter, and see how
front-loading or back-loading the year-on-year changes gives different totals and different breakdowns of risk.
Next we will introduce discount rates to the calculations. Recall that discount rates are factors used to convert future
impacts into present-day impacts, based on the idea that an impact in the future is less significant than the same impact
today.
We will work with the annual 1.4% discount that we defined earlier in the discount_stern object. Let’s define two
new Entity objects with these discount rates:
entity_present_disc = Entity(
exposures=exp_present,
disc_rates=discount_stern,
impact_func_set=impf_set,
measure_set=meas_set,
)
entity_future_disc = Entity(
exposures=exp_future,
disc_rates=discount_stern,
impact_func_set=impf_set,
measure_set=meas_set,
)
costben_disc = CostBenefit()
costben_disc.calc(
haz_present,
entity_present_disc,
haz_future=haz_future,
ent_future=entity_future_disc,
future_year=future_year,
risk_func=risk_aai_agg,
imp_time_depen=1,
save_imp=True,
)
print(costben_disc.imp_meas_future["no measure"]["impact"].imp_mat.shape)
With scenarios like this, the CostBenefit.plot_cost_benefit method shows a 2-dimensional representation of the
cost-benefits.
ax = costben_disc.plot_cost_benefit()
The x-axis here is damage averted over the 2018-2080 analysis period. The y-axis is the Benefit/Cost ratio (so higher is
better). This means that the area of each shape represents the total benefit of the measure. Furthermore, any measure
which goes above 1 on the y-axis gives a larger benefit than the cost of its implementation.
The average annual impact and the total climate risk are marked on the x-axis. The width between the last measure bar
and the the total climate risk is the residual risk.
Exercise: How sensitive are cost benefit analyses to different parameters? Let’s say an adaptation measure is a ‘good
investment’ if the benefit is greater than the cost over the analysis period, and it’s a ‘bad investment’ if the benefit is less
than the cost.
• Using the hazards and exposures from this tutorial, can you design an impact measure that is a good investment
when no discount rates are applied, and a bad investment when a 1.4% (or higher) discount rate is applied?
• Create hazard and exposure objects for the same growth and climate change scenarios as this tutorial, but for the
year 2040. Can you design an impact measure that is a good investment when evaluated out to 2080, but a bad
investment when evaluated to 2040?
• Using the hazards and exposures from this tutorial, can you design an impact measure that is a good investment
when imp_time_depen = 1/4 (change happens closer to 2018) and a bad investment when imp_time_depen =
4 (change happens closer to 2080).
Finally we can use some of the functionality of the objects stored within the CostBenefit object. Remember that many
impact calculations have been performed to get here, and if imp_mat was set to True, the data has been stored (or … it
will be. I found a bug that stops it being saved while writing the tutorial.)
So this means that you can, for example, plot maps of return period hazard with different adaptation measures applied (or
with all applied, using combine_measures).
Another thing to explore is exceedance curves, which are stored. Here are the curves for the present, future unadapted
and future adapted scenarios:
combined_costben_disc = costben_disc.combine_measures(
["Measure A", "Measure B"],
"Combined measures",
new_color=np.array([0.1, 0.8, 0.8]),
disc_rates=discount_stern,
)
efc_present = costben_disc.imp_meas_present["no measure"]["efc"]
efc_future = costben_disc.imp_meas_future["no measure"]["efc"]
efc_combined_measures = combined_costben_disc.imp_meas_future["Combined measures"][
"efc"
]
ax = plt.subplot(1, 1, 1)
efc_present.plot(axis=ax, color="blue", label="Present")
efc_future.plot(axis=ax, color="orange", label="Future, unadapted")
efc_combined_measures.plot(axis=ax, color="green", label="Future, adapted")
leg = ax.legend()
Conclusion
Cost-benefits calculations can be powerful policy tools, but they are as much an art as a science. Describing your adaptation
measures well, choosing the period to evaluate them over, describing the changing climate and picking a discount rate
will all affect your results and whether a particular measure is worth implementing.
Take the time to explain these choices to yourself and anyone else who wants to understand your calculations. It is also
good practice to run sensitivity tests on your results: how much do your conclusions change when you use other plausible
setups for the calculation?
import numpy as np
array([2, 2, 2, 0, 4, 5, 4, 2, 3, 1])
[array([8, 3]),
array([7, 0]),
array([4, 6]),
array([], dtype=int32),
array([5, 9, 1, 2]),
array([1, 6, 0, 7, 2]),
array([4, 9, 5, 8]),
array([9, 8]),
array([5, 3, 4]),
array([1])]
0.7746478873239436
# compare the resulting yimp with our step-by-step computation without applying the␣
(continues on next page)
# and here the same comparison with applying the correction factor (default settings):
yimp, sampling_vect = yearsets.impact_yearset(imp, sampled_years=list(range(1, 11)))
print(
"The same can be shown for the case of applying the correction factor."
"The yimp.at_event values equal our step-by-step computed imp_per year:"
)
print("yimp.at_event = ", yimp.at_event)
print("imp_per_year = ", imp_per_year / correction_factor)
The same can be shown for the case of applying the correction factor.The yimp.at_
,→event values equal our step-by-step computed imp_per year:
We first explain the methods functionality and options using a mock Hazard object such that the computation can be easily
followed. Further below, we apply the methods to realistic Hazard and Impact objects. If you are already familiar with
local exceedance values and return values, you can directly jump to the section about Method comparison for a realistic
Hazard object.
2.6. Local exceedance intensities, local exceedance impacts, and return periods 283
CLIMADA documentation, Release 6.0.2-dev
# import packages
import numpy as np
from scipy import sparse
import warnings
warnings.filterwarnings("ignore")
# hazard intensity
intensity = sparse.csr_matrix([[0, 0, 10, 50], [0, 100, 100, 100]])
# hazard centroids
centroids = Centroids(lat=np.array([2, 2, 1, 1]), lon=np.array([1, 2, 1, 2]))
# define hazard
hazard = Hazard(
haz_type="TC",
intensity=intensity,
fraction=np.full_like(intensity, 1),
centroids=centroids,
event_id=np.array([1, 2]),
event_name=["ev1", "ev2"],
date=np.array([1, 2]),
orig=np.array([True, True]),
frequency=np.array([9.0 / 100, 1.0 / 100]),
frequency_unit="1/year",
units="m/s",
)
hazard.intensity_thres = 0
Now, the question is how to estimate the exceedance intensity for new return periods, e.g., 5, 30, and 150 years.
For the return periods inside the range of observed return periods (here, 30 years), one can simply interpolate between the
data points. For return periods outside the range of observed return periods (here, 5 and 150 years), the cautious answer
is “we don’t know” and one returns NaN. This behaviour is given using method='interpolate' which is the default
setting.
2.6. Local exceedance intensities, local exceedance impacts, and return periods 285
CLIMADA documentation, Release 6.0.2-dev
Note that, by default, the linear interpolation between data points is done after converting the data to logarithmic scale.
We do this because, when extrapolating, logarithmic scales avoid negative numbers. The scale choice can be controlled
by changing the boolean parameters log_frequency and log_intensity.
Option 2: extrapolation
If the user wants to estimate the return periods outside the range of observed return periods (here, 5 and 150 years), they
can use method='extrapolate'. This just extends the last interpolation piece inside the data range beyond the data
borders. If there is only a single (nonzero) data point, this setting returns the given intensity (e.g., 100m/s for centroid B)
for return periods above the observed return period (e.g., 100 years for centroid B), and zero intensity for return periods
below. Centroids where all events have zero intensity will be assigned zero exceedance intensity for any return period.
Users who want to extrapolate in a more cautious way can use method='extrapolate_constant'. Here, return
periods above the largest obsvered return period are assigned the largest intensity, and return periods below the smallest
observed return periods are assigned 0.
Option 4: stepfunction
Finally, instead of interpolating between the data points, one can use method='stepfunction'. Here, a user-provided
return period will be assigned an exceedance intensity equal to the intensity corresponding to the closest observed re-
turn period that is below the given the user-provided return period. The extrapolation behaviour is the same as for
method='extrapolate_constant'.
2.6. Local exceedance intensities, local exceedance impacts, and return periods 287
CLIMADA documentation, Release 6.0.2-dev
# method: extrapolation
local_return_period, title, column_label = hazard.local_return_period(
threshold_intensities=[5, 30, 150], method="extrapolate_constant"
)
plot_from_gdf(local_return_period, title, column_label, smooth=False, figsize=(10,␣
,→6));
# method: extrapolation
local_return_period, title, column_label = hazard.local_return_period(
threshold_intensities=[5, 30, 150], method="extrapolate"
)
plot_from_gdf(local_return_period, title, column_label, smooth=False, figsize=(10,␣
,→6));
haz_tc_fl = Hazard.from_hdf5(
get_test_file("HAZ_DEMO_FL_15")
) # Historic tropical cyclones in Florida from 1990 to 2004
haz_tc_fl.check() # Use always the check() method to see if the hazard has been␣
,→loaded correctly
Next, we plot the hazards local exceedance intensities for user-specified return periods of 10, 100 and 200 years. We use
the setting "method=extrapolate". Furthermore, we indicate a specific centroid (red circle) that we will analyse in
more detail below.
Now, we calculate the local exceedance frequencies for return periods ranging from 5 to 250 years, using the four different
options explained above.
0
]
extrapolated = haz_tc_fl.local_exceedance_intensity(
return_periods=test_return_periods, method="extrapolate"
)[0]
extrapolated_constant = haz_tc_fl.local_exceedance_intensity(
return_periods=test_return_periods, method="extrapolate_constant"
)[0]
stepfunction = haz_tc_fl.local_exceedance_intensity(
return_periods=test_return_periods, method="stepfunction"
)[0]
Finally, we focus on a specific centroid (red circle in above plots) and show how the different options of Hazard.
2.6. Local exceedance intensities, local exceedance impacts, and return periods 289
CLIMADA documentation, Release 6.0.2-dev
local_exceedance_intensity() can lead to different results. The user-specified return periods from above are
indicated as dotted lines. Note in particular that the return periods 5 years and 200 years that we considered above,
lie outside the range of observed values for this centroid (blue scatter points). Thus, depending on the extrapolation
choice, Hazard.local_exceedance_intensity() either returns NaN ("method=interpolate", default option)
or different extrapolated estimates.
2.6.3 Compute local exceedance impacts and local return periods of impact ob-
jects
Completely analogous to the above explained methods Hazard.local_exceedance_intensity() and
Hazard.local_return_period() of a Hazard object, an Impact object has the methods Impact.
local_exceedance_impact() and Impact.local_return_period() (to be added soon).
import warnings
warnings.filterwarnings("ignore")
2.6. Local exceedance intensities, local exceedance impacts, and return periods 291
CLIMADA documentation, Release 6.0.2-dev
client = Client()
haz_tc_haiti = client.get_hazard(
"tropical_cyclone",
properties={
"country_name": "Haiti",
"climate_scenario": "historical",
"nb_synth_tracks": "10",
},
)
haz_tc_haiti.check()
,→2020/v2/tropical_cyclone_10synth_tracks_150arcsec_HTI_1980_2020.hdf5
# prepare exposure
exposure = client.get_litpop(country="Haiti")
exposure.check()
,→hdf5
impf_tc = ImpfTropCyclone.from_emanuel_usa()
impf_set = ImpactFuncSet([impf_tc])
impf_set.check()
# compute impact
2.6. Local exceedance intensities, local exceedance impacts, and return periods 293
CLIMADA documentation, Release 6.0.2-dev
As can be seen in the following example, this binning can lead to a different and more stable extrapolation (which might
be desriable in particular for the local_return_period method), and to a smoother interpolation. Note that, due to
binning, the data range of observed return periods for method='interpolate' may be reduced.
As an example, we consider a hazard object that contains one centroid and five intensities, all of which occur with a
frequency of 1/10years. Importantly, the two maximal intensities are very similar, which strongly affects the extrapolation
behaviour.
hazard_binning = Hazard(
intensity=sparse.csr_matrix([[80.0], [80.02], [70.0], [70.0], [60.0]]),
frequency=np.array([0.1, 0.1, 0.1, 0.1, 0.1]),
frequency_unit="1/year",
units="m/s",
centroids=Centroids(lat=np.array([1]), lon=np.array([2])),
)
2.6. Local exceedance intensities, local exceedance impacts, and return periods 295
CLIMADA documentation, Release 6.0.2-dev
Given the hazard distribution in the hazard object (black points), the user might prefer flat extrapolation behaviour to
large return periods, in which case the default bin_decimals=None is the correct choice.
However, in the inverse problem of computing a return period for a given hazard intensity, not binning the values can
lead to an unbounded extrapolation, and binning the values might be a good choice.
A rough schemata of how to perform uncertainty and sensitivity analysis (taken from Kropf(2021))
1. Kropf, C.M. et al. Uncertainty and sensitivity analysis for global probabilistic weather and climate risk modelling:
an implementation in the CLIMADA platform (2021)
2. Pianosi, F. et al. Sensitivity analysis of environmental models: A systematic review with practical workflow. En-
vironmental Modelling & Software 79, 214–232 (2016). 3.Douglas-Smith, D., Iwanaga, T., Croke, B. F. W. &
Jakeman, A. J. Certain trends in uncertainty and sensitivity analysis: An overview of software tools and techniques.
Environmental Modelling & Software 124, 104588 (2020)
3. Knüsel, B. Epistemological Issues in Data-Driven Modeling in Climate Research. (ETH Zurich, 2020)
4. Saltelli, A. et al. Why so many published sensitivity analyses are false: A systematic review of sensitivity analysis
practices. Environmental Modelling & Software 114, 29–39 (2019)
5. Saltelli, A. & Annoni, P. How to avoid a perfunctory sensitivity analysis. Environmental Modelling & Software
25, 1508–1517 (2010)
InputVar
The InputVar class is used to define uncertainty variables.
An input uncertainty parameter is a numerical input value that has a certain probability density distribution in your
model, such as the total exposure asset value, the slope of the vulnerability function, the exponents of the litpop exposure,
the value of the discount rate, the cost of an adaptation measure, …
The probability densitity distributions (values of distr_dict) of the input uncertainty parameters (keyword arguments
of the func and keys of the distr_dict) can be any of the ones defined in scipy.stats.
Several helper methods exist to make generic InputVar for Exposures, ImpactFuncSet, Hazard, Entity (including
DiscRates and Measures). These are described in details in the tutorial Helper methods for InputVar. These are a good
bases for your own computations.
Suppose we assume that the GDP value used to scale the exposure has a relative error of +-10%.
import warnings
exp_base = Exposures.from_hdf5(EXP_DEMO_H5)
# Define the function that returns an exposure with scaled total assed value
# Here x_exp is the input uncertainty parameter and exp_func the inputvar.func.
def exp_func(x_exp, exp_base=exp_base):
exp = exp_base.copy()
(continues on next page)
exp_distr = {
"x_exp": sp.stats.uniform(0.9, 0.2),
}
exp_iv = InputVar(exp_func, exp_distr)
# Uncertainty parameters
exp_iv.labels
['x_exp']
# Defined distribution
exp_iv.plot(figsize=(5, 3));
Suppose we want to test different exponents (m=1,2 ; n=1,2) for the LitPop exposure for the country Switzerland.
# A faster method would be to first create a dictionnary with all the exposures. This␣
,→however
distr_dict = {
"m": sp.stats.randint(low=m_min, high=m_max + 1),
"n": sp.stats.randint(low=n_min, high=n_max + 1),
}
cat_iv = InputVar(
litpop_cat, distr_dict
) # One can use either of the above definitions of litpop_cat
# Uncertainty parameters
cat_iv.labels
['m', 'n']
cat_iv.evaluate(m=1, n=2).plot_raster();
cat_iv.plot(figsize=(10, 3));
UncOutput
The UncOutput class is used to store data from sampling, uncertainty and sensitivity analysis. An UncOutput object
can be saved and loaded from .hdf5. The classes UncImpactOuput and UncCostBenefitOutput are extensions of
UncOutput specific for CalcImpact and CalcCostBenefit, respectively.
Data attributes
UncImpactOutput
UncCostBenefitOut-
put
Here we show an example loaded from file. In the sections below this class is extensively used and further examples can
be found.
apiclient = Client()
ds = apiclient.get_dataset_info(name=TEST_UNC_OUTPUT_IMPACT, status="test_dataset")
_target_dir, [filename] = apiclient.download_dataset(ds)
# If you produced your own data, you do not need the API. Just replace 'filename'␣
,→with the path to your file.
unc_imp = UncOutput.from_hdf5(filename)
apiclient = Client()
ds = apiclient.get_dataset_info(name=TEST_UNC_OUTPUT_COSTBEN, status="test_dataset")
_target_dir, [filename] = apiclient.download_dataset(ds)
# If you produced your own data, you do not need the API. Just replace 'filename'␣
,→with the path to your file.
unc_cb = UncOutput.from_hdf5(filename)
unc_cb.get_uncertainty().tail()
[5 rows x 29 columns]
CalcImpact
In this example, we model the impact function for tropical cyclones on the parametric function suggested in Emanuel
(2015) with 4 parameters. The exposures total value varies between 80% and 120%. For that hazard, we assume to have
no good error estimate and thus do not define an InputVar for the hazard.
# Define the input variable functions
import numpy as np
intensity_unit = "m/s"
intensity = np.linspace(0, 150, num=100)
mdd = np.repeat(1, len(intensity))
paa = np.array([sigmoid_func(v, G, v_half, vmin, k) for v in intensity])
imp_fun = ImpactFunc("TC", _id, intensity, mdd, paa, intensity_unit)
imp_fun.check()
impf_set = ImpactFuncSet([imp_fun])
return impf_set
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
exp_base = Exposures.from_hdf5(EXP_DEMO_H5)
# It is a good idea to assign the centroids to the base exposures in order to avoid␣
,→repeating this
import scipy as sp
from climada.engine.unsequa import InputVar
exp_distr = {
"x_exp": sp.stats.beta(10, 1.1)
} # This is not really a reasonable distribution but is used
# here to show that you can use any scipy distribution.
impf_distr = {
"G": sp.stats.truncnorm(0.5, 1.5),
"v_half": sp.stats.uniform(35, 65),
"vmin": sp.stats.uniform(0, 15),
"k": sp.stats.uniform(1, 4),
(continues on next page)
ax = exp_iv.plot(figsize=(6, 4))
plt.yticks(fontsize=16)
plt.xticks(fontsize=16);
Next, we generate samples for the uncertainty parameters using the default methods. Note that depending on the cho-
sen Salib method, the effective number of samples differs from the input variable N. For the default ‘saltelli’, with
calc_second_order=True, the effective number is N(2D+2), with D the number of uncertainty parameters. See
SAlib for more information.
output_imp.plot_sample(figsize=(15, 8));
Now we can compute the value of the impact metrics for all the samples. In this example, we additionaly chose to restrict
the return periods 50, 100, and 250 years. By default, eai_exp and at_event are not stored.
The distributions of metrics ouputs are stored as dictionaries of pandas dataframe. The metrics are directly taken from
the output of climada.impact.calc. For each metric, on dataframe is made.
['aai_agg', 'freq_curve']
aai_agg
1531 2.905571e+09
1532 3.755172e+09
1533 1.063119e+09
1534 2.248718e+09
1535 1.848139e+09
Accessing the uncertainty is in general done via the method get_uncertainty(). If none are specified, all metrics are
returned.
output_imp.get_uncertainty().tail()
The distributions of the one-dimensioanl metrics (eai_exp and at_event are never shown with this method) can be
vizualised with plots.
output_imp.plot_uncertainty(figsize=(12, 12));
No artists with labels found to put in legend. Note that artists whose label start␣
,→with an underscore are ignored when legend() is called with no argument.
Now that a distribution of the impact metrics has been computed for each sample, we can also compute the sensitivity
indices for each metrics to each uncertainty parameter. Note that the chosen method for the sensitivity analysis should
correpond to its sampling partner as defined in the SAlib package.
The sensitivity indices dictionnaries outputs from the SAlib methods are stored in the same structure of nested diction-
naries as the metrics distributions. Note that depending on the chosen sensitivity analysis method the returned indices
dictionnary will return specific types of sensitivity indices with specific names. Please get familiar with SAlib for more
information.
Note that in our case, several of the second order sensitivity indices are negative. For the default method sobol, this
indicates that the algorithm has not converged and cannot give realiable values for these sensitivity indices. If this happens,
please use a larger number of samples. Here we will focus on the first-order indices.
output_imp = calc_imp.sensitivity(output_imp)
output_imp.sensitivity_metrics
['aai_agg', 'freq_curve']
output_imp.get_sens_df("aai_agg").tail()
To obtain the sensitivity interms of a particular sensitivity index, use the method get_sensisitivity(). If none is
specified, the value of the index for all metrics is returned.
output_imp.get_sensitivity("S1")
Sometimes, it is useful to simply know what is the largest sensitivity index for each metric.
output_imp.get_largest_si(salib_si="S1")
The value of the sensitivity indices can be plotted for each metric that is one-dimensional (eai_exp and at_event are
not shown in this plot).
We see that both the errors in freq_curve and in aai_agg are mostly determined by x_exp and v_half. Finally, we
see small differences in the sensitivity of the different return periods.
Note that since we have quite a few measures, the imp_meas_fut and imp_meas_pres plots are too crowded. We can select
only the other metrics easily. In addition, instead of showing first order sensitivity ‘S1’, we can plot the total sensitivity
‘ST’.
One can also vizualise the second-order sensitivity indices in the form of a correlation matrix.
output_imp.plot_sensitivity_second_order(figsize=(12, 8));
We shall use the same uncertainty variables as in the previous section but show a few possibilities to use non-default
method arguments.
output_imp2.plot_sample(figsize=(15, 8));
start = time.time()
output_imp2 = calc_imp2.uncertainty(
output_imp2, rp=[50, 100, 250], calc_eai_exp=True, calc_at_event=True, processes=4
)
end = time.time()
time_passed = end - start
print(f"Time passed with pool: {time_passed}")
start2 = time.time()
output_imp2 = calc_imp2.uncertainty(
output_imp2, rp=[50, 100, 250], calc_eai_exp=True, calc_at_event=True
)
end2 = time.time()
time_passed_nopool = end2 - start2
print(f"Time passed without pool: {time_passed_nopool}")
# Add the original value of the impacts (without uncertainty) to the uncertainty plot
from climada.engine import ImpactCalc
# Use the method 'rbd_fast' which is recommend in pair with 'latin'. In addition,␣
,→change one of the kwargs
Since we computed the distribution and sensitivity indices for the total impact at each exposure point, we can plot a map
of the largest sensitivity index in each exposure location. For every location, the most sensitive parameter is v_half,
meaning that the average annual impact at each location is most sensitivity to the ucnertainty in the impact function slope
scaling parameter.
output_imp2.plot_sensitivity_map();
output_imp2.get_largest_si(salib_si="S1", metric_list=["eai_exp"]).tail()
CalcDeltaImpact
The main goal of this class is to perform an uncertainty and sensitivity analysis of the “delta” impact between a reference
state and future (or any other “to be compared”) state.
Classical example: risk increase in the future with climate change and socio economic development. In this case, the
uncertainty and sensitivity analysis in performed on the estimated risk (delta) increase in the future relative to the present-
day baseline.
The uncertainty and sensitivity analysis for CalcDeltaImpact is completely analogous to the Impact case. It is slightly
more complex as there are more input variables.
Note, the logic of this class works with any comparison between an initial (reference and final (altered) risk or impact
state and is not limited to the scope of climate change and socio-economic development in the future.
import numpy as np
Load the hazard set and apply climate change factors to it. This yields a hazard representation in 2050 under 4 RCP
scenarios. For a full documentation of this function please refer to the TropCyclone tutorial.
# pack future hazard sets into dictionary - we want to sample from this dictionary␣
,→later
exp_base = Exposures.from_hdf5(EXP_DEMO_H5)
# It is a good idea to assign the centroids to the base exposures in order to avoid␣
,→repeating this
import scipy as sp
from climada.engine.unsequa import InputVar
exp_distr = {
"x_exp": sp.stats.beta(10, 1.1)
} # This is not really a reasonable distribution but is used
# here to show that you can use any scipy distribution.
impf_distr = {
"G": sp.stats.truncnorm(0.5, 1.5),
"v_half": sp.stats.uniform(35, 65),
"vmin": sp.stats.uniform(0, 15),
"k": sp.stats.uniform(1, 4),
}
impf_iv = InputVar(impf_func, impf_distr)
Next we define the function for the future hazard representation. It’s a simple function that allows us to draw from
the hazard dictionary of hazard sets under different RCP scenarios. Note, we do not investigate other hazard related
uncertainties in this example.
# future
def haz_fut_func(rcp_scenario):
haz_fut = tc_haz_fut_dict[rcp_key[rcp_scenario]]
return haz_fut
In contrast to CalcImpact, we define InputVars for initial and final states of exposure, impact function, hazard. This class
requires 6 input variables. For the sake of simplicity, we did not define varying input variables for the initial and future
exposure and vulernability in the example. Hence, the exp_iv and impf_iv are passed to CalcDeltaImpact twice.
The input parameter x_exp is shared among at least 2 input variables. Their␣
,→uncertainty is thus computed with the same samples for this input paramter.
The input parameter G is shared among at least 2 input variables. Their uncertainty␣
,→is thus computed with the same samples for this input paramter.
The input parameter v_half is shared among at least 2 input variables. Their␣
,→uncertainty is thus computed with the same samples for this input paramter.
The input parameter vmin is shared among at least 2 input variables. Their␣
,→uncertainty is thus computed with the same samples for this input paramter.
The input parameter k is shared among at least 2 input variables. Their uncertainty␣
,→is thus computed with the same samples for this input paramter.
output_imp = calc_imp.make_sample(N=2**7)
output_imp.get_samples_df().tail()
output_imp = calc_imp.uncertainty(output_imp)
Plotting functionalities work analogous to CalcImpact. By setting calc_delta=True, the axis labels are adjusted.
output_imp.plot_uncertainty(calc_delta=True)
output_imp.plot_rp_uncertainty(calc_delta=True)
No artists with labels found to put in legend. Note that artists whose label start␣
,→with an underscore are ignored when legend() is called with no argument.
# compute sensitivity
output_imp = calc_imp.sensitivity(output_imp)
# plot sensitivity
output_imp.plot_sensitivity()
The rest of the functionalities that apply to CalcImpact also work for the CalcDeltaImpact class. Hence, refer to the
CalcCostBenefit
The uncertainty and sensitivity analysis for CostBenefit is completely analogous to the Impact case. It is slightly more
complex as there are more input variables.
import copy
from climada.util.constants import ENT_DEMO_TODAY, ENT_DEMO_FUTURE, HAZ_DEMO_H5
from climada.entity import Entity
from climada.hazard import Hazard
entity = Entity.from_excel(ENT_DEMO_TODAY)
entity.exposures.ref_year = 2018
entity.exposures.gdf["value"] *= x_ent
return entity
# Entity in the future has a +- 10% uncertainty in the cost of all the adapatation␣
,→measures
def ent_fut_func(m_fut_cost):
# In-function imports needed only for parallel computing on Windows
from climada.entity import Entity
from climada.util.constants import ENT_DEMO_FUTURE
entity = Entity.from_excel(ENT_DEMO_FUTURE)
entity.exposures.ref_year = 2040
for meas in entity.measures.get_measure("TC"):
meas.cost *= m_fut_cost
return entity
haz_base = Hazard.from_hdf5(HAZ_DEMO_H5)
haz = copy.deepcopy(haz_base)
haz.intensity = haz.intensity.multiply(x_haz_fut)
return haz
haz_today = haz_base
haz_fut_distr = {
"x_haz_fut": sp.stats.uniform(1, 3),
}
haz_fut_iv = InputVar(haz_fut_func, haz_fut_distr)
ent_avg = ent_today_iv.evaluate()
ent_avg.exposures.gdf.head()
For examples of how to use non-defaults please see the impact example
unc_cb = CalcCostBenefit(
haz_input_var=haz_today,
ent_input_var=ent_today_iv,
haz_fut_input_var=haz_fut_iv,
ent_fut_input_var=ent_fut_iv,
)
# with pool
output_cb = unc_cb.uncertainty(output_cb, processes=4)
Net Present ValuesMeasure Cost (USD bn) Benefit (USD bn) Benefit/
,→Cost
Net Present ValuesMeasure Cost (USD bn) Benefit (USD bn) Benefit/
,→Cost
The output of CostBenefit.calc is rather complex in its structure. The metrics dictionary inherits this complexity.
['imp_meas_present',
'imp_meas_future',
'tot_climate_risk',
'benefit',
'cost_ben_ratio']
We can plot the distributions for the top metrics or our choice.
Analogously to the impact example, now that we have a metric distribution, we can compute the sensitivity indices.
Since we used the default sampling method, we can use the default sensitivity analysis method. However, since we used
calc_second_order = False for the sampling, we need to specify the same for the sensitivity analysis.
output_cb = unc_cb.sensitivity(
output_cb, sensitivity_kwargs={"calc_second_order": False}
)
The sensitivity indices can be plotted. For the default method ‘sobol’, by default the ‘S1’ sensitivity index is plotted.
Note that since we have quite a few measures, the plot must be adjusted a bit or dropped. Also see that for many metrics,
the sensitivity to certain uncertainty parameters appears to be 0. However, this result is to be treated with care. Indeed,
we used for demonstration purposes a rather too low number of samples, which is indicated by large confidence intervals
(vertical black lines) for most sensitivity indices. For a more robust result the analysis should be repeated with more
samples.
Advanced examples
Coupled variables
In this example, we show how you can define correlated input variables. Suppose your exposures and hazards are condi-
tioned on the same Shared Socio-economic Pathway (SSP). Then, you want that only exposures and hazard belonging to
the same SSP are present in each sample.
In order to achieve this, you must simply define an uncertainty parameter that shares the same name and the same distri-
bution for both the exposures and the hazard uncertainty variables.
In this example we look at the case where many scenarios are tested in the uncertainty analysis. For instance, suppose
you have data for different Shared Socio-economic Pathways (SSP) and different Climate Change projections. From the
SSPs, you have a number of Exposures, saved to files. From the climate projections, you have a number of Hazards, saved
to file.
The task is to sample from the SSPs and the Climate change scenarios for the uncertainty and sensitivity analysis efficiently.
For demonstration purposes, we will use below as exposures files the litpop for three countries, and for tha hazard files the
winter storms for the same three countries. Instead of having SSPs, we now want to only combine exposures and hazards
of the same countries.
client = Client()
def get_litpop(iso):
return client.get_litpop(country=iso)
def get_ws(iso):
properties = {
"country_iso3alpha": iso,
}
return client.get_hazard("storm_europe", properties=properties)
exp_distr = {
"x_exp": sp.stats.uniform(0.9, 0.2),
"cnt": sp.stats.randint(
low=0, high=len(exp_list)
), # use the same parameter name accross input variables
}
exp_iv = InputVar(exp_func, exp_distr)
haz_distr = {
"i_haz": sp.stats.norm(1, 0.2),
"cnt": sp.stats.randint(low=0, high=len(haz_list)),
}
haz_iv = InputVar(haz_func, haz_distr)
impf = ImpfStormEurope.from_schwierz()
impf_set = ImpactFuncSet()
impf_set.append(impf)
impf_iv = InputVar.impfset([impf_set], bounds_mdd=[0.9, 1.1])
The input parameter cnt is shared among at least 2 input variables. Their uncertainty␣
,→is thus computed with the same samples for this input paramter.
# as we can see, there is only a single input parameter "cnt" to select the country␣
,→for both the exposures and the hazard
output_imp.samples_df.tail()
output_imp = calc_imp.uncertainty(output_imp)
output_imp.aai_agg_unc_df.tail()
Loading Hazards or Exposures from file is a rather lengthy operation. Thus, we want to minimize the reading operations,
ideally reading each file only once. Simultaneously, Hazard and Exposures can be large in memory, and thus we would
like to have at most one of each loaded at a time. Thus, we do not want to use the list capacity from the helper method
InputVar.exposures and InputVar.hazard.
For demonstration purposes, we will use below as exposures files the litpop for three countries, and for tha hazard files
the winter storms for the same three countries. Note that this does not make a lot of sense for an uncertainty analysis. For
your use case, please replace the set of exposures and/or hazard files with meaningful sets, for instance sets of exposures
for different resolutions or hazards for different model runs.
client = Client()
def get_litpop_path(iso):
properties = {
"country_iso3alpha": iso,
"res_arcsec": "150",
"exponents": "(1,1)",
"fin_mode": "pc",
}
litpop_datasets = client.list_dataset_infos(
data_type="litpop", properties=properties
)
ds = litpop_datasets[0]
download_dir, ds_files = client.download_dataset(ds)
return ds_files[0]
def get_ws_path(iso):
properties = {
"country_iso3alpha": iso,
}
hazard_datasets = client.list_dataset_infos(
data_type="storm_europe", properties=properties
)
ds = hazard_datasets[0]
download_dir, ds_files = client.download_dataset(ds)
return ds_files[0]
exp = exp_base.copy()
exp.gdf["value"] *= x_exp
return exp
exp_distr = {
"x_exp": sp.stats.uniform(0.9, 0.2),
"f_exp": sp.stats.randint(low=0, high=len(f_exp_list)),
}
exp_iv = InputVar(exp_func, exp_distr)
haz = copy.deepcopy(haz_base)
haz.intensity *= i_haz
return haz
haz_distr = {
"i_haz": sp.stats.norm(1, 0.2),
"f_haz": sp.stats.randint(low=0, high=len(f_haz_list)),
}
haz_iv = InputVar(haz_func, haz_distr)
imp_fun = ImpactFunc()
imp_fun.haz_type = "WS"
imp_fun.id = _id
imp_fun.intensity_unit = "m/s"
imp_fun.intensity = np.linspace(0, 150, num=100)
imp_fun.mdd = np.repeat(1, len(imp_fun.intensity))
imp_fun.paa = np.array(
[sigmoid_func(v, G, v_half, vmin, k) for v in imp_fun.intensity]
)
imp_fun.check()
impf_set = ImpactFuncSet()
impf_set.append(imp_fun)
return impf_set
impf_distr = {
"G": sp.stats.truncnorm(0.5, 1.5),
"v_half": sp.stats.uniform(35, 65),
"vmin": sp.stats.uniform(0, 15),
"k": sp.stats.uniform(1, 4),
}
impf_iv = InputVar(impf_func, impf_distr)
Now that the samples have been generated, it is crucial to oder the samples in order to minimize the number of times
files have to be loaded. In this case, loading the hazards take more time than loading the exposures. We thus sort first by
hazards (which then each have to be loaded one single time), and then by exposures (which have to be each loaded once
for each hazard).
We can verify how the samples are ordered. In the graph below, it is confirmed that the hazard are ordered, and thus the
hazards will be loaded once each. The exposures on the other changes at most once per hazard.
e = output_imp.samples_df["f_exp"].values
h = output_imp.samples_df["f_haz"].values
Note that due to the very small number of samples chosen here for illustrative purposes, not all combinations of hazard
and exposures are part of the samples. This is due to the nature of the Sobol sequence (default sampling method).
plt.plot(e, label="exposures")
plt.plot(h, label="hazards")
plt.xlabel("samples")
plt.ylabel("file number")
(continues on next page)
output_imp = calc_imp.uncertainty(output_imp)
import warnings
Exposures
The following types of uncertainties can be added:
• ET: scale the total value (homogeneously)
The value at each exposure point is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_totvalue
• EN: mutliplicative noise (inhomogeneous)
The value of each exposure point is independently multiplied by a random number sampled uniformly from
a distribution with (min, max) = bounds_noise. EN is the value of the seed for the uniform random number
generator.
• EL: sample uniformly from exposure list
From the provided list of exposure is elements are uniformly sampled. For example, LitPop instances with
different exponents.
If a bounds is None, this parameter is assumed to have no uncertainty.
exp_base = Exposures.from_hdf5(EXP_DEMO_H5)
# The difference in total value between the base exposure and the average input␣
,→uncertainty exposure
# due to the random noise on each exposures point (the average change in the total␣
,→value is 1.0).
avg_exp = exp_iv.evaluate()
(sum(avg_exp.gdf["value"]) - sum(exp_base.gdf["value"])) / sum(exp_base.gdf["value"])
0.03700231587024304
# The values for EN are seeds for the random number generator for the noise sampling␣
,→and
# Define a generic method to make litpop instances with different exponent pairs.
from climada.entity import LitPop
def generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
):
# In-function imports needed only for parallel computing on Windows
from climada.entity import LitPop
litpop_base = []
for [m, n] in choice_mn:
print("\n Computing litpop for m=%d, n=%d \n" % (m, n))
litpop_kwargs["exponents"] = (m, n)
exp = LitPop.from_countries(**litpop_kwargs)
exp.gdf["impf_" + haz.haz_type] = impf_id
exp.gdf.drop("impf_", axis=1, inplace=True)
if value_unit is not None:
exp.value_unit = value_unit
exp.assign_centroids(haz, **assign_centr_kwargs)
litpop_base.append(exp)
(continues on next page)
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
choice_mn = [[0, 0.5], [0, 1], [0, 2]] # Choice of exponents m,n
litpop_list = generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
)
# To choose n=0.5, we have to set EL=1 (the index of 0.5 in choice_n = [0, 0.5, 1, 2])
pop_half = litpop_iv.evaluate(ET=1, EL=1)
pop_half.gdf.tail()
impf_TC centr_TC
5519 1 619
5520 1 619
5521 1 618
5522 1 617
5523 1 617
pop_half.plot_hexbin();
,→AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT[
,→"degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS[
# To choose n=1, we have to set EL=2 (the index of 1 in choice_n = [0, 0.5, 1, 2])
pop_one = litpop_iv.evaluate(ET=1, EL=2)
pop_one.gdf.tail()
impf_TC centr_TC
5519 1 619
5520 1 619
5521 1 618
5522 1 617
5523 1 617
pop_one.plot_hexbin();
,→AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT[
,→"degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS[
# The values for EN are seeds for the random number generator for the noise sampling␣
,→and
Hazard
The following types of uncertainties can be added:
• HE: sub-sampling events from the total event set
For each sub-sample, n_ev events are sampled with replacement. HE is the value of the seed for the uniform
random number generator.
• HI: scale the intensity of all events (homogeneously)
The instensity of all events is multiplied by a number sampled uniformly from a distribution with (min, max)
= bounds_int
• HA: scale the fraction of all events (homogeneously)
The fraction of all events is multiplied by a number sampled uniformly from a distribution with (min, max)
= bounds_frac
• HF: scale the frequency of all events (homogeneously)
The frequency of all events is multiplied by a number sampled uniformly from a distribution with (min, max)
= bounds_freq
• HL: sample uniformly from hazard list
From the provided list of hazard is elements are uniformly sampled. For example, Hazards outputs from
dynamical models for different input factors.
If a bounds is None, this parameter is assumed to have no uncertainty.
haz_base = Hazard.from_hdf5(HAZ_DEMO_H5)
0.10000000000000736
Note that the HE is not a univariate distribution, but for each sample corresponds to the names of the sub-sampled events.
However, to simplify the data stream, the HE is saved as the seed for the random number generator that made the sample.
Hence, the value of HE is a label for the given sample. If really needed, the exact chosen events can be obtained as
follows.
import numpy as np
'1998209N11335'
# The values for HE are seeds for the random number generator for the noise sampling␣
,→and
ImpactFuncSet
The following types of uncertainties can be added:
• MDD: scale the mdd (homogeneously)
The value of mdd at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_mdd
• PAA: scale the paa (homogeneously)
The value of paa at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_paa
• IFi: shift the intensity (homogeneously)
The value intensity are all summed with a random number sampled uniformly from a distribution with (min,
max) = bounds_int
• IL: sample uniformly from impact function set list
From the provided list of impact function sets elements are uniformly sampled. For example, impact func-
tions obtained from different calibration methods.
impf = ImpfTropCyclone.from_emanuel_usa()
impf_set_base = ImpactFuncSet([impf])
It is necessary to specify the hazard type and the impact function id. For simplicity, the default uncertainty input variable
only looks at the uncertainty on one single impact function.
# Plot the impact function for 50 random samples (note for the expert, these are not␣
,→global)
n = 50
ax = impf_iv.evaluate().plot()
inten = impf_iv.distr_dict["IFi"].rvs(size=n)
mdd = impf_iv.distr_dict["MDD"].rvs(size=n)
for i, m in zip(inten, mdd):
impf_iv.evaluate(IFi=i, MDD=m).plot(axis=ax)
ax.get_legend().remove()
Entity
The following types of uncertainties can be added:
• DR: value of constant discount rate (homogeneously)
The value of the discounts in each year is sampled uniformly from a distribution with (min, max) =
bounds_disc
• CO: scale the cost (homogeneously)
The cost of all measures is multiplied by the same number sampled uniformly from a distribution with (min,
max) = bounds_cost
• ET: scale the total value (homogeneously)
The value at each exposure point is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_totval
• EN: mutliplicative noise (inhomogeneous)
The value of each exposure point is independently multiplied by a random number sampled uniformly from
a distribution with (min, max) = bounds_noise. EN is the value of the seed for the uniform random number
generator.
• EL: sample uniformly from exposure list
From the provided list of exposure is elements are uniformly sampled. For example, LitPop instances with
different exponents.
• MDD: scale the mdd (homogeneously)
The value of mdd at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_mdd
• PAA: scale the paa (homogeneously)
The value of paa at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_paa
• IFi: shift the intensity (homogeneously)
The value intensity are all summed with a random number sampled uniformly from a distribution with (min,
max) = bounds_int
If a bounds is None, this parameter is assumed to have no uncertainty.
ent = Entity.from_excel(ENT_DEMO_TODAY)
ent.exposures.ref_year = 2018
ent.check()
ent_iv = InputVar.ent(
impf_set_list=[ent.impact_funcs],
disc_rate=ent.disc_rates,
exp_list=[ent.exposures],
meas_set=ent.measures,
bounds_disc=[0, 0.08],
bounds_cost=[0.5, 1.5],
bounds_totval=[0.9, 1.1],
bounds_noise=[0.3, 1.9],
bounds_mdd=[0.9, 1.05],
bounds_paa=None,
bounds_impfi=[-2, 5],
haz_id_dict={"TC": [1]},
)
ent_iv.plot();
# Define a generic method to make litpop instances with different exponent pairs.
from climada.entity import LitPop
def generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
):
# In-function imports needed only for parallel computing on Windows
from climada.entity import LitPop
litpop_base = []
for [m, n] in choice_mn:
print("\n Computing litpop for m=%d, n=%d \n" % (m, n))
litpop_kwargs["exponents"] = (m, n)
exp = LitPop.from_countries(**litpop_kwargs)
exp.gdf["impf_" + haz.haz_type] = impf_id
exp.gdf.drop("impf_", axis=1, inplace=True)
if value_unit is not None:
exp.value_unit = value_unit
exp.assign_centroids(haz, **assign_centr_kwargs)
litpop_base.append(exp)
return litpop_base
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
choice_mn = [[1, 0.5], [0.5, 1], [1, 1]] # Choice of exponents m,n
litpop_list = generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
)
ent = Entity.from_excel(ENT_DEMO_TODAY)
ent.exposures.ref_year = 2020
ent.check()
ent_iv.evaluate().exposures.plot_hexbin();
,→AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT[
,→"degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS[
Entity Future
The following types of uncertainties can be added:
• CO: scale the cost (homogeneously)
The cost of all measures is multiplied by the same number sampled uniformly from a distribution with (min,
max) = bounds_cost
• EG: scale the exposures growth (homogeneously)
The value at each exposure point is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_eg
• EN: mutliplicative noise (inhomogeneous)
The value of each exposure point is independently multiplied by a random number sampled uniformly from
a distribution with (min, max) = bounds_noise. EN is the value of the seed for the uniform random number
generator.
• EL: sample uniformly from exposure list
From the provided list of exposure is elements are uniformly sampled. For example, LitPop instances with
different exponents.
• MDD: scale the mdd (homogeneously)
The value of mdd at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_mdd
• PAA: scale the paa (homogeneously)
The value of paa at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_paa
• IFi: shift the impact function intensity (homogeneously)
The value intensity are all summed with a random number sampled uniformly from a distribution with (min,
max) = bounds_impfi
• IL: sample uniformly from impact function set list
From the provided list of impact function sets elements are uniformly sampled. For example, impact func-
tions obtained from different calibration methods.
If a bounds is None, this parameter is assumed to have no uncertainty.
ent_fut = Entity.from_excel(ENT_DEMO_FUTURE)
ent_fut.exposures.ref_year = 2040
ent_fut.check()
entfut_iv = InputVar.entfut(
impf_set_list=[ent_fut.impact_funcs],
exp_list=[ent_fut.exposures],
meas_set=ent_fut.measures,
bounds_cost=[0.6, 1.2],
bounds_eg=[0.8, 1.5],
bounds_noise=None,
bounds_mdd=[0.7, 0.9],
bounds_paa=[1.3, 2],
haz_id_dict={"TC": [1]},
)
# Define a generic method to make litpop instances with different exponent pairs.
from climada.entity import LitPop
def generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
):
# In-function imports needed only for parallel computing on Windows
from climada.entity import LitPop
litpop_base = []
for [m, n] in choice_mn:
print("\n Computing litpop for m=%d, n=%d \n" % (m, n))
litpop_kwargs["exponents"] = (m, n)
exp = LitPop.from_countries(**litpop_kwargs)
exp.gdf["impf_" + haz.haz_type] = impf_id
exp.gdf.drop("impf_", axis=1, inplace=True)
if value_unit is not None:
exp.value_unit = value_unit
exp.assign_centroids(haz, **assign_centr_kwargs)
litpop_base.append(exp)
return litpop_base
haz = Hazard.from_hdf5(HAZ_DEMO_H5)
choice_mn = [[1, 0.5], [0.5, 1], [1, 1]] # Choice of exponents m,n
litpop_list = generate_litpop_base(
impf_id, value_unit, haz, assign_centr_kwargs, choice_mn, **litpop_kwargs
)
ent_fut = Entity.from_excel(ENT_DEMO_FUTURE)
ent_fut.exposures.ref_year = 2040
ent_fut.check()
entfut_iv = InputVar.entfut(
impf_set_list=[ent_fut.impact_funcs],
exp_list=litpop_list,
meas_set=ent_fut.measures,
bounds_cost=[0.6, 1.2],
bounds_eg=[0.8, 1.5],
bounds_noise=None,
bounds_mdd=[0.7, 0.9],
bounds_paa=[1.3, 2],
haz_id_dict={"TC": [1]},
)
# generate hazard
hazard, haz_model, run_datetime, event_date = generate_WS_forecast_hazard()
# generate hazard with forecasts from past dates (works only if the files have␣
,→already been downloaded)
# generate vulnerability
impact_function = ImpfStormEurope.from_welker()
impact_function_set = ImpactFuncSet([impact_function])
Here you see a different plot highlighting the spread of the impact forecast calculated from the different ensemble members
of the weather forecast.
CH_WS_forecast.plot_hist(save_fig=False, close_fig=False);
It is possible to color the pixels depending on the probability that a certain threshold of impact is reach at a certain grid
point
CH_WS_forecast.plot_exceedence_prob(
threshold=5000, save_fig=False, close_fig=False, proj=ccrs.epsg(2056)
);
It is possible to color the cantons of Switzerland with warning colors, based on aggregated forecasted impacts in their
area.
import fiona
from cartopy.io import shapereader
from climada.util.config import CONFIG
# create a file containing the polygons of Swiss cantons using natural earth
cantons_file = CONFIG.local_data.save_dir.dir() / "cantons.shp"
adm1_shape_file = shapereader.natural_earth(
resolution="10m", category="cultural", name="admin_1_states_provinces"
)
if not cantons_file.exists():
with fiona.open(adm1_shape_file, "r") as source:
with fiona.open(cantons_file, "w", **source.meta) as sink:
for f in source:
if f["properties"]["adm0_a3"] == "CHE":
sink.write(f)
CH_WS_forecast.plot_warn_map(
(continues on next page)
exp_df = DataFrame()
exp_df["value"] = np.ones_like(
hazard.centroids.lat[centroid_selection]
) # provide value
exp_df["latitude"] = hazard.centroids.lat[centroid_selection]
exp_df["longitude"] = hazard.centroids.lon[centroid_selection]
exp_df["impf_WS"] = np.ones_like(hazard.centroids.lat[centroid_selection], int)
# Generate Exposures
exp = Exposures(exp_df)
exp.check()
exp.value_unit = "warn_level"
The each grid point now has a warnlevel between 1-5 assigned for each event. Now the cantons can be colored based on
a threshold on a grid point level. for each warning level it is assessed if 50% of grid points in the area of a canton has at
least a 50% probability of reaching the specified threshold.
warn_forecast.plot_warn_map(
cantons_file,
thresholds=[2, 3, 4, 5],
decision_level="exposure_point",
probability_aggregation=0.5,
area_aggregation=0.5,
title="DWD ICON METEOROLOGICAL WARNING",
explain_text="warn level based on wind gust thresholds",
save_fig=False,
close_fig=False,
proj=ccrs.epsg(2056),
);
2.9.1 Overview
The basic idea of the calibration is to find a set of parameters for an impact function that minimizes the deviation be-
tween the calculated impact and some impact data. For setting up a calibration task, users have to supply the following
information:
• Hazard and Exposure (as usual, see the tutorial)
• The impact data to calibrate the model to
• An impact function definition depending on the calibrated parameters
• Bounds and constraints of the calibrated parameters (depending on the calibration algorithm)
• A “cost function” defining the single-valued deviation between impact data and calculated impact
• A function for transforming the calculated impact into the same data structure as the impact data
This information defines the calibration task and is inserted into the Input object. Afterwards, the user may insert this
object into one of the optimizer classes. Currently, the following classes are available:
• BayesianOptimizer: Uses Bayesian optimization to sample the parameter space.
• ScipyMinimizeOptimizer: Uses the scipy.optimize.minimize function for determining the best param-
eter set.
The following tutorial walks through the input data preparation and the setup of a BayesianOptimizer instance for
calibration. For a brief example, refer to Quickstart. If you want to go through a somewhat realistic calibration task
step-by-step, continue with Calibration Data.
import logging
import climada
logging.getLogger("climada").setLevel("WARNING")
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/site-packages/dask/
,→dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will␣
,→soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading.
warnings.warn(
Quickstart
This section gives a very quick overview of assembling a calibration task. Here, we calibrate a single impact function for
damage reports in Mexico (MEX) from a TC with IbtracsID 2010176N16278.
import pandas as pd
from sklearn.metrics import mean_squared_log_error
# Create input
inp = Input(
hazard=hazard,
exposure=exposure,
data=data,
# Generate impact function from estimated parameters
impact_func_creator=lambda v_half: ImpactFuncSet(
[ImpfTropCyclone.from_emanuel_usa(v_half=v_half, impf_id=1)]
),
# Estimated parameter bounds
bounds={"v_half": (26, 100)},
# Cost function
cost_func=mean_squared_log_error,
# Transform impact to pandas Dataframe with same structure as data
impact_to_dataframe=lambda impact: impact.impact_at_reg(exposure.gdf["region_id
,→"]),
# Run optimization
with log_level("WARNING", "climada.engine.impact_calc"):
output = opt.run(controller)
# Analyse results
(continues on next page)
# Optimal value
output.params
,→hdf5
,→global_1980_2020/v2/tropical_cyclone_0synth_tracks_150arcsec_global_1980_2020.hdf5
{'v_half': 48.30330549244917}
Follow the next sections of the tutorial for a more in-depth explanation.
Each entry in the database refers to an economic impact for a specific country and TC event. The TC events are identified
by the ID assigned from the International Best Track Archive for Climate Stewardship (IBTrACS). We now want to
reshape this data so that impacts are grouped by event and country.
To achieve this, we iterate over the unique track IDs, select all reported damages associated with this ID, and concatenate
the results. For missing entries, pandas will set the value to NaN. We assume that missing entries means that no damages
are reported (this is a strong assumption), and set all NaN values to zero. Then, we transpose the dataframe so that each
row represents an event and each column states the damage for a specific country. Finally, we set the track ID to be the
index of the data frame.
track_ids = emdat_subset["ibtracsID"].unique()
data = pd.pivot_table(
emdat_subset,
values="emdat_impact_scaled",
index="ibtracsID",
columns="region_id",
# fill_value=0,
)
data
region_id 28 44 92 132 \
(continues on next page)
This is the data against which we want to compare our model output. Let’s continue setting up the calibration!
client = Client()
exposure = LitPop.concat(
[
client.get_litpop(country_to_iso(country_id, representation="alpha3"))
(continues on next page)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
(continues on next page)
setstate(state)
/Users/ldr.riedel/miniforge3/envs/climada_env_3.9/lib/python3.9/pickle.py:1717:␣
,→UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle␣
setstate(state)
client = Client()
tc_dataset_infos = client.list_dataset_infos(data_type="tropical_cyclone")
client.get_property_values(
client.list_dataset_infos(data_type="tropical_cyclone"),
known_property_values={"event_type": "observed", "spatial_coverage": "global"},
)
{'res_arcsec': ['150'],
'event_type': ['observed'],
'spatial_coverage': ['global'],
'climate_scenario': ['None']}
We will use the CLIMADA Data API to download readily computed wind fields from TC tracks. The API provides a
large dataset containing all historical TC tracks. We will download them and then select the subset of TCs for which we
have impact data by using select().
client = Client()
all_tcs = client.get_hazard(
"tropical_cyclone",
properties={"event_type": "observed", "spatial_coverage": "global"},
)
hazard = all_tcs.select(event_names=track_ids.tolist())
NOTE: Discouraged! This will usually take a longer time than using the Data API
Alternatively, CLIMADA provides the TCTracks class, which lets us download the tracks of TCs using their IBTrACS
IDs. We then have to equalize the time steps of the different TC tracks.
The track and intensity of a cyclone are insufficient to compute impacts in CLIMADA. We first have to re-compute a
windfield from each track at the locations of interest. For consistency, we simply choose the coordinates of the exposure.
We will be using the BayesianOptimizer, which requires very little information on the parameter space beforehand.
One crucial information are the bounds of the parameters, though. Initial values are not needed because the optimizer
first samples the bound parameter space uniformly and then iteratively “narrows down” the search. We choose a v_half
between v_thresh and 150, and a scale between 0.01 (it must never be zero) and 1.0. Specifying the bounds as dictionary
(a must in case of BayesianOptimizer) also serves the purpose of naming the parameters we want to calibrate. Notice
that these names have to match the arguments of the impact function generator.
Defining the cost function is crucial for the result of the calibration. You can choose what is best suited for your application.
Often, it is not clear which function works best, and it’s a good idea to try out a few. Because the impacts of different
events may vary over several orders of magnitude, we select the mean squared logartihmic error (MSLE). This one and
other error measures are readily supplied by the sklearn package.
The cost function must be defined as a function that takes the impact object calculated by the optimization algorithm
and the input calibration data as arguments, and that returns a single number. This number represents a “cost” of the
parameter set used for calculating the impact. A higher cost therefore is worse, a lower cost is better. Any optimizer will
try to minimize the cost.
Note that the impact object is an instance of Impact, whereas the input calibration data is a pd.DataFrame. To compute
the MSLE, we first have to transform the impact into the same data structure, meaning that we have to aggregate the
point-wise impacts by event and country. The function performing this transformation task is provided to the Input via
its impact_to_dataframe attribute. Here we choose climada.engine.impact.Impact.impact_at_reg(),
which aggregates over countries by default. To improve performance, we can supply this function with our known region
IDs instead of re-computing them in every step.
Computations on data frames align columns and indexes. The indexes of the calibration data are the IBTrACS IDs, but
the indexes of the result of impact_at_reg are the hazard event IDs, which at this point are only integer numbers. To
resolve that, we adjust our calibration dataframe to carry the respective Hazard.event_id as index.
data = data.rename(
index={
hazard.event_name[idx]: hazard.event_id[idx]
for idx in range(len(hazard.event_id))
}
)
data.index.rename("event_id", inplace=True)
data
region_id 28 44 92 132 \
event_id
1333 NaN NaN NaN NaN
1339 1.394594e+07 NaN NaN NaN
1344 NaN NaN NaN NaN
1351 NaN NaN NaN NaN
1361 NaN 4.352258e+07 NaN NaN
3686 NaN NaN NaN NaN
3691 NaN NaN NaN NaN
1377 NaN NaN NaN NaN
1390 NaN NaN NaN NaN
3743 NaN NaN NaN NaN
1421 NaN NaN NaN 1281242.483
1426 NaN 8.362720e+07 NaN NaN
3777 NaN NaN NaN NaN
3795 NaN NaN NaN NaN
1450 NaN NaN NaN NaN
1454 2.111764e+08 1.801876e+06 3.000000e+09 NaN
1458 NaN NaN NaN NaN
Users can control the “density”, and thus the accuracy of the sampling by adjusting the controller parameters. Increas-
ing init_points, n_iter, min_improvement_count, and max_iterations, and decreasing min_improvement
generally increases density and accuracy, but leads to longer runtimes.
We suggest using the from_input classmethod for a convenient choice of sampling density based on the parameter
space. The two parameters init_points and n_iter are set to bN , where N is the number of estimated parameters
and b is the sampling_base parameter, which defaults to 4.
Now we can finally execute our calibration task! We will plug all input parameters in an instance of Input, and then
create the optimizer instance with it. The Optimizer.run method returns an Output object, whose params attribute
holds the optimal parameters determined by the calibration.
Notice that the BayesianOptimization maximizes a target function. Therefore, higher target values are better than
lower ones in this case.
from climada.util.calibrate import Input, BayesianOptimizer,␣
,→BayesianOptimizerController
p_space_df = bayes_output.p_space_to_dataframe()
p_space_df
Parameters Calibration
scale v_half Cost Function
Iteration
0 0.422852 115.264302 2.726950
1 0.010113 63.349706 4.133135
2 0.155288 37.268453 0.800611
3 0.194398 68.718642 1.683610
4 0.402800 92.721038 2.046407
... ... ... ...
246 0.790590 50.164555 0.765166
247 0.788425 48.043636 0.768636
248 0.826704 49.932348 0.764991
249 0.880736 49.290532 0.767437
250 0.744523 51.596642 0.770761
In contrast, the controller only tracks the consecutive improvements of the best guess.
controller.improvements()
p_space_df["Parameters"]
scale v_half
Iteration
0 0.422852 115.264302
1 0.010113 63.349706
2 0.155288 37.268453
3 0.194398 68.718642
4 0.402800 92.721038
... ... ...
246 0.790590 50.164555
247 0.788425 48.043636
248 0.826704 49.932348
249 0.880736 49.290532
250 0.744523 51.596642
Notice that the optimal parameter set is not necessarily the last entry in the parameter space! Therefore, let’s order the
parameter space by the ascending cost function values.
Parameters Calibration
scale v_half Cost Function
Iteration
216 0.993744 52.255606 0.763936
173 0.985721 52.026516 0.764000
177 0.881143 51.558665 0.764718
146 0.967284 51.034702 0.764947
248 0.826704 49.932348 0.764991
... ... ... ...
35 0.028105 118.967924 6.298332
95 0.012842 102.449398 6.635728
199 0.031310 143.537900 7.262185
99 0.025663 141.236104 7.504095
232 0.010398 147.113486 9.453765
The BayesianOptimizerOutput supplies the plot_p_space() method for convenience. If there were more than
two parameters we calibrated, it would produce a plot for each parameter combination.
bayes_output.plot_p_space(x="v_half", y="scale")
p_space_df["Parameters"].iloc[0, :].to_dict()
Here we show how the variability in parameter combinations with similar cost function values (as seen in the plot of the
parameter space) translate to varying impact functions. In addition, the hazard value distribution is shown. Together this
provides an intuitive overview regarding the robustness of the optimization, given the chosen cost function. It does NOT
provide a view of the sampling uncertainty (as e.g. bootstrapping or cross-validation) NOR of the suitability of the cost
function which is chosen by the user.
This functionality is only available from the BayesianOptimizerOutputEvaluator tailored to Bayesian optimizer
outputs. It includes all function from OutputEvaluator.
The target function has limited meaning outside the calibration task. To investigate the quality of the calibration, it is
helpful to compute the impact with the impact function defined by the optimal parameters. The OutputEvaluator
readily computed this impact when it was created. You can access the impact via the impact attribute.
import numpy as np
impact_data = output_eval.impact.impact_at_reg(exposure.gdf["region_id"])
impact_data.set_index(np.asarray(hazard.event_name), inplace=True)
impact_data
28 44 92 132 \
2010176N16278 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2010236N12341 2.382896e+07 0.000000e+00 6.319901e+07 0.000000
2010257N16282 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2010302N09306 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2011233N15301 0.000000e+00 1.230474e+09 2.364701e+07 0.000000
2011279N10257 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2012215N12313 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2012166N09269 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2012296N14283 0.000000e+00 1.323275e+08 0.000000e+00 0.000000
2014253N13260 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2015293N13266 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2015242N12343 0.000000e+00 0.000000e+00 0.000000e+00 227605.609803
2015270N27291 0.000000e+00 8.433712e+05 0.000000e+00 0.000000
2016248N15255 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2017219N16279 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
2017242N16333 3.750206e+08 1.354638e+06 4.684316e+08 0.000000
2017260N12310 0.000000e+00 0.000000e+00 3.413228e+07 0.000000
670 796
2010176N16278 0.000000e+00 0.000000e+00
2010236N12341 0.000000e+00 0.000000e+00
2010257N16282 0.000000e+00 0.000000e+00
2010302N09306 4.578844e+06 6.995127e+05
2011233N15301 0.000000e+00 1.383507e+08
2011279N10257 0.000000e+00 0.000000e+00
2012215N12313 0.000000e+00 0.000000e+00
2012166N09269 0.000000e+00 0.000000e+00
2012296N14283 0.000000e+00 0.000000e+00
2014253N13260 0.000000e+00 0.000000e+00
2015293N13266 0.000000e+00 0.000000e+00
2015242N12343 0.000000e+00 0.000000e+00
2015270N27291 0.000000e+00 0.000000e+00
2016248N15255 0.000000e+00 0.000000e+00
2017219N16279 0.000000e+00 0.000000e+00
2017242N16333 0.000000e+00 8.060133e+08
2017260N12310 0.000000e+00 9.235309e+07
We can now compare the modelled and reported impact data on a country- or event-basis. The OutputEvaluator also
has methods for that. In both of these, you can supply a transformation function with the data_transf argument. This
transforms the data to be plotted right before plotting. Recall that we set the event IDs as index for the data frames. To
better interpret the results, it is useful to transform them into event names again, which are the IBTrACS IDs. Likewise,
we use the region IDs for region identification. It might be nicer to transform these into country names before plotting.
def country_code_to_name(code):
return u_coord.country_to_iso(code, representation="name")
event_id_to_name = {
hazard.event_id[idx]: hazard.event_name[idx] for idx in range(len(hazard.event_
,→id))
output_eval.plot_at_event(
data_transf=lambda x: x.rename(index=event_id_to_name), logy=True
(continues on next page)
Finally, we can do an event- and country-based comparison using a heatmap using plot_event_region_heatmap()
Since the magnitude of the impact values may differ strongly, this method compare them on a logarithmic scale. It divides
each modelled impact by the observed impact and takes the the decadic logarithm. The result will tell us how many orders
of magnitude our model was off. Again, the considerations for “nicer” index and columns apply.
output_eval.plot_event_region_heatmap(
data_transf=lambda x: x.rename(index=event_id_to_name, columns=country_code_to_
,→name)
<Axes: >
hazard_irma = all_tcs.select(event_names=["2017242N16333"])
data_irma = data.loc[1454, :].to_frame().T
data_irma
Let’s first calibrate the impact function only on this event, including all data we have.
# Evaluate output
output_eval = OutputEvaluator(input, bayes_output)
output_eval.impf_set.plot()
If we now remove some of the damage reports and repeat the calibration, the respective impact computed by the model
will be ignored. For Saint Kitts and Nevis, and for Turks and the Caicos Islands, the impact is overestimated by the model.
Removing these regions from the estimation should shift the estimated parameters accordingly, because by default, impacts
for missing data points are ignored with missing_data_value=np.nan.
However, the calibration should change into the other direction once we require the modeled impact at missing data points
to be zero:
Actively requiring that the model calibrates towards zero impact in the two dropped regions means that it will typically
strongly overestimate the impact there (because impact actually took place). This will “flatten” the vulnerability curve,
causing strong underestimation in the other regions.
1. Run the calibration again, but change the number of initial steps and/or iteration steps.
2. Use a different cost function, e.g., an error measure based on a ratio rather than a difference.
3. Also calibrate the v_thresh parameter. This requires adding constraints, because v_thresh < v_half.
4. Calibrate different impact functions for houses in Mexico and Puerto Rico within the same optimization task.
5. Employ the ScipyMinimizeOptimizer instead of the BayesianOptimizer.
import webbrowser
import ee
ee.Initialize()
image = ee.Image("srtm90_v4")
print(image.getInfo())
,→'system:asset_size': 18827626666}}
# Access a collection
collection = "LANDSAT/LE07/C01/T1" # Landsat 7 raw images collection
If you have a collection, specification of the time range and area of interest. Then, use methods of the series ob-
tain_image_type(collection,time_range,area) depending the type of product needed.
Time range
It depends on the image acquisition period of the targeted satellite and type of images desired (without clouds, from a
specific period…)
Area
GEE needs a special format for defining an area of interest. It has to be a GeoJSON Polygon and the coordinates should
be first defined in a list and then converted using ee.Geometry. It is possible to use data obtained via Exposure layer.
Some examples are given below.
collection_dresden = "LANDSAT/LE07/C01/T1"
print(type(area_dresden))
collection_swiss = ee.ImageCollection("CIESIN/GPWv4/population-density")
print(type(collection_swiss))
<class 'ee.geometry.Geometry'>
<class 'ee.imagecollection.ImageCollection'>
<class 'ee.image.Image'>
Parameters:
collection (): name of the collection
time_range (['YYYY-MT-DY','YYYY-MT-DY']): must be inside the available data
area (ee.geometry.Geometry): area of interest
Returns:
image_composite (ee.image.Image)
"""
collection = ee.ImageCollection(collection)
Parameters:
collection (): name of the collection
time_range (['YYYY-MT-DY','YYYY-MT-DY']): must be inside the available data
area (ee.geometry.Geometry): area of interest
Returns:
image_median (ee.image.Image)
"""
collection = ee.ImageCollection(collection)
Parameters:
collection (): name of the collection
time_range (['YYYY-MT-DY','YYYY-MT-DY']): must be inside the available data
area (ee.geometry.Geometry): area of interest
Returns:
sentinel_median (ee.image.Image)
"""
sentinel_filtered = (
ee.ImageCollection(collection)
.filterBounds(area)
.filterDate(time_range[0], time_range[1])
.filter(ee.Filter.lt("CLOUDY_PIXEL_PERCENTAGE", 20))
.map(maskclouds)
)
sentinel_median = sentinel_filtered.median()
return sentinel_median
# Application to examples
composite_dresden = obtain_image_landsat_composite(
collection_dresden, time_range_dresden, area_dresden
)
median_swiss = obtain_image_median(collection_swiss, time_range_swiss, area_swiss)
zurich_median = obtain_image_sentinel(collection_zurich, time_range_zurich, area_
,→zurich)
,→1, 0]}, {'id': 'B2', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min':␣
,→0, 'max': 255}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'B3
,→': 'int', 'min': 0, 'max': 255}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0,␣
,→1, 0]}, {'id': 'B7', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min':␣
,→0, 'max': 255}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'B8
<class 'ee.image.Image'>
<class 'ee.image.Image'>
def get_region(geom):
"""Get the region of a given geometry, needed for exporting tasks.
Parameters:
geom (ee.Geometry, ee.Feature, ee.Image): region of interest
Returns:
region (list)
"""
if isinstance(geom, ee.Geometry):
region = geom.getInfo()["coordinates"]
elif isinstance(geom, (ee.Feature, ee.Image)):
region = geom.geometry().getInfo()["coordinates"]
return region
region_dresden = get_region(area_dresden)
region_swiss = get_region(area_swiss)
region_zurich = get_region(area_zurich)
Parameters:
name (str): name of the created folder
image (ee.image.Image): image to export
scale (int): resolution of export in meters (e.g: 30 for Landsat)
region (list): region of interest
Returns:
path (str)
"""
path = image.getDownloadURL({"name": (name), "scale": scale, "region": (region)})
webbrowser.open_new_tab(path)
return path
# For the example of Zürich, due to size, it doesn't work on Jupyter Notebook but it␣
,→works on Python
print(url_swiss)
print(url_dresden)
print(url_landcover)
https://earthengine.googleapis.com/api/download?
,→docid=6b6d96f567d6a055188c8c17dd24bcb8&token=00a796601efe425c821777a284bff361
https://earthengine.googleapis.com/api/download?
,→docid=15182f82ba65ce24f62305e4465ac21c&token=5da59a20bb84d79bcf7ce958855fe848
https://earthengine.googleapis.com/api/download?
,→docid=07c14e22d96a33fc72a7ba16c2178a6a&token=0cfa0cd6537257e96600d10647375ff4
import numpy as np
from skimage import data
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
image_pop = imread(swiss_pop)
plt.figure(figsize=(12, 12))
plt.imshow(image_pop, cmap="Reds", interpolation="nearest")
plt.colorbar()
plt.axis()
plt.show()
global_thresh = threshold_otsu(image_dresden_crop)
binary_global = image_dresden_crop > global_thresh
block_size = 35
adaptive_thresh = threshold_local(image_dresden_crop, block_size, offset=10)
binary_adaptive = image_dresden_crop > adaptive_thresh
ax[1].imshow(binary_global)
ax[1].set_title("Global thresholding")
ax[2].imshow(binary_adaptive)
ax[2].set_title("Adaptive thresholding")
for a in ax:
a.axis("off")
plt.show()
print(np.sum(binary_global))
64832
client = Client()
import pandas as pd
data_types = client.list_data_type_infos()
dtf = pd.DataFrame(data_types)
dtf.sort_values(["data_type_group", "data_type"])
properties
3 [{'property': 'crop', 'mandatory': True, 'desc...
0 [{'property': 'res_arcsec', 'mandatory': False...
5 []
2 [{'property': 'res_arcsec', 'mandatory': False...
4 [{'property': 'country_iso3alpha', 'mandatory'...
1 [{'property': 'res_arcsec', 'mandatory': True,...
litpop_dataset_infos = client.list_dataset_infos(data_type="litpop")
all_properties = client.get_property_values(litpop_dataset_infos)
all_properties.keys()
# as datasets are usually available per country, chosing a country or global dataset␣
,→reduces the options
# here we want to see which datasets are available for litpop globally:
client.get_property_values(
(continues on next page)
{'res_arcsec': ['150'],
'exponents': ['(0,1)', '(1,1)', '(3,0)'],
'fin_mode': ['pop', 'pc'],
'spatial_coverage': ['global']}
{'res_arcsec': ['150'],
'exponents': ['(3,0)', '(0,1)', '(1,1)'],
'fin_mode': ['pc', 'pop'],
'spatial_coverage': ['country'],
'country_iso3alpha': ['CHE'],
'country_name': ['Switzerland'],
'country_iso3num': ['756']}
gets the dataset information, downloads the data and opens it as a hazard instance
tc_dataset_infos = client.list_dataset_infos(data_type="tropical_cyclone")
client.get_property_values(
tc_dataset_infos, known_property_values={"country_name": "Haiti"}
)
{'res_arcsec': ['150'],
'climate_scenario': ['rcp26', 'rcp45', 'rcp85', 'historical', 'rcp60'],
'ref_year': ['2040', '2060', '2080'],
'nb_synth_tracks': ['50', '10'],
'spatial_coverage': ['country'],
'tracks_year_range': ['1980_2020'],
'country_iso3alpha': ['HTI'],
'country_name': ['Haiti'],
'country_iso3num': ['332'],
'resolution': ['150 arcsec']}
client = Client()
tc_haiti = client.get_hazard(
"tropical_cyclone",
(continues on next page)
https://climada.ethz.ch/data-api/v1/dataset climate_
,→scenario=rcp45 country_name=Haiti data_type=tropical_
,→cyclone limit=100000 name=None nb_synth_tracks=10 ref_
,→year=2040 status=active version=None
2022-07-01 15:55:23,593 - climada.util.api_client - WARNING - Download failed: /Users/
,→szelie/climada/data/hazard/tropical_cyclone/tropical_cyclone_10synth_tracks_
,→150arcsec_rcp45_HTI_2040/v1/tropical_cyclone_10synth_tracks_150arcsec_rcp45_HTI_
,→2040/v1/tropical_cyclone_10synth_tracks_150arcsec_rcp45_HTI_2040.hdf5
,→1]],ENSEMBLEACCURACY[2.0]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.
,→ORDER[2],ANGLEUNIT["degree",0.0174532925199433]],USAGE[SCOPE["Horizontal component␣
gets the default litpop, with exponents (1,1) and ‘produced capital’ as financial mode. If no country is given, the global
dataset will be downloaded.
litpop_default = client.get_property_values(
litpop_dataset_infos, known_property_values={"fin_mode": "pc", "exponents": "(1,1)
,→"}
litpop = client.get_litpop(country="Haiti")
,→hdf5
imp_fun = ImpfTropCyclone.from_emanuel_usa()
imp_fun.check()
imp_fun.plot()
imp_fun_set = ImpactFuncSet([imp_fun])
(continues on next page)
litpop.impact_funcs = imp_fun_set
crop_dataset_infos = client.list_dataset_infos(data_type="crop_production")
client.get_property_values(crop_dataset_infos)
rice_exposure = client.get_exposures(
exposures_type="crop_production",
(continues on next page)
centroids = client.get_centroids()
centroids.plot()
https://climada.ethz.ch/data-api/v1/dataset data_
,→type=centroids extent=(-180, 180, -90,␣
,→90) limit=100000 name=None res_arcsec_land=150 res_
,→arcsec_ocean=1800 status=active version=None
2022-07-01 15:59:42,013 - climada.hazard.centroids.centr - INFO - Reading /Users/
,→szelie/climada/data/centroids/earth_centroids_150asland_1800asoceans_distcoast_
,→regions/v1/earth_centroids_150asland_1800asoceans_distcoast_region.hdf5
,→",6378137,298.257223563,LENGTHUNIT["metre",1]],ENSEMBLEACCURACY[2.0]],PRIMEM[
,→"Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS[
<GeoAxesSubplot:>
For many hazards, limiting the latitude extent to [-60,60] is sufficient and will reduce the computational ressources required
https://climada.ethz.ch/data-api/v1/dataset data_
,→type=centroids extent=(-180, 180, -90,␣
,→90) limit=100000 name=None res_arcsec_land=150 res_
,→arcsec_ocean=1800 status=active version=None
2022-07-01 15:59:27,602 - climada.hazard.centroids.centr - INFO - Reading /Users/
,→szelie/climada/data/centroids/earth_centroids_150asland_1800asoceans_distcoast_
,→regions/v1/earth_centroids_150asland_1800asoceans_distcoast_region.hdf5
,→",6378137,298.257223563,LENGTHUNIT["metre",1]],ENSEMBLEACCURACY[2.0]],PRIMEM[
,→"Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS[
<GeoAxesSubplot:>
centroids_hti = client.get_centroids(country="HTI")
https://climada.ethz.ch/data-api/v1/dataset data_
,→type=centroids extent=(-180, 180, -90,␣
,→90) limit=100000 name=None res_arcsec_land=150 res_
,→arcsec_ocean=1800 status=active version=None
2022-07-01 16:01:24,328 - climada.hazard.centroids.centr - INFO - Reading /Users/
,→szelie/climada/data/centroids/earth_centroids_150asland_1800asoceans_distcoast_
,→regions/v1/earth_centroids_150asland_1800asoceans_distcoast_region.hdf5
Server
The CLIMADA data file server is hosted on https://data.iac.ethz.ch that can be accessed via a REST API at
https://climada.ethz.ch. For REST API details, see the documentation.
Client
?Client
Init docstring:
Constructor of Client.
Data API host and chunk_size (for download) are configurable values.
Default values are 'climada.ethz.ch' and 8096 respectively.
File: c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type: type
Subclasses:
client = Client()
client.chunk_size
8192
The url to the API server and the chunk size for the file download can be configured in ‘climada.conf’. Just replace the
corresponding default values:
"data_api": {
"host": "https://climada.ethz.ch",
"chunk_size": 8192,
"cache_db": "{local_data.system}/.downloads.db"
}
The other configuration value affecting the data_api client, cache_db, is the path to an SQLite database file, which is
keeping track of the files that are successfully downloaded from the api server. Before the Client attempts to download any
file from the server, it checks whether the file has been downloaded before and if so, whether the previously downloaded
file still looks good (i.e., size and time stamp are as expected). If all of this is the case, the file is simply read from disk
without submitting another request.
Metadata
Unique Identifiers
Any dataset can be identified with data_type, name and version. The combination of the three is unique in
the API servers’ underlying database. However, sometimes the name is already enough for identification. All
datasets have a UUID, a universally unique identifier, which is part of their individual url. E.g., the uuid of the
client.get_dataset_info_by_uuid("b1c76120-4e60-4d8f-99c0-7e1e7b7860ec")
DatasetInfo(uuid='b1c76120-4e60-4d8f-99c0-7e1e7b7860ec', data_
,→type=DataTypeShortInfo(data_type='litpop', data_type_group='exposures'), name=
,→'SGS', 'country_name': 'South Georgia and the South Sandwich Islands', 'country_
,→'https://data.iac.ethz.ch/climada/b1c76120-4e60-4d8f-99c0-7e1e7b7860ec/LitPop_
,→assets_pc_150arcsec_SGS.hdf5', file_name='LitPop_assets_pc_150arcsec_SGS.hdf5',␣
,→exposure per country: Gridded physical asset values by country, at a resolution of␣
,→150 arcsec. Values are total produced capital values disaggregated proportionally␣
,→to the cube of nightlight intensity (Lit^3, based on NASA Earth at Night). The␣
or by filtering:
As stated above get_dataset (or get_dataset_by_uuid) return a DatasetInfo object and get_datasets a list
thereof.
?DatasetInfo
Init signature:
DatasetInfo(
uuid: str,
data_type: climada.util.api_client.DataTypeShortInfo,
name: str,
(continues on next page)
?FileInfo
Init signature:
FileInfo(
uuid: str,
url: str,
file_name: str,
file_format: str,
file_size: int,
check_sum: str,
) -> None
Docstring: file data from CLIMADA data API.
File: c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type: type
Subclasses:
There are conveinience functions to easily convert datasets into pandas DataFrames, get_datasets and ex-
pand_files:
?client.into_datasets_df
Signature: client.into_datasets_df(dataset_infos)
Docstring:
Convenience function providing a DataFrame of datasets with properties.
Parameters
----------
dataset_infos : list of DatasetInfo
as returned by list_dataset_infos
client = Client()
litpop_datasets = client.list_dataset_infos(
data_type="litpop",
properties={"country_name": "South Georgia and the South Sandwich Islands"},
)
litpop_df = client.into_datasets_df(litpop_datasets)
litpop_df
description \
0 LitPop asset value exposure per country: Gridd...
1 LitPop population exposure per country: Gridde...
2 LitPop asset value exposure per country: Gridd...
license \
0 Attribution 4.0 International (CC BY 4.0)
1 Attribution 4.0 International (CC BY 4.0)
2 Attribution 4.0 International (CC BY 4.0)
country_name country_iso3num
0 South Georgia and the South Sandwich Islands 239
1 South Georgia and the South Sandwich Islands 239
2 South Georgia and the South Sandwich Islands 239
Download
The wrapper functions get_exposures or get_hazard fetch the information, download the file and opens the file as a climada
object. But one can also just download dataset files using the method download_dataset which takes a DatasetInfo
object as argument and downloads all files of the dataset to a directory in the local file system.
?client.download_dataset
Signature:
client.download_dataset(
dataset,
target_dir=WindowsPath('C:/Users/me/climada/data'),
organize_path=True,
)
Docstring:
Download all files from a given dataset to a given directory.
Parameters
----------
dataset : DatasetInfo
the dataset
target_dir : Path, optional
target directory for download, by default `climada.util.constants.SYSTEM_DIR`
organize_path: bool, optional
if set to True the files will end up in subdirectories of target_dir:
[target_dir]/[data_type_group]/[data_type]/[name]/[version]
by default True
Returns
-------
download_dir : Path
the path to the directory containing the downloaded files,
will be created if organize_path is True
downloaded_files : list of Path
the downloaded files themselves
Raises
------
Exception
when one of the files cannot be downloaded
File: c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type: method
Cache
The method avoids superfluous downloads by keeping track of all downloads in a sqlite db file. The client will make sure
that the same file is never downloaded to the same target twice.
Examples
(WindowsPath('C:/Users/me/climada/data/exposures/litpop/LitPop_assets_pc_150arcsec_
,→SGS/v1/LitPop_assets_pc_150arcsec_SGS.hdf5'),
True)
(PosixPath('/home/yuyue/climada/data/hazard/tropical_cyclone/tropical_cyclone_50synth_
,→tracks_150arcsec_rcp26_BRA_2040/v1/tropical_cyclone_50synth_tracks_150arcsec_rcp26_
,→BRA_2040.hdf5'),
True)
If the dataset contains only one file (which is most commonly the case) this file can also be downloaded and accessed in
a single step, using the get_dataset_file method:
Client().get_dataset_file(
data_type="litpop",
properties={
"country_name": "South Georgia and the South Sandwich Islands",
"fin_mode": "pop",
},
)
WindowsPath('C:/Users/me/climada/data/exposures/litpop/LitPop_pop_150arcsec_SGS/v1/
,→LitPop_pop_150arcsec_SGS.hdf5')
By default, the API Client downloads files into the ~/climada/data directory.
In the course of time obsolete files may be accumulated within this directory, because there is a newer version of these
files available from the CLIMADA data API, or because the according dataset got expired altogether.
To prevent file rot and free disk space, it’s possible to remove all outdated files at once, by simply calling Client().
purge_cache(). This will remove all files that were ever downloaded with the api_client.Client and for which a
newer version exists, even when the newer version has not been downloaded yet.
Offline Mode
The API Client is silently used in many methods and functions of CLIMADA, including the installation test that is run
to see whether the CLIMADA installation was successful. Most methods of the client send GET requests to the API
server assuming the latter is accessible through a working internet connection. If this is not the case, the functionality of
CLIMADA is severely limited if not altogether lost. Often this is an unnecessary restriction, e.g., when a user wants to
access a file through the API Client that is already downloaded and available in the local filesystem.
In such cases the API Client runs in offline mode. In this mode the client falls back to previous results for the same call in
case there is no internet connection or the server is not accessible.
To turn this feature off and make sure that all results are current and up to date - at the cost of failing when there is no
internet connection - one has to disable tha cache. This can be done programmatically, by initializing the API Client with
the optional argument cache_enabled:
client = Client(cache_enabled=False)
Or it can be done through configuration. Edit the climada.conf file in the working directory or in ~/climada/ and
change the “cache_enabled” value, like this:
...
"data_api": {
...
"cache_enabled": false
},
...
While cache_enabled is true (default), every result from the server is stored as a json file in ~/climada/data/.apicache/
by a unique name derived from the method and arguments of the call. If the very same call is made again later, at a time
where the server is not accessible, the client just comes back to the cached result from the previous call.
Please find the code to reprocduce selected CLIMADA-related scientific publications in our repository of scientific pub-
lications.
As key link, please use https://wcr.ethz.ch/research/climada.html, as it provides a brief introduction especially for those
not familiar with GitHub.
THREE
DEVELOPER GUIDE
473
CLIMADA documentation, Release 6.0.2-dev
3. Implement your changes and commit them with meaningful and well formatted commit messages.
4. Add unit and integration tests to your code, if applicable.
5. Use Pylint for a static code analysis of your code with CLIMADA’s configuration .pylintrc:
pylint
NOTE: Only team members are allowed to push to the original repository. Most contributors are/will be team
members. To be added to the team list and get permissions please contact one of the owners. Alternatively, you
can fork the CLIMADA repository and add this fork as a new remote to your local repository. You can then push
to the fork remote:
8. On the CLIMADA-project/climada_python GitHub repository, create a new pull request with target branch de-
velop. This also works if you pushed to a fork instead of the main repository. Add a description and explanation
of your changes and work through the pull request author checklist provided. Feel free to request reviews from
specific team members.
9. After approval of the pull request, the branch is merged into develop and your changes will become part of the
next CLIMADA release.
3.4 Resources
The CLIMADA documentation provides several Developer Guides. Here’s a selection of the commonly required infor-
mation:
• How to use Git and GitHub for CLIMADA development: Development and Git and CLIMADA
• Coding instructions for CLIMADA: Python Dos and Don’ts, Performance Tips, CLIMADA Conventions
• How to execute tests in CLIMADA: Testing and Continuous Integration
Note on dependencies
Climada dependencies are handled with the requirements/env_climada.yml file. When you run mamba env
update -n <your_env> -f requirements/env_climada.yml, the content of that file is used to install the
dependencies, thus, if you are working on a branch that changes the dependencies, make sure to be on that branch before
running the command.
get the latest data from the remote repository and update your branch
git pull
Once you have set up everything (including pre-commit hooks) you will be able to:
see your locally modified files
git status
Pre-Commit Hooks
Climada developer dependencies include pre-commit hooks to help ensure code linting and formatting. See Code For-
matting for our conventions regarding formatting. These hooks will run on all staged files and verify:
• the absence of trailing whitespace
• that files end in a newline and only a newline
• the correct sorting of imports using isort
• the correct formatting of the code using black
If you have installed the pre-commit hooks (see Install developer dependencies), they will be run each time you attempt
to create a new commit, and the usual git flow can slightly change:
If any check fails, you will be warned and these hooks will apply corrections (such as formatting the code with black if it
is not). As files are modified, you are required to stage them again (hooks cannot stage their modification, only you can)
and commit again.
As an exemple, suppose you made an improvement to Centroids and want to commit these changes, you would run:
$ git status
On branch feature/<new_feature>
Your branch is up-to-date with 'origin/<new_feature>'.
Changes to be committed:
(continues on next page)
Now trying to commit, and assuming that imports are not correctly sorted, and some of the code is not correctly formatted:
Fixing [...]/climada_python/climada/hazard/centroids/centr.py
black-jupyter............................................................Failed
- hook id: black-jupyter
- files were modified by this hook
reformatted climada/hazard/centroids/centr.py
All done!
Note the commit was aborted, and the problems were fixed. However, these changes added by the hooks are not staged
yet. You have to run git add again to stage them:
$ git status
On branch feature/<new_feature>
Your branch is up-to-date with 'origin/<new_feature>'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: climada/hazard/centroids/centr.py
After that, you can execute the commit and the hooks should pass:
All done!
Make unit and integration tests on your code, preferably during development
Writing new code requires writing new tests: Please read our Guide on unit and integration tests
Steps
1) Make sure the develop branch is up to date on your own machine
2) Merge develop into your feature branch and resolve any conflicts
In the case of more complex conflicts, you may want to speak with others who worked on the same code. Your IDE
should have a tool for conflict resolution.
3) Check all the tests pass locally
make unit_test
make integ_test
4) Perform a static code analysis using pylint with CLIMADA’s configuration .pylintrc (in the climada root di-
rectory). Jenkins executes it after every push.
To do it locally, your IDE probably provides a tool, or you can run make lint and see the output in pylint.log.
5) Push to GitHub. If you’re pushing this branch for the first time, use
git push
6) Check all the tests pass on the WCR Jenkins server (https://ied-wcr-jenkins.ethz.ch). See Emanuel’s presentation
for how to do this! You should regularly be pushing your code and checking this!
7) Create the pull request!
• On the CLIMADA GitHub page, navigate to your feature branch (there’s a drop-down menu above the file
structure, pointing by default to main).
• Above the file structure is a branch summary and an icon to the right labelled “Pull request”.
• Choose which branch you want to merge with. This will usually be develop, but may be another feature
branch for more complex feature development.
• Give your pull request an informative title (like a commit message).
• Write a description of the pull request. This can usually be adapted from your branch’s commit messages
(you wrote informative commit messages, didn’t you?), and should give a high-level summary of the changes,
specific points you want the reviewers’ input on, and explanations for decisions you’ve made. The code doc-
umentation (and any references) should cover the more detailed stuff.
• Assign reviewers in the page’s right hand sidebar. Tag anyone who might be interested in reading the code.
You should already have found one or two people who are happy to read the whole request and sign it off
(they could also be added to ‘Assignees’).
• Create the pull request.
• Contact the reviewers to let them know the request is live. GitHub’s settings mean that they may not be alerted
automatically. Maybe also let people know on the WCR Slack!
8) Talk with your reviewers
• Use the comment/chat functionality within GitHub’s pull requests - it’s useful to have an archive of discussions
and the decisions made.
• Take comments and suggestions on board, but you don’t need to agree with everything and you don’t need to
implement everything.
• If you feel someone is asking for too many changes, prioritise, especially if you don’t have time for complex
rewrites.
• If the suggested changes and or features don’t block functionality and you don’t have time to fix them, they
can be moved to Issues.
• Chase people up if they’re slow. People are slow.
9) Once you implement the requested changes, respond to the comments with the corresponding commit implementing
each requested change.
10) If the review takes a while, remember to merge develop back into the feature branch every now and again (and
check the tests are still passing on Jenkins).
Anything pushed to the branch is added to the pull request.
11) Once everyone reviewing has said they’re satisfied with the code you can merge the pull request using the GitHub
interface.
Delete the branch once it’s merged, there’s no reason to keep it. (Also try not to re-use that branch name later.)
12) Update the develop branch on your local machine.
Also see the Reviewer Guide and Reviewer Checklist!
Commit more often than you think, and use informative commit messages
• Committing often makes mistakes less scary to undo
Git compares file versions by text tokens. Jupyter Notebooks typically contain a lot of metadata, along with binary data
like image files. Simply re-running a notebook can change this metadata, which will be reported as file changes by Git.
This causes excessive Diff reports that cannot be reviewed conveniently.
To avoid committing changes of unrelated metadata, open Jupyter Notebooks in a text editor instead of your browser
renderer. When committing changes, make sure that you indeed only commit things you did change, and revert any
changes to metadata that are not related to your code updates.
Several code editors use plugins to render Jupyter Notebooks. Here we collect the instructions to inspect Jupyter Note-
books as plain text when using them:
• VSCode: Open the Jupyter Notebook. Then open the internal command prompt (Ctrl + Shift + P or Cmd +
Shift + P on macOS) and type/select ‘View: Reopen Editor with Text Editor’
Merge the remote develop branch into your feature branch every now and again
• This way you’ll find conflicts early
Questions
https://xkcd.com/1597/
– (It’s possible to set more than one remote repository, e.g. you might set one up on a network-restricted
computing cluster)
• push, pull and pull request
– You push your work when you send it from your local machine to the remote repository
– You pull from the remote repository to update the code on your local machine
– A pull request is a standardised review process on GitHub. Usually it ends with one branch merging into
another
• Conflict resolution
– Sometimes two people have made changes to the same bit of code. Usually this comes up when you’re trying
to merge branches. The changes have to be manually compared and the code edited to make sure the ‘correct’
version of the code is kept.
3.7.2 Gitflow
Gitflow is a particular way of using git to organise projects that have
• multiple developers
• working on different features
• with a release cycle
It means that
• there’s always a stable version of the code available to the public
• the chances of two developers’ code conflicting are reduced
• the process of adding and reviewing features and fixes is more standardised for everyone
Gitflow is a convention, so you don’t need any additional software.
• … but if you want you can get some: a popular extension to the git command line tool allows you to issue more
intuitive commands for a Gitflow workflow.
• Mac/Linux users can install git-flow from their package manager, and it’s included with Git for Windows
• The critical difference between Gitflow and ‘standard’ git is that almost all of your work takes place on the develop
branch, instead of the main (formerly master) branch.
• The main branch is reserved for planned, stable product releases, and it’s what the general public download when
they install CLIMADA. The developers almost never interact with it.
• This is common to many workflows: when you want to add something new to the model you start a new branch,
work on it locally, and then merge it back into develop with a pull request (which we’ll cover later).
• By convention we name all CLIMADA feature branches feature/* (e.g. feature/meteorite).
• Features can be anything, from entire hazard modules to a smarter way to do one line of a calculation. Most of the
work you’ll do on CLIMADA will be a features of one size or another.
• We’ll talk more about developing CLIMADA features later!
Developer guidelines:
• All tests must pass before submitting a pull request.
• Integration tests don’t run on feature branches in Jenkins, therefore developers are requested to run them locally.
• After a pull request was accepted and the changes are merged to the develop branch, integration tests may still fail
there and have to be addressed.
Developer guidelines:
• Make sure the coverage of novel code is at 100% before submitting a pull request.
Be aware that having a code coverage alone does not grant that all required tests have been written!
The following artificial example would have a 100% coverage and still obviously misses a test for y(False)
import unittest
class TestXY(unittest.TestCase):
(continues on next page)
def test_y(self):
self.assertEqual(y(True), 0.25)
unittest.TextTestRunner().run(unittest.TestLoader().loadTestsFromTestCase(TestXY));
..
been here
been there
been everywhere
been here
----------------------------------------------------------------------
Ran 2 tests in 0.003s
OK
Developer guidelines:
• High Priority Warnings are as severe as test failures and must be addressed at once.
• Do not introduce new Medium Priority Warnings.
• Try to avoid introducing Low Priority Warnings, in any case their total number should not increase.
climada_ci_night
Branch: develop
Runs when climada_install_env has finished successfully
• runs all test modules
• runs static code analysis
climada_branches
Branch: any
Runs when a commit is pushed to the repository
• runs all test modules outside of climada.test
• runs static code analysis
climada_data_api
Branch: develop
Runs every day at 0:20AM CET
• tests availability of external data APIs
climada_data_api
Branch: develop
No automated running
• tests executability of CLIMADA tutorial notebooks.
Note
As of CLIMADA v4.0, the default CI technology remains Jenkins. GitHub Actions CI is currently considered experi-
mental for CLIMADA development.
Has something similar already been implemented? This is far from trivial to answer! First, search for functions in
the same module where you’d be implementing the new piece of code. Then, search in the util folders, there’s a lot
of functions in some of the scripts! You could also search the index (a list of all functions and global constants) in the
climada documentation for key-words that may be indicative of the functionality you’re looking for.
Don’t expect this process to be fast!
Even if you want to implement just a small helper function, which might take 10mins to write, it may take you 30mins to
check the existing code base! That’s part of the game! Even if you found something, most likely, it’s not the exact same
thing which you had in mind. Then, ask yourself how you can re-use what’s there, or whether you can easily add another
option to the existing method to also fit your case, and only if it’s nearly impossible or highly unreadable to do so, write
your own implementation.
Can my code serve others? You probably have a very specific problem in mind. Yet, think about other use-cases, where
people may have a similar problem, and try to either directly account for those, or at least make it easy to configure to
other cases. Providing keyword options and hard-coding as few things as possible is usually a good thing. For example,
if you want to write a daily aggregation function for some time-series, consider that other people might find it useful to
have a general function that can also aggregate by week, month or year.
Can I get started? Before you finally start coding, be sure about placing them in a sensible location. Functions in non-util
modules are actually specific for that module (e.g. a file-reader function is probably not river-flood specific, so put it into
the util section, not the RiverFlood module, even if that’s what you’re currently working on)! If unsure, talk with
other people about where your code should go.
If you’re implementing more than just a function or two, or even an entirely new module, the planning process should be
talked over with someone doing climada-administration.
Clean Code
A few basic principles:
• Follow the PEP 8 Style Guide. It contains, among others, recommendations on:
– code layout
– basic naming conventions
– programming recommendations
– commenting (in detail described in Chapter 4)
– varia
• Perform a static code analysis - or: PyLint is your friend
• Follow the best practices of Correctness - Tightness - Readability
• Adhere to principles of pythonic coding (idiomatic coding, the “python way”)
• Indentation: 4 spaces per level. For continuation lines, decide between vertical alignment & hanging indentation as
shown here:
– None: Blank lines may be omitted between a bunch of related one-liners (e.g. a set of dummy implementa-
tions).
• Whitespaces:
– None immediately inside parentheses, brackets or braces; after trailing commas; for keyword assignments in
functions.
– Do for assignments (i = i + 1), around comparisons (>=, ==, etc.), around booleans (and, or, not)
– the following 3 examples are correct:
A short typology: b (single lowercase letter); B (single uppercase letter); lowercase; lower_case_with_underscores; UP-
PERCASE; UPPER_CASE_WITH_UNDERSCORES; CapitalizedWords (or CapWords, or CamelCase); mixedCase;
Capitalized_Words_With_Underscores (ugly!)
A few basic rules:
• packages and modules: short, all-lowercase names. Underscores can be used in the module name if it improves
readability. E.g. numpy, climada
• classes: use the CapWords convention. E.g. RiverFlood
• functions, methods and variables: lowercase, with words separated by underscores as necessary to improve read-
ability. E.g. from_raster(), dst_meta
• function- and method arguments: Always use self for the first argument to instance methods,cls for the first
argument to class methods.
• constants: all capital letters with underscores, e.g. DEF_VAR_EXCEL
Use of underscores
• _single_leading_underscore: weak “internal use” indicator. E.g. from M import * does not import
objects whose names start with an underscore. A side-note to this: Always decide whether a class’s methods and
instance variables (collectively: “attributes”) should be public or non-public. If in doubt, choose non-public; it’s
easier to make it public later than to make a public attribute non-public. Public attributes are those that you expect
unrelated clients of your class to use, with your commitment to avoid backwards incompatible changes. Non-
public attributes are those that are not intended to be used by third parties; you make no guarantees that non-public
attributes won’t change or even be removed. Public attributes should have no leading underscores.
• single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g. tkinter.
Toplevel(master, class_='ClassName')
• comparisons to singletons like None should always be done with is or is not, never the equality operators.
• Use is not operator rather than not ... is.
• Be consistent in return statements. Either all return statements in a function should return an expression, or none
of them should. Any return statements where no value is returned should explicitly state this as return None.
# Correct
def foo(x):
if x >= 0:
return math.sqrt(x)
else:
return None
# Wrong
def foo(x):
if x >= 0:
return math.sqrt(x)
• Object type comparisons should always use isinstance() instead of comparing types directly:
# Correct:
if isinstance(obj, int):
# Wrong:
if type(obj) is type(1)
• Remember: sequences (strings, lists, tuples) are false if empty; this can be used:
# Correct:
if not seq:
if seq:
# Wrong:
if len(seq):
if not len(seq)
# Correct:
if greeting:
# Wrong:
if greeting == True:
• Use ‘’.startswith() and ‘’.endswith() instead of string slicing to check for prefixes or suffixes.
# Correct:
if foo.startswith('bar'):
# Wrong:
if foo[:3] == 'bar':
• Context managers exist and can be useful (mainly for opening and closing files
Static code analysis detects style issues, bad practices, potential bugs, and other quality problems in your code, all without
having to actually execute it. In Spyder, this is powered by the best in class Pylint back-end, which can intelligently detect
an enormous and customizable range of problem signatures. It follows the style recommended by PEP 8 and also includes
the following features: Checking the length of each line, checking that variable names are well-formed according to the
project’s coding standard, checking that declared interfaces are truly implemented.
A detailed instruction can be found here.
In brief: In the editor, select the Code Analysis pane (if not visible, go to View -> Panes -> Code Analysis) and the file
you want to be analyzed; hit the Analyze button.
• warning,
• error
• a global score regarding code quality.
All messages have a line reference and a short description on the issue. Errors must be fixed, as this is a no-go for actually
executing the script. Warnings and refactoring messages should be taken seriously; so should be the convention messages,
even though some of the naming conventions etc. may not fit the project style. This is configurable.
In general, there should be no errors and warnings left, and the overall code quality should be in the “green” range
(somewhere above 5 or so).
There are advanced options to configure the type of warnings and other settings in pylint.
Correctness
Methods and functions must return correct and verifiable results, not only under the best circumstances but in any possible
context. I.e. ideally there should be unit tests exploring the full space of parameters, configuration and data states. This
is often clearly a non-achievable goal, but still - we aim at it.
Tightness
• Avoid code redundancy.
• Make the program efficient, use profiling tools for detection of bottlenecks.
• Try to minimize memory consumption.
• Don’t introduce new dependencies (library imports) when the desired functionality is already covered by existing
dependencies.
• Stick to already supported file types.
Readability
• Write complete Python Docstrings.
• Use meaningful method and parameter names, and always annotate the data types of parameters and return values.
• No context-dependent return types! Also: Avoid None as return type, rather raise an Exception instead.
• Be generous with defining Exception classes.
• Comment! Comments are welcome to be redundant. And whenever there is a particular reason for the way some-
thing is done, comment on it! See below for more detail.
• For functions which implement mathematical/scientific concepts, add the actual mathematical formula as comment
or to the Doctstrings. This will help maintain a high level of scientific accuracy. E.g. How is are the random walk
tracks computed for tropical cyclones?
Pythonic Code
In Python, there are certain structures that are specific to the language, or at least the syntax of how to use them. This is
usually referred to as “pythonic” code.
There is an extensive overview on on crucial “pythonic” structures and methods in the Python 101 library.
A few important examples are:
• iterables such as dictionaries, tuples, lists
• iterators and generators (a very useful construct when it comes to code performance, as the implementation of
generators avoids reading into memory huge iterables at once, and allows to read them lazily on-the-go; see this
blog post for more details)
• f-strings (“formatted string literals,” have an f at the beginning and
curly braces containing expressions that will be replaced with their values:
• decorators (a design pattern in Python that allows a user to add new functionality to an existing object without
modifying its structure). Something like:
@uppercase_decorator
def say_hi():
return "hello there"
• type checking (Python is a dynamically typed language; also: cf. “Duck typing”. Yet, as a best practice, variables
should not change type once assigned)
• Do not use mutable default arguments in your functions (e.g. lists). For example, if you define a function as such:
Your list will be mutated for future calls of the functions too. The correct implementation would be the following:
• lambda functions (little, anonymous functions, sth like high_ord_func(2, lambda x: x * x))
• list comprehensions (a short and possibly elegant syntax to create a new list in one line, sth like newlist = [x
for x in range(10) if x < 5] returns [0, 1, 2, 3, 4])
It is recommended to look up the above concepts in case not familiar with them.
Comments are for developers. They describe parts of the code where necessary to facilitate the understanding of program-
mers. They are marked by putting a # in front of every comment line (for multi-liners, wrapping them inside triple double
quotes """ is basically possible, but discouraged to not mess up with docstrings). A documentation string (docstring) is
a string that describes a module, function, class, or method definition. The docstring is a special attribute of the object
(object.__doc__) and, for consistency, is surrounded by triple double quotes ("""). This is also where elaboration of
the scientific foundation (explanation of used formulae, etc.) should be documented.
A few general rules:
• Have a look at this blog-post on commenting basics
• Comments should be D.R.Y (“Don’t Repeat Yourself.”)
• Obvious naming conventions can avoid unnecessary comments (cf. families_by_city[city] vs.
my_dict[p])
def line:
Numpy-style docstrings
Full reference can be found here. The standards are such that they use re-structured text (reST) syntax and are rendered
using Sphinx.
There are several sections in a docstring, with headings underlined by hyphens (---). The sections of a function’s docstring
are:
1. Short summary: A one-line summary that does not use variable names or the function name
2. Deprecation warning (use if applicable): to warn users that the object is deprecated, including ver-
sion the object that was deprecated, and when it will be removed, reason for deprecation, new
recommended way of obtaining the same functionality. Use the deprecated Sphinx directive:
3. Extended Summary: A few sentences giving an extended description to clarify functionality, not to discuss imple-
mentation detail or background theory (see Notes section below!)
4. Parameters: Descrip-
tion of the function arguments, keywords and their respective types. Enclose variables in single backticks in the
description. The colon must be preceded by a space, or omitted if the type is absent. For the parameter types, be as
precise as possible. If it is not necessary to specify a keyword argument, use optional after the type specification:
e.g. x: int, optional. Default values of optional parameters can also be detailed in the description. (e.g.
... description of parameter ... (default is -1))
5. Returns: Explanation of the returned values and their types. Similar to the Parameters
section, except the name of each return value is optional, type isn’t. If both the name
and type are specified, the Returns section takes the same form as the Parameters section.
There is a range of other sections that can be included, if sensible and applicable, such as Yield (for generator functions
only), Raises (which errors get raised and under what conditions), See also ( refer to related code), Notes (additional
information about the code, possibly including a discussion of the algorithm; may include mathematical equations, written
in LaTeX format), References, Examples(to illustrate usage).
Importing
General remarks
• Imports should be grouped in the following order:
– Standard library imports (such as re, math, datetime, cf. here )
– Related third party imports (such as numpy)
– Local application/library specific imports (such as climada.hazard.base)
• You should put a blank line between each group of imports.
• Don’t introduce new dependencies (library imports) when the desired functionality is already covered by existing
dependencies.
Avoid circular importing!!
Circular imports are a form of circular dependencies that are created with the import statement in Python; e.g. module A
loads a method in module B, which in turn requires loading module A. This can generate problems such as tight coupling
between modules, reduced code reusability, more difficult maintenance. Circular dependencies can be the source of
potential failures, such as infinite recursions, memory leaks, and cascade effects. Generally, they can be resolved with
better code design. Have a look here for tips to identify and resolve such imports.
Varia
• there are absolute imports (uses the full path starting from the project’s root folder) and relative imports (uses
the path starting from the current module to the desired module; usually in the for from .<module/package>
import X; dots . indicate how many directories upwards to traverse. A single dot corresponds to the current
directory; two dots indicate one folder up; etc.)
• generally try to avoid star imports (e.g. from packagename import *)
Importing utility functions
When importing CLIMADA utility functions (from climada.util), the convention is to import the function as
“u_name_of_function”, e.g.:
Debugging
When writing code, you will encounter bugs and hence go through (more or less painful) debugging. Depending on the
IDE you use, there are different debugging tools that will make your life much easier. They offer functionalities such as
stopping the execution of the function just before the bug occurs (via breakpoints), allowing to explore the state of defined
variables at this moment of time.
For spyder specifically, have a look at the instructions on how to use ipdb
Exception handling
CLIMADA guidelines
1. Catch specific exceptions if possible, i.e, if not needed do not catch all exceptions.
2. Do not catch exception if you do not handle them.
3. Make a clear explanatory message when you raise an error (similarly to when you use the logger to inform the user).
Think of future users and how it helps them understanding the error and debugging their code.
4. Catch an exception when it arises.
5. When you catch an exception and raise an error, it is in often (but not always) a good habit to not throw away the
first caught exception as it may contain useful information for debugging. (use raise Error from)
# Bad (1)
x = 1
try:
l = len(events)
if l < 1:
print("l is too short")
except:
pass
Exceptions reminder
Logging
CLIMADA guidelines
• In CLIMADA, you cannot use printing. Any output must go into the LOGGER.
• For any logging messages, always think about the audience. What would a user or developer need for information?
This also implies to carefully think about the correct LOGGER level. For instance, some information is for de-
bugging, then use the debug level. In this case, make sure that the message actually helps the debugging process!
Some message might just inform the user about certain default parameters, then use the inform level. See below
for more details about logger levels.
• Do not overuse the LOGGER. Think about which level of logging. Logging errors must be useful for debugging.
You can set the level of the LOGGER using climada.util.config.LOGGER.setLevel(logging.XXX). This way
you can for instance ‘turn-off’ info messages when you are making an application. For example, setting the logger to the
“ERROR” level, use:
import logging
from climada.util.config import LOGGER
LOGGER.setLevel(logging.ERROR)
“Logging is a means of tracking events that happen when some software runs.”
When to use logging
“Logging provides a set of convenience functions for simple logging usage. These are debug(), info(), warning(), error()
and critical(). To determine when to use logging, see the table below, which states, for each of a set of common tasks,
the best tool to use for it.”
Logger level
“The logging functions are named after the level or severity of the events they are used to track. The standard levels and
their applicability are described below (in increasing order of severity):”
3.9.3 Python performance tips and best practice for CLIMADA developers
This guide covers the following recommendations:
2/7 Use profiling tools to find and assess performance bottlenecks. 2/7 Replace for-loops by built-in functions and effi-
cient external implementations. 2/7 Consider algorithmic performance, not only implementation performance. 2/7 Get
familiar with NumPy: vectorized functions, slicing, masks and broadcasting. ⚫ Miscellaneous: sparse arrays, Numba,
parallelization, huge files (xarray), memory, pickle format. ⚠ Don’t over-optimize at the expense of readability and
usability.
Profiling
Python comes with powerful packages for the performance assessment of your code. Within IPython and notebooks,
there are several magic commands for this task:
• %time: Time the execution of a single statement
• %timeit: Time repeated execution of a single statement for more accuracy
• %%timeit Does the same as %timeit for a whole cell
• %prun: Run code with the profiler
• %lprun: Run code with the line-by-line profiler
• %memit: Measure the memory use of a single statement
• %mprun: Run code with the line-by-line memory profiler
More information on profiling in the Python Data Science Handbook.
Also useful: unofficial Jupyter extension Execute Time.
While it’s easy to assess how fast or slow parts of your code are, including finding the bottlenecks, generating an improved
version of it is much harder. This guide is about simple best practices that everyone should know who works with
Python, especially when models are performance-critical.
In the following, we will focus on arithmetic operations because they play an important role in CLIMADA. Operations
on non-numeric objects like strings, graphs, databases, file or network IO might be just as relevant inside and outside of
the CLIMADA context. Some of the tips presented here do also apply to other contexts, but it’s always worth looking
for context-specific performance guides.
General considerations
This section will be concerned with:
2/7 for-loops and built-ins
2/7 external implementations and converting data structures
2/7 algorithmic efficiency
2/7 memory usage
𝚺 As this section’s toy example, let’s assume we want to sum up all the numbers in a list:
list_of_numbers = list(range(10000))
for-loops
A developer with a background in C++ would probably loop over the entries of the list:
%%timeit
result = 0
for i in list_of_numbers:
result += i
332 µs ± 65.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit sum(list_of_numbers)
54.9 µs ± 5.63 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The timing improves by a factor of 5-6 and this is not a coincidence: for-loops generally tend to get prohibitively
expensive when the number of iterations increases.
2/7 When you have a for-loop with many iterations in your code, check for built-in functions or efficient external
implementations of your programming task.
A special case worth noting are append operations on lists which can often be replaced by more efficient list comprehen-
sions.
2/7 When you find an external library that solves your task efficiently, always consider that it might be necessary
to convert your data structure which takes time.
For arithmetic operations, NumPy is a great library, but if your data comes as a Python list, NumPy will spend quite some
time converting it to a NumPy array:
import numpy as np
%timeit np.sum(list_of_numbers)
572 µs ± 80 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10.6 µs ± 1.56 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Indeed, this is 5-6 times faster than the built-in sum and 20-30 times faster than the for-loop.
Even for such a basic task as summing, there exist several implementations whose performance can vary more than you
might expect:
%timeit ndarray_of_numbers.sum()
%timeit np.einsum("i->", ndarray_of_numbers)
9.07 µs ± 1.39 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.55 µs ± 383 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
This is up to 50 times faster than the for-loop. More information about the einsum function will be given in the NumPy
section of this guide.
Efficient algorithms
n = max(list_of_numbers)
%timeit 0.5 * n * (n + 1)
83.1 ns ± 2.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Not surprisingly, This is almost 100 times faster than even the fastest implementation of the 10,000 summing operations
listed above.
You don’t need a degree in maths to find algorithmic improvements. Other algorithmic improvements that are often easy
to detect are:
• Filter your data set as much as possible to perform operations only on those entries that are really relevant.
Example: When computing a physical hazard (e.g. extreme wind) with CLIMADA, restrict to Centroids on land
unless you know that some of your exposure is off shore.
• Make sure to detect inconsistent or trivial input parameters early on, before starting any operations. Example:
If your code does some complicated stuff and applies a user-provided normalization factor at the very end, make
sure to check that the factor is not 0 before you start applying those complicated operations.
2/7 In general: Before starting to code, take pen and paper and write down what you want to do from an algorithmic
perspective.
Memory usage
2/7 Be careful with deep copies of large data sets and only load portions of large files into memory as needed.
Write your code in such a way that you handle large amounts of data chunk by chunk so that Python does not need to
load everything into memory before performing any operations. When you do, Python’s generators might help you with
the implementation.
2/7 Allocating unnecessary amounts of memory might slow down your code substantially due to swapping.
Vectorized functions
We mentioned above that Python’s for-loops are really slow. This is even more important when looping over the entries
in a NumPy array. Fortunately, NumPy’s masks, slicing notation and vectorization capabilities help to avoid for-loops in
almost every possible situation:
%%timeit
# SLOW: summing over columns using loops
output = np.zeros(100)
for row_i in range(input_arr.shape[0]):
for col_i in range(input_arr.shape[1]):
output[row_i] += input_arr[row_i, col_i]
145 µs ± 5.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
4.23 µs ± 216 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In the special case of multiplications and sums (linear operations) over the axes of two multi-dimensional arrays, NumPy’s
einsum is even faster:
2.38 µs ± 214 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
many_vectors = np.random.rand(1000, 3)
%timeit np.sqrt((many_vectors**2).sum(axis=1))
%timeit np.linalg.norm(many_vectors, axis=1)
%timeit np.sqrt(np.einsum("...j,...j->...", many_vectors, many_vectors))
24.4 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
26.5 µs ± 2.44 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
9.5 µs ± 91.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
For more information about the capabilities of NumPy’s einsum function, refer to the official NumPy documentation.
However, note that future releases of NumPy will eventually improve the performance of core functions, so that einsum
will become an example of over-optimization (see above) at some point. Whenever you use einsum, consider adding a
comment that explains what it does for users that are not familiar with einsum’s syntax.
Not only sum, but many NumPy functions come with similar vectorization capabilities. You can take minima, maxima,
means or standard deviations along selected axes. But did you know that the same is true for the diff and argmin
functions?
array([[4, 2, 6],
[2, 3, 4],
[3, 3, 3],
[3, 2, 4]])
arr.argmin(axis=1)
array([1, 0, 0, 1])
Broadcasting
When operations are performed on several arrays, possibly of differing shapes, be sure to use NumPy’s broadcasting
capabilities. This will save you a lot of memory and time when performing arithmetic operations.
Example: We want to multiply the columns of a two-dimensional array by values stored in a one-dimensional array. There
are two naive approaches to this:
input_arr = np.random.rand(100, 3)
col_factors = np.random.rand(3)
5.67 µs ± 718 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
# SLOW: loop over columns and factors
output = input_arr.copy()
for i, factor in enumerate(col_factors):
output[:, i] *= factor
9.63 µs ± 95.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The idea of broadcasting is that NumPy automatically matches axes from right to left and implicitly repeats data
along missing axes if necessary:
1.41 µs ± 51.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
For automatic broadcasting, the trailing dimensions of two arrays have to match. NumPy is matching the shapes of the
arrays from right to left. If you happen to have arrays where other dimensions match, you have to tell NumPy which
dimensions to add by adding an axis of length 1 for each missing dimension:
Because this concept is so important, there is a short-hand notation for adding an axis of length 1. In the slicing notation,
add None in those positions where broadcasting should take place.
input_arr = np.random.rand(7, 3, 5, 4, 6)
factors = np.random.rand(7, 3, 4)
output = factors[:, :, None, :, None] * input_arr
While in-place operations are generally faster than long and explicit expressions, they shouldn’t be over-estimated when
looking for performance bottlenecks. Often, the loss in code readability is not justified because NumPy’s memory man-
agement is really fast.
2/7 Don’t over-optimize!
17.3 ms ± 820 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
# almost same performance: in-place operations
arr_d = arr_a + arr_b
arr_d *= arr_c
arr_d -= arr_a
arr_d += arr_c
17.4 ms ± 618 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# You may want to install the module "memory_profiler" first: activate the␣
,→environment climada_env in an Anaconda prompt,
%%memit
# almost same memory usage: in-place operations
arr_d = arr_a + arr_b
arr_d *= arr_c
arr_d -= arr_a
arr_d += arr_c
Miscellaneous
Sparse matrices
In many contexts, we deal with sparse matrices or sparse data structures, i.e. two-dimensional arrays where most of the
entries are 0. In CLIMADA, this is especially the case for the intensity attributes of Hazard objects. This kind of
data is usually handled using SciPy’s submodule scipy.sparse.
2/7 When dealing with sparse matrices make sure that you always understand exactly which of your variables are
sparse and which are dense and only switch from sparse to dense when absolutely necessary.
2/7 Multiplications (multiply) and matrix multiplications (dot) are often faster than operations that involve
masks or indexing.
As an example for the last rule, consider the problem of multiplying certain rows of a sparse array by a scalar:
In the following cells, note that the code in the first line after the %%timeit statement is not timed, it’s the setup line.
/home/tovogt/.local/share/miniconda3/envs/tc/lib/python3.7/site-packages/scipy/sparse/
,→data.py:55: RuntimeWarning: overflow encountered in multiply
self.data *= other
1.52 ms ± 155 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
340 µs ± 7.32 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
400 µs ± 6.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
As a last resort, if there’s no way to avoid a for-loop even with NumPy’s vectorization capabilities, you can use the @njit
decorator provided by the Numba package:
@njit
def sum_array(arr):
result = 0.0
for i in range(arr.shape[0]):
result += arr[i]
return result
In fact, the Numba function is more than 100 times faster than without the decorator:
%timeit sum_array(input_arr)
10.9 µs ± 444 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1.84 ms ± 65.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
However, whenever available, NumPy’s own vectorized functions will usually be faster than Numba.
%timeit np.sum(input_arr)
%timeit input_arr.sum()
%timeit np.einsum("i->", input_arr)
7.6 µs ± 687 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.27 µs ± 411 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
7.89 µs ± 499 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2/7 Make sure you understand the basic idea behind Numba before using it, read the Numba docs.
2/7 Don’t use @jit, but use @njit which is an alias for @jit(nopython=True).
When you know what you are doing, the fastmath and parallel options can boost performance even further: read
more about this in the Numba docs.
Parallelizing tasks
Depending on your hardware setup, parallelizing tasks using pathos and Numba’s automatic parallelization feature can
improve the performance of your implementation.
2/7 Expensive hardware is no excuse for inefficient code.
Many tasks in CLIMADA could profit from GPU implementations. However, currently there are no plans to include
GPU support in CLIMADA because of the considerable development and maintenance workload that would come with
it. If you want to change this, contact the core team of developers, open an issue or mention it in the bi-weekly meetings.
When dealing with NetCDF datasets, memory is often an issue, because even if the file is only a few megabytes in size,
the uncompressed raw arrays contained within can be several gigabytes large (especially when data is sparse or similarly
structured). One way of dealing with this situation is to open the dataset with xarray.
2/7 xarray allows to read the shape and type of variables contained in the dataset without loading any of the actual
data into memory.
Furthermore, when loading slices and arithmetically aggregating variables, memory is allocated not more than necessary,
but values are obtained on-the-fly from the file.
⚠ Note that opening a dataset should be done with a context manager, to ensure proper closing of the file:
with xr.open_dataset("saved_on_disk.nc") as ds:
pickle is the standard python library serialization module. It has the nice feature of being able to save most of python
objects (standard and user defined) using simple methods. However, pickle is transient, i.e. it has limited Portability:
Pickle files are specific to the Python environment they were created in. This means that Pickle files may not be compatible
across different Python versions or environments, which can make it challenging to share data between systems. As such
it should only be used for temporary storage and not for persistent one.
Take-home messages
We conclude by repeating the gist of this guide:
2/7 Use profiling tools to find and assess performance bottlenecks.
2/7 Replace for-loops by built-in functions and efficient external implementations.
2/7 Consider algorithmic performance, not only implementation performance.
2/7 Get familiar with NumPy: vectorized functions, slicing, masks and broadcasting.
⚫ Miscellaneous: sparse arrays, Numba, parallelization, huge files (xarray), memory.
⚠ Don’t over-optimize at the expense of readability and usability.
black
We chose black as our formatter because it perfectly fits this need, quoting directly from the project
Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae
of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging
about formatting. You will save time and mental energy for more important matters. Blackened code looks
the same regardless of the project you’re reading. Formatting becomes transparent after a while and you can
focus on the content instead. Black makes code review faster by producing the smallest diffs possible.
black automatically reformats your Python code to conform to the PEP 8 style guide, among other guidelines. It takes
care of various aspects, including:
• Line Length: By default, it wraps lines to 88 characters, though this can be adjusted.
• Indentation: Ensures consistent use of 4 spaces for indentation.
• String Quotes: Converts all strings to use double quotes by default.
• Spacing: Adjusts spacing around operators and after commas to maintain readability.
For installation and more in-depth information on black, refer to its documentation.
Plugins executing black are available for our recommended IDEs:
• VSCode: Black Formatter Plugin
• Spyder: See this SO post
• JupyterLab: Code Formatter Plugin
isort
isort is a Python utility to sort imports alphabetically, and automatically separated into sections and by type.
Just like black it ensure consistency of the code, focusing on the imports
For installation and more in depth information on isort refer to its documentation.
A VSCode plugin is available.
git fetch -t
git checkout develop-white
git checkout develop-black
2. Switch to your feature branch and merge develop-white (in order to get the latest changes in develop before
switching to black):
If merge conflicts arise, resolve them and conclude the merge as instructed by Git. It also helps to check if the tests
pass after the merge.
3. Install and run the pre-commit hooks:
pre-commit install
pre-commit run --all-files
git add -u
git commit
Resolve all conflicts by choosing “Ours” over “Theirs” (“Current Change” over the “Incoming Change”).
Again, fix merge conflicts if they arise and check if the tests pass. Accept the incoming changes for the tutorials
1_main, Exposures, LitPop Impact, Forecast and TropicalCyclone unless you made changes to those. Again, the
file with the most likely merging conflicts is CHANGELOG.md, which should probably be resolved by accepting
both changes.
7. Finally, push your latest changes:
Manual download
As indicated in the software and tutorials, other data might need to be downloaded manually by the user. The following
table shows these last data sources, their version used, its current availability and where they are used within CLIMADA:
f(1, 2, 3, 4, 5, 6, 7, 8)
This of course pleads the case on a strictly formal level. No real complexities have been reduced during the making of
this example.
Nevertheless there is the benefit of reduced test case requirements. And in real life, real complexity will be reduced.
Hard Coded
Hard coding constants is the preferred way to deal with strings that are used to identify objects or files.
# suboptimal
my_dict = {"x": 4}
if my_dict["x"] > 3:
msg = "well, arh, ..."
msg
# good
X = "x"
my_dict = {X: 4}
if my_dict[X] > 3:
msg = "yeah!"
msg
'yeah!'
# possibly overdoing it
X = "x"
Y = "this doesn't mean that every string must be a constant"
my_dict = {X: 4}
if my_dict[X] > 3:
msg = Y
msg
import pandas as pd
X = "x"
df = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
try:
df.X
except:
from sys import stderr
0 1
1 2
2 3
Name: x, dtype: int64
Configurable
When it comes to absolute paths, it is urgently suggested to not use hard coded constant values, for obvious reasons. But
also relative paths can cause problems. In particular, they may point to a location where the user has not sufficient access
permissions. In order to avoid these problems, all paths constants in CLIMADA are supposed to be defined through
configuration.
→ paths must be configurable
The same applies to urls to external resources, databases or websites. Since they may change at any time, their addresses
are supposed to be defined through configuration. Like this it will be possible to access them without the need of tampering
with the source code or waiting for a new release.
→ urls must be configurable
Another category of constants that should go into the configuration file are system specifications, such as number of CPU’s
available for CLIMADA or memory settings.
→ OS settings must be configurable
3.11.2 Configuration
Configuration files
The proper place to define constants that a user may want (or need) to change without changing the CLIMADA installation
are the configuration files.
These are files in json format with the name climada.conf. There is a default config file that comes with the installation
of CLIMADA. But it’s possible to have several of them. In this case they are complementing one another.
CLIMADA looks for configuration files upon import climada. There are four locations to look for configuration files:
Format
A configuration file is a JSON file, with the additional restriction, that all keys must be strings without a ‘.’ (dot) character
.
The JSON format looks a lot like a Python dict. But note, that all strings must be surrounded by double quotes and
trailing commas are not allowed anywhere.
For configuration values that belong to a particular module it is suggested to reflect the code repositories file structure in
the json object. For example, if a configuration for my_config_value that belongs to the module climada.util.
dates_times is wanted, it would be defined as
{
"util": {
"dates_times": {
"my_config_value": 42
}
}
}
Configuration string values can be referenced from other configuration values. E.g.
{
"a": "x",
"b": "{a}y"
}
CONFIG.hazard
Data Types
The configuration itself and its attributes have the data type climada.util.config.Config
CONFIG.__class__, CONFIG.hazard.trop_cyclone.random_seed.__class__
(climada.util.config.Config, climada.util.config.Config)
The actual configuration values can be accessed as basic types (bool, float, int, str), provided that the definition is according
to the respective data type:
CONFIG.hazard.trop_cyclone.random_seed.int()
54
try:
CONFIG.hazard.trop_cyclone.random_seed.str()
except Exception as e:
from sys import stderr
(continues on next page)
However, configuration string values can be converted to pathlib.Path objects if they are pointing to a directory.
CONFIG.hazard.storm_europe.forecast_dir.dir()
Note that converting a configuration string to a Path object like this will create the specified directory on the fly, unless
dir is called with the parameter create=False.
Default Configuration
The conifguration file climada/conf/climada.conf contains the default configuration.
On the top level it has the following attributes:
• local_data: definition of main paths for accessing and storing CLIMADA related data
– system: top directory, where (persistent) climada data is stored
default: ~/climada/data
– demo: top directory for data that is downloaded or created in the CLIMADA tutorials
default: ~/climada/demo/data
– save_dir: directory where transient (non-persistent) data is stored
default: ./results
• log_level: minimum log level showed by logging, one of DEBUG, INFO, WARNING, ERROR or CRITICAL.
default: INFO
• max_matrix_size: maximum matrix size that can be used, can be decreased in order to avoid memory issues
default: 100000000 (1e8)
• exposures: exposures modules specific configuration
• hazard: hazard modules specific configuration
CONFIG.__dict__.keys()
Test Configuration
The configuration values for unit and integration tests are not part of the default configuration, since they are irrelevant
for the regular CLIMADA user and only aimed for developers.
The default test configuration is defined in the climada.conf file of the installation directory. This file contains paths
to files that are read during tests. If they are part of the GitHub repository, their path i.g. starts with the climada folder
within the installation directory:
{
"_comment": "this is a climada configuration file meant to supersede the default␣
,→ configuration in climada/conf during test",
"test_directory": "./climada",
(continues on next page)
Obviously, the default test_directory is given as the relative path to ./climada. This is fine if (but only if) unit or
integration tests are started from the installation directory, which is the case in the automated tests on the CI server.
Developers who intend to start a test from another working directory may have to edit this file and replace the relative
path with the absolute path to the installation directory:
{
"_comment": "this is a climada configuration file meant to supersede the default␣
,→ configuration in climada/conf during test",
"test_directory": "/path/to/installation-dir/climada",
"test_data": "{test_directory}/test/data",
"disc_rates": {
"test_data": "{test_directory}/entity/disc_rates/test/data"
}
}
Data Initialization
When import climada is executed in a python script or shell, data files from the installation directory are copied to
the location specified in the current configuration.
This happens only when climada is used for the first time with the current configuration. Subsequent execution will only
check for presence of files and won’t overwrite existing files.
Thus, the home directory will automatically be populated with a climada directory and several files from the repository
when climada is used.
To prevent this and keep the home directory clean, create a config file ~/.config/climada.conf with customized
values for local_data.system and local_data.demo.
As an example, a file with the following content would suppress creation of directories and copying of files during execution
of CLIMADA code:
{
"local_data": {
"system": "/path/to/installation-dir/climada/data/system",
"demo": "/path/to/installation-dir/climada/data/demo"
}
}
Basic structure
Every tutorial should cover the following main points. Additional features characteristic to the modules presented can
and should be added as seen fit.
Introduction
Walk users through the core functions of the module and illustrate how the feature can be used. This obviously is dependent
on the feature itself. A few core points should be considered when creating the tutorial:
• SIZE MATTERS!
– each notebook as a total should not exceed the critical (yet vague) size of “a couple MB”
– keep the size of data you use as examples in the tutorial in mind
– we aim for computational efficiency
– a lean, well-organized, concise notebook is more informative than a long, messy all-encompassing one.
• follow the general CLIMADA naming convention for the notebook. For example: “cli-
mada_hazard_TropCyclone.ipynb”
Good examples
The following examples can be used as templates and inspiration for your tutorial:
• Exposure tutorial
• Hazard tutorial
Headers
To structure your tutorial, use headers of different levels to create sections and subsections.
To create an header, write the symbol (#) before your header name
‘#’ : create a header of level 1
‘##’ : create a header of level 2
‘###’ : create a header of level 3
‘####’ : create a header of level 4
The title of the tutorial should be of level 1 (#), should have its own cell, and should be the first cell of the notebook.
Local Build
You can also build and browse the documentation on your machine. This can be useful if you want to access the docu-
mentation of a particular feature branch or to check your updates to the documentation.
For building the documentation, you need to follow the advanced installation instructions. Make sure to install the devel-
oper requirements as well.
Then, activate the climada_env and navigate to the doc directory:
Next, execute make (this might take a while when executed for the first time)
make html
The documentation will be placed in doc/_build/html. Simply open the page doc/_build/html/index.html
with your browser.
For re-creating the documentation environment, we provide a Dockerfile. You can use it to build a new environment
and extract the exact versions from it. This might be necessary when we upgrade to a new version of Python, or when
dependencies are updated. NOTE: Your machine must be able to run/virtualize an AMD64 OS.
cd climada_python
5. You have now entered the container. Activate the conda environment and export its specs:
Copy and paste the shell output of the last command into the requirements/env_docs.yml file in the CLI-
MADA repository, overwriting all its contents.
3.13 Testing
3.13.1 Notes on Testing
Any programming code that is meant to be used more than once should have a test, i.e., an additional piece of programming
code that is able to check whether the original code is doing what it’s supposed to do.
Writing tests is work. As a matter of facts, it can be a lot of work, depending on the program often more than writing the
original code.
Luckily, it essentially follows always the same basic procedure and a there are a lot of tools and frameworks available to
facilitate this work.
In CLIMADA we use the Python in-built test runner pytest for execution of the tests.
Why do we write test?
• The code is most certainly buggy if it’s not properly tested.
• Software without tests is worthless. It won’t be trusted and therefore it won’t be used.
When do we write test?
• Before implementation. A very good idea. It is called Test Driven Development.
• During implementation. Test routines can be used to run code even while it’s not fully implemented. This is
better than running it interactively, because the full context is set up by the test.
By command line:
python -m unittest climada.x.test_y.TestY.test_z
Interactively:
climada.x.test_y.TestY().test_z()
• Right after implementation. In case the coverage analysis shows that there are missing tests, see Test Coverage.
• Later, when a bug was encountered. Whenever a bug gets fixed, also the tests need to be adapted or amended.
Testing types
Despite the common basic procedure there are many different kinds of tests distinguished. (See WikiPedia:Software
testing). Very commonly a distinction is made based on levels:
• Unit Test: tests only a small part of the code, a single function or method, essentially without interaction between
modules
• Integration Test: tests whether different methods and modules work well with each other
• System Test: tests the whole software at once, using the exposed interface to execute a program
Unit Tests
Unit tests are meant to check the correctness of program units, i.e., single methods or functions, they are supposed to be
fast, simple and easy to write.
Developer guidelines:
• Ideally, each method or function should have at least one test method.
Naming suggestion: def xy() → def test_xy(), def test_xy_suffix1(), def test_xy_suffix2()
Functions that are created for the sole purpose of structuring the code do not necessarily have their own unit test.
• Aim at having very fast unit tests!
There will be hundreds of unit tests and in general they are called in corpore and expected to finish after a reaonable
amount of time.
Less than 10 milisecond is good, 2 seconds is the maximum acceptable duration.
• A unit test shouldn’t call more than one climada method or function.
The motivation to combine more than one method in a test is usually creation of test data. Try to provide test data by
other means. Define them on the spot (within the code of the test module) or create a file in a test data directory that
can be read during the test. If this is too tedious, at least move the data acquisition part to the constructor of the test
class.
• Do not use external resources in unit tests.
Methods depending on external resources can be skipped from unit tests.
Integration Tests
Integration tests are meant to check the correctness of interaction between units of a module or a package.
As a general rule, more work is required to write integration tests than to write unit tests and they have longer runtime.
Developer guidelines:
System Tests
System tests are meant to check whether the whole software package is working correctly.
In CLIMADA, the system test that checks the core functionality of the package is executed by calling make in-
stall_test from the installation directory.
Error Messages
When a test fails, make sure the raised exception contains all information that might be helpful to identify the exact
problem.
If the error message is ever going to be read by someone else than you while still developing the test, you best assume it
will be someone who is completely naive about CLIMADA.
Writing extensive failure messages will eventually save more time than it takes to write them.
Putting the failure information into logs is neither required nor sufficient: the automated tests are built around error
messages, not logs.
Anything written to stdout by a test method is useful mainly for the developer of the test.
Test Coverage
Coverage is a measure of how much of your code is actually checked by the tests. One distinguishes between line coverage
and branch or conditionals coverage. The line coverage reports the percentage of all lines of code covered by the tests. The
branch coverage reports the percentage of all possible branches covered by the tests. Achieving a high branch coverage
is much harder than a high line coverage.
In CLIMADA, we aim for a high line coverage (only). Ideally, any new code should have a line coverage of 100%,
meaning every line of code is tested. You can inspect the test coverage of your local code by following the instructions
for executing tests below.
See the Continuous Integration Guide for information on how to inspect coverage of the automated test pipeline.
Test files
For integration tests it can be required to read data from a file, in order to set up a test that aims to check functionality with
non-trivial data, beyond the scope of unit tests. Some of thes test files can be found in the climada/**/test/data
directories or in the climada/data directory. As mostly the case with large test data, it is not very well suitable for a
Git repository.
The preferable alternative is to post the data to the Climada Data-API with status test_dataset and retrieve the files
on the fly from there during tests. To do this one can use the convenience method climada.test.get_test_file:
my_test_file = get_test_file(
ds_name="my-test-file", file_format="hdf5"
) # returns a pathlib.Path object
Behind the scenes, get_test_file uses the climada.util.api_client.Client to identify the appropriate dataset
and downloads the respective file to the local dataset cache (~/climada/data/*).
import climada
def x(download_file=climada.util.files_handler.download_file):
filepath = download_file("http://real_data.ch")
return Path(filepath).stat().st_size
import unittest
class TestX(unittest.TestCase):
def download_file_dummy(url):
return "phony_data.ch"
def test_x(self):
self.assertEqual(44, x(download_file=self.download_file_dummy))
Developer guideline:
Test Configuration
Use the configuration file climada.config in the installation directory to define file paths and external resources used
during tests (see the Constants and Configuration Guide).
pytest <path>
where you replace <path> with a Python file containing tests or an entire directory containing multiple test files. Pytest
will walk through all subdirectories of <path> and try to discover all tests. For example, to execute all tests within the
CLIMADA repository, execute
pytest climada/
Installation Test
From the installation directory run
make install_test
It lasts about 45 seconds. If it succeeds, CLIMADA is properly installed and ready to use.
Unit Tests
From the installation directory run
make unit_test
It lasts about 5 minutes and runs unit tests for all modules.
Integration Tests
From the installation directory run
make integ_test
It lasts about 15 minutes and runs extensive integration tests, during which also data from external resources is read. An
open internet connection is required for a successful test run.
Coverage
Executing make unit_test and make integ_tests provides local coverage reports as HTML pages at coverage/
index.html. You can open this file with your browser.
Creating a new python environment is often not necessary (e.g., for minor feature branch that do not change the depen-
dencies you can probably use a generic develop environment where you installed climada in editable mode), but can
help in some cases (for instance changes in dependencies), to do so:
Here is a generic set of instructions which should always work, assuming you already cloned the climada repository, and
are at the root of that folder:
def function(default=[]):
but use
def function(default=None):
if default is None: default=[]
d = {}
for color in colors:
d[color] = d.get(color, 0) + 1
d = collections.defaultdict(int)
for color in colors:
d[color] += 1
• Did the code writer perform a static code analysis? Does the code respect Pep8 (see also the pylint config file)?
• Did the code writer perform a profiling and checked that there are no obviously inefficient (computation time-wise
and memory-wise) parts in the code?
mkdir -p /cluster/project/climate/$USER \
/cluster/work/climate/$USER
module load \
gcc/12.2.0 \
stack/2024-06 \
python/3.11.6 \
hdf5/1.14.3 \
geos/3.9.1 \
sqlite/3.43.2 \
eccodes/2.25.0 \
gdal/3.6.3 \
eth_proxy
(The last two lines may seem odd but they are working around a conficting dependency version situation.)
You need to execute this every time you login to Euler before Climada can be used. To safe yourself from doing it
manually, append these lines to the ~/.bashrc script, which is automatically executed upon logging in to Euler.
envname=climada_env
# create environment
python -m venv --system-site-packages /cluster/project/climate/$USER/venv/$envname
# acitvate it
. /cluster/project/climate/$USER/venv/$envname/bin/activate
3. Install dependencies
pip install \
dask[dataframe] \
fiona==1.9 \
gdal==3.6 \
netcdf4==1.6.2 \
rasterio==1.4 \
pyproj==3.7 \
geopandas==1.0 \
xarray==2024.9 \
sparse==0.15
4. Install Climada
There are two options. Either install from the downloaded repository (option A), or use a particular released version
(option B).
option A
cd climada_python
pip install -e .
If you need to work with a specific branch of Climada, you can do so by checking out to the target branch your_branch
by running git checkout your_branch after having cloned the Climada repository and before running pip install
-e ..
option B
{
"local_data": {
"system": "/cluster/work/climate/USERNAME/climada/data",
"demo": "/cluster/project/climate/USERNAME/climada/data/demo",
"save_dir": "/cluster/work/climate/USERNAME/climada/results"
}
}
This should prompt the usual “OK” in the end. Once that succeeded you may want to test the installation also in a compute
node, just for the sake of it:
cd climada_petals
pip install -e .
mkdir -p ~/.config/euler/jupyterhub
cat > ~/.config/euler/jupyterhub/jupyterlabrc <<EOF
module purge
EOF
ln -s /cluster/software/others/services/jupyterhub/scripts/jupyterlab.sh ~/.config/
,→euler/jupyterhub/jupyterlab.sh
1. create a recipe
Create a file recipe.txt with the following content:
Bootstrap: docker
From: nvidia/cuda:12.0.0-devel-ubuntu22.04
%labels
version="1.0.0"
description="climada"
%post
# Install requirements
apt-get -y update
DEBIAN_FRONTEND="noninteractive" TZ="Europe/Rome" apt-get -y install tzdata
apt-get install -y `apt-cache depends openssh-client | awk '/Depends:/{print$2}'`
apt-get download openssh-client
dpkg --unpack openssh-client*.deb
rm /var/lib/dpkg/info/openssh-client.postinst -f
dpkg --configure openssh-client
apt-get -y install tk tcl rsync wget curl git patch
mkdir -p /opt/software
# Install jupyter
python -m pip install jupyterhub jupyterlab
%environment
#export LC_ALL=C
%runscript
. /opt/software/conda/bin/activate && conda activate climada_env
$@
sbatch \
--ntasks=1\
--cpus-per-task=1 \
--time=1:00:00 \
--job-name="build-climada-container" \
--mem-per-cpu=4096 \
--wrap="singularity build --sandbox /cluster/project/[path/to]/climada.sif recipe.
,→txt"
3. Configure jupyterhub
create a file ~/.config/euler/jupyterhub/jupyterlabrc with the following content:
#!/bin/bash
export JUPYTER_HOME=${JUPYTER_HOME:-$HOME}
export JUPYTER_DIR=${JUPYTER_DIR:-/}
export JUPYTER_EXTRA_ARGS=${JUPYTER_EXTRA_ARGS:-}
warn_jupyterhub
sleep 1
echo $PYTHON_EULER_ROOT
echo $JUPYTER_EXTRA_ARGS
echo $PROXY_PORT
export PYTHON_ROOT=/opt/software/conda/envs/climada_env
module purge
export APPTAINER_BIND="/cluster,$TMPDIR,$SCRATCH"
singularity exec --nv \
--env="NVIDIA_VISIBLE_DEVICES=all" \
--bind /cluster/project/[path/to]/climada_python:/opt/climada_workspace/climada_
,→python \
--bind /cluster/project/[path/to]/climada_petals:/opt/climada_workspace/climada_
,→petals \
/cluster/project/[path/to]/climada.sif \
/bin/bash <<EOF
. /opt/software/conda/bin/activate && conda activate climada_env
export JUPYTER_CONFIG_PATH=$HOME/.jupyterlab:$PYTHON_ROOT/share/jupyter
export JUPYTER_CONFIG_DIR=$HOME/.jupyterlab
export JUPYTER_PATH=$PYTHON_ROOT/share/jupyter
export JUPYTERLAB_DIR=$PYTHON_ROOT/share/jupyter/lab
export JUPYTERLAB_ROOT=$PYTHON_ROOT
export http_proxy=http://proxy.ethz.ch:3128
export https_proxy=http://proxy.ethz.ch:3128
jupyter lab build
jupyterhub-singleuser \
--preferred-dir="$JUPYTER_HOME" \
--notebook-dir="$JUPYTER_DIR" $JUPYTER_EXTRA_ARGS \
--keyfile="$CONFIG_PATH/jupyter.key" \
--certfile="$CONFIG_PATH/jupyter.crt" \
--port="$PROXY_PORT"
EOF
• Sarah Hülsen
• Timo Schmid
• Luca Severino
• Samuel Juhel
• Valentin Gebhart
FOUR
API REFERENCE
The API reference contains the whole specification of the code, that is, every modules, classes (and their attributes), and
functions that are available (and documented).
class climada.engine.unsequa.calc_base.Calc
Bases: object
Base class for uncertainty quantification
Contains the generic sampling and sensitivity methods. For computing the uncertainty distribution for specific
CLIMADA outputs see the subclass CalcImpact and CalcCostBenefit.
_input_var_names
Names of the required uncertainty variables.
Type
tuple(str)
_metric_names
Names of the output metrics.
Type
tuple(str)
Notes
Parallelization logics: for computation of the uncertainty users may specify a number N of processes on which to
perform the computations in parallel. Since the computation for each individual sample of the input parameters is
independent of one another, we implemented a simple distribution on the processes.
1. The samples are divided in N equal sub-sample chunks
2. Each chunk of samples is sent as one to a node for processing
Hence, this is equivalent to the user running the computation N times, once for each sub-sample. Note that for
each process, all the input variables must be copied once, and hence each parallel process requires roughly the same
amount of memory as if a single process would be used.
545
CLIMADA documentation, Release 6.0.2-dev
This approach differs from the usual parallelization strategy (where individual samples are distributed), because
each sample requires the entire input data. With this method, copying data between processes is reduced to a
minimum.
Parallelization is currently not available for the sensitivity computation, as this requires all samples simoultenaously
in the current implementation of the SaLib library.
__init__()
Empty constructor to be overwritten by subclasses
check_distr()
Log warning if input parameters repeated among input variables
Return type
True.
property input_vars
Uncertainty variables
Returns
All uncertainty variables associated with the calculation
Return type
tuple(UncVar)
property distr_dict
Dictionary of the input variable distribution
Probabilitiy density distribution of all the parameters of all the uncertainty variables listed in self.InputVars
Returns
distr_dict – Dictionary of all probability density distributions.
Return type
dict( sp.stats objects )
est_comp_time(n_samples, time_one_run, processes=None)
Estimate the computation time
Parameters
• n_samples (int/float) – The total number of samples
• time_one_run (int/float) – Estimated computation time for one parameter set in seconds
• pool (pathos.pool, optional) – pool that would be used for parallel computation. The default
is None.
Return type
Estimated computation time in secs.
make_sample(N, sampling_method='saltelli', sampling_kwargs=None)
Make samples of the input variables
For all input parameters, sample from their respective distributions using the chosen sampling_method from
SALib. https://salib.readthedocs.io/en/latest/api.html
This sets the attributes unc_output.samples_df, unc_output.sampling_method, unc_output.sampling_kwargs.
Parameters
• N (int) – Number of samples as used in the sampling method from SALib
Notes
The ‘ff’ sampling method does not require a value for the N parameter. The inputed N value is hence ignored
in the sampling process in the case of this method. The ‘ff’ sampling method requires a number of uncertainty
parameters to be a power of 2. The users can generate dummy variables to achieve this requirement. Please
refer to https://salib.readthedocs.io/en/latest/api.html for more details.
µ See also
SALib.sample
sampling methods from SALib SALib.sample
https
//salib.readthedocs.io/en/latest/api.html
Notes
The variables ‘Em’,’Term’,’X’,’Y’ are removed from the output of the ‘hdmr’ method to ensure compatibility
with unsequa. The ‘Delta’ method is currently not supported.
Returns
sens_output – Uncertainty data object with all the sensitivity indices, and all the uncertainty
data copied over from unc_output.
Return type
climada.engine.unsequa.UncOutput
climada.engine.unsequa.calc_cost_benefit module
Type
tuple(str)
_metric_names
Names of the cost benefit output metrics (‘tot_climate_risk’, ‘benefit’, ‘cost_ben_ratio’, ‘imp_meas_present’,
‘imp_meas_future’)
Type
tuple(str)
__init__(haz_input_var: InputVar | Hazard, ent_input_var: InputVar | Entity, haz_fut_input_var: InputVar |
Hazard | None = None, ent_fut_input_var: InputVar | Entity | None = None)
Initialize UncCalcCostBenefit
Sets the uncertainty input variables, the cost benefit metric_names, and the units.
Parameters
• haz_input_var (climada.engine.uncertainty.input_var.InputVar) – or cli-
mada.hazard.Hazard Hazard uncertainty variable or Hazard for the present Hazard in
climada.engine.CostBenefit.calc
• ent_input_var (climada.engine.uncertainty.input_var.InputVar) – or climada.entity.Entity
Entity uncertainty variable or Entity for the present Entity in climada.engine.CostBenefit.calc
• haz_fut_input_var (climada.engine.uncertainty.input_var.InputVar) – or cli-
mada.hazard.Hazard, optional Hazard uncertainty variable or Hazard for the future
Hazard The Default is None.
• ent_fut_input_var (climada.engine.uncertainty.input_var.InputVar) – or cli-
mada.entity.Entity, optional Entity uncertainty variable or Entity for the future Entity
in climada.engine.CostBenefit.calc
uncertainty(unc_sample, processes=1, chunksize=None, **cost_benefit_kwargs)
Computes the cost benefit for each sample in unc_output.sample_df.
By default, imp_meas_present, imp_meas_future, tot_climate_risk, benefit, cost_ben_ratio are computed.
This sets the attributes: unc_output.imp_meas_present_unc_df, unc_output.imp_meas_future_unc_df
unc_output.tot_climate_risk_unc_df unc_output.benefit_unc_df unc_output.cost_ben_ratio_unc_df
unc_output.unit unc_output.cost_benefit_kwargs
Parameters
• unc_sample (climada.engine.uncertainty.unc_output.UncOutput) – Uncertainty data object
with the input parameters samples
• processes (int, optional) – Number of CPUs to use for parralel computations. The default is
1 (not parallel)
• cost_benefit_kwargs (keyword arguments) – Keyword arguments passed on to cli-
mada.engine.CostBenefit.calc()
• chunksize (int, optional) – Size of the sample chunks for parallel processing. Default is equal
to the number of samples divided by the number of processes.
Returns
unc_output – Uncertainty data object in with the cost benefit outputs for each sample and all
the sample data copied over from unc_sample.
Return type
climada.engine.uncertainty.unc_output.UncCostBenefitOutput
Raises
ValueError: – If no sampling parameters defined, the uncertainty distribution cannot be
computed.
Notes
Parallelization logic is described in the base class here Calc
µ See also
climada.engine.cost_benefit
compute risk and adptation option cost benefits.
climada.engine.unsequa.calc_impact module
Type
InputVar if ImpactFuncSet
haz_input_var
Hazard uncertainty variable
Type
InputVar or Hazard
_input_var_names
Names of the required uncertainty input variables (‘exp_input_var’, ‘impf_input_var’, ‘haz_input_var’)
Type
tuple(str)
_metric_names
Names of the impact output metrics (‘aai_agg’, ‘freq_curve’, ‘at_event’, ‘eai_exp’)
Type
tuple(str)
__init__(exp_input_var: InputVar | Exposures, impf_input_var: InputVar | ImpactFuncSet, haz_input_var:
InputVar | Hazard)
Initialize UncCalcImpact
Sets the uncertainty input variables, the impact metric_names, and the units.
Parameters
• exp_input_var (climada.engine.uncertainty.input_var.InputVar or climada.entity.Exposure)
– Exposure uncertainty variable or Exposure
• impf_input_var (climada.engine.uncertainty.input_var.InputVar or cli-
mada.entity.ImpactFuncSet) – Impact function set uncertainty variable or Impact function
set
• haz_input_var (climada.engine.uncertainty.input_var.InputVar or climada.hazard.Hazard)
– Hazard uncertainty variable or Hazard
uncertainty(unc_sample, rp=None, calc_eai_exp=False, calc_at_event=False, processes=1, chunksize=None)
Computes the impact for each sample in unc_data.sample_df.
By default, the aggregated average impact within a period of 1/frequency_unit (impact.aai_agg) and the excees
impact at return periods rp (imppact.calc_freq_curve(self.rp).impact) is computed. Optionally, eai_exp and
at_event is computed (this may require a larger amount of memory if the number of samples and/or the
number of centroids and/or exposures points is large).
This sets the attributes self.rp, self.calc_eai_exp, self.calc_at_event, self.metrics.
This sets the attributes: unc_output.aai_agg_unc_df, unc_output.freq_curve_unc_df
unc_output.eai_exp_unc_df unc_output.at_event_unc_df unc_output.unit
Parameters
• unc_sample (climada.engine.uncertainty.unc_output.UncOutput) – Uncertainty data object
with the input parameters samples
• rp (list(int), optional) – Return periods in years to be computed. The default is [5, 10, 20,
50, 100, 250].
• calc_eai_exp (boolean, optional) – Toggle computation of the impact at each centroid loca-
tion. The default is False.
• calc_at_event (boolean, optional) – Toggle computation of the impact for each event. The
default is False.
• processes (int, optional) – Number of CPUs to use for parralel computations. The default is
1 (not parallel)
• chunksize (int, optional) – Size of the sample chunks for parallel processing. Default is equal
to the number of samples divided by the number of processes.
Returns
unc_output – Uncertainty data object with the impact outputs for each sample and all the
sample data copied over from unc_sample.
Return type
climada.engine.uncertainty.unc_output.UncImpactOutput
Raises
ValueError: – If no sampling parameters defined, the distribution cannot be computed.
Notes
Parallelization logic is described in the base class here Calc
µ See also
climada.engine.impact
compute impact and risk.
climada.engine.unsequa.input_var module
Notes
A few default Variables are defined for Hazards, Exposures, Impact Fucntions, Measures and Entities.
Examples
Categorical variable function: LitPop exposures with m,n exponents in [0,5]
Returns
axes – The figure and axes handle of the plot.
Return type
matplotlib.pyplot.figure, matplotlib.pyplot.axes
static var_to_inputvar(var)
Returns an uncertainty variable with no distribution if var is not an InputVar. Else, returns var.
Parameters
var (climada.uncertainty.InputVar or any other CLIMADA object)
Returns
var if var is InputVar, else InputVar with var and no distribution.
Return type
InputVar
static haz(haz_list, n_ev=None, bounds_int=None, bounds_frac=None, bounds_freq=None)
Helper wrapper for basic hazard uncertainty input variable
The following types of uncertainties can be added:
HE: sub-sampling events from the total event set
For each sub-sample, n_ev events are sampled with replacement. HE is the value of the seed for the
uniform random number generator.
HI: scale the intensity of all events (homogeneously)
The instensity of all events is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_int
HA: scale the fraction of all events (homogeneously)
The fraction of all events is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_frac
HF: scale the frequency of all events (homogeneously)
The frequency of all events is multiplied by a number sampled uniformly from a distribution with (min,
max) = bounds_freq
Return type
climada.engine.unsequa.input_var.InputVar
static impfset(impf_set_list, haz_id_dict=None, bounds_mdd=None, bounds_paa=None,
bounds_impfi=None)
Helper wrapper for basic impact function set uncertainty input variable.
One impact function (chosen with haz_type and fun_id) is characterized.
The following types of uncertainties can be added:
MDD: scale the mdd (homogeneously)
The value of mdd at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_mdd
PAA: scale the paa (homogeneously)
The value of paa at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_paa
IFi: shift the intensity (homogeneously)
The value intensity are all summed with a random number sampled uniformly from a distribution with
(min, max) = bounds_int
IL: sample uniformly from impact function set list
From the provided list of impact function sets elements are uniformly sampled. For example, impact
functions obtained from different calibration methods.
If a bounds is None, this parameter is assumed to have no uncertainty.
Parameters
• impf_set_list (list of ImpactFuncSet) – The list of base impact function set. Can be one
or many to uniformly sample from. The impact function ids must identical for all impact
function sets.
• bounds_mdd ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous mdd scaling. The default is None.
• bounds_paa ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous paa scaling. The default is None.
• bounds_impfi ((float, float), optional) – Bounds of the uniform distribution for the homo-
geneous shift of intensity. The default is None.
• haz_id_dict (dict(), optional) – Dictionary of the impact functions affected by uncer-
tainty. Keys are hazard types (str), values are a list of impact function id (int). Default
is impsf_set.get_ids() i.e. all impact functions in the set
Returns
Uncertainty input variable for an impact function set object.
Return type
climada.engine.unsequa.input_var.InputVar
static ent(impf_set_list, disc_rate, exp_list, meas_set, haz_id_dict, bounds_disc=None, bounds_cost=None,
bounds_totval=None, bounds_noise=None, bounds_mdd=None, bounds_paa=None,
bounds_impfi=None)
Helper wrapper for basic entity set uncertainty input variable.
Important: only the impact function defined by haz_type and fun_id will be affected by bounds_impfi,
bounds_mdd, bounds_paa.
The following types of uncertainties can be added:
• impf_set_list (list of ImpactFuncSet) – The list of base impact function set. Can be one
or many to uniformly sample from. The impact function ids must identical for all impact
function sets.
• disc_rate (climada.entity.disc_rates.base.DiscRates) – The base discount rates.
• exp_list ([climada.entity.exposures.base.Exposure]) – The list of base exposure. Can be one
or many to uniformly sample from.
• meas_set (climada.entity.measures.measure_set.MeasureSet) – The base measures.
• haz_id_dict (dict) – Dictionary of the impact functions affected by uncertainty. Keys are
hazard types (str), values are a list of impact function id (int).
Returns
Entity uncertainty input variable
Return type
climada.engine.unsequa.input_var.InputVar
static entfut(impf_set_list, exp_list, meas_set, haz_id_dict, bounds_cost=None, bounds_eg=None,
bounds_noise=None, bounds_impfi=None, bounds_mdd=None, bounds_paa=None)
Helper wrapper for basic future entity set uncertainty input variable.
Important: only the impact function defined by haz_type and fun_id will be affected by bounds_impfi,
bounds_mdd, bounds_paa.
The following types of uncertainties can be added:
CO: scale the cost (homogeneously)
The cost of all measures is multiplied by the same number sampled uniformly from a distribution with
(min, max) = bounds_cost
EG: scale the exposures growth (homogeneously)
The value at each exposure point is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_eg
EN: mutliplicative noise (inhomogeneous)
The value of each exposure point is independently multiplied by a random number sampled uniformly
from a distribution with (min, max) = bounds_noise. EN is the value of the seed for the uniform random
number generator.
EL: sample uniformly from exposure list
From the provided list of exposure is elements are uniformly sampled. For example, LitPop instances
with different exponents.
MDD: scale the mdd (homogeneously)
The value of mdd at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_mdd
PAA: scale the paa (homogeneously)
The value of paa at each intensity is multiplied by a number sampled uniformly from a distribution with
(min, max) = bounds_paa
IFi: shift the impact function intensity (homogeneously)
The value intensity are all summed with a random number sampled uniformly from a distribution with
(min, max) = bounds_impfi
IL: sample uniformly from impact function set list
From the provided list of impact function sets elements are uniformly sampled. For example, impact
functions obtained from different calibration methods.
If a bounds is None, this parameter is assumed to have no uncertainty.
Parameters
• bounds_cost ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous cost of all measures scaling. The default is None.
• bounds_eg ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous total exposure growth scaling. The default is None.
• bounds_noise ((float, float), optional) – Bounds of the uniform distribution to scale each
exposure point independently. The default is None.
• bounds_mdd ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous mdd scaling. The default is None.
• bounds_paa ((float, float), optional) – Bounds of the uniform distribution for the homoge-
neous paa scaling. The default is None.
• bounds_impfi ((float, float), optional) – Bounds of the uniform distribution for the homo-
geneous shift of intensity. The default is None.
• impf_set_list (list of ImpactFuncSet) – The list of base impact function set. Can be one
or many to uniformly sample from. The impact function ids must identical for all impact
function sets.
• exp_list ([climada.entity.exposures.base.Exposure]) – The list of base exposure. Can be one
or many to uniformly sample from.
• meas_set (climada.entity.measures.measure_set.MeasureSet) – The base measures.
• haz_id_dict (dict) – Dictionary of the impact functions affected by uncertainty. Keys are
hazard types (str), values are a list of impact function id (int).
Returns
Entity uncertainty input variable
Return type
climada.engine.unsequa.input_var.InputVar
climada.engine.unsequa.unc_output module
__init__(samples_df, unit=None)
Initialize Uncertainty Data object.
Parameters
• samples_df (pandas.DataFrame) – input parameters samples
• unit (str, optional) – value unit
order_samples(by_parameters)
Function to sort the samples dataframe.
Note: the unc_output.samples_df is ordered inplace.
Parameters
by_parameters (list[string]) – List of the uncertainty parameters to sort by (ordering in list is
kept)
Return type
None.
get_samples_df()
get_unc_df(metric_name)
set_unc_df(metric_name, unc_df )
get_sens_df(metric_name)
set_sens_df(metric_name, sens_df )
check_salib(sensitivity_method )
Checks whether the chosen sensitivity method and the sampling method used to generated self.samples_df
respect the pairing recommendation by the SALib package.
https://salib.readthedocs.io/en/latest/api.html
Parameters
sensitivity_method (str) – Name of the sensitivity analysis method.
Returns
True if sampling and sensitivity methods respect the recommended pairing.
Return type
bool
property sampling_method
Returns the sampling method used to generate self.samples_df See: https://salib.readthedocs.io/en/latest/
api.html#
Returns
Sampling method name
Return type
str
property sampling_kwargs
Returns the kwargs of the sampling method that generate self.samples_df
Returns
Dictionary of arguments for SALib sampling method
Return type
dict
property n_samples
The effective number of samples
Returns
effective number of samples
Return type
int
property param_labels
Labels of all uncertainty input parameters.
Returns
Labels of all uncertainty input parameters.
Return type
list of str
property problem_sa
The description of the uncertainty variables and their distribution as used in SALib. https://salib.readthedocs.
io/en/latest/basics.html
Returns
Salib problem dictionary.
Return type
dict
property uncertainty_metrics
Retrieve all uncertainty output metrics names
Returns
unc_metric_list – List of names of attributes containing metrics uncertainty values, without
the trailing ‘_unc_df’
Return type
[str]
property sensitivity_metrics
Retrieve all sensitivity output metrics names
Returns
sens_metric_list – List of names of attributes containing metrics sensitivity values, without the
trailing ‘_sens_df’
Return type
[str]
get_uncertainty(metric_list=None)
Returns uncertainty dataframe with values for each sample
Parameters
metric_list ([str], optional) – List of uncertainty metrics to consider. The default returns all
uncertainty metrics at once.
Returns
Joint dataframe of all uncertainty values for all metrics in the metric_list.
Return type
pandas.DataFrame
µ See also
uncertainty_metrics
list of all available uncertainty metrics
get_sensitivity(salib_si, metric_list=None)
Returns sensitivity index
E.g. For the sensitivity analysis method ‘sobol’, the choices are [‘S1’, ‘ST’], for ‘delta’ the choices are [‘delta’,
‘S1’].
For more information see the SAlib documentation: https://salib.readthedocs.io/en/latest/basics.html
Parameters
• salib_si (str) – Sensitivity index
• metric_list ([str], optional) – List of sensitivity metrics to consider. The default returns all
sensitivity indices at once.
Returns
Joint dataframe of the sensitvity indices for all metrics in the metric_list
Return type
pandas.DataFrame
µ See also
sensitvity_metrics
list of all available sensitivity metrics
Parameters
figsize (tuple(int or float, int or float), optional) – The figsize argument of mat-
plotlib.pyplot.subplots() The default is derived from the total number of plots (nplots) as:
Raises
ValueError – If no sample was computed the plot cannot be made.
Returns
axes – The axis handle of the plot.
Return type
matplotlib.pyplot.axes
plot_uncertainty(metric_list=None, orig_list=None, figsize=None, log=False, axes=None, calc_delta=False)
Plot the uncertainty distribution
For each risk metric, a separate axes is used to plot the uncertainty distribution of the output values obtained
over the sampled input parameters.
Parameters
• metric_list (list[str], optional) – List of metrics to plot the distribution. The default is None.
• orig_list (list[float], optional) – List of the original (without uncertainty) values for each
sub-metric of the mtrics in metric_list. The ordering is identical. The default is None.
• figsize (tuple(int or float, int or float), optional) – The figsize argument of mat-
plotlib.pyplot.subplots() The default is derived from the total number of plots (nplots) as:
nrows, ncols = int(np.ceil(nplots / 3)), min(nplots, 3) figsize = (ncols * FIG_W, nrows *
FIG_H)
• log (boolean, optional) – Use log10 scale for x axis. Default is False.
• axes (matplotlib.pyplot.axes, optional) – Axes handles to use for the plot. The default is None.
• calc_delta (boolean, optional) – Adapt x axis label for CalcDeltaImpact unc_output. Default
is False.
Raises
ValueError – If no metric distribution was computed the plot cannot be made.
Returns
axes – The axes handle of the plot.
Return type
matplotlib.pyplot.axes
µ See also
uncertainty_metrics
list of all available uncertainty metrics
• orig_list (list[float], optional) – List of the original (without uncertainty) values for each
sub-metric of the metrics in metric_list. The ordering is identical. The default is None.
• figsize (tuple(int or float, int or float), optional) – The figsize argument of mat-
plotlib.pyplot.subplots() The default is (16, 6)
• axes (matplotlib.pyplot.axes, optional) – Axes handles to use for the plot. The default is None.
• calc_delta (boolean, optional) – Adapt axis labels for CalcDeltaImpact unc_output. Default
is False.
Raises
ValueError – If no metric distribution was computed the plot cannot be made.
Returns
axes – The axis handle of the plot.
Return type
matplotlib.pyplot.axes
plot_sensitivity(salib_si='S1', salib_si_conf='S1_conf', metric_list=None, figsize=None, axes=None,
**kwargs)
Bar plot of a first order sensitivity index
For each metric, the sensitivity indices are plotted in a separate axes.
This requires that a senstivity analysis was already performed.
E.g. For the sensitivity analysis method ‘sobol’, the choices are [‘S1’, ‘ST’], for ‘delta’ the choices are [‘delta’,
‘S1’].
Note that not all sensitivity indices have a confidence interval.
For more information see the SAlib documentation: https://salib.readthedocs.io/en/latest/basics.html
Parameters
• salib_si (string, optional) – The first order (one value per metric output) sensitivity index to
plot. The default is S1.
• salib_si_conf (string, optional) – The confidence value for the first order sensitivity index to
plot. The default is S1_conf.
• metric_list (list of strings, optional) – List of metrics to plot the sensitivity. If a metric is not
found it is ignored.
• figsize (tuple(int or float, int or float), optional) – The figsize argument of mat-
plotlib.pyplot.subplots() The default is derived from the total number of plots (nplots) as:
• axes (matplotlib.pyplot.axes, optional) – Axes handles to use for the plot. The default is None.
• kwargs – Keyword arguments passed on to pandas.DataFrame.plot(kind=’bar’)
Raises
ValueError : – If no sensitivity is available the plot cannot be made.
Returns
axes – The axes handle of the plot.
Return type
matplotlib.pyplot.axes
µ See also
sensitvity_metrics
list of all available sensitivity metrics
• axes (matplotlib.pyplot.axes, optional) – Axes handles to use for the plot. The default is None.
• kwargs – Keyword arguments passed on to matplotlib.pyplot.imshow()
Raises
ValueError : – If no sensitivity is available the plot cannot be made.
Returns
axes – The axes handle of the plot.
Return type
matplotlib.pyplot.axes
µ See also
sensitvity_metrics
list of all available sensitivity metrics
plot_sensitivity_map(salib_si='S1', **kwargs)
Plot a map of the largest sensitivity index in each exposure point
Requires the uncertainty distribution for eai_exp.
Parameters
• salib_si (str, optional) – The name of the sensitivity index to plot. The default is ‘S1’.
• kwargs – Keyword arguments passed on to climada.util.plot.geo_scatter_categorical
Raises
ValueError : – If no sensitivity data is found, raise error.
Returns
ax – The axis handle of the plot.
Return type
matplotlib.pyplot.axes
µ See also
climada.util.plot.geo_scatter_categorical
geographical plot for categorical variable
to_hdf5(filename=None)
Save output to .hdf5
Parameters
filename (str or pathlib.Path, optional) – The filename with absolute or relative path. The
default name is “unc_output + datetime.now() + .hdf5” and the default path is taken from cli-
mada.config
Returns
save_path – Path to the saved file
Return type
pathlib.Path
static from_hdf5(filename)
Load a uncertainty and uncertainty output data from .hdf5 file
Parameters
filename (str or pathlib.Path) – The filename with absolute or relative path.
Returns
unc_output – Uncertainty and sensitivity data loaded from .hdf5 file.
Return type
climada.engine.uncertainty.unc_output.UncOutput
class climada.engine.unsequa.unc_output.UncCostBenefitOutput(samples_df, unit,
imp_meas_present_unc_df,
imp_meas_future_unc_df,
tot_climate_risk_unc_df,
benefit_unc_df,
cost_ben_ratio_unc_df,
cost_benefit_kwargs)
Bases: UncOutput
Extension of UncOutput specific for CalcCostBenefit, returned by the uncertainty() method.
__init__(samples_df, unit, imp_meas_present_unc_df, imp_meas_future_unc_df, tot_climate_risk_unc_df,
benefit_unc_df, cost_ben_ratio_unc_df, cost_benefit_kwargs)
Constructor
Bases: UncOutput
Extension of UncOutput specific for CalcDeltaImpact, returned by the uncertainty() method.
__init__(samples_df, unit, aai_agg_unc_df, freq_curve_unc_df, eai_exp_unc_df, at_event_initial_unc_df,
at_event_final_unc_df, coord_df )
Constructor
Uncertainty output values from impact.calc for each sample
Parameters
• samples_df (pandas.DataFrame) – input parameters samples
• unit (str) – value unit
• aai_agg_unc_df (pandas.DataFrame) – Each row contains the value of aai_aag for one
sample (row of samples_df)
• freq_curve_unc_df (pandas.DataFrame) – Each row contains the values of the impact ex-
ceedence frequency curve for one sample (row of samples_df)
• eai_exp_unc_df (pandas.DataFrame) – Each row contains the values of eai_exp for one
sample (row of samples_df)
• at_event_initial_unc_df (pandas.DataFrame) – Each row contains the values of at_event
for one sample (row of samples_df)
• at_event_final_unc_df (pandas.DataFrame) – Each row contains the values of at_event for
one sample (row of samples_df)
• coord_df (pandas.DataFrame) – Coordinates of the exposure
climada.engine.unsequa.calc_delta_climate module
rp
List of the chosen return periods.
Type
list(int)
calc_eai_exp
Compute eai_exp or not
Type
bool
calc_at_event
Compute eai_exp or not
Type
bool
value_unit
Unit of the exposures value
Type
str
exp_input_var
Exposure uncertainty variable
Type
InputVar or Exposures
impf_input_var
Impact function set uncertainty variable
Type
InputVar if ImpactFuncSet
haz_input_var
Hazard uncertainty variable
Type
InputVar or Hazard
_input_var_names
Names of the required uncertainty input variables (‘exp_initial_input_var’, ‘impf_initial_input_var’,
‘haz_initial_input_var’, ‘exp_final_input_var’, ‘impf_final_input_var’, ‘haz_final_input_var’’)
Type
tuple(str)
_metric_names
Names of the impact output metrics (‘aai_agg’, ‘freq_curve’, ‘at_event’, ‘eai_exp’)
Type
tuple(str)
__init__(exp_initial_input_var: InputVar | Exposures, impf_initial_input_var: InputVar | ImpactFuncSet,
haz_initial_input_var: InputVar | Hazard, exp_final_input_var: InputVar | Exposures,
impf_final_input_var: InputVar | ImpactFuncSet, haz_final_input_var: InputVar | Hazard)
Initialize UncCalcImpact
Sets the uncertainty input variables, the impact metric_names, and the units.
Parameters
Return type
climada.engine.uncertainty.unc_output.UncImpactOutput
Raises
ValueError: – If no sampling parameters defined, the distribution cannot be computed.
Notes
Parallelization logic is described in the base class here Calc
µ See also
climada.engine.impact
compute impact and risk.
climada.engine.calibration_opt module
climada.engine.calibration_opt.calib_instance(hazard, exposure, impact_func, df_out=Empty
DataFrame Columns: [] Index: [], yearly_impact=False,
return_cost='False' )
calculate one impact instance for the calibration algorithm and write to given DataFrame
Parameters
• hazard (Hazard)
• exposure (Exposure)
• impact_func (ImpactFunc)
• df_out (Dataframe, optional) – Output DataFrame with headers of columns defined and op-
tionally with first row (index=0) defined with values. If columns “impact”, “event_id”, or
“year” are not included, they are created here. Data like reported impacts or impact function
parameters can be given here; values are preserved.
• yearly_impact (boolean, optional) – if set True, impact is returned per year, not per event
• return_cost (str, optional) – if not ‘False’ but any of ‘R2’, ‘logR2’, cost is returned instead of
df_out
Returns
df_out – DataFrame with modelled impact written to rows for each year or event.
Return type
DataFrame
climada.engine.calibration_opt.init_impf(impf_name_or_instance, param_dict, df_out=Empty
DataFrame Columns: [] Index: [0])
create an ImpactFunc based on the parameters in param_dict using the method specified in
impf_parameterisation_name and document it in df_out.
Parameters
• impf_name_or_instance (str or ImpactFunc) – method of impact function parameterisation
e.g. ‘emanuel’ or an instance of ImpactFunc
• param_dict (dict, optional) – dict of parameter_names and values e.g. {‘v_thresh’: 25.7,
‘v_half’: 70, ‘scale’: 1} or {‘mdd_shift’: 1.05, ‘mdd_scale’: 0.8, ‘paa_shift’: 1, paa_scale’: 1}
Returns
Parameters
• df_out (pd.Dataframe) – DataFrame as created in calib_instance
• cost_function (str) – chooses the cost function e.g. ‘R2’ or ‘logR2’
Returns
cost – The results of the cost function when comparing modelled and reported impact
Return type
float
Returns
param_dict_result – the parameters with the best calibration results (or a tuple with (1) the pa-
rameters and (2) the optimization output)
Return type
dict or tuple
climada.engine.cost_benefit module
class climada.engine.cost_benefit.CostBenefit(present_year: int = 2016, future_year: int = 2030,
tot_climate_risk: float = 0.0, unit: str = 'USD',
color_rgb: Dict[str, ndarray] | None = None, benefit:
Dict[str, float] | None = None, cost_ben_ratio: Dict[str,
float] | None = None, imp_meas_present: Dict[str, float |
Tuple[float, float] | Impact | ImpactFreqCurve] | None
= None, imp_meas_future: Dict[str, float | Tuple[float,
float] | Impact | ImpactFreqCurve] | None = None)
Bases: object
Impact definition. Compute from an entity (exposures and impact functions) and hazard.
present_year
present reference year
Type
int
future_year
future year
Type
int
tot_climate_risk
total climate risk without measures
Type
float
unit
unit used for impact
Type
str
color_rgb
color code RGB for each measure.
Type
dict
Key
measure name (‘no measure’ used for case without measure),
Type
str
Value
Type
np.array
benefit
benefit of each measure. Key: measure name, Value: float benefit
Type
dict
cost_ben_ratio
cost benefit ratio of each measure. Key: measure name, Value: float cost benefit ratio
Type
dict
imp_meas_future
impact of each measure at future or default. Key: measure name (‘no measure’ used for case without mea-
sure), Value: dict with: ‘cost’ (tuple): (cost measure, cost factor insurance), ‘risk’ (float): risk measurement,
‘risk_transf’ (float): annual expected risk transfer, ‘efc’ (ImpactFreqCurve): impact exceedance freq (op-
tional) ‘impact’ (Impact): impact instance
Type
dict
imp_meas_present
impact of each measure at present. Key: measure name (‘no measure’ used for case without measure), Value:
dict with: ‘cost’ (tuple): (cost measure, cost factor insurance), ‘risk’ (float): risk measurement, ‘risk_transf’
(float): annual expected risk transfer, ‘efc’ (ImpactFreqCurve): impact exceedance freq (optional) ‘impact’
(Impact): impact instance
Type
dict
__init__(present_year: int = 2016, future_year: int = 2030, tot_climate_risk: float = 0.0, unit: str = 'USD',
color_rgb: Dict[str, ndarray] | None = None, benefit: Dict[str, float] | None = None, cost_ben_ratio:
Dict[str, float] | None = None, imp_meas_present: Dict[str, float | Tuple[float, float] | Impact |
ImpactFreqCurve] | None = None, imp_meas_future: Dict[str, float | Tuple[float, float] | Impact |
ImpactFreqCurve] | None = None)
Initilization
calc(hazard, entity, haz_future=None, ent_future=None, future_year=None, risk_func=<function
risk_aai_agg>, imp_time_depen=None, save_imp=False, assign_centroids=True)
Compute cost-benefit ratio for every measure provided current and, optionally, future conditions. Present and
future measures need to have the same name. The measures costs need to be discounted by the user. If future
entity provided, only the costs of the measures of the future and the discount rates of the present will be used.
Parameters
• hazard (climada.Hazard)
• entity (climada.entity)
• haz_future (climada.Hazard, optional) – hazard in the future (future year provided at
ent_future)
• ent_future (Entity, optional) – entity in the future. Default is None
• future_year (int, optional) – future year to consider if no ent_future. Default
is None provided. The benefits are added from the entity.exposures.ref_year until
ent_future.exposures.ref_year, or until future_year if no ent_future given. Default: en-
tity.exposures.ref_year+1
• risk_func (func optional) – function describing risk measure to use to compute the annual
benefit from the Impact. Default: average annual impact (aggregated).
• imp_time_depen (float, optional) – parameter which represents time evolution of impact
(super- or sublinear). If None: all years count the same when there is no future hazard nor
entity and 1 (linear annual change) when there is future hazard or entity. Default is None.
• save_imp (bool, optional) – Default: False
• assign_centroids (bool, optional) – indicates whether centroids are assigned to the
self.exposures object. Centroids assignment is an expensive operation; set this to False
to save computation time if the exposures from ent and ent_fut have already centroids
assigned for the respective hazards. Default: True
• True if Impact of each measure is saved. Default is False.
combine_measures(in_meas_names, new_name, new_color, disc_rates, imp_time_depen=None,
risk_func=<function risk_aai_agg>)
Compute cost-benefit of the combination of measures previously computed by calc with save_imp=True. The
benefits of the measures per event are added. To combine with risk transfer options use apply_risk_transfer.
Parameters
• in_meas_names (list(str))
• list with names of measures to combine
• new_name (str) – name to give to the new resulting measure new_color (np.array): color
code RGB for new measure, e.g. np.array([0.1, 0.1, 0.1])
• disc_rates (DiscRates) – discount rates instance
• imp_time_depen (float, optional) – parameter which represents time evolution of impact
(super- or sublinear). If None: all years count the same when there is no future hazard nor
entity and 1 (linear annual change) when there is future hazard or entity. Default is None.
• risk_func (func, optional) – function describing risk measure given an Impact. Default:
average annual impact (aggregated).
Return type
climada.CostBenefit
apply_risk_transfer(meas_name, attachment, cover, disc_rates, cost_fix=0, cost_factor=1,
imp_time_depen=None, risk_func=<function risk_aai_agg>)
Applies risk transfer to given measure computed before with saved impact and compares it to when no measure
is applied. Appended to dictionaries of measures.
Parameters
• meas_name (str) – name of measure where to apply risk transfer
• attachment (float) – risk transfer values attachment (deductible)
• cover (float) – risk transfer cover
• cost_fix (float) – fixed cost of implemented innsurance, e.g. transaction costs
• cost_factor (float, optional) – factor to which to multiply the insurance layer to compute its
cost. Default is 1
• imp_time_depen (float, optional) – parameter which represents time evolution of impact
(super- or sublinear). If None: all years count the same when there is no future hazard nor
entity and 1 (linear annual change) when there is future hazard or entity. Default is None.
• risk_func (func, optional) – function describing risk measure given an Impact. Default:
average annual impact (aggregated).
remove_measure(meas_name)
Remove computed values of given measure
Parameters
meas_name (str) – name of measure to remove
plot_cost_benefit(cb_list=None, axis=None, **kwargs)
Plot cost-benefit graph. Call after calc().
Parameters
• cb_list (list(CostBenefit), optional) – if other CostBenefit provided, overlay them all. Used
for uncertainty visualization.
• axis (matplotlib.axes._subplots.AxesSubplot, optional) – axis to use
• kwargs (optional) – arguments for Rectangle matplotlib, e.g. alpha=0.5 (color is set by
measures color attribute)
Return type
matplotlib.axes._subplots.AxesSubplot
plot_event_view(return_per=(10, 25, 100), axis=None, **kwargs)
Plot averted damages for return periods. Call after calc().
Parameters
• return_per (list, optional) – years to visualize. Default 10, 25, 100
• axis (matplotlib.axes._subplots.AxesSubplot, optional) – axis to use
• kwargs (optional) – arguments for bar matplotlib function, e.g. alpha=0.5 (color is set by
measures color attribute)
Return type
matplotlib.axes._subplots.AxesSubplot
static plot_waterfall(hazard, entity, haz_future, ent_future, risk_func=<function risk_aai_agg>,
axis=None, **kwargs)
Plot waterfall graph at future with given risk metric. Can be called before and after calc().
Parameters
• hazard (climada.Hazard)
• entity (climada.Entity)
• haz_future (Hazard) – hazard in the future (future year provided at ent_future).
haz_future is expected to have the same centroids as hazard.
• ent_future (climada.Entity) – entity in the future
• risk_func (func, optional) – function describing risk measure given an Impact. Default:
average annual impact (aggregated).
• axis (matplotlib.axes._subplots.AxesSubplot, optional) – axis to use
• kwargs (optional) – arguments for bar matplotlib function, e.g. alpha=0.5
Return type
matplotlib.axes._subplots.AxesSubplot
climada.engine.cost_benefit.risk_rp_100(impact )
Risk measurement as exceedance impact at 100 years return period.
Parameters
impact (climada.engine.Impact) – an Impact instance
Return type
float
climada.engine.cost_benefit.risk_rp_250(impact )
Risk measurement as exceedance impact at 250 years return period.
Parameters
impact (climada.engine.Impact) – an Impact instance
Return type
float
climada.engine.forecast module
class climada.engine.forecast.Forecast(hazard_dict: Dict[str, Hazard], exposure: Exposures,
impact_funcs: ImpactFuncSet, haz_model: str = 'NWP',
exposure_name: str | None = None)
Bases: object
Forecast definition. Compute an impact forecast with predefined hazard originating from a forecast (like numerical
weather prediction models), exposure and impact. Use the calc() method to calculate a forecasted impact. Then
use the plotting methods to illustrate the forecasted impacts. By default plots are saved under in a ‘/forecast/plots’
folder in the configurable save_dir in local_data (see climada.util.config) under a name summarizing the Hazard
type, haz model name, initialization time of the forecast run, event date, exposure name and the plot title. As the
class is relatively new, there might be future changes to the attributes, the methods, and the parameters used to call
the methods. It was discovered at some point, that there might be a memory leak in matplotlib even when figures
are closed (https://github.com/matplotlib/matplotlib/issues/8519). Due to this reason the plotting functions in this
module have the flag close_fig, to close figures within the function scope, which might mitigate that problem if a
script runs this plotting functions many times.
run_datetime
initialization time of the forecast model run used to create the Hazard
Type
list of datetime.datetime
event_date
Date on which the Hazard event takes place
Type
datetime.datetime
hazard
List of the hazard forecast with different lead times.
Type
list of CLIMADA Hazard
haz_model
Short string specifying the model used to create the hazard, if possible three big letters.
Type
str
exposure
an CLIMADA Exposures containg values at risk
Type
Exposure
exposure_name
string specifying the exposure (e.g. ‘EU’), which is used to name output files.
Type
str
vulnerability
Set of impact functions used in the impact calculation.
Type
ImpactFuncSet
__init__(hazard_dict: Dict[str, Hazard], exposure: Exposures, impact_funcs: ImpactFuncSet, haz_model: str
= 'NWP', exposure_name: str | None = None)
Initialization with hazard, exposure and vulnerability.
Parameters
• hazard_dict (dict) – Dictionary of the format {run_datetime: Hazard} with run_datetime
being the initialization time of a weather forecast run and Hazard being a CLIMADA Haz-
ard derived from that forecast for one event. A probabilistic representation of that one
event is possible, as long as the attribute Hazard.date is the same for all events. Several
run_datetime:Hazard combinations for the same event can be provided.
• exposure (Exposures)
• impact_funcs (ImpactFuncSet)
• haz_model (str, optional) – Short string specifying the model used to create the hazard, if
possible three big letters. Default is ‘NWP’ for numerical weather prediction.
• exposure_name (str, optional) – string specifying the exposure (e.g. ‘EU’), which is used
to name output files. If None, the name will be inferred from the Exposures GeoDataframe
region_id column, using the corresponding name of the region with the lowest ISO 3166-1
numeric code. If that fails, it defaults to "custom".
ei_exp(run_datetime=None)
Expected impact per exposure
Parameters
run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime, de-
fault is first element of attribute run_datetime.
Return type
float
ai_agg(run_datetime=None)
average impact aggregated over all exposures
Parameters
run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime, de-
fault is first element of attribute run_datetime.
Return type
float
haz_summary_str(run_datetime=None)
provide a summary string for the hazard part of the forecast
Parameters
run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime, de-
fault is first element of attribute run_datetime.
Returns
summarizing the most important information about the hazard
Return type
str
summary_str(run_datetime=None)
provide a summary string for the impact forecast
Parameters
run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime, de-
fault is first element of attribute run_datetime.
Returns
summarizing the most important information about the impact forecast
Return type
str
lead_time(run_datetime=None)
provide the lead time for the impact forecast
Parameters
run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime, de-
fault is first element of attribute run_datetime.
Returns
the difference between the initialization time of the forecast model run and the date of the event,
commenly named lead time
Return type
datetime.timedelta
calc(force_reassign=False)
calculate the impacts for all lead times using exposure, all hazards of all run_datetime, and ImpactFunctionSet.
Parameters
force_reassign (bool, optional) – Reassign hazard centroids to the exposure for all hazards,
default is false.
plot_imp_map(run_datetime=None, explain_str=None, save_fig=True, close_fig=False, polygon_file=None,
polygon_file_crs='epsg:4326', proj=<Projected CRS: +proj=eqc +ellps=WGS84 +a=6378137.0
+lon_0=0.0 +to ...> Name: unknown Axis Info [cartesian]: - E[east]: Easting (unknown) -
N[north]: Northing (unknown) - h[up]: Ellipsoidal height (metre) Area of Use: - undefined
Coordinate Operation: - name: unknown - method: Equidistant Cylindrical Datum: Unknown
based on WGS 84 ellipsoid - Ellipsoid: WGS 84 - Prime Meridian: Greenwich, figsize=(9, 13),
adapt_fontsize=True)
plot a map of the impacts
Parameters
• run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime,
default is first element of attribute run_datetime.
• explain_str (str, optional) – Short str which explains type of impact, explain_str is included
in the title of the figure. default is ‘mean building damage caused by wind’
• save_fig (bool, optional) – Figure is saved if True, folder is within your configurable save_dir
and filename is derived from the method summary_str() (for more details see class docstring).
Default is True.
• close_fig (bool, optional) – Figure not drawn if True. Default is False.
• polygon_file (str, optional) – Points to a .shp-file with polygons do be drawn as outlines on
the plot, default is None to not draw the lines. please also specify the crs in the parameter
polygon_file_crs.
• polygon_file_crs (str, optional) – String of pattern <provider>:<code> specifying the crs.
has to be readable by pyproj.Proj. Default is ‘epsg:4326’.
• proj (ccrs) – coordinate reference system used in coordinates The default is
ccrs.PlateCarree()
• figsize (tuple) – figure size for plt.subplots, width, height in inches The default is (9, 13)
• adapt_fontsize (bool, optional) – If set to true, the size of the fonts will be adapted to the
size of the figure. Otherwise the default matplotlib font size is used. Default is True.
Returns
axes
Return type
cartopy.mpl.geoaxes.GeoAxesSubplot
plot_hist(run_datetime=None, explain_str=None, save_fig=True, close_fig=False, figsize=(9, 8))
plot histogram of the forecasted impacts all ensemble members
Parameters
• run_datetime (datetime.datetime, optional) – Select the used hazard by the run_datetime,
default is first element of attribute run_datetime.
• explain_str (str, optional) – Short str which explains type of impact, explain_str is included
in the title of the figure. default is ‘total building damage’
• save_fig (bool, optional) – Figure is saved if True, folder is within your configurable save_dir
and filename is derived from the method summary_str() (for more details see class docstring).
Default is True.
• close_fig (bool, optional) – Figure is not drawn if True. Default is False.
• figsize (tuple) – figure size for plt.subplots, width, height in inches The default is (9, 8)
Returns
axes
Return type
matplotlib.axes.Axes
plot_exceedence_prob(threshold, explain_str=None, run_datetime=None, save_fig=True, close_fig=False,
polygon_file=None, polygon_file_crs='epsg:4326', proj=<Projected CRS: +proj=eqc
+ellps=WGS84 +a=6378137.0 +lon_0=0.0 +to ...> Name: unknown Axis Info
[cartesian]: - E[east]: Easting (unknown) - N[north]: Northing (unknown) - h[up]:
Ellipsoidal height (metre) Area of Use: - undefined Coordinate Operation: - name:
unknown - method: Equidistant Cylindrical Datum: Unknown based on WGS 84
ellipsoid - Ellipsoid: WGS 84 - Prime Meridian: Greenwich, figsize=(9, 13),
adapt_fontsize=True)
climada.engine.impact module
class climada.engine.impact.ImpactFreqCurve(return_per: ~numpy.ndarray = <factory>, impact:
~numpy.ndarray = <factory>, unit: str = '', frequency_unit:
str = '1/year', label: str = '' )
Bases: object
Impact exceedence frequency curve.
return_per: ndarray
return period
impact: ndarray
impact exceeding frequency
crs
WKT string of the impact’s crs
Type
str
eai_exp
expected impact for each exposure within a period of 1/frequency_unit
Type
np.array
at_event
impact for each hazard event
Type
np.array
frequency
frequency of event
Type
np.array
frequency_unit
frequency unit used (given by hazard), default is ‘1/year’
Type
str
aai_agg
average impact within a period of 1/frequency_unit (aggregated)
Type
float
unit
value unit used (given by exposures unit)
Type
str
imp_mat
matrix num_events x num_exp with impacts. only filled if save_mat is True in calc()
Type
sparse.csr_matrix
haz_type
the hazard type of the hazard
Type
str
__init__(event_id=None, event_name=None, date=None, frequency=None, frequency_unit='1/year',
coord_exp=None, crs='EPSG:4326', eai_exp=None, at_event=None, tot_value=0.0, aai_agg=0.0,
unit='', imp_mat=None, haz_type='' )
Init Impact object
Parameters
• event_id (np.array, optional) – id (>0) of each hazard event
transfer_risk(attachment, cover)
Compute the risk transfer for the full portfolio. This is the risk of the full portfolio summed over all events.
For each event, the transfered risk amounts to the impact minus the attachment (but maximally equal to the
cover) multiplied with the probability of the event.
Parameters
• attachment (float) – attachment per event for entire portfolio.
• cover (float) – cover per event for entire portfolio.
Returns
• transfer_at_event (np.array) – risk transfered per event
• transfer_aai_agg (float) – average risk within a period of 1/frequency_unit, transfered
residual_risk(attachment, cover)
Compute the residual risk after application of insurance attachment and cover to entire portfolio. This is the
residual risk of the full portfolio summed over all events. For each event, the residual risk is obtained by
subtracting the transfered risk from the trom the total risk per event. of the event.
Parameters
• attachment (float) – attachment per event for entire portfolio.
• cover (float) – cover per event for entire portfolio.
Returns
• residual_at_event (np.array) – residual risk per event
• residual_aai_agg (float) – average residual risk within a period of 1/frequency_unit
µ See also
transfer_risk
compute the transfer risk per portfolio.
calc_risk_transfer(attachment, cover)
Compute traaditional risk transfer over impact. Returns new impact with risk transfer applied and the insur-
ance layer resulting Impact metrics.
Parameters
• attachment (float) – (deductible)
• cover (float)
Return type
climada.engine.impact.Impact
impact_per_year(all_years=True, year_range=None)
Calculate yearly impact from impact data.
Note: the impact in a given year is summed over all events. Thus, the impact in a given year can be larger
than the total affected exposure value.
Parameters
• all_years (boolean, optional) – return values for all years between first and last year with
event, including years without any events. Default: True
• year_range (tuple or list with integers, optional) – start and end year
Returns
year_set – Key=year, value=Summed impact per year.
Return type
dict
impact_at_reg(agg_regions=None)
Aggregate impact on given aggregation regions. This method works only if Impact.imp_mat was stored during
the impact calculation.
Parameters
agg_regions (np.array, list (optional)) – The length of the array must equal the number of
centroids in exposures. It reports what macro-regions these centroids belong to. For example,
asuming there are three centroids and agg_regions = [‘A’, ‘A’, ‘B’] then impact of the first and
second centroids will be assigned to region A, whereas impact from the third centroid will be
assigned to area B. If no aggregation regions are passed, the method aggregates impact at the
country (admin_0) level. Default is None.
Returns
Contains the aggregated data per event. Rows: Hazard events. Columns: Aggregation regions.
Return type
pd.DataFrame
calc_impact_year_set(all_years=True, year_range=None)
This function is deprecated, use Impact.impact_per_year instead.
local_exceedance_impact(return_periods=(25, 50, 100, 250), method='interpolate', min_impact=0,
log_frequency=True, log_impact=True, bin_decimals=None)
Compute local exceedance impact for given return periods. The default method is fitting the ordered impacts
per centroid to the corresponding cummulated frequency with linear interpolation on log-log scale.
Parameters
• return_periods (array_like) – User-specified return periods for which the exceedance in-
tensity should be calculated locally (at each centroid). Defaults to (25, 50, 100, 250).
• method (str) – Method to interpolate to new return periods. Currently available are “in-
terpolate”, “extrapolate”, “extrapolate_constant” and “stepfunction”. If set to “interpolate”,
return periods outside the range of the Impact object’s observed local return periods will be
assigned NaN. If set to “extrapolate_constant” or “stepfunction”, return periods larger than
the Impact object’s observed local return periods will be assigned the largest local impact,
and return periods smaller than the Impact object’s observed local return periods will be
assigned 0. If set to “extrapolate”, local exceedance impacts will be extrapolated (and in-
terpolated). The extrapolation to large return periods uses the two highest impacts of the
centroid and their return periods and extends the interpolation between these points to the
given return period (similar for small return periods). Defauls to “interpolate”.
• min_impact (float, optional) – Minimum threshold to filter the impact. Defaults to 0.
• log_frequency (bool, optional) – If set to True, (cummulative) frequency values are con-
verted to log scale before inter- and extrapolation. Defaults to True.
• log_impact (bool, optional) – If set to True, impact values are converted to log scale before
inter- and extrapolation. Defaults to True.
• bin_decimals (int, optional) – Number of decimals to group and bin impact values. Binning
results in smoother (and coarser) interpolation and more stable extrapolation. For more de-
tails and sensible values for bin_decimals, see Notes. If None, values are not binned. Defaults
to None.
Returns
• gdf (gpd.GeoDataFrame) – GeoDataFrame containing exeedance impacts for given return
periods. Each column corresponds to a return period, each row corresponds to a centroid.
Values in the gdf correspond to the exceedance impact for the given centroid and return
period
• label (str) – GeoDataFrame label, for reporting and plotting
• column_label (function) – Column-label-generating function, for reporting and plotting
µ See also
util.interpolation.preprocess_and_interpolate_ev
inter- and extrapolation method
Notes
If an integer bin_decimals is given, the impact values are binned according to their bin_decimals decimals,
and their corresponding frequencies are summed. This binning leads to a smoother (and coarser) interpo-
lation, and a more stable extrapolation. For instance, if bin_decimals=1, the two values 12.01 and 11.97
with corresponding frequencies 0.1 and 0.2 are combined to a value 12.0 with frequency 0.3. The default
bin_decimals=None results in not binning the values. E.g., if your impact range from 1 to 100, you could use
bin_decimals=1, if your impact range from 1e6 to 1e9, you could use bin_decimals=-5, if your impact range
from 0.0001 to .01, you could use bin_decimals=5.
local_exceedance_imp(return_periods=(25, 50, 100, 250))
This function is deprecated, use Impact.local_exceedance_impact instead.
Deprecated since version The: use of Impact.local_exceedance_imp is deprecated. Use
Impact.local_exceedance_impact instead. Some errors in the previous calculation in Im-
pact.local_exceedance_imp have been corrected. To reproduce data with the previous calculation,
use CLIMADA v5.0.0 or less.
local_return_period(threshold_impact=(1000.0, 10000.0), method='interpolate', min_impact=0,
log_frequency=True, log_impact=True, bin_decimals=None)
Compute local return periods for given threshold impacts. The default method is fitting the ordered impacts
per centroid to the corresponding cummulated frequency with linear interpolation on log-log scale.
Parameters
• threshold_impact (array_like) – User-specified impact values for which the return period
should be calculated locally (at each centroid). Defaults to (1000, 10000)
• method (str) – Method to interpolate to new threshold impacts. Currently available are “in-
terpolate”, “extrapolate”, “extrapolate_constant” and “stepfunction”. If set to “interpolate”,
threshold impacts outside the range of the Impact object’s local impacts will be assigned NaN.
If set to “extrapolate_constant” or “stepfunction”, threshold impacts larger than the Impacts
object’s local impacts will be assigned NaN, and threshold impacts smaller than the Impact
object’s local impacts will be assigned the smallest observed local return period. If set to
“extrapolate”, local return periods will be extrapolated (and interpolated). The extrapolation
to large threshold impacts uses the two highest impacts of the centroid and their return peri-
ods and extends the interpolation between these points to the given threshold imapct (similar
for small large threshold impacts). Defaults to “interpolate”.
µ See also
util.interpolation.preprocess_and_interpolate_ev
inter- and extrapolation method
Notes
If an integer bin_decimals is given, the impact values are binned according to their bin_decimals decimals,
and their corresponding frequencies are summed. This binning leads to a smoother (and coarser) interpo-
lation, and a more stable extrapolation. For instance, if bin_decimals=1, the two values 12.01 and 11.97
with corresponding frequencies 0.1 and 0.2 are combined to a value 12.0 with frequency 0.3. The default
bin_decimals=None results in not binning the values. E.g., if your impact range from 1 to 100, you could use
bin_decimals=1, if your impact range from 1e6 to 1e9, you could use bin_decimals=-5, if your impact range
from 0.0001 to .01, you could use bin_decimals=5.
calc_freq_curve(return_per=None)
Compute impact exceedance frequency curve.
Parameters
return_per (np.array, optional) – return periods where to compute the exceedance impact.
Use impact’s frequencies if not provided
Return type
ImpactFreqCurve
plot_scatter_eai_exposure(mask=None, ignore_zero=False, pop_name=True, buffer=0.0,
extend='neither', axis=None, adapt_fontsize=True, **kwargs)
Plot scatter expected impact within a period of 1/frequency_unit of each exposure.
Parameters
• mask (np.array, optional) – mask to apply to eai_exp plotted.
• ignore_zero (bool, optional) – flag to indicate if zero and negative values are ignored in plot.
Default: False
Return type
cartopy.mpl.geoaxes.GeoAxesSubplot
plot_basemap_eai_exposure(mask=None, ignore_zero=False, pop_name=True, buffer=0.0,
extend='neither', zoom=10, url={'attribution': '(C) OpenStreetMap contributors
(C) CARTO', 'html_attribution': '© <a
href="https://www.openstreetmap.org/copyright">OpenStreetMap</a>
contributors © <a href="https://carto.com/attributions">CARTO</a>',
'max_zoom': 20, 'name': 'CartoDB.Positron', 'subdomains': 'abcd', 'url':
'https://{s}.basemaps.cartocdn.com/{variant}/{z}/{x}/{y}{r}.png', 'variant':
'light_all'}, axis=None, **kwargs)
Plot basemap expected impact of each exposure within a period of 1/frequency_unit.
Parameters
• mask (np.array, optional) – mask to apply to eai_exp plotted.
• ignore_zero (bool, optional) – flag to indicate if zero and negative values are ignored in plot.
Default: False
• pop_name (bool, optional) – add names of the populated places
• buffer (float, optional) – border to add to coordinates. Default: 0.0.
• extend (str, optional) – extend border colorbar with arrows. [ ‘neither’ | ‘both’ | ‘min’ | ‘max’ ]
• zoom (int, optional) – zoom coefficient used in the satellite image
• url (str, optional) – image source, default: ctx.providers.CartoDB.Positron
• axis (matplotlib.axes.Axes, optional) – axis to use
• kwargs (dict, optional) – arguments for scatter matplotlib function, e.g. cmap=’Greys’. De-
fault: ‘Wistia’
Return type
cartopy.mpl.geoaxes.GeoAxesSubplot
plot_hexbin_impact_exposure(event_id=1, mask=None, ignore_zero=False, pop_name=True, buffer=0.0,
extend='neither', axis=None, adapt_fontsize=True, **kwargs)
Plot hexbin impact of an event at each exposure. Requires attribute imp_mat.
Parameters
• event_id (int, optional) – id of the event for which to plot the impact. Default: 1.
• mask (np.array, optional) – mask to apply to impact plotted.
• ignore_zero (bool, optional) – flag to indicate if zero and negative values are ignored in plot.
Default: False
• pop_name (bool, optional) – add names of the populated places
• buffer (float, optional) – border to add to coordinates. Default: 1.0.
• extend (str, optional) – extend border colorbar with arrows. [ ‘neither’ | ‘both’ | ‘min’ | ‘max’ ]
• axis (matplotlib.axes.Axes) – optional axis to use
• adapt_fontsize (bool, optional) – If set to true, the size of the fonts will be adapted to the
size of the figure. Otherwise the default matplotlib font size is used. Default is True.
• kwargs (dict, optional) – arguments for hexbin matplotlib function
Return type
cartopy.mpl.geoaxes.GeoAxesSubplot
plot_basemap_impact_exposure(event_id=1, mask=None, ignore_zero=False, pop_name=True,
buffer=0.0, extend='neither', zoom=10, url={'attribution': '(C)
OpenStreetMap contributors (C) CARTO', 'html_attribution': '© <a
href="https://www.openstreetmap.org/copyright">OpenStreetMap</a>
contributors © <a
href="https://carto.com/attributions">CARTO</a>', 'max_zoom': 20,
'name': 'CartoDB.Positron', 'subdomains': 'abcd', 'url':
'https://{s}.basemaps.cartocdn.com/{variant}/{z}/{x}/{y}{r}.png',
'variant': 'light_all'}, axis=None, **kwargs)
Plot basemap impact of an event at each exposure. Requires attribute imp_mat.
Parameters
• event_id (int, optional) – id of the event for which to plot the impact. Default: 1.
• mask (np.array, optional) – mask to apply to impact plotted.
• ignore_zero (bool, optional) – flag to indicate if zero and negative values are ignored in plot.
Default: False
• pop_name (bool, optional) – add names of the populated places
• buffer (float, optional) – border to add to coordinates. Default: 0.0.
• extend (str, optional) – extend border colorbar with arrows. [ ‘neither’ | ‘both’ | ‘min’ | ‘max’ ]
• zoom (int, optional) – zoom coefficient used in the satellite image
• url (str, optional) – image source, default: ctx.providers.CartoDB.Positron
• axis (matplotlib.axes.Axes, optional) – axis to use
• kwargs (dict, optional) – arguments for scatter matplotlib function, e.g. cmap=’Greys’. De-
fault: ‘Wistia’
Return type
cartopy.mpl.geoaxes.GeoAxesSubplot
plot_rp_imp(return_periods=(25, 50, 100, 250), log10_scale=True, axis=None, mask_distance=0.03,
kwargs_local_exceedance_impact=None, **kwargs)
Compute and plot exceedance impact maps for different return periods. Calls local_exceedance_impact. For
handling large data sets and for further options, see Notes.
Parameters
• return_periods (tuple of int, optional) – return periods to consider. Default: (25, 50, 100,
250)
• log10_scale (boolean, optional) – plot impact as log10(impact). Default: True
• smooth (bool, optional) – smooth plot to plot.RESOLUTIONxplot.RESOLUTION. Default:
True
• mask_distance (float, optional) – Only regions are plotted that are closer to any of the data
points than this distance, relative to overall plot size. For instance, to only plot values at the
centroids, use mask_distance=0.03. If None, the plot is not masked. Default is 0.03.
• kwargs_local_exceedance_impact (dict) – Dictionary of keyword arguments for the
method impact.local_exceedance_impact.
• kwargs (dict, optional) – arguments for pcolormesh matplotlib function used in event plots
Returns
• axis (matplotlib.axes.Axes)
• imp_stats (np.array) – return_periods.size x num_centroids
µ See also
engine.impact.local_exceedance_impact
inter- and extrapolation method
Notes
For handling large data, and for more flexible options in the exceedance impact computation and
in the plotting, we recommend to use gdf, title, labels = impact.local_exceedance_impact() and
util.plot.plot_from_gdf(gdf, title, labels) instead.
write_csv(file_name)
Write data into csv file. imp_mat is not saved.
Parameters
file_name (str) – absolute path of the file
write_excel(file_name)
Write data into Excel file. imp_mat is not saved.
Parameters
file_name (str) – absolute path of the file
write_hdf5(file_path: str | Path, dense_imp_mat: bool = False)
Write the data stored in this object into an H5 file.
Try to write all attributes of this class into H5 datasets or attributes. By default, any iterable will be stored in
a dataset and any string or scalar will be stored in an attribute. Dictionaries will be stored as groups, with the
previous rules being applied recursively to their values.
The impact matrix can be stored in a sparse or dense format.
Parameters
• file_path (str or Path) – File path to write data into. The enclosing directory must exist.
• dense_imp_mat (bool) – If True, write the impact matrix as dense matrix that can be more
easily interpreted by common H5 file readers but takes up (vastly) more space. Defaults to
False.
Raises
TypeError – If event_name does not contain strings exclusively.
write_sparse_csr(file_name)
Write imp_mat matrix in numpy’s npz format.
static read_sparse_csr(file_name)
Read imp_mat matrix from numpy’s npz format.
Parameters
file_name (str)
Return type
sparse.csr_matrix
classmethod from_csv(file_name)
Read csv file containing impact data generated by write_csv.
Parameters
file_name (str) – absolute path of the file
Returns
imp – Impact from csv file
Return type
climada.engine.impact.Impact
read_csv(*args, **kwargs)
This function is deprecated, use Impact.from_csv instead.
classmethod from_excel(file_name)
Read excel file containing impact data generated by write_excel.
Parameters
file_name (str) – absolute path of the file
Returns
imp – Impact from excel file
Return type
climada.engine.impact.Impact
read_excel(*args, **kwargs)
This function is deprecated, use Impact.from_excel instead.
classmethod from_hdf5(file_path: str | Path)
Create an impact object from an H5 file.
This assumes a specific layout of the file. If values are not found in the expected places, they will be set to
the default values for an Impact object.
The following H5 file structure is assumed (H5 groups are terminated with /, attributes are denoted by .
attrs/):
file.h5
├─ at_event
├─ coord_exp
├─ eai_exp
├─ event_id
├─ event_name
├─ frequency
├─ imp_mat
├─ .attrs/
│ ├─ aai_agg
│ ├─ crs
│ ├─ frequency_unit
│ ├─ haz_type
│ ├─ tot_value
│ ├─ unit
The impact matrix imp_mat can either be an H5 dataset, in which case it is interpreted as dense representation
of the matrix, or an H5 group, in which case the group is expected to contain the following data for instantiating
a scipy.sparse.csr_matrix:
imp_mat/
├─ data
├─ indices
├─ indptr
├─ .attrs/
│ ├─ shape
Parameters
file_path (str or Path) – The file path of the file to read.
Returns
imp – Impact with data from the given file
Return type
Impact
Notes
the frequencies are NOT adjusted. Method to adjust frequencies
and obtain correct eai_exp:
1- Select subset of impact according to your choice imp = impact.select(…) 2- Adjust manually the
frequency of the subset of impact imp.frequency = […] 3- Use select without arguments to select all
events and recompute the eai_exp with the updated frequencies. imp = imp.select()
Parameters
• event_ids (list of int, optional) – Selection of events by their id. The default is None.
• event_names (list of str, optional) – Selection of events by their name. The default is None.
• dates (tuple, optional) – (start-date, end-date), events are selected if they are >= than start-
date and <= than end-date. Dates in same format as impact.date (ordinal format of datetime
library) The default is None.
• coord_exp (np.array, optional) – Selection of exposures coordinates [lat, lon] (in degrees)
The default is None.
• reset_frequency (bool, optional) – Change frequency of events proportional to difference
between first and last year (old and new). Assumes annual frequency values. Default: False.
Raises
ValueError – If the impact matrix is missing, the eai_exp and aai_agg cannot be updated for
a selection of events and/or exposures.
Returns
imp – A new impact object with a selection of events and/or exposures
Return type
climada.engine.impact.Impact
If event ids are not unique among the passed impact objects an error is raised. In this case, the user can set
reset_event_ids=True to create unique event ids for the concatenated impact.
If all impact matrices of the impacts in imp_list are empty, the impact matrix of the concatenated impact
is also empty.
Parameters
• imp_list (Iterable of climada.engine.impact.Impact) – Iterable of Impact objects to concate-
nate
• reset_event_ids (boolean, optional) – Reset event ids of the concatenated impact object
Returns
impact – New impact object which is a concatenation of all impacts
Return type
climada.engine.impact.Impact
Notes
• Concatenation of impacts with different exposure (e.g. different countries) could also be implemented
here in the future.
Parameters
• hazard (Hazard) – Hazard to match (with raster or vector centroids).
• distance (str, optional) – Distance to use in case of vector centroids. Possible values are
“euclidean”, “haversine” and “approx”. Default: “euclidean”
• threshold (float) – If the distance (in km) to the nearest neighbor exceeds threshold, the
index -1 is assigned. Set threshold to 0, to disable nearest neighbor matching. Default: 100
(km)
Returns
array of closest Hazard centroids, aligned with the Impact’s coord_exp array
Return type
np.array
climada.engine.impact_calc module
class climada.engine.impact_calc.ImpactCalc(exposures, impfset, hazard )
Bases: object
Class to compute impacts from exposures, impact function set and hazard
__init__(exposures, impfset, hazard )
ImpactCalc constructor
The dimension of the imp_mat variable must be compatible with the exposures and hazard objects.
This will call climada.hazard.base.Hazard.check_matrices().
Parameters
• exposures (climada.entity.Exposures) – exposures used to compute impacts
• impf_set (climada.entity.ImpactFuncSet) – impact functions set used to compute impacts
• hazard (climada.Hazard) – hazard used to compute impacts
property n_exp_pnt
Number of exposure points (rows in gdf)
property n_events
Number of hazard events (size of event_id array)
Examples
µ See also
apply_deductible_to_mat
apply deductible to impact matrix
apply_cover_to_mat
apply cover to impact matrix
• include_deductible (bool) – if set to True, the column ‘deductible’ of the exposures Geo-
DataFrame is excluded from the returned GeoDataFrame, otherwise it is included if present.
imp_mat_gen(exp_gdf, impf_col)
Generator of impact sub-matrices and correspoding exposures indices
The exposures gdf is decomposed into chunks that fit into the max defined memory size. For each chunk, the
impact matrix is computed and returned, together with the corresponding exposures points index.
Parameters
• exp_gdf (GeoDataFrame) – Geodataframe of the exposures with columns required for im-
pact computation.
• impf_col (str) – name of the desired impact column in the exposures.
Raises
ValueError – if the hazard is larger than the memory limit
Yields
scipy.sparse.crs_matrix, np.ndarray – impact matrix and corresponding exposures indices for
each chunk.
insured_mat_gen(imp_mat_gen, exp_gdf, impf_col)
Generator of insured impact sub-matrices (with applied cover and deductible) and corresponding exposures
indices
This generator takes a ‘regular’ impact matrix generator and applies cover and deductible onto the impacts. It
yields the same sub-matrices as the original generator.
Deductible and cover are taken from the dataframe stored in exposures.gdf.
Parameters
• imp_mat_gen (generator of tuples (sparse.csr_matrix, np.array)) – The generator for creat-
ing the impact matrix. It returns a part of the full matrix and the associated exposure indices.
• exp_gdf (GeoDataFrame) – Geodataframe of the exposures with columns required for im-
pact computation.
• impf_col (str) – Name of the column in ‘exp_gdf’ indicating the impact function (id)
Yields
• mat (scipy.sparse.csr_matrix) – Impact sub-matrix (with applied cover and deductible) with
size (n_events, len(exp_idx))
• exp_idx (np.array) – Exposure indices for impacts in mat
impact_matrix(exp_values, cent_idx, impf )
Compute the impact matrix for given exposure values, assigned centroids, a hazard, and one impact function.
Parameters
• exp_values (np.array) – Exposure values
• cent_idx (np.array) – Hazard centroids assigned to each exposure location
• hazard (climada.Hazard) – Hazard object
• impf (climada.entity.ImpactFunc) – one impactfunction comon to all exposure elements in
exp_gdf
Returns
Impact per event (rows) per exposure point (columns)
Return type
scipy.sparse.csr_matrix
stitch_impact_matrix(imp_mat_gen)
Make an impact matrix from an impact sub-matrix generator
stitch_risk_metrics(imp_mat_gen)
Compute the impact metrics from an impact sub-matrix generator
This method is used to compute the risk metrics if the user decided not to store the full impact matrix.
Parameters
imp_mat_gen (generator of tuples (sparse.csr_matrix, np.array)) – The generator for creating
the impact matrix. It returns a part of the full matrix and the associated exposure indices.
Returns
• at_event (np.array) – Accumulated damage for each event
• eai_exp (np.array) – Expected impact within a period of 1/frequency_unit for each exposure
point
• aai_agg (float) – Average impact within a period of 1/frequency_unit aggregated
static apply_deductible_to_mat(mat, deductible, hazard, cent_idx, impf )
Apply a deductible per exposure point to an impact matrix at given centroid points for given impact function.
All exposure points must have the same impact function. For different impact functions apply use this method
repeatedly on the same impact matrix.
Parameters
• imp_mat (scipy.sparse.csr_matrix) – impact matrix (events x exposure points)
• deductible (np.array()) – deductible for each exposure point
• hazard (climada.Hazard) – hazard used to compute the imp_mat
• cent_idx (np.array()) – index of centroids associated with each exposure point
• impf (climada.entity.ImpactFunc) – impact function associated with the exposure points
Returns
imp_mat – impact matrix with applied deductible
Return type
scipy.sparse.csr_matrix
static apply_cover_to_mat(mat, cover)
Apply cover to impact matrix.
The impact data is clipped to the range [0, cover]. The cover is defined per exposure point.
Parameters
• imp_mat (scipy.sparse.csr_matrix) – impact matrix
• cover (np.array()) – cover per exposures point (columns of imp_mat)
Returns
imp_mat – impact matrix with applied cover
Return type
scipy.sparse.csr_matrix
climada.engine.impact_data module
climada.engine.impact_data.assign_hazard_to_emdat(certainty_level, intensity_path_haz,
names_path_haz, reg_id_path_haz,
date_path_haz, emdat_data, start_time, end_time,
keep_checks=False)
• emdat_file (str, Path, or DataFrame) – Either string with full path to CSV-file or pan-
das.DataFrame loaded from EM-DAT CSV
• countries (list of str) – country ISO3-codes or names, e.g. [‘JAM’, ‘CUB’]. countries=None
for all countries (default)
• hazard (list or str) – List of Disaster (sub-)type accordung EMDAT terminology, i.e.: Ani-
mal accident, Drought, Earthquake, Epidemic, Extreme temperature, Flood, Fog, Impact, In-
sect infestation, Landslide, Mass movement (dry), Storm, Volcanic activity, Wildfire; Coastal
Flooding, Convective Storm, Riverine Flood, Tropical cyclone, Tsunami, etc.; OR CLIMADA
hazard type abbreviations, e.g. TC, BF, etc.
• year_range (list or tuple) – Year range to be extracted, e.g. (2000, 2015); (only min and max
are considered)
• target_version (int) – required EM-DAT data format version (i.e. year of download), changes
naming of columns/variables, default: newest available version in VARNAMES_EMDAT that
matches the given emdat_file
Returns
df_data – DataFrame containing cleaned and filtered EM-DAT impact data
Return type
pd.DataFrame
climada.engine.impact_data.emdat_countries_by_hazard(emdat_file_csv, hazard=None,
year_range=None)
return list of all countries exposed to a chosen hazard type from EMDAT data as CSV.
Parameters
• emdat_file (str, Path, or DataFrame) – Either string with full path to CSV-file or pan-
das.DataFrame loaded from EM-DAT CSV
• hazard (list or str) – List of Disaster (sub-)type accordung EMDAT terminology, i.e.: Ani-
mal accident, Drought, Earthquake, Epidemic, Extreme temperature, Flood, Fog, Impact, In-
sect infestation, Landslide, Mass movement (dry), Storm, Volcanic activity, Wildfire; Coastal
Flooding, Convective Storm, Riverine Flood, Tropical cyclone, Tsunami, etc.; OR CLIMADA
hazard type abbreviations, e.g. TC, BF, etc.
• year_range (list or tuple) – Year range to be extracted, e.g. (2000, 2015); (only min and max
are considered)
Returns
• countries_iso3a (list) – List of ISO3-codes of countries impacted by the disaster (sub-)types
• countries_names (list) – List of names of countries impacted by the disaster (sub-)types
climada.engine.impact_data.scale_impact2refyear(impact_values, year_values, iso3a_values,
reference_year=None)
Scale give impact values proportional to GDP to the according value in a reference year (for normalization of
monetary values)
Parameters
• impact_values (list or array) – Impact values to be scaled.
• year_values (list or array) – Year of each impact (same length as impact_values)
• iso3a_values (list or array) – ISO3alpha code of country for each impact (same length as
impact_values)
• reference_year (int, optional) – Impact is scaled proportional to GDP to the value of the
reference year. No scaling for reference_year=None (default)
climada.engine.impact_data.emdat_impact_yearlysum(emdat_file_csv, countries=None, hazard=None,
year_range=None, reference_year=None,
imp_str="Total Damages ('000 US$)",
version=None)
function to load EM-DAT data and sum impact per year
Parameters
emdat_file_csv (str or DataFrame) – Either string with full path to CSV-file or pandas.DataFrame
loaded from EM-DAT CSV
countries
[list of str] country ISO3-codes or names, e.g. [‘JAM’, ‘CUB’]. countries=None for all countries (default)
hazard
[list or str] List of Disaster (sub-)type accordung EMDAT terminology, i.e.: Animal accident, Drought,
Earthquake, Epidemic, Extreme temperature, Flood, Fog, Impact, Insect infestation, Landslide, Mass move-
ment (dry), Storm, Volcanic activity, Wildfire; Coastal Flooding, Convective Storm, Riverine Flood, Tropical
cyclone, Tsunami, etc.; OR CLIMADA hazard type abbreviations, e.g. TC, BF, etc.
year_range
[list or tuple] Year range to be extracted, e.g. (2000, 2015); (only min and max are considered)
version
[int, optional] required EM-DAT data format version (i.e. year of download), changes naming of
columns/variables, default: newest available version in VARNAMES_EMDAT
Returns
out – DataFrame with summed impact and scaled impact per year and country.
Return type
pd.DataFrame
countries
[list of str] country ISO3-codes or names, e.g. [‘JAM’, ‘CUB’]. default: countries=None for all countries
hazard
[list or str] List of Disaster (sub-)type accordung EMDAT terminology, i.e.: Animal accident, Drought,
Earthquake, Epidemic, Extreme temperature, Flood, Fog, Impact, Insect infestation, Landslide, Mass move-
ment (dry), Storm, Volcanic activity, Wildfire; Coastal Flooding, Convective Storm, Riverine Flood, Tropical
cyclone, Tsunami, etc.; OR CLIMADA hazard type abbreviations, e.g. TC, BF, etc.
year_range
[list or tuple] Year range to be extracted, e.g. (2000, 2015); (only min and max are considered)
reference_year
[int reference year of exposures. Impact is scaled] proportional to GDP to the value of the reference year.
Default: No scaling for 0
imp_str
[str] Column name of impact metric in EMDAT CSV, default = “Total Damages (‘000 US$)”
version
[int, optional] EM-DAT version to take variable/column names from, default: newest available version in
VARNAMES_EMDAT
Returns
out – EMDAT DataFrame with new columns “year”, “region_id”, and “impact” and +im-
pact_scaled” total impact per event with same unit as chosen impact, but multiplied by 1000 if
impact is given as 1000 US$ (e.g. imp_str=”Total Damages (‘000 US$) scaled”).
Return type
pd.DataFrame
select(year_range)
Select discount rates in given years.
Parameters
• year_range (np.array(int)) – continuous sequence of selected years.
• Returns (climada.entity.DiscRates) – The selected discrates in the year_range
append(disc_rates)
Check and append discount rates to current DiscRates. Overwrite discount rate if same year.
Parameters
disc_rates (climada.entity.DiscRates) – DiscRates instance to append
Raises
ValueError –
>>> DEF_VAR_MAT = {
... 'sup_field_name': 'entity',
... 'field_name': 'discount',
... 'var_name': {
... 'year': 'year',
... 'disc': 'discount_rate',
... }
... }
Returns
The disc rates from matlab
Return type
climada.entity.DiscRates
read_mat(*args, **kwargs)
This function is deprecated, use DiscRates.from_mat instead.
>>> DEF_VAR_EXCEL = {
... 'sheet_name': 'discount',
... 'col_name': {
... 'year': 'year',
... 'disc': 'discount_rate',
... }
... }
Returns
The disc rates from excel
Return type
climada.entity.DiscRates
read_excel(*args, **kwargs)
This function is deprecated, use DiscRates.from_excel instead.
write_excel(file_name, var_names=None)
Write excel file following template.
Parameters
• file_name (str) – filename including path and extension
• var_names (dict, optional) – name of the variables in the file. The Default is
>>> DEF_VAR_EXCEL = {
... 'sheet_name': 'discount',
... 'col_name': {
... 'year': 'year',
... 'disc': 'discount_rate',
... }
... }
Returns
The disc rates from the csv file
Return type
climada.entity.DiscRates
write_csv(file_name, year_column='year', disc_column='discount_rate', **kwargs)
Write DiscRate to a csv file following template and store variables.
Parameters
• file_name (str) – filename including path and extension
• year_column (str, optional) – name of the column that contains the years, Default: “year”
• disc_column (str, optional) – name of the column that contains the discount rates, Default:
“discount_rate”
• **kwargs – any additional arguments, e.g., sep, delimiter, head, are forwarded to pandas.
read_csv
climada.entity.exposures package
climada.entity.exposures.litpop package
climada.entity.exposures.litpop.gpw_population module
climada.entity.exposures.litpop.gpw_population.load_gpw_pop_shape(geometry, reference_year,
gpw_version,
data_dir=PosixPath('/home/docs/climada/data')
layer=0, verbose=True)
Read gridded population data from TIFF and crop to given shape(s).
Note: A (free) NASA Earthdata login is necessary to download the data. Data can be downloaded e.g.
for gpw_version=11 and year 2015 from https://sedac.ciesin.columbia.edu/downloads/data/gpw-v4/ gpw-v4-
population-count-rev11/gpw-v4-population-count-rev11_2015_30_sec_tif.zip
Parameters
• geometry (shape(s) to crop data to in degree lon/lat.) – for example
shapely.geometry.(Multi)Polygon or shapefile.Shape from polygon(s) defined in a (country)
shapefile.
• reference_year (int) – target year for data extraction
• gpw_version (int) – Version number of GPW population data, i.e. 11 for v4.11. The default
is CONFIG.exposures.litpop.gpw_population.gpw_version.int()
• data_dir (Path, optional) – Path to data directory holding GPW data folders. The default is
SYSTEM_DIR.
• layer (int, optional) – relevant data layer in input TIFF file to return. The default is 0 and should
not be changed without understanding the different data layers in the given TIFF file.
• verbose (bool, optional) – Enable verbose logging about the used GPW version and reference
year. Default: True.
Returns
• pop_data (2D numpy array) – contains extracted population count data per grid point in shape
first dimension is lat, second dimension is lon.
• meta (dict) – contains meta data per array, including “transform” with meta data on coordi-
nates.
• global_transform (Affine instance) – contains six numbers, providing transform info for global
GWP grid. global_transform is required for resampling on a globally consistent grid
climada.entity.exposures.litpop.gpw_population.get_gpw_file_path(gpw_version, reference_year,
data_dir=None,
verbose=True)
Check available GPW population data versions and year closest to reference_year and return full path to TIFF file.
Parameters
• gpw_version (int (optional)) – Version number of GPW population data, i.e. 11 for v4.11.
• reference_year (int (optional)) – Data year is selected as close to reference_year as possible.
The default is 2020.
• data_dir (pathlib.Path (optional)) – Absolute path where files are stored. Default: SYS-
TEM_DIR
• verbose (bool, optional) – Enable verbose logging about the used GPW version and reference
year. Default: True.
Raises
FileExistsError –
Returns
pathlib.Path
Return type
path to input file with population data
climada.entity.exposures.litpop.litpop module
climada.entity.exposures.litpop.litpop.GPW_VERSION = 11
Version of Gridded Population of the World (GPW) input data. Check for updates.
class climada.entity.exposures.litpop.litpop.LitPop(*args, meta=None, exponents=None,
fin_mode=None, gpw_version=None,
**kwargs)
Bases: Exposures
Holds geopandas GeoDataFrame with metadata and columns (pd.Series) defined in Attributes of Exposures class.
LitPop exposure values are disaggregated proportional to a combination of nightlight intensity (NASA) and Gridded
Population data (SEDAC). Total asset values can be produced capital, population count, GDP, or non-financial
wealth.
Calling sequence example: country_names = [‘CHE’, ‘Austria’] exp = LitPop.from_countries(country_names)
exp.plot()
exponents
Defining powers (m, n) with which lit (nightlights) and pop (gpw) go into Lit**m * Pop**n. The default is
(1,1).
Type
tuple of two integers, optional
fin_mode
Socio-economic value to be used as an asset base that is disaggregated. The default is ‘pc’.
Type
str, optional
gpw_version
Version number of GPW population data, e.g. 11 for v4.11. The default is defined in GPW_VERSION.
Type
int, optional
__init__(*args, meta=None, exponents=None, fin_mode=None, gpw_version=None, **kwargs)
Parameters
• data (dict, iterable, DataFrame, GeoDataFrame, ndarray) – data of the initial DataFrame,
see pandas.DataFrame(). Used to initialize values for “region_id”, “category_id”,
“cover”, “deductible”, “value”, “geometry”, “impf_[hazard type]”.
• columns (Index or array, optional) – Columns of the initial DataFrame, see pandas.
DataFrame(). To be provided if data is an array
• index (Index or array, optional) – Columns of the initial DataFrame, see pandas.
DataFrame(). can optionally be provided if data is an array or for defining a specific row
index
• dtype (dtype, optional) – data type of the initial DataFrame, see pandas.DataFrame().
Can be used to assign specific data types to the columns in data
• copy (bool, optional) – Whether to make a copy of the input data, see pandas.
DataFrame(). Default is False, i.e. by default data may be altered by the Exposures
object.
• geometry (array, optional) – Geometry column, see geopandas.GeoDataFrame(). Must
be provided if lat and lon are None and data has no “geometry” column.
• crs (value, optional) – Coordinate Reference System, see geopandas.GeoDataFrame().
• meta (dict, optional) – Metadata dictionary. Default: {} (empty dictionary). May be used to
provide any of description, ref_year, value_unit and crs
• description (str, optional) – Default: None
• ref_year (int, optional) – Reference Year. Defaults to the entry of the same name in meta or
2018.
• value_unit (str, optional) – Unit of the exposed value. Defaults to the entry of the same
name in meta or ‘USD’.
• value (array, optional) – Exposed value column. Must be provided if data has no “value”
column
• lat (array, optional) – Latitude column. Can be provided together with lon, alternative to
geometry
• lon (array, optional) – Longitude column. Can be provided together with lat, alternative to
geometry
set_countries(*args, **kwargs)
This function is deprecated, use LitPop.from_countries instead.
classmethod from_countries(countries, res_arcsec=30, exponents=(1, 1), fin_mode='pc',
total_values=None, admin1_calc=False, reference_year=2018,
gpw_version=11, data_dir=PosixPath('/home/docs/climada/data'))
Init new LitPop exposure object for a list of countries (admin 0).
Sets attributes ref_year, crs, value, geometry, meta, value_unit, exponents,`fin_mode`, gpw_version, and ad-
min1_calc.
Parameters
• countries (list with str or int) – list containing country identifiers: iso3alpha (e.g. ‘JPN’),
iso3num (e.g. 92) or name (e.g. ‘Togo’)
• res_arcsec (float, optional) – Horizontal resolution in arc-sec. The default is 30 arcsec, this
corresponds to roughly 1 km.
• exponents (tuple of two integers, optional) – Defining power with which lit (nightlights) and
pop (gpw) go into LitPop. To get nightlights^3 without population count: (3, 0). To use
population count alone: (0, 1). Default: (1, 1)
• fin_mode (str, optional) – Socio-economic value to be used as an asset base that is disaggre-
gated to the grid points within the country:
– ‘pc’: produced capital (Source: World Bank), incl. manufactured or built assets such as
machinery, equipment, and physical structures pc is in constant 2014 USD.
– ‘pop’: population count (source: GPW, same as gridded population). The unit is ‘people’.
– ‘gdp’: gross-domestic product (Source: World Bank) [USD]
– ‘income_group’: gdp multiplied by country’s income group+1 [USD]. Income groups are
1 (low) to 4 (high income).
– ‘nfw’: non-financial wealth (Source: Credit Suisse, of households only) [USD]
– ‘tw’: total wealth (Source: Credit Suisse, of households only) [USD]
– ‘norm’: normalized by country (no unit)
– ‘none’: LitPop per pixel is returned unchanged (no unit)
Default: ‘pc’
• total_values (list containing numerics, same length as countries, optional) – Total values to
be disaggregated to grid in each country. The default is None. If None, the total number is
extracted from other sources depending on the value of fin_mode.
• admin1_calc (boolean, optional) – If True, distribute admin1-level GDP (if available). De-
fault: False
• reference_year (int, optional) – Reference year. Default: CONFIG.exposures.def_ref_year.
• gpw_version (int, optional) – Version number of GPW population data. The default is
GPW_VERSION
• data_dir (Path, optional) – redefines path to input data directory. The default is SYS-
TEM_DIR.
Raises
ValueError –
Returns
exp – LitPop instance with exposure for given countries
Return type
LitPop
set_nightlight_intensity(*args, **kwargs)
This function is deprecated, use LitPop.from_nightlight_intensity instead.
classmethod from_nightlight_intensity(countries=None, shape=None, res_arcsec=15,
reference_year=2018,
data_dir=PosixPath('/home/docs/climada/data'))
Returns
exp – Exposure instance with values representing pure nightlight intensity from input nightlight
data (BlackMarble)
Return type
LitPop
set_population(*args, **kwargs)
This function is deprecated, use LitPop.from_population instead.
classmethod from_population(countries=None, shape=None, res_arcsec=30, reference_year=2018,
gpw_version=11, data_dir=PosixPath('/home/docs/climada/data'))
Wrapper around from_countries / from_shape.
Initiate exposures instance with value equal to GPW population count. Provide either countries or shape.
Parameters
• countries (list or str, optional) – list containing country identifiers (name or iso3)
• shape (Shape, Polygon or MultiPolygon, optional) – geographical shape of target region, al-
ternative to countries.
• res_arcsec (int, optional) – Resolution in arc seconds. The default is 30.
• reference_year (int, optional) – Reference year (closest available GPW data year is used)
The default is CONFIG.exposures.def_ref_year.
• gpw_version (int, optional) – specify GPW data verison. The default is 11.
• data_dir (Path, optional) – data directory. The default is None. Either countries or shape is
required.
Raises
ValueError –
Returns
exp – Exposure instance with values representing population count according to Gridded Pop-
ulation of the World (GPW) input data set.
Return type
LitPop
set_custom_shape_from_countries(*args, **kwargs)
This function is deprecated, use LitPop.from_shape_and_countries instead.
classmethod from_shape_and_countries(shape, countries, res_arcsec=30, exponents=(1, 1),
fin_mode='pc', admin1_calc=False, reference_year=2018,
gpw_version=11,
data_dir=PosixPath('/home/docs/climada/data'))
create LitPop exposure for country and then crop to given shape.
Parameters
• shape (shapely.geometry.Polygon, MultiPolygon, shapereader.Shape,) – or GeoSeries or list
containg either Polygons or Multipolygons. Geographical shape for which LitPop Exposure
is to be initiated.
• countries (list with str or int) – list containing country identifiers: iso3alpha (e.g. ‘JPN’),
iso3num (e.g. 92) or name (e.g. ‘Togo’)
• res_arcsec (float, optional) – Horizontal resolution in arc-sec. The default is 30 arcsec, this
corresponds to roughly 1 km.
• exponents (tuple of two integers, optional) – Defining power with which lit (nightlights) and
pop (gpw) go into LitPop. Default: (1, 1)
• fin_mode (str, optional) – Socio-economic value to be used as an asset base that is disaggre-
gated to the grid points within the country:
– ‘pc’: produced capital (Source: World Bank), incl. manufactured or built assets such as
machinery, equipment, and physical structures (pc is in constant 2014 USD)
– ‘pop’: population count (source: GPW, same as gridded population). The unit is ‘people’.
– ‘gdp’: gross-domestic product (Source: World Bank) [USD]
– ‘income_group’: gdp multiplied by country’s income group+1 [USD] Income groups are
1 (low) to 4 (high income).
– ‘nfw’: non-financial wealth (Source: Credit Suisse, of households only) [USD]
– ‘tw’: total wealth (Source: Credit Suisse, of households only) [USD]
– ‘norm’: normalized by country
– ‘none’: LitPop per pixel is returned unchanged
Default: ‘pc’
• admin1_calc (boolean, optional) – If True, distribute admin1-level GDP (if available). De-
fault: False
• reference_year (int, optional) – Reference year for data sources. Default: 2020
• gpw_version (int, optional) – Version number of GPW population data. The default is
GPW_VERSION
• data_dir (Path, optional) – redefines path to input data directory. The default is SYS-
TEM_DIR.
Raises
NotImplementedError –
Returns
exp – The exposure LitPop within shape
Return type
LitPop
set_custom_shape(*args, **kwargs)
This function is deprecated, use LitPop.from_shape instead.
classmethod from_shape(shape, total_value, res_arcsec=30, exponents=(1, 1), value_unit='USD',
region_id=None, reference_year=2018, gpw_version=11,
data_dir=PosixPath('/home/docs/climada/data'))
init LitPop exposure object for a custom shape. Requires user input regarding the total value to be disaggre-
gated.
Sets attributes ref_year, crs, value, geometry, meta, value_unit, exponents,`fin_mode`, gpw_version, and ad-
min1_calc.
This method can be used to initiated LitPop Exposure for sub-national regions such as states, districts, cantons,
cities, … but shapes and total value need to be provided manually. If these required input parameters are not
known / available, better initiate Exposure for entire country and extract shape afterwards.
Parameters
• shape (shapely.geometry.Polygon or MultiPolygon or shapereader.Shape.) – Geographical
shape for which LitPop Exposure is to be initiated.
• total_value (int, float or None type) – Total value to be disaggregated to grid in shape. If
None, no value is disaggregated.
• res_arcsec (float, optional) – Horizontal resolution in arc-sec. The default 30 arcsec corre-
sponds to roughly 1 km.
• exponents (tuple of two integers, optional) – Defining power with which lit (nightlights) and
pop (gpw) go into LitPop.
• value_unit (str) – Unit of exposure values. The default is USD.
• region_id (int, optional) – The numeric ISO 3166 region associated with the shape. If set to a
value, this single value will be set for every coordinate in the GeoDataFrame of the resulting
LitPop instance. If None (default), the region ID for every coordinate will be determined
automatically (at a slight computational cost).
• reference_year (int, optional) – Reference year for data sources. Default: CON-
FIG.exposures.def_ref_year
• gpw_version (int, optional) – Version number of GPW population data. The default is set in
CONFIG.
• data_dir (Path, optional) – redefines path to input data directory. The default is SYS-
TEM_DIR.
Raises
• NotImplementedError –
• ValueError –
• TypeError –
Returns
exp – The exposure LitPop within shape
Return type
LitPop
set_country(*args, **kwargs)
This function is deprecated, use LitPop.from_countries instead.
climada.entity.exposures.litpop.litpop.get_value_unit(fin_mode)
get value_unit depending on fin_mode
Parameters
fin_mode (Socio-economic value to be used as an asset base)
Returns
value_unit
Return type
str
climada.entity.exposures.litpop.litpop.reproject_input_data(data_array_list, meta_list, i_align=0,
target_res_arcsec=None,
global_origins=(-180.0,
89.99999999999991),
resampling=Resampling.bilinear,
conserve=None)
LitPop-sepcific wrapper around u_coord.align_raster_data.
Reprojects all arrays in data_arrays to a given resolution – all based on the population data grid.
Parameters
• data_array_list (list or array of numpy arrays containing numbers) – Data to be reprojected,
i.e. list containing N (min. 1) 2D-arrays. The data with the reference grid used to align the
global destination grid to should be first data_array_list[i_align], e.g., pop (GPW population
data) for LitPop.
• meta_list (list of dicts) – meta data dictionaries of data arrays in same order as data_array_list.
Required fields in each dict are ‘dtype,’, ‘width’, ‘height’, ‘crs’, ‘transform’. Example:
>>> {
... 'driver': 'GTiff',
... 'dtype': 'float32',
... 'nodata': 0,
... 'width': 2702,
... 'height': 1939,
... 'count': 1,
... 'crs': CRS.from_epsg(4326),
... 'transform': Affine(0.00833333333333333, 0.0, -18.
,→175000000000068,
... }
The meta data with the reference grid used to define the global destination grid should be first
in the list, e.g., GPW population data for LitPop.
• i_align (int, optional) – Index/Position of meta in meta_list to which the global grid of the
destination is to be aligned to (c.f. u_coord.align_raster_data) The default is 0.
• target_res_arcsec (int, optional) – target resolution in arcsec. The default is None, i.e. same
resolution as reference data.
• global_origins (tuple with two numbers (lat, lon), optional) – global lon and lat origins as basis
for destination grid. The default is the same as for GPW population data: (-180.0, 89.
99999999999991)
Parameters
• data_arrays (list or array of numpy arrays containing numbers) – Data to be combined, i.e.
list containing N (min. 1) arrays of same shape.
• total_val_rescale (float or int, optional) – Total value for optional rescaling of resulting array.
All values in result_array are skaled so that the sum is equal to total_val_rescale. The default
(None) implies no rescaling.
• offsets (list or array containing N numbers >= 0, optional) – One numerical offset per array
that is added (sum) to the corresponding array in data_arrays. The default (None) corresponds
to np.zeros(N).
• exponents (list or array containing N numbers >= 0, optional) – One exponent per array used
as power for the corresponding array. The default (None) corresponds to np.ones(N).
Raises
ValueError – If input lists don’t have the same number of elements. Or: If arrays in data_arrays
do not have the same shape.
Returns
Results from calculation described above.
Return type
np.array of same shape as arrays in data_arrays
climada.entity.exposures.litpop.nightlight module
climada.entity.exposures.litpop.nightlight.NOAA_RESOLUTION_DEG = 0.008333333333333333
NOAA nightlights coordinates resolution in degrees.
climada.entity.exposures.litpop.nightlight.NASA_RESOLUTION_DEG = 0.004166666666666667
NASA nightlights coordinates resolution in degrees.
climada.entity.exposures.litpop.nightlight.NASA_TILE_SIZE = (21600, 21600)
NASA nightlights tile resolution.
climada.entity.exposures.litpop.nightlight.NOAA_BORDER = (-180, -65, 180, 75)
NOAA nightlights border (min_lon, min_lat, max_lon, max_lat)
climada.entity.exposures.litpop.nightlight.BM_FILENAMES =
['BlackMarble_%i_A1_geo_gray.tif', 'BlackMarble_%i_A2_geo_gray.tif',
'BlackMarble_%i_B1_geo_gray.tif', 'BlackMarble_%i_B2_geo_gray.tif',
'BlackMarble_%i_C1_geo_gray.tif', 'BlackMarble_%i_C2_geo_gray.tif',
'BlackMarble_%i_D1_geo_gray.tif', 'BlackMarble_%i_D2_geo_gray.tif']
Nightlight NASA files which generate the whole earth when put together.
climada.entity.exposures.litpop.nightlight.load_nasa_nl_shape(geometry, year,
data_dir=PosixPath('/home/docs/climada/data'),
dtype='float32' )
Read nightlight data from NASA BlackMarble tiles cropped to given shape(s) and combine arrays from each tile.
1) check and download required blackmarble files
2) read and crop data from each file required in a bounding box around the given geometry.
3) combine data from all input files into one array. this array then contains all data in the geographic bounding
box around geometry.
4) return array with nightlight data
Parameters
• geometry (shape(s) to crop data to in degree lon/lat.) – for example
shapely.geometry.(Multi)Polygon or shapefile.Shape. from polygon defined in a shape-
file. The object should have attribute ‘bounds’ or ‘points’
• year (int) – target year for nightlight data, e.g. 2016. Closest availble year is selected.
• data_dir (Path (optional)) – Path to directory with BlackMarble data. The default is SYS-
TEM_DIR.
• dtype (dtype) – data type for output default ‘float32’, required for LitPop, choose ‘int8’ for
integer.
Returns
• results_array (numpy array) – extracted and combined nightlight data for bounding box
around shape
• meta (dict) – rasterio meta data for results_array
climada.entity.exposures.litpop.nightlight.get_required_nl_files(bounds)
Parameters
bounds (1x4 tuple) – bounding box from shape (min_lon, min_lat, max_lon, max_lat).
Raises
ValueError – invalid bounds
Returns
req_files – Array indicating the required files for the current operation with a boolean value (1: file
is required, 0: file is not required).
Return type
numpy array
climada.entity.exposures.litpop.nightlight.check_nl_local_file_exists(required_files=None,
check_path=PosixPath('/home/docs/climad
year=2016)
Checks if BM Satellite files are avaialbe and returns a vector denoting the missing files.
Parameters
• required_files (numpy array, optional) – boolean array of dimension (8,) with which some
files can be skipped. Only files with value 1 are checked, with value zero are skipped. The
default is np.ones(len(BM_FILENAMES),)
• check_path (str or Path) – absolute path where files are stored. Default: SYSTEM_DIR
• year (int) – year of the image, e.g. 2016
Returns
files_exist – Boolean array that denotes if the required files exist.
Return type
numpy array
climada.entity.exposures.litpop.nightlight.download_nl_files(req_files=array([1., 1., 1., 1., 1., 1.,
1., 1.]), files_exist=array([0., 0., 0.,
0., 0., 0., 0., 0.]),
dwnl_path=PosixPath('/home/docs/climada/data'),
year=2016)
Attempts to download nightlight files from NASA webpage.
Parameters
• req_files (numpy array, optional) –
Boolean array which indicates the files required (0-> skip, 1-> download).
The default is np.ones(len(BM_FILENAMES),).
• files_exist (numpy array, optional) –
Boolean array which indicates if the files already
exist locally and should not be downloaded (0-> download, 1-> skip). The default is
np.zeros(len(BM_FILENAMES),).
• dwnl_path (str or path, optional) – Download directory path. The default is SYSTEM_DIR.
• year (int, optional) – Data year to be downloaded. The default is 2016.
Raises
• ValueError –
• RuntimeError –
Returns
dwnl_path – Download directory path.
Return type
str or path
climada.entity.exposures.litpop.nightlight.load_nasa_nl_shape_single_tile(geometry, path,
layer=0)
Read nightlight data from single NASA BlackMarble tile and crop to given shape.
Parameters
• geometry (shape or geometry object) – shape(s) to crop data to in degree lon/lat. for example
shapely.geometry.Polygon object or from polygon defined in a shapefile.
• path (Path or str) – full path to BlackMarble tif (including filename)
• layer (int, optional) – TIFF-layer to be returned. The default is 0. BlackMarble usually comes
with 3 layers.
Returns
• out_image[layer, (,:] : 2D numpy ndarray) – 2d array with data cropped to bounding box of
shape
• meta (dict) – rasterio meta
climada.entity.exposures.litpop.nightlight.load_nightlight_nasa(bounds, req_files, year)
Get nightlight from NASA repository that contain input boundary.
Note: Legacy for BlackMarble, not required for litpop module
Parameters
• bounds (tuple) – min_lon, min_lat, max_lon, max_lat
• req_files (np.array) – array with flags for NASA files needed
• year (int) – nightlight year
Returns
• nightlight (sparse.csr_matrix)
• coord_nl (np.array)
climada.entity.exposures.litpop.nightlight.read_bm_file(bm_path, filename)
Reads a single NASA BlackMarble GeoTiff and returns the data. Run all required checks first.
Note: Legacy for BlackMarble, not required for litpop module
Parameters
• bm_path (str) – absolute path where files are stored.
• filename (str) – filename of the file to be read.
Returns
• arr1 (array) – Raw BM data
• curr_file (gdal GeoTiff File) – Additional info from which coordinates can be calculated.
climada.entity.exposures.litpop.nightlight.unzip_tif_to_py(file_gz)
Unzip image file, read it, flip the x axis, save values as pickle and remove tif.
Parameters
file_gz (str) – file fith .gz format to unzip
Returns
• fname (str) – file_name of unzipped file
• nightlight (sparse.csr_matrix)
climada.entity.exposures.litpop.nightlight.untar_noaa_stable_nightlight(f_tar_ini)
Move input tar file to SYSTEM_DIR and extract stable light file. Returns absolute path of stable light file in format
tif.gz.
Parameters
f_tar_ini (str) – absolute path of file
Returns
f_tif_gz – path of stable light file
Return type
str
climada.entity.exposures.litpop.nightlight.load_nightlight_noaa(ref_year=2013,
sat_name=None)
Get nightlight luminosites. Nightlight matrix, lat and lon ordered such that nightlight[1][0] corresponds to lat[1],
lon[0] point (the image has been flipped).
Parameters
• ref_year (int, optional) – reference year. The default is 2013.
• sat_name (str, optional) – satellite provider (e.g. ‘F10’, ‘F18’, …)
Returns
• nightlight (sparse.csr_matrix)
• coord_nl (np.array)
• fn_light (str)
climada.entity.exposures.base module
data
containing at least the columns ‘geometry’ and ‘value’ for locations and assets optionally more, a.o., ‘region_id’,
‘category_id’, columns for (hazard specific) assigned centroids and (hazard specific) impact funcitons.
Type
GeoDataFrame
vars_oblig = ['value', 'geometry']
Name of the variables needed to compute the impact.
vars_def = ['impf_', 'if_']
Name of variables that can be computed.
vars_opt = ['centr_', 'deductible', 'cover', 'category_id', 'region_id',
'geometry']
Name of the variables that aren’t need to compute the impact.
property crs
Coordinate Reference System, refers to the crs attribute of the inherent GeoDataFrame
property gdf
Inherent GeoDataFrame
property latitude
Latitude array of exposures
property longitude
Longitude array of exposures
property geometry
Geometry array of exposures
property value
Geometry array of exposures
property region_id
Region id for each exposure
Return type
np.array of int
property category_id
Category id for each exposure
Return type
np.array
property cover
Cover value for each exposures
Return type
np.array of float
property deductible
Deductible value for each exposures
Return type
np.array of float
hazard_impf(haz_type='' )
Get impact functions for a given hazard
Parameters
haz_type (str) – hazard type, as in the hazard’s.haz_type which is the HAZ_TYPE constant of
the hazard’s module
Returns
impact functions for the given hazard
Return type
np.array of int
hazard_centroids(haz_type='' )
Get centroids for a given hazard
Parameters
haz_type (str) – hazard type, as in the hazard’s.haz_type which is the HAZ_TYPE constant of
the hazard’s module
Returns
centroids index for the given hazard
Return type
np.array of int
derive_raster()
Metadata dictionary, containing raster information, derived from the geometry
__init__(data=None, index=None, columns=None, dtype=None, copy=False, geometry=None, crs=None,
meta=None, description=None, ref_year=None, value_unit=None, value=None, lat=None,
lon=None)
Parameters
• data (dict, iterable, DataFrame, GeoDataFrame, ndarray) – data of the initial DataFrame,
see pandas.DataFrame(). Used to initialize values for “region_id”, “category_id”,
“cover”, “deductible”, “value”, “geometry”, “impf_[hazard type]”.
• columns (Index or array, optional) – Columns of the initial DataFrame, see pandas.
DataFrame(). To be provided if data is an array
• index (Index or array, optional) – Columns of the initial DataFrame, see pandas.
DataFrame(). can optionally be provided if data is an array or for defining a specific row
index
• dtype (dtype, optional) – data type of the initial DataFrame, see pandas.DataFrame().
Can be used to assign specific data types to the columns in data
• copy (bool, optional) – Whether to make a copy of the input data, see pandas.
DataFrame(). Default is False, i.e. by default data may be altered by the Exposures
object.
• geometry (array, optional) – Geometry column, see geopandas.GeoDataFrame(). Must
be provided if lat and lon are None and data has no “geometry” column.
• crs (value, optional) – Coordinate Reference System, see geopandas.GeoDataFrame().
• meta (dict, optional) – Metadata dictionary. Default: {} (empty dictionary). May be used to
provide any of description, ref_year, value_unit and crs
• description (str, optional) – Default: None
• ref_year (int, optional) – Reference Year. Defaults to the entry of the same name in meta or
2018.
• value_unit (str, optional) – Unit of the exposed value. Defaults to the entry of the same
name in meta or ‘USD’.
• value (array, optional) – Exposed value column. Must be provided if data has no “value”
column
• lat (array, optional) – Latitude column. Can be provided together with lon, alternative to
geometry
• lon (array, optional) – Longitude column. Can be provided together with lat, alternative to
geometry
check()
Check Exposures consistency.
Reports missing columns in log messages.
set_crs(crs='EPSG:4326' )
Set the Coordinate Reference System. If the epxosures GeoDataFrame has a ‘geometry’ column it will be
updated too.
Parameters
crs (object, optional) – anything anything accepted by pyproj.CRS.from_user_input.
set_gdf(gdf: GeoDataFrame, crs=None)
Set the gdf GeoDataFrame and update the CRS
Parameters
• gdf (GeoDataFrame)
• crs (object, optional,) – anything anything accepted by pyproj.CRS.from_user_input, by de-
fault None, then gdf.crs applies or - if not set - the exposure’s current crs
get_impf_column(haz_type='' )
Find the best matching column name in the exposures dataframe for a given hazard type,
Parameters
haz_type (str or None) – hazard type, as in the hazard’s.haz_type which is the HAZ_TYPE
constant of the hazard’s module
Returns
a column name, the first of the following that is present in the exposures’ dataframe:
• impf_[haz_type]
• if_[haz_type]
• impf_
• if_
Return type
str
Raises
ValueError – if none of the above is found in the dataframe.
µ See also
climada.util.coordinates.match_grid_points
method to associate centroids to exposure points when centroids is a raster
climada.util.coordinates.match_coordinates
method to associate centroids to exposure points
Notes
The default order of use is:
1. if centroid raster is defined, assign exposures points to the closest raster point.
2. if no raster, assign centroids to the nearest neighbor using euclidian metric
Both cases can introduce innacuracies for coordinates in lat/lon coordinates as distances in degrees differ from
distances in meters on the Earth surface, in particular for higher latitude and distances larger than 100km. If
more accuracy is needed, please use ‘haversine’ distance metric. This however is slower for (quasi-)gridded
data, and works only for non-gridded data.
set_geometry_points(scheduler=None)
obsolete and deprecated since climada 5.0
Deprecated since version Obsolete: method call. As of climada 5.0, geometry points are set during object
initialization
set_lat_lon()
Set latitude and longitude attributes from geometry attribute.
Deprecated since version latitude: and longitude columns are no longer meaningful in Exposures` Geo-
DataFrames. They can be retrieved from Exposures.latitude and .longitude properties
set_from_raster(*args, **kwargs)
This function is deprecated, use Exposures.from_raster instead.
Return type
matplotlib.figure.Figure, cartopy.mpl.geoaxes.GeoAxesSubplot
plot_basemap(mask=None, ignore_zero=False, pop_name=True, buffer=0.0, extend='neither', zoom=10,
url={'attribution': '(C) OpenStreetMap contributors (C) CARTO', 'html_attribution': '© <a
href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> contributors © <a
href="https://carto.com/attributions">CARTO</a>', 'max_zoom': 20, 'name': 'CartoDB.Positron',
'subdomains': 'abcd', 'url': 'https://{s}.basemaps.cartocdn.com/{variant}/{z}/{x}/{y}{r}.png',
'variant': 'light_all'}, axis=None, **kwargs)
Scatter points over satellite image using contextily
Parameters
• mask (np.array, optional) – mask to apply to eai_exp plotted. Same size of the exposures,
only the selected indexes will be plot.
• ignore_zero (bool, optional) – flag to indicate if zero and negative values are ignored in plot.
Default: False
• pop_name (bool, optional) – add names of the populated places, by default True.
• buffer (float, optional) – border to add to coordinates. Default: 0.0.
• extend (str, optional) – extend border colorbar with arrows. [ ‘neither’ | ‘both’ | ‘min’ | ‘max’ ]
• zoom (int, optional) – zoom coefficient used in the satellite image
• url (Any, optional) – image source, e.g., ctx.providers.OpenStreetMap.Mapnik.
Default: ctx.providers.CartoDB.Positron
• axis (matplotlib.axes._subplots.AxesSubplot, optional) – axis to use
• kwargs (optional) – arguments for scatter matplotlib function, e.g. cmap=’Greys’. Default:
‘Wistia’
Return type
matplotlib.figure.Figure, cartopy.mpl.geoaxes.GeoAxesSubplot
write_hdf5(file_name)
Write data frame and metadata in hdf5 format
Parameters
file_name (str) – (path and) file name to write to.
read_hdf5(*args, **kwargs)
This function is deprecated, use Exposures.from_hdf5 instead.
classmethod from_hdf5(file_name)
Read data frame and metadata in hdf5 format
Parameters
• file_name (str) – (path and) file name to read from.
• additional_vars (list) – list of additional variable names, other than the attributes of the
Exposures class, whose values are to be read into the Exposures object class.
Return type
Exposures
read_mat(*args, **kwargs)
This function is deprecated, use Exposures.from_mat instead.
Return type
Exposures
centroids_total_value(hazard )
Compute value of exposures close enough to be affected by hazard
Deprecated since version 3.3: This method will be removed in a future version. Use af-
fected_total_value() instead.
This method computes the sum of the value of all exposures points for which a Hazard centroid is assigned.
Parameters
hazard (Hazard) – Hazard affecting Exposures
Returns
Sum of value of all exposures points for which a centroids is assigned
Return type
float
affected_total_value(hazard: Hazard, threshold_affected: float = 0, overwrite_assigned_centroids: bool =
True)
Total value of the exposures that are affected by at least one hazard event (sum of value of all exposures points
for which at least one event has intensity larger than the threshold).
Parameters
• hazard (Hazard) – Hazard affecting Exposures
• threshold_affected (int or float) – Hazard intensity threshold above which an exposures is
considere affected. The default is 0.
• overwrite_assigned_centroids (boolean) – Assign centroids from the hazard to the expo-
sures and overwrite existing ones. The default is True.
Returns
Sum of value of all exposures points for which a centroids is assigned and that have at least one
event intensity above threshold.
Return type
float
µ See also
Exposures.assign_centroids
method to assign centroids.
® Note
The fraction attribute of the hazard is ignored. Thus, for hazards with fraction defined the affected values
will be overestimated.
climada.entity.impact_funcs package
climada.entity.impact_funcs.base module
Type
np.array
paa
percentage of affected assets (exposures) for each intensity (numbers in [0,1])
Type
np.array
__init__(haz_type: str = '', id: str | int = '', intensity: ndarray | None = None, mdd: ndarray | None = None,
paa: ndarray | None = None, intensity_unit: str = '', name: str = '' )
Initialization.
Parameters
• haz_type (str, optional) – Hazard type acronym (e.g. ‘TC’).
• id (int or str, optional) – id of the impact function. Exposures of the same type will refer to
the same impact function id.
• intensity (np.array, optional) – Intensity values. Defaults to empty array.
• mdd (np.array, optional) – Mean damage (impact) degree for each intensity (numbers in
[0,1]). Defaults to empty array.
• paa (np.array, optional) – Percentage of affected assets (exposures) for each intensity (num-
bers in [0,1]). Defaults to empty array.
• intensity_unit (str, optional) – Unit of the intensity.
• name (str, optional) – Name of the ImpactFunc.
calc_mdr(inten: float | ndarray) → ndarray
Interpolate impact function to a given intensity.
Parameters
inten (float or np.array) – intensity, the x-coordinate of the interpolated values.
Return type
np.array
plot(axis=None, **kwargs)
Plot the impact functions MDD, MDR and PAA in one graph, where MDR = PAA * MDD.
Parameters
• axis (matplotlib.axes._subplots.AxesSubplot, optional) – axis to use
• kwargs (optional) – arguments for plot matplotlib function, e.g. marker=’x’
Return type
matplotlib.axes._subplots.AxesSubplot
check()
Check consistent instance data.
Raises
ValueError –
classmethod from_step_impf(intensity: tuple[float, float, float], haz_type: str, mdd: tuple[float, float] = (0,
1), paa: tuple[float, float] = (1, 1), impf_id: int = 1, **kwargs)
Step function type impact function.
By default, the impact is 100% above the step. Useful for high resolution modelling.