0% found this document useful (0 votes)
17 views71 pages

Liability Modelling

The document titled 'Liability Modelling' by Paul King provides an extensive overview of investment risk measures, stochastic rates of return, simulation and stochastic modeling, and various actuarial methods for valuing benefit guarantees. It includes detailed sections on mathematical models, claims estimation, and ruin theory, along with learning objectives, practical issues, and homework assignments. The content is structured to facilitate understanding of complex actuarial concepts and their applications.

Uploaded by

Uday Bhalla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views71 pages

Liability Modelling

The document titled 'Liability Modelling' by Paul King provides an extensive overview of investment risk measures, stochastic rates of return, simulation and stochastic modeling, and various actuarial methods for valuing benefit guarantees. It includes detailed sections on mathematical models, claims estimation, and ruin theory, along with learning objectives, practical issues, and homework assignments. The content is structured to facilitate understanding of complex actuarial concepts and their applications.

Uploaded by

Uday Bhalla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Liability Modelling

Paul King

29 January 2023
Table of contents

1 Overview 6

2 Measures of investment risk 7


2.1 Learning objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Types of risk measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Variance of returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Downside semi-variance of returns . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Value at risk (VaR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 Expected shortfall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Shortfall probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 Relationship between risk measures and utility functions . . . . . . . . . . . . 14
2.9 Practical issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.10 Past IFoA exam questions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.11 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Stochastic rates of return 17


3.1 Learning objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Mathematical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 Types of model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Uses of models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Key notation and equations from Garrett . . . . . . . . . . . . . . . . . . . . 19
3.4.1 Log-normal returns (Garrett 12.3) . . . . . . . . . . . . . . . . . . . . 20
3.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.1 Behavioural economics reading . . . . . . . . . . . . . . . . . . . . . . 20
3.5.2 Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.3 Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.4 Familiarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6 Past IFoA questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Simulation and stochastic modelling 23


4.1 Generative models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Turning a generative model into code . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Estimating probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4 Timing code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 Discrete Event Simulation (DES) . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Valuing benefit guarantees 28


5.1 Week 4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2
5.2 Economic Scenario Generators . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Actuarial uses of ESGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.4 Stylized facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.5 ESG summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.6 Single period mean-variance models . . . . . . . . . . . . . . . . . . . . . . . 30
5.7 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.8 Additional reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Run-off triangles 31
6.1 Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.2 Need for claims estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.2.1 Accurate estimation of reserves is important . . . . . . . . . . . . . . . 32
6.2.2 Example: Incremental paid claims . . . . . . . . . . . . . . . . . . . . 32
6.2.3 Cumulative paid claims . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2.4 Mathematical (regression) model . . . . . . . . . . . . . . . . . . . . . 33
6.3 Basic chain ladder approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3.1 Development factors for all years . . . . . . . . . . . . . . . . . . . . . 35
6.3.2 Assumptions underlying the basic chain ladder method . . . . . . . . 35
6.4 Adjusting for past inflation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.4.1 Assumptions underlying the inflation-adjusted chain ladder method . 38
6.5 Average cost per claim method . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.5.1 Assumptions underlying the average cost per claim method . . . . . . 40
6.6 The Bornhuetter-Ferguson method . . . . . . . . . . . . . . . . . . . . . . . . 40
6.6.1 Loss Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.6.2 Assumptions underlying the Bornhuetter-Ferguson method . . . . . . 41
6.7 Statistical models and simulation . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.7.1 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.7.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7 Ruin theory 45
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.1.2 Premiums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.1.3 The surplus process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.1.4 Simulating the surplus process . . . . . . . . . . . . . . . . . . . . . . 46
7.2 The probability of ruin in continuous time . . . . . . . . . . . . . . . . . . . . 47
7.2.1 Some relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.3 Probability of ruin in discrete time . . . . . . . . . . . . . . . . . . . . . . . . 48
7.4 A counting process for claims . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.4.1 The Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.4.2 Distribution of time to the first claim . . . . . . . . . . . . . . . . . . 49
7.4.3 The compound Poisson process . . . . . . . . . . . . . . . . . . . . . . 49
7.5 Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.5.1 Lundberg’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.5.2 How R depends on other parameters . . . . . . . . . . . . . . . . . . . 50

3
7.5.3 The adjustment factor - compound Poisson processes . . . . . . . . . . 50
7.5.4 General aggregate claims processes - adjustment . . . . . . . . . . . . 51
7.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

8 Rational expectation and utility theory 52


8.1 Meaning of utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.2 Expected Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.2.1 Expected Utility Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.2 Non-satiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.3 Risk preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.3 Some utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.3.1 The quadratic utility function . . . . . . . . . . . . . . . . . . . . . . . 54
8.3.2 The log utility function . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.3.3 The power utility function . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.4 Example utility curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.5 Problems with EUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.6 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.6.1 First order stochastic dominance . . . . . . . . . . . . . . . . . . . . . 57
8.6.2 Second order stochastic dominance . . . . . . . . . . . . . . . . . . . . 57
8.7 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.8 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

9 Behavioural economics 59
9.1 Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.2 Criticisms of expected utility theory (EUT) . . . . . . . . . . . . . . . . . . . 59
9.3 Summary of Prospect Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.3.1 Decision making under Prospect Theory . . . . . . . . . . . . . . . . . 60
9.4 Heuristics and behavioural biases . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.4.1 Anchoring and adjustment . . . . . . . . . . . . . . . . . . . . . . . . 64
9.4.2 Familiarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.4.3 Overconfidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.4.4 Hindsight bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.4.5 Confirmation bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.4.6 Self-serving bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.4.7 Status quo bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.4.8 Herd behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.6 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

10 Insurance and risk markets 68


10.1 Learning objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
10.2 Finding the maximum premium . . . . . . . . . . . . . . . . . . . . . . . . . . 68
10.3 Finding the minimum premium . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.4 A little insurance backgound . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.4.1 Pooling resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.4.2 Adverse selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4
10.4.3 Moral hazard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.4.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

71

5
1 Overview

The module covers the objectives in the IFoA syllabus for CM2 that aren’t covered by
MA3471 / MA7471. The broad structure will be:

• Measures of investment risk


• Stochastic rate of return models
• Simulation and stochastic modelling
• Valuing benefit guarantees
• Run off triangles
• Ruin theory
• Rational expectation and utility theory
• Behavioural economics
• Insurance and risk markets

We’ll be using both R and Excel, so you’ll need to make sure you have access to a working
R system.
You can access R by logging on to https://rserver.mcs.le.ac.uk. If you don’t have an account
from Fundamentals of Data Science last semester you will be sent one in the next few days.
(For some mysterious reason, it’s not possible to log onto the server on-campus if you are
connected to the University wireless network. You must log onto the Guest (_cloud) network
instead.)
Alternatively you can install R and RStudio on your own computer, or you can access
RStudio on the university computers via the Student Desktop.

6
2 Measures of investment risk

Financial markets operate under uncertainty, and hence every participant in those markets
is exposed to the possibility of losing money or, achieving a return that is less than expected
(i.e. they are exposed to investment risk). Several different measures can be used to quantify
this risk, all of which aim to measure the scale of the possible loss and/or take account of
the probability of incurring that loss. These are the two main components of any risk.
In this chapter, we define several widely used measures of financial risk and use them to
compare different investment opportunities. They could be used to analyse single assets or,
more usefully, portfolios of assets. In particular, we focus on variance of return, downside
semi-variance of return, shortfall probability and the value at risk measure.

Check your understanding

ì Some risk-free investments exist; can you give examples? Why not just invest in
those, i.e. why bother taking investment risk?

Government bonds (Gilts in the UK) are practically risk free. Also bank deposits
in the UK are guaranteed by the government up to a limit (£85,000 at the time of
writing).
There’s a saying “No risk, no return”. Risk-free investments give relatively low returns,
and so many investors choose riskier assets with higher expected returns.

ì Can you think of some examples of why you would want to analyse the risks of a
portfolio of assets?

First, to make sure you are getting enough extra return for the risk taken compared
with risk-free alternatives.
Second, to make sure you are getting the highest return for the chosen level of risk you
are willing to take (your “risk appetite”).
Other reasons:

• If you measure risk it gives you a chance to control it e.g. by hedging or diversi-
fication
• You can start to see whether your assets will be a good match to your liabilities
• Allowing for the above, you can work out how much capital to hold so that, even
if the risks materialise, you are able to remain solvent.

7
2.1 Learning objectives

This chapter covers the following learning objectives from the CM2 syllabus.

• Define the following measures of investment risk:


– variance of return
– downside semi-variance of return
– shortfall probabilities
– Value at Risk (VaR) / Tail VaR
• Describe how the risk measures listed above are related to the form of an investor’s
utility function.
• Perform calculations using the risk measures listed above to compare investment op-
portunities.
• Explain how the distribution of returns and the thickness of tails will influence the
assessment of risk.

2.2 Types of risk measures

The risk measures listed above can be classified into two main types. The first type are
dispersion measures, such as variance and semi-variance of returns; they estimate the spread
of returns around the expected value. Measures of the second type, such as value at risk
and expected shortfall, are probabilistic and are more directly related to the probability of
loss. As a generalisation, the dispersion measures often tell you more about the central part
of the distribution, whilst the probabilistic measures can provide more information about
the tails (to the extent that they can be reliably estimated).

Check your understanding

ì What are the two components of risk that might be captured by different risk
measures?

The possibility of incurring a loss


The scale of the loss incurred

ì Why do you think variance is the simplest and most widely used risk measure?

Variance is the most basic and widely understood statistical measure of dispersion
around the mean.

8
2.3 Variance of returns

Consider the price 𝑃 of an asset as a random variable on the probability space that is
formed by all possible events affecting the market. Then the single period return 𝑟 = 𝑃1𝑃−𝑃0 ,
0
where 𝑃0 is the initial price, is also a random variable. Investment risk stems from possible
deviations of the price from its expected value and the impact that this will have on the
return.
The simplest and most widely used risk measure in mathematical finance is variance of
return, which is defined to be:
𝐸(𝑅 − 𝐸(𝑅))2 = 𝐸(𝑅2 ) − (𝐸(𝑅))2
The square root of the variance, i.e. the standard deviation, can also be used as a measure
of risk, but variance is more convenient due to its better analytical properties (by avoiding
square roots).
The variance of return measures the total spread of the return distribution giving equal
weights to positive and negative deviations from the mean (i.e. all variability is treated
as “bad”). Usually investors benefit from positive deviations and only suffer from negative
deviations, so it would appear natural to focus on negative deviations only when considering
risk. One measure that does so is called the downside semi-variance of returns and will
be described in more detail below. However, typical distributions of financial variables
are approximately symmetric around the mean, so the variance and semi-variance will be
proportional. When the distribution is close to being symmetric, the semi-variance does not
contain any additional information that the variance does not.
Since the variance is usually easier to compute and is readily available as a standard output
in most statistical software, this advantage in ease of calculation is important. Alternative
measures that are more complicated are only justified if they provide genuine additional
insight. The use of the variance of returns also allows for the development of useful theo-
retical results, as we will see under mean-variance portfolio theory, where it can be used to
find optimal portfolio allocations. Lastly, if an investor has a quadratic utility function, or
returns are known to be normally distributed, then variance of returns is by definition the
appropriate risk measure.

Check your understanding

ì If 𝑓(𝑥) is the probability density function (pdf) of the investment return, what is
the integral expression for the variance of return?

∫−∞ [𝑥 − 𝐸(𝑥)]2 𝑓(𝑥)𝑑𝑥

9
ì How would you decide if it is worth using a risk measure that is more complicated
than variance of return.

You would only choose a different measure if the extra insight justified the extra
complication in the circumstances you are using the measure.

2.4 Downside semi-variance of returns

Semi-variance measures downside variability and is computed as the average of the squared
deviations below the mean return (i.e. it provides a measure of 𝐸(𝑟 − 𝐸(𝑟))2− , where 𝑓− ≡
𝑚𝑖𝑛(𝑓, 0))
Semi-variance is similar to the variance, however, it only considers observed returns below
the mean/expected return. A useful tool in portfolio or asset analysis, especially in the
presence of skewed distributions, semi-variance provides a measure for downside risk. While
standard deviation and variance provide measures of overall volatility, semi-variance only
looks at the negative fluctuations of the asset. By ignoring all values above the mean (or an
investor’s target return) semi-variance estimates the average degree of loss that a portfolio
or asset could incur, given the definition of “loss” being used (i.e. in an absolute sense or
compared to a benchmark).
For risk averse investors, solving for optimal portfolio allocations by minimizing semi-
variance would limit the likelihood of a large loss. A similar result could be achieved with
the variance of returns, unless the distribution was highly skewed.

Check your understanding

ì If 𝑓(𝑥) is the probability density function (pdf) of the investment return, what is
the integral expression for the downside semi-variance of return?

𝐸(𝑥)
∫−∞ [𝑥 − 𝐸(𝑥)]2 𝑓(𝑥)𝑑𝑥

Assume investment returns are distributed uniformly as 𝑈 (0%, 6%).

ì State whether the distribution is skewed or not.

The distribution is symmetrical.

10
ì Calculate the variance and semi-variance of returns using the integral expressions
given above.
6 2
∫0 (𝑥−3)
6 𝑑𝑥 = 3
(Or we could just use the standard formula for the variance of a uniform distribution.)
3 2
∫0 (𝑥−3)
6 𝑑𝑥 = 32

ì Comment on the relationship between the two measures.

Since the “bottom half” of the distribution has exactly the same form as the full
distribution we would expect the downside semi-variance to be equal to half of the
variance.

ì How would the answer differ if the distribution was 𝑈 (−3%, 3%)?

Since the measures are independent of the location of the distribution, the results
would be the same.

2.5 Value at risk (VaR)

The value at risk (VaR) estimates the future loss (within a given interval of time) that will
only be exceeded with a given (low) probability. It is therefore a “threshold” measure of
loss that is not the highest possible loss, but rather is only likely to be exceeded in extreme
conditions. Crucially, VaR tells you nothing about the potential scale of losses beyond the
threshold level. Having evaluated VaR as 𝑙 for the confidence level 𝛼, 0 < 𝛼 ≤ 1, one can be
100.𝛼% sure the loss would not exceed 𝑙 and therefore 100.(1-a)% sure that it will. Typically
𝛼 is chosen close to 1, but it is not certain that VaR can be accurately estimated for very
high confidence levels (e.g. 𝛼 = 99.5% or higher).
Given the cumulative distribution function 𝐹𝐿 (𝑙) of losses 𝐿, 𝐹 = 𝑃 𝑟(𝐿 <= 𝑙) , the assign-
ment 𝛼 → 𝑉 𝑎𝑅 is roughly speaking the inverse function of 𝐹 . That is exactly so if 𝐹𝐿 (𝑙) is
continuous and strictly increasing. In general, the function is weakly increasing and might
also have (at most a countable number of) jump points. The value at risk for the confidence
level 𝛼 is then defined to be:
𝑉 𝑎𝑅𝛼 (𝐿) = 𝑖𝑛𝑓{𝑙|𝐹𝐿 (𝑙) ≥ 𝛼}
If 𝛼 belongs to the range of values of 𝐹𝐿 (𝑙) (that is always the case if 𝐹𝐿 (𝑙) is continuous) then
the weak inequality can be replaced by equality. Also, if 𝐹𝐿 (𝑙) is strictly increasing, then
we can drop the symbol inf, and 𝑉 𝑎𝑅𝛼 (𝐿) would be a unique solution to the equation:
𝐹𝐿 (𝑉 𝑎𝑅𝛼 (𝐿)) = 𝛼
A vast amount of literature is devoted to how to evaluate VaR. It is used not only by traders
and portfolio managers but also by regulatory authorities. For example, banks in the United

11
States are required to hold reserves equal to 3 times the 10-day 99% VAR for market risks.

Check your understanding

ì If 𝑓𝐿 (𝑥) is the probability density function (pdf) of the loss 𝐿, what integral expres-
sion defines the 𝛼% VaR?
𝑉 𝑎𝑅𝛼
∫−∞ 𝑓𝐿 (𝑥)𝑑𝑥 = 𝛼

ì Assume investment returns are distributed as 𝑈 (−3%, 3%). Calculate the 98% VaR
on an investment of 100.

The loss is distributed as 𝑈 (−3, 3). Therefore


𝑉 𝑎𝑅
∫−3 𝛼 16 𝑑𝑥 = 0.98
Giving 𝑉 𝑎𝑅 = 2.88.

Despite its popularity, VaR has the following disadvantages:

1. VaR does not take into account the possibility of big losses with probabilities less than
(1 − 𝛼).
2. VaR makes no distinction between tail types of loss distributions and hence underes-
timates risk in the situation of heavy tail distributions.
3. VaR is not a coherent risk measure. In particular it is not sub-additive. This means
that there are examples when VaR of a portfolio is greater than the sum of the VaRs
of its components. This contradicts common sense: if we view a risk measure as the
amount of money needed for reserves to cover potential losses due to market risk:
it’s counter-intuitive that the portfolio requires more reserves than the sum of its
components (i.e. holding the assets together is more risky than a group of people
holding them separately).

Check your understanding

ì Why do you think it’s difficult to estimate VaR accurately for very high confidence
levels?

Because tails of distributions are hard to estimate accurately given that, by definition,
extreme events happen rarely.

2.6 Expected shortfall

The expected shortfall (ES) is also called conditional value at risk or the tail VaR. ES
evaluates the risk of an investment in a conservative way, focusing on the less profitable

12
outcomes above a given threshold. This is a similar approach to the semi-variance of returns
described above. For high values of (1 − 𝛼) it ignores the most potentially profitable but
unlikely possibilities, for small values of (1 − 𝛼) it focuses on the worst losses. On the other
hand, unlike the maximum loss, even for higher values of 𝛼 ES does not consider only the
single most catastrophic outcome. A value of 𝛼 often used in practice is 95%.
𝐸𝑆𝛼 (𝐿) = 𝐸(𝐿|𝐿 ≥ 𝑙)
where 𝑙 = 𝑉 𝑎𝑅𝛼 (𝐿) for the threshold 𝛼. Note that the possible equality in the right-hand
side is essential, since otherwise for sufficiently high 𝛼 the expected loss would be zero in
case of bounded 𝐿. Consider for example the distribution 𝐿 = 0 with probability 0.9 and
𝐿 = 100 with probability 0.1. The value at risk for 𝛼 = 0.95 coincides with the expected
shortfall and is equal to 100, while 𝐸(𝐿|𝐿 > 100) = 0.
Some properties of expected shortfall are as follows:

1. It is an (weakly) increasing function of 𝛼.


2. The 0%-quantile expected shortfall 𝐸𝑆0.0 equals the expected loss of the portfolio.
(Note that this is a very special case: expected shortfall and expected value are not
equal in general).
3. For a given portfolio the expected shortfall 𝐸𝑆𝛼 is greater than (or equal to) the value
at risk 𝑉 𝑎𝑅𝛼 at the same 𝛼-level.

The expected shortfall has advantages compared with VaR. It is a more conservative risk
measure than VaR and for the same confidence level it will suggest higher reserves to hold
against potential losses, if that is what the risk measure is used to determine (this could
potentially be true for a bank or insurance company, but not necessarily an investment
fund).
Consider a simple example illustrating relations between VaR and ES. Suppose that we have
a debt security with nominal value 100 and redemption date being tomorrow. It will be
redeemed completely with probability 0.99. With probability 0.01 the borrower will refuse
to pay 100 and we get only half of the nominal value. In this scenario our loss 𝐿 will be
0 with probability 0.99 and 50 with probability 0.01. For 𝛼 = 0.95 we find 𝑉 𝑎𝑅𝛼 (𝐿) = 0,
that is VaR recommends that we do not hold any reserves against potential losses at all.
This seems strange, because our loss might be significant and its probability 0.01 is not so
small as to be something that we can ignore. At the same time:
𝐸𝑆𝛼 (𝐿) = 𝐸(𝐿|𝐿 > 0) = 50
Thus, the expected shortfall takes into account bigger losses that might occur with low (less
than 1−𝛼) probability. It is also more informative in the often encountered real life situation
when the loss distribution has a heavy tail. It can also be useful in fund management, where
performance is often measured relative to a specified benchmark (perhaps a stock market
index).

13
2.7 Shortfall probabilities

A related measure to the expected shortfall is the shortfall probability, which is simply
𝑃 𝑟(𝐿 ≤ 𝑙) for a specified loss 𝑙. This is a unitless way of comparing return distributions and
can be used as a way of understanding behaviour in the tails of those distributions.

Check your understanding

ì Assume investment returns are distributed as 𝑈 (−3%, 3%).Calculate the shortfall


probability and expected shortfall beyond the 98% VaR level.

By definition of the VaR the shortfall probability is 2%.


The loss is distributed as 𝑈 (−3, 3). Therefore the expected shortfall is given by

3
∫2.88 𝑥6 𝑑𝑥
𝐸𝑆 = 3
∫2.88 16 𝑑𝑥
3
[𝑥2 /12]2.88
= 3
[𝑥/6]2.88
(9 − 2.882 )
=
2(3 − 2.88)
= 2.94

2.8 Relationship between risk measures and utility functions

An investor using a particular risk measure will base their investment decisions on the
available combinations of risk and expected return. Using information regarding how a
specific investor will make the trade-off between these two competing features of potential
investments is made it is possible, in principle, to determine the investor’s underlying utility
function. On the other hand, given a specific utility function, the related risk measure can
be determined. For example, if an investor has a quadratic utility function, the variance of
return is an appropriate measure of risk.
This is because the function used to maximise expected utility is a function of the expected
return and the variance of returns only. If expected return and semi-variance below the
expected return are used as the basis of investment decisions (i.e. the semi-variance is
the risk measure), it can be shown that this implies a utility function which is quadratic
below the expected return level and linear above (since the investor is assumed to be risk
neutral in respect of positive returns, but risk averse in terms of negative returns with
utility depending on the mean and variance for these negative returns). Use of a shortfall

14
risk measure corresponds to a utility function that has a discontinuity at the minimum
required return (or shortfall threshold).

2.9 Practical issues

As you will have seen from the discussion above, different risk measures have different
advantages and disadvantages and so there is no one measure that is always superior to the
others. It is therefore worth investigating the properties of different investments using a
variety of different measures in order to better understand the risk profile.
An important risk assessment often used in practice that is not covered here is stress testing,
where extreme financial scenarios (e.g. combinations of large falls in asset prices, high rates
of inflation, or very low or negative interest rates) are applied to an asset or portfolio to
test how the values will perform under extreme conditions. Knowing how your investments
are likely to behave under extreme conditions can provide valuable insights to the risks that
you are exposed to. When performing stress tests, liabilities need to be consistently valued
according to the stress scenario. Often in the actuarial context, it is the surplus or shortfall
of assets compared to liabilities that is more important than the absolute value of either.
Having calculated the risk measures, the most important step is then considering what to do
in light of these results. Monitoring changes over time can be used for portfolio management:
investigating the reasons for observed changes can inform future investment strategy or risk
mitigation actions.

Check your understanding

ì Define the value at risk 𝑉 𝑎𝑅𝛼 as 𝑉 𝑎𝑅𝛼 (𝐿) = 𝑠𝑢𝑝{𝑙|𝐹𝐿 (𝑙) < (1 − 𝛼)}. Show that
this definition is equivalent to the definition: 𝑉 𝑎𝑅𝛼 (𝐿) = 𝑖𝑛𝑓{𝑙|𝐹𝐿 (𝑙) ≥ (1 − 𝛼)}

This result relies on the properties of the cumulative distribution function (cdf) of
losses, 𝐹 (ℓ). All cdfs are weakly increasing functions where limℓ→0 𝐹 (ℓ) = 0 and
limℓ→∞ 𝐹 (ℓ) = 1. The fact that the function is weakly increasing means that 𝐹 has at
most a finite number of jump points and that at each jump limℓ→𝑎 𝐹 (ℓ) = 𝑎.
Suppose now that 𝛼 ∈ (0, 1]. There are two possible alternatives:
𝛼 is not included in the mapping of 𝐹 , in i.e. there is a jump in 𝐹 such that
limℓ→𝑎− 𝐹 (ℓ) < 𝛼 < 𝐹 (𝑎). In other words, the step function 𝐹 “skips” 𝛼. In this
case, it is clear that

inf{ℓ|𝐹𝐿 (ℓ) ≥ (1 − 𝛼)} = 𝑎 = sup{ℓ|𝐹𝐿 (ℓ) < (1 − 𝛼)}


.
If 𝛼 does belong to the mapping of 𝐹 then define 𝐴 = 𝐹 −1 (𝛼). Since 𝐹 may be a
step function, in which case 𝐴 will be an interval. This will have a lower bound,
since 𝛼 is assumed to be strictly positive. Now set 𝑎 = 𝑚𝑖𝑛𝐴. By definition,

15
inf ℓ|𝐹𝐿 (ℓ) ≥ (1 − 𝛼) = 𝑎 since 𝐹 is an increasing function. This also means that
ℓ < 𝑎 ⟺ 𝐹 (ℓ) < 𝛼. Therefore, as required,

𝑎 = sup ℓ|𝐹𝐿 (ℓ) < (1 − 𝛼)

Here sup designates the supremum (lowest upper bound) of a set of real numbers.
Similarly we use notation inf for the infimum (greatest lower bound).}

2.10 Past IFoA exam questions.

Past IFoA exam questions from CM2 and CT8 papers are a very valuable resource to help
you develop your understanding and exam technique. Question papers and Examiners’
Reports can be downloaded from the IFoA website free of charge.
By the way, IFoA exams have 100 marks in three hours, whereas University of Leicester
exams are two hours long. So you can multiply the IFoA marks for any question by 1.5 to
get the Leicester equivalent.

2.11 Homework

CM2A September 2019 Question 1 This combines some bookwork on definitions with
the sort of risk measure calculations you should be able to do.
CM2B September 2020 Question 3 Parts (i) and (ii) The mini-project for this module
will require some computer computations (in Excel, or R if you prefer), so here’s something
to get you started.
The calculation in Part (iii) is the sort of thing you’ll be doing in the second half of the
module, so feel free to have a go now!
There are also some additional questions on the homework sheet in the Week 1 folder.

16
3 Stochastic rates of return

3.1 Learning objectives

• Show an understanding of simple stochastic models for investment returns.


• Describe the concept of a stochastic investment return model and the fundamental
distinction between this and a deterministic model.
• Derive algebraically, for the model in which the annual rates of return are indepen-
dently and identically distributed and for other simple models, expressions for the
mean value and the variance of the accumulated amount of a single premium.
• Derive algebraically, for the model in which the annual rates of return are indepen-
dently and identically distributed, recursive relationships which permit the evaluation
of the mean value and the variance of the accumulated amount of an annual premium.
• Derive analytically, for the model in which each year the random variable (1 + r) has
an independent log-normal distribution, the distribution functions for the accumulated
amount of a single premium and for the present value of a sum due at a given specified
future time.
• Apply the above results to the calculation of the probability that a simple sequence
of payments will accumulate to a given amount at a specific future time.

3.2 Reading

This week’s work is covered by Chapter 12 of Stephen Garrett’s book, An introduction to the
mathematics of finance: a deterministic approach. (You don’t need to read Section 12.5.)
There’s a link to an electronic version of the book in the Blackboard Reading list and I’ve
put a PDF of Chapter 12 in the stochastic rates of return folder.
In particular, you need to know the material in 12.2 and 12.3, plus exercise 12.3.1.
(“Know” in the sense of being able to derive the equations and apply them to exam ques-
tions.)
Before it was in CM2 this material was in CT1, so that’s where you will find most past
exam papers, though we’ll also look at the modelling questions in the CM2B exam.

17
3.3 Mathematical models

According to Wikepedia a mathematical model is a description of a system using mathe-


matical concepts and language.

3.3.1 Types of model

Typically models are classified as either deterministic or stochastic.


A deterministic model is a model that provides an output based on one set of parameter
and input variables. However, deciding which set of input variables to use may be a challenge.
Thus, for example, if it is desired to determine premium rates on the basis of one fixed rate
of return, it is nearly always necessary to adopt a conservative basis for the rate to be used
in any calculations subject to the premium being competitive. [IFoA CM2 Core Reading
2020]
An alternative approach to recognising the uncertainty that in reality exists is provided by
the use of stochastic models. In such models, no single rate is used and variations are
allowed for by the application of probability theory. [IFoA CM2 Core Reading 2020]
See also IFoA CP1 Core Reading 2020, Unit 6, Section 2 and Unit 14,
and
https://en.wikipedia.org/wiki/Stochastic_modelling_(insurance)

3.3.2 Uses of models

Models can be used for different purposes (e.g. entertainment, trying to understand the
system modelled - including the effects of changes in the system) but practising actuaries
most often use them for making decisions.
Here’s an example of a big decision: how much should an global insurance company offering
many types of policy hold as capital reserves?
You’d expect such a big decision to need a big model to answer it: and you’d be right. The
model should include elements that model the behaviour of the various assets the company
holds, the behaviour of its liabilities and the behaviour of its customers.
We’ll look at all three aspects in this module but we’ll start simply with the rate of return
models described in Garrett’s book.
In the first lecture we’ll go over the construction in R of a simple random walk model and in
the second lecture for this week I’ll talk you through the reproduction of Garrett’s Example
12.5.3 using R.

18
3.4 Key notation and equations from Garrett

In all that follows, we are assuming that the rates of return in each period are independent.
Let 𝑆𝑛 be the accumulation of a single payment of 1 at time 0, let 𝑖𝑡 be the return obtainable
over the 𝑡𝑡ℎ year. Then

𝑆𝑛 = (1 + 𝑖1 )(1 + 𝑖2 ) … (1 + 𝑖𝑛 )
Similarly we let the accumulation of a series of annual investments of 1 at the start of each
year be 𝐴𝑛 .

Check your understanding

ì Write down an expression for 𝐴𝑛

The formula is Equation 12.2.3 in Garrett.

𝐴𝑛 = (1 + 𝑖1 )(1 + 𝑖2 )(1 + 𝑖3 ) … (1 + 𝑖𝑛 ) (3.1)


+(1 + 𝑖2 )(1 + 𝑖3 ) … (1 + 𝑖𝑛 ) (3.2)
+ ⋮ ⋮ ⋮ ⋮ (3.3)
+(1 + 𝑖𝑛−1 )(1 + 1𝑛 ) (3.4)
+(1 + 𝑖𝑛 ) (3.5)

Suppose that the return in each period has mean 𝑗 and variance 𝑠2 .
Then (Garratt Eq. 12.2.5)

𝐸[𝑆𝑛 ] = (1 + 𝑗)𝑛 ,
and (Garratt Eq. 12.2.5)

𝑣𝑎𝑟[𝑆𝑛 ] = (1 + 2𝑗 + 𝑗2 + 𝑠2 ) − (1 + 𝑗)2𝑛 .

The equations for the moments of 𝐴𝑛 are a bit more complicated see Garrett P283 - P285.

19
Check your understanding

ì Write down an expression for 𝐸[𝐴𝑛 ]

𝑠𝑛|
̈ ̄ at rate 𝑗.

3.4.1 Log-normal returns (Garrett 12.3)

Suppose that the random variable 𝑙𝑛(1+𝑖𝑡 ) is normally distributed with mean 𝜇 and variance
𝜎2 . In this case, the variable (1+𝑖𝑡 ) is said to have a log-normal distribution with parameters
𝜇 and 𝜎2 .
In this case, 𝑆𝑛 has a log-normal distribution with parameters 𝑛𝜇 and 𝑛𝜎2 .

Check your understanding

ì Write down expressions for the mean and variance of 𝑆𝑛 .

The formulae are given in Formulae and Tables P14.


𝑛𝜎2
𝐸[𝑆𝑛 ] = 𝑒𝑛𝜇+ 2

2 2
𝑣𝑎𝑟[𝑆𝑛 ] = 𝑒2𝑛𝜇+𝑛𝜎 (𝑒𝑛𝜎 − 1)

3.5 Homework

The first part of your homework (which we’ll go over in the feedback session) will be to
reproduce Example 12.6.1 from Garrett in R.
The second part of the homework is the behavioural economics reading below.

3.5.1 Behavioural economics reading

Most weeks I’ll ask you to read a few chapters from Thinking Fast and Slow.
As well as giving you an understanding or the behavioural economics part of the chapter
this is a very interesting and useful book that will give you insights into human behaviour
that will be useful in many areas of your future career.
This week you should skim-read the introduction and Chapters 1 - 4 (most chapters are
only short).

20
You should also read Chapter 5 more carefully, bearing in mind the following definitions
from Wikipedia and the CM2 Core reading.

3.5.2 Heuristic

A heuristic technique, or a heuristic (Ancient Greek: �������, heurískō, ‘I find, discover’), is


any approach to problem solving or self-discovery that employs a practical method that is
not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching
an immediate, short-term goal or approximation. Where finding an optimal solution is
impossible or impractical, heuristic methods can be used to speed up the process of finding
a satisfactory solution. Heuristics can be mental shortcuts that ease the cognitive load of
making a decision.
Examples that employ heuristics include using trial and error, a rule of thumb or an educated
guess.
[Wikipedia]

3.5.3 Availability

This heuristic is characterised by assessing the probability of an event occurring by the


ease with which instances of its occurrence can be brought to mind. Vivid outcomes are
more easily recalled than other (perhaps more sensible) options that may require System
II thinking. This can lead to biased judgements when examples of one event are inherently
more difficult to imagine than examples of another. For example, individuals living in areas
that are prone to extreme weather events may only be compelled to take out home insurance
after they have been directly affected by such events rather than beforehand, and may even
cease their coverage after some time as the memory of the event subsides.

3.5.4 Familiarity

This heuristic is closely-related to availability, and describes the process by which people
favour situations or options that are familiar over others that are new. This may lead to
an undiversified portfolio of investments if people simply put their money in industries or
companies that they are familiar with rather than others in alternative markets or sectors.
Home-country bias refers to people’s tendency to disproportionally invest in stocks from
their home country, rather than forming an internationally-diversified portfolio.

3.6 Past IFoA questions

The following is a selection of past exam questions that cover the material on stochastic
rates of return in this chapter:

21
• CT1 S2017 Q5
• CT1 A2016 Q8
• CT1 A2016 Q8
• CM2A A2019, Q3(i, ii, iii)
• CT1 S2018 Q6
• CT1 A2018 Q4
• CT1 S2016 Q9

22
4 Simulation and stochastic modelling

This week we’ll cover some basic techniques for building simulation models in R. The aim
is to lay the foundations for future work, so there are no specific syllabus objectives.

4.1 Generative models

One of the key concepts this week is the idea of a generative model.
This is a description of the process generating the variables you want to simulate. For
example:

The return on the fund each year is an iid random variable drawn from a lognor-
mal distribution with fixed mean and variance. The accumulated value of the
fund at the end of N years is the initial value multiplied by the product of the
annual returns for each year.

Or

Claims arise as a Poisson process with a fixed rate. The value of each claim is
an iid random variable drawn from a Pareto distribution with fixed parameters.
The total claim amount at the end of the period is the sum of all the individual
claims arising in the period

The generative model allows us to simulate the variables of interest, but it doesn’t tell us
how to implement the simulation in code.
Running the simulation allows us to predict the simulated variables and comparing the
predictions with the observed values in the real world will tell us how good our generative
model is (assuming we haven’t made errors in the simulation).
We can then use the model to decide what actions to take (which will likely include actions
to improve the performance of the model).
Interestingly, there are many theories of how the brain works that would be specified in a
very similar way - it is always trying to predict the results of proposed actions and comparing
what happens with its predictions. Actions will be adjusted in the light of the difference
between the expected and observed outcome.

23
4.2 Turning a generative model into code

There will usually be a number of different structures that can be adopted when coding a
particular generative model. Key decision that will influence the structure are:

• Is it necessary to store the whole simulated path of each run, or just the final value?
• Does each simulated path need to run for the same number of steps, or can it be
terminated at a defined point?
• Do we simulate by using discrete time steps, or some other type of event?
• What is the trade-off between number of steps and precision?
• Is runtime more important than development time, or vice versa?
• Do we generate simulated paths one at a time - or can we do them in parallel?

Let’s look at some examples using the first generative model above. (You should be familiar
with this from last week.)
Suppose we want to build a model where we track the growth of the fund each year, and
store these values.
Write outlines of the code you could use for this (a) generating one simulated path at a time
and (b) generating the first year value for all of the simulated funds, then the second year’s
value for all of them, and so on.
When you’ve done that translate your outlines into pseudocode, and then into code using
for loops.
Here’s some code for the first method. First the set up:

# Initialise constants
mu <- 0.02
sig <- 0.01
n_years <- 5
n_sims <- 5000
fund_one <- 1000

# Create an empty array for the results


FundVal <- array(numeric((n_years + 1) * n_sims),
dim = c(n_years + 1, n_sims))

# Set the first row to the starting fund value


FundVal[1,] <- fund_one

And then the actual simulation. (Notice it’s a good idea do a test run with very small
numbers of years and simulations.)

set.seed(1)
for (sim in (seq(n_sims))){

24
for (year in seq(2, n_years + 1)){
FundVal[year, sim] <-
FundVal[year - 1, sim] * rlnorm(1, meanlog = mu, sdlog = sig)
}
}

FundVal[1:5, 1:5]

[,1] [,2] [,3] [,4] [,5]


[1,] 1000.000 1000.000 1000.000 1000.000 1000.000
[2,] 1013.830 1011.865 1035.742 1019.743 1029.620
[3,] 1036.212 1037.350 1060.792 1040.175 1058.668
[4,] 1048.348 1066.149 1075.520 1071.251 1080.860
[5,] 1086.725 1093.967 1073.213 1101.904 1080.975

4.3 Estimating probabilities

Once you have simulated a set of results, estimating probabilities is usually just a matter
of counting and then calculating ratios.
For example, if we extract the final fund values from the calculations above we can calculate
the probability that the average annual return is less than two per cent, and the lower
quartile return.

fund_final <- FundVal[n_years + 1,]


target <- fund_one * 1.02^n_years
prob_undershot <- sum(fund_final < target) / n_sims
Q1 <- quantile(fund_final, probs = 0.25)

Of course, the numbers calculated above will only be meaningful if n_sims is large enough.

4.4 Timing code

The package microbenchmark (Mersmann 2021) is a useful way to time chunks of code. It’s
easiest if you put them in a function.
Here are two functions that produce the same output.

slow_loop <- function(n){


r <- rnorm(1)
for (i in seq(2, n)){
r <- append(r, rnorm(1))
}

25
r
}

fast_loop <- function(n){


r <- numeric(n)
for (i in seq(n)){
r[i] = rnorm(1)
}
r
}

The difference between them is that the first one starts off with a vector of length one and
increases its length in each loop. The second function creates a vector of the needed length
right at the beginning.
Now let’s compare their speed, alongside the built-in vectorised function that does the same
thing.

library(microbenchmark)
n <- 1000
res <- microbenchmark(slow_loop(n),
fast_loop(n),
rnorm(n),
check = 'identical',
setup = set.seed(12345))
res

Unit: microseconds
expr min lq mean median uq max neval
slow_loop(n) 2933.486 3724.4080 5122.06028 3879.218 3978.3105 26809.652 100
fast_loop(n) 1160.875 1455.8165 1864.62738 1532.869 1572.3180 12488.060 100
rnorm(n) 48.553 53.4705 61.49053 63.441 64.5255 88.084 100

Further information about timing chunks of code can be found on the Jumping Rivers
website and many other places online.
As well as timing individual chunks of code, a profiler can be used to understand which parts
of a program are taking most time (and hence, which parts you should try to optimise). R
comes with its own profiler, which is described here. You don’t need to know the details of
how to use the profiler, but read the first paragraph of the preceding link for some useful
background.

26
4.5 Discrete Event Simulation (DES)

In many simulations we are attempting to model a system over a pre-determined period


of time. For example, modelling the return of an investment over a number of years. In
that case we are modelling an essentially continuous process by sampling it at discrete time
intervals. For some systems that makes sense, but there are others where it makes more
sense for the simulation to be driven by the occurrence of specified events, rather than the
ticking of a clock.
The classic example of a discrete event simulation is a modelof a queuing system.
Imagine a queue for the services at a bank counter. We have a stochastic model for the
arrival of new customers at the end of the queue and a stochastic model for the time taken
to deal with each customer.
In this case we can construct a model where we generate the time to the next event (customer
arriving, dealing with a customer at the counter finishing) and then calculate the change to
the state of the system at that point, before repeating the process.
Discrete event simulation is described in Chapter 6 of the book Simulation (Ross 2013). You
should skim read this chapter to make sure you understand the principles, but you won’t
be expected to program your own DES model.
One of the examples in Ross is about insurance claims. I’ve written a program to perform
the simulation he describes in pseudocode - ross_ruin.Rmd. Have a look at the details if
you’re interested.
There are two or three DES packages in R. The most interesting is probably simmer and
it might be worth reading the introduction to the paper describing it (Ucar, Smeets, and
Azcorra 2019) (upto the first diagram).

27
5 Valuing benefit guarantees

5.1 Week 4 Objectives

This week we’ll look at ways of valuing benefit guarantees by simulation and, more generally,
at economic scenario generators (ESGs - not to be confused with Environmental, Social, and
Governance criteria in investing).
The IFoA syllabus objective is:

• Value basic benefit guarantees using simulation techniques.

Practical application of of this objective will be assessed in the mini-project. Here we will
look at some of the underlying concepts and issues to be taken into consideration - which
will be tested in the exam.
The key reading is contained in Economic Scenario Generators: a practical guide (Pedersen
et al. 2016), a copy of which is available in Blackboard.
The Introduction, Executive Summary, and Chapter 1 of this paper are examinable.

5.2 Economic Scenario Generators

An economic scenario generator (ESG) is a computer-based model of an economic environ-


ment that is used to produce simulations of the joint behaviour of financial market values
and economic variables. The design and content of an ESG model can vary based on the
needs of the specific application, but a general-purpose ESG could involve joint simulation
of future bond prices, stock prices, mortgage-backed security prices, inflation, gross domes-
tic product and foreign-exchange rates. The joint simulation includes not only simulated
series of future values of each component, but their interactions as well.
While an ESG does not include the direct modelling of liabilities that financial institutions
would typically require in the management of their business, it would provide the funda-
mental economic output (interest rates, inflation, etc.) that may be necessary to project
out these liability simulations.

28
5.3 Actuarial uses of ESGs

Life insurance. Applications of ESGs for life insurance liabilities are primarily focused
on the interaction of interest rate changes and policyholder behaviour regarding lapses and
other optionality such as surrenders.
Pensions. In pensions work ESGs have a role in consistent valuation of assets and liabilities
and can also be valuable in understanding member behaviour, as well as areas like liability-
driven investment and pension-risk transfer.
General insurance. General insurance business is strongly cyclical and driven by many
factors that can be modelled with an ESG, both in terms of liabilities (particularly driven
by inflation), and in terms of market behaviour.

5.4 Stylized facts

Stylized facts are generalized statements about economic and market behaviour, based on
historical experience. An ESG should produce outputs consistent with the stylized facts
relevant to its intended use.
Examples of stylized facts include:

• Interest rates can be negative.


• Corporate credit spreads are wider for lower-credit-quality instruments.
• There is a tendency for corporate credit spreads to fluctuate more during recessionary
periods.
• The probabilities of default will fluctuate with general economic conditions and firm-
or industry-specific conditions.
• Equity returns exhibit both higher expected returns and higher volatility than fixed-
income returns.

5.5 ESG summary

(Pedersen et al. 2016)

1. An economic scenario generator is a computer-based model of an economic environ-


ment that is used to produce simulations of the joint behaviour of financial market
values and economic variables.
2. Interest rate modelling is at the core of an ESG because of the importance of interest
rates in discounting liability cash flows and in determining returns on fixed-income
investments.
3. Interest rates are closely associated with default-free bonds and may be further refined
to consider dynamics of the yield curve and interaction with inflation.
4. Other financial components likely considered in a robust ESG may include gross do-
mestic product, inflation, unemployment rates and foreign-exchange rates.

29
5. Key financial market and economic variables are modelled in simulated series, and
importantly the interaction among the variables also is modelled. Modelling in an eco-
nomic scenario generator can be performed on a market-consistent basis or real-world
basis. Each has application to the insurance and pension worlds and to understanding
financial markets
6. Some key financial market variables particularly important to insurance and pensions
are bonds (and related interest rates), including corporate and asset-backed bonds,
and equities.

5.6 Single period mean-variance models

If we are able to estimate the distribution of the variable of interest at the end of the period
of interest we may be able to apply standard probability theory in an analytical way.
For example, if we assume the return on a fund is normal at the end of a period we will
be able to calculate the probability of the return falling below a specified amount and the
mean and the variance of the shortfall if it does so. This leads directly to ruin theory, which
we’ll look at in a couple of weeks.

5.7 Homework

Research and answer the following questions in your own words (do not copy and paste!)

1. Explain the difference between the “real world” approach and the “risk neutral” ap-
proach in ESGs. Give examples when each might be more appropriate.
2. List three variables that might be linked in a “cascade approach”.
3. What are:

• effective duration analysis


• economic capital

4. State two “stylized facts” commonly accepted in actuarial science that are not listed
in the paper Economic Scenario Generators: A practical guide.

Also read, Chapters 16 - 20 of Thinking Fast and Slow

5.8 Additional reading

(Not directly examinable)


Evolution of economic scenario generators: a report by the Extreme Events Working Party
members. (Jakhria et al. 2019)
If you want to examine the area in greater depth the annotated bibliography in Pedersen
et al. (2016) would be very useful.

30
6 Run-off triangles

6.1 Learning Objectives

This week we look at methods for estimating the reserves that an insurance policy should
hold in a respect of policies written in the past.

• Define a development factor and show how a set of assumed development factors can
be used to project the future development of a delay triangle.
• Describe how a statistical model can be used to underpin a run-off triangles approach.
• Describe and apply a basic chain ladder method for completing the delay triangle
using development factors.
• Show how the basic chain ladder method can be adjusted to make explicit allowance
for inflation.
• Describe and apply the average cost per claim method for estimating outstanding
claim amounts.
• Describe and apply the Bornhuetter-Ferguson method for estimating outstanding
claim amounts.
• Discuss the assumptions underlying the application of the methods above.

There are many run-off triangle questions available in past IFoA papers. You should practice
these until you can do the calculations rapidly and accurately. You should also be able to
comment on the assumptions and other matters discussed below.

6.2 Need for claims estimation

Insurers need to estimate future claims:

• There is normally a delay between incidents leading to claim and the insurance pay
out
• Insurance companies need to estimate future claims for their reserves
• It makes sense to use historical data to infer future patterns of claims

31
6.2.1 Accurate estimation of reserves is important

Too low reserves and you run the risk of action from the regulator, or even insolvency. My
reducing reserves frees capital to be invested in business projects or returned to shareholders.
(Or paid as executive bonuses!)

6.2.2 Example: Incremental paid claims

Look at the table of data below. It shows claims incurred, grouped by the year in which
the accident leading to the claim happened - the accident year. The number of years until
a claim is recorded is called the delay, or development period.

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

filter, lag

The following objects are masked from 'package:base':

intersect, setdiff, setequal, union

Origin Dev0 Dev1 Dev2 Dev3 Dev4 Dev5 Dev6


1 2001 507 274 139 224 109 192 143
2 2002 560 263 208 186 105 208

32
3 2003 673 242 189 255 55
4 2004 776 267 184 163
5 2005 824 301 207
6 2006 911 298
7 2007 974

Grouping by accident year means that the data should include claims which have been
incurred but not year reported (IBNR), which will have to be estimated. Those of you
who discovered the Claims Reserving Manual while doing research for a previous week’s
homework might have seen some methods for doing this - but it’s beyond the scope of this
module.
Other ways of grouping include: the year the policy was written, the year a claim was
reported.

6.2.3 Cumulative paid claims

Origin Dev0 Dev1 Dev2 Dev3 Dev4 Dev5 Dev6


1 2001 507 781 920 1144 1253 1445 1588
2 2002 560 823 1031 1217 1322 1530
3 2003 673 915 1104 1359 1414
4 2004 776 1043 1227 1390
5 2005 824 1125 1332
6 2006 911 1209
7 2007 974

We want to estimate the Dev6 column to calculate claims outstanding. There are many
possible ways we could do this.

6.2.4 Mathematical (regression) model

We can write a general model as:

𝐶𝑖,𝑗 = 𝑟𝑗 𝑥𝑖+𝑗 𝑠𝑖 + 𝜖𝑖,𝑗


Where:
𝐶𝑖,𝑗 is the entry for origin year 𝑖 and development year𝑗.
𝑟𝑗 is a parameter that depends only on the development year (a development factor)
𝑥𝑖+𝑗 is a factor that depends on the year of payment (e.g. to take account of inflation)
𝑠𝑖 is the initial exposure in origin year 𝑖
𝜖𝑖,𝑗 is an error term

33
We’ll talk more about the regression model in Thursday’s lecture. For now we’ll stick to
some pragmatic calculation methods without worrying about their statistical properties.

6.3 Basic chain ladder approach.

We work with cumulative amounts and we calculate 𝑟𝑗 as a product of one-year development


factors. Here’s the triangle again.

Origin Dev0 Dev1 Dev2 Dev3 Dev4 Dev5 Dev6


1 2001 507 781 920 1144 1253 1445 1588
2 2002 560 823 1031 1217 1322 1530
3 2003 673 915 1104 1359 1414
4 2004 776 1043 1227 1390
5 2005 824 1125 1332
6 2006 911 1209
7 2007 974

We could estimate 𝑟6 as 1588/507, but that would be ignoring a lot of information. Instead,
consider development from Dev0 to Dev1: we could calculate a factor for each entry where
we have two values.

Origin Dev0 Dev1 df


1 2001 507 781 1.54
2 2002 560 823 1.47
3 2003 673 915 1.36
4 2004 776 1043 1.34
5 2005 824 1125 1.37
6 2006 911 1209 1.33
7 2007 974

And then just take the average 1.402.


In the basic chain ladder method we take a weighted average where the weights are the claim
amounts in the starting development year for each year of origin. This seems reasonable
and gives an easy calculation.
781+823+915+1043+1125+1209 5896
𝑟0,1 = 507+560+673+776+824+911 = 4251 = 1.387

34
6.3.1 Development factors for all years

We do the same for all development years, but let’s work with a smaller triangle.

Origin Dev0 Dev1 Dev2 Dev3


1 2004 776 1043 1227 1390
2 2005 824 1125 1332
3 2006 911 1209
4 2007 974

1043 + 1125 + 1209


𝑟0→1 = = 1.345
776 + 824 + 911
1227 + 1332
𝑟1→2 = = 1.180
1043 + 1125
1390
𝑟2→3 = = 1.133
1227

6.3.2 Assumptions underlying the basic chain ladder method

The main assumptions we’ve made are:

1. Payments from each accident year will develop in the same way. That is, the same
development factors are used to project outstanding claims for each accident year.
(In fact, sometimes ad hoc adjustments are made to the development factor for a
particular year, based on extra sources of information.)
2. No additional adjustments for inflation are needed. Weighted average inflation (as
included in the development factors) will remain the same in future.
3. Claims are fully run off for the first accident year included in the data. It’s impossible
to project forward using the methods below. (In practice, further projections might be
carried out using so-called tail factors, but this is beyond the scope of this module.)

6.4 Adjusting for past inflation.

If the inflation assumption above can’t be justified we’ll can use an inflation adjusted method.
We treat past and future inflation separately.
Consider the following table of past inflation.

Year Inflation CumFac


1 2014 2.5 1.107
2 2015 3.0 1.074
3 2016 3.3 1.040
4 2017 4.0 1.000

35
The inflation figures are those for claims inflation, rather than general price inflation.Claims
inflation is different from general price inflation, and often significantly higher. (Can you
think of reasons why this might be?)
Willis Towers Watson publish an index of claims inflation.
To adjust for past inflation we need to work first of all with the incremental payments
triangle, not the cumulative claims version. The smaller incremental triangle is:

Origin 0 1 2 3
1 2014 810 236 220 193
2 2015 844 320 168
3 2016 927 294
4 2017 974

We now convert the past year amounts to 2017 values by multiplying by the appropriate
cumulative inflation factor. Note that each lower-left to upper-right diagonal holds data
from the same year.

Origin 0 1 2 3
1 2014 896 254 229 193
2 2015 907 333 168
3 2016 964 294
4 2017 974

Now we can apply the basic chain ladder method. First converting to a cumulative trian-
gle…

Origin 0 1 2 3
1 2014 896 1150 1379 1572
2 2015 907 1240 1408
3 2016 964 1258
4 2017 974

calculating the development factors…

𝑟0→1 = 1.318
𝑟1→2 = 1.166
𝑟2→3 = 1.140

and projecting forward the triangle…

36
Origin 0 1 2 3
1 2014 896 1150 1379 1572
2 2015 907 1240 1408 1605
3 2016 964 1258 1467 1672
4 2017 974 1284 1497 1707

We’re nearly there, but we’ve taken out the inflationary growth from the future payments.
To add it back in, we need to return to the incremental form…

Origin 0 1 2 3
1 2014 896 254 229 193
2 2015 907 333 168 197
3 2016 964 294 209 205
4 2017 974 310 213 210

and then increase the future payments by (say) 4 percent a year.

Origin 0 1 2 3
1 2014 896 254 229 193
2 2015 907 333 168 205
3 2016 964 294 217 222
4 2017 974 322 231 236

Finally, back to the cumulative form:

Origin 0 1 2 3
1 2014 896 1150 1379 1572
2 2015 907 1240 1408 1612
3 2016 964 1258 1475 1697
4 2017 974 1296 1527 1763

And we can see:

Total paid = 5211


Total estimated = 6644
Total outstanding = 1433

37
6.4.1 Assumptions underlying the inflation-adjusted chain ladder method

The key assumptions underlying the inflation-adjusted method are:

1. Payments from each accident year will develop in the same way in real terms. That is,
the same development factors are used to project outstanding claims for each accident
year after they have been adjusted for inflation.
2. Explicit assumptions are made for past and future rates of claims inflation.
3. Claims are fully run-off for the first accident year included in the data.

6.5 Average cost per claim method

The average cost per claim method projects numbers of claims and average size of claims
separately.
The method we’ll demonstrate uses grossing up factors. A grossing up factor tell us what
proportion of the ultimate amount emerges in each year. However, the same approach,
of modelling numbers of claims and severity separately, can be used with development
factors.
An outline of the method is:

• From cumulative claim amounts & number of claims calculate average cost per claim
• Project average cost per claim & number of claims, using grossing up factors or devel-
opment factors
• Multiply projected average cost and number for each origin year to get projected total
claims

We start with two tables.


The cumulative amounts of claims (in thousands):

Origin 0 1 2 3
1 2014 810 1046 1266 1459
2 2015 844 1164 1332
3 2016 927 1221
4 2017 974

And the cumulative number of claims:

Origin 0 1 2 3
1 2014 336 506 577 646
2 2015 397 528 646
3 2016 426 589
4 2017 481

38
Dividing gives us the average cost per claim (ACC).

Origin 0 1 2 3
1 2014 2411 2067 2194 2259
2 2015 2126 2205 2062
3 2016 2176 2073
4 2017 2025

The following table shows the grossing up factors for the average cost per claim.

Origin 0 1 2 3 Ai3
1 2014 1.067 0.915 0.971 1 2259
2 2015 1.002 1.039 0.971 2122
3 2016 1.026 0.977 2122
4 2017 1.032 1963

The process is:

• For the top row, divide each entry by the value for the final year.
• For each subsequent row, start by calculating the grossing-up factor on the diagonal
by taking the average of the factors above it. Then use that factor to estimate the
ACC in the final year.
• Then work backwards along the row calculating the factor by dividing the ACC for
each year by the final-year figure.

(The only real way to understand and learn run-off triangle methods is to work through
examples, first by hand and then in Excel.)
You do exactly the same for the numbers of claims.

Origin 0 1 2 3 Ni3
1 2014 0.520 0.783 0.893 1 646
2 2015 0.549 0.73 0.893 723
3 2016 0.547 0.757 778
4 2017 0.539 893

Finally we can use the projected number of claims and ACC to calculate the estimate of
the final claim amount for each year.

Origin Average cost per claim Number of Claims Projected Claims Estimate
1 2014 2259 646 1459
2 2015 2122 723 1535
3 2016 2122 778 1652
4 2017 1963 893 1753

39
Total paid = 4986
Total estimated = 6398
Total outstanding = 1412

6.5.1 Assumptions underlying the average cost per claim method

The key assumptions underlying this method will depend on the exact form it takes. In the
form presented above, it’s assumed:

1. Numbers of claims and ACC from each accident year will develop in the same way as
a proportion of the ultimate value. That is, the same grossing-up factors are used for
each accident year.
2. No adjustments are required for inflation. (However, inflation can be accounted for in
a similar way to the previous approach.)
3. Claims are fully run-off for the first accident year included in the data.

6.6 The Bornhuetter-Ferguson method

In this section we will look at the Bornhuetter-Ferguson method, which uses the loss ratio to
create a first approximation of the ultimate total claims, then adjusts it to reflect experience
to date.

6.6.1 Loss Ratio

The loss ratio is

total claims paid


loss ratio =
total premium received

Loss ratios are used in the derivation of the premium basis and tend to show some stability
from year to year, in the absence of known changes to the risks insured, or the premium
basis.
So if we have the premium for each origin year and reasonable confidence in the loss ratio,
we can estimate the ultimate total claim payments.
In the basic chain ladder method we calculated development factors and used them to project
the claims incurred to date forwards, ending up with the ultimate total claim payments
arising from each accident year.
In the Bornhuetter-Ferguson method we calculate development factors in the same way but
we apply them to the ultimate total claim, calculated using the loss ratio, working backwards

40
to find the expected claims arising in each year. We use these estimated claims, for future
years, to calculate the reserves.
Let’s assume a loss ratio of 0.8 and start with the same cumulative claims table as in the
basic chain ladder example.

Origin 0 1 2 3
1 2014 810 1046 1266 1459
2 2015 844 1164 1332
3 2016 927 1221
4 2017 974

We calculate the development factors as usual, and also the cumulative factors.

1 2 3
Single year 1.329 1.176 1.152
Cumulative 1.801 1.355 1.152

Accident year 2014 2015 2016 2017


1 Premium 1897 2096.000 2201.000 2338.000
2 Ult. est. 1612 1781.600 1870.850 1987.300
3 Factor 1 1.152 1.355 1.801
4 Expected 1612 1545.926 1380.926 1103.474
5 Emerging 0 235.674 489.924 883.826
6 Reported 1459 1332.000 1221.000 974.000
7 Total 1459 1567.674 1710.924 1857.826

6.6.2 Assumptions underlying the Bornhuetter-Ferguson method

The assumptions are the same as for the basic chain-ladder method plus the assumption
that the loss ratio assumed is appropriate.

6.7 Statistical models and simulation

The statistical model behind the chain ladder approaches is based on the fact that they can
be expressed as linear regression models with different weights.
Consider the first triangle we saw as an example. Let’s just consider the first two develop-
ment years and plot Dev1 vs Dev0.

paid_cum_df %>%
filter(!is.na(Dev1)) %>%
ggplot(aes(x = Dev0, y = Dev1)) +

41
geom_point() +
stat_smooth(method = 'lm', formula = 'y ~ 0 + x', se = FALSE) +
theme_bw()

1200

1100
Dev1

1000

900

800

700

500 600 700 800 900


Dev0

The line shown is a linear regression line forced to go through the origin, and we can see
that the development factor from Dev0 to Dev1 can be thought of as the slope of this line.

fit1 <-
paid_cum_df %>%
filter(!is.na(Dev1)) %>%
lm(formula = 'Dev1 ~ 0 + Dev0') # The zero forces the line through the origin

df_regr1 <- coef(fit1)[1]

sprintf('Slope = %.3f', df_regr1)

[1] "Slope = 1.375"

This isn’t quite the value we got before (1.387) because fit1 is an unweighted linear regres-
sion model and the basic chain ladder method is equivalent to a weighted model, where the
weights are inversely proportional to the value of claims recorded in Dev0.

fit2 <-
paid_cum_df %>%

42
filter(!is.na(Dev1)) %>%
lm(formula = 'Dev1 ~ 0 + Dev0',
weights = 1 / Dev0)

df_regr2 <- coef(fit2)[1]

sprintf('Slope = %.3f', df_regr2)

[1] "Slope = 1.387"

Which is the value we had before.


Identifying the statistical model underlying the method allows us to investigate the prop-
erties of the estimators we are using. That’s beyond the scope of this module, but you
can read more about it in the documentation for the ChainLadder package (Gesmann et al.
2023).

6.7.1 Goodness of fit

One way to examine the goodness of fit of the basic chain ladder method is to apply the
development factors to the claims recorded in the first development year.
This will give us fitted claim amounts for the years in which we do have actual data, as well
as predicted values for future years.
Comparing the fitted values with the actual data gives us some idea of the fit of our model - it
is equivalent to examining the residuals when using the standard linear regression model.
There is an example to work through in the homework.

6.7.2 Simulation

Properties of estimators can also be investigated by simulation.


A full set of claims run-off data is simulated using a specified generative model. The chosen
method is then applied to the simulated triangle available at a particular time and the
estimated reserves compared with the simulated outcome and an error calculated.
The whole process is repeated many times to obtain an empirical distribution of the error.

43
6.8 Homework

The homework for this section is contained in an Excel workbook in the Week 5 Blackboard
folder.
You should also continue your behavioural economics reading, reading Chapters 11 to 15 of
Thinking Fast and Slow.
Before you do the reading, consider the following definitions (from the CM2 Core Reading)
and notice where the concepts appear in the reading.
Anchoring and Adjustment is a term used to explain how people produce estimates.
People start with an initial idea of the answer (the anchor) and then adjust away from
this initial anchor to arrive at their final judgement. Thus, people may use experience or
‘expert’ opinion as the anchor, which they amend to allow for evident differences to the
current conditions. The effects of anchoring are pervasive and robust and are extremely
difficult to ignore, even when people are aware of the effect and aware that the anchor is
ridiculous. The anchor does not have to be related to the good. Nor does the anchor have
to be consciously chosen by the consumer. If adjustments are insufficient, final judgements
will reflect the (possibly arbitrary) anchors. Anchoring can have important implications for
investment decisions, not least if individual investors rely on seemingly irrelevant yet salient
data or statistics in order to guide their portfolio choices (e.g. expected returns from one-off
investments in other unrelated industries, etc.).
Representativeness. Decision-makers often use similarity as a proxy for probabilistic
thinking. Representativeness occurs because it is easier and quicker for our brain to compare
a situation to a similar one (System I) than assess it probabilistically on its own merits
(System II).
Representativeness is one of the most commonly used heuristics and can, at times, work
reasonably well. Nonetheless, similarity does not always adequately predict true probability,
leading to irrational outcomes. This is also related to the law of small numbers, where people
assess the probability of something occurring based on its occurrence in a small, statistically-
unrepresentative sample due to a desire to make sense of the uncertain situation (the name
is an ironic play on the law of large numbers in statistics). Representativeness can lead
individuals to base their decision on whether to invest in a particular stock, or not, on
the basis of its price over a few recent periods, rather than its long-term movement or the
underlying fundamentals of the company.
Availability. This heuristic is characterised by assessing the probability of an event oc-
curring by the ease with which instances of its occurrence can be brought to mind. Vivid
outcomes are more easily recalled than other (perhaps more sensible) options that may re-
quire System II thinking. This can lead to biased judgements when examples of one event
are inherently more difficult to imagine than examples of another. For example, individuals
living in areas that are prone to extreme weather events may only be compelled to take out
home insurance after they have been directly affected by such events rather than beforehand,
and may even cease their coverage after some time as the memory of the event subsides.

44
7 Ruin theory

7.1 Overview

In previous modules you’ve looked at the distribution of insurance losses over one period
of time. In reality, of course, insurance claims (and premiums) occur as processes through
time.
This week we will look at the behaviour of the amount of money held by an insurer over
time: the so-called solvency process.
We will be covering the following syllabus items:

• Explain what is meant by the aggregate claim process and the cashflow process for a
risk.
• Use the Poisson process and the distribution of inter-event times to calculate proba-
bilities of the number of events in a given time interval and waiting times.
• Define a compound Poisson process and calculate probabilities using simulation.
• Define the probability of ruin in infinite/finite and continuous/discrete time and state,
and explain relationships between the different probabilities of ruin.
• Describe the effect on the probability of ruin, in both finite and infinite time, of
changing parameter values by reasoning or simulation.
• Calculate probabilities of ruin by simulation.

7.1.1 Definitions

We consider the claims generated by a portfolio of policies over time.


𝑁 (𝑡) is the number of claims generated in the interval [0, 𝑡] for all 𝑡 ≥ 0.
𝑋𝑖 is the amount of the 𝑖𝑡ℎ claim, 𝑖 = 1, 2, 3, ...
𝑆(𝑡) is the aggregate of all claims in the interval [0, 𝑡] for all 𝑡 ≥ 0.
{𝑋𝑖 }∞
𝑖=1 is a sequence of random variables.

{𝑁 (𝑡)}𝑡≥0 and {𝑆(𝑡)}𝑡≥0 are stochastic processes.


{𝑆(𝑡)}𝑡≥0 is called the aggregate claims process for the portfolio.

45
7.1.2 Premiums

We assume that premiums are received continuously at a constant rate 𝑐.


We will also assume the premiums received in an interval of time are greater than the
expected value of losses in the same interval.
In other words, the total premium received in [0, 𝑡]
𝑐𝑡 = (1 + 𝜃)𝐸[𝑆(𝑡)]
Where 𝜃 is the premium loading.

7.1.3 The surplus process

We assume the insurer starts at 𝑡 = 0 with an initial surplus 𝑈 .


The surplus at time 𝑡 will be:

𝑈 (𝑡) = 𝑈 + 𝑐𝑡 − 𝑆(𝑡)

Hence {𝑈 (𝑡)}𝑡≥0 is a stochastic process called the surplus process or cash flow process.

7.1.4 Simulating the surplus process

1. Choose an initial surplus 𝑈 and set the current surplus, 𝑈 (0) = 𝑈


2. Choose a small time step (𝛿𝑡), such that the probability of a claim occurring in it is
small (𝑝𝛿𝑡)
3. for the first time step, draw a single random variable 𝑖 from ℬ𝑒(𝑝𝛿𝑡) + iff 𝑖 = 1 calculate
the loss for this step as a single drawing from 𝑋 ∼ ℱ where ℱ is the distribution of
individual claim amounts + otherwise, set the loss for this step to zero
4. Increase the surplus (𝑈 ) by the premium (𝑐𝛿𝑡) less the loss just calculated
5. repeat steps 3 and 4 as many times as required

46
The surplus process

4
U(t)

0 1 2 3 4 5 6 7 8 9 10
t

7.2 The probability of ruin in continuous time

Define:

𝜓(𝑈 ) = 𝑃 [𝑈 (𝑡) < 0, for some 𝑡, 0 < 𝑡 < ∞]


𝜓(𝑈 , 𝑡) = 𝑃 [𝑈 (𝜏 ) < 0, for some 𝜏 , 0 < 𝜏 ≤ 𝑡]

𝜓(𝑈 ) is the ultimate probability of ruin.


𝜓(𝑈 , 𝑡) is the probability of ruin in time 𝑡.

7.2.1 Some relationships

For 0 < 𝑡1 ≤ 𝑡2 and 𝑈1 ≤ 𝑈2

𝜓(𝑈1 , 𝑡) ≥ 𝜓(𝑈2 , 𝑡)
𝜓(𝑈1 ) ≥ 𝜓(𝑈2 )
𝜓(𝑈 ) ≥ 𝜓(𝑈 , 𝑡2 ) ≥ 𝜓(𝑈 , 𝑡1 )
lim 𝜓(𝑈 , 𝑡) = 𝜓(𝑈 )
𝑡→∞

47
7.3 Probability of ruin in discrete time

If we only examine the surplus at discrete time intervals 𝑡 = 𝑛ℎ where 𝑛 = 0, 1, 2, ...


𝜓ℎ (𝑈 , 𝑡) is the discrete probability of ruin.
Assuming 𝑡 is a multiple of ℎ.
The discrete surplus process
Black: daily, Red: 6 hourly

4
U(t)

0 1 2 3 4 5 6 7 8 9 10
t

It’s intuitively obvious that


𝜓ℎ (𝑈 , 𝑡) ≤ 𝜓(𝑈 , 𝑡)

7.4 A counting process for claims

We must have

1. 𝑁 (0) = 0
2. 𝑁 (𝑡) is integer valued
3. 𝑁 (𝑡2 ) ≥ 𝑁 (𝑡1 ) if 𝑡2 ≥ 𝑡1
4. 𝑁 (𝑡2 ) − 𝑁 (𝑡1 ) is the number of claims in (𝑡1 , 𝑡2 )

7.4.1 The Poisson process

The claim number process {𝑁 (𝑡)}𝑡≥0 is defined to be a Poisson process with parameter 𝜆
if

1. 𝑁 (0) = 0 and 𝑁 (𝑡2 ) ≥ 𝑁 (𝑡1 ) if 𝑡2 ≥ 𝑡1

48
2.

𝑃 [𝑁 (𝑡 + ℎ) = 𝑟|𝑁 (𝑡) = 𝑟] = 1 − 𝜆ℎ + 𝑜(ℎ)


𝑃 [𝑁 (𝑡 + ℎ) = 𝑟 + 1|𝑁 (𝑡) = 𝑟] = 𝜆ℎ + 𝑜(ℎ)
𝑃 [𝑁 (𝑡 + ℎ) > 𝑟 + 1|𝑁 (𝑡) = 𝑟] = 𝑜(ℎ)

3. The number of claims in (𝑡1 , 𝑡2 ] is independent of 𝑁 (𝑡1 )

The number of events which occur in a time period of length 𝑡 has a Poisson distribution
with mean 𝜆𝑡.

7.4.2 Distribution of time to the first claim

If 𝑇1 is the time to the first claim

𝑃 (𝑇1 > 𝑡) = 𝑃 [𝑁 (𝑡) = 0] = 𝑒−𝜆𝑡 𝑃 (𝑇1 ≤ 𝑡) = 1 − 𝑒−𝜆𝑡

So 𝑇1 ∼ ℰ𝑥𝑝(𝜆)
Also, the time between any two claims has the same distribution.

7.4.3 The compound Poisson process

We now combine the Poisson process for the number of claims with a claim amount distri-
bution to give a compound Poisson process for the aggregate claims process. We assume:

1. the random variables {𝑋𝑖 }𝑖=1 are independent and identically distributed

2. the random variables {𝑋𝑖 }𝑖=1 are independent of 𝑁 (𝑡) for all 𝑡 ≥ 0
3. the stochastic process {𝑁 (𝑡)}𝑡≥0 is a Poisson process with parameter 𝜆

Then

(𝜆𝑡)𝑘
𝑃 [𝑁 (𝑡) = 𝑘] = 𝑒−𝜆𝑡 , for 𝑘 = 1, 2, ...
𝑘!

The aggregate claims process {𝑆(𝑡)}𝑡≥0 is said to be a compound Poisson claims process
and, for any 𝑡 ≥ 0 𝑆(𝑡) has a compound Poisson distribution with parameter 𝜆𝑡. We can
thus apply all the results you’ve previously seen for the compound Poisson distribution:

𝐸[𝑆(𝑡)] = 𝜆𝑡𝑚1
𝑉 𝑎𝑟[𝑆(𝑡)] = 𝜆𝑡𝑚2
𝑀𝑆 (𝑟) = 𝑒𝜆𝑡(𝑀𝑋 (𝑟)−1)

49
So we also have
𝑐 = (1 + 𝜃)𝜆𝑚1

7.5 Bounds

In this section we look at how we can place bounds on the ultimate probability of ruin.

7.5.1 Lundberg’s inequality

Lundberg’s inequality states that

𝜓(𝑈 ) ≤ 𝑒−𝑅𝑈

Where R is called the adjustment coefficient.


𝑅 is difficult to motivate in physical terms, but we can consider it as a measure of the
riskiness of the portfolio. It depends on the rate of premium income and on the distribution
of the aggregate claims.
The larger 𝑅 is, the smaller is the upper bound on the ultimate probability of ruin.

7.5.2 How R depends on other parameters

Since 𝑅 encapsulates the factors that determine the riskiness of a portfolio (apart from 𝑈 ),
by working out how we would expect these factors to affect risk we can work out how they
affect 𝑅

• increasing the rate of premium income…increases 𝑅


• increasing the expected value of aggregate claims…decreases 𝑅
• increasing the variance of the aggregate claims…decreases 𝑅

7.5.3 The adjustment factor - compound Poisson processes

𝑅 will be defined in terms of the Poisson parameter and the MGF of the individual claim
distribution, plus the rate of premium income per unit time.
It is defined to be the unique positive root of

𝜆𝑀𝑋 (𝑟) − 𝜆 − 𝑐𝑟 = 0
If we write 𝑐 as (1 + 𝜃)𝜆𝑚1 , this equation for 𝑅 becomes

𝑀𝑋 (𝑟) − 1 − (1 + 𝜃)𝑚1 𝑟 = 0

50
7.5.4 General aggregate claims processes - adjustment

For the general (i.e. not just Poisson) aggregate process the adjustment coefficient is the
positive root of

𝐸[𝑒𝑅(𝑆𝑖 −𝑐) ] = 1

7.6 Homework

The homework questions are in the Week 6 homework folder. They are based on past IFoA
exam questions.

51
8 Rational expectation and utility theory

In this section we’ll cover the following detailed objectives:

• Explain the meaning of the term “utility function”.


• Explain the axioms underlying utility theory and the expected utility theorem.
• Explain how the following economic characteristics of investors can be expressed.
mathematically in a utility function:
– non-satiation
– risk aversion
– risk neutrality
– risk seeking
– declining or increasing absolute and relative risk aversion.
• Discuss the economic properties of commonly used utility functions.
• Discuss how a utility function may depend on current wealth and discuss state depen-
dent utility functions.
• Perform calculations using commonly used utility functions to compare investment
opportunities.
• State conditions for absolute dominance and for first and second-order dominance.

8.1 Meaning of utility

In economics, ‘utility’ is the satisfaction that an individual obtains from a particular course
of action.
In the application of utility theory to finance and investment choice, it is assumed that a
numerical value called the utility can be assigned to each possible value of the investor’s
wealth by what is known as a utility function.

8.2 Expected Utility Theory

The Expected Utility Theorem (EUT) states that a function, 𝑈 (𝑤), can be constructed as
representing an investor’s utility of wealth, 𝑤, at some future date. Decisions are made
in a manner to maximise the expected value of utility given a set of objectively agreed
probabilities of different outcomes.
(Compare with Subjective Utility Theory.)

52
8.2.1 Expected Utility Axioms

1. Comparability
An investor can state a preference between all available certain outcomes.
2. Transitivity
If 𝐴 is preferred to 𝐵 and 𝐵 is preferred to 𝐶, then 𝐴 is preferred to 𝐶.
3. Independence
If an investor is indifferent between two certain outcomes, 𝐴 and 𝐵, then they are also
indifferent between the following two gambles:

(i) 𝐴 with probability 𝑝 and 𝐶 with probability (1 − 𝑝); and


(ii) 𝐵 with probability 𝑝 and 𝐶 with probability (1 − 𝑝).

4. Certainty equivalence
Suppose that 𝐴 is preferred to 𝐵 and 𝐵 is preferred to 𝐶. Then there is a unique probability,
𝑝, such that an investor is indifferent between 𝐵 and a gamble giving 𝐴 with probability 𝑝
and 𝐶 with probability (1 − 𝑝).
𝐵 is known as the ‘certainty equivalent’ of the above gamble.
[There is more than one possible way to set out the axioms of EUT. (For example the way
in Dhami.) Any exam questions will be based on the axioms in these slides.]

8.2.2 Non-satiation

We assume that people prefer more wealth to less. This is known as the principle of non-
satiation and can be expressed as:

𝑈 (𝑤) > 0
Where 𝑈 (𝑤) is a utility function and 𝑤 is wealth.

8.2.3 Risk preferences

A risk averse investor values an incremental increase in wealth less highly than an incremen-
tal decrease and will reject a fair gamble. The utility function is concave:

𝑈 (𝑤) < 0
A risk seeking investor values an incremental increase in wealth more highly than an incre-
mental decrease and will seek a fair gamble. The utility function is convex:

𝑈 (𝑤) > 0

53
A risk neutral investor is indifferent between a fair gamble and the status quo. In this case
the utility function is linear:

𝑈 (𝑤) = 0
Risk preference can also be expressed in terms of the certainty equivalent.
Consider a gamble 𝐴 with certainty equivalent 𝐶𝐿 .
For a risk neutral investor 𝐶𝐿 = 𝐸[𝐴], where 𝐸[𝐴] is the expected value of 𝐴.
For a risk averse investor 𝐶𝐿 < 𝐸[𝐴],
For a risk seeking investor 𝐶𝐿 > 𝐸[𝐴].
If the absolute value of the certainty equivalent decreases with increasing wealth, the in-
vestor is said to exhibit declining absolute risk aversion. If the absolute value of the
certainty equivalent increases, the investor exhibits increasing absolute risk aversion. If
the absolute value of the certainty equivalent decreases (increases) as a proportion of total
wealth as wealth increases, the investor is said to exhibit declining (increasing) relative
risk aversion.
Two functions measure how risk preference changes as a function of wealth:

−𝑈 (𝑤)
𝐴(𝑤) = 𝑈 ′ (𝑤)

measures absolute risk aversion,



𝑅(𝑤) = − 𝑤𝑈 (𝑤)
𝑈 ′ (𝑤)

measures relative risk aversion.

8.3 Some utility functions

8.3.1 The quadratic utility function

The general form is:


𝑈 (𝑤) = 𝑤 + 𝑑𝑤2
Exercise
Show that, if the quadratic utility function is to satisfy the condition of diminishing marginal
utility of wealth (risk aversion), we must have 𝑑 < 0 and therefore this utility function can
only satisfy the law of non-satiation over a limited range.
Solution
We have:

𝑈 (𝑤) = 1 + 2𝑑𝑤

𝑈 (𝑤) = 2𝑑

54
″ ′
For 𝑈 (𝑤) to be negative (risk aversion) 𝑑 must be negative, and for 𝑈 (𝑤) to be positive
(non-satiation) we must have −∞ < 𝑤 < −1/2𝑑
Exercise
Show that
2𝑑
𝐴(𝑤) = (1+2𝑑𝑤) ,
−2𝑑𝑤
𝑅(𝑤) = (1+2𝑑𝑤) .

And state whether the quadratic utility function shows increasing or decreasing absolute
and relative risk aversion.

8.3.2 The log utility function

The form of the log utility function is:


𝑈 (𝑤) = 𝑙𝑛 (𝑤), 𝑤 > 0.
Show that it:

• satisfies the principle of non-satiation and risk aversion


• exhibits declining absolute risk aversion
• exhibits constant relative risk aversion.

Here is a quote from Core Reading:

Utility functions exhibiting constant relative risk aversion are said to be “iso-
elastic”. The use of iso-elastic utility functions simplifies the determination of
an optimal strategy for a multi-period investment decision, because [it] allows
for a series of so-called “myopic” decisions. What this means is that the decision
at the start of each period only considers the possible outcomes at the end of
that period and ignores subsequent periods.

Read the answer to a question about iso-elasticity on Economics Stack Exchange. Can
you see how constant relative risk aversion leads to the ability to make myopic investment
decisions?
Now read this. What do you think?

55
8.3.3 The power utility function

The form of the power utility function is:


(𝑤𝛾 −1)
𝑈 (𝑤) = 𝛾 , 𝑤 > 0.
Where 𝛾 is known as the risk aversion coefficient.
Show that the power utility function:

• satisfies the principle of non-satiation and risk aversion


• exhibits declining absolute risk aversion
• exhibits constant relative risk aversion.

8.4 Example utility curves

The plots here are to give you an idea of the general shape. The absolute values are irrelevant,
so they have been adjusted to fit onto the same graph.
Plots of some utility functions
Log Power Quadratic

10
U(w)

0
0 5 10 15 20 25
w

8.5 Problems with EUT

The main problem with EUT is that its axioms (the axioms of rationality) seem reasonable,
but they can only be justified if they describe human (or organisational) behaviour in reality.
And it has been shown many times that they don’t! (See Dhami for plenty of examples.)
Also:

56
• we don’t know the precise form of any person’s utility function (although there have
been plenty of attempts to devise ways of measuring them);
• organisational decision making is a balance between a coalition of different interests;
• utility functions are likely to be highly state dependent, depending not just on wealth,
but on other factors (think of some factors that might affect your risk preferences).

8.6 Stochastic dominance

Absolute dominance is said to exist when one investment portfolio (or results of another
type of decision) provides a higher return than another in all possible circumstances. Clearly,
this situation will rarely occur, so we usually need to consider the relative likelihood of out-
performance, i.e. stochastic dominance.
Consider two investment portfolios, 𝐴 and 𝐵, with cumulative probability distribution func-
tions of returns 𝐹𝐴 and 𝐹𝐵 respectively.

8.6.1 First order stochastic dominance

The first order stochastic dominance theorem states that, assuming an investor prefers more
to less, 𝐴 will dominate 𝐵 (i.e. the investor will prefer portfolio 𝐴 to portfolio 𝐵) if:
𝐹𝐴 (𝑥) ≤ 𝐹𝐵 (𝑥) for all 𝑥, and
𝐹𝐴 (𝑥) < 𝐹𝐵 (𝑥) for some 𝑥.
This means that the probability of portfolio 𝐵 producing a return below a certain value is
never less than the probability of portfolio 𝐴 producing a return below the same value, and
exceeds it for at least some value of 𝑥.

8.6.2 Second order stochastic dominance

The second order stochastic dominance theorem applies when the investor is risk averse, as
well as preferring more to less.
In this case, the condition for 𝐴 to dominate 𝐵 is that:
𝑥 𝑥
∫𝑎 𝐹𝐴 (𝑦)𝑑𝑦 ≤ ∫𝑎 𝐹𝐵 (𝑦)𝑑𝑦 for all 𝑥.
With the strict inequality holding for some value of 𝑎 , and where 𝑎 is the lowest return
that the portfolios can possibly provide.
The interpretation of the inequality above is that a risk averse investor will accept a lower
probability of a given extra return, at a low absolute level of return, in preference to the
same probability of extra return at a higher absolute level. In other words, a potential gain
of a certain amount is not valued as highly as a loss of the same amount.

57
8.7 Homework

Have a go at the following past IFoA questions:


CM2B, September 2019, Q3 (use Excel)
CM2A, September 2020, Q1.

8.8 Reading

The topic is behavioural economics:


Describe the main features of Kahneman and Tversky’s prospect theory critique of expected
utility theory.
Read Chapters 26 – 30 of Thinking Fast and Slow.
Optionally, you could also:
Read Chapter One of Dhami (see the reading list) and read Chapter Two: 2.1 – 2.4.

58
9 Behavioural economics

9.1 Learning Objectives

• Describe the main features of Kahneman and Tversky’s prospect theory critique of
expected utility theory.
• Explain what is meant by “framing”, “heuristics” and “bias” in the context of financial
markets and describe the following features of behaviour in such markets:
– the herd instinct
– anchoring and adjustment
– self-serving bias
– loss aversion
– confirmation bias
– availability bias
– familiarity bias

9.2 Criticisms of expected utility theory (EUT)

Before going any further read, or reread, Chapters 25 and 26 of Thinking Fast and Slow.
The fundamental problem is EUT can’t explain many observations of real behaviour, for
example:

• Individuals have different attitudes to risk (i.e. utility functions) depending on wealth
[Friedman & Savage].
• Utility should be measured relative to a reference point [Markowitz].

Prospect Theory (PT) was based on a series of psychological experiments by Kahnemann


and Twersky.

9.3 Summary of Prospect Theory

Faced with (i.e. “Having the prospect of”) a risky choice leading to gains, people are risk-
averse, preferring solutions that lead to a lower expected utility but with a higher certainty
(concave value function). Faced with a risky choice leading to losses, people are risk-seeking,
preferring solutions that lead to a lower expected utility as long as it has the potential to
avoid losses (convex value function).

59
People tend to overweight very low probabilities and underweight very high probabilities.

Figure 9.1: Prelec_weighting_function

Prelec’s function is often used for weighting probabilities. For the Formula see Dhami page
121. (Dhami 2019)

9.3.1 Decision making under Prospect Theory

There are two stages to decision making under PT:

1. Editing (sometimes called framing)


• possible outcomes are appraised & ordered
2. Evaluation
• a choice is made from the options the options

60
Editing

The main editing stages are:

• Acceptance
• Segregation

Acceptance implies that people rarely change the way that the decision is presented to them
- that is they accept the framing that is presented,rather than reframing the decision for
themselves.
Framing is a very powerful technique, used continually in marketing.
Segregation is the process of separating the parts of the picture that are relevant to making
the decision, from those that are (or should be) irrelevant. The behavioural biases described
below can be very involved in this process.
Other stages are:
Coding
This refers to defining the reference point and the various outcomes and probabilities in a
quantifiable way.
The location of the reference point—and the consequent coding of outcomes as gains or
losses—can be affected by the formulation of the offered prospects and by expectations of
the decision maker.
Combining
Prospects can sometimes be simplified by combining the probabilities associated with iden-
tical outcomes.
For example, the prospect (200, 0.25; 200, 0.25) will be reduced to (200, 0.50) and evaluated
in this form.
Cancellation
Some prospects contain a riskless component that is segregated from the risky component
during editing.
For example, the prospect (100, 0.70; 150, 0.30) can be decomposed into a sure gain of 100
and the risky prospect (50, 0.30).
Simplification
Prospects can be simplified by rounding either outcomes or probabilities.
For example, the prospect (99, 0.51) can be coded as an even chance of winning 100.
Outcomes that are extremely improbable are likely to be ignored, meaning the probabilities
are rounded down to 0.
[The examples above are from: http://mark-hurlstone.github.io/Week%205.%20Decision%20Making%20Unde
]

61
Evaluation

Reference dependence
People derive utility from gains and losses measured relative to some reference point –
giving an S-shaped utility curve with a point of inflexion at that point.
Loss aversion
People are more sensitive to losses than gains, so steeper curve below inflexion. They are
risk averse in the domain of gains and risk seeking in the domain of losses. together with a
diminishing sensitivity this gives rise to the S-shape in the diagram.

Diagram from The Trouble with Economics


Endowment effect

62
People’s preferences depend on what they already have. They value something more highly
just because they already own it, and will often refuse to sell something for well above its
market value, even if they could purchase something virtually identical in the market.
Mental accounting
People hold mental accounts of the sources or planned use of money and may make different
decisions depending on the mental account involved.
Probabilities are weighted
E.g. see Prelec curve.
Certainty effect
Change from certainty is weighted highly. So the move from 100% probability to 99%
probability is give much more significance than the move from 50% to 49%.
Isolation effect
Common elements in the things being compared are ignored in the decision, even though
they might have different implications.
Evaluation formula

𝑛
𝑉 = ∑ 𝜋(𝑝𝑖 )𝑣(𝑥𝑖 )
𝑖=1

𝑉 is the overall utility of the outcomes to the individual making the decision; 𝑥𝑖 are the
outcomes; 𝑝𝑖 are the probabilities of the outcomes; 𝜋() is a probability weighting function;
𝑣() is the s-shaped value function passing through the origin.

9.4 Heuristics and behavioural biases

System I in the brain is fast and instinctive


Used all the time and vital for survival
System II is slower and more deliberative
Akin to the rational agent of traditional economics
System I uses shortcuts, or heuristics
Also known as “rules of thumb”

63
9.4.1 Anchoring and adjustment

Once we have established (or been persuaded to establish) an anchor we find it very difficult
to make sufficient adjustments away from it.
This is why a common tactic in negotiating is to try and be the first to suggest a price
rather than allowing the other side to create an “anchor” favourable to themselves.
A classic example of anchoring is to have a few very expensive bottles of wine on a restaurant
list to make the others look good value.

9.4.2 Familiarity

People tend to prefer familiar choices over unfamiliar ones.


It’s been shown that English speaking people prefer to invest in companies with English
sounding names - even if the ones with the “funny” names are based in the UK.

9.4.3 Overconfidence

People systematically overestimate their own capabilities and judgement.


For example, they buy a risky investment because they think they will be able to determine
the right time to sell it.

9.4.4 Hindsight bias

Looking backwards, people believed they could predict the future much better than they
could at the time. They think they had information in the past that actually only emerged
later.
This might be the reason why so many people believe active investment works - they see
outperformance of some investments and believe it was predictable before it occurred.
It’s probably also one of the reasons for the persistent belief that universities and schools
aren’t producing graduates well equipped for employment - people older than 40 have for-
gotten how much they didn’t know when they were 22!

64
9.4.5 Confirmation bias

People look for, and remember, evidence that confirms their existing beliefs. There are
many examples of people being able to narrow the sources of news so they only get what
they want to hear.
As another example, if you feel ethics are sustainability are important factors in making
investment choices you’ll easily find reputable investment professionals who will tell ethical
investments are a good financial choice.
If you believe that such considerations should be irrelevant to investment, plenty of people
will also confirm that view.

9.4.6 Self-serving bias

People believe luck is evidence of their own skill (and the opposite when observing other
people). For example, when someone makes an investment that pays off well, the assign it
to their skill. When it doesn’t do well they think it is bad luck (or perhaps others conspiring
against them.)

9.4.7 Status quo bias

Status quo bias is a preference given to the present state of affairs, or a natural bias towards
the current or previous decision.
You will have noticed that people don’t like to change their minds! People are also slow to
change their investment portfolio in reaction to events or a change in their circumstances.

9.4.8 Herd behaviour

There are many, many examples of this (look up the Dutch Tulip Mania if you haven’t heard
of it before).
Extraordinary Popular Delusions and the Madness of Crowds is currently available for 49p
on Kindle.

9.5 Homework

Find a relevant, and powerful, example for each of the behavioural heuristics/biases listed
above.
Make notes on the five stages of editing listed under “other stages” above.
Continue reading Thinking fast and slow.

65
Figure 9.2: Flock_of_sheep_crossing_a_bridge

66
9.6 Further reading

For a readable introduction to PT see this blog post


For a slightly more in-depth example see this extract from R McDermott (2001)

67
10 Insurance and risk markets

10.1 Learning objectives

In this section we will cover the following objectives:

• Analyse simple insurance problems in terms of utility theory


• Describe how insurance companies help to reduce or remove risk.
• Explain what is meant by the terms ‘moral hazard’ and ‘adverse selection’.

In fact, there is no new theory to learn - we already have the tools we need to see why people
are prepared to purchase insurance even though it might lead to an expected decrease in
wealth.
Consider the following example from the CM2 Core Reading.

10.2 Finding the maximum premium

The maximum premium 𝑃 which an individual will be prepared to pay in order to insure
themselves against a random loss 𝑋 is given by the solution of the equation:
𝐸[𝑈 (𝑎 − 𝑋)] = 𝑈 (𝑎 − 𝑃 ),
where 𝑎 is the initial level of wealth.

For example, consider an individual with a utility function of 𝑢(𝑥) = 𝑥 and current wealth
of 15000.
Assume this individual is at risk of suffering damages that are uniformly distributed up to
15000, so their expected loss is 7500.
If they accept the risk their wealth will be uniformly distributed between 0 and 15000, so
their expected utility is:
15000 √ √
𝑥 15000
∫ 𝑑𝑥 =
0 15000 1.5
Setting this equal to the utility if they pay the premium, we find they would be willing to
pay up to 8333 for insurance that covers any loss. This is well above the 7500 expected loss.
(Admittedly it’s not a very realistic example.)

68
10.3 Finding the minimum premium

The insurance premium 𝑄 which an insurer should be prepared to charge for insurance
against a risk with potential loss 𝑌 is given by the solution of the equation:
𝐸[𝑈 (𝑎 + 𝑄 − 𝑌 )] = 𝑈 (𝑎),
where 𝑎 is the initial wealth.
If the insurer uses a premium loading 𝜃 we can write this as:
𝐸[𝑈 (𝑎 + (1 + 𝜃)𝐸[𝑌 ] − 𝑌 )] = 𝑈 (𝑎).

10.4 A little insurance backgound

10.4.1 Pooling resources

As we have seen above, a person who is risk averse will be prepared to pay more for insurance
than the long-run average value of claims which will be made. Thus, insurance can be
worthwhile for the risk averse policyholder even if the insurer has to charge a premium in
excess of the expected value of claims in order to cover expenses and to provide a profit
margin. An insurance contract is feasible if the minimum premium that the insurer is
prepared to charge (given their own risk aversion) is less than the maximum amount that a
potential policyholder is prepared to pay.
Insurance reduces the variability of losses due to adverse outcomes by pooling risks.
In pooling risks, an insurer attempts to group insured risks within homogeneous groups so
that the premiums charged to a particular group accurately reflect the risk.

10.4.2 Adverse selection

Adverse selection describes the fact that people who know that they are particularly bad
risks are more inclined to take out insurance than those who know that they are good risks
if the premiums charged are based on the average risk for the whole group.
The key requirement for adverse selection is information asymmetry where one party (the
buyer of insurance) has information that the other party doesn’t.
To try and reduce the problems of adverse selection insurance companies may try and find
out lots of information about potential policyholders. Policyholders can then be put in
small, reasonably homogeneous pools and charged appropriate premiums. However, there
are a number of problems with this approach:

• collecting and processing data is expensive


• homogeneous groups might end up too small to have statistically credible data

69
• customers don’t like providing detailed information so might choose a competitor with
a less onerous proposal form
• anti-discrimination legislation may prevent the use of certain types of information

10.4.3 Moral hazard

Moral hazard describes the fact that a policyholder may, because they have insurance, act
in a way which makes the insured event more likely. Moral hazard makes insurance more
expensive. It may even push the price of insurance above the maximum premium that a
person is prepared to pay.
More generally, moral hazard occurs when someone is induced to behave in a risky way
because someone else bears the cost of the risk taken on. A well known example occurs if
banks believe they are “too big to fail”. They may take indulge in risky behaviour confident
that the government will bail them out if things go wrong.
Moral hazard is not the same as insurance fraud. Making a false claim for a loss that hasn’t
occurred, or which shouldn’t be covered by the insurance contract, is fraud, not moral
hazard.

10.4.4 Exercise

For two types of non-life insurance and two types of life insurance think of some examples
of moral hazard, adverse selection, and fraud.
How might an insurance company seek to deal with the increased risk brought by each of
your examples?

10.5 Homework

Have a go at the homework question in the Insurance folder on Blackboard. This is a past
CT7 (Economics) question, with some extensions.

70
Dhami, Sanjit. 2019. The Foundations of Behavioral Economic Analysis: Volume i: Behav-
ioral Economics of Risk, Uncertainty, and Ambiguity. Vol. 1. Oxford: Oxford University
Press USA - OSO.
Gesmann, Markus, Daniel Murphy, Yanwei (Wayne) Zhang, Alessandro Carrato, Mario
Wuthrich, Fabio Concina, and Eric Dal Moro. 2023. ChainLadder: Statistical Methods
and Models for Claims Reserving in General Insurance. https://mages.github.io/Chai
nLadder/.
Jakhria, P., R. Frankland, S. Sharp, A. Smith, A. Rowe, and T. Wilkins. 2019. “Evolution
of Economic Scenario Generators: A Report by the Extreme Events Working Party
Members.” British Actuarial Journal 24. https://doi.org/10.1017/S135732171800018
1.
Mersmann, Olaf. 2021. Microbenchmark: Accurate Timing Functions. https://github.com
/joshuaulrich/microbenchmark/.
Pedersen, Hal., Mary Pat. Campbell, Stephan L. Christiansen, Samuel H. Cox, Daniel.
Finn, Ken. Griffin, Nigel. Hooker, Matthew. Lightwood, Stephen M. Sonlin, and Chris.
Suchar. 2016. Evolution of Economic Scenario Generators: A Report by the Extreme
Events Working Party Members. Society of Actuaries. https://www.soa.org/globalasse
ts/assets/Files/Research/Projects/research-2016-economic-scenario-generators.pdf.
Ross, Sheldon M. 2013. Simulation. Fifth edition. Amsterdam: Academic Press.
Ucar, Iñaki, Bart Smeets, and Arturo Azcorra. 2019. “simmer: Discrete-Event Simulation
for R.” Journal of Statistical Software 90 (2): 1–30. https://doi.org/10.18637/jss.v090.
i02.

71

You might also like