0% found this document useful (0 votes)
32 views32 pages

BA3202-L2 Distribution

This document provides an overview of the objectives and content covered in Lecture 2 of the course BA3202 Actuarial Statistics. The lecture will describe properties of statistical distributions used to model individual and aggregate losses. It will also cover deriving moments and moment generating functions of common loss distributions, and applying statistical inference to select appropriate loss distributions. The document outlines methods of parameter estimation like method of moments, method of percentiles, and maximum likelihood estimation, as well as goodness-of-fit testing. It provides examples of some common distributions used in actuarial applications like the normal, lognormal, exponential, gamma, Pareto, Weibull, and mixtures.

Uploaded by

Hagan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views32 pages

BA3202-L2 Distribution

This document provides an overview of the objectives and content covered in Lecture 2 of the course BA3202 Actuarial Statistics. The lecture will describe properties of statistical distributions used to model individual and aggregate losses. It will also cover deriving moments and moment generating functions of common loss distributions, and applying statistical inference to select appropriate loss distributions. The document outlines methods of parameter estimation like method of moments, method of percentiles, and maximum likelihood estimation, as well as goodness-of-fit testing. It provides examples of some common distributions used in actuarial applications like the normal, lognormal, exponential, gamma, Pareto, Weibull, and mixtures.

Uploaded by

Hagan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

BA3202

Actuarial Statistics
Lecture 2:
- Loss distributions
Instructor:
Wenjun Zhu, PhD, FSA, CERA
Assistant Professor
Email: [email protected]
Office: S3-B1B-71
Tel: (65)6592-1859

Division of Banking & Finance


Nanyang Technological University

BA3202
L2
Objectives

Actuarial Statistics
1. Describe the properties of the statistical distributions which
are suitable for modelling individual and aggregate losses.
2. Derive moments and moment generating functions (where
defined) of loss distributions including the gamma,

Wenjun Zhu [email protected]


exponential, Pareto, generalized Pareto, normal, lognormal,
Weibull and Burr distributions.
3. Apply the principles of statistical inference to select suitable
loss distributions for sets of claims.

BA3202
L2
Insurance overview

Actuarial Statistics
• Insurance losses:
• Frequency of loss: how often does a loss occur over the insured
period
• Severity of loss: how much does each loss cost?
• Expected total loss=expected frequency * expected average severity

Wenjun Zhu [email protected]


• General procedure for modelling of loss frequency and severity:
• There is a large sample of candidate “loss distribution” models
• Each loss distribution is fit on historical losses for the insurance
company.
• All distributions fitted on the data, are compared using
goodness-of-fit (GOF) criteria.
• The loss distribution with the best fitness is chosen.
• The procedure above is executed separately for the frequency and 3
the severity of losses.
BA3202
L2
Statistical overview

Actuarial Statistics
• We make the assumptions that historical losses (claims) come
from a familiar distribution.
• However, the exact distribution can never be known with
absolute certainty.

Wenjun Zhu [email protected]


• Typically we assume that losses from a “family” of
distributions (next sections).
• Since parameters of these distributions are unknown, then
several techniques can be used to estimate such as “Maximum
likelihood estimation” (next sections).
• Some complications exist when certain policies are used such
as those with high deductibles, co-insurance and excess of
loss. We will cover reinsurance arrangements in the next 4
lecture.
BA3202
L2
Estimation & Goodness-of-Fit
(GOF)

Actuarial Statistics
• Several methods to estimate distribution parameters:
• Method of Moments (MM)
• Method of Percentiles (MP)
• Maximum Likelihood Estimation (MLE)

Wenjun Zhu [email protected]


• Statistical fitness is tested using a 𝜒 ! test.
• No need to memorize the formulae for densities and moment
generating functions (when defined)
• Provided in the yellow book in exams
• Try to derive them as examples
• Remember some common ones to safe time during the exam

BA3202
L2
Method of Moments

Actuarial Statistics
• {𝑥" , 𝑥! , … , 𝑥# } is a random sample with size 𝑛, from a random
variable 𝑋
• The 𝑗$% moment of this sample is defined as
#
1 (

Wenjun Zhu [email protected]


+ 𝑥&
𝑛
&'"
• Let the theoretical 𝑗$% moment of 𝑋 be 𝑚( (𝜃)
• 𝜃 is the unknown parameter of 𝑋 that we want to estimate
• 𝑚! (𝜃) is a function of 𝜃
" (
• 𝜃 can then be estimated by solving 𝑚( 𝜃 = ∑#&'" 𝑥&
#

BA3202
L2
Method of Percentiles

Actuarial Statistics
• Works similarly to MM
• Procedure In MM when there are 2 parameters to be estimated, using the
first 2 moments:
1. Derive the 1st and 2nd moments of the distribution where the 2 parameter
values will be unknown;

Wenjun Zhu [email protected]


2. Estimate 1st and 2nd moments from the data;
3. Set the 1st estimated moment equal to the 1st empirical (data) moment, AND
ALSO, set the 2nd estimated moment equal to the 2nd empirical (data)
moment
à 2 equations and 2 unknowns à solved simultaneously
• Procedure in MP when there are 2 parameters to be estimated:
1. Derive two different percentiles of the distribution (i.e. the 25th and 75th
percentiles) where the 2 parameter values will be unknown
2. Estimate the sample percentiles from the data (25th and 75th percentiles)
3. Set the estimated 25th percentile equal to the 25th empirical percentile, AND 7
ALSO, set the estimated 75th percentile equal to the 75th empirical percentile
à 2 equations and 2 unknowns à solved simultaneously
BA3202
L2
Maximum Likelihood
Estimation

Actuarial Statistics
• 𝑓 𝑥 𝜃 : the conditional pdf of random variable 𝑋 with
parameter set 𝜃, where 𝜃 is a set of unknown variable
• 𝐿(𝜃): the likelihood function
• 𝐿 𝜃 = ∏%"#$ 𝑃(𝑋" = 𝑥" |𝜃), when 𝑋 is discrete
• 𝐿 𝜃 = ∏%"#$ 𝑓 𝑥" 𝜃 , when 𝑋 is continuous

Wenjun Zhu [email protected]


• Maximum Likelihood estimate (MLE): the parameter set that
maximizes 𝐿 𝜃
• Instead of the Likelihood function, we typically use the
log-likelihood function 𝑙 𝜃 = log 𝐿 𝜃
• To maximize, we differentiate 𝑙 𝜃 w.r.t. 𝜃 and set equal to 0
9
to find the MLE 𝜃:
𝜕𝑙 𝜃9
=0 8
𝜕𝜃
BA3202
L2
Some common distributions

Actuarial Statistics
• The Lognormal distribution
• The Exponential distribution
• The Gamma distribution
• The Pareto distribution

Wenjun Zhu [email protected]


• The Weibull distribution
• Mixture distributions
• Variations on Pareto distribution

BA3202
L2
The normal distribution

Actuarial Statistics
• If a random variable 𝑋 follows normal distribution:
X~𝑁 𝜇, 𝜎 ! with parameters 𝜇 and 𝜎
!"#$
$ )
• PDF: 𝑓 𝑥 = 𝑒 $%$
& '(
*)+
• CDF: 𝐹 𝑥 = Φ , where Φ(⋅) is CDF of standard normal.
&

Wenjun Zhu [email protected]


• Mean: 𝜇
• Variance: 𝜎 '
"
• MGF: 𝑀) 𝑡 = exp 𝜇𝑡 + 𝜎 ! 𝑡 !
!

10

BA3202
L2
The normal distribution

Actuarial Statistics
• Simulation example with R code:
• 𝜇 = 2, 𝜎 = 1: library(ggplot2)
x_norm = rnorm(1000, mean = 2, sd = 1)
ggplot(data.frame(x_norm), aes(x_norm)) +
geom_histogram(aes(y=..density..),bins =
50,color = 'grey') +
stat_function(fun=function(x)dnorm(x,
mean = 2, sd = 1), color="magenta", size=1)

Wenjun Zhu [email protected]


11

BA3202
L2
Example:

Actuarial Statistics
• Calculate the MM, MP, and ML estimator of a normal
distribution with parameters 𝜇, 𝜎
• MM estimator:
$
• 𝜇6 = 𝑥, 𝜎6 ' = ∑ 𝑥" − 𝑥 ' .

Wenjun Zhu [email protected]


%
• MP estimator:
,&.$( ⋅.") /.12 ),&.*( ⋅.") /.'2
• 𝜇6 = ,
.") /.12 ).") /.'2
,&.*( ),&.$(
• 𝜎6 =
3") /.12 )3") /.'2
• ML estimator:
$
• 𝜇6 = 𝑥, 𝜎6 ' = ∑ 𝑥" − 𝑥 ' .
%
12

BA3202
L2
The lognormal distribution

Actuarial Statistics
• If a random variable 𝑋 satisfies ln 𝑋 ~𝑁(𝜇, 𝜎 ! ), then
𝑋~𝐿𝑁(𝜇, 𝜎 ! ), i.e. lognormally distributed with parameters 𝜇
and 𝜎
!"# $ %& '
! &
• PDF: 𝑓 𝑥 = 𝑒 '('
"# $%
'
• Mean: 𝑒 '(# /$

Wenjun Zhu [email protected]


' '
• Variance: 𝑒 $'(# (𝑒 # − 1)
• MGF does not exist

13

BA3202
L2
The lognormal distribution

Actuarial Statistics
• Simulation example with
• 𝜇 = 2, 𝜎 = 1:
R code:
library(ggplot2)
x_logn = rlnorm(1000, meanlog = 2, sdlog = 1)

Wenjun Zhu [email protected]


ggplot(data.frame(x_logn), aes(x_logn)) +
geom_histogram(aes(y=..density..),bins = 50,color = 'grey') +
stat_function(fun=function(x)dlnorm(x, meanlog = 2, sdlog = 1),
color="magenta", size=1)

14

BA3202
L2
The exponential distribution

Actuarial Statistics
• If 𝑋~𝐸𝑥𝑝(𝜆)
• PDF of 𝑋: 𝑓 𝑥 = 𝜆𝑒 )4* , 𝑥 > 0.
• CDF of 𝑋: 𝐹 𝑥 = 1 − 𝑒 )4* , 𝑥 > 0.
$
• Mean:

Wenjun Zhu [email protected]


4
$
• Variance:
4$

$) $ +"
• MGF: 𝑀 𝑡 = 𝐸 𝑒 = 1− ,𝑡 < 𝜆
*

15

BA3202
L2
The exponential distribution

Actuarial Statistics
• Simulation example with
• 𝜆 = 0.5:
R code:
library(ggplot2)
x_exp = rexp(1000, rate = 1/2)

Wenjun Zhu [email protected]


ggplot(data.frame(x_exp), aes(x_exp)) +
geom_histogram(aes(y=..density..),bins = 50,color = 'grey') +
stat_function(fun=function(x)dexp(x, rate = 1/2),
color="magenta", size=1)

16

BA3202
L2
The Gamma distribution

Actuarial Statistics
• If 𝑋~Γ(𝛼, 𝜆)
!)
• PDF of 𝑋: 𝑓 𝑥 = 𝑥 #$% 𝑒 $!& , 𝑥 > 0
" #
# #
• 𝐸 𝑋 = , 𝑉𝑎𝑟 𝑋 =
! !'
• Estimation of parameters:

Wenjun Zhu [email protected]


• the simple equations for the MM method make it easy to use.
• the MLE do not exist in closed form, but the MM estimators can be
used as “initial inputs” to the MLE estimation which can be done
using optimization techniques.
5 )6
• MGF: 𝑀 𝑡 = 1 − ,𝑡 < 𝜆.
4
• If we have 𝑛 number of i.i.d. 𝑋' ~Γ 𝛼' , 𝜆 , then ∑,*+! 𝑋* ~Γ ∑,*+! 𝛼* , 𝜆
• Q: What is the distribution of ∑.+,- 𝑋+ if 𝑋+ ~Exp 𝜆 ?
( %
• If 𝑋~Γ , , then it is equivalent to 𝜒 ) 𝜈 . MGF? 17
) )
• If 𝑋~Γ 𝛼, 𝜆 , then 2𝜆𝑋 ∼ 𝜒 ) 2𝛼 .
BA3202
L2
The Gamma distribution

Actuarial Statistics
• Simulation example with
• 𝛼 = 4, 𝜆 = 2: R code:
library(ggplot2)
x_gam = rgamma(1000, shape = 4, rate = 2)

ggplot(data.frame(x_gam), aes(x_gam)) +

Wenjun Zhu [email protected]


geom_histogram(aes(y=..density..),bins = 50,color = 'grey') +
stat_function(fun=function(x)dgamma(x, shape = 4, rate = 2),
color="magenta", size=1)

18

BA3202
L2
Example: Invariance Property of MLE

Actuarial Statistics
• Invariance Property of MLE:
• If 𝜃E is the MLE of 𝜃, then for any function 𝑔 𝜃 , the MLE of 𝑔 𝜃
is 𝑔 𝜃E .
• Given a sample 𝑥 = (𝑥$ , 𝑥' , … , 𝑥% ), derive the MLE of population
mean given the following two distributional assumptions:

Wenjun Zhu [email protected]


1. Suppose that sample 𝑥 is from 𝐸𝑥𝑝 𝜆
• 𝑙 𝜆 = 𝑛 log 𝜆 − 𝜆∑𝑥*
,
• MLE of 𝜆: 𝜆5 =
∑"*
! ∑"*
• By invariance property of MLE, MLE of 𝜇: 𝜇̂ = . =
/ ,
2. Suppose that sample 𝑥 is from Γ 𝛼, 𝜆
• 𝑙 𝛼, 𝜆 = 𝑛𝛼 log 𝜆 − 𝑛 log Γ 𝛼 + 𝛼 − 1 ∑ log 𝑥* − 𝜆∑𝑥*
0
1 ∑"* 19
• By invariance property of MLE, MLE of 𝜇: 𝜇̂ = . =
/ ,

BA3202
L2
The Pareto distribution

Actuarial Statistics
• If 𝑋~𝑃𝑎(𝛼, 𝜆)
4 6
• CDF of 𝑋: 𝐹 𝑥 = 1 −
47*
64/
• PDF of 𝑋: 𝑓 𝑥 = ,𝑥 >0
47* /0)

Wenjun Zhu [email protected]


4
• Mean: 𝛼>1
6)$
64$
• Variance: 𝛼>2
6)$ $ (6)')
• The parameters can be estimated either by the MM or
MLE
• However, because the variance of the Pareto distribution is
relatively large, we typically use the MM to get the initial
estimates of the 2 parameters and then use a more advanced
method like MLE to get more precise estimates 20

BA3202
L2
The Pareto distribution

Actuarial Statistics
• Simulation example with
4
• 𝛼 = 3, 𝜆 = 4 (𝜇 = = 2):
6)$

R code:

Wenjun Zhu [email protected]


library(ggplot2)
library(actuar)
x_pare = rpareto(1000, shape = 3, scale = 4)

ggplot(data.frame(x_pare), aes(x_pare)) +
geom_histogram(aes(y=..density..),bins = 60,color = 'grey') +
stat_function(fun=function(x)dpareto(x, shape = 3, scale = 4),
color="magenta", size=1)

21

BA3202
L2
Variations on Pareto

Actuarial Statistics
• The Burr (or transformed Pareto) distribution
4/
• CDF: 𝐹 𝑥 = 1 −
47*1 /
/)
• Compare with Pareto CDF: 𝐹 𝑥 = 1 − /(" )
• The additional parameter 𝛾 gives extra flexibility

Wenjun Zhu [email protected]


• Parameter estimation through MP; MLE will require non-linear
optimization techniques
• The Generalized Pareto distribution 𝑃𝑎(𝛼, 𝜆, 𝑘)
: 67; 4/ *2")
• PDF: 𝑓 𝑥 = ,𝑥 > 0 (k=1?)
: 6 : ; 47* /02
4; 4$ ;(;76)$)
• 𝐸 𝑋 = , 𝑉𝑎𝑟 𝑋 =
6)$ 6)$ $ (6)')
• Parameter estimation through MM or MLE; MP can not be used
since CDF is not in closed form 22

BA3202
L2
The Weibull distribution

Actuarial Statistics
• If 𝑋~𝑊(𝑐, 𝛾)
+
• CDF of 𝑋: 𝐹 𝑥 = 1 − 𝑒 $*&
!
• Weibull tail: Pr 𝑋 > 𝑥 = 𝑒 345
• Exponential tail: Pr 𝑋 > 𝑥 = 𝑒 365
6 8

Wenjun Zhu [email protected]


• Pareto tail: Pr 𝑋 > 𝑥 = 675
• Pareto tail is heavier than Exponential tail
• 𝛾 < 1, Weibull tail weight is between Exponential and Pareto.
• 𝛾 = 1, Weibull reduces to Exponential
• 𝛾 > 1, Weibull tail weight is lighter than Exponential
+
• PDF of 𝑋: 𝑓 𝑥 = 𝑐𝛾𝑥 +$% 𝑒 $*&
• If we already know 𝛾, then use MLE to estimate 𝑐
,
• 𝑐̂ = ∑. +
*,- &*
• MP and MM methods could both be used 23

BA3202
L2
The Weibull distribution

Actuarial Statistics
• Simulation example with 𝜇 = 2
)
( : $71 ' $
• 𝛾 = 2, 𝑐 = 𝜇= ) = 2 , where Γ = 𝜋:
$< > '
=1
R code:

Wenjun Zhu [email protected]


library(ggplot2)

x_weib = rweibull(1000, shape = 2, scale = pi/16)


ggplot(data.frame(x_weib), aes(x_weib)) +
geom_histogram(aes(y=..density..),bins = 50,color = 'grey') +
stat_function(fun=function(x)dweibull(x, shape = 2, scale = pi/16),
color="magenta", size=1)

26

BA3202
L2
Goodness-of-fit Test

Actuarial Statistics
• The 𝜒 ) test is defined for the hypothesis:
• 𝐻9 : The data follow a specified distribution
• 𝐻- : The data do not follow the specified distribution
• Test statistic:
/ )
𝑂' − 𝐸'
𝜒) = < ,
𝐸'

Wenjun Zhu [email protected]


'.%
where 𝑂' is the observed frequency for the 𝑖-th bin, and 𝐸' is the expected
frequency for the 𝑖-th bin.
• The null hypothesis 𝐻0 is rejected if
)
𝜒 ) > 𝜒%$#,/$2$%
• 𝑘 − number of bins; 𝑝 − No. of parameters estimated; 𝛼 − significant level
: ;" 3<" #
• Note: The fact that ∑+,- follows a chi-square distribution is according
<"
to the Pearson Theorem. Proof is out of the scope of this course.
• Calculate 𝐸' :
• Discrete distribution: 𝐸+ = 𝑃𝑟𝑜𝑏 𝐸𝑣𝑒𝑛𝑡+ ⋅ 𝑁 27
• Continuous distribution: 𝐸+ = 𝐹 𝑏= − 𝐹 𝑏> ⋅ 𝑁
BA3202
L2
Example: Goodness-of-fit Test

Actuarial Statistics
• A game involves rolling 3 identical dices. The game is repeated
100 times, with the following observed counts:
Number of Sixes (N𝟔 ) Number of Rolls
0 48
1 35

Wenjun Zhu [email protected]


2 15
3 3

Test whether the dices used in the game are fair.

28

BA3202
L2
Example: Goodness-of-fit Test

Actuarial Statistics
"
• Under null hypothesis, the game follows 𝐵𝑖𝑛(3, ):
,
@ A - @ B
• 𝑃 𝑁? = 0 = ?
= 0.579; 𝑃 𝑁? = 1 = 3 ⋅ ?
⋅ ?
= 0.347
- B @ - A
• 𝑃 𝑁? = 2 = 3 ⋅ ⋅ = 0.069; 𝑃 𝑁? = 3 = = 0.005
? ? ?

Number of Sixes (𝑁3 ) Observed Counts (𝑂* ) Expected Counts (E* )

Wenjun Zhu [email protected]


0 48 57.9
1 35 34.7
2 15 6.9
3 3 0.5

• Test statistic:
! !
𝜒 ! = 23.367 > 𝜒"+<,>+?+" = 𝜒@.AB,C+@+" = 7.815
• Therefore, the null hypothesis is rejected. 29

BA3202
L2
Mixture distributions

Actuarial Statistics
• So far, we have assumed that losses are realizations of a
random variable from ONE distribution
• Actual losses may be more complex
• Claims can be generated from :
• More than one distribution: Insurance company may have losses

Wenjun Zhu [email protected]


from multiple populations – for example, High risk population V.S.
Low risk population.

30

BA3202
L2
Mixture distributions

Actuarial Statistics
• Claims can be generated from :
• Other non-regular loss curves (e.g., piecewise linear/Splines)
• Example: Munich Re underwriting cycle & credit cycle:

Wenjun Zhu [email protected]


Source: deconstructingrisk.com
• A distribution whose parameters are characterized by other
distributions (recall Bayesian statistics in L1)
31

BA3202
L2
Example 1: Pareto

Actuarial Statistics
Example:
Assume the insurance loss X~𝐸𝑥𝑝(Λ), where the parameter Λ is also a random variable
with Λ~Γ(𝛼, 𝛿).
• The gamma distribution for Λ is called the mixing distribution
• The resulting distribution of 𝑋 is called the mixture distribution
D$
• The PDF of Λ: 𝑓C 𝜆 = E 8 𝜆83-𝑒 3D6 , 𝜆 > 0

Wenjun Zhu [email protected]


G D$
• The PDF of 𝑋: 𝑓F 𝑥 = ∫9 𝜆𝑒 365 ⋅ E 8 𝜆83-𝑒 3D6 𝑑𝜆
D$ G
=
E8
∫9 𝜆8 𝑒 3 D75 6 𝑑𝜆
D$ E 87- G 57D $%& 8 3 D75 6
=E8 ∫
57D $%& 9 E 87-
𝜆 𝑒 𝑑𝜆
57D $%& 8 3 D75 6
• E 87-
𝜆 𝑒 is the pdf of Γ(𝛼 + 1, 𝑥 + 𝛿)
8D$
• 𝑓F 𝑥 = ,𝑥 >0
57D $%&
• This is the pdf of a Pareto distribution 𝑃𝑎(𝛼, 𝛿) 32
• When exponential losses are averaged using a Gamma mixing distribution, the
resulting mixture distribution is a Pareto distribution. BA3202
L2
Example 2: Generalized Pareto

Actuarial Statistics
Example:
Assume the insurance loss X~Γ(𝑘, Λ), where the parameter Λ is also a random variable
with Λ~Γ(𝛼, 𝛿).
• The gamma distribution for Λ is called the mixing distribution
• The resulting distribution of 𝑋 is called the mixture distribution
D$
• The PDF of Λ: 𝑓C 𝜆 = E 8 𝜆83-𝑒 3D6 , 𝜆 > 0

Wenjun Zhu [email protected]


G 6' D$
• The PDF of 𝑋: 𝑓F 𝑥 = ∫9 E : 𝑥 :3-𝑒 365 ⋅ 𝜆83-𝑒 3D6 𝑑𝜆
E8
D$5'(& G
= E8 E:
∫9 𝜆87:3-𝑒 3 D75 6 𝑑𝜆
D$5'(& E 87: G 57D $%) 87:3- 3 D75 6
= E8 E:
⋅ ∫
57D $%' 9 E 87:
𝜆 𝑒 𝑑𝜆
57D $%& 87:3- 3 D75 6
• E 87:
𝜆 𝑒 is the pdf of Γ(𝛼 + 𝑘, 𝑥 + 𝛿)
D$5'(& E 87:
• 𝑓F 𝑥 = ⋅ ,𝑥 >0
E8 E: 57D $%'
• This is the pdf of a Generalized Pareto distribution 𝑃𝑎(𝛼, 𝛿, 𝑘) 33
• When Gamma losses are averaged using another Gamma mixing distribution, the
resulting mixture distribution is a Generalized Pareto distribution. BA3202
L2
BA3202
L2

You might also like