Exam C Manual
Exam C Manual
Exam C Manual
Actuarial Models:
A Preparation for the Actuarial Exam C/4
Marcel B. Finan
Arkansas Tech University
All
c Rights Reserved
Preliminary Draft
Last Updated
November 4, 2017
To my son
Amin
ii
Preface
The flow of topics follows very closely that of Klugman et al. Loss Models:
From Data to Decisions. The lectures cover designated sections from this
book as suggested by the 2012 SOA Syllabus.
The recommended approach for using this manuscript is to read each sec-
tion, work on the embedded examples, and then try ALL the problems given
in the text. An answer key is provided by request. Email:[email protected].
This manuscript can be used for personal use or class use, but not for com-
mercial purposes. If you find any errors, I would appreciate hearing from
you: [email protected]
Marcel B. Finan
Russellville, Arkansas
February 15, 2013.
iii
iv
Contents
Preface iii
Actuarial Modeling 1
1 Understanding Actuarial Models . . . . . . . . . . . . . . . . . . 2
v
Generating New Distributions 143
18 Scalar Multiplication of Random Variables . . . . . . . . . . . . 144
19 Powers and Exponentiation of Random Variables . . . . . . . . 148
20 Continuous Mixing of Distributions . . . . . . . . . . . . . . . . 153
21 Frailty (Mixing) Models . . . . . . . . . . . . . . . . . . . . . . 160
22 Spliced Distributions . . . . . . . . . . . . . . . . . . . . . . . . 165
23 Limiting Distributions . . . . . . . . . . . . . . . . . . . . . . . 168
24 The Linear Exponential Family of Distributions . . . . . . . . . 172
vi
48 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . 344
vii
Credibility Theory 551
75 Limited Fluctuation Credibility Approach: Full Credibility . . . 552
76 Limited Fluctuation Credibility Approach: Partial Credibility . 561
77 Greatest Accuracy Credibility Approach . . . . . . . . . . . . . 567
78 Conditional Distributions and Expectation . . . . . . . . . . . . 571
79 Bayesian Credibility with Discrete Prior . . . . . . . . . . . . . 581
80 Bayesian Credibility with Continuous Prior . . . . . . . . . . . 594
81 Bühlman Credibility Premium . . . . . . . . . . . . . . . . . . . 600
82 The Bühlmann Model with Discrete Prior . . . . . . . . . . . . 605
83 The Bühlmann Model with Continuous Prior . . . . . . . . . . 618
84 The Bühlmann-Straub Credibility Model . . . . . . . . . . . . . 627
85 Exact Credibility . . . . . . . . . . . . . . . . . . . . . . . . . . 637
86 Non-parametric Empirical Bayes Estimation for the Bühlmann
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
87 Non-parametric Empirical Bayes Estimation for the Bühlmann-
Straub Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
88 Semiparametric Empirical Bayes Credibility Estimation . . . . 661
BIBLIOGRAPHY 796
Index 798
viii
Actuarial Modeling
1
2 ACTUARIAL MODELING
(I) Deterministic Models. These are models that produce a unique set
of outputs for a given set of inputs such as the future value of a deposit
in a savings account. In these models, the inputs and outputs don’t have
associated probability weightings.
The book in [4] explains in enormous detail the advantages and disadvan-
tages of stochastic (versus deterministic) modeling.
Example 1.1
Determine whether each of the model below is deterministic or stochastic.
(a) The monthly payment P on a home or a car loan.
(b) A modification of the model in (a) is P + ξ, where ξ is a random variable
introduced to account for the possibility of failure of making a payment.
Solution.
(a) In this model, the element of randomness is absent. This model is a
deterministic one.
(b) Because of the presence of the random variable ξ, the given model is
stochastic
1 UNDERSTANDING ACTUARIAL MODELS 3
Model Calibration. Available data and existing techniques are used to cali-
brate a model.
Model Validation. Diagnostic tests are used to ensure the model meets its
objectives and adequately conforms to the data.
Selection of Models. Based on some preset criteria, the best model will
be selected among all valid models.
Practice Problems
Problem 1.1
After an actuary being hired, his or her annual salary progression is mod-
eled according to the formula S(t) = $45, 000e0.06t , where t is the number
of years of employment.
Problem 1.2
In the previous model, a random variable ξ is introduced: S(t) = $45, 000e0.06t +
ξ.
Problem 1.3
Consider a model that depends on the movement of a stock market such as
the pricing of an option with an underlying stock.
Problem 1.4
Consider a model that involves the life expectancy of a policyholder.
Problem 1.5
Insurance companies use models to estimate their assets and liabilities.
5
6 A REVIEW OF PROBABILITY RELATED RESULTS
An event is a subset of the sample space. For example, the event of rolling
an odd number with a die consists of three simple events {1, 3, 5}.
Example 2.1
Consider the random experiment of tossing a coin three times.
(a) Find the sample space of this experiment.
(b) Find the outcomes of the event of obtaining more than one head.
Solution.
We will use T for tail and H for head.
(a) The sample space is composed of eight simple events:
(b) The event of obtaining more than one head is the set
Remark 2.1
The above definitions of intersection, union, and mutually exclusive can be
extended to any number of events.
Probability Axioms
Probability is the measure of occurrence of an event. It is a function Pr(·)
defined on the collection of all (subsets) events of a sample space Ω and
which satisfies Kolmogorov axioms:
Any function Pr that satisfies Axioms 1-3 will be called a probability mea-
sure.
Example 2.2
Consider the sample space Ω = {1, 2, 3}. Suppose that Pr({1, 3}) = 0.3
and Pr({2, 3}) = 0.8. Find Pr(1), Pr(2), and Pr(3). Is Pr a valid probability
measure?
Solution.
For Pr to be a probability measure we must have Pr(1) + Pr(2) + Pr(3) = 1.
But Pr({1, 3}) = Pr(1) + Pr(3) = 0.3. This implies that 0.3 + Pr(2) = 1
or Pr(2) = 0.7. Similarly, 1 = Pr({2, 3}) + Pr(1) = 0.8 + Pr(1) and so
Pr(1) = 0.2. It follows that Pr(3) = 1 − Pr(1) − Pr(2) = 1 − 0.2 − 0.7 = 0.1.
It can be easily seen that Pr satisfies Axioms 1-3 and so Pr is a probability
measure
8 A REVIEW OF PROBABILITY RELATED RESULTS
Probability Trees
For all multistage experiments, the probability of the outcome along any
path of a tree diagram is equal to the product of all the probabilities along
the path.
Example 2.3
In a city council, 35% of the members are female, and the other 65% are
male. 70% of the male favor raising city sales tax, while only 40% of the
female favor the increase. If a member of the council is selected at random,
what is the probability that he or she favors raising sales tax?
Solution.
Figure 2.1 shows a tree diagram for this problem.
Figure 2.1
The first and third branches correspond to favoring the tax. We add their
probabilities.
Pr(tax) = 0.455 + 0.14 = 0.595
Conditional Probability and Bayes Formula
Consider the question of finding the probability of an event A given that an-
other event B has occurred. Knowing that the event B has occurred causes
us to update the probabilities of other events in the sample space.
To illustrate, suppose you roll two dice of different colors; one red, and
one green. You roll each die one at time. Our sample space has 36 out-
1
comes. The probability of getting two ones is 36 . Now, suppose you were
told that the green die shows a one but know nothing about the red die.
What would be the probability of getting two ones? In this case, the answer
is 16 . This shows that the probability of getting two ones changes if you have
2 A BRIEF REVIEW OF PROBABILITY 9
Example 2.4
Let A denote the event “an immigrant is male” and let B denote the event
“an immigrant is Brazilian”. In a group of 100 immigrants, suppose 60
are Brazilians, and suppose that 10 of the Brazilians are males. Find the
probability that if I pick a Brazilian immigrant, it will be a male, that is,
find Pr(A|B).
Solution.
Since 10 out of 100 in the group are both Brazilians and male, Pr(A ∩ B) =
10 60
100 = 0.1. Also, 60 out of the 100 are Brazilians, so Pr(B) = 100 = 0.6.
Hence, Pr(A|B) = 0.1
0.6 = 6
1
It is often the case that we know the probabilities of certain events con-
ditional on other events, but what we would like to know is the “reverse”.
That is, given Pr(A|B) we would like to find Pr(B|A).
Example 2.5
A soccer match may be delayed because of bad weather. The probabilities
are 0.60 that there will be bad weather, 0.85 that the game will take place
if there is no bad weather, and 0.35 that the game will be played if there is
bad weather. What is the probability that the match will occur?
Solution.
Let A be the event that the game will be played and B is the event that
there will be a bad weather. We are given Pr(B) = 0.60, Pr(A|B c ) = 0.85,
and Pr(A|B) = 0.35. From Equation (2.1) we find
Solution.
We are given Pr(A) = 0.1, Pr(B) = 0.9, Pr(D|A) = 0.01, and Pr(D|B) =
0.05. We want to find Pr(A|D). Using Bayes’ formula we find
Pr(A ∩ D) Pr(D|A)Pr(A)
Pr(A|D) = =
Pr(D) Pr(D|A)Pr(A) + Pr(D|B)Pr(B)
(0.01)(0.1)
= ≈ 0.0217
(0.01)(0.1) + (0.05)(0.9)
Formula 2.2 is a special case of the more general result:
where
Example 2.7
A survey is taken in Oklahoma, Kansas, and Arkansas. In Oklahoma, 50%
of surveyed support raising tax, in Kansas, 60% support a tax increase, and
in Arkansas only 35% favor the increase. Of the total population of the
three states, 40% live in Oklahoma, 25% live in Kansas, and 35% live in
Arkansas. Given that a surveyed person is in favor of raising taxes, what is
the probability that he/she lives in Kansas?
Solution.
Let LI denote the event that a surveyed person lives in state I, where I
= OK, KS, AR. Let S denote the event that a surveyed person favors tax
increase. We want to find Pr(LKS |S). By Bayes’ formula we have
Pr(S|LKS )Pr(LKS )
Pr(LKS |S) =
Pr(S|LOK )Pr(LOK ) + Pr(S|LKS )Pr(LKS ) + Pr(S|LAR )Pr(LAR )
(0.6)(0.25)
= ≈ 0.3175
(0.5)(0.4) + (0.6)(0.25) + (0.35)(0.35)
12 A REVIEW OF PROBABILITY RELATED RESULTS
Practice Problems
Problem 2.1
Consider the sample space of rolling a die. Let A be the event of rolling
an even number, B the event of rolling an odd number, and C the event of
rolling a 2.
Find
(a) Ac , B c and C c .
(b) A ∪ B, A ∪ C, and B ∪ C.
(c) A ∩ B, A ∩ C, and B ∩ C.
(d) Which events are mutually exclusive?
Problem 2.2
If, for a given experiment, O1 , O2 , O3 , · · · is an infinite sequence of outcomes,
verify that
i
1
Pr(Oi ) = , i = 1, 2, 3, · · ·
2
is a probability measure.
Problem 2.3 ‡
An insurer offers a health plan to the employees of a large company. As
part of this plan, the individual employees may choose exactly two of the
supplementary coverages A, B, and C, or they may choose no supplementary
coverage. The proportions of the company’s employees that choose cover-
ages A, B, and C are 14 , 13 , and , 12
5
respectively.
Problem 2.4
A toll has two crossing lanes. Let A be the event that the first lane
is busy, and let B be the event the second lane is busy. Assume that
Pr(A) = 0.2, Pr(B) = 0.3 and Pr(A ∩ B) = 0.06.
Problem 2.5
If a person visits a car service center, suppose that the probability that he
will have his oil changed is 0.44, the probability that he will have a tire
replacement is 0.24, the probability that he will have airfilter replacement
is 0.21, the probability that he will have oil changed and a tire replaced is
0.08, the probability that he will have oil changed and air filter changed is
0.11, the probability that he will have a tire and air filter replaced is 0.07,
and the probability that he will have oil changed, a tire replacement, and
an air filter changed is 0.03.
What is the probability that at least one of these things done to the car?
Recall that
Pr(A∪B∪C) = Pr(A)+Pr(B)+Pr(C)−Pr(A∩B)−Pr(A∩C)−Pr(B∩C)+Pr(A∩B∩C)
Problem 2.6 ‡
A survey of a group’s viewing habits over the last year revealed the following
information
(i) 28% watched gymnastics
(ii) 29% watched baseball
(iii) 19% watched soccer
(iv) 14% watched gymnastics and baseball
(v) 12% watched baseball and soccer
(vi) 10% watched gymnastics and soccer
(vii) 8% watched all three sports.
Find the probability of a viewer that watched none of the three sports during
the last year.
Problem 2.7 ‡
The probability that a visit to a primary care physician’s (PCP) office re-
sults in neither lab work nor referral to a specialist is 35% . Of those coming
to a PCP’s office, 30% are referred to specialists and 40% require lab work.
Problem 2.8 ‡
You are given Pr(A ∪ B) = 0.7 and Pr(A ∪ B c ) = 0.9.
Determine Pr(A).
14 A REVIEW OF PROBABILITY RELATED RESULTS
Problem 2.9 ‡
Among a large group of patients recovering from shoulder injuries, it is found
that 22% visit both a physical therapist and a chiropractor, whereas 12%
visit neither of these. The probability that a patient visits a chiropractor
exceeds by 14% the probability that a patient visits a physical therapist.
Problem 2.10 ‡
In modeling the number of claims filed by an individual under an auto-
mobile policy during a three-year period, an actuary makes the simplifying
assumption that for all integers n ≥ 0, pn+1 = 15 pn , where pn represents the
probability that the policyholder files n claims during the period.
Problem 2.11
An urn contains three red balls and two blue balls. You draw two balls
without replacement. Construct a probability tree diagram that represents
the various outcomes that can occur.
What is the probability that the first ball is red and the second ball is
blue?
Problem 2.12
Repeat the previous exercise but this time replace the first ball before draw-
ing the second.
Problem 2.13 ‡
A public health researcher examines the medical records of a group of 937
men who died in 1999 and discovers that 210 of the men died from causes
related to heart disease. Moreover, 312 of the 937 men had at least one par-
ent who suffered from heart disease, and, of these 312 men, 102 died from
causes related to heart disease.
Determine the probability that a man randomly selected from this group
died of causes related to heart disease, given that neither of his parents
suffered from heart disease.
2 A BRIEF REVIEW OF PROBABILITY 15
Problem 2.14 ‡
An actuary is studying the prevalence of three health risk factors, denoted
by A, B, and C, within a population of women. For each of the three fac-
tors, the probability is 0.1 that a woman in the population has only this risk
factor (and no others). For any two of the three factors, the probability is
0.12 that she has exactly these two risk factors (but not the other). The
probability that a woman has all three risk factors, given that she has A
and B, is 31 .
What is the probability that a woman has none of the three risk factors,
given that she does not have risk factor A?
Problem 2.15 ‡
An auto insurance company insures drivers of all ages. An actuary compiled
the following statistics on the company’s insured drivers:
Age of Probability Portion of Company’s
Driver of Accident Insured Drivers
16 - 20 0.06 0.08
21 - 30 0.03 0.15
31 - 65 0.02 0.49
66 - 99 0.04 0.28
A randomly selected driver that the company insures has an accident.
Problem 2.16 ‡
An insurance company issues life insurance policies in three separate cate-
gories: standard, preferred, and ultra-preferred. Of the company’s policy-
holders, 50% are standard, 40% are preferred, and 10% are ultra-preferred.
Each standard policyholder has probability 0.010 of dying in the next year,
each preferred policyholder has probability 0.005 of dying in the next year,
and each ultra-preferred policyholder has probability 0.001 of dying in the
next year.
A policyholder dies in the next year.
Problem 2.17 ‡
Upon arrival at a hospital’s emergency room, patients are categorized ac-
cording to their condition as critical, serious, or stable. In the past year:
16 A REVIEW OF PROBABILITY RELATED RESULTS
Problem 2.18 ‡
A health study tracked a group of persons for five years. At the beginning
of the study, 20% were classified as heavy smokers, 30% as light smokers,
and 50% as nonsmokers.
Results of the study showed that light smokers were twice as likely as non-
smokers to die during the five-year study, but only half as likely as heavy
smokers.
A randomly selected participant from the study died over the five-year pe-
riod.
Problem 2.19 ‡
An actuary studied the likelihood that different types of drivers would be
involved in at least one collision during any one-year period. The results of
the study are presented below.
Probability
Type of Percentage of of at least one
driver all drivers collision
Teen 8% 0.15
Young adult 16% 0.08
Midlife 45% 0.04
Senior 31% 0.05
Total 100%
Given that a driver has been involved in at least one collision in the past
year, what is the probability that the driver is a young adult driver?
Problem 2.20 ‡
A blood test indicates the presence of a particular disease 95% of the time
when the disease is actually present. The same test indicates the presence
2 A BRIEF REVIEW OF PROBABILITY 17
of the disease 0.5% of the time when the disease is not present. One percent
of the population actually has the disease.
Calculate the probability that a person has the disease given that the test
indicates the presence of the disease.
18 A REVIEW OF PROBABILITY RELATED RESULTS
Example 3.1
State whether the random variables are discrete, continuous, or mixed.
(a) A coin is tossed ten times. The random variable X is the number of
heads that are noted.
(b) A coin is tossed repeatedly. The random variable X is the number of
times needed to get the first head.
(c) X : (0, 1) −→ R defined by X(s) = 2s − 1.
3 A REVIEW OF RANDOM VARIABLES 19
1
(d) X : (0, 1) −→ R defined by X(s) = 2s − 1 for 0 < s < 2 and X(s) = 1
for 12 ≤ s < 1.
Solution.
(a) The support of X is {1, 2, 3, · · · , 10}. X is an example of a finite discrete
random variable.
(b) The support of X is N. X is an example of a countably infinite discrete
random variable.
(c) The support of X is the open interval (−1, 1). X is an example of a
continuous random variable.
(d) X is continuous on (0, 21 ) and discrete on [ 21 , 1)
That is, a probability mass function (pmf) gives the probability that a dis-
crete random variable is exactly equal to some value. Note that the domain
of the pf is the support of the corresponding random variable. The pmf can
be an equation, a table, or a graph that shows how probability is assigned
to possible values of the random variable.
Example 3.2
Suppose a variable X can take the values 1, 2, 3, or 4. The probabilities
associated with each outcome are described by the following table:
x 1 2 3 4
p(x) 0.1 0.3 0.4 0.2
Solution.
The probability histogram is shown in Figure 3.1
Figure 3.1
Example 3.3
A committee of m is to be selected from a group consisting of x men and y
women. Let X be the random variable that represents the number of men
in the committee. Find p(n) for 0 ≤ n ≤ m.
Solution.
For 0 ≤ n ≤ m, we have
x y
n m−n
p(n) =
x+y
m
That is, areas under the probability density function represent probabilities
as illustrated in Figure 3.2.
Figure 3.2
Now, if we let a = b in the previous formula we find
Z a
Pr(X = a) = f (x)dx = 0.
a
and
22 A REVIEW OF PROBABILITY RELATED RESULTS
Example 3.4
Suppose that the function f (t) defined below is the density function of some
random variable X. −t
e t ≥ 0,
f (t) =
0 t < 0.
Solution.
Z 10 Z 0 Z 10
P (−10 ≤ X ≤ 10) = f (t)dt = f (t)dt + f (t)dt
−10 −10 0
Z 10
10
= e−t dt = −e−t 0 = 1 − e−10
0
F (t) = Pr(X ≤ t)
i.e., F (t) is equal to the probability that the variable X assumes values,
which are less than or equal to t.
Example 3.5
Given the following pmf
1, if x = a
p(x) =
0, otherwise.
Solution.
A formula for F (x) is given by
0, if x < a
F (x) =
1, otherwise
Its graph is given in Figure 3.3
Figure 3.3
For discrete random variables the cumulative distribution function will al-
ways be a step function with jumps at each value of x that has probability
greater than 0. Note that the value of F (x) is assigned to the top of the jump.
(b)
x
e−y
Z
F (x) = −y 2
dy
−∞ (1 + e )
x
1 1
= −y
=
1+e −∞ 1 + e−x
Next, we list the properties of the cumulative distribution function F (x) for
any random variable X.
Theorem 3.1
The cumulative distribution function of a random variable X satisfies the
following properties:
(a) 0 ≤ F (x) ≤ 1.
(b) F (x) is a non-decreasing function, i.e. if a < b then F (a) ≤ F (b).
(c) F (x) → 0 as x → −∞ and F (x) → 1 as x → ∞.
(d) F is right-continuous.
Example 3.7
If the distribution function of X is given by
0 x<0
1
0≤x<1
16
5
F (x) = 16 1 ≤ x < 2
11
16 2 ≤ x < 3
15
3≤x<4
16
1 x≥4
Solution.
1
Using 3.1, we get p(0) = 16 , p(1) = 14 , p(2) = 38 , p(3) = 14 , and p(4) = 1
16 and
3 A REVIEW OF RANDOM VARIABLES 25
0 otherwise
It follows from Theorem 3.1, that any random variable satisfies the prop-
erties: S(−∞) = 1, S(∞) = 0, S(x) is right-continuous, and that S(x) is
nonincreasing.
Remark 3.1
For a discrete random variable, the survival function need not be left-
continuous, that is, it is possible for its graph to jump down. When it
jumps, the value is assigned to the bottom of the jump.
Example 3.8 ‡
For watches produced by a certain manufacturer:
(i) Lifetimes follow a single-parameter Pareto distribution with α¿ 1 and
θ = 4.
(ii) The expected lifetime of a watch is 8 years.
Calculate the probability that the lifetime of a watch is at least 6 years.
Solution.
From Table C, we have
αθ 4α
E(X) = = = 8 =⇒ α = 2.
α−1 α−1
26 A REVIEW OF PROBABILITY RELATED RESULTS
Example 3.9
Show that
S 0 (x) d
h(x) = − = − [ln S(x)]. (3.2)
S(x) dx
Solution.
S 0 (x)
The equation follows from f (x) = −S 0 (x) and d
dx [ln S(x)] = S(x)
Example 3.10
Find the hazard rate function of a random variable with pdf given by f (x) =
e−ax , a > 0.
Solution.
We have
f (x) ae−ax
h(x) = = −ax = a
S(x) e
Example 3.11
Let X be a random variable with support [0, ∞). Show that
S(x) = e−Λ(x)
where Z x
Λ(x) = h(s)ds.
0
Solution.
Integrating equation (3.2) from 0 to x, we have
Z x Z x
d
h(s)ds = − [ln S(s)]ds = ln S(0)−ln S(x) = ln 1−ln S(x) = − ln S(x).
0 0 ds
E(XIY )
E(X|Y ) = . (3.3)
Pr(Y )
Example 3.12
Let X and Y be two random variables. Find a formula of E[(X −d)k |X > d]
in the
(a) discrete case
(b) continuous case.
Solution.
(a) We have
− d)k p(xj )
P
k xj >d (x
E[(X − d) |X > d] = .
Pr(X > d)
(b) We have
Z ∞
k 1
E[(X − d) |X > d] = (x − d)k fX (x)dx
Pr(X > d) d
Example 3.13
You are given the following information
xi 0 1 2 3 4 5
wi 512 307 123 41 11 6
Solution.
The weighted mean is
Practice Problems
Problem 3.1
State whether the random variables are discrete, continuous, or mixed.
(a) In two tossing of a coin, let X be the number of heads in the two tosses.
(b) An urn contains one red ball and one green ball. Let X be the number
of picks necessary in getting the first red ball.
(c) X is a random number in the interval [4, 7].
(d) X : R −→ R such that X(s) = s if s is irrational and X(s) = 1 if s is
rational.
Problem 3.2
Toss a pair of fair dice. Let X denote the sum of the dots on the two faces.
Problem 3.3
Consider the random variable X : {S, F } −→ R defined by X(S) = 1 and
X(F ) = 0. Suppose that p = Pr(X = 1).
Problem 3.4 ‡
The loss due to a fire in a commercial building is modeled by a random
variable X with density function
0.005(20 − x) 0 < x < 20
f (x) =
0 otherwise.
Given that a fire loss exceeds 8, what is the probability that it exceeds 16 ?
Problem 3.5 ‡
The lifetime of a machine part has a continuous distribution on the interval
(0, 40) with probability density function f, where f (x) is proportional to
(10 + x)−2 .
Calculate the probability that the lifetime of the machine part is less than
6.
30 A REVIEW OF PROBABILITY RELATED RESULTS
Problem 3.6 ‡
A group insurance policy covers the medical claims of the employees of a
small company. The value, V, of the claims made in one year is described
by
V = 100000Y
where Y is a random variable with density function
where k is a constant.
Problem 3.7 ‡
An insurance policy pays for a random loss X subject to a deductible of
C, where 0 < C < 1. The loss amount is modeled as a continuous random
variable with density function
2x 0 < x < 1
f (x) =
0 otherwise.
Given a random loss X, the probability that the insurance payment is less
than 0.5 is equal to 0.64 .
Calculate C.
Problem 3.8
Let X be a continuous random variable with pdf
αxe−x , x > 0
f (x) =
0, x ≤ 0.
Problem 3.9
Consider the following probability distribution
x 1 2 3 4
p(x) 0.25 0.5 0.125 0.125
3 A REVIEW OF RANDOM VARIABLES 31
Problem 3.10
Find the distribution functions corresponding to the following density func-
tions:
a−1
(a)f (x) = , 0 < x < ∞, 0 otherwise.
(1 + x)a
α
(b)f (x) =kαxα−1 e−kx , k > 0, α, 0 < x < ∞, 0 otherwise.
Problem 3.11
Let X be a random variable with pmf
1 2 n
p(n) = , n = 0, 1, 2, · · · .
3 3
Problem 3.12
Given the pdf of a continuous random variable X.
1 −x
5e
5 if x ≥ 0
f (x) =
0 otherwise.
Problem 3.13
A random variable X has the cumulative distribution function
ex
F (x) = .
ex + 1
Problem 3.14
Consider an age-at-death random variable X with survival distribution de-
fined by
1 1
S(x) = (100 − x) 2 , 0 ≤ x ≤ 100.
10
32 A REVIEW OF PROBABILITY RELATED RESULTS
Problem 3.15
Consider an age-at-death random variable X with survival distribution de-
fined by
S(x) = e−0.34x , x ≥ 0.
Problem 3.16
Consider an age-at-death random variable X with survival distribution S(x) =
x2
1 − 100 for x ≥ 0.
Find F (x).
Problem 3.17
Consider an age-at-death random variable X. The survival distribution is
x
given by S(x) = 1 − 100 for 0 ≤ x ≤ 100 and 0 for x > 100.
(a) Find the probability that a person dies before reaching the age of 30.
(b) Find the probability that a person lives more than 70 years.
Problem 3.18
An age-at-death random variable has a survival function
1 1
S(x) = (100 − x) 2 , 0 ≤ x ≤ 100
10
and 0 otherwise.
Problem 3.19
Consider an age-at-death random variable X with force of mortality h(x) =
µ > 0.
Problem 3.20
Let x 61
F (x) = 1 − 1 − , 0 ≤ x ≤ 120.
120
Find h(40).
34 A REVIEW OF PROBABILITY RELATED RESULTS
Example 4.1
Let X be a continuous random variable with pdf given by f (x) = 83 x2 for
0 < x < 2 and 0 otherwise. Find the second central moment of X.
Solution.
We first find the mean of X. We have
3 4 2
Z 2 Z 2
3 3
E(X) = xf (x)dx = x dx = x = 1.5.
0 0 8 32 0
4 RAW AND CENTRAL MOMENTS 35
The importance of moments is that they are used to define quantities that
characterize the shape of a distribution. These quantities which will be dis-
cussed below are: skewness, kurtosis and coefficient of variation.
That is, γ1 is the ratio of the third central moment to the cube of the
standard deviation. Equivalently, γ1 is the third central moment of the
standardized variable
X −µ
X∗ = .
σ
If γ1 is close to zero then the distribution is symmetric about its mean such
as the normal distribution. A positively skewed distribution has a “tail”
which is pulled in the positive direction. A negatively skewed distribution
has a “tail” which is pulled in the negative direction (see Figure 4.1).
Figure 4.1
36 A REVIEW OF PROBABILITY RELATED RESULTS
Example 4.2
A random variable X has the following pmf:
Solution.
We first find the mean of X :
1 1 1 1 1 1 2027
µ = E(X) = 120× +122× +124× +150× +167× +245× = .
4 12 6 12 12 3 12
The second raw moment is
1 1 1 1 1 1 379325
E(X 2 ) = 1202 × +1222 × +1242 × +1502 × +1672 × +2452 × = .
4 12 6 12 12 3 12
Thus, the variance of X is
Thus,
93270.81134
γ1 = = 0.5463016252
55.4759081833
Example 4.3
Let X be a random variable with density f (x) = e−x on (0, ∞) and 0
otherwise. Find the coefficient of skewness of X.
4 RAW AND CENTRAL MOMENTS 37
Solution.
Since
Z ∞ ∞
E(X) = xe−x dx = −e−x (1 + x)0 = 1
Z0 ∞
∞
E(X 2 ) = x2 e−x dx = −e−x (x2 + 2x + 2)0 = 2
Z0 ∞
∞
3
E(X ) = x3 e−x dx = −e−x (x3 + 3x2 + 6x + 6)0 = 6
0
we find
6 − 3(1)(2) + 2(1)3
γ1 = 3 =2
(2 − 12 ) 2
Coefficient of Kurtosis
The fourth central moment, µ4 , is called the kurtosis and is a measure of
peakedness/flatness of a distribution with respect to the normal distribution.
Figure 4.2
38 A REVIEW OF PROBABILITY RELATED RESULTS
Example 4.4
A random variable X has the following pmf:
Solution.
We first find the fourth central moment.
Thus,
13693826.62
γ2 = = 1.44579641
55.4759081834
Example 4.5
Find the coefficient of kurtosis of the random variable X with density func-
tion f (x) = 1 on (0, 1) and 0 elsewhere.
Solution.
Since Z 1
1
E(X k ) = xk dx = .
0 k+1
we obtain,
1 1
1
1
1 2 1 4
5 −4 4 2 +6 3 2 −3 2 9
γ2 = =
1 1 2 5
3 − 4
Coefficient of Variation
Some combinations of the raw moments and central moments that are also
commonly used. One such combination is the coefficient of variation,
denoted by CV (X), of a random variable X which is defined as the ratio of
the standard deviation to the mean:
σ
CV (X) = , µ = µ01 = E(X).
µ
4 RAW AND CENTRAL MOMENTS 39
Practice Problems
Problem 4.1
Consider n independent trials. Let X denote the number of successes in n
trials. We call X a binomial random variable. Its pmf is given by
p(r) = C(n, r)pr (1 − p)n−r
where p is the probability of a success.
(a) Show that E(X) = np and E[X(X − 1)] = n(n − 1)p2 . Hint: (a + b)n =
P n k n−k .
k=0 C(n, k)a b
(b) Find the variance of X.
Problem 4.2
A random variable X is said to be a Poisson random variable with param-
eter λ > 0 if its probability mass function has the form
λk
p(k) = e−λ , k = 0, 1, 2, · · ·
k!
where λ indicates the average number of successes per unit time or space.
Problem 4.5
An exponential random variable with parameter λ > 0 is a random variable
with pdf
λe−λx if x ≥ 0
f (x) =
0 if x < 0
1 2
(a) Show that E(X) = λ and E(X 2 ) = λ2
.
(b) Find Var(X).
Problem 4.6
A Gamma random variable with parameters α > 0 and θ > 0 has a pdf
(
1 α−1 e− xθ if x ≥ 0
θ α Γ(α) x
f (x) =
0 if x < 0
where Z ∞
e−y y α−1 dy = Γ(α) = αΓ(α − 1).
0
Show:
(a) E(X) = αθ
(b) V ar(X) = αθ2 .
Problem 4.7
Let X be a continuous random variable with pdf given by f (x) = 83 x2 for
0 ≤ x ≤ 2 and 0 otherwise.
Problem 4.8
A random variable X has the following pmf:
x 120 122 124 150 167 245
1 1 1 1 1 1
p(x) 4 12 6 12 12 3
Problem 4.9
A random variable X has the following pmf:
x 120 122 124 150 167 245
1 1 1 1 1 1
p(x) 4 12 6 12 12 3
Problem 4.10
Compute the coefficient of skewness of a uniform random variable, X, on
[0, 1].
Problem 4.11
Let X be a random variable with density f (x) = e−x and 0 otherwise.
Problem 4.12
A random variable X has the following pmf:
x 120 122 124 150 167 245
1 1 1 1 1 1
p(x) 4 12 6 12 12 3
Problem 4.13
Let X be a continuous random variable with density function f (x) = Axb e−Cx
for x ≥ 0 and 0 otherwise. The parameters A, B, and C satisfy
1 1
A = R∞ , B ≥ − , C > 0.
0 xB e−Cx dx 2
Show that
B+r
E(X n ) = E(X n−1 ).
C
Problem 4.14
Let X be a continuous random variable with density function f (x) = Axb e−Cx
for x ≥ 0 and 0 otherwise. The parameters A, B, and C satisfy
1 1
A = R∞ , B ≥ − , C > 0.
0 xB e−Cx dx 2
Problem 4.15
Let X be a continuous random variable with density function f (x) = Axb e−Cx
for x ≥ 0 and 0 otherwise. The parameters A, B, and C satisfy
1 1
A = R∞ , B ≥ − , C > 0.
0 xB e−Cx dx 2
Problem 4.16
Let X be a continuous random variable with density function f (x) = Axb e−Cx
for x ≥ 0 and 0 otherwise. The parameters A, B, and C satisfy
1 1
A = R∞ , B ≥ − , C > 0.
0 xB e−Cx dx 2
Problem 4.17
You are given: E(X) = 2, CV (X) = 2, and µ03 = 136. Calculate γ1 .
Problem 4.18
Let X be a random variable with pdf f (x) = 0.005x for 0 ≤ x ≤ 20 and 0
otherwise.
Problem 4.19
Let X be the Gamma random variable with pdf f (x) = 1 α−1 e− xθ
θα Γ(α) x for
x > 0 and 0 otherwise. Suppose E(X) = 8 and γ1 = 1.
Problem 4.20
Let X be a Pareto random variable in one parameter and with a pdf
a
f (x) = xa+1 , x ≥ 1 and 0 otherwise.
a
(a) Show that E(X k ) = a−k for 0 < k < a.
(b) Find the coefficient of variation of X.
Problem 4.21
For the random variable X you are given:
(i) E(X) = 4
(ii) Var(X) = 64
(iii) E(X 3 ) = 15.
Problem 4.22
Let X be a Pareto random variable with two parameters α and θ, i.e., X
has the pdf
αθα
f (x) = , α > 1, θ > 0, x > 0
(x + θ)α+1
and 0 otherwise.
Problem 4.23
Let X be a Pareto random variable with two parameters α and θ, i.e., X
has the pdf
αθα
f (x) = , α > 1, θ > 0, x > 0
(x + θ)α+1
and 0 otherwise.
Problem 4.24
Let X be the Gamma random variable with pdf f (x) = 1 α−1 e− xθ
θα Γ(α) x for
x > 0 and 0 otherwise. Suppose CV (X) = 1.
Determine γ1 .
Problem 4.25
You are given the following times of first claim for five randomly selected
auto insurance policies observed from time t = 0 :
1 2 3 4 5
Example 5.1 ‡
You are given the following for a sample of five observations from a bivariate
distribution:
(i)
x y
1 4
2 2
4 3
5 6
6 4
Solution.
We have
Now, since E(X) and E(Y ) are fixed, we want to create a new bivariate
distribution from the given one with maximum E(XY ). Clearly, this occurs
if largest values of X are paired with largest values of Y. Hence, the following
bivariate distribution has the same marginal distributions as the original
bivariate distribution:
46 A REVIEW OF PROBABILITY RELATED RESULTS
x y
6 6
5 4
4 4
2 3
1 2
Example 5.2 ‡
You are given the following graph of cumulative distribution functions:
Determine the difference between the mean of the lognormal model and
the mean of the data.
Solution.
The empirical distribution is given by
Now, from the graph we see that the 20th and 60th percentiles of the log-
normal distribution are 10 and 100 respectively. That is,
0.2 = Φ ln 10−µ
σ and 0.6 = Φ ln 100−µ
σ
Solving this system, we find µ = 4.0771 and σ = 2.1125. Thus, the mean of
the lognormal distribution is (Table C)
2 2)
eµ+0.5σ = e4.0771+0.5(2.1125 = 549.18.
Example 5.3
In a fitness club monthly new memberships are recorded in the table below.
January February March April May June
100 102 84 84 100 100
Solution.
The sample under consideration has 12 data points which are the months of
the year. For our empirical model, each data point is assigned a probability
1
of 12 . The pmf of the random variable X is given by:
x 45 67 84 93 100 102
1 1 1 1 1 1
p(x) 3 12 6 12 4 12
paid by the insurance. The insurer pays the insuree the amount of the loss
that was in excess of the deductible2 . Any amount of losses that are be-
low the deductible are ignored by the insurance since they do not result in
an insurance payment being made. Hence, the insurer would be consider-
ing the conditional distribution of amount paid, given that a payment was
actually made. This is what is referred to as the excess loss random variable.
is called the excess loss variable, the cost per payment, or the left
truncated and shifted variable. It stands for the amount paid by the
insurance which is also known as claim amount.
We can find the k th moment of the excess loss variable as follows. For a
continuous distribution with probability density function f (x) and cumula-
tive distribution function F (x), we have3
Z ∞
k k 1
eX (d) =E[(X − d) |X > d] = (x − d)k f (x)dx
Pr(X > d) d
Z ∞
1
= (x − d)k f (x)dx
1 − F (d) d
For a discrete distribution with probability density function p(x) and a cu-
mulative distribution function F (x), we have
1 X
ekX (d) = (xj − d)k p(xj )
1 − F (d)
xj >d
2
The deductible is referred to as ordinary deductible. Another type of deductible is
called franchise deductible and will be discussed in Section 32.
3
See (3.3)
5 EMPIRICAL MODELS, EXCESS AND LIMITED LOSS VARIABLES49
is called the mean excess loss function. Other names used have been
mean residual life function and complete expectation of life.
If X denotes payment, then eX (d) stands for the expected amount paid
given that there has been a payment in excess of the deductible d. If X de-
notes age at death, then eX (d) stands for the expected future lifetime given
that the person is alive at age d.
Example 5.4
Show that for a continuous random variable X, we have
Z ∞ Z ∞
1 1
eX (d) = (1 − F (x))dx = S(x)dx.
1 − F (d) d S(d) d
Solution.
Using integration by parts with u = x − d and v 0 = f (x), we have
Z ∞
1
eX (d) = (x − d)f (x)dx
1 − F (d) d
(x − d)(1 − F (x)) ∞
Z ∞
1
=− + 1 − F (d) (1 − F (x))dx
1 − F (d) d d
Z ∞ Z ∞
1 1
= (1 − F (x))dx = S(x)dx.
1 − F (d) d S(d) d
Note that
Z ∞ Z ∞
0 ≤ xS(x) = x f (t)dt ≤ tf (t)dt =⇒ lim xS(x) = 0
x x x→∞
Example 5.5
Let X be an excess loss random variable with pdf given by f (x) = 31 (1 +
2x)e−x for x > 0 and 0 otherwise. Calculate the mean excess loss function
with deductible amount x.
Solution.
The cdf of X is given by
x
e−x (2x + 3)
Z x
1 −t 1 −t
F (x) = (1 + 2t)e dt = − e (2t + 3) = 1 −
0 3 3 0 3
50 A REVIEW OF PROBABILITY RELATED RESULTS
∞ e−x (2x+5)
e−t (2t + 3)
Z
1 3 2x + 5
eX (x) = e−x (2x+3)
1− dt = e−x (2x+3)
=
x 3 2x + 3
3 3
Example 5.6 ‡
For an industry-wide study of patients admitted to hospitals for treatment
of cardiovascular illness in 1998, you are given:
(i)
(ii) Discharges from the hospital are uniformly distributed between the du-
rations shown in the table.
Calculate the mean residual time remaining hospitalized, in days, for a pa-
tient who has been hospitalized for 21 days.
Solution.
Let X denote the number of days at the hospital measured from time 0. We
are asked to find E(X − 21|X > 21) which by Example 5.4 can be expressed
as Z ∞
SX (x)
E(X − 21|X > 21) = dx.
21 S X (21)
the trapezoids with bases [21, 25], [25, 30], [30, 35], and [35, 40]. For instance,
the area under the first trapezoid is
1 SX (21) SX (25)
(25 − 21) + .
2 SX (21) SX (21)
`x
SX (x) =
`0
where `0 is the number of patients at the hospital at time 0 and `x is the
expected number of patients in the hospital at time x. Thus,
SX (x) `x
= .
SX (21) `21
We repeat the same calculation with the remaining three trapezoids, we find
Example 5.7
Show that
FX (y + d) − FX (d)
FY P (y) = .
1 − FX (d)
Solution.
We have
Pr(d < X ≤ y + d)
FY P (y) =Pr(Y P ≤ y) = Pr(X − d ≤ y|X > d) =
Pr(X > d)
FX (y + d) − FX (d)
=
1 − FX (d)
52 A REVIEW OF PROBABILITY RELATED RESULTS
in the discrete case. Note the relationship between the moments of Y P and
Y L given by
E[(X − d)k+ ] = ekX (d)[1 − F (d)] = ekX (d)S(d).
Setting k = 1 and using the formula for eX (d) we see that
Z ∞
E(Y L ) = S(x)dx.
d
Example 5.8
For a house insurance policy, the loss amount (expressed in thousands), in
the event of a fire, is being modeled by a distribution with density
3
f (x) = x(5 − x), 0 < x < 4.
56
4
See Section 39.
5 EMPIRICAL MODELS, EXCESS AND LIMITED LOSS VARIABLES53
Solution.
We first calculate the survival function:
Z x
3 3 5 2 1 3
S(x) = 1 − F (x) = 1 − t(5 − t)dt = 1 − x − x .
0 56 56 2 3
Thus,
Z 4
L 3 5 2 1 3
E(Y ) = 1− x − x dx = 1325.893
1 56 2 3
fY L (y) = fX (y + d)
Notice that the distribution of X is censored on the right and that is why the
limit loss variable is also known as the right-censored random variable.
The expected value of the limited loss value is E(X ∧ u) and is called the
limited expected value.
cumulative distribution function F (xj ) for all relevant index values j, the
k th moment of the limited loss random variable is given by
X
E[(X ∧ u)k ] = xkj p(xj ) + uk [1 − F (u)].
xj ≤u
Remark 5.1
Usually the random variable X is non-negative (for instance, when X rep-
resents a loss or a time until death), and the lower integration limit −∞ is
replaced by 0. Thus, for X ≥ 0, we whave
Z u
E(X ∧ u) = S(x)dx.
0
Example 5.9
A continuous random variable X has a pdf f (x) = 0.005x for 0 ≤ x ≤ 20
and 0 otherwise. Find the mean and the variance of X ∧ 10.
Solution.
The cdf of X is
Z x Z x 0, x<0
F (x) = 0.005tdt = 0.005tdt = 0.0025x2 , 0 ≤ x ≤ 20
−∞ 0
1, x > 20.
Thus,
Z 10 Z 10
55
E(X ∧ 10) = [1 − F (x)]dx = [1 − 0.0025x2 ]dx = .
0 0 6
Now,
Z 10
2
E[(X ∧ 10) ] = x2 (0.005x)dx + 102 [1 − F (10)] = 87.5.
0
Finally,
2
55 125
Var(X ∧ 10) = 87.5 − =
6 36
Example 5.10 ‡
The unlimited severity distribution for claim amounts under an auto liability
insurance policy is given by the cumulative distribution:
The insurance policy pays amounts up to a limit of 1000 per claim. Calculate
the expected payment under this policy for one claim.
56 A REVIEW OF PROBABILITY RELATED RESULTS
Solution.
We are asked to find the limited expected value E(X ∧ 1000). We have
Z 1000 Z 1000
E(X ∧ 1000) = S(x)dx = [0.8e−0.02x + 0.2e−0.001x ]dx = 166.4
0 0
Example 5.11 ‡
A health plan implements an incentive to physicians to control hospitaliza-
tion under which the physicians will be paid a bonus B equal to c times the
amount by which total hospital claims are under 400 (0 ≤ c ≤ 1).
The effect the incentive plan will have on underlying hospital claims is
modeled by assuming that the new total hospital claims will follow a two-
parameter Pareto distribution with α = 2 and θ = 300.
Suppose that E(B) = 100. Calculate the value of c.
Solution.
Let X denote the number of hospital claims. We are told that
c(400 − x), x < 400 400c − cx, x < 400
B= = = 400c−X∧400.
0, x ≥ 400 400c − 400c, x ≥ 400
Thus,
1200
With u = 400, we find E(X ∧ 400) = 7 . Finally,
1200
100 = 400c − c =⇒ c ≈ 0.44
7
Example 5.12
Show that, for X ≥ 0, we have
E(X) − E(X ∧ d)
eX (d) = .
S(d)
5 EMPIRICAL MODELS, EXCESS AND LIMITED LOSS VARIABLES57
Solution.
We have
Z ∞ Z ∞
E(X) = xf (x)dx = −xS(x)|∞
0 + S(x)dx
0 0
Z ∞
= S(x)dx
0
R∞ R∞ Rd
d S(x)dx S(x)dx − 0 S(x)dx
eX (d) = = 0
S(x) S(d)
E(X) − E(X ∧ d)
=
S(d)
Example 5.13 ‡
The random variable for a loss, X, has the following characteristics:
x F (x) E(X ∧ x)
0 0.0 0
100 0.2 91
200 0.6 153
1000 1.0 331
Calculate the mean excess loss for a deductible of 100.
Solution.
We are asked to find
E(X) − E(X ∧ 100)
eX (100) = .
1 − FX (100)
The only term unknown in this formula is E(X). Now, Pr(X > 1000) =
1 − F (1000) = 0. This shows that X ≤ 1000 so that X ∧ 1000 = X. It
follows that E(X) = E(X ∧ 1000) = 331.
The mean excess loss is
E(X) − E(X ∧ 100) 331 − 91
eX (100) = = = 300
1 − FX (100) 1 − 0.2
Remark 5.2
Just as in the case of a deductible, the random variable Y = X ∧ u has
a mixed distribution with continuous part fY (y) = fX (y) for y < u and a
discrete part pY (u) = 1 − FX (u).
58 A REVIEW OF PROBABILITY RELATED RESULTS
Practice Problems
Problem 5.1
Suppose that a policy has a deductible of $500. Complete the following
table.
Amount of loss 750 500 1200
Insurance payment
Problem 5.2
Referring to Example 5.3, find the cumulative distribution function of X.
Problem 5.3
Referring to Example 5.3, find the first and second raw moments of X.
Problem 5.4
Suppose you observe 8 claims with amounts
5 10 15 20 25 30 35 40
Problem 5.5
Let X be uniform on the interval [0, 100]. Find eX (d) for d > 0.
Problem 5.6
Let X be uniform on [0, 100] and Y be uniform on [0, α]. Suppose that
eY (30) = eX (30) + 4.
Problem 5.7
Let X be the exponential random variable with mean λ. Its pdf is f (x) =
λe−λx for x > 0 and 0 otherwise.
Find the expected cost per payment (i.e., mean excess loss function).
Problem 5.8
For an automobile insurance policy, the loss amount (expressed in thou-
sands), in the event of an accident, is being modeled by a distribution with
density
3
f (x) = x(5 − x), 0 < x < 4
56
5 EMPIRICAL MODELS, EXCESS AND LIMITED LOSS VARIABLES59
and 0 otherwise.
Problem 5.9
1
The loss random variable X has an exponential distribution with mean λ
and an ordinary deductible is applied to all losses.
Problem 5.10
1
The loss random variable X has an exponential distribution with mean λ
and an ordinary deductible is applied to all losses.
Problem 5.11
The loss random variable X has an exponential distribution with mean λ1
and an ordinary deductible is applied to all losses. The variance of the cost
per payment random variable (excess loss random variable) is 25,600.
Find λ.
Problem 5.12
The loss random variable X has an exponential distribution with mean λ1
and an ordinary deductible is applied to all losses. The variance of the cost
per payment random variable (excess loss random variable) is 25,600. The
variance of the cost per loss random variable is 20,480.
Problem 5.13
The loss random variable X has an exponential distribution with mean λ1
and an ordinary deductible is applied to all losses. The variance of the cost
per payment random variable (excess loss random variable) is 25,600. The
variance of the cost per loss random variable is 20,480.
Problem 5.14 φ
For the loss random variable with cdf F (x) = xθ , 0 < x < θ, and 0
otherwise, determine the mean residual lifetime eX (x).
Problem 5.15
Let X be a loss random variable with pdf f (x) = (1 + 2x2 )e−2x for x > 0
and 0 otherwise.
Problem 5.16
Show that
SX (y + d)
SY P (y) = .
SX (d)
Problem 5.17
Let X be a loss random variable with cdf F (x) = 1 − e−0.005x − 0.004e−0.005x
for x ≥ 0 and 0 otherwise.
(a) If an ordinary deductible of 100 is applied to each loss, find the pdf
of the per payment random variable Y P .
(b) Calculate the mean and the variance of the per payment random vari-
able.
Problem 5.18
A continuous random variable X has a pdf f (x) = 0.005x for 0 ≤ x ≤ 20
and 0 otherwise.
Problem 5.19 ‡
For a random loss X, you are given: Pr(X = 3) = Pr(X = 12) = 0.5 and
E[(X − d)+ ] = 3.
Problem 5.20 ‡
A loss, X, follows a 2-parameter Pareto distribution with α = 2 and unspec-
ified parameter θ. You are given:
5
E(X − 100|X > 100) = E(X − 50|X > 50).
3
5 EMPIRICAL MODELS, EXCESS AND LIMITED LOSS VARIABLES61
Problem 5.21 ‡
For an insurance:
(i) Losses can be 100, 200 or 300 with respective probabilities 0.2, 0.2, and
0.6.
(ii) The insurance has an ordinary deductible of 150 per loss.
(iii) Y P is the claim payment per payment random variable.
Calculate Var(Y P ).
Problem 5.22 ‡
For an insurance:
(i) Losses have density function
0.02x, 0 << 10
f (x) =
0, otherwise.
Calculate E[Y P ].
Problem 5.23 ‡
The loss severity random variable X follows the exponential distribution
with mean 10,000.
Example 6.1
Given the pmf of a discrete random variable X.
x 0 1 2 3 4 5
p(x) 0.35 0.20 0.15 0.15 0.10 0.05
Solution.
Since Pr(X ≤ 1) = 0.55 and Pr(X ≥ 1) = 0.65, 1 is the median of X
Example 6.2
1
Let X be a continuous random variable with pdf f (x) = b−a for a < x < b
and 0 otherwise. Find the median of X.
Solution. R M dx
We must find a number M such that a b−a = 0.5. This leads to the
M −a
equation b−a = 0.5. Solving this equation we find M = a+b
2
Remark 6.1
A discrete random variable might have many medians. For example,
x let X be
the discrete random variable with pmf given by p(x) = 12 , x = 1, 2, · · ·
and 0 otherwise. Then any number 1 < M < 2 satisfies Pr(X ≤ M ) =
Pr(X ≥ M ) = 0.5.
6 MEDIAN, MODE, PERCENTILES, AND QUANTILES 63
Example 6.3
1 x
Let X be the discrete random variable with pmf given by p(x) = 2 , x=
1, 2, · · · and 0 otherwise. Find the mode of X.
Solution.
The value of x that maximizes p(x) is x = 1. Thus, the mode of X is 1
Example 6.4
Let X be the continuous random variable with pdf given by f (x) = 0.75(1 −
x2 ) for −1 ≤ x ≤ 1 and 0 otherwise. Find the mode of X.
Solution.
The pdf is maximum for x = 0. Thus, the mode of X is 0
Example 6.5
A loss random variable X has the density function
(
2.5(200)2.5
x3.5
x > 200
f (x) =
0 otherwise.
Solution.
First, the cdf is given by
x
2.5(200)2.5
Z
F (x) = dt.
200 t3.5
Example 6.6
1
Let X be the random variable with pdf f (x) = b−a for a < x < b and 0
otherwise. Find the pth quantile of X.
Solution.
We have x
x−a
Z
dt
p = Pr(X ≤ x) = = .
a b−a b−a
Solving this equation for x, we find x = a + (b − a)p
Example 6.7
What percentile is 0.63 quantile?
Solution.
0.63 quantile is 63rd percentile
6 MEDIAN, MODE, PERCENTILES, AND QUANTILES 65
Practice Problems
Problem 6.1
Using words, explain the meaning of F (1120) = 0.2 in terms of percentiles
and quantiles.
Problem 6.2
Let X be a discrete random variable with pmf p(n) = (n−1)(0.4)2 (0.6)n−2 , n ≥
2 and 0 otherwise.
Problem 6.7 ‡
An insurance company sells an auto insurance policy that covers losses in-
curred by a policyholder, subject to a deductible of 100 . Losses incurred
follow an exponential distribution with mean 300.
What is the 95th percentile of actual losses that exceed the deductible?
Problem 6.8
Let X be a randon variable with density function
λe−λx ,
x>0
f (x) =
0, otherwise.
Problem 6.9
People are dispersed on a linear beach with a density function f (y) =
4y 3 , 0 < y < 1, and 0 elsewhere. An ice cream vendor wishes to locate
her cart at the median of the locations (where half of the people will be on
each side of her).
Problem 6.10 ‡
An automobile insurance company issues a one-year policy with a deductible
of 500. The probability is 0.8 that the insured automobile has no accident
and 0.0 that the automobile has more than one accident. If there is an
accident, the loss before application of the deductible is exponentially dis-
tributed with mean 3000.
Calculate the 95th percentile of the insurance company payout on this policy.
Problem 6.11
Let Y be a continuous random variable with cumulative distribution function
y≤a
0,
F (y) = − 12 (y−a)2
1−e , otherwise
where a is a constant.
Problem 6.12
Find the pth quantile of the exponential distribution defined by the distri-
bution function F (x) = 1 − e−x for x ≥ 0 and 0 otherwise.
Problem 6.13
A continuous random variable has the pdf f (x) = e−|x| for x ∈ R.
Problem 6.14
Let X be a loss random variable with cdf
( α
θ
1 − θ+x , x≥0
F (x) =
0, x < 0.
Problem 6.15
A random variable X follows a normal distribution with µ = 1 and σ 2 = 4.
Define a random variable Y = eX , then Y follows a lognormal distribution.
It is known that the 95th percentile of a standard normal distribution is
1.645.
Problem 6.16
4x
Let X be a random variable with density function f (x) = (1+x2 )3
for x > 0
and 0 otherwise.
Problem 6.17
3 5000 4
Let X be a random variable with pdf f (x) = 5000 x for x > 0 and 0
otherwise.
Problem 6.18
Let X be a random variable with cdf
0, x<0
F (x) = x3
, 0≤x≤3
27
1, x > 3.
Problem 6.19
Consider a sample of size 9 and observed data
Problem 6.20
3
A distribution has a pdf f (x) = x4
for x > 1 and 0 otherwise.
Sn = X1 + X2 + · · · + Xn
A similar formula holds for the variance provided that the Xi0 s are indepen-
dent6 random variables. In this case,
Example 7.1 ‡
The random variables X1 , X2 , · · · , Xn are independent and identically dis-
tributed with probability density function
1 x
f (x) = e− θ .
θ
2
Determine E[X ].
Solution.
The random variable Xi has an exponential distribution with mean θ. Thus,
6
We say that X and Y are independent random variables if and only if for any two
sets of real numbers A and B we have
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B).
70 A REVIEW OF PROBABILITY RELATED RESULTS
E(X1 ) + · · · + E(Xn )
E[X] = =θ
n
Var(X1 ) + · · · + Var(Xn )
Var[X] =
n2
θ 2
=
n
2
E[X ] =Var[X] + E[X]2
θ2
2 n+1
= +θ = θ2
n n
The central limit theorem reveals a fascinating property of the sum of inde-
pendent random variables. It states that the CDF of the sum converges to
the standard normal CDF as the number of terms grows without limit. This
theorem allows us to use the properties of the standard normal distribution
to obtain accurate estimates of probabilities associated with sums of random
variables.
Theorem 7.1
Let X1 , X2 , · · · be a sequence of independent and identically distributed
random variables, each with mean µ and variance σ 2 . Then,
√ Z a
n X1 + X2 + · · · + Xn 1 x2
P −µ ≤a → √ e− 2 dx
σ n 2π −∞
as n → ∞.
The Central Limit Theorem says that regardless of the underlying distribu-
tion
√ of the variables Xi , so long as they are independent, the distribution of
n X1 +X2 +···+Xn
σ n − µ converges to the same, normal, distribution.
Example 7.2
The weight of a typewriter has a mean of 20 pounds and a variance of 9
pounds. Consider a train that carries 200 of these typewriters. Estimate the
probability that the total weight of typewriters carried in the train exceeds
4050 pounds.
Solution.
Label the typewriters as Typewriter 1, Typewriter 2, etc. Let Xi be the
7 SUM OF RANDOM VARIABLES AND THE CENTRAL LIMIT THEOREM71
Example 7.3 ‡
In an analysis of healthcare data, ages have been rounded to the nearest
multiple of 5 years. The difference between the true age and the rounded
age is assumed to be uniformly distributed on the interval from −2.5 years to
2.5 years. The healthcare data are based on a random sample of 48 people.
What is the approximate probability that the mean of the rounded ages is
within 0.25 years of the mean of the true ages?
Solution.
Let X denote the difference between true and reported age. We are given X
is uniformly distributed on (−2.5, 2.5). That is, X has pdf f (x) = 1/5, −2.5 <
x < 2.5. It follows that E(X) = 0 and
2.5
x2
Z
2 2
σX = E(X ) = dx ≈ 2.083
−2.5 5
√
so that SD(X) = 2.083 ≈ 1.443.
Now X 48 the difference between the means of the true and rounded ages,
has a distribution that is approximately normal with mean 0 and standard
deviation 1.443
√
48
≈ 0.2083. Therefore,
1 1 −0.25 X48 0.25
P − ≤ X 48 ≤ =P ≤ ≤
4 4 0.2083 0.2083 0.2083
=P (−1.2 ≤ Z ≤ 1.2) = 2Φ(1.2) − 1
≈2(0.8849) − 1 = 0.77
Example 7.4
Let X1 , X2 , X3 , X4 be a random sample of size 4 from a normal distribution
with mean 2 and variance 10, and let X be the sample mean. Determine a
such that P (X ≤ a) = 0.90.
72 A REVIEW OF PROBABILITY RELATED RESULTS
Solution.
2
The sample mean X is normal with mean µ = 2 and variance σn = 10
= 2.5,
√ 4
and standard deviation 2.5 ≈ 1.58, so
X −2 a−2 a−2
0.90 = P (X ≤ a) = P < =Φ .
1.58 1.58 1.58
a−2
Using Excel, we get 1.58 = 1.28, so a = 4.02
7 SUM OF RANDOM VARIABLES AND THE CENTRAL LIMIT THEOREM73
Practice Problems
Problem 7.1
A shipping agency ships boxes of booklets with each box containing 100
booklets. Suppose that the average weight of a booklet is 1 ounce and the
standard deviation is 0.05 ounces.
What is the probability that 1 box of booklets weighs more than 100.4
ounces?
Problem 7.2
In the National Hockey League, the standard deviation in the distribution
of players’ height is 2 inches. The heights of 25 players selected at random
were measured.
Estimate the probability that the average height of the players in this sample
is within 1 inch of the league average height.
Problem 7.3
A battery manufacturer claims that the lifespan of its batteries has a mean
of 54 hours and a standard deviation of 6 hours. A sample of 60 batteries
were tested.
What is the probability that the mean lifetime is less than 52 hours?
Problem 7.4
Roll a dice 10 times. Estimate the probability that the sum obtained is
between 30 and 40, inclusive.
Problem 7.5
Consider 10 independently random variables each uniformly distributed over
(0,1).
Problem 7.6
The Chicago Cubs play 100 independent baseball games in a given season.
Suppose that the probability of winning a game in 0.8.
Problem 7.7
An insurance company has 10,000 home policyholders. The average annual
claim per policyholder is found to be $240 with a standard deviation of $800.
Estimate the probability that the total annual claim is at least $2.7 mil-
lion.
Problem 7.8
A certain component is critical to the operation of a laptop and must be
replaced immediately upon failure. It is known that the average life of this
type of component is 100 hours and its standard deviation is 30 hours.
Problem 7.9
An instructor found that the average student score on class exams is 74 and
the standard deviation is 14. This instructor gives two exams: One to a
class of size 25 and the other to a class of 64.
Using the Central Limit Theorem, estimate the probability that the average
test score in the class of size 25 is at least 80.
Problem 7.10
The Salvation Army received 2025 in contributions. Assuming the contri-
butions to be independent and identically distributed with mean 3125 and
standard deviation 250.
Estimate the 90th percentile for the distribution of the total contributions
received.
Problem 7.11 ‡
An insurance company issues 1250 vision care insurance policies. The num-
ber of claims filed by a policyholder under a vision care insurance policy
during one year is a Poisson random variable with mean 2. Assume the
numbers of claims filed by distinct policyholders are independent of one an-
other.
Problem 7.12
A battery manufacturer finds that the lifetime of a battery, expressed in
months, follows a normal distribution with mean 3 and standard deviation
1 . Suppose that you want to buy a number of these batteries with the
intention of replacing them successively into your radio as they burn out.
Assuming that the batteries’ lifetimes are independent, what is the small-
est number of batteries to be purchased so that the succession of batteries
keeps your radio working for at least 40 months with probability exceeding
0.9772?
Problem 7.13
The total claim amount for a home insurance policy has a pdf
x
1 − 1000
f (x) = 1000 e x>0
0 otherwise.
An actuary sets the premium for the policy at 100 over the expected total
claim amount.
If 100 policies are sold, estimate the probability that the insurance com-
pany will have claims exceeding the premiums collected.
Problem 7.14 ‡
A city has just added 100 new female recruits to its police force. The city
will provide a pension to each new hire who remains with the force until
retirement. In addition, if the new hire is married at the time of her re-
tirement, a second pension will be provided for her husband. A consulting
actuary makes the following assumptions:
(i) Each new recruit has a 0.4 probability of remaining with the police force
until retirement.
(ii) Given that a new recruit reaches retirement with the police force, the
probability that she is not married at the time of retirement is 0.25.
(iii) The number of pensions that the city will provide on behalf of each new
hire is independent of the number of pensions it will provide on behalf of
any other new hire.
Determine the probability that the city will provide at most 90 pensions
to the 100 new hires and their husbands.
76 A REVIEW OF PROBABILITY RELATED RESULTS
Problem 7.15
The amount of an individual claim has a two-parameter Pareto distribution
with θ = 8000 and α = 9. Consider a sample of 500 claims.
Estimate the probability that the total sum of the claims is at least 550,000.
Problem 7.16
Suppose that the current profit from selling a share of a stock is found to
follow a uniform distribution on [−45, 72].
Problem 7.17
The severities of individual claims have the Pareto distribution with param-
eters α = 38 and θ = 8000.
Use the central limit theorem to approximate the probability that the sum
of 100 independent claims will exceed 600,000.
Problem 7.18 ‡
Let X and Y be the number of hours that a randomly selected person
watches movies and sporting events, respectively, during a three-month pe-
riod. The following information is known about X and Y :
E(X) = 50
E(Y) = 20
Var(X) = 50
Var(Y) = 30
Cov (X,Y) = 10
One hundred people are randomly selected and observed for these three
months. Let T be the total number of hours that these one hundred people
watch movies or sporting events during this three-month period.
Problem 7.19 ‡
Automobile losses reported to an insurance company are independent and
uniformly distributed between 0 and 20,000. The company covers each such
loss subject to a deductible of 5,000.
7 SUM OF RANDOM VARIABLES AND THE CENTRAL LIMIT THEOREM77
Calculate the probability that the total payout on 200 reported losses is
between 1,000,000 and 1,200,000.
Problem 7.20 ‡
For Company A there is a 60% chance that no claim is made during the
coming year. If one or more claims are made, the total claim amount is
normally distributed with mean 10,000 and standard deviation 2, 000.
For Company B there is a 70% chance that no claim is made during the
coming year. If one or more claims are made, the total claim amount is
normally distributed with mean 9,000 and standard deviation 2,000.
Assume that the total claim amounts of the two companies are independent.
What is the probability that, in the coming year, Company B’s total claim
amount will exceed Company A’s total claim amount?
78 A REVIEW OF PROBABILITY RELATED RESULTS
Example 8.1
Calculate the moment generating function for the exponential distribution
with parameter λ, i.e. f (x) = λe−λx for x > 0 and 0 otherwise.
Solution.
We have
∞ ∞
λ −x(λ−t) ∞
Z Z
tx −λx −x(λ−t) λ
MX (t) = e λe dx = λe dx = − e = λ − t, t < λ
0 0 λ−t 0
Example 8.2
Let X be a discrete random variable with pmf given by the following table
x 1 2 3 4 5
p(x) 0.15 0.20 0.40 0.15 0.10
and 0 otherwise. Calculate MX (t).
Solution.
We have
MX (t) = 0.15et + 0.20e2t + 0.40e3t + 0.15e4t + 0.10e5t
As the name suggests, the moment generating function can be used to gen-
erate moments E(X n ) for n = 1, 2, · · · . The next result shows how to use
the moment generating function to calculate moments.
8 MOMENT GENERATING FUNCTIONS AND PROBABILITY GENERATING FUNCTIONS79
Theorem 8.1
For any random variable X, we have
dn
E(X n ) = MX
n (0) where M n (0) =
X dtn MX (t) t=0 .
Example 8.3
Let X be a binomial random variable with parameters n and p. Find the
expected value and the variance of X using moment generating functions.
Solution.
We can write
n
X n
X
tk k n−k
MX (t) = e C(n, k)p (1−p) = C(n, k)(pet )k (1−p)n−k = (pet +1−p)n .
k=0 k=0
Differentiating yields
d d
MX (t) = npet (pet + 1 − p)n−1 =⇒ E(X) = MX (t) |t=0 = np.
dt dt
To find E(X 2 ), we differentiate a second time to obtain
d2
MX (t) = n(n − 1)p2 e2t (pet + 1 − p)n−2 + npet (pet + 1 − p)n−1 .
dt2
Evaluating at t = 0 we find
00
E(X 2 ) = MX (0) = n(n − 1)p2 + np.
Example 8.4
Let X be a Poisson random variable with parameter λ. Find the expected
value and the variance of X using moment generating functions.
Solution.
We can write
∞ tn −λ n ∞ tn n ∞
X e e λ −λ
X e λ −λ
X (λet )n t t −1)
MX (t) = =e =e = e−λ eλe = eλ(e .
n! n! n!
n=0 n=0 n=0
Example 8.5
Let X be a normal random variable with parameters µ and σ 2 . Find the
expected value and the variance of X using moment generating functions.
Solution.
First we find the moment of a standard normal random variable with pa-
rameters 0 and 1. We can write
Z ∞ Z ∞
(z 2 − 2tz)
tZ 1 2
tz − z2 1
MZ (t) =E(e ) = √ e e dz = √ exp − dz
2π −∞ 2π −∞ 2
Z ∞ Z ∞
(z − t)2 t2
1 t2 1 (z−t)2 t2
=√ exp − + dz = e 2 √ e− 2 dz = e 2
2π −∞ 2 2 2π −∞
Now, since X = µ + σZ we have
By differentiation we obtain
σ 2 t2
0
MX (t) = (µ + tσ 2 )exp + µt
2
and
σ 2 t2
2 2
00 2 2 2 σ t
MX (t) = (µ + tσ ) exp + µt + σ exp + µt
2 2
and thus
0 (0) = µ and E(X 2 ) = M 00 (0) = µ2 + σ 2
E(X) = MX X
The variance of X is
Example 8.6
If X and Y are independent binomial random variables with parameters
(n, p) and (m, p), respectively, what is the pmf of X + Y ?
Solution.
We have
MX+Y (t) = MX (t)MY (t) = (pet + 1 − p)n (pet + 1 − p)m = (pet + 1 − p)n+m .
Example 8.7
If X and Y are independent Poisson random variables with parameters λ1
and λ2 , respectively, what is the pmf of X + Y ?
Solution.
We have
t −1) t −1) t −1)
MX+Y (t) = MX (t)MY (t) = eλ1 (e eλ2 (e = e(λ1 +λ2 )(e .
t
Since e(λ1 +λ2 )(e −1) is the moment generating function of a Poisson random
variable having parameter λ1 + λ2 , X + Y is a Poisson random variable with
this same pmf
82 A REVIEW OF PROBABILITY RELATED RESULTS
Note that PX (t) = MX [ex ln t ] and MX (t) = PX (et ). The pgf transforms a
sum into a product and enables it to be handled much more easily: Let
X1 , X2 , · · · , Xn be independent random variables and Sn = X1 + X2 + · · · +
Xn . It can be shown that
Example 8.8
Find the pgf of the Poisson distribution of parameter λ.
Solution.
λx e−λ
Recall that the Poisson random variable has a pmf p(x) = x! . Hence,
∞ x e−λ ∞
X
xλ −λ
X (λt)x
PX (t) = t =e = e−λ eλt = eλ(t−1)
x! x!
x=0 x=0
The probability generating function gets its name because the power series
can be expanded and differentiated to reveal the individual probabilities.
Thus, given only the pgf PX (t) = E(tx ), we can recover all probabilities
Pr(X = x).
It can be shown that
1 dn
p(n) = PX (t) .
n! dtn
t=0
Example 8.9
Let X be a discrete random variable with pgf PX (t) = 5t (2 + 3t2 ). Find the
distribution of X.
Solution. 00 (0)
PX
We have, p(0) = PX (0) = 0; p(1) = PX0 (0) = 2
5 ; p(2) = 2! = 0, p(3) =
P X 000 (0)
3! = 35 ; and p(n) = 0, ∀n ≥ 4
8 MOMENT GENERATING FUNCTIONS AND PROBABILITY GENERATING FUNCTIONS83
Theorem 8.2
For any discrete random variable X, we have
dk
E[X(X − 1)(X − 2) · · · (X − k + 1)] = k PX (t) .
dt t=1
Example 8.10
Let X be a Poisson random variable with parameter λ. Find the mean and
the variance using probability generating functions.
Solution.
We know that the pgf of X is PX (t) = eλ(t−1) . We have
Example 8.11
Let X be a Poisson random variable with parameter λ and Y is Poisson
with parameter µ. Find the distribution of X + Y, assuming X and Y are
independent.
Solution.
We have
Practice Problems
Problem 8.1
Let X be an exponential random variable with parameter λ.
Find the expected value and the variance of X using moment generating
functions.
Problem 8.2
Let X and Y be independent normal random variables with parameters
(µ1 , σ12 ) and (µ2 , σ22 ), respectively.
Problem 8.3 ‡
Let X and Y be identically distributed independent random variables such
that the moment generating function of X + Y is
M (t) = 0.09e−2t + 0.24e−t + 0.34 + 0.24et + 0.09e2t , − ∞ < t < ∞.
Problem 8.4 ‡
The value of a piece of factory equipment after three years of use is 100(0.5)X
where X is a random variable having moment generating function
1
MX (t) = 1−2t for t < 21 .
Calculate the expected value of this piece of equipment after three years of
use.
Problem 8.5
Let X and Y be two independent random variables with moment generating
functions
2 +2t 2 +t
MX (t) = et and MY (t) = e3t .
Determine the moment generating function of X + 2Y.
Problem 8.6
The random variable X has an exponential distribution with parameter b.
It is found that MX (−b2 ) = 0.2.
Find b.
8 MOMENT GENERATING FUNCTIONS AND PROBABILITY GENERATING FUNCTIONS85
Problem 8.7
If the moment generating function for the random variable X is MX (t) =
1 3
t+1 , find E[(X − 2) ].
Problem 8.8
Suppose a random variable X has moment generating function
9
2 + et
MX (t) = .
3
Find the variance of X.
Problem 8.9
A random variable X has the moment generating function
1
MX (t) = .
(1 − 2500t)4
Determine the standard deviation of X.
Problem 8.10 ‡
A company insures homes in three cities, J, K, and L . Since sufficient dis-
tance separates the cities, it is reasonable to assume that the losses occurring
in these cities are independent.
The moment generating functions for the loss distributions of the cities are:
MJ (t) =(1 − 2t)−3
MK (t) =(1 − 2t)−2.5
MJ (t) =(1 − 2t)−4.5
Let X represent the combined losses from the three cities.
Calculate E(X 3 ).
Problem 8.11
Let X be a binomial random variable with pmf p(k) = C(n, k)pk (1 − p)n−k .
Problem 8.12
Let X be a geometric random variable with pmf p(n) = p(1 − p)n−1 , n =
1, 2, · · · , where 0 < p < 1.
Problem 8.13
Let X be a random variable with pgf PX (t) = eλ(t−1) . True or false: X is a
Poisson random variable with parameter λ.
Problem 8.14
Let X be a random variable and Y = a + bX. Express PY (t) in terms of
PX (t).
Problem 8.15
Let X have the distribution of a geometric random variable with parameter
p. That is, p(x) = p(1 − p)x−1 , x = 1, 2, 3, · · · .
Find the mean and the variance of X using probability generating func-
tions.
Problem 8.16
You are given a sample of size 4 with observed data
2 2 3 5 8
87
88 TAIL WEIGHT OF A DISTRIBUTION
Example 9.1
Show that the exponential distribution with parameter λ > 0 is light-tailed
according to the above definition. Refer to Table C.
Solution.
Using Table C, for all positive integers k, we have
Γ(k + 1)
E(X k ) = .
λk
Hence, the exponential distribution is light-tailed
Example 9.2
Show that the Pareto distribution with parameters α and θ is heavy-tailed.
Refer to Table C.
Solution.
Using Table C, we have
θk Γ(k + 1)Γ(α − k)
E(X k ) =
Γ(α)
provided that −1 < k < α. Since the moments are not finite for all positive
k, the Pareto distribution is heavy-tailed
9 TAIL WEIGHT MEASURES: MOMENTS AND THE SPEED OF DECAY OF S(X)89
Example 9.3
Let X be a continuous random variable with pdf fX (x) defined for x > 0 and
0 otherwise. Suppose that there is a constant M > 0 such that fX (x) = xCn
for all x ≥ M and 0 otherwise, where n > 1 and C = Mn−1 1−n . Show that X
has a heavy-tailed distribution.
Solution.
We have
Z M Z ∞
k k
E(X ) = x fX (x)dx + C xk−n dx
0 M
Z M ∞
k xk−n+1
= x fX (x)dx + C
0 k − n + 1 M
=∞
Next, we consider comparing the tail weights of two distributions with the
same mean. This is done by comparing the survival functions of the two
distributions. Algebraically, we compute the ratio of the tail probabilities
or the survival functions which we will refer to as the relative tail weight:
0 (x)
−SX
SX (x) fX (x)
lim = lim = lim ≥ 0.
x→∞ SY (x) x→∞ −S 0 (x) x→∞ fY (x)
Y
Note that in the middle limit we used L’Hôpital’s rule since limx→∞ SX (x) =
limx→∞ SY (x) = 0.
Now, if the above limit is 0, then this happens only when the numerator
is 0 and the denominator is positive. In this case, we say that the distri-
bution of X has lighter tail than Y. If the limit is finite positive number
then we say that the distributions have similar or proportional tails. If the
90 TAIL WEIGHT OF A DISTRIBUTION
Example 9.4
Compare the tail weight of the inverse Pareto distribution with pdf fX (x) =
θ2
τ θ1 xτ −1 θ2α e− x
(x+θ1 )τ +1
with the inverse Gamma distribution with pdf fY (x) = xα+1 Γ(α)
where θ1 , θ2 , τ > 0 and α > 1.
Solution.
We have
fX (x) τ θ1 xτ −1 xα+1 Γ(α)
lim = lim ·
x→∞ fY (x) x→∞ (x + θ1 )τ +1 θ2
θ2α e− x
τ +1
τ θ1 Γ(α) θ2 x
= lim e x xα−1
x→∞ θ2α x + θ1
τ θ1 Γ(α) 0
= · e · ∞ = ∞.
θ2α
Example 9.5
LetX be the exponential distribution with survival function SX = e−x for
x ≥ 0 and 0 otherwise, and Y be the distribution with survival function
SY (x) = x1 for x ≥ 1 and 0 otherwise. Compare the tail weight of these
distributions.
Solution.
We have
SX (x)
lim = lim xe−x = 0.
x→∞ SY (x) x→∞
Hence, X has a lighter tail than Y
9 TAIL WEIGHT MEASURES: MOMENTS AND THE SPEED OF DECAY OF S(X)91
Practice Problems
Problem 9.1
Let X be a random variable with pdf fX (x) = Cxn e−bx for x > 0 and 0
R ∞ −1
otherwise, where b, n > 0 and C = 0 xn e−bx dx .
Problem 9.9
2
Let X be a random variable with pdf fX (x) = π(1+x2 )
for x > 0 and 0
1
otherwise. Let Y be a random variable with pdf fY (x) = (1+x)2
for x > 0
and 0 otherwise.
Problem 9.10
2
Let X be a random variable with pdf fX (x) = π(1+x 2 ) for x > 0 and 0
α
otherwise. Let Y be a random variable with pdf fY (x) = (1+x) α+1 for x > 0
Problem 9.11
The distribution of X has the survival function
θxγ
SX (x) = 1 − , θ, γ > 0.
1 + θxγ
and 0 otherwise. The distribution of Y has pdf
x
xγ−1 e− θ
SY (x) = γ .
θ Γ(γ)
and 0 otherwise.
Problem 9.12
Using the criterion of existence of moments, complete the following. Refer
to Table C.
Distribution Heavy-Tail Light-Tail
Weibull
Inverse Pareto
Normal
Loglogistic
Problem 9.13
Using the criterion of existence of moments, complete the following. Refer
to Table C.
9 TAIL WEIGHT MEASURES: MOMENTS AND THE SPEED OF DECAY OF S(X)93
Problem 9.14
Using the criterion of existence of moments, complete the following. Refer
to Table C.
Problem 9.15
Show that the Loglogistic distribution has a heavier tail than the Gamma
distribution.
Problem 9.16
Show that the Paraloglogistic distribution has a heavier tail than the Log-
normal distribution.
Problem 9.17
Show that the inverse exponential distribution has a heavier tail than the
exponential distribution.
Problem 9.18
SX (x)
Let X and Y have similar (proportional) right tails and limx→∞ SY (x) = c.
Which of the following is a possible value of c?
Problem 9.19
Let X be a Pareto distribution with parameters α = 4 and θ = 340. Let Y
be a Pareto distribution with parameters α = 6 and θ = 340.
Problem 9.20
You are given the right-tails of the survival functions of three distributions
94 TAIL WEIGHT OF A DISTRIBUTION
Example 10.1
Let X be a random variable with survival function f (x) = x12 if x ≥ 1 and
0 otherwise. Based on the hazard rate function of the distribution, decide
whether the distribution is heavy-tailed or light-tailed.
Solution.
The hazard rate function is
S 0 (x) − 23 2
h(x) = − = − 1x = .
S(x) x2
x
Hence, for x ≥ 1,
2
h0 (x) = −<0
x2
which shows that h(x) is nonincreasing. We conclude that the distribution
of X is heavy-tailed
96 TAIL WEIGHT OF A DISTRIBUTION
Remark 10.1
Under this definition, a constant hazard function can be called both non-
increasing and nondecreasing. We will refer to distributions with constant
hazard function as medium-tailed distribution. Thus, the exponential
random variable which was classified as light-tailed in Example 9.1, will be
referred to as a medium-tailed distribution.
The next result provides a criterion for testing tail weight based on the
probability density function.
Theorem 10.1
If for a fixed y ≥ 0, the function f (x+y)
f (x) is nonincreasing (resp. nondecreas-
ing) in x then the hazard rate function is nondecreasing (resp. nonincreas-
ing).
Proof.
We have
R∞ R∞ ∞
f (t)dt f (x + y)dy
Z
−1 x 0 f (x + y)
[h(x)] = = = dy.
f (x) f (x) 0 f (x)
f (x+y)
Thus, if f (x) is nondecreasing in x for a fixed y, then h(x) is a nonincreas-
f (x+y)
ing in x. Likewise, if f (x) is nonincreasing in x for a fixed y, then h(x) is
a nondecreasing in x
Example 10.2
Using the above theorem, show that the Gamma distribution with parame-
ters θ > 0 and 0 < α < 1 is heavy-tailed.
Solution.
We have
f (x + y) y α−1 − y
= 1+ e θ
f (x) x
and
d f (x + y) y(1 − α) y α−2 − y
= 1 + e θ >0
dx f (x) x2 x
for 0 < α < 1. Thus, the hazard rate function is nonincreasing and the
distribution is heavy-tailed
Next, the hazard rate function can be used to compare the tail weight of two
10 TAIL WEIGHT MEASURES: HAZARD RATE FUNCTION AND MEAN EXCESS LOSS FUNCTION97
Example 10.3
Let X be the Pareto distribution with α = 2 and θ = 150 and Y be the
Pareto distribution with α = 3 and θ = 150. Compare the tail weight of
these distributions using
(a) the relative tail weight measure;
(b) the hazard rate measure.
Compare your results in (a) and (b).
Solution.
(a) Note that both distributions are heavy-tailed using the hazard rate anal-
ysis. However, h0Y (x) = − (x+150)
2 0 1
2 < hX (x) = − (x+150)2 so that hY (x) de-
creases at a faster rate than hX (x). Thus, X has a lighter tail than X.
(b) Using the relative tail weight, we find
Remark 10.2
Note that the Gamma distribution is light-tailed for all α > 0 and θ > 0
by the existence of moments analysis. However, the Gamma distribution is
heavy-tailed for 0 < α < 1 by the hazard rate analysis. Thus, the concept
of light/heavy right tailed is somewhat vague in this case.
In the context of life contingency models (See [3]), if X is the random vari-
able representing the age at death and if T (x) is the continuous random
98 TAIL WEIGHT OF A DISTRIBUTION
variable representing time until death of someone now alive at age x then
e(x) is denoted by e̊( x) = E[T (x)] = E[X − x|X > x]. In words, for a
newborm alive at age x, e̊( x) is the average additional number of years until
death from age x, given that an individual has survived to age x. We call
e̊( x) the complete expectation of life or the residual mean lifetime.
Next, we establish a relationship between e(x) and the hazard rate func-
tion. We have
R∞ Rx
E(X) − E(X ∧ x) 0 SX (y)dy − 0 SX (y)dy
e(x) = =
1 − F (x) SX (x)
R∞ Z ∞
x SX (y)dy SX (x + y)
= = dy.
SX (x) 0 SX (x)
But one of the characteristics of the hazard rate function is that it can
generate the survival function:
Rx
SX (x) = e− 0 h(t)dt
.
From the above discussion, we see that for a fixed y > 0, if SX (x+y)
SX (x) is an
increasing function of x (and therefore e(x) is increasing) then the hazard
rate function is decreasing and consequently the distribution is heavy-tailed.
Likewise, if the SX (x+y)
SX (x) is a decreasing function of x (and therefore e(x) is
decreasing) then the hazard rate function is increasing and consequently the
distribution is light-tailed.
Example 10.4
2
Let X be a random variable with pdf f (x) = 2xe−x for x > 0 and 0
otherwise. Show that the distribution is light-tailed by showing SX (x+y)
SX (x) is
a decreasing function of x.
10 TAIL WEIGHT MEASURES: HAZARD RATE FUNCTION AND MEAN EXCESS LOSS FUNCTION99
Solution. R∞ 2 2
We have SX (x) = x 2te−t dt = e−x . Thus, for a fixed y > 0, we have
SX (x + y)
= e−2xy − y 2
SX (x)
Practice Problems
Problem 10.1
Show that the Gamma distribution with parameters θ > 0 and α > 1 is
light-tailed by showing that f (x+y)
f (x) is nonincreasing.
Problem 10.2
Show that the Gamma distribution with parameters θ > 0 and α = 1 is
medium-tailed.
Problem 10.3
Let X be the Weibull distribution with probability density function f (x) =
τ
− x
τ xτ −1 e ( θ )
θτ .
Using hazard rate analysis, show that the distribution is heavy-tailed for
0 < τ < 1 and light-tailed for τ > 1.
Problem 10.4
2
Let X be a random variable with pdf f (x) = 2xe−x for x > 0 and 0 other-
wise.
Problem 10.5
Using Theorem 10.1, show that the Pareto distribution is heavy-tailed.
Problem 10.6
Show that the hazard rate function of the Gamma distribution approaches
1
θ as x → ∞.
Problem 10.7
1
Show that limx→∞ e(x) = limx→∞ h(x) .
Problem 10.8
Find limx→∞ e(x) where X is the Gamma distribution.
Problem 10.9
Let X be the Gamma distribution with 0 < α < 1 and θ > 0. Show that
e(x) increases from αθ to θ.
10 TAIL WEIGHT MEASURES: HAZARD RATE FUNCTION AND MEAN EXCESS LOSS FUNCTION10
Problem 10.10
Let X be the Gamma distribution with α > 1 and θ > 0. Show that e(x)
decreases from αθ to θ.
Problem 10.11
Find limx→∞ e(x) where X is the Pareto distribution with parameters α and
θ and conclude that the distribution is heavy-tailed.
Problem 10.12
1
Let X be a random variable with pdf f (x) = (1+x)2
for x > 0 and 0 other-
wise.
Problem 10.13
Let X be a random variable with mean excess loss function e(x) = x + 1.
Problem 10.14
Let X be a random variable with mean excess loss function e(x) = x + 1.
Determine the tail behavior of X by using the hazard rate analysis.
Problem 10.15
Let X be a random variable with mean excess loss function e(x) = x + 1.
Determine the tail behavior of X by using the mean excess loss function
analysis.
Problem 10.16
x 2
Let X be a random variable with cdf S(x) = e−( θ ) . Determine the tail
behavior of X by using the mean excess loss function analysis.
Problem 10.17
Let X be the single-Pareto distribution with pdf
αθα
f (x) = .
xα+1
Use Theorem 10.1, to show that X is heavy-tailed.
102 TAIL WEIGHT OF A DISTRIBUTION
Example 11.1
Show that the equilibrium mean is given by
E(X 2 )
E(Xe ) = .
2E(X)
Solution.
Using integration by parts , we find
Z ∞
2
E(X ) = x2 f (x)dx
0
∞
Z ∞
2
= −x S(x)0 + 2 xS(x)dx
0
Z ∞
=2 xS(x)dx
0
since Z ∞ Z ∞
2 2
0 ≤ x S(x) = x f (t)dt ≤ t2 f (t)dt
x x
11 EQUILIBRIUM DISTRIBUTIONS AND TAIL WEIGHT 103
which implies
lim x2 S(x) = 0.
x→∞
Now,
∞ ∞
E(X 2 )
Z Z
1
E(Xe ) = xfe (x)dx = xS(x)dx =
0 E(X) 0 2E(X)
Example 11.2
Show that h i
e(0) − R0x 1
dt
S(x) = e e(t)
.
e(x)
Solution.
Using e(0) = E(X), We have
Example 11.3
Show that
e(x) Se (x)
= .
e(0) S(x)
Solution. R∞ R∞
Since Se (x) = 1 R S(x)
E(X) x S(t)dt, we have x S(t)dt = e(0)Se (x). Since x∞ S(t)dt =
1
R∞
e(x) ,
we obtain x S(t)dt = e(x)S(x). Thus, e(x)S(x) = e(0)Se (x) or equiv-
alently
e(x) Se (x)
=
e(0) S(x)
If the mean residual life function is increasing ( implied if the hazard rate
function of X is decreasing by Section 10) then e(x) ≥ e(0) and
Se (x) ≥ S(x).
which implies
E(X 2 )
≥ E(X)
2E(X)
and this can be rewritten as
which gives
Var(X) ≥ [E(X)]2 .
Also,
Var(X)
[CV (x)]2 = ≥ 1.
[E(X)]2
Example 11.4
2
Let X be the random variable with pdf f (x) = (1+x) 3 for x ≥ 0 and 0
otherwise.
(a) Determine the survival function S(x).
(b) Determine the hazard rate function h(x).
(c) Determine E(X).
(d) Determine the pdf of the equilibrium distribution.
(e) Determine the survival function Se (x) of the equilibrium distribution.
(f) Determine the hazard function of the equilibrium distribution.
(g) Determine the mean residual lifetime of X.
Solution.
(a) The survival function is
Z ∞ ∞
2dt 1 1
S(x) = 3
=− 2
= .
x (1 + t) (1 + t) x (1 + x)2
1 ∞
Z ∞
dt 1
Se (x) = 2
=− = .
x (1 + t) 1+t x
1+x
(f) We have
fe (x) 1
he (x) = = .
Se (x) 1+x
(g) We have
1
e(x) = = x + 1, x ≥ 0
he (x)
106 TAIL WEIGHT OF A DISTRIBUTION
Practice Problems
Problem 11.1
2
Let X be the random variable √with pdf f (x) = 2xe−x for x > 0 and 0
R ∞ −x2
otherwise. Recall 0 e dx = 2π .
(a) Determine the survival function S(x).
(b) Determine the equilibrium distribution fe (x).
Problem 11.2
Let X be a random variable with pdf f (x) = 31 (1 + 2x)e−x for x > 0
and 0 otherwise. Determine the hazard rate function of the equilibrium
distribution. Hint: Example 5.3.
Problem 11.3
Let X be a random variable with mean excees loss function
1
e(x) = , x > 0.
1+x
Determine the survival funtion of the distribution X.
Problem 11.4
Let X be a random variable with mean excees loss function
1
e(x) = , x > 0.
1+x
Determine the survival function of the equilibrium distribution.
Problem 11.5
Let X be a random variable with mean excees loss function
1
e(x) = , x > 0.
1+x
Determine the mean of the equilibrium distribution.
Problem 11.6
3 2
Let X be a random variable with pdf f (x) = 8x for 0 < x < 2 and 0
otherwise.
(a) Find E(X) and E(X 2 ).
(b) Find the equilibrium mean.
11 EQUILIBRIUM DISTRIBUTIONS AND TAIL WEIGHT 107
Problem 11.7
Let X be a loss random variable with mean excess loss function
Problem 11.8
A random variable X has an exponential distribution with parameter λ.
Calculate the equilibrium mean.
108 TAIL WEIGHT OF A DISTRIBUTION
Risk Measures
109
110 RISK MEASURES
Remark 12.1
If L is a loss random variable then ρ(L) may be interpreted as the riskiness
12 COHERENT RISK MEASUREMENT 111
Example 12.1
Show that the expectation function E(·) is a coherent risk measure on L.
Solution.
The expectation function E(·) satisfies the following properties:
Example 12.2
Show that the variance is not a cohorent risk measure.
Solution.
Since Var(L + a) = Var(L) 6= Var(L) + a, the variance of a distribution is
not a cohorent risk measure
Example 12.3
Show that ρ(L) = E(L) + βVar(L), where β > 0, satisfies the property of
translation invariant but not positive homogeneity. We refer to this risk
measure as the variance premium principle
Solution.
We have
where α > 0
112 RISK MEASURES
Example 12.4
Show that ρ(L) = α1 ln [E(eαL )], where α, t > 0, satisfies the properties of
translation invariant and monotonicity. We refer to this risk measure as the
exponential premium principle
Solution.
We have
1
ρ(L + β) = ln [E(eα(L+β) )]
α
1
= ln [E(eαL eαβ )]
α
1
= ln [eαβ E(eαL )]
α
1
= [αβ + ln [E(eαL )]
α
=ρ(L) + β.
eαL1 ≤eαL2
E(eαL1 ) ≤E(eαL2 )
ln [E(eαL1 )] ≤ ln [E(eαL2 )]
ρ(L1 ) ≤ρ(L2 )
12 COHERENT RISK MEASUREMENT 113
Practice Problems
Problem 12.1
Show that ρ(0) = 0 and interpret this result.
Problem 12.2
Show that ρ(αL + β) = αρ(L) + β, where α > 0 and β ∈ R.
Problem 12.3
Show that ρ(L) = (1 + α)E(L) is a coherent risk measure, where α ≥ 0.
This risk measure is known as the expected value premium principle.
Problem 12.4
Which of the following is an implication of the subadditivity requirement
for a coherent risk measure?
Problem 12.5
Which of the following is an implication of the monotonicity requirement
for a coherent risk measure?
Problem 12.6
Which of the following is an implication of the positive homogeneity require-
ment for a coherent risk measure? More than one answer may be correct.
(a) If one assumes twice the amount of risk formerly assumed, one will
need twice the capital.
(b) As the size of a position doubles, the risk stays unchanged.
(c) The risk of the position increases in a linear way with the size of the
position.
114 RISK MEASURES
Problem 12.7
Which of the following is an implication of the translation invariant require-
ment for a coherent risk measure? More than one answer may be correct.
(a) Adding a fixed amount to the initial financial position should increase
the risk by the same amount.
(b) Subtracting a fixed amount to a portfolio decreases the required risk
capital by the same amount.
(c) Getting additional capital, if it is from a risk-free source, cannot funda-
mentally alter the riskiness of a position.
Problem 12.8
Show that ρ(L) = E(L) + αE[L − E(L)] satisfies (P1), (P3), and (P4).
Problem 12.9
Find the numerical value of ρ(L − ρ(L)).
Problem 12.10 p
Show that ρ(L) = E(L) + Var(L) satisfies the properties of translation
invariant and positive homogeneity. We refer to this risk measure as the
standard deviation principle.
13 VALUE-AT-RISK 115
13 Value-at-Risk
A standard risk measure used to evaluate exposure to risk is the value-at-
risk, abbreviated VaR. In general terms, the value-at-risk measures the
potential loss of value of an asset or a portfolio over a defined time with a
high level of certainty. For example, if the VaR is $1 million at one-month,
99% confidence level, then there is 1% chance that under normal market
movements the monthly loss will exceed $1 million. Bankers use VaR to
capture the potenetial losses in their traded portfolios from adverse market
movements over a period of time; then they use it to compare with their
available capital and cash reserves to ensure that the losses can be covered
withoud putting the firm at risk.
Example 13.1
Let L be an exponential loss random variable with mean λ > 0. Find πp .
Solution.
x
The pdf of L is f (x) = λ1 e− λ for x > 0 and 0 otherwise. Thus, F (x) =
x πp
1 − e− λ . Now, solving the equation F (πp ) = p, that is, 1 − e− λ = p, we
obtain πp = −λ ln (1 − p)
Example 13.2
The loss random variable L has a Pareto distribution with parameters α
and θ. Find πp .
Solution.
αθα
The pdf of L is f (x) = (x+θ) α+1 for x > 0 and 0 otherwise. The cdf is
α
θ
F (x) = 1 − x+θ . Solving the equation F (πp ) = p, we find
1
πp = θ[(1 − p)− α − 1]
Example 13.3
The loss random variable L follows a normal distribution with mean µ and
standard deviation σ. Find πp .
116 RISK MEASURES
Solution.
Let Z = L−µσ . Then Z is the standard normal distribution. The p− quantile
of Z satisfies the equation Φ(z) = p. Thus, z = Φ−1 (p). Hence,
πp = µ + σz = µ + σΦ−1 (p)
Example 13.4
Consider a sample of size 8 in which the observed data points were 3,5,6,6,6,7,7,
and 10. Find VaR0.90 (L) for this empirical distribution.
Solution.
The pmf of L is given below.
x 3 5 6 7 10
1 1 3 2 1
p(x) 8 8 8 8 8
Thus, π0.90 = 10
Remark 13.1
According to [1], VaRp (L) is monotone, positive homogeneous, and trans-
lation invariant but not subadditive. Thus, VaRp (L) is not a coherent risk
measure.
13 VALUE-AT-RISK 117
Practice Problems
Problem 13.1
The loss random variable L has a uniform distribution in [a, b]. Find VaRp (L).
Problem 13.2
The cdf of a loss random variable L is given by
x2
FL (x) = 4 , 0<x≤2
1, x > 2.
Find π0.90 .
Problem 13.3
You are given the following empirical distribution
3, 5, 6, 6, 6, 7, 7, 10.
The risk measure under the standard deviation principle is ρ(L) = E(L) +
ασ(L). Determine the value of α so that ρ(L) = π0.90 .
Problem 13.4
Losses represented by L are distributed as a Pareto distribution with pa-
rameters α = 2 and θ = 60. Find VaR0.75 (L).
Problem 13.5
Losses represented by L are distributed as a single Pareto distribution with
α
a pdf f (x) = xαθ
α+1 , x > θ and 0 otherwise. Find πp .
Problem 13.6
A loss random variable X has a survival function
2
θ
S(x) = , x > 0.
x+θ
Find θ given that π0.75 = 40.
Problem 13.7
Let L be a random variable with discrete loss distribution given by
x 0 100 1000 10000 100000
p(x) 0.65 0.20 0.07 0.05 0.03
Calculate the Value-at-Risk of L at the 90% level.
118 RISK MEASURES
Problem 13.8
A loss random variable L has a two-parameter Pareto distribution satisfying:
Problem 13.9
Let L be a loss random variable with probability generating function
Problem 13.10
A loss random variable L has a survival function
2
100
S(x) = , x > 0.
x + 100
14 Tail-Value-at-Risk
The quantile risk meaure discussed in the previous section provides us only
with the probability that a loss random variable L will exceed the VaRp (L)
for a certain confidence level. It does not provide any information about
how large the losses are beyond a particular percentile. The Tail-Value-
at-Risk (TVaR) measure does consider losses above a percentile. Other
names used for TVaR are Tail Conditional Expectation and Expected
Shortfall.
T V aRp (L) = E[L|L > V aRp (L)] = E[L|L > FL−1 (p)]
where FL (x) is the distribution of L. This is the expected value of the loss,
conditional on the loss exceeding πp . Note that TVaR is also the expected
cost per payment with a franchise deductible7 of πp .
Since
E(L) − E(L ∧ πp )
E[X − πp |X > πp ] = ,
1−p
we can write
E(L) − E(L ∧ πp )
TVaRp (L) = VaRp (L) + .
1−p
7
See Section 32.
120 RISK MEASURES
Remark 14.1
Unlike the VaR risk measure, TVaR risk measure is shown to be coherent.
Example 14.1
Find the Tail-Value-at-Risk of an exponential distribution with mean λ > 0.
Solution.
From Problem 5.7, we have e(πp ) = λ. This and Example 13.1 give
T V aRp (L) = λ − λ ln (1 − p)
Example 14.2
Find the Tail-Value-at-Risk of a Pareto distribution with parameters α > 1
and θ > 0.
Solution.
The survival function of the Pareto distribution is
α
θ
S(x) = , x > 0.
x+θ
Thus,
Z ∞
1
e(πp ) = S(x)dx
S(πp ) πp
Z ∞
=(πp + θ)α (x + θ)−α dx
πp
(πp + θ)α ∞
= (x + θ)1−α πp
α−1
πp + θ
= .
α−1
On the other hand, using Example 13.2, we have
1
πp = θ[(1 − p)− α − 1].
Hence,
πp + θ 1
TVaRp (L) = + θ[(1 − p)− α − 1]
α−1
Example 14.3
x2
Let Z be the standard normal distribution with pdf fZ (x) = √1 e− 2 . Find
2π
TVaRp (Z).
14 TAIL-VALUE-AT-RISK 121
Solution.
Notice first that fZ (x) satisfies the differential equation xfX (x) = −fZ0 (x).
Using the Fundamental Theorem of Calculus, we can write
Z ∞
1
TVaRp (Z) = xfZ (x)dx
1 − p Φ−1 (p)
Z ∞
1
=− f 0 (x)dx
1 − p Φ−1 (p) Z
∞
1
=− fZ (x)
1−p Φ−1 (p)
1
= fZ [Φ−1 (p)]
1−p
Example 14.4
Let L be a loss random variable having a normal distribution with mean µ
and standard deviation σ. Find TVaRp (L).
Solution.
Since TVaRp (L) is a coherent risk measure, it is positive homogeneous and
translation invariant. Thus, we have
σ
TVaRp (L) = TVaRp (µ + σZ) = µ + σTVaRp (Z) = µ + fZ [Φ−1 (p)]
1−p
122 RISK MEASURES
Practice Problems
Problem 14.1
Let L be a loss random variable with uniform distribution in (a, b).
Problem 14.2
The cdf of a loss random variable L is given by
x2
FL (x) = 4 , 0<x≤2
1, x > 2.
Problem 14.3
You are given the following empirical distribution
3, 5, 6, 6, 6, 7, 7, 10.
Problem 14.4
Losses are distributed as Pareto distributions with mean of 200 and variance
of 60000.
Problem 14.5
Losses represented by the random variable L are uniformly distributed from
0 to the maximum loss. You are given that Var(L) = 62, 208.
Problem 14.6
Losses represented by the random variable L are uniformly distributed in
(0, 864). Determine β so that the standard deviation principle is equal to
TVaR0.75 (L).
14 TAIL-VALUE-AT-RISK 123
Problem 14.7
Diabetes claims follow an exponential distribution with parameter λ = 2.
Find TVaR0.90 (L).
Problem 14.8
You are given the following empirical distribution
3, 5, 6, 6, 6, 7, 7, 10.
Calculate β2 − β1 .
Problem 14.9
Let L1 be a Pareto random variable with parameters α = 2 and θ = 100.
Let L2 be a random variable with uniform distribution on (0, 864). Find p
such that
TVaR0.99 (L1 ) TVaRp (L2 )
= .
VaR0.99 (L1 ) VaRp (L2 )
Problem 14.10
Let L be a random variable with discrete loss distribution given by
Problem 14.11
Find TVaR0.95 (L) when L has a normal distribution with mean of 100 and
standard deviation of 10.
124 RISK MEASURES
Characteristics of Actuarial
Models
125
126 CHARACTERISTICS OF ACTUARIAL MODELS
Example 15.1
Show that the Pareto distribution is a scale distribution.
Solution.
The cdf of the Pareto distribution is
α
θ
FX (x) = 1 − .
x+θ
Let Y = cX. Then
y
FY (y) =Pr(Y ≤ y) = Pr X ≤
α c
cθ
=1 − .
y + cθ
This is a Pareto distribution with parameters α and cθ
Example 15.2
Show that the Weibull distribution with parameters θ and τ is a scale dis-
tribution.
15 PARAMETRIC AND SCALE DISTRIBUTIONS 127
Solution.
The cdf of the Weibull distribution is
x τ
FX (x) = 1 − e−( θ ) .
Example 15.3
Show that the parameter θ in the Pareto distribution is a scale parameter.
Solution.
This follos from Example 15.1
Example 15.4
Find the scale parameter of the Weibull distribution with parameters θ and
τ.
Solution.
According to Example 15.2, the scale parameter is θ
Example 15.5
The amount of money in dollars that Clark received in 2010 from his invest-
ment in Simplicity futures follows a Pareto distribution with parameters
α = 3 and θ. Annual inflation in the United States from 2010 to 2011 is
i%. The 80th percentile of the earning size in 2010 equals the mean earning
size in 2011. If Clark’s investment income keeps up with inflation but is
otherwise unaffected, determine i.
Solution.
Let X be the earning size in 2010 and Y that in 2011. Then Y is a Pareto
distribution with parameters α = 3 and (1 + i)θ. We are told that
(1 + i)θ (1 + i)θ
π0.80 = E(Y ) = (1 + i)E(X) = = .
3−1 2
128 CHARACTERISTICS OF ACTUARIAL MODELS
Thus,
!3
(1 + i)θ (1 + i)θ θ
0.8 = Pr X < = FX =1− (1+i)θ
.
2 2 +θ
2
Example 15.6
Show that exponential distributions belong to the Weibull distribution fam-
ily with parameters θ and τ.
Solution. τ
τ − x ( )
τ ( xθ ) eθ
Weibull distributions with parameters θ and τ have pdf fX (x) = x .
x
e− θ
Letting τ = 1, the pdf reduces to fX (x) = θ which is the pdf of an expo-
nential distribution
15 PARAMETRIC AND SCALE DISTRIBUTIONS 129
Practice Problems
Problem 15.1
Show that the exponential distribution is a scale distribution.
Problem 15.2
2
Let X be a random variable with pdf fX (x) = 2xe−x for x > 0 and 0
otherwise. Let Y = cX for c > 0.
Find FY (y).
Problem 15.3
Let X be a uniform random variable on the interval (0, θ). Let Y = cX for
c > 0.
Find FY (y).
Problem 15.4
x −α
Show that the Fréchet distribution with cdf FX (x) = e−( θ ) and parame-
ters θ and α is a scale distribution.
Problem 15.5
Show that the three-parameter Burr distribution with cdf FX (x) = 1 −
1
γ α is a scale distribution.
[1+( xθ ) ]
Problem 15.6
Find the scale parameter of the following distributions:
Problem 15.8
Let X be the
lognormal
distribution with parameters µ and σ and cdf
FX (x) = Φ ln x−µ
σ .
Problem 15.9
Show that the Gamma distribution is a scale distribution. Is there a scale
parameter?
Problem 15.10
Earnings during 2012 follow a Gamma distribution with variance 2,500. For
the 2013, earnings are expected to be subject to P % inflation and the ex-
pected variance for the 2013 year is 10,000.
Problem 15.11
The Gamma distribution with parameters α and θ has the pdf fX (x) =
x
xα−1 e− θ
Γ(α) .
Problem 15.12
Hardy Auto Insurance claims X are represented by a Weibull distribution
with parameters α = 2 and θ = 400. It is found that the claim sizes are
inflated by 30% uniformly.
The mixing weights are discrete probabilities. To see this, let Θ be the dis-
crete random variable with support {1, 2, · · · , k} and pmf Pr(Θ = i) = ai .
We can think of the distribution of Θ as a conditioning distribution where
X = Xi is conditioned on Θ = i, or equivalently, FX|Θ (x|Θ = i) = FXi (x).
In this case, X is the unconditional distribution with cdf
k
X
FX (x) = a1 FX1 (x)+a2 FX2 (x)+· · ·+ak FXk (x) = FX|Θ (x|Θ = i)Pr(Θ = i).
i=1
Example 16.1
Let Y be a 2-point mixture of two random variables X1 and X2 with mixing
weights 0.6 and 0.4 respectively. The random variable X1 is a Pareto random
variable with parameters α = 3 and θ = 900. The random variable X2 is a
Pareto random variable with parameters α = 5 and θ = 1500. Find the pdf
of Y.
132 CHARACTERISTICS OF ACTUARIAL MODELS
Solution.
We are given
3(900)3 5(1500)5
fX1 = (x+900)4
and fX2 (x) = (x+1500)6
.
Thus,
3(900)3 5(1500)5
fY (x) = 0.6fX1 + 0.4fX2 = 0.6 + 0.4
(x + 900)4 (x + 1500)6
Example 16.2 ‡
The random variable N has a mixed distribution:
(i) With probability p, N has a binomial distribution with q = 0.5 and
m = 2.
(ii) With probability 1 − p, N has a binomial distribution with q = 0.5 and
m = 4.
Calculate Pr(N = 2).
Solution.
We have
Example 16.3
Determine the mean and second moment of the two-point mixture distribu-
tion with the cdf
α α+2
θ1 θ2
FX (x) = 1 − α − (1 − α) .
x + θ1 x + θ2
Solution.
The first part is the distribution of a Pareto random variable X1 with pa-
rameters α1 = α and θ1 . The second part is the distribution of a Pareto
random variable X2 with parameters α2 = α + 2 and θ2 . Thus,
θ1 θ1
E(X1 ) = =
α1 − 1 α−1
θ2 θ2
E(X2 ) = =
α2 − 1 α+1
θ1 θ2
E(X) =α + (1 − α)
α α+1
16 DISCRETE MIXTURE DISTRIBUTIONS 133
θ12 2! 2θ12
E(X12 ) = =
(α1 − 1)(α1 − 2) (α − 1)(α − 2)
2
θ2 2! 2θ22
E(X22 ) = =
(α2 − 1)(α2 − 2) (α)(α + 1)
2 2θ22
2 2θ1
E(X ) =α + (1 − α)
(α − 1)(α − 2) (α)(α + 1)
Next, we consider mixtures where the number of random variables in the
mixture is unknown. A variable-component mixture distribution has
a distribution function that can be written as
Example 16.4
Determine the distribution, density, and hazard rate functions for the vari-
able mixture of exponential distributions.
Solution.
The distribution function of the variable mixture is
− θx − θx − θx
FX (x) = 1 − a1 e 1 − a2 e 2 − · · · − aN e N
Example 16.5 ‡
You are given claim count data for which the sample mean is roughly equal
to the sample variance. Thus you would like to use a claim count model
that has its mean equal to its variance. An obvious choice is the Poisson
distribution.
Determine which of the following models may also be appropriate.
(A) A mixture of two binomial distributions with different means
(B) A mixture of two Poisson distributions with different means
(C) A mixture of two negative binomial distributions with different means
(D) None of (A), (B) or (C)
(E) All of (A), (B) and (C).
Solution.
Let X be a 2-point mixture of the random variables X1 and X2 with mixing
weights α and 1−α. Let Θ be the discrete random variable such that Pr(Θ =
1) = α and Pr(Θ = 2) = 1 − α. Thus, we have
E(X) =αE(X1 ) + (1 − α)E(X2 )
Var(X) =E(X 2 ) − E(X)2
=αE(X12 ) + (1 − α)E(X22 ) − [αE(X1 ) + (1 − α)E(X2 )]2
=αVar(X1 ) + (1 − α)Var(X2 ) + α(1 − α)[E(X1 ) − E(X2 )]2 .
If X1 and X2 are Poisson with means λ1 and λ2 respectively with λ1 6= λ2 ,
then
Var(X) =αλ1 + (1 − α)λ2 + α(1 − α)(λ1 − λ2 )2
>αλ1 + (1 − α)λ2 = E(X).
If X1 and X2 are negative binomial with parameters (r1 , β1 ) and (r2 , β2 )
respectively with r1 β1 6= r2 β2 , then
Var(X) =αr1 β1 (1 + β1 ) + (1 − α)r2 β2 (1 + β2 ) + α(1 − α)(r1 β1 − r2 β2 )2
>αr1 β1 + (1 − α)r2 β2 = E(X).
If X1 and X2 are binomial with parameters (m1 , q1 ) and (m2 , q2 ) respectively
with m1 q1 6= m2 q2 , then
Var(X) =αm1 q1 (1 − q1 ) + (1 − α)m2 q2 (1 − q2 ) + α(1 − α)(m1 q1 − m2 q2 )2
=E(X) + α(1 − α)(m1 q1 − m2 q2 )2 − αm1 q12 − (1 − α)m2 q22 .
The expression α(1 − α)(m1 q1 − m2 q2 )2 − αm1 q12 − (1 − α)m2 q22 can be
positive, negative, or zero. Thus, a mixture of two binomial distributions
with different means may result in the variance being equal to the mean
16 DISCRETE MIXTURE DISTRIBUTIONS 135
Example 16.6 ‡
Losses come from an equally weighted mixture of an exponential distribu-
tion with mean m1 , and an exponential distribution with mean m2 .
Determine the least upper bound for the coefficient of variation of this dis-
tribution.
Solution.
Let X be the random variable with pdf
1 1 − mx 1 − mx
f (x) = e 1 + e 2 .
2 m1 m2
We have
1
E(X) = (m1 + m2 )
2
1
E(X ) = (2m21 + 2m22 )
2
2
2
1 2 2 1
Var(X) = (2m1 + 2m2 ) − (m1 + m2 ) .
2 2
Practice Problems
Problem 16.1
The distribution of a loss, X, is a 2-point mixture:
(i) With probability 0.6, X1 is a Pareto distribution with parameters α = 3
and θ = 900.
(ii) With probability 0.4, X2 is a Pareto distribution with parameters α = 5
and θ = 1500.
Problem 16.6
How many parameters are there in a variable component mixture consisting
of 9 Burr distributions?
Problem 16.7
Determine the distribution, density, and hazard rate functions for the vari-
able mixture of two-parameter Pareto distribution.
Problem 16.8
A Weibull distribution has two parameters: θ and τ. An actuary is creating
variable-component mixture distribution consisting of K Weibull distribu-
tions. If the actuary chooses to use 17 Weibull distributions instead of 12,
how many more parameters will the variable-component mixture distribu-
tion have as a result?
Problem 16.9
Let X be a 2-point mixture with underlying random variables X1 and X2 .
The distribution of X1 is a Pareto distribution with parmaters α1 = 3 and
θ. The distribution of X2 is a Gamma distribution with parameters α2 = 2
and θ2 = 2000.
Given that a1 = 0.7, a2 = 0.3, and E(X) = 1340, determine the value
of θ.
Problem 16.10
Let X be a 3-point mixture of three variables X1 , X2 , X3 . You are given the
following information:
R.V. Weight Mean Standard Deviation
X1 0.2 0.10 0.15
X2 0.5 0.25 0.45
X3 0.3 0.17 0.35
Determine Var(X).
Problem 16.11 ‡
The distribution of a loss, X, is a 2-point mixture:
(i) With probability 0.8, X1 is a Pareto distribution with parameters α = 2
and θ = 100.
(ii) With probability 0.2, X2 is a Pareto distribution with parameters α = 4
and θ = 3000.
17 Data-dependent Distributions
In Section 15, we discussed parametric distributions. In Section 16, we
introduced the k−point mixture distributions that are also known as semi-
parametric distributions. In this section, we look at non-parametric
distributions.
Example 17.1
Below are the losses suffered by policyholders of an insurance company:
49, 50, 50, 50, 60, 75, 80, 120, 230.
Let X be the random variable representing the losses incurred by the poli-
cyholders. Find the pmf and the cdf of X.
Solution.
The pmf is given by the table below.
x 49 50 60 75 80 120 130
1 1 1 1 1 1 1
p(x) 9 3 9 9 9 9 9
Example 17.2
Below are the losses suffered by policyholders of an insurance company:
Solution.
For i = 1, 2, · · · , 9, we have
1
10 , x i − 5 ≤ x ≤ xi + 5
ki (x) =
0, otherwise.
Practice Problems
Problem 17.1
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Problem 17.2
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the policy-
holders. The insurance company issued a policy with an ordinary deductible
of 105.
Problem 17.3
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Problem 17.4
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
17 DATA-DEPENDENT DISTRIBUTIONS 141
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Problem 17.5
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Problem 17.6
You are given the following empirical distribution of losses suffered by poli-
cyholders Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Problem 17.7
You are given the following the distribution of losses suffered by policyhold-
ers Prevent Dental Insurance Company:
94, 104, 104, 104, 134, 134, 180, 180, 180, 180, 210, 350, 524.
Let X be the random variable representing the losses incurred by the poli-
cyholders.
Calculate fX (x) using smoothed kernel setting with uniform kernel of band-
with 4.
142 CHARACTERISTICS OF ACTUARIAL MODELS
Generating New
Distributions
In this chapter a collection of continuous models that are commonly used for
most actuarial modeling situations will be developed. Processes for creating
new distributions from existing ones will be introduced. We point out here
that most of the distributions that are used in actuarial modeling have
nonnegative support so that FX (0) = 0.
143
144 GENERATING NEW DISTRIBUTIONS
Theorem 18.1
Let X be a continuous random variable and c a positive constant. Let
Y = cX. Then fY (y) = 1c fX yc and FY (y) = FX yc . Thus, c is a scale
parameter for Y.
Proof.
We start by finding the cdf of Y and then we generate the pdf by differen-
tiation. We have
y y
FY (y) = Pr(Y ≤ y) = Pr X ≤ = FX .
c c
Now, differentiating FY (y) and using the chain rule we find
1 y
fY (y) = fX
c c
Example 18.1
Suppose that random losses are exponentially distributed with parameter θ.
Find the pdf and the cdf of the random variable Y = cX, c > 0.
Solution.
We have y y
FY (y) = Pr(Y ≤ y) = Pr X ≤ = 1 − e− cθ .
c
Thus,
d 1 y
fY (y) = FY (y) = e− cθ
dy c
Scalar Multiples of random variables are useful in actuarial modeling when
annual losses are subject to future uniform inflation. For example, if X is
the random variable representing this year’s losses and uniform losses are
known to be i% for the next year then the next year’s losses can be modeled
with the random variable Y = (1 + 0.01i)X.
Example 18.2
You are given:
(i) In 2011, losses follow a Pareto distribution with parameters α = 2 and
18 SCALAR MULTIPLICATION OF RANDOM VARIABLES 145
θ = 100.
(ii) Inflation of 3.5% impacts all losses uniformly from 2011 to 2012.
What is the probability that the losses will exceed 350 in 2012?
Solution.
Let X and Y be the random variables representing the losses in 2011 and
2012 respectively. Then Y = 1.035X. We want to find Pr(Y > 350). Recall
that the cdf of the Pareto distribution is
2
100
FX (x) = 1 − .
x + 100
Thus,
!2
350 350 100
Pr(Y > 350) = Pr X > = SX X > = 350 = 0.0521
1.035 1.035 1.035 + 100
146 GENERATING NEW DISTRIBUTIONS
Practice Problems
Problem 18.1
You are given:
(i) X has a Pareto distribution with parameters α = 3 and θ = 2000.
(ii) Y = cX, c > 0.
(iii) σY = 1500.
Problem 18.2
Losses in 2011 are represented by a random variable X with pdf fX (x) = 83 x2
for 0 < x < 2 and 0 otherwise. Let Y be the random variable of losses in
2012. It is expected that losses in 2012 will go down by 50% than the current
year.
Problem 18.3
Losses from auto accidents are modeled by a Pareto distribution with pa-
rameters α = 3 and θ = 2000. The insurance policy pays only 75% of any
auto accident claim.
Find the mean and the standard deviation of the claims for this policy.
Problem 18.4
Let X have cdf FX (x) = 1 − (1 + x)−α where x > 0 and α > 0. Determine
the pdf and the cdf of Y = θX.
Problem 18.5
Let Y have the lognormal distribution with parameters µ and σ. Let Z = θY.
Problem 18.6
Losses in 1993 follow the density function fX (x) = 3x−4 , x > 1 where x is
the loss in millions of dollars. Inflation of 10% impacts all claims uniformly
from 1993 to 1994.
Determine the cdf of losses for 1994 and use it to determine the probability
that a 1994 loss exceeds 2,200,000.
18 SCALAR MULTIPLICATION OF RANDOM VARIABLES 147
Problem 18.7
You are given: √
(i) X is a loglogistic random variable with parameters γ = 2 and θ = 10 10.
(ii) Y is a Pareto distribution with parameters α = 1 and θ = 1000.
(iii) Z is a 2-point mixture of X and Y with equal mixing weights.
(iv) W = (1 + r)Z where r > 0.
The pdf and the cdf of the new distribution are provided by the next theo-
rem.
Theorem 19.1
Let X be a continuous random variable with pdf and cdf fX (x) and FX (x)
with FX (0) = 0. Let τ > 0. We have
1
(a) In the transformed case, Y = X τ
FY (y) = FX (y τ ) and fY (y) = τ y τ −1 fX (y τ ).
1
(b) In the inverse transformed case, Y = X − τ
FY (y) = 1 − FX (y −τ ) and fY (y) = τ y −τ −1 fX (y −τ ).
(c) In the inverse case, Y = X −1
FY (y) = 1 − FX (y −1 ) and fY (y) = y −2 fX (y −1 ).
Proof.
(a) We have
Differentiating this function with respect to y and using the chain rule, we
find
fY (y) = τ y τ −1 fX (y τ ).
(b) We have
Differentiating this function with respect to y and using the chain rule, we
find
fY (y) = τ y −τ −1 fX (y −τ ).
19 POWERS AND EXPONENTIATION OF RANDOM VARIABLES 149
(c) We have
Differentiating this function with respect to y and using the chain rule, we
find
fY (y) = y −2 fX (y −1 )
Example 19.1 √
Let X be a random variable with pdf fX (x) = x for 0 < x < 2 and 0
1
otherwise. Let Y = X 4 . Find the pdf and the cdf of Y.
Solution. √ √
The cdf of X is FX (x) = x2 for 0 ≤ x ≤ 2 and 1 for x > 2. Thus,
1 1
FY (y) = FX (y 4 ) = y 8 for 0 ≤ y ≤ 2 8 and 1 for y ≥ 2 8 . The pdf of Y is
1
fY (y) = 4y 3 fX (y 4 ) = 8y 7 for 0 < y < 2 8 and 0 otherwise
Example 19.2
Let X have the beta distribution with pdf
Γ(α + β) α−1
fX (x) = x (1 − x)β−1 , 0 < x < 1
Γ(α)Γ(β)
1
and 0 otherwise. Find the pdf of Y = X τ where τ > 0.
Solution.
We have
Γ(α + β) τ (α−1)
fY (y) = −τ y τ −1 fX (y τ ) = −τ y τ −1 y (1 − y τ )β−1 , 0 < y < 1
Γ(α)Γ(β)
and 0 otherwise
Theorem 19.2
Let X be a continuous random variable with pdf fX (x) and cdf FX (x) such
that fX (x) > 0 for all x ∈ R. Let Y = eX . Then, for y > 0, we have
Proof.
We have
Example 19.3
Let X be a normal distribution with parameters µ = 1 and σ 2 = 4. Define
the random variable Y = eX .
(a) Find E(Y ).
(b) The 95th percentile of the standard normal distribution is 1.645. Find
the 95th percentile of Y.
Solution.
(a) We have
2
Z ∞ (x−1)
1 −
ex √ e
8
E(Y ) = dx
−∞ 2 2π
Z ∞
1 1 2
=e3 √ e− 8 (x−5) dx
−∞ 2 2π
=e3 · 1 = 20.086.
Note that the last integral is the integral of the density function of the
normal distribution with parameters µ = 5 and σ 2 = 4.
(b) Let π0.95 be the 95th percentile of Y. Then Pr(Y ≤ π0.95 ) = 0.95. Thus,
X −1 ln π0.95 − 1
0.95 = Pr(Y ≤ π0.95 ) = Pr(X ≤ ln 0.95) = Pr ≤ .
2 2
Hence,
ln π0.95 − 1
= 1.645 =⇒ π0.95 = 72.97
2
19 POWERS AND EXPONENTIATION OF RANDOM VARIABLES 151
Practice Problems
Problem 19.1
Let X be the exponential distribution with parameter θ. Determine the pdf
and the cdf of the transformed, inverse transformed, and inverse exponential
distribution.
Problem 19.2
Find the cdf of the inverse of a Pareto distribution with parameters α and
θ. What’s the name of the new distribution and its parameter(s)?
Problem 19.3
Let X be a random variable with pdf fX (x) = 2x for 0 < x < 1 and 0
otherwise.
Problem 19.4
Find the pdf of the inverse of the Gamma distribution with parameters α
and θ = 1.
Problem 19.5
1
Let X have a uniform distribution in (0, b). Find the pdf of Y = X τ , with
τ > 0.
Problem 19.6
X
Let X have a Pareto distribution with parameters α and θ. Let Y = ln 1 + θ .
Determine the name of the distribution of Y and its parameter(s).
Problem 19.7
Let X have the normal distribution with parameters µ and σ 2 . Find the pdf
and cdf of Y = eX .
Problem 19.8
Let X have a uniform distribution in (0, b). Find the pdf of Y = eX .
Problem 19.9
Let X have an exponential distribution with parameter θ. Find the pdf of
Y = eX .
152 GENERATING NEW DISTRIBUTIONS
Problem 19.10 ‡
You are given:
(i) X has a Pareto distribution with parameters α = 2 and θ = 100.
(ii) Y = ln 1 + xθ .
(b) We have
Z ∞ Z ∞
E(X k ) = xk fX|Λ (x|λ)fΛ (λ)dλdx
−∞ −∞
Z ∞ Z ∞
k
= x fX|Λ (x|λ)dx fΛ (λ)dλ
−∞ −∞
Z ∞
= E(X k |Λ)fΛ (λ)dλ = E[E(X k |Λ)].
−∞
154 GENERATING NEW DISTRIBUTIONS
(c) We have
Example 20.1
1
The distribution of X|Λ is exponential with parameter Λ. The distribution
of Λ is Gamma with parameters α and θ. Find fX (x).
Solution.
We have
∞
θα α−1 −λθ
Z
fX (x) = λe−λx λ e dλ
0 Γ(α)
∞
θα
Z
= λα e−λ(x+θ) dλ
Γ(α) 0
θα Γ(α + 1) αθα
= = .
Γ(α) (x + θ)α+1 (x + θ)α+1
This is the distribution of a Pareto random variable and we know that this
distribution is heavy-tailed
Example 20.2
Suppose that X|Λ has a normal distribution with parameters λ and σ1 . That
is,
(x−λ)2
1 − 2
fX|Λ (x|λ) = √ e 2σ1
, − ∞ < x < ∞.
σ1 2π
Suppose that Λ has a normal distribution with parameters µ and σ2 . That
is,
(λ−µ)2
1 − 2
fΛ (λ) = √ e 2σ2
, − ∞ < x < ∞.
σ2 2π
Determine the unconditional pdf of X.
20 CONTINUOUS MIXING OF DISTRIBUTIONS 155
Solution.
We first establish the following identity:
2
x−λ 2 λ−µ 2
2
σ1 + σ22 σ22 x + µσ12 (x − µ)2
+ = λ − + .
σ1 σ2 σ12 σ22 σ12 + σ22 σ12 + σ22
By completing the square, we find
x−λ 2 λ − µ 2 x2 − 2λx + λ2 λ2 − 2µλ + µ2
+ = +
σ1 σ2 σ12 σ22
2 2 σ22 x + σ12 µ σ 2 x 2 + σ 2 µ2
2 σ1 + σ2
=λ 2 2 − 2λ 2 2 + 2 2 21
σ1 σ2 σ1 σ2 σ1 σ2
2 2
σ1 + σ22 σ22 x + µσ12 (σ22 x + µσ12 )2 σ22 x2 + σ12 µ2
= λ − − +
σ12 σ22 σ12 + σ22 σ12 σ22 (σ12 + σ22 σ12 σ22
2 2
σ1 + σ22 σ22 x + µσ12 σ12 σ22 x2 − 2µxσ12 σ22 + σ12 σ22 µ2
1
= λ− + 2 2
σ12 σ22 σ12 + σ22 σ1 σ2 σ12 + σ22
2 2
σ1 + σ22 σ22 x + µσ12 (x − µ)2
= 2 2 λ − 2 2 + 2 .
σ1 σ2 σ1 + σ2 σ1 + σ22
Now, the marginal density function of X is
Z ∞ (x−λ)2 (λ−µ)2
1 − 2 1 − 2
fX (x) = √ e 2σ1
√ e 2σ2 dλ
−∞ σ1 2π σ2 2π
2 2
Z ∞ 1 x−λ λ−µ
1 − 2 σ1
+ σ2
= e dλ
2πσ1 σ2 −∞
(x−µ)2
− 2 +σ 2 ) Z ∞s 2 " 2 #
σ1 + σ22 σ12 + σ22 σ22 x + µσ12
2(σ1
e 2
=p exp − λ− dλ.
2π(σ12 + σ22 ) −∞ 2πσ12 σ22 2σ12 σ22 σ12 + σ22
The integrand in the last integral is the pdf of a normal distribution with
σ 2 x+σ 2 µ σ2 σ2
parameters 1σ2 +σ22 and σ21+σ22 so that the integral is 1. Hence,
1 2 1 2
(x−µ)2
− 2 +σ 2 )
2(σ1
e 2
fX (x) = p
2π(σ12 + σ22 )
which is the pdf of the normal distribution with parameters µ and σ12 + σ22
Example 20.3 ‡
The scores on the final exam in Ms. B’s Latin class have a normal distribu-
tion with mean θ and standard deviation equal to 8. θ is a random variable
156 GENERATING NEW DISTRIBUTIONS
Solution.
Let S denote the scores. Since S|Θ and Θ are normal, S is also normally
distributed with mean
and variance
We want
Example 20.4
Let X|Λ have a Poisson distribution with parameter λ. Let Λ have a Gamma
distribution with parameters α and β. That is,
−λ
λα−1 e β
fΛ (λ) = α .
β Γ(α)
Solution.
We have8
Z ∞
Pr(X = 1) = Pr(X = 1|Λ)fΛ (λ)dλ
0
λ
Z ∞ α−1 e− β
−λ λ
= λe dλ
0 β α Γ(α)
Z ∞
1 1
α −λ 1+ β
= α λ e dλ
β Γ(α) 0
−(α+1) λ
− −1
1 + β1 Γ(α + 1) Z ∞ λ(α+1)−1 e (1+ β1 )
= dλ.
β α Γ(α) −(α+1)
0
1 + β1 Γ(α + 1)
−(α+1)
1
1+ β Γ(α + 1) αβ
Pr(X = 1) = =
β α Γ(α) (1 + β)α+1
Example 20.5 ‡
Bob is a carnival operator of a game in which a player receives a prize worth
W = 2N if the player has N successes, N = 0, 1, 2, · · · , Bob models the
probability of success for a player as follows:
(i) N has a Poisson distribution with mean Λ.
(ii) Λ has a uniform distribution on the interval (0, 4).
Calculate E[W ].
Solution.
We know that PN (z) = eλ(z−1) = E(Z N ). In particular, E(W |Λ) = PN (2) =
eλ . Thus,
Z 4
1 4 λ
Z
E(W ) = E(W |Λ)fΛ (λ)dλ = e dλ = 13.4
0 4 0
8
See P. 380 of [2]
158 GENERATING NEW DISTRIBUTIONS
Practice Problems
Problem 20.1
Let X be a loss random variable having a Pareto distribution with parame-
ters α and Θ. The parameter Θ is uniformly distributed in (0, b).
Problem 20.2
Let X|Θ be the inverse exponential random variable with parameter Θ. Its
pdf is
1
fX|Θ (x|θ) = 2 θe−θx , x > 0
x
and 0 otherwise. Let Θ have the exponential distribution with mean 4.
Problem 20.3
Let X|Λ have the pdf
and 0 otherwise.
Problem 20.4
Let X|Λ have a Poisson distribution with parameter Λ. Let Λ have a Gamma
distribution with parameters α and β. That is,
−λ
λα−1 e β
fΛ (λ) = α .
β Γ(α)
Problem 20.5
γ
Suppose that X|Λ has a Weibull distribution with cdf FX|Λ (x|λ) = 1−e−λx
for x ≥ 0. Suppose that Λ is exponentially distributed with mean θ.
Problem 20.6
Let N have a Poisson distribution with mean Λ. Let Λ have a uniform dis-
tribution on the interval (0,5).
Problem 20.7
Let N |Λ have a negative binomial distribution with r = 1 and Λ. Let Λ have
a Gamma distribution with α = 1 and θ = 2.
Problem 20.8 ‡
A claim count distribution can be expressed as a mixed Poisson distribu-
tion. The mean of the Poisson distribution is uniformly distributed over the
interval [0, 5].
Problem 20.9 ‡
The length of time T , in years, that a person will remember an actuarial
statistic is modeled by an exponential distribution with mean Y1 . In a certain
population, Y has a gamma distribution with α = θ = 2.
Calculate the probability that a person drawn at random from this pop-
ulation will remember an actuarial statistic less than 12 year.
160 GENERATING NEW DISTRIBUTIONS
Theorem 21.1
Rx
Let A(x) = 0 a(z)dz. We have:
(a) The conditional survival function of X given Λ is
SX|Λ (x|λ) = e−λA(x) .
(b) The unconditional survival function of X is
SX (x) = MΛ (−A(x)).
Proof.
(a) Recall that a survival function can be recovered from the hazard rate
function so that we have
Rx Rx
SX|Λ (x|λ) = e− 0 hX|Λ (z|λ)dz
= e− 0 λa(z)dz
= e−λA(x) .
(b) The unconditional survival function is
Z
SX (x) = SX|Λ (x|λ)fΛ (λ)dλ = E[SX|Λ (x|λ)] = E[e−λA(x) ] = MΛ (−A(x))
λ
Remark 21.1
If X|Λ has an exponential distribution in the frailty model, a(x) will be 1,
and A(x) will be x. When X|Λ has Weibull distribution in the frailty model,
a(x) will be γxγ−1 , and A(x) will be xγ .
Example 21.1
Let X|Λ have a Weibull distribution with conditional survival function
γ
SX|Λ (x|λ) = e−λx . Let Λ have a Gamma distribution with parameters α
and θ. Find the unconditional or marginal survival function of X.
21 FRAILTY (MIXING) MODELS 161
Solution.
We first find the moment generating function of Λ. We have
Z ∞
1 x
tX
E(e ) = α etx xα−1 e− θ dx
θ Γ(α) 0
Z ∞
1 1
= α xα−1 e−x(−t+ θ ) dx
θ Γ(α) 0
(−t + 1θ )−α e−y
Z ∞ α−1
y
= dy
0 θα Γ(α)
−α
−t + 1θ Γ(α) 1
= α
= (1 − θt)−α , t < .
θ Γ(α) θ
Example 21.2
A continuous mixture is used in a frailty model with frailty random variable
1
Λ, such that a(x) = x+1 , x > 0. Find the conditional survival function of
X.
Solution.
We first find A(x) :
Z x
dt
A(x) = = ln (1 + x).
0 1+t
Thus,
1
SX|Λ (x|λ) = e−λA(x) = e−λ ln (1+x) =
(1 + x)λ
Example 21.3
Given that the marginal survival function in a frailty model is SX (x) =
x−2 , x ≥ 1. Let the frailty random variable have an exponential distribution.
Determine MΛ (x).
Solution.
Since Λ have an exponential distribution, A(x) = x. Thus, x−2 = SX (x) =
MΛ (−A(x)) = MΛ (−x). Hence, MΛ (x) = x−2 , x ≥ 1
162 GENERATING NEW DISTRIBUTIONS
Example 21.4
The marginal survival function in a frailty model is given to be SX (x) =
1
2 α MΛ (x). The frailty random variable Λ is a Gamma random variable with
parameters α and θ. Determine A(x).
Solution.
1
The moment generating function of Λ is Mλ (x) = (1 − θx)−α . Thus, 2 α (1 −
θx)−α = SX (x) = MΛ (−A(x)) = (1 + θA(x))−α . Hence, 2(1 − θx) = 1 +
θA(x). Solving for A(x), we find A(x) = 2−θx
θ
Example 21.5
Consider a frailty model where X|Θ has an exponential distribution with
conditional hazard rate function hX|Θ (x|θ) = θ. The frailty random variable
Θ has a uniform distribution in (1, 11). Find the conditional survival function
of X.
Solution.
We have
Rx Rx
SX|Θ (x|θ) = e− 0 hX|θ (z|θ)dz
= e− 0 θdz
= e−θx , x > 0
Example 21.6
Consider the exponential-inverse Gaussian frailty model with
θ
a(x) = √ , θ > 0.
2 1 + θx
Determine A(x).
Solution.
We have
x x √ x √
Z Z
θ
A(x) = a(t)dt = √ dt = 1 + θt = 1 + θx − 1
0 0 2 1 + θt 0
21 FRAILTY (MIXING) MODELS 163
Practice Problems
Problem 21.1
Let X|Λ have an exponential distribution with conditional survival function
SX|Λ (x|λ) = e−λx . Let Λ have a Gamma distribution with parameters α and
θ.
Problem 21.2
A continuous mixture is used in a frailty model with frailty random variable
1
Λ, such that a(x) = x+1 , x > 0. The frailty random variable has a uniform
distribution in (0, 1).
Problem 21.3
Given that the marginal survival function in a frailty model is SX (x) =
x
e− θ , x ≥ 0. Let the frailty random variable have an exponential distribu-
tion.
Determine MΛ (x).
Problem 21.4
The marginal survival function in a frailty model is given to be SX (x) =
1
2 α MΛ (x). The frailty random variable Λ is a Gamma random variable with
parameters α and θ.
Determine a(x).
Problem 21.5
The probability generating function of the frailty random variable is PΛ (x) =
ex−1 . Suppose X|Λ has an exponential distribution.
Problem 21.6
Consider a frailty model where X|Θ has an exponential distribution with
conditional hazard rate function hX|Θ (x|θ) = θ. The frailty random variable
Θ has a uniform distribution in (1, 11).
Problem 21.7
Suppose that X|Λ has the Weibull distribution with conditional survival
γ
function SX|Λ (x|λ) = e−λx , x ≥ 0. The frailty random variable Λ has an
exponential distribution with mean θ.
Problem 21.8
Consider the exponential-inverse Gaussian frailty model with
θ
a(x) = √ , θ > 0.
2 1 + θx
Determine the conditional survival function SX|Λ (x|λ).
Problem 21.9
Consider the exponential-inverse Gaussian frailty model with
θ
a(x) = √ , θ > 0.
2 1 + θx
Suppose Λ has a Gamma distribution with parameters 2α and θ = 1.
Problem 21.10
Determine the unconditional probability density function of frailty distribu-
tion.
Problem 21.11
Determine the unconditional hazard rate function of frailty distribution.
22 SPLICED DISTRIBUTIONS 165
22 Spliced Distributions
A spliced distribution consists of different distributions one for each part
of the domain of the random variable. For example, a n-component
spliced distribution has the following pdf
α1 f1 (x), c1 < x < c2
α2 f2 (x),
c2 < x < c3
fX (x) = ..
.
αn fn (x), cn−1 < x < cn
Example 22.3 ‡
An actuary for a medical device manufacturer initially models the failure
time for a particular device with an exponential distribution with mean 4
years. This distribution is replaced with a spliced model whose density
function:
(i) is Uniform over [0,3]
(ii) is proportional to the initial modeled density function after 3 years
(iii) is continuous.
Calculate the probability of failure in the first 3 years under the revised
distribution.
Solution.
The two-spliced pdf is
α
f (x) = 3, 0<x<3
(1 − α)e [0.25e−0.25x ],
0.75 x≥3
Practice Problems
Problem 22.1
Find the density function of a random variable that is uniform on (0, c) and
exponential thereafter.
Problem 22.2 ‡
Suppose a loss distribution is a two-component spliced model with:
(i) a Weibull distribution having parameters θ1 = 1500 and τ = 1 for losses
up to $4,000; and
(ii) a Pareto distribution having parameters θ2 = 12000 and α = 2 for losses
$4,000 and up.
The probability that losses are less than $4,000 is 0.60.
Problem 22.3
A random variable X follows a continuous two-component spliced distribu-
tion that is uniform in (0, 3] and exponential (with mean 1) thereafter. Find
the 95th percentile of X.
Problem 22.4
Using the results of the previous problem, find E(X|X > π0.95 ).
Problem 22.5
Write the density function for a 2-component spliced model in which the
density function is proportional to a uniform density over the interval from
0 to 1000 and is proportional to an exponential density function from 1000
to ∞. Ensure that the resulting density function is continuous.
Problem 22.6
The pdf of two-component spliced distribution is given below.
0.01, 0 ≤ x < 50
f (x) =
0.02, 50 ≤ x ≤ 75
23 Limiting Distributions
In addition to the methods described in the previous sections, we can obtain
new distributions as limiting cases of other ones. This is accomplished by
letting the parameters go to either infinity or zero.
Example 23.1
For a Pareto distribution with parameters α and θ, let both α and θ go
to infinity with the ratio αθ → ξ held constant. Show that the result is an
exponential distribution.
Solution.
Let ξ = αθ so that α = ξθ. Substituting into the cdf of the Pareto distribution,
we find ξθ
θ
FX (x) = 1 − .
x+θ
ξθ
θ
Let w = x+θ . We have
θ−1 − (x + θ)−1
=ξ lim
θ→∞ −θ−2
xθ
= − ξ lim = −ξx.
θ→∞ x + θ
It follows that limθ→∞ w = e−ξx and limθ→∞ FX (x) = 1 − e−ξx which is the
cdf of an exponential distribution with mean 1ξ
Example 23.2
For a transformed beta distribution with parameters α, γ and θ, let both
α and θ go to infinity with the ratio θ1 → ξ. Show that the result is a
αγ
transformed gamma distribution.
Solution.
For large α, Stirling’s formula gives
1 1
Γ(α) ≈ e−α αα− 2 (2π) 2 .
23 LIMITING DISTRIBUTIONS 169
1
θ
Also, we let ξ = 1 so that θ = ξα γ .
αγ
Using this and Stirling’s formula in the pdf of a transformed beta distribu-
tion, we find
Γ(α + τ )γxγτ −1
fX (x) =
Γ(α)Γ(τ )θγτ (1 + xγ θ−γ )γ+τ
1 1
e−α−τ (α + τ )α+τ − 2 (2π) 2 γxγτ −1
≈ 1 1 1
e−α (α)α− 2 (2π) 2 Γ(τ )(α γ )γτ (1 + xγ ξ −γ α−1 )γ+τ
1
e−τ [(α + τ )/α]α+τ − 2 γxγτ −1
= .
Γ(τ )ξ γτ [1 + (x/ξ)γ /α]α+τ
Now, let
τ α+τ − 21
w1 = 1 + .
α
We have
1 τ
lim ln w1 = lim (α + τ − ) ln 1 +
α→∞ α→∞ 2 α
τ
ln 1 + α
= lim
α→∞ (α + τ − 1 )−1
2
−2
−1
−τ α 1 + ατ
= lim
α→∞ −(α + τ − 1 )−2
2
1 2
τ −1 τ
=τ lim 1 + 1+ −
α→∞ α α 1α
=τ.
α+τ
(x/ξ)γ
w2 = 1 + .
α
170 GENERATING NEW DISTRIBUTIONS
We have
(x/ξ)γ
lim ln w2 = lim (α + τ ) ln 1 +
α→∞ α→∞ α
h i
(x/ξ)γ
ln 1 + α
= lim
α→∞ (α + τ )−1
γ −1
h i
1 + (x/ξ) −(x/ξ)γ α−2
α
= lim
α→∞ −(α + γ)−2
(x/ξ)γ −1
γ
τ 2
=(x/ξ) lim 1 + 1+
α→∞ α α
γ
=(x/ξ) .
Hence,
γ
lim w2 = e(x/ξ) .
α→∞
Finally, γ
x
−
γxγτ −1 e ξ
lim fX (x) =
α→∞ Γ(τ )ξ γτ
which is the pdf of the transformed gamma distribution
23 LIMITING DISTRIBUTIONS 171
Practice Problems
Problem 23.1
Show:
1 h iα+τ ξ γ
α α+τ − 2 (ξ/x)γ
= eα = e( x ) .
limτ →∞ 1 + τ and limτ →∞ 1 + τ
Problem 23.2
For a transformed beta distribution with parameters α, γ and θ, let θ go to
1
infinity with the ratio θτ γ → ξ.
Problem 23.3
For an inverse Pareto distribution with parameters τ and θ, let θ → 0, τ →
∞ and τ θ → ξ.
Example 24.1
Show that a normal distribution with parameters µ and σ 2 belongs to the
linear exponential family with θ = µ.
Solution.
The pdf of the normal distribution function can be written as
1 1 x−µ 2
f (x, µ) = √ e− 2 ( σ )
σ 2π
2 2
1 − 12 x −2µx+µ
σ2
= √ e
σ 2π
2
1 − x 2 µ2 x
= √ µ 2 e 2σ e σ .
σ 2πe 2σ2
x2
Thus, X belongs to the linear exponential family with p(x) = √1 e− 2σ2 , q(µ) =
σ 2π
µ2
µ
e 2σ 2 , and r(µ) = σ2
Example 24.2
Let X belong in the linear exponential family. Find an expression of E(X).
Solution.
Taking the logarithm of f (x, θ) we find
Solution.
From the previous example, we have
∂f
(x, θ) = [x − µ(θ)]r0 (θ)f (x, θ).
∂θ
Differentiating with respect to θ and using the already obtained first deriva-
tive of f (x, θ) yield
∂2f ∂f
2
(x, θ) =r00 (θ)[x − µ(θ)]f (x, θ) − µ0 (θ)r0 (θ)f (x, θ) + [x − µ(θ)]r0 (θ) (x, θ)
∂θ ∂θ
=r00 (θ)[x − µ(θ)]f (x, θ) − µ0 (θ)r0 (θ)f (x, θ) + [x − µ(θ)]2 [r0 (θ)]2 f (x, θ).
Now we integrate both sides with respect to x to obtain
Z 2
∂ f
(x, θ)dx = [r0 (θ)]2 Var(X) − r0 (θ)µ0 (θ).
∂θ2
By the definition of the family, the support of X is independent of θ and so
is the range of x. Thus, we can write
∂2
Z
v(x, θ)dx = [r0 (θ)]2 Var(X) − r0 (θ)µ0 (θ).
∂θ2
174 GENERATING NEW DISTRIBUTIONS
That is,
[r0 (θ)]2 Var(X) − r0 (θ)µ0 (θ) = 0.
Solving for Var(X) yields
µ0 (θ)
Var(X) =
r0 (θ)
Example 24.4
Let X be the normal distribution with θ = µ = 24 and σ = 3. Use the
formulas of this section to verify that E(X) = 24 and Var(X) = 9.
Solution.
x2 θ2
For the given distribution, we have p(x) = √1 e− 2σ2 , q(θ) = e 2σ2 , and
σ 2π
θ
r(θ) = σ2
. Thus,
θ2
θ 2σ2
q 0 (θ) σ2
e
E(X) = 0 = θ2
= 24.
r (θ)q(θ) 1 2σ2
e
σ2
Likewise, we have
µ0 (θ) 1
Var(X) = 0
= 1 =9
r (θ) 9
24 THE LINEAR EXPONENTIAL FAMILY OF DISTRIBUTIONS 175
Practice Problems
Problem 24.1
Show that the Gamma distributions belongs to the linear exponential family.
Problem 24.2
Show that the Poisson distribution with parameter λ belongs to the linear
exponential family.
Problem 24.3
Show that the binomial distribution with m trials and parameter p belongs
to the linear exponential family.
Problem 24.4
Use the formulas of this section to find the mean and the variance of the
Poisson distribution.
Problem 24.5
Use the formulas of this section to find the mean and the variance of the
binomial distribution.
176 GENERATING NEW DISTRIBUTIONS
Discrete Distributions
The distributions and the families of distributions that we have been dis-
cussing so far are mainly used to describe the amount of risks. Next, we turn
our attention to distributions that describe the number of risks or claims.
In this chapter, we introduce classes of counting distributions. By a count-
ing distribution we mean a discrete distribution with support a subset of
N ∪ {0}. We will adopt the following notation: If N is the random variable
representing the number of events (or claims) then the probability mass
function or the probability function Pr(N = k) will be denoted by pk .
177
178 DISCRETE DISTRIBUTIONS
Using the pgf, we can find the mean and the variance:
Theorem 25.1
Let N1 , N2 , · · · , Nn be n independent Poisson random variables with param-
eterd λ1 , λ2 , · · · , λn respectively. Then the random variable S = N1 + N2 +
· · · + Nn is also a Poisson distribution with parameters λ1 + λ2 + · · · + λn .
25 THE POISSON DISTRIBUTION 179
Proof.
Since N1 , N2 , · · · , Nn are independent so are etN1 , etN2 , · · · , etNm . Using the
fact that the expectation of a product of independent random variables is
the product of the individual expectations, we have
The second result is very useful when modeling insurance losses or claims
where claims have classification. This result is known as the decomposi-
tion property of Poisson distribution.
Theorem 25.2
Let the total number of events N be a Poisson random variable with mean
λ. Suppose that the events can be classified into independent types such
as Type 1, Type 2,· · · , Type m with probabilities pk , k = 1, · · · , m. Let
Nk be the random variable representing the number of events of Type
k, k = 1, · · · , m. Then N1 , N2 , · · · , Nm are mutually independent Poisson
distributions with means λp1 , λp2 , , · · · , λpm respectively.
Proof.
Given n = n1 +n2 +· · ·+nm , the conditional joint distribution of N1 , N2 , · · · , Nm
is a multinomial distribution with parameters n, p1 , p2 , · · · , pm . Its condi-
tional pmf is
n!
Pr(N1 = n1 , N2 , n2 , · · · , Nm |N = n) = pn1 pn2 · · · pnmm .
n1 !n2 ! · · · nm ! 1 2
e−λ λn
Pr(N = n) = .
n!
180 DISCRETE DISTRIBUTIONS
Solution.
Let N be the number of patients with appendicitis that arrive to the emer-
gency room in the 24 hour period. Then
1
E(N ) = 6 × 24 × = 2.88.
50
Thus,
e−2.88
Pr(N > 2.88) = 1−p0 −p1 −p2 = 1−e−2.88 −2.88e−2.88 −2.882 = 0.5494
2
Example 25.2
In a portfolio of insurance, a claim can be classified as Type A, Type B, or
Type C with probabilities 0.2, 0.3, and 0.5 respectively. Suppose that the
total number of claims is a Poisson random variable with mean 10. Each
type has a Poisson distribution and these random variables are supposed to
be independent. What is the probability that out of 5 claims, 2 are of Type
A?
Solution.
We have
Pr(NA = 2, NB + NC = 3)
Pr(NA = 2|N = 5) =
Pr(N = 5)
e−2 22 e−8 83
= 2! 3!
e−10 105
= 0.2048
5!
182 DISCRETE DISTRIBUTIONS
Practice Problems
Problem 25.1
The number of students in College Algebra that go to Math Help room has
a Poisson distribution with mean 2 per hour. The number of students in
Trigonometry that go to Math Help room has a Poisson distribution with
mean 1 per hour. The number of students in Pre-calculus that go to Math
Help room has a Poisson distribution with mean 0.5 per hour.
Calculate the probability that more than 3 students (in any class) go to
Math Help room between 2:00om and 4:pm.
Problem 25.2
Let N1 and N2 be two independent Poisson random variables with mean 2
and 3 respectively.
Problem 25.3
The number of monthly car wrecks in Russellville follows a Poisson distri-
bution with mean 10. There are three possible mutually exclusive causes of
accidents. The probability of a wreck due to icy road is 0.40. The proba-
bility of a wreck due to break malfunction is 0.45 and the probability of a
wreck due to driver’s error is 0.15.
What is the probability that exactly 5 wrecks will occur this month due
to driver’s error?
Problem 25.4
Suppose that the number of cars arriving for service at a service facility in
one week has a Poisson distribution with a mean of 20. Suppose that each
car is classified as either domestic or foreign. Suppose also that each time
a car arrives for service there is a 75% chance that it domestic and a 25%
chance that it is foreign.
What is the weekly expected number of foreign cars arriving at the service
facility?
26 THE NEGATIVE BINOMIAL DISTRIBUTION 183
Example 26.2
Find the mean and the variance of a negative binomial distribution N.
Solution.
We have
E(N ) =PN0 (1) = rβ
E[N (N − 1)] =PN00 (1) = r(r + 1)β 2
Var(N ) =E[N (N − 1)] + E(N ) − (E(N ))2
=rβ(1 + β).
184 DISCRETE DISTRIBUTIONS
Example 26.3
Show that a negative binomial distribution is a mixture of a Poisson distri-
bution with a random parameter distributed Gamma.
Solution.
Suppose that N |Λ has a Poisson distribution with parameter λ. Suppose
that Λ has a Gamma distribution with parameters α and θ. We wish to find
the distribution of N. We have
Z ∞
pn =Pr(N = n) = Pr(N = n|Λ = λ)Pr(Λ = λ)dλ
0
λ
∞ Z ∞
e−λ λn α−1 e− θ
Z
λ 1 1 1
= α
dλ = α
e−λ(1+ θ ) λn+α−1 dλ
0 n! θ Γ(α) n! θ Γ(α) 0
λ
α n Z ∞ −
e (1+1/θ)−1 λα+n−1
n+α−1 1 θ
= dλ
n 1 + θ) 1 + θ) 0 (1 + 1/θ)−(α+n) Γ(α + n)
α n
n+α−1 1 θ
= .
n 1+θ 1+θ
This shows that the mixed Poisson, with a Gamma mixing distribution is
the same as the negative binomial distribution
Example 26.4 ‡
Glen is practicing his simulation skills. He generates 1000 values of the
random variable X as follows:
(i) He generates the observed value λ from the gamma distribution with
α = 2 and θ = 1 (hence with mean 2 and variance 2).
(ii) He then generates x from the Poisson distribution with mean λ.
(iii) He repeats the process 999 more times: first generating a value λ, then
generating x from the Poisson distribution with mean λ.
(iv) The repetitions are mutually independent.
Calculate the expected number of times that his simulated value of X is 3.
Solution.
By the previous result, X is a negative binomial distribution with r = α = 2
and β = θ = 1. From Table C, we have
r(r + 1)(r + 2)β 3 (2)(3)(4)13
p3 = = = 0.125.
3!(1 + β)r+3 3!(2)5
Thus we expect 1000p3 = 125 out of 1000 simulated values to be 3
26 THE NEGATIVE BINOMIAL DISTRIBUTION 185
Example 26.5
Show that a negative binomial distribution with r → ∞, n → 0, and rn → ξ
results in a Poisson distribution.
Solution.
Replace β in the pgf of N by rξ . We have
ξ
lim PN (z) = lim 1 − (z − 1)
r→∞ r→∞ r
ξ
=exp lim −r ln 1 − (z − 1)
r→∞ r
( )
ln 1 − ξ(z − 1)r−1
=exp lim −
r→∞ r−1
[1 − ξ(z − 1)r−1 ]ξ(z − 1)r−2
=exp lim
r→∞ r−2
rξ(z − 1)
=exp lim = eξ(z−1) .
r→∞ r − ξ(z − 1)
Example 26.6 ‡
Actuaries have modeled auto windshield claim frequencies and have con-
cluded that the number of windshield claims filed per year per driver follows
the Poisson distribution with parameter Λ, where Λ follows the Gamma
distribution with mean 3 and variance 3. Calculate the probability that a
driver selected at random will file no more than 1 windshield claim next
year.
Solution.
We are given that αθ = 3 and αθ2 = 3. Thus, α = 3 and θ = 1. On the
other hand, the number of windshield is a negative binomial distribution
with r = α = 3 and β = θ = 1. Hence,
1 3 5
Pr(N ≤ 0) = p0 + p1 = + = = 0.3125
8 16 16
For the special case r = 1, the negative binomial random variable is called
the geometric random variable. The geometric distribution, like the expo-
nential distribution, has the memoryless property: the distribution of a
variable that is known to be in excess of some value d does not depend on
d.
186 DISCRETE DISTRIBUTIONS
Example 26.7
The number of students having the flu in a particular college follows a geo-
metric distribution with β = 0.4. What is the difference between the follow-
ing two values:
A. The expected number of students having the flu in excess of 3 if it is
known that the number of students having the flu is greater than 6.
B. The expected number of students having the flu in excess of 2 if it is
known that the number of students having the flu is greater than 2.
Solution.
They are equal by the memoryless property of the geometric distribution
26 THE NEGATIVE BINOMIAL DISTRIBUTION 187
Practice Problems
Problem 26.1
You are modeling the frequency of events, and you need to select a distri-
bution to use. You observe that the variance of the number of events is less
than the mean number of events.
(a) Poisson
(b) Negative binomial
(c) Geometric
(d) None of the above
Problem 26.2
Assume that a certain type of claims in one month follows a geometric dis-
tribution with β = 3.
Problem 26.3
Suppose that the number of claims N in one month follows a geometric dis-
tribution with β = 3.
Problem 26.4
Suppose that the number of claims N in one month folloes a negative bino-
mial distribution with r = 3 and β = 2.
Problem 26.5
Let N |Λ have a negative binomial distribution with r = 2 and λ. Let Λ have
a Gamma distribution with α = 2 and θ = 3.
Find Var[N ].
Problem 26.6
Let N be a negative binomial random variable with mean 8 and variance 40.
Problem 26.7
Find the probability generating function of a geometric random variable N.
Problem 26.8
Find the coefficient of variation of a negative binomial random variable with
parameters r and β.
27 THE BERNOULLI AND BINOMIAL DISTRIBUTIONS 189
Example 27.1
Suppose that in a particular sheet of 100 postage stamps, 3 are defective.
The inspection policy is to look at 5 randomly chosen stamps on a sheet and
to release the sheet into circulation if none of those five is defective. Write
down the random variable, the corresponding probability distribution and
then determine the probability that the sheet described here will be allowed
to go into circulation.
Solution.
Let N be the number of defective stamps in the sheet. Then N is a binomial
190 DISCRETE DISTRIBUTIONS
(b) We have
2
X
Pr(N ≤ 2) = C(10, k)(0.5)k (0.5)10−k ≈ 0.0547
k=0
27 THE BERNOULLI AND BINOMIAL DISTRIBUTIONS 191
Theorem 27.1
Let N be binomial distribution with parameters (m, q). Then
(a) PN (z) = (zq + 1 − q)m .
(b) E(N ) = mq > and Var(N ) = mq(1 − q). Note that Var(N ) < E(N ).
Proof.
(a) The probability
Pngenerating function is found by using the binomial for-
n k
mula (a + b) = k=0 a b n−k :
m
X m
X
k k m−k
PN (z) = C(m, k)z q (1 − q) = (zq)k (1 − q)m−k = (zq + 1 − q)m .
k=0 k=0
(b) We have:
Example 27.4
Let N be a binomial random variable with parameters (12, 0.5). Find the
variance and the standard deviation of N.
Solution.
We have m = 12 and q = 0.5. Thus, Var(N
√ ) = mq(1 − q) = 6(1 − 0.5) = 3.
The standard deviation is SD(N ) = 3
Example 27.5
An exam consists of 25 multiple choice questions in which there are five
choices for each question. Suppose that you randomly pick an answer for
each question. Let N denote the total number of correctly answered ques-
tions. Write an expression that represents each of the following probabilities.
(a) The probability that you get exactly 16, or 17, or 18 of the questions
correct.
(b) The probability that you get at least one of the questions correct.
192 DISCRETE DISTRIBUTIONS
Solution.
(a) We have
Theorem 27.2
Let N be a binomial random variable with parameters (m, q). Then for
k = 1, 2, 3, · · · , n
q m−k+1
p(k) = p(k − 1)
1−q k
Proof.
We have
p(k) C(m, k)q k (1 − q)m−k
=
p(k − 1) C(m, k − 1)q k−1 (1 − q)m−k+1
m! k m−k
k!(m−k)! q (1 − q)
= m! k−1 (1 − q)m−k+1
(k−1)!(m−k+1)! q
(m − k + 1)q q m−k+1
= =
k(1 − q) 1−q k
27 THE BERNOULLI AND BINOMIAL DISTRIBUTIONS 193
Practice Problems
Problem 27.1
You are again modeling the frequency of events, and you need to select a
distribution to use. You observe that the variance of the number of events
is less than the mean number of events.
(a) Binomial
(b) Poisson
(c) Negative binomial
(d) Geometric
(e) None of the above are correct.
Problem 27.2
Let N be a random variable which follows a binomial distribution with pa-
rameters m = 20 and q = 0.2
Calculate E(2N ).
Problem 27.3
Suppose that the number of claims N has a binomial distribution with m = 2
and q = 0.7.
Problem 27.4
An insurance company insures 15 risks, each with a 2.5% probability of loss.
The probabilities of loss are independent.
Problem 27.5
Suppose that N |Λ has a binomial distribution with parameters Λ and q =
0.4. Suppose that Λ has a probability function defined by p(1) = p(2) =
p(3) = p(4) = 0.25.
Problem 27.6
An actuary has determined that the number of claims follows a binomial
distribution with mean 6 and variance 3.
Calculate the probability that the number of claims is at least 3 but less
than 5.
Problem 27.7
Let N be a binomial random variable with m = 10 and q = 0.2. Let F (m)
denote the cdf of N. Complete the following table.
m 0 1 2 3 4
pm
F (m)
Problem 27.8
Let N1 and N2 be two independent binomial random variables with respec-
tive parameters (m1 , q) and (m2 , q).
pk b
=a+ (28.1)
pk−1 k
for some constants a and b and for k ∈ N. We will denote the collection of
these discrete distributions by C(a, b, 0). The table below list the parameters
a and b for each distribution together with the probability function at 0.
Distributions a b p0
Poisson 0 λ e−λ
β β
Negative binomial 1+β (r − 1) 1+β (1 + β)−r
β
Geometric 1+β 0 (1 + β)−1
q q
Binomial − 1−q (m + 1) 1−q (1 − q)m
Example 28.1
Let N be a member of C(a, b, 0) satisfying the recursive probabilities
pk 3
k = k + 3.
pk−1 4
Solution.
Since a > 0 and b > 0, N is the negative binomial distribution. We have
β 3 β
1+β = 4 which implies β = 3. Also, we have (r − 1) 1+β = 3 which yields
r=5
196 DISCRETE DISTRIBUTIONS
Example 28.2 ‡
The distribution of accidents for 84 randomly selected policies is as follows:
# of Accidents # of Policies
0 32
1 26
2 12
3 7
4 4
5 2
6 1
Solution.
We have
pk
# of Accidents # of Policies k pk−1
0 32 NA
1 26 0.8125
2 12 0.9231
3 7 1.75
4 4 2.2857
5 2 2.5
6 1 3
pk
Plotting the points k, k pk−1 we find that the negative binomial distribu-
tion is the best model from C(a, b, 0) to use (slope of line is positive)
Example 28.3
The number of dental claims in a year follows a Poisson distribution with
a mean of λ. The probability of exactly six claims during a year is 40% of
the probability that there will be 5 claims. Determine the probability that
there will be 4 claims.
Solution.
Let N be the number of dental claims in a year. Since N is a member of
C(a, b, 0) we can write
pk b λ
= a + = , k = 1, 2, · · · .
pk−1 k k
28 THE (A, B, 0) CLASS OF DISCRETE DISTRIBUTIONS 197
Solution.
Let N denote the distribution under consideration. Since N is a member of
the C(a, b, 0) class, we have the recursive relation
b
pk = a + pk−1 , k = 1, 2, · · · .
k
Example 28.5 ‡
A discrete probability distribution has the following properties:
1
(i) pk = c 1 + k pk−1 , k = 1, 2, · · ·
(ii) p0 = 0.5.
Determine the value of c.
Solution.
This is a class C(a, b, 0) distribution with a = b = c. We will go through each
distribution in the class and see which one fits the condition of the problem.
• If the distribution is Poisson then a = b = c = 0. In this case, p0 = 0.5
X∞
and pk = 0 for k = 1, 2, · · · . Since pk = 0.5 6= 1, this distribution can
k=1
not be the answer.
198 DISCRETE DISTRIBUTIONS
a 1
=1= =⇒ r = 2.
b r−1
Also,
p0 = 0.5 = (1 + β)−r =⇒ β = 0.414.
Finally,
β 0.414
c= = = 0.29
1+β 1.414
28 THE (A, B, 0) CLASS OF DISCRETE DISTRIBUTIONS 199
Practice Problems
Problem 28.1
Let N be a member of C(a, b, 0) satisfying the recursive probabilities
pk 4 1
= − .
pk−1 k 3
Identify the distribution N.
Problem 28.2
Let N be a member of C(a, b, 0) satisfying the recursive probabilities
pk 4 1
= − .
pk−1 k 3
Find E(N ) and Var(N ).
Problem 28.3 ‡
The number of claims is being modeled with an C(a, b, 0)) class of distribu-
tions. You are given:
• p0 = p1 = 0.25
• p2 = 0.1875.
Problem 28.4
Let N be a counting distribution in C(a, b, 0) satisfying:
• p0 = 415
pk
= c 0.25 + k1 , k = 1, 2, 3, · · · .
• pk−1
Problem 28.5
Suppose that the number of claims N has a Poisson distribution with mean
λ = 3.
p255
Calculate p254 .
Problem 28.6
Let N be a negative binomial random variable with r = 2.5 and β = 5.
pk
Find the smallest value of k such that pk−1 < 1.
200 DISCRETE DISTRIBUTIONS
Problem 28.7
Let N be a member of C(a, b, 0) such that p0 = p1 and p2 = 0.6p1 .
Problem 28.8
For N in C(a, b, 0) you are given the following:
• p0 = p1 .
• p2 = 0.6p1 .
Based on this information, which of the following are true statements?
Problem 28.9
Let N be a member of C(a, b, 0). You are given:
• p5 = 0.00144
• p4 = 0.006
• p3 = 0.02
Problem 28.10
Let N be a member of C(a, b, 0). You are given:
• p2 = 0.1536
• p1 = 0.4096
• p0 = 0.4096
Determine E(N ).
Problem 28.11 ‡
For a discrete probability distribution, you are given the recursion relation
2
p(k) = p(k − 1), k = 1, 2, · · · .
k
Determine p(4).
29 THE CLASS C(A, B, 1) OF DISCRETE DISTRIBUTIONS 201
1 − p∗0
p∗k = pk , k = 1, 2, · · · .
1 − p0
Note that
∞ ∞ ∞
X X 1 − p∗0 X 1 − p∗0
p∗k = p∗0 + p∗k = p∗0 + pk = p∗0 + (1 − p0 ) = 1.
1 − p0 1 − p0
k=0 k=1 k=1
We will consider only the distributions of C(a, b, 0). The collection of all
members of C(a, b, 0) together with the associated zero-modified distribu-
tions belong to a class denoted by C(a, b, 1). Note that, for k = 1, 2, · · · , we
have
p∗k pk b
= =a+ .
p∗k−1 pk−1 k
Note that both members of C(a, b, 0) and C(a, b, 1) satisfy
b
pk = a + pk−1
k
but for the C(a, b, 0) class the k starts from 1 whereas for the C(a, b, 1) class
the k starts from 2.
1 − pM
pM
k =
0
pk , k = 1, 2, · · ·
1 − p0
202 DISCRETE DISTRIBUTIONS
1
pTk = pk , k = 1, 2, · · ·
1 − p0
and pT0 = 0.
Theorem 29.1
Let N be in C(a, b, 0) with corresponding moment generating function MN (t).
Then the moment generating function of N M is
pM − p0 1 − pM
M
MN (t) = 0 + 0
MN (t).
1 − p0 1 − p0
Proof.
We have
∞
X
M tN
MN (t) =E(e )= etn pM
n
n=0
∞
− pM
X
1
=pM
0 + 0
etn pn
1 − p0
n=1
∞
"X #
pM
1−
=pM
0 + 0 tn
e pn − p0
1 − p0
n=0
1 − pM
=pM
0 + 0
[MN (t) − p0 ]
1 − p0
1 − pM 1 − pM
M 0 0
=p0 − p0 + MN (t)
1 − p0 1 − p0
pM − p0 1 − pM
= 0 + 0
MN (t)
1 − p0 1 − p0
Corollary 29.1
Let N be in C(a, b, 0) with corresponding probability generating function
PN (t). Then the probability generating function of N M is
pM − p0 1 − pM
PNM (t) = 0 + 0
PN (t).
1 − p0 1 − p0
29 THE CLASS C(A, B, 1) OF DISCRETE DISTRIBUTIONS 203
Proof.
We have
pM 1 − pM
0 − p0
PNM (t) M
=MN (ln t) = + 0
MN (ln t)
1 − p0 1 − p0
pM 1 − pM
0 − p0 0
= + PN (t)
1 − p0 1 − p0
Remark 29.1
Note that
1 − pM 1 − pM
PNM (t) = 1− 0
·1+ 0
PN (t).
1 − p0 1 − p0
Example 29.1
Let N be the Poisson distribution with parameter λ. Find the probability
functions of
(a) the zero-truncated distribution
(b) the zero-modified distribution with preassigned pM
0 = 0.10.
Solution.
e−λ λk
(a) We have p0 = e−λ and pk = k! . Hence,
1 1 e−λ λk
pTk = pk = , k = 1, 2, · · · .
1 − p0 1 − e−λ k!
1 − pM 0.90 e−λ λk
pM
k =
0
pk = , k = 1, 2, · · ·
1 − p0 1 − e−λ k!
Example 29.2
Let N have the negative binomial distribution with r = 2.5 and β = 0.5.
(a) Determine pk , k = 0, 1, 2, 3.
(b) Determine pT1 , pT2 , pT3 .
(c) Determine pM M M M
1 , p2 , p3 given that p0 = 0.6.
204 DISCRETE DISTRIBUTIONS
Solution.
(a) From the table in the previous section, we find
(b) We have
pT0 =0
1
pT1 = p1 = 0.474651
1 − p0
1
pT2 = p2 = 0.276880
1 − p0
1
pT3 = p3 = 0.138440.
1 − p0
(c) We have
pM
0 =0.6
1 − pM
pM
1 =
0
p1 = 0.189860
1 − p0
1 − pM
pM
2 = 0
p2 = 0.110752
1 − p0
1 − pM
pM
3 = 0
p3 = 0.055376
1 − p0
29 THE CLASS C(A, B, 1) OF DISCRETE DISTRIBUTIONS 205
Practice Problems
Problem 29.1
Show that
pM M T
k = (1 − p0 )pk , k = 1, 2, · · ·
Problem 29.2
Show that
1 − pM
M 0
E(N ) = E(N ).
1 − p0
Problem 29.3
Let N M be the zero-modified distribution associated to N.
Problem 29.4
Let N have a Poisson distribution with mean 1.
Problem 29.5
Consider the zero-modified geometric distribution:
1
pM
0 =
2
1 2 k−1
pM
k = , k = 1, 2, 3, · · · .
6 3
Problem 29.6
1 1 2
You are given: pM M M
1 = 6 , p2 = 9 , and p3 = 27 . Find pM
0 .
206 DISCRETE DISTRIBUTIONS
p0 =0
pk b
=a + , k = 2, 3, · · ·
pk−1 k
where
β β
a= 1+β , β>0 and b = (r − 1) 1+β , r > −1, r 6= 0.
Example 30.1
Show that
k−1
β r+1 r+2 r+k−1
pk = p1 · ··· , k = 2, 3, · · · .
β+1 2 3 k
Solution.
The proof is by induction on k = 2, 3, · · · . For k = 2, we have
β r−1 β β r+1
p2 = p1 + = p1 .
1+β 2 1+β 1+β 2
Then,
k
β r−1 β β r+1 r+2 r+k−1r+k
pk+1 = p1 + = p1 · ···
1+β k+11+β β+1 2 3 k k+1
Example 30.2
Pp1 > 0 then pk > 0 for all k = 2, 3, · · · .
(a) Show that if
(b) Show that ∞ k=1 pk < ∞.
30 THE EXTENDED TRUNCATED NEGATIVE BINOMIAL MODEL207
Solution.
(a) Since r > −1, β > 0 and k = 2, 3, · · · , from the previous example we
conclude that pk > 0 for all k = 2, 3, · · · .
(b) We have
∞ ∞ k−1
X X β r+1 r+2 r+k−1
pk = p1 · ··· .
β+1 2 3 k
k=1 k=2
Let
k−1
β r+1 r+2 r+k−1
ak = · ··· .
β+1 2 3 k
Then
ak+1 β r+k β
lim = lim = < 1.
k→∞ ak k→∞ 1 + β k + 1 1+β
Hence, by the ratio series test, the given series is convergent
Example 30.3
Find the probability generating function of the logarithmic distribution.
208 DISCRETE DISTRIBUTIONS
Solution.
We have
∞ ∞ n
X
n 1 X β 1 n
P (z) = pn z = z
ln (1 + β) 1+β n
n=0 n=1
1 zβ
= − ln 1 −
ln (1 + β) 1+β
1 1+β
= ln
ln (1 + β) 1 − β(z − 1)
ln [1 − β(z − 1)]
=1 −
ln (1 + β)
Example 30.4
Consider the extended zero-truncated negative binomial distribution with
r = −0.5 and β = 1. Calculate pT2 , and pT3 given that pT1 = 0.853553.
Solution.
We have
β 1
a= = = 0.5
1+β 1+1
β
b =(r − 1) = −0.75
1+β
T T 0.75
p2 =p1 0.5 − = 0.106694
2
T T 0.75
p3 =p2 0.5 − = 0.026674
3
Remark 30.1
The C(a, b, 1) class consists of the following distributions:
• Poisson, Negative binomial, Geometric, Binomial, and Logarithmic.
• Zero-truncated: Poisson, Binomial, Negative Binomial (ETNB), and Ge-
ometric.
• Zero-modified: Poisson, Binomial, ETNB, Geometric, and Logarithmic.
30 THE EXTENDED TRUNCATED NEGATIVE BINOMIAL MODEL209
Practice Problems
Problem 30.1
Consider the extended zero-modified negative binomial distribution with
r = −0.5 and β = 1.
Calculate pM M M T M
1 , p2 , and p3 given that p1 = 0.853553 and p0 = 0.6.
Problem 30.2
Let N denote the logarithmic distribution introduced in this section. Find
E(N ).
Problem 30.3
Let N denote the logarithmic distribution introduced in this section. Find
E[N (N − 1)].
Problem 30.4
Let N denote the logarithmic distribution introduced in this section. Find
Var(N ).
Problem 30.5
Let N T denote the zero-truncated distribution corresponding to the dis-
tribution N. Let PNT (z) be the probability generating function of N T and
PN (z) be the probability generating function of N. Show that
PN (z) − p0
pTN (z) = .
1 − p0
Problem 30.6
Find the probability generating function for the extended truncated negative
binomial distribution.
Problem 30.7
Find the mean of the extended truncated negative binomial.
Problem 30.8
Let N T denote the ETNB. Find E[N T (N T − 1)].
Problem 30.9
Let N T denote the ETNB. Find Var(N T ).
210 DISCRETE DISTRIBUTIONS
Modifications of the Loss
Random Variable
211
212 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
SY L (y) = SX (y + d), y ≥ 0.
if X is discrete and
Z ∞
L k
E[(Y ) ] = (x − d)k fX (x)dx
d
Note that with the presence of deductibles, the number of payments is fewer
than the losses since losses with amount less than or equal to the deductible
will result in no payments.
Example 31.1
Determine the pdf, cdf, and the sdf for Y L if the gound-up loss amount func-
tion has an exponential distribution with mean 1θ and an ordinary deductible
of d.
Solution.
Recall that
fX (x) = θe−θx , x > 0.
Thus,
1 − e−θd y = 0
fY L (y) =
θe−θ(y+d) y > 0.
1 − e−θd
y=0
FY L (y) = −θ(y+d)
1−e y > 0.
e−θd
y=0
SY L (y) =
θe−θ(y+d) y>0
In the cost per loss situation, all losses below or at the deductible level are
recorded as 0. We next examine the situation where all losses below or at
the deductible level are completely ignored and not recorded in any way.
This sitution is represented by the random variable
P L L L undefined X ≤ d
Y = (Y |Y > 0) = (Y |X > d) =
X −d X > d.
Example 31.2
Determine the pdf, cdf, sdf, and the hazard rate function for Y P if the
gound-up loss amount function has an exponential distribution with mean
1
θ and an ordinary deductible of d.
Solution.
Recall that
fX (x) = θe−θx , x > 0.
Thus,
fY P (y) =θ−θy
FY P (y) =1 − e−θy
SY P (y) =e−θy
hY P (y) =θ
Theorem 31.1
For an ordinary deductible d, we have
and
E(X) − E(X ∧ d)
E(Y P ) =
1 − FX (d)
where X ∧ d is the limited loss variable (see Section 5) defined by
X, X < d
X ∧ d = min(X, d) =
d, X ≥ d
and Z d
E(X ∧ d) = SX (x)dx.
0
Proof.
Buying one policy with a deductible d and another one with a limit d is
equivalent to purchasing full cover. That is,
Y L + X ∧ d = X.
Hence,
E(Y L ) = E(X) − E(X ∧ d).
Also,
E(Y L ) E(X) − E(X ∧ d)
E(Y P ) = =
1 − FX (d) 1 − FX (d)
Example 31.3
Losses are distributed exponentially with parameter θ. Policies are subject
to ordinary deductible d. Find E(Y L ) and E(Y P ).
Solution.
We have Z d
1
E(X ∧ d) = e−θx dx = (1 − e−θd )
0 θ
and Z ∞
1
E(X) = xθe−θx dx = .
0 θ
Hence,
1 1 e−θd
E(Y L ) = − (1 − e−θd ) =
θ θ θ
and
e−θd
1
E(Y P ) = θ
=
e−θd θ
216 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Example 31.4 ‡
The annual number of doctor visits for each individual in a family of 4 has
a geometric distribution with mean 1.5. The annual numbers of visits for
the family members are mutually independent. An insurance pays 100 per
doctor visit beginning with the 4th visit per family.
Calculate the expected payments per year for this family.
Solution.
Let Xi be the annual number of visits by member i of the family, where
i = 1, 2, 3, 4. Let Y = X1 + X2 + X3 + X4 be the annual number of visits by
the whole family. For each Xi , the probability generating function is
Using independence,
(Y − 3)+ = Y − Y ∧ 3
so that
E[(Y − 3)+ ] = E(Y ) − E(Y ∧ 3).
From Table C, the pmf of the negative binomial with r = 4 and β = 1.5 is
r(r + 1) · · · (r + k − 1)β k
Pr(Y = k) = .
k!(1 + β)r+k
Now, we have the following
Thus,
E(Y ∧ 3) = 0.06144 + 2(0.09216) + 3(0.8208) = 2.71
and the expected number of visits resulting in insurance payments is
The insurance pays 100 per visit so that the total expected insurance pay-
ment for the year is 3.29 × 100 = 329
218 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Practice Problems
Problem 31.1
The cdf of a loss amount distribution is expressed as:
x 2
FX (x) = 1 − e−( 100 ) , x > 0.
The ordinary deductible for an insurance policy is 50. Find the pdf, cdf,
and the survival function of the cost per loss Y L .
Problem 31.2
The cdf of a loss amount distribution is expressed as:
x 2
FX (x) = 1 − e−( 100 ) , x > 0.
The ordinary deductible for an insurance policy is 50. Find the pdf, cdf,
survival function, and the hazard rate function of the cost per payment Y P .
Problem 31.3
Loss amounts are exponentially distributed with parameter θ = 1000. An
insurance policy is subject to deductible of 500.
Find E(Y L ).
Problem 31.4
Loss amounts are exponentially distributed with parameter θ = 1000. An
insurance policy is subject to deductible of 500.
Find E(Y P ).
Problem 31.5 ‡
Losses follow an exponential distribution with parameter θ. For an ordinary
deductible of 100, the expected payment per loss is 2000.
What is the expected payment per loss in terms of θ for an ordinary de-
ductible of 500?
Problem 31.6
Loss amounts are distributed as a single Pareto with parameters α = 4 and
θ = 90. An insurance policy is subject to an ordinary deductible of 100.
Determine Var(Y L ).
31 ORDINARY POLICY DEDUCTIBLES 219
Problem 31.7
Losses are uniformly distributed in (0, b). An insurance policy is subject to
an ordinary deductible of d.
Calculate Var(Y P ).
Problem 31.8
Loss amounts have a discrete distribution with the following probabilities:
Problem 31.9 ‡
Risk 1 has a Pareto distribution with parameters α > 2 and θ. Risk 2 has
a Pareto distribution with parameters 0.8α and θ. Each risk is covered by a
separate policy, each with an ordinary deductible of k.
Problem 31.10
Loss amounts are uniformly distributted in (0, 10). An ordinary policy de-
ductible of d is applied.
Example 32.1
Loss amounts are distributed exponentially with mean 1θ . Insurance policies
are subject to a franchise deductible of d. Find the pdf, cdf, sdf, and the
hazard rate function for the cost per-loss Y L .
Solution.
The pdf of X is
fX (x) = θe−θx . x > 0.
We have
1 − e−θd , y = 0
fY L (y) =
θe−θy , y > d.
1 − e−θd , 0 ≤ y ≤ d
FY L (y) =
1 − e−θy , y > d.
−θd
e , 0≤y≤d
SY L (y) =
e−θy , y > d.
0, 0 < y < d
hY L (y) =
θ, y > d
Example 32.2
Loss amounts are distributed exponentially with mean 1θ . Insurance policies
are subject to a franchise deductible of d. Find the pdf, cdf, sdf, and the
hazard rate function for the cost per-payment Y P .
Solution.
We have
fY P (y) = θe−θ(y−d) , y > d.
0, 0≤y≤d
FY P (y) =
1 − e−θ(y−d) , y > d.
1, 0≤y≤d
SY P (y) = −θ(y−d)
e , y > d.
0, 0 < y < d
hY P (y) =
θ, y > d
Theorem 32.1
For a franchise deductible d, we have
and
E(X) − E(X ∧ d)
E(Y P ) = +d
1 − FX (d)
where X ∧ d is the limited loss variable (see Section 5) defined by
X, X < d
X ∧ d = min(X, d) =
d, X ≥ d
and Z d
E(X ∧ d) = SX (x)dx.
0
Proof.
We will prove the results for the continuous case. We have
Z ∞ Z ∞ Z ∞
E(Y L ) = xfX (x)dx = (x − d)fX (x)dx + d fX (x)dx
d d d
=E[(X − d)+ ] + d[1 − FX (d)]
=E(X) − E(X ∧ d) + d[1 − FX (d)]
E(Y L ) E(X) − E(X ∧ d)
E(Y P ) = = +d
1 − FX (d)] 1 − FX (d)
Example 32.3
Losses are distributed exponentially with parameter θ. Policies are subject
to franchise deductible d. Find E(Y L ) and E(Y P ).
Solution.
Using Example 31.3, we have
e−θd
E(Y L ) = + de−θd
θ
and
eθd
E(Y P ) = +d
θ
32 FRANCHISE POLICY DEDUCTIBLES 223
Example 32.4 ‡
Insurance agent Hunt N. Quotum will receive no annual bonus if the ratio of
incurred losses to earned premiums for his book of business is 60% or more
for the year. If the ratio is less than 60%, Hunt’s bonus will be a percentage
of his earned premium equal to 15% of the difference between his ratio and
60%. Hunt’s annual earned premium is 800,000.
Incurred losses are distributed according to the Pareto distribution, with
θ = 500, 000 and α = 2.
Calculate the expected value of Hunt’s bonus.
Solution.
Let L be the incurred losses and B Hunt’s bonuses. We are told that if
L L
800,000 < 0.6, that is, L ≤ 480, 000, then B = 0.15 0.6 − 800,000 (800, 000) =
0.15(480, 000 − L). This can be written as
480, 000 − L, L < 480, 000
B = 0.15 = 0.15[480, 000 − X ∧ 480, 000)].
0, L ≥ 480, 000
Hence,
E(B) = 0.15(480, 000 − 244, 898) = 35, 265
224 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Practice Problems
Problem 32.1
Loss amounts are uniformly distributed on (0, θ). For a franchise deductible
d < θ, find fY L (y), FY L (y), SY L (y), and hY L (y).
Problem 32.2
Loss amounts are uniformly distributed on (0, θ). For a franchise deductible
d < θ, find fY P (y), FY P (y), SY P (y), and hY P (y).
Problem 32.3
Loss amounts are uniformly distributed on (0, θ). For a franchise deductible
d < θ, find E(Y L ) and E(Y P ).
Problem 32.4
Claim amounts X have the following Pareto distribution
3
800
FX (x) = 1 − .
x + 800
An insurance policy has a franchise deductibe of 300. Find the expected
cost per-loss.
Problem 32.5 ‡
Auto liability losses for a group of insureds (Group R) follow a Pareto dis-
tribution with α = 2 and θ = 2000. Losses from a second group (Group
S) follow a Pareto distribution with α = 2 and θ = 3000. Group R has an
ordinary deductible of 500, Group S has a franchise deductible of 200.
Calculate the amount that the expected cost per payment for Group S
exceeds that for Group R.
Problem 32.6
Loss amounts are exponentially distributed with parameter θ. For a franchise
deductible d, it is given that E(Y L ) = 0.40E(Y P ). Express d in terms of θ.
Problem 32.7
Loss amounts have a discrete distribution with the following probabilities:
Loss amounts Probability
100 20%
300 70%
1000 10%
32 FRANCHISE POLICY DEDUCTIBLES 225
Problem 32.8
Losses in 2011 are distributed as a Pareto distribution with α = 2 and
θ = 2000. An insurance company sells a policy that covers these losses with
a franchise deductible of 500 during 2011. Losses in 2012 increase by 20%.
During 2012, the insurance company will sell a policy covering the losses.
However, instead of the franchise deductible used in 2010, the company will
implement an ordinary deductible of d. The expected value of per-loss is the
same for both years.
Problem 32.9
Losses are distributed as a Pareto distribution with α = 5 and θ = 1000.
Losses are subject to a franchise deductible of d. The expected value per
payment after the deductible is 820.
Calculate d.
Problem 32.10
Loss amounts follow a Pareto distribution with parameters α = 2 and θ. The
expected value per-payment for an ordinary deductible of 10000 is 20000.
Calculate the expected value per-loss when losses are subject to a franchise
deductible of 15000.
226 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Example 33.1
Loss amounts X follow an exponential distribution with mean θ = 1000.
Suppose that insurance policies are subject to an ordinary deductible of
500. Calculate the loss elimination ratio.
Solution.
We have
Rd R 500 − x
[1 − FX (x)]dx e 1000 dx
0
LER = R ∞ = R0∞ − x = 1 − e−0.5
0 [1 − FX (x)]dx 0 e
1000 dx
Example 33.2 ‡
You are given:
(i) Losses follow an exponential distribution with the same mean in all years.
(ii) The loss elimination ratio this year is 70%.
(iii) The ordinary deductible for the coming year is 4/3 of the current de-
ductible.
Compute the loss elimination ratio for the coming year.
33 THE LOSS ELIMINATION RATIO AND INFLATION EFFECTS FOR ORDINARY DEDUCTIBLES227
Solution.
We have
− xθ
Rd
0 e dx d
LER = R ∞ −x = 1 − e− θ =⇒ d = θ ln (1 − LER).
0 e
θ dx
Thus,
dLast year = θ ln 0.30
and
dLast year = θ ln (1 − LERNext year ).
4
θ ln (1 − LERNext year ) = θ ln 0.30 =⇒ LERNext year = 0.80
3
Example 33.3
An insurance company offers two types of policies: Type A and Type B.
The distribution of each type is presented below.
Type A Type B
Loss Amount Probability Loss Amount Probability
100 0.65 300 0.70
200 0.35 400 0.30
55% of the policies are of Type A and the rest are of Type B. For an ordinary
deductible of 125, calculate the loss elimination ratio and interpret its value.
Solution.
The expected losses without deductibles is
Hence,
116.0625
LER = = 0.521 = 52.1%
222.75
228 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Theorem 33.1
Let loss amounts be X and let Y be the loss amounts after uniform inflation
of r. That is, Y = (1 + r)X. For an ordinary deductible of d, the expected
cost per-loss is
L d
E(Y ) = (1 + r) E(X) − E X ∧ .
1+r
−1
P E(Y ) − E(Y ∧ d) d d
E(Y ) = = 1 − FX (1+r) E(X) − E X ∧ .
1 − FY (d) 1+r 1+r
Proof.
We have
y y
FY (y) =Pr(Y ≤ y) = Pr X ≤ = FX
1+r 1+r
1 y
fY (y) = fX
1+r 1+r
E(Y ) =(1 + r)E(X).
Z d
E(Y ∧ d) = yfY (y)dy + d[1 − FY (d)]
0
Z d yfX y
1+r d
= dy + d 1 − FX
0 1+r 1+r
33 THE LOSS ELIMINATION RATIO AND INFLATION EFFECTS FOR ORDINARY DEDUCTIBLES229
Z d
1+r d
= (1 + r)xfX (x)dx + d 1 − FX
0 1+r
(Z d )
1+r d d
=(1 + r) xfX (x)dx + 1 − FX
0 1+r 1+r
d
=(1 + r)E X ∧
1+r
Example 33.4
Determine the effect of inflation at 10% on an ordinary deductible of 500
applied to an exponential distribution with mean 1000.
Solution.
Before the inflation, We have
E(X) =1000
Z 500
x
E(X ∧ 500) = e− 1000 dx = 1000(1 − e−0.5 )
0
L
E(YBI ) =1000e−0.5
P 1000e−0.5
E(YBI )= = 1000.
e−0.5
After the inflation, we have
Z 500
500 1.1 x 1
E X∧ = e− 1000 dx = 1000(1 − e− 2.2 )
1.1 0
1 1
L
E(YAI ) =(1.1)[1000 − 1000(1 − e− 2.2 ) = 1100e− 2.2
1
P 1100e− 2.2
E(YAI )= 1 = 1100.
e− 2.2
1
Thus, the expected cost per loss increased from 1000e−0.5 to 1100e− 2.2 , an
1
1100e− 2.2 −1000e−0.5
increase of 1000e−0.5
= 15.11%. The cost per-pay increased from
1000 to 1100, an increase of 10%
230 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Example 33.5 ‡
The graph of the density function for losses is:
Solution.
We will use the formula
E[(X − 20)+
LER = 1 − .
E(X)
We have
Z 80 Z 120
E(X) = 0.01xdx + (0.03x − 0.00025x2 )dx
0 80
=50.66667
Z 20 Z 20
E[(X − 20)+ ] =E(X) − xf (x)dx − 20[1 − f (x)dx]
0 0
Z 20 Z 20
2
=E(X) − 0.01x dx − 20[1 − 0.01xdx]
0 0
=50.6667 − 2 − 20(0.8) = 32.6667
32.6667
LER =1 − = 0.3553
50.6667
33 THE LOSS ELIMINATION RATIO AND INFLATION EFFECTS FOR ORDINARY DEDUCTIBLES231
Practice Problems
Problem 33.1
Loss amounts are being modeled with a distribution function expressed be-
low:
x2
FX (x) = , 0 ≤ x ≤ 50.
2500
An insurance policy comes with an ordinary deductible of 30.
Problem 33.2
Loss amounts are being exponentially distributed with the same mean for all
years. Suppose that with an ordinary deductible of d, LER(2011) = 0.75.
In 2012, the deductible is expected to increase by 45%.
Calculate LER(2012).
Problem 33.3 ‡
Losses have an exponential distribution with a mean of 1000. There is an or-
dinary deductible of 500. The insurer wants to double loss elimination ratio.
Problem 33.4
Losses follow an exponential distribution with mean of θ = 1000. An in-
surance company applies an ordinary policy deductible d which results in a
Loss Elimination Ratio of 1 − e−0.5 .
Calculate d.
Problem 33.5 ‡
Losses follow a distribution prior to the application of any deductible with
a mean of 2000. The loss elimination ratio at a deductible of 1000 is 0.3.
The probability of a loss being greater than 1000 is 0.4.
Determine the average size of a loss given it is less than or equal to the
deductible of 1000, that is, find E(X|X ≤ 1000).
232 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Problem 33.6
Loss amounts are being modeled with a distribution function expressed be-
low
x
FX (x) = 1 − e− 100 .
An insurance policy comes with a deductible of 50.
Calculate the difference in the loss elimination ratio before and after a uni-
form inflation of 30%
Problem 33.7 ‡
Losses have a Pareto distribution with α = 2 and θ = k. There is an ordi-
nary deductible of 2k.
Determine the loss elimination ration before and after 100% inflation.
Problem 33.8 ‡
Claim sizes this year are described by a 2-parameter Pareto distribution
with parameters α = 4 and θ = 1500.
What is the expected claim size per loss next year after 20% inflation and
the introduction of a $100 ordinary deductible?
Problem 33.9 ‡
Losses in 2003 follow a two-parameter Pareto distribution with α = 2 and
θ = 5. Losses in 2004 are uniformly 20% higher than in 2003. An insurance
covers each loss subject to an ordinary deductible of 10.
34 Policy Limits
If a policy has a limit u, then the insurer will pay the full loss as long as
the losses are less than or equal to u, otherwise, the insurer pays only the
amount u. Thus, the insurer is subject to pay a maximum covered loss of u.
Let Y denote the claim amount random variable for policies with limit u.
Then
X, X ≤ u
Y = min{X, u} = X ∧ u =
u, X > u.
We call Y the limited loss random variable(See Section 5).
Example 34.1 Ru
Show that: E(Y ) = 0 SX (x)dx.
Solution.
First note that
Y =u+Z
where
X − u, X ≤ u
Z=
0, X > u.
Thus,
Z u
E(Y ) =u + (x − u)fX (x)dx
0
Z u
u
=u + [(x − u)FX (x)]0 − FX (x)dx
0
Z u
= [1 − FX (x)]dx
0
234 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Theorem 34.1
Let E(Y ) be the expected cost before inflation. Suppose that the same policy
limit applies after an inflation at rate r. Then the after inflation expected
cost is given by
u
E((1 + r)X ∧ u) = (1 + r)E X ∧ .
1+r
Proof.
See the proof of Theorem 33.1
Example 34.2
Losses follow a Pareto distribution with parameters α = 2 and θ = 1000.
For a coverage with policy limit of 2000, find the pdf and the cdf of the
limited loss random variable.
Solution.
Recall that
αθα 2(1000)2
fX (x) = =
(x + θ)α+1 (1000 + x)3
and α 2
θ 1000
FX (x) = 1 − =1− .
θ+x 1000 + x
Thus,
1
9, y = 2000
2(1000)2
fY (y) = (1000+y)3
, y < 2000
0, y > 2000
and ( 2
1000
1− , y < 2000
FY (y) = 1000+x
1 y ≥ 2000
Example 34.3
Losses follow a Pareto distribution with parameters α = 2 and θ = 1000.
Calculate the expected cost for a coverage with policy limit of 2000.
Solution.
Recall that α
θ
S(x) = .
θ+x
34 POLICY LIMITS 235
Thus,
Z 2000 Z 2000 2
1000
E(X ∧ 2000) = S(x)dx = dx = 666.67
0 0 1000 + x
Example 34.4
Losses follow a Pareto distribution with parameters α = 2 and θ = 1000.
For a coverage with policy limit 2000 and after an inflation rate of 30%,
calculate the after inflation expected cost.
Solution.
We have
" !#
2000 2000 1000
E(1.3X∧2000) = 1.3E X ∧ = 1.3 1− = 1575.76
1.3 2−1 1000 + 2000
1.3
Example 34.5 ‡
A jewelry store has obtained two separate insurance policies that together
provide full coverage. You are given:
(i) The average ground-up loss is 11,100.
(ii) Policy A has an ordinary deductible of 5,000 with no policy limit.
(iii) Under policy A, the expected amount paid per loss is 6,500.
(iv) Under policy A, the expected amount paid per payment is 10,000.
(v) Policy B has no deductible and a policy limit of 5,000.
Given that a loss has occurred, determine the probability that the payment
under policy B is 5,000.
Solution.
Let X denote the ground-up loss random variable. By (i), E(X) = 11, 100.
By (ii) and (iii), we have
E(Y L ) = E(X) − E(X ∧ 5000) = 6, 500.
By (iv), we habe
E(X) − E(X ∧ 5000)
E(Y P ) = = 10, 000.
1 − FX (500)
Thus,
6500
= 10, 000 =⇒ FX (5000) = 0.35.
1 − FX (5000)
For Policy B, a payment of 5000 will occur if X ≥ 5000. Hence,
Pr(X ≥ 5000) = 1 − FX (5000) = 1 − 0.35 = 0.65
236 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Practice Problems
Problem 34.1
Losses follow an exponential distribution with mean 1θ . For a coverage with
policy limit u, find fY (y) and FY (y).
Problem 34.2
Losses follow an exponential distribution with mean 1θ . For a coverage with
policy limit u, find E(X ∧ u).
Problem 34.3
Losses follow an exponential distribution with mean 1θ . For a coverage with
policy limit u and with an inflation at rate r, find the expected cost.
Problem 34.4
Losses are distributed uniformly between 0 and 100. An insurance policy
which covers the losses has an upper limit of 80. Find the expected cost.
Problem 34.5 ‡
An insurance company offers two types of policies: Type Q and Type R.
Type Q has no deductible, but has a policy limit of 3000. Type R has no
limit, but has an ordinary deductible of d. Losses follow a Pareto distribu-
tion with α = 3 and θ = 2000.
Calculate the deductible d such that both policies have the same expected
cost per loss.
Problem 34.6
Suppose that the ground-up losses for 2010 follow an exponential distribu-
tion with a mean of 1000. In 2011, all losses are subject to uniform inflation
of 25%. The policy in 2011 has limit u.
Determine the value of u if the expected cost in 2011 is equal to the ex-
pected loss in 2010.
Problem 34.7
Suppose that the ground-up losses for 2010 follow a Pareto distribution with
parameters α = 3 and θ = 9800. In 2011, all losses are subject to uniform
inflation of 6%. The policy limit in 2011 is 170,000.
E(Y L ) 650.57
E(Y P ) = = −0.2 = 794.61
1 − FX (200) e
Theorem 35.1
The expected value of the per-loss random variable is
L u d
E(Y ) = α(1 + r) E X ∧ −E X ∧ .
1+r 1+r
E(Y L )
E(Y P ) = .
d
1 − FX 1+r
Proof.
We have
Theorem 35.2
The second moment for the per-loss random variable is
Proof.
Using
we can write
2
YL
=[X ∧ u∗ − X ∧ d∗ ]2
α(1 + r)
=(X ∧ u∗ )2 − 2(X ∧ u∗ )(X ∧ d∗ ) + (X ∧ d∗ )2
=(X ∧ u∗ )2 − (X ∧ d∗ )2 − 2(X ∧ d∗ )[X ∧ u∗ − X ∧ d∗ ].
But
d
0, X ≤ 1+r
∗ ∗ ∗ ∗ ∗ ∗
(X∧d )[X∧u −X∧d ] = d (X − d∗ ),
∗ d u
1+r < X ≤ 1+r = d [X∧u −X∧d ].
∗ ∗
d (u − d∗ ), u
X > 1+r
Thus,
2
YL
= (X ∧ u∗ )2 − (X ∧ d∗ )2 − 2d∗ [X ∧ u∗ − X ∧ d∗ ].
α(1 + r)
Now, the result of the theorem follows by taking the expectation of both
sides
Example 35.2
Determine the mean and the standard deviation per loss for an exponential
distribution with mean 1000 and with a deductible of 500 and a policy limit
of 2500.
Solution.
x
Recall that E(X ∧ x) = θ(1 − e− θ ). Thus,
3000 500
E(Y L ) = E(X∧3000)−E(X∧500) = 1000(1−e− 1000 )−1000(1−e− 1000 ) = 556.74.
240 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
The variance of Y L is
Example 35.3 ‡
A group dental policy has a negative binomial claim count distribution with
mean 300 and variance 800.
Ground-up severity is given by the following table:
Severity Probability
40 0.25
80 0.25
120 0.25
200 0.25
You expect severity to increase 50% with no change in frequency. You decide
to impose a per claim deductible of 100.
Calculate the expected total claim payment after these changes.
Solution.
After imposing the 50% increase, the severity values are
Severity (X) Probability
60 0.25
120 0.25
180 0.25
300 0.25
Let N be the claim frequency. Then the expected total claim payment is
the expected number of losses times the expected payment per loss. That
is,
E(N )[E(X) − E(X ∧ 100)].
35 COMBINATIONS OF COINSURANCE, DEDUCTIBLES, LIMITS, AND INFLATIONS241
We have
E(N ) =300
E(X) =(60 + 120 + 180 + 300)(0.25) = 165
E(X ∧ 100) =60(0.25) + 100(1 − 0.25) = 90.
Thus, the expected total claim payment is 300(165 − 90) = 22, 500
Example 35.4 ‡
An insurer has excess-of-loss reinsurance on auto insurance. You are given:
(i) Total expected losses in the year 2001 are 10,000,000.
(ii) In the year 2001 individual losses have a Pareto distribution with
2
2000
F (x) = 1 − , x > 0.
x + 2000
(iii) Reinsurance will pay the excess of each loss over 3000.
(iv) Each year, the reinsurer is paid a ceded premium, Cyear , equal to 110%
of the expected losses covered by the reinsurance.
(v) Individual losses increase 5% each year due to inflation.
(vi) The frequency distribution does not change.
(a) Calculate C2001 .
(b) Calculate CC2001 .
2002
Solution.
(a) The reinsurance fraction per loss is given by
E(X) − E(X ∧ 3000)
E(X)
where
θ 2000
E(X) = = = 2000
α−1" 2−1
α−1 #
θ 2000
E(X ∧ 3000) = 1−
α−1 3000 + 2000
" 2−1 #
2000 2000
= 1− = 1200.
2−1 3000 + 2000
Thus,
E(X) − E(X ∧ 3000) 1200
=1− = 0.40.
E(X) 2000
242 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Finally,
C2001 = 1.10(0.40)(10, 000, 000) = 4, 400, 000.
(b) Due to inflation, the amount per loss in 2002 is X2002 = 1.05X2001 .
Thus, E(X2002 ) = 1.05E(X2001 ) = 1.05(2000) = 2100. Now, X2002 is a
Pareto distribution with parameters α = 2 and θ = 2100) (see Section 18).
Thus,
" 2−1 #
2100 2100
E(X2002 ∧ 3000) = 1− = 1235.
2−1 3000 + 2100
Hence,
2100 − 1235
C2002 = 1.10 (10, 000, 000)(1.05) = 4, 758, 600
2100
and
C2002 4, 758, 600
= = 1.08
C2001 4, 400, 000
Example 35.5 ‡
Annual prescription drug costs are modeled by a two-parameter Pareto dis-
tribution with θ = 2000 and α = 2.
A prescription drug plan pays annual drug costs for an insured member sub-
ject to the following provisions:
(i) The insured pays 100% of costs up to the ordinary annual deductible of
250.
(ii) The insured then pays 25% of the costs between 250 and 2250.
(iii) The insured pays 100% of the costs above 2250 until the insured has
paid 3600 in total.
(iv) The insured then pays 5% of the remaining costs.
Determine the expected annual plan payment.
Solution.
Let X denote the annual drug cost and Y the insurer’s payment. What is the
first value of X where the insured total payment reaches 3600? From what
is given, if X = 2250 the insured’s payment is 250 + 0.25(2250 − 250) = 750.
After this point, the insured’s will pay 100% of the costs above 2250 until
the insured has paid 3600 in total. But this means that insured will pay
3600 − 750 = 2850 past the 2250 mark. In other words, the insured reaches
35 COMBINATIONS OF COINSURANCE, DEDUCTIBLES, LIMITS, AND INFLATIONS243
the total payment of 3600 when X = 2250 + 2850 = 5100. Having said that,
Y can be expressed as follows
0, 0 ≤ X ≤ 250
0.75(X − 250), 250 < X ≤ 2250
Y =
0.75(2250 − 250), 2250 < X ≤ 5100
1500 + 0.95(X − 5100), X > 5100.
Hence,
Example 35.6 ‡
Loss amounts have the distribution function
x 2
100 , 0 ≤ x ≤ 100
F (x) =
1, x > 100
Solution.
The maximum covered loss is
u = 20 + 0.8(60) = 95.
244 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Thus,
E(Y L ) =0.8[E(X ∧ 95) − E(X ∧ 20)]
Z 95 Z 20
=0.8 S(x)dx − S(x)dx
0 0
Z 95 x 2 Z 20 x 2
=0.8 1− dx − 1− dx
0 100 0 100
Z 95 x 2
=0.8 1− dx = 37.35.
20 100
The expected cost per payment is
E(Y L ) 37.35
E(Y P ) = = = 38.91
1 − F (20) 1 − 0.04
Example 35.7 ‡
For a special investment product, you are given:
(i) All deposits are credited with 75% of the annual equity index return,
subject to a minimum guaranteed crediting rate of 3%.
(ii) The annual equity index return is normally distributed with a mean of
8% and a standard deviation of 16%.
(iii) For a random variable X which has a normal distribution with mean
µ and standard deviation σ, you are given the following limited expected
values:
E(X ∧ 3%)
µ = 6% µ = 8%
σ = 12% −0.43% 0.31%
σ = 16% −1.99% −1.19%
E(X ∧ 4%)
µ = 6% µ = 8%
σ = 12% 0.15% 0.95%
σ = 16% −1.43% −0.58%
Solution.
LetY denote the annual credit rating and X the annual equity index return.
Then we have
3, 0.75X ≤ 3 3, X≤4 0, X≤4
Y = = = 3+
0.75X, 0.75X > 3 0.75X, X > 4 0.75X − 3, X > 4
35 COMBINATIONS OF COINSURANCE, DEDUCTIBLES, LIMITS, AND INFLATIONS245
That is,
Y = 3 + (0.75X − 3)+ = 3 + 0.75X − (0.75X ∧ 3).
Hence,
E(Y ) = 3 + 0.75E(X) − 0.75E(X ∧ 4).
By (ii), E(X) = 8. From the second given table, we see that E(X ∧ 4) =
−0.58. Hence,
E(Y ) = 3 + 6 − 0.75(−0.58) = 9.435%
Example 35.8 ‡
A risk has a loss amount which has a Poisson distribution with mean 3.
An insurance covers the risk with an ordinary deductible of 2. An alternative
insurance replaces the deductible with coinsurance α, which is the proportion
of the loss paid by the insurance, so that the expected insurance cost remains
the same.
Calculate α.
Solution.
The expected cost per loss with a deductible of 2 is
E[(X − 2)+ ] =E(X) − E(X ∧ 2) = 3 − [Pr(X = 1) + 2Pr(X = 2) + 2(1 − Pr(X ≤ 2))]
=3 − Pr(X = 1) + 2[1 − Pr(X = 0) − Pr(X = 1)]
=3 − 3e−3 − 2[1 − (e−3 + 3e−3 )] = 1.249.
The expected cost per loss with a coinsurance of α is αE(X) = 3α. We are
told that 3α = 1.249 so that α = 1.249
3 = 0.42
Example 35.9 ‡
Michael is a professional stuntman who performs dangerous motorcycle
jumps at extreme sports events around the world.
The annual cost of repairs to his motorcycle is modeled by a two parameter
Pareto distribution with θ = 5000 and α = 2.
An insurance reimburses Michael’s motorcycle repair costs subject to the
following provisions:
(i) Michael pays an annual ordinary deductible of 1000 each year.
(ii) Michael pays 20% of repair costs between 1000 and 6000 each year.
(iii) Michael pays 100% of the annual repair costs above 6000 until Michael
has paid 10,000 in out-of-pocket repair costs each year.
(iv) Michael pays 10% of the remaining repair costs each year.
Calculate the expected annual insurance reimbursement.
246 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Solution.
Let X denote the annual repair cost. For 1000 ≤ X ≤ 6000, the insurance
pays 0.80(X − 1000). Thus, if cost is 6000, Michael’s share is 2000 and the
insurance share is 4000. For the next 8000, Michael’s share is the whole
amount. That is, the insurance pays nothing. But in this case, Michael’s
out-of-pocket reaches 10,000. It follows that, the insurance pays 90% of cost
over 14,000 which is 0.90(X − 14000).
The annual insurance reimbursement is
From Table C,
θ 5000x
E(X ∧ x) = θ 1 − =
θ+x x + 5000
and
θ
E(X) = = 5000.
α−1
Hence,
5000(6000) 5000(1000) 5000
E(R) = 0.8 − +0.9 5000 − = 2699.36
6000 + 5000 1000 + 5000 14000 + 5000
35 COMBINATIONS OF COINSURANCE, DEDUCTIBLES, LIMITS, AND INFLATIONS247
Practice Problems
Problem 35.1 ‡
Losses this year have a distribution such that E(X ∧ x) = −0.025x2 +
1.475x − 2.25 for x = 11, 12, · · · , 26. Next year, losses will be uniformly
higher by 10%. An insurance policy reimburses 100% of losses subject to a
deductible of 11 up to a maximum reimbursement of 11.
Problem 35.2 ‡
Losses have an exponential distribution with mean 1000. An insurance com-
pany will pay the amount of each claim in excess of a deductible of 100.
Calculate the variance of the amount paid by the insurance company for
one claim, including the possibility that the amount paid is 0.
Problem 35.3 ‡
Losses follow a Poisson distribution with mean λ = 3. Consider two insur-
ance contracts. One has an ordinary deductible of 2. The second has no
deductible and coinsurance in which the insurance company pays α of the
loss.
Determine α so that the expected cost of the two contracts is the same.
Problem 35.4 ‡
x
You are given that e(0) = 25 and S(x) = 1 − w, 0 ≤ x ≤ w, and Y P is the
excess loss variable for d = 10.
Problem 35.5 ‡
Total hospital claims for a health plan were previously modeled by a two-
parameter Pareto distribution with α = 2 and θ = 500. The health plan
begins to provide financial incentives to physicians by paying a bonus of
50% of the amount by which total hospital claims are less than 500. No
bonus is paid if total claims exceed 500. Total hospital claims for the health
plan are now modeled by a new Pareto distribution with α = 2 and θ = K.
The expected claims plus the expected bonus under the revised model equals
expected claims under the previous model.
248 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Calculate K.
Problem 35.6
The amount of a loss has a Pareto distribution with α = 2 and θ = 5000. An
insurance policy on this loss has an ordinary deductible of 1,000, a policy
limit of 10,000, and a coinsurance of 80%.
With a uniform inflation of 2%, calculate the expected claim amount per
payment on this policy.
Problem 35.7
Claim amounts follow a Pareto distribution with parameters αX = 3. and
θX = 2000. A policy is subject to a coinsurance rate of α. The standard
deviation of the claims for this policy is 1472.24.
Problem 35.8
The loss size distribution is exponential with mean 50. An insurance policy
pays the following for each loss. There is no insurance payment for the first
20. The policy has a coinsurance in which the insurance company pays 75%
of the loss. The maximum covered loss is 100.
Calculate E(Y P ).
Problem 35.9
The loss size distribution is a Pareto α = 2 and θ = 100. An insurance policy
pays the following for each loss. There is no insurance payment for the first
20. The policy has a coinsurance in which the insurance company pays 75%
of the loss. The policy has a limit of u∗ . The maximum covered loss is 100.
Given that E(Y P ) = 34.2857, find the maximum payment per loss for this
policy.
Problem 35.10 ‡
In 2005 a risk has a two-parameter Pareto distribution with α = 2 and
θ = 3000. In 2006 losses inflate by 20%.
An insurance on the risk has a deductible of 600 in each year. Pi , the
premium in year i, equals 1.2 times the expected claims.
The risk is reinsured with a deductible that stays the same in each year. Ri ,
35 COMBINATIONS OF COINSURANCE, DEDUCTIBLES, LIMITS, AND INFLATIONS249
the reinsurance premium in year i, equals 1.1 times the expected reinsured
claims.
Suppose R2005 /P2005 = 0.55. Calculate R2006 /P2006 .
250 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Let Xj denote the jth ground-up loss and assume that there are no cov-
erage modifications. Let N L be the total number of losses. Now, suppose
that a deductible is imposed. Let v = Pr(X > d) be the probability that a
loss will result in a payment. We will let Ij be the indicator random variable
whose value is 1 if the jth loss occur (and thus results in a payment) and is 0
otherwise. Then Ij is a Bernoulli random variable such that Pr(Ij = 1) = v
and Pr(Ij = 0) = 1 − v. The corresponding pgf is PIj (z) = vz + 1 − v.
where we used the fact that the pgf of a sum of independent random variables
is the product of pgf of the individual random variables.
Example 36.1
Suppose that losses follow an exponential distribution all with the same
mean 100. Suppose that insurance policies are subject to an ordinary de-
ductible of 20 and that N L follows a Poisson distribution with λ = 3.
36 THE IMPACT OF DEDUCTIBLES ON THE NUMBER OF PAYMENTS251
Solution.
(a) The probability of a loss that results in a payment is
Z ∞
x
v= e− 100 dx = 100e−0.2 .
20
Remark 36.1
In general, if K is a compound distribution with primary distribution N
and secondary distribution M then
Example 36.2 ‡
An actuary has created a compound claims frequency model with the fol-
lowing properties:
(i) The primary distribution is the negative binomial with probability gen-
erating function
P (z) = [1 − 3(z − 1)]−2
(ii) The secondary distribution is the Poisson with probability generating
function
P (z) = eλ (z − 1).
(iii) The probability of no claims equals 0.067.
Calculate λ.
Solution.
From the above remark, we have
Now, suppose that N L depends on a parameter θ and that its pgf PN L (z; θ)
satisfies the equation
PN L (z; θ) = B[θ(z − 1)]
where B(z) is independent of θ and both z and θ only appear in the pgf as
θ(z − 1). Then we have
This shows that N L and N P are both from the same parametric family and
only the parameter θ need to be changed to vθ.
Example 36.3
Suppose that N L follows a negative binomial distribution with parameters
r and β. Find PN L (z) and B(z).
Solution.
We have PN L (z) = [1 − β(z − 1)]−r so that B(z) = (1 − z)−r . Note that β
takes on the role of θ in the above result
Example 36.4
Losses follow a Pareto distribution with α = 3 and θ = 1000. Assume that
N L follow a negative binomial distribution with r = 2 and β = 3. Find
PN P (z) when a deductible of 250 is imposed on N L .
Solution.
N P has a negative binomial distribution with r∗ = 2 and β ∗ = βv. Since
3
1000
v = 1 − FX (250) = = 0.512
1000 + 250
we have
Now, suppose that N L depends on two parameters θ and α with the pgf
satisfying the equation
B(−vθ) − B(−θ)
α∗ = PN L (1 − v; θ; α) = α + (1 − α)
1 − B(−θ)
we have
where
α∗ = PN P (0) = Pr(N P = 0).
Hence, if N L is zero-modified then N P is zero-modified.
Example 36.5
Losses follow a Pareto distribution with α = 3 and θ = 1000. Assume that
N L follow a zero-modified negative binomial distribution with r = 2, β = 3
and pM0 = 0.4 Find PN P (z) when a deductible of 250 is imposed on N .
L
solution.
The pgf of N L is (See Problem 30.6)
where α = pM −r P
0 and B(z) = (1 − z) . Hence, N is a zero-modified negative
∗ ∗
binomial distribution with r = r2, β = vβ = 1.536 and
(1 + 1.536)−2 − (1 + 3)−2
α ∗ = pM
0 ∗ = 0.4 + (1 − 0.4) = 0.4595.
1 − (1 + 3)−2
254 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Hence,
Example 36.6
Suppose payments on a policy with a deductible of 250 have the zero-
modified negative binomial distribution with r∗ = 2, β ∗ = 1.536 and
pM
0 ∗ = 0.4595. Losses have the Pareto distribution with α = 3 and β = 1000.
Determine the distribution of payments when the deductible is removed.
solution.
When the deductibles are removed, the number of payments follow a zero-
∗
modified negative binomial with r = 2, β = βv = 1.536
0.512 = 3 and
∗ ∗
pM ∗ −r + (1 + β )−r − pM ∗ (1 + β )−r
0 ∗ −(1 + β ) 0
pM
0 = v v
1 − (1 + β ∗ )−r
0.4595 − 2.536−2 + 4−2 − 0.4595(4)−2
pM
0 = = 0.4
1 − 2.536−2
36 THE IMPACT OF DEDUCTIBLES ON THE NUMBER OF PAYMENTS255
Practice Problems
Problem 36.1 ‡
The frequency distribution for the number of losses when there is no de-
ductible is negative binomial with r = 2 and β = 5. Loss amounts have a
Weibull distribution with r = 0.3 and θ = 1000.
Problem 36.2
Individual losses have an exponential distribution with mean 340. With a
deductible of 200, the frequency distribution for the number of payments is
Poisson with mean λ = 0.5.
Problem 36.3
Individual losses have an exponential distribution with mean 340. With a
deductible of 200, the frequency distribution for the number of payments is
Poisson with mean λ = 0.5.
Problem 36.4
Individual losses have an exponential distribution with mean 340. With a
deductible of 200, the frequency distribution for the number of payments is
Poisson with mean λ = 0.5.
Problem 36.5
Loss amounts follow a Pareto distribution with α = 3 and θ = 1000. With
a deductible of 500, the frequency distribution for the number of payments
is geometric with β = 0.4.
Find PN P (z).
Problem 36.6
Individual losses have an exponential distribution with mean 1000. With a
256 MODIFICATIONS OF THE LOSS RANDOM VARIABLE
Determine pM M
0 ∗ if p0 = 0.4.
Problem 36.7
Suppose payments on a policy with deductible 200 have a zero-modified
Poisson distribution with λ∗ = 0.27756 and pM0 ∗ = 0.6304. Losses have an
exponential distribution with mean θ = 1000.
Find pM
0 if the deductible is removed.
Aggregate Loss Models
257
258 AGGREGATE LOSS MODELS
S = X1 + X2 + · · · + Xn .
The Xi0 s usually have mixed distributions with a probability mass at zero
corresponding to the probability of no loss or payments.
This type of models is used in modeling group life or health insurance policy
of a group of n individuals where each individual can have different coverage
and different level of loss probabilities.
If the Xi0 s are identically distributed then the individual risk model be-
comes a special case of the so-called collective risk model which we define
next.
S = X1 + X2 + · · · + XN .
There are advantages in modeling the claim frequency and claim severity
separately, and then combine them to obtain the aggregate loss distribu-
tion. For example, expansion of insurance business may have impacts on
the claim frequency but not the claim severity. In contrast, a cost increae
may a affect the claim severity with no effects on the claim frequency.
37 INDIVIDUAL RISK AND COLLECTIVE RISK MODELS 259
Example 37.1
An insurance company insures 500 individuals against accidental death.
Which of the following risk models is better suited for evaluating the risk to
the insurance company?
(a) A collective risk model.
(b) An individual risk model.
Solution.
Since the number of policies is fixed, the model that is most suited to the
insurance company is the individual risk model
Example 37.2
Which of the following statements are true?
(a) In collective risk model, the number of summands in the aggregate loss
is fixed.
(b) In individual risk model, the number of summands is a random variable.
(c) In individual risk models, the summands are independent but not nec-
essarily identically distributed.
Solution.
(a) This is false. In collective risk model, the number of summands in the
aggregate loss S is a random variable−which we refer to as the frequency
random variable.
(b) This is false. In individual risk model, the number of summands is a
fixed number.
(c) This is true
Example 37.3 ‡
You are given:
Claim sizes are independent. Determine the variance of the aggregate loss.
Solution.
For N = 0, we have S = 0. For N = 1 either S = 25 or S = 150. For N = 2,
260 AGGREGATE LOSS MODELS
1
Pr(S = 0) =
5
Pr(S = 25) =Pr(S = 25|N = 1)Pr(N = 1)
1 3 1
=Pr(X = 25|N = 1)Pr(N = 1) = =
3 5 5
Pr(S = 100) =Pr(S = 100|N = 2)Pr(N = 2)
2 2 1 4
=Pr[(X1 = 50)and(X2 = 50)]Pr(N = 2) = =
3 3 5 45
Pr(S = 150) =Pr(S = 150|N = 1)Pr(N = 1)
2 3 2
=Pr(X = 150|N = 1)Pr(N = 1) = =
3 5 5
Pr(S = 250) =Pr(S = 250|N = 2)Pr(N = 2)
=[Pr[(X1 = 50)and(X2 = 200)] + Pr[(X1 = 200)and(X2 = 50)]Pr(N = 2)
2 1 1 4
=2 =
3 3 5 45
Pr(S = 400) =Pr(S = 400|N = 2)Pr(N = 2)
1 1 1 1
=Pr[(X1 = 200)and(X2 = 200)]Pr(N = 2) = = .
3 3 5 45
Thus,
1 1 4 2 4 1
E(S) =0 + 25 + 100 + 150 + 250 + 400
5 5 45 5 45 45
=105
2 2 1 2 1 2 4 2 2 2 4 2 1
E(S ) =0 + 25 + 100 + 150 + 250 + 400
5 5 45 5 45 45
=19125
Var(S) =E(S 2 ) − E(S)2 = 19125 − 1052 = 8100
PN (z; α) = Q(z)α ,
37 INDIVIDUAL RISK AND COLLECTIVE RISK MODELS 261
Example 37.4
Show that the Poisson distribution with parameter λ has a probability gen-
erating function of the form P (z) = Q(z)α .
Solution.
The probability generating function is
P (z) = eλ(z−1) = [e(z−1) ]λ .
Thus, Q(z) = ez−1 and α = λ
Example 37.5 ‡
In order to simplify an actuarial analysis Actuary A uses an aggregate dis-
tribution S = X1 + X2 + · · · + XN , where N has a Poisson distribution with
mean 10 and Xi = 1.5 for all i.
Actuary A’s work is criticized because the actual severity distribution is
given by
Pr(Yi = 1) = Pr(Yi = 2) = 0.5, for alli,
where Yi0 s are independent.
Actuary A counters this criticism by claiming that the correlation coefficient
between S and S ∗ = Y1 + Y2 + · · · + YN is high.
Calculate the correlation coefficient between S and S ∗ .
Solution.
The coefficient of corrolation between S and S ∗ is given by
Cov(S, S ∗ )
ρ= p p .
Var(S) Var(S 8 )
We have
E(S) =E(N )E(X) = 10(1.5) = 15
Var(S) =E(N )Var(X) + Var(N )E(X)2
=10(0) + 10(1.5)2 = 22.5
E(S ∗ ) =E(N )E(Y ) = 10(1.5) = 15
Var(S ∗ ) =E(N )Var(Y ) + Var(N )E(Y )2
=10(0.25) + 10(1.5)2 = 25
262 AGGREGATE LOSS MODELS
Practice Problems
Problem 37.1
Which of the following statements are true?
(a) In a collective risk model one policy can be modeled using a Pareto
distribution and another can be modeled with an exponential distribution.
(b) In an individual risk model, the loss amounts can have different proba-
bilities at zero.
(c) In the collective risk model, the frequency and severity of payments are
modeled separately.
(d) In the collective risk model, the number of payments affects the size of
each individual payment.
Problem 37.2
Consider the aggregate loss sum in an individual loss model:
S = X1 + X2 + · · · + Xn .
Assume that the loss amounts are identically distributed. Find the limit of
the coefficient of variation of S as n goes to infinity.
Problem 37.3
Consider a portfolio of two policies. One policy follows a Pareto distribu-
tion with parameters α = 3 and θ = 100 and the other policy follows an
exponential distribution with parameter 0.05. Assume that the two policies
are independent.
Problem 37.4
Determine whether the following model is individual or collective: The num-
ber of claims per day N has a geometric distribution with mean 2. The size
of each claim has an exponential distribution with mean 1000. The number
of losses and loss sizes are mutually independent.
Problem 37.5
Let N, the claim count random variable, follow a negative binomial distri-
bution with parameters r and β. Show that PN (z; α) = [Q(z)]α .
264 AGGREGATE LOSS MODELS
S = X1 + X2 + · · · + XN
and
1, x ≥ 0
FX∗0 (x) =
0, x < 0.
The random variable X is the common distribution of the Xi0 s. Note that
FX∗1 = FX (x). By differentiation, and assuming that differentiation and in-
tegration can be reversed, we find the pdf
Z x
∗n ∗(n−1)
fX (x) = fX (x − y)fX (y)dy, n = 2, 3, · · · .
0
∞
X
∗n
fS (x) = Pr(S = x) = Pr(N = n)fX (x), x = 0, 1, 2, · · · .
n=0
Example 38.1
An insurance portfolio produces N claims, where
n Pr(N = n)
0 0.5
1 0.2
2 0.2
3 0.1
x fX (x)
1 0.9
2 0.1
x ∗0 (x)
fX ∗1 (x)
fX ∗2 (x)
fX ∗3 (x)
fX fS (x)
0
1
2
3
4
Pr(N = n)
Solution.
x ∗0 (x)
fX ∗1 (x)
fX ∗2 (x)
fX fX∗3 (x) fS (x)
0 1 0 0 0 0.5
1 0 0.9 0 0 0.18
2 0 0.1 0.81 0 0.182
3 0 0 0.09 0.729 0.0909
4 0 0 0.01 0.162 0.0182
Pr(N = n) 0.5 0.2 0.2 0.1
To find fX∗n (x) pick two columns whose superscripts sum to n. Then add all
x ∗0 (x)
fX ∗1 (x)
fX fX∗2 (x) ∗3 (x)
fX fS (x)
0 1 0 0 0 0.2
1 0 0.25 0 0 0.04
2 0 0.25 0.0625 0 0.048
3 0 0.25 0.125 0.0156 0.0576
Pr(N = n) 0.2 0.16 0.128 0.1024
where
4n
Pr(N = n) = , n = 1, 2, · · · .
5n+1
Hence,
FS (3) = 0.2 + 0.04 + 0.048 + 0.0576 = 0.3456
Example 38.3
Severities have a uniform distribution on [0, 100]. The frequency distribution
is given by
n Probability
0 0.60
1 0.30
2 0.10
Solution.
We know that FX∗0 (x) = 1 for x ≥ 0. Thus,
Z x
∗1 dt x
FX (x) = = .
0 100 100
Example 38.4
Severities have a uniform distribution on [0, 100]. The frequency distribution
is given by
n Probability
0 0.60
1 0.30
2 0.10
268 AGGREGATE LOSS MODELS
Find FS (x).
Solution.
We have
x x2
FS (x) = 0.60 + 0.30 + 0.10
100 20000
The pgf of S is found as follows:
∞
X
PS (x) =E(z S ) = E[z 0 ]Pr(N = 0) + E[z X1 +X2 +···+Xn |N = n]Pr(N = n)
n=1
∞
X ∞
X
E Πnj=1 z Xj Pr(N = n) = Pr(N = n)[PX (z)]n
=Pr(N = 0) +
n=1 n=1
=PN [PX (z)].
Example 38.5
Loss frequency N follows a Poisson distribution with parameter λ. Loss
severity X follows an exponential distribution with mean θ. Find the ex-
pected value and the variance of the aggregate loss random variable.
Solution.
We have
E(S) = E(N )E(X) = λθ
and
Thus,
Var(S) = λ2 θ2 + λ(2θ2 ) − λ2 θ2 = 2λθ2
Example 38.6
Loss frequency N follows a Poisson distribution with parameter λ. Loss
severity X follows an exponential distribution with mean θ. Find an expres-
sion of the probability that the aggregate loss random variable will exceed
a value of n
(a) if S can be approximated with the standard normal distribution
(b) if S can be approximated with the lognormal distribution.
Solution.
(a) We have
!
S − E(S) n − E(S)
Pr(S > n) =Pr p > p
Var(S) Var(S)
!
n − E(S)
=Pr Z > p
Var(S)
!
n − E(S)
=1 − Φ p .
Var(S)
Example 38.7 ‡
You own a fancy light bulb factory. Your workforce is a bit clumsy they
keep dropping boxes of light bulbs. The boxes have varying numbers of
light bulbs in them, and when dropped, the entire box is destroyed.
You are given:
(i) Expected number of boxes dropped per month : 50
(ii) Variance of the number of boxes dropped per month: 100
270 AGGREGATE LOSS MODELS
Solution.
Let S denote the total value of boxes destroyed in a month. Then
S = X1 + X2 + · · · + XN
Example 38.8 ‡
The number of claims, N, made on an insurance portfolio follows the follow-
ing distribution:
n Pr(N = n)
0 0.7
2 0.2
3 0.1
If a claim occurs, the benefit is 0 or 10 with probability 0.8 and 0.2, respec-
tively. The number of claims and the benefit for each claim are independent.
Calculate the probability that aggregate benefits will exceed expected ben-
efits by more than 2 standard deviations.
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 271
Solution.
We have
But,
Finally,
Pr(S > 9.4) = 1 − 0.8792 = 0.1208
Example 38.9 ‡
For a collective risk model the number of losses, N, has a Poisson distribu-
tion with λ = 20.
The common distribution of the individual losses has the following charac-
teristics:
(i) E(X) = 70
(ii) E(X ∧ 30) = 25
(iii) Pr(X > 30) = 0.75
(iv) E(X 2 |X > 30) = 9000
An insurance covers aggregate losses subject to an ordinary deductible of 30
per loss.
Calculate the variance of the aggregate payments of the insurance.
272 AGGREGATE LOSS MODELS
Solution.
Let S denote the aggregate payments. Then S is a compound distribution
with primary distribution N and secondary distribution the payment per
loss (X − 30)+ . We are asked to find
Var(S) = E(N )E[(X − 30)2+ ] = 20E[(X − 30)2+ ].
We have
E[(X − 30)2+ ] =E[(X − 30)2+ |X > 30](1 − FX (30)) = 0.75E[(X − 30)2+ |X > 30]
=0.75E[(X − 30)2 |X > 30] = 0.75E(X 2 − 60X + 900|X > 30)
=0.75[E(X 2 |X > 30) − 60E(X|X > 30) + 900]
=0.75[E(X 2 |X > 30) − 60E(X − 30|X > 30) − 1800 + 900]
=0.75[E(X 2 |X > 30) − 60(E(X) − E(X ∧ 30))(1 − FX (30))−1 − 900]
=0.75[9000 − 60[70 − 25)(1 − 0.75)−1 − 900] = 3375
Var(S) =20(3375) = 67500
Example 38.10 ‡
The repair costs for boats in a marina have the following characteristics:
Thus,
p
E(Y ) = 9, 000 + 30, 000 + 150, 000 + (2.19 + 39 + 360) × 106 ≈ 209, 000
Example 38.11 ‡
For an insurance:
(i) The number of losses per year has a Poisson distribution with λ = 10.
(ii) Loss amounts are uniformly distributed on (0, 10).
(iii) Loss amounts and the number of losses are mutually independent.
(iv) There is an ordinary deductible of 4 per loss.
Calculate the variance of aggregate payments in a year.
Solution.
Let S be the aggregate claims. Then S is a compound distribution with
primary distribution N, the number of claims, and secondary distribution
Y = (X − 4)+ , the amount paid per loss, where X is the loss amount. We
want
Var(S) = E(N )Var(Y ) + Var(N )E(Y )2 = λE(Y 2 )
where
Z 10 Z 10
2 2
E(Y ) = (x − 4) fX (x)dx = (x − 4)2 (0.1)dx = 7.2.
4 4
Hence,
Var(S) = 10(7.2) = 72
Example 38.12 ‡
For an insurance portfolio:
(i) The number of claims has the probability distribution
n pn
0 0.1
1 0.4
2 0.3
3 0.2
274 AGGREGATE LOSS MODELS
Solution.
Let S be the aggreagte claims, N the number of claims, and X the amount
of claim. Then S is a compound distribution with primary distribution N
and a secondary distribution X. Thus,
Var(S) = E(N )var(X) + Var(N )E(X)2
where
E(X) =λ = 3
Var(X) =λ = 3
E(N ) =0(0.1) + 1(0.4) + 2(0.3) + 3(0.2) = 1.6
E(N 2 ) =02 (0.1) + 12 (0.4) + 22 (0.3) + 32 (0.2) = 3.4
Var(N ) =3.4 − 1.62 = 0.84.
Hence,
Var(S) = 1.6(3) + 0.84(32 ) = 12.36
Example 38.13 ‡
Aggregate losses are modeled as follows:
(i) The number of losses has a Poisson distribution with λ = 3.
(ii) The amount of each loss has a Burr (Burr Type XII, Singh-Maddala)
distribution with α = 3, θ = 2, and γ = 1.
(iii) The number of losses and the amounts of the losses are mutually inde-
pendent.
Calculate the variance of aggregate losses.
Solution.
Let N denote the number of losses and X be the severity random variable.
Then
Var(S) = E(N )Var(X) + Var(N )E(X)2 = λE(X 2 ).
From Table C, we find
22 Γ(3)Γ(1)
E(X 2 ) = = 4.
Γ(3)
Hence,
Var(S) = 3(4) = 12
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 275
Example 38.14 ‡
You are the producer for the television show Actuarial Idol. Each year,
1000 actuarial clubs audition for the show. The probability of a club being
accepted is 0.20.
The number of members of an accepted club has a distribution with mean
20 and variance 20. Club acceptances and the numbers of club members are
mutually independent.
Your annual budget for persons appearing on the show equals 10 times the
expected number of persons plus 10 times the standard deviation of the
number of persons.
Calculate your annual budget for persons appearing on the show.
Solution.
Let S be the number of people appearing on the show. Then S is a compound
distribution with frequency N (number of clubs being accepted) and severity
X (number of members of an accepted club).
The frequency N has a binomial distribution with parameters m = 1000
and q = 0.20. Thus,
Example 38.15 ‡
For an aggregate loss distribution S :
(i) The number of claims has a negative binomial distribution with r = 16
and β = 6.
(ii) The claim amounts are uniformly distributed on the interval (0, 8).
(iii) The number of claims and claim amounts are mutually independent.
Using the normal approximation for aggregate losses, calculate the premium
such that the probability that aggregate losses will exceed the premium is
5%.
276 AGGREGATE LOSS MODELS
Solution.
The aggregate losses S has a compound distribution with primary function
the frequency N and secondary function the severity X. We have
Let P denote the premium such that Pr(S > P ) = 0.05. Using normal
approximation, we have
S − 384 P − 384
0.05 =Pr(S > P ) = Pr √ > √
11264 11264
P − 384
=Pr Z > √
11264
P − 384
=1 − Pr Z ≤ √
11264
P − 384
=1 − Φ √ .
11264
Hence,
P − 384 P − 384
Φ √ = 0.95 =⇒ √ = 1.645.
11264 11264
Solving this last equation, we find P = 558.59
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 277
Practice Problems
Problem 38.1
Find the third raw moment of the aggregate loss random variable S : E[S 3 ].
Problem 38.2
Find the third central moment of the aggregate loss random variable S :
E[(S − E(S))3 ].
Problem 38.3
Let S = X1 + X2 + · · · XN where the Xi0 s are independent with common
distribution the lognormal distribution with parameters µ and σ. The fre-
quency random variable N has a Poisson distribution with parameter λ.
Using the normal approximation, calculate the probability that the aggre-
gate losses will exceed 52,250.
278 AGGREGATE LOSS MODELS
Problem 38.8
For an insurance company, each loss has a mean of 100 and a variance of
100. The number of losses follows a Poisson distribution with a mean of
500. Each loss and the number of losses are mutually independent.
Using the lognormal approximation, calculate the probability that the ag-
gregate losses will exceed 52,250.
Problem 38.9
An insurance company offers car insurance to a group of 1000 employees.
The frequency claim has negative binomial distribution with r = 1 and
β = 1.5. Severity claims are exponentially distributed with a mean of 5000.
Assume that the number of claims and the size of the claim are independent
and identically distributed.
Problem 38.10 ‡
When an individual is admitted to the hospital, the hospital charges have
the following characteristics:
Determine the mean and the standard deviation of the insurer’s payout
for the policy.
Problem 38.11 ‡
Computer maintenance costs for a department are modeled as follows:
(i) The distribution of the number of maintenance calls each machine will
need in a year is Poisson with mean 3.
(ii) The cost for a maintenance call has mean 80 and standard deviation
200.
(iii) The number of maintenance calls and the costs of the maintenance calls
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 279
Using the normal approximation for the distribution of the aggregate main-
tenance costs, calculate the minimum number of computers needed to avoid
purchasing a maintenance contract.
Problem 38.12 ‡
A towing company provides all towing services to members of the City Au-
tomobile Club. You are given:
(i) The automobile owner must pay 10% of the cost and the remainder is
paid by the City Automobile Club.
(ii) The number of towings has a Poisson distribution with mean of 1000
per year.
(iii) The number of towings and the costs of individual towings are all mu-
tually independent.
Problem 38.13 ‡
The number of auto vandalism claims reported per month at Sunny Daze In-
surance Company (SDIC) has mean 110 and variance 750. Individual losses
have mean 1101 and standard deviation 70. The number of claims and the
amounts of individual losses are independent.
Using the normal approximation, calculate the probability that SDIC’s ag-
gregate auto vandalism losses reported for a month will be less than 100,000.
Problem 38.14 ‡
At the beginning of each round of a game of chance the player pays 12.5.
280 AGGREGATE LOSS MODELS
The player then rolls one die with outcome N. The player then rolls N dice
and wins an amount equal to the total of the numbers showing on the N
dice. All dice have 6 sides and are fair.
Problem 38.15 ‡
A dam is proposed for a river which is currently used for salmon breeding.
You have modeled:
(i) For each hour the dam is opened the number of salmon that will pass
through and reach the breeding grounds has a distribution with mean 100
and variance 900.
(ii) The number of eggs released by each salmon has a distribution with
mean of 5 and variance of 5.
(iii) The number of salmon going through the dam each hour it is open and
the numbers of eggs released by the salmon are independent.
Using the normal approximation for the aggregate number of eggs released,
determine the least number of whole hours the dam should be left open so
the probability that 10,000 eggs will be released is greater than 95%.
Problem 38.16 ‡
You are the producer of a television quiz show that gives cash prizes. The
number of prizes, N, and prize amounts, X, have the following distributions:
x Pr(X = x)
n Pr(N = n)
0 0.2
1 0.8
100 0.7
2 0.2
1000 0.1
Your budget for prizes equals the expected prizes plus the standard devia-
tion of prizes.
Problem 38.17 ‡
The number of accidents follows a Poisson distribution with mean 12. Each
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 281
Problem 38.18 ‡
In a clinic, physicians volunteer their time on a daily basis to provide care
to those who are not eligible to obtain care otherwise. The number of physi-
cians who volunteer in any day is uniformly distributed on the integers 1
through 5. The number of patients that can be served by a given physician
has a Poisson distribution with mean 30.
Determine the probability that 120 or more patients can be served in a day
at the clinic, using the normal approximation with continuity correction.
Problem 38.19 ‡
For an individual over 65:
(i) The number of pharmacy claims is a Poisson random variable with mean
25.
(ii) The amount of each pharmacy claim is uniformly distributed between 5
and 95.
(iii) The amounts of the claims and the number of claims are mutually in-
dependent.
Determine the probability that aggregate claims for this individual will ex-
ceed 2000 using the normal approximation.
Problem 38.20 ‡
Two types of insurance claims are made to an insurance company. For each
type, the number of claims follows a Poisson distribution and the amount
of each claim is uniformly distributed as follows:
Problem 38.21 ‡
For aggregate losses, S :
(i) The number of losses has a negative binomial distribution with mean 3
and variance 3.6.
(ii) The common distribution of the independent individual loss amounts is
uniform from 0 to 20.
Problem 38.22 ‡
In a CCRC, residents start each month in one of the following three states:
Independent Living (State #1), Temporarily in a Health Center (State #2)
or Permanently in a Health Center (State #3). Transitions between states
occur at the end of the month.
If a resident receives physical therapy, the number of sessions that the res-
ident receives in a month has a geometric distribution with a mean which
depends on the state in which the resident begins the month. The num-
bers of sessions received are independent. The number in each state at the
beginning of a given month, the probability of needing physical therapy in
the month, and the mean number of sessions received for residents receiving
therapy are displayed in the following table:
State # Number in Probability of Mean number
state needing therapy of visits
1 400 0.2 2
2 300 0.5 15
3 200 0.3 9
Using the normal approximation for the aggregate distribution, calculate the
probability that more than 3000 physical therapy sessions will be required
for the given month.
Problem 38.23 ‡
You are given:
(i) Aggregate losses follow a compound model.
(ii) The claim count random variable has mean 100 and standard deviation
25.
(iii) The single-loss random variable has mean 20,000 and standard devia-
tion 5000.
38 AGGREGATE LOSS DISTRIBUTIONS VIA CONVOLUTIONS 283
in the discrete case. Note that this is identical to the discussion of ordinary
deductible of Section 31.
Example 39.1
The distribution of aggregate losses covered under a policy of stop-loss in-
surance is given by FS (x) = 1 − x12 , x > 1. Calculate E[(S − 3)+ ].
Solution.
We have ∞
1 ∞ 1
Z
dx
E[(S − 3)+ ] = =− =
3 x2 x 3 3
The following provides a simple calculation of the net stop-loss premium in
a special case.
Theorem 39.1
Suppose that Pr(a < S < b) = 0. Then, for a ≤ d ≤ b, we have
b−d d−a
E[(S − d)+ ] = E[(S − a)+ ] + E[(S − b)+ ].
b−a b−a
Proof.
Let a ≤ x ≤ b. Since a ≤ x and FS (x) is nondecreasing, we have FS (a) ≤
FS (x). On the other hand,
Z x Z a Z x Z b
FS (x) = fS (y)dy = fS (y)dy+ fS (y)dy ≤ FS (a)+ fS (y)dy = FS (a).
0 0 a a
39 STOP LOSS INSURANCE 285
Solution.
We have
120 − 105 105 − 100
E[(S − 105)+ ] = (15) + (10) = 13.75
120 − 100 120 − 100
More simplification result follows.
Theorem 39.2
Suppose S is discrete and Pr(S = kh) ≥ 0 for some fixed h and k =
0, 1, 2, · · · . Also, Pr(S = x) = 0 for all x 6= kh. Then, for any nonnega-
tive integer j, we have
∞
X
E[(S − jh)+ ] = h {1 − FS [(n + 1)j]} .
n=0
286 AGGREGATE LOSS MODELS
In particular,
Proof.
We have
X
E[(S − jh)+ ] = (x − jh)fS (x)
x>jh
X∞ ∞
X
= (kh − jh)Pr(S = kh) = h (k − j)Pr(S = kh)
k=j k=j
∞ k−j−1
X X ∞ k−j−1
X X
=h Pr(S = kh) = h Pr(S = kh)
k=j n=0 n=0 k=n+j+1
X∞
=h {1 − FS [(n + 1)j]} .
n=0
Finally, we have
∞
X
E[(S − (j + 1)h)+ ] − E[(S − jh)+ ] =h {1 − FS [(n + 1)j + n + 1]}
n=0
X∞
−h {1 − FS [(n + 1)j]}
n=0
X∞
=h {1 − FS [(n + 1)j]}
n=1
X∞
−h {1 − FS [(n + 1)j]}
n=0
=h[FS (jh) − 1]
Example 39.3
Given the following information about the distribution of a discrete aggre-
gate loss random variable:
x 0 25 50 75
FS (x) 0.05 0.065 0.08838 0.12306
Calculate E[(S − 25)+ ], E[(S − 50)+ ], E[(S − 75)+ ], and E[(S − 100)+ ] given
that E(S) = 314.50.
39 STOP LOSS INSURANCE 287
Solution.
We have
E[(S − 25)+ ] =314.50 − 25(1 − 0.05) = 290.75
E[(S − 50)+ ] =290.75 − 25(1 − 0.065) = 267.375
E[(S − 75)+ ] =267.375 − 25(1 − 0.08838) = 244.5845
E[(S − 100)+ ] =244.5845 − 25(1 − 0.12306) = 222.661
Example 39.4 ‡
Prescription drug losses, S, are modeled assuming the number of claims has
a geometric distribution with mean 4, and the amount of each prescription
is 40. Calculate E[(S − 100)+ ].
Solution.
Let N denote the number of prescriptions and S the aggregate losses. Then
S = 40N. We have
E[(S − 100)+ ] =E(S) − E(S ∧ 100)
=40E(N ) − 40E(N ∧ 2.5)
=40E(N ) − 40[fN (1) + 2fN (2) + 2.5(1 − FN (2))]
=40(4) − 40[0.16 + 2(0.1280) + 2.5(0.5120)]
=92.16
where
4n
fN (n) =
5n+1
Example 39.5 ‡
WidgetsRUs owns two factories. It buys insurance to protect itself against
major repair costs. Profit equals revenues, less the sum of insurance premi-
ums, retained major repair costs, and all other expenses. WidgetsRUs will
pay a dividend equal to the profit, if it is positive.
You are given:
(i) Combined revenue for the two factories is 3.
(ii) Major repair costs at the factories are independent.
(iii) The distribution of major repair costs (k) for each factory is
k Probability(k)
0 0.4
1 0.3
2 0.2
3 0.1
288 AGGREGATE LOSS MODELS
(iv) At each factory, the insurance policy pays the major repair costs in
excess of that factorys ordinary deductible of 1. The insurance premium is
110% of the expected claims.
(v) All other expenses are 15% of revenues.
Calculate the expected dividend.
Solution.
Let R denote the retained major repair cost for both factories. That is, this
is the portion of the major repair costs to be covered by the ownership of the
factory. Note that R is an aggregate random variable and R = 0, 1, 2. Let D
stand for the dividend. The expected claims per factory for the insurance
company is
Example 39.6 ‡
For a collective risk model:
(i) The number of losses has a Poisson distribution with λ = 2.
(ii) The common distribution of the individual losses is:
x f (x)
1 0.6
2 0.4
39 STOP LOSS INSURANCE 289
x f (x)
5 0.6
k 0.4
where k > 5. The expected cost of an aggregate stop-loss insurance subject
to a deductible of 5 is 28.03.
Calculate k.
Solution.
The stop-loss insurance with deductible 5 pays
(S − 5)+ = S − S ∧ 5.
Thus,
E[(S − 5)+ ] = E(S) − E(S ∧ 5).
We have
E(S) = E(N )E(X) = 5[5(0.6) + k(0.4)] = 15 + 2k
and
E(S ∧ 5) =5[1 − FS (5)]
=5[1 − Pr(S = 0)]
=5[1 − Pr(N = 0)]
=5(1 − e−5 ) = 4.9663.
290 AGGREGATE LOSS MODELS
Thus,
28.03 = E[(S − 5)+ ] = 15 + 2k − 4.9663 =⇒ k = 9
39 STOP LOSS INSURANCE 291
Practice Problems
Problem 39.1
You are given: E[(S − 15)+ = 0.34 and E[(S − 30)+ ] = 0.55. Calculate
FS (15).
Problem 39.2 ‡
An aggregate claim distribution has the following characteristics: Pr(S =
i) = 16 for i = 1, 2, 3, 4, 5, 6. A stop-loss insurance with deductible amount d
has an expected insurance payment of 1.5 .
Find d.
Problem 39.3 ‡
You are given:
(i) S takes only positive integer values.
(ii) E(S) = 53 .
(iii) E[(S − 2)+ ] = 16 .
(iv) E[(S − 3)+ ] = 0.
Problem 39.4 ‡
For a stop-loss insurance on a three person group:
(i) Loss amounts are independent.
(ii) The distribution of loss amount for each person is:
Loss Amount Probability
0 0.4
1 0.3
2 0.2
3 0.1
iii) The stop-loss insurance has a deductible of 1 for the group.
Problem 39.5
Suppose that the aggregate loss random variable is discrete satisfying Pr(S =
1
50k) = 2k+1 for k = 0, 1, 2, · · · and Pr(S = x) = 0 for all other x.
Problem 39.6 ‡
For a certain company, losses follow a Poisson frequency distribution with
mean 2 per year, and the amount of a loss is 1, 2, or 3, each with probability
1/3. Loss amounts are independent of the number of losses, and of each
other.
An insurance policy covers all losses in a year, subject to an annual aggre-
gate deductible of 2.
Problem 39.7 ‡
For a stop-loss insurance on a three person group:
(i) Loss amounts are independent.
(ii) The distribution of loss amount for each person is:
Loss Amount (X) Probability (X)
0 0.4
1 0.3
2 0.2
3 0.1
(iii) The stop-loss insurance has a deductible of 1 for the group.
Problem 39.8 ‡
The number of annual losses has a Poisson distribution with a mean of 5.
The size of each loss has a two-parameter Pareto distribution with θ = 10
and α = 2.5. An insurance for the losses has an ordinary deductible of 5 per
loss.
Calculate the expected value of the aggregate annual payments for this in-
surance.
Problem 39.9 ‡
In a given week, the number of projects that require you to work overtime
has a geometric distribution with β = 2. For each project, the distribution
of the number of overtime hours in the week is the following:
x f (x)
5 0.2
10 0.3
20 0.5
39 STOP LOSS INSURANCE 293
The number of projects and number of overtime hours are independent. You
will get paid for overtime hours in excess of 15 hours in the week.
Calculate the expected number of overtime hours for which you will get
paid in the week.
294 AGGREGATE LOSS MODELS
Let S be an aggregate loss random variable such that the severities are
all independent and identically distributed with common distribution the
exponential distribution with mean θ. We assume that the severities and
the frequency distributions are independent. By independence, we have
and where Z ∞
Γ(α) = tα−1 e−t dt.
0
If α = n is a positive integer then we can find a closed form of Γ(n : x).
Theorem 40.1
Let n be a positive integer. Then
n−1
X xj e−x
Γ(n; x) = 1 − .
j!
j=0
Proof.
The proof is by induction on n. For n = 1, we have
Z x 1−1 j −x
−t −x
X x e
Γ(1; x) = e dt = 1 − e =1− .
0 j!
j=0
40 CLOSED FORM OF AGGREGATE DISTRIBUTIONS 295
where
∞
X
Pj = Pr(N = n), j = 0, 1, · · · .
n=j+1
Example 40.1
Find FS (x) if the frequency N has a binomial distribution with parameters
n and m.
296 AGGREGATE LOSS MODELS
Solution.
We have P j = 0 for j = m, m + 1, · · · . Hence,
m n−1
X (x/θ)j
X m n m−n
FS (x) = 1 − q (1 − q)
n j!
n=1 j=0
Example 40.2
Suppose that X1 , X2 , · · · , XN are independent normal random variables
with parameters (µ1 , σ12 ), (µ2 , σ22 ), · · · , (µN , σN
2 ). Show that S = X + X +
1 2
· · · + XN is also a normal random variable.
Solution.
By independence, we have
MS (z) =MX1 (z)MX2 (z) · · · MXN (z)
2 2 2
=eµ1 +σ1 t eµ2 +σ2 t · · · eµN +σN t
2 2 2
=e(µ1 +µ2 +···+µN )+(σ1 +σ2 +···+σN )t .
Hence, S has a normal distribution with parameters (µ1 +µ2 +· · ·+µN , σ12 +
σ22 + · · · + σN
2 )
Theorem 40.2
Suppose that S1 , S2 , · · · , Sn are compound Poisson distributions with pa-
rameters λ1 , λ2 , · · · , λn and severity distributions with cdf’s F1 (x), F2 (x), · · · , Fn (x).
Suppose that the Si0 s are independent. Then S = S1 + · · · + Sn has a com-
pound Poisson distribution with parameter λ = λ1 + λ2 + · · · + λn and
severity distribution with cdf
n
X λi
F (x) = Fj (x).
λ
j=1
Proof.
Let Mj (t) be the mgf of Fj (x), Then
Example 40.3
Let S1 , S2 , S3 , and S4 be independent compound Poisson with parameters
λ1 = 1, λ2 = 2, λ3 = 3, λ4 = 4 and severities with cdf Fj (x) = x2 for
0 ≤ x ≤ 1. Let S = S1 + S2 + S3 + S4 . Find the mgf of S.
Solution.
The moment generating function of Fj (x) is
1
et et
Z
1
M (t) = Mj (t) = etx (2x)dx = − 2 + 2.
0 t t t
Thus,
4
X λj
λ Mj (t) − 1 = 10[M (t) − 1].
λ
j=1
Hence, t t
10[M (t)−1] 10 et − e2 + 1
−1
MS (t) = e =e t t2
298 AGGREGATE LOSS MODELS
Practice Problems
Problem 40.1
Suppose that the frequency distribution N of S has a negative binomial
with integer value r and parameter β. Suppose also that the severities are
identically distributed with common distribution the exponential distribu-
tion with mean θ. Suppose that the severities are independent among each
other and with N.
Find the moment generating function. Show that the given model is equiv-
alent to the binomial-exponential model.
Problem 40.2
Show that the cdf of the model of Exercise 40.1 is given by
r n r−n
X r β 1
FS (x) = 1 − × Pn (x)
n 1+β 1+β
n=1
where
n−1 −1 (1+β)−1
X [xθ−1 (1 + β)−1 ]j e−xθ
Pn (x) = .
j!
j=0
Problem 40.3
Suppose N has a geometric distribution with parameter β. The severities
are all exponential with mean θ. The severities and frequency are indepen-
dent.
Problem 40.4
Suppose N has a geometric distribution with parameter β. The severities
are all exponential with mean θ. The severities and frequency are indepen-
dent.
Problem 40.5
Show that the family of Poisson distributions is closed under convolution.
Problem 40.6
Show that the family of binomial distributions is closed under convolution.
40 CLOSED FORM OF AGGREGATE DISTRIBUTIONS 299
Problem 40.7
Show that the family of negative binomial distributions with the same pa-
rameter β but different r0 s is closed under convolution.
Problem 40.8
Show that the family of Gamma distributions with common paramter θ is
closed under convolution.
Problem 40.9
Let S be an aggregate loss random variable with a discrete frequency distri-
bution N defined by the table below.
n 0 1 2
Pr(N = n) 0.5 0.4 0.1
The severity claim has an exponential distribution with mean 2. Find FS (x).
Problem 40.10
Let S1 , S2 , · · · , S5 be i.i.d. compound Poisson random variables, each with
parameter λj = 7. The pdf of the severity distribution for each Sj is Fj (x) =
x2 for 0 < x < 1. For S = S1 + S2 + · · · + S5 , find MS (2).
300 AGGREGATE LOSS MODELS
The recursion method is used to find the distribution of S when the fre-
quency distribution belongs to either the (a, b, 0) class or the (a, b, 1) class
and the severity is integer valued and nonnegative. So let N be in the (a, b, 1)
class and denote its distribution by pn = Pr(N = n). Then pn satisfies the
equation
b
pn = a + pn−1 , n = 2, 3, · · · . (41.1)
n
We have the following theorem.
Theorem 41.1
For a frequency distribution in (a, b, 1), the pdf of S is recursively defined
by
n
X b
[p1 − (a + b)p0 ]fX (n) + a + j fX (j)fS (n − j)
n
j=1
fS (n) = (41.2)
1 − afX (0)
Proof.
Rewrite (41.1) in the form
Multiplying each side by [PX (z)]n−1 PX0 (z), summing over n and reindexing
yields
∞
X ∞
X
npn [PX (z)] n−1
PX0 (z) − p1 PX0 (z) =a (n − 1)pn−1 [PX (z)]n−1 PX0 (z)
n=1 n=2
∞
X
+(a + b) pn−1 [PX (z)]n−1 PX0 (z)
n=2
41 DISTRIBUTION OF S VIA THE RECURSIVE METHOD 301
∞
X ∞
X
=a npn [PX (z)]n PX0 (z) + (a + b) pn [PX (z)]n PX0 (z)
n=1 n=1
X∞ X∞
=a npn [PX (z)]n PX0 (z) + (a + b) pn [PX (z)]n PX0 (z) − p0 (a + b)PX0 (z).
n=1 n=0
P∞
Because PS (z) = n=0 pn [PX (z)]n (see similar argument in Section 36), the
previous calculation yields
PS0 (z) − p1 PX0 (z) = aPS0 (z)PX (z) + (a + b)PS (z)PX0 (z) − p0 (a + b)PX0 (z)
or
PS0 (z) = [p1 − (a + b)p0 ]PX0 (z) + aPS0 (z)PX (z) + (a + b)PS (z)PX0 (z).
Each side can be expanded in powers of z. The coefficient of z n−1 in such an
expansion must be the same on both sides of the equation. Hence, obtaining
n
X
nfS (n) =[p1 − (a + b)p0 ]nfX (n) + a (n − j)fX (j)fS (n − j)
j=0
n
X
+(a + b) jfX (j)fS (n − j) = [p1 − (a + b)p0 ]nfX (n) + anfX (0)fS (n)
j=0
n
X n
X
+a (n − j)fX (j)fS (n − j) + (a + b) jfX (j)fS (n − j)
j=1 j=1
Hence
n
X bj
[1 − afX (0)]fS (n) = [p1 − (a + b)p0 ]fX (n) + a+ fX (j)fS (n − j).
n
j=1
Finally, the result follows by dividing both sides of the last equation by
1 − afX (0)
Corollary 41.2
If N is in the (a, b, 0) class then
n
X b
a + j fX (j)fS (x − j)
x
j=1
fS (x) = Pr(S = x) = (41.3)
1 − afX (0)
302 AGGREGATE LOSS MODELS
Proof.
p1
If N is in the (a, b, 0) class then from p0 = a + b we find p1 − (a + b)p0 = 0
so that (41.2) reduces to (41.3)
The recursive method requires an initial starting value fS (0) which can be
found as follows:
Example 41.1
Develop the recursive formula for the case of compound Poisson distribution
with parameter λ.
Solution.
The initial value is
Example 41.2 ‡
For a tyrannosaur with a taste for scientists:
(i) The number of scientists eaten has a binomial distribution with q = 0.6
and m = 8.
(ii) The number of calories of a scientist is uniformly distributed on (7000, 9000).
(iii) The numbers of calories of scientists eaten are independent, and are in-
dependent of the number of scientists eaten.
Calculate the probability that two or more scientists are eaten and exactly
two of those eaten have at least 8000 calories each.
Solution.
If X denotes the number of calories of a scientist and N the number of
scientists eaten. Then
9000 − 8000 1
Pr(8000 ≤ X ≤ 9000) = FX (8000) − FX (9000) = = .
9000 − 7000 2
Thus, half the scientists are considered heavy. Let Y denote a heavy scientist
S the total number of heavy scientist. We are asked to find Pr(S = 2) =
41 DISTRIBUTION OF S VIA THE RECURSIVE METHOD 303
Solution.
In order to have exactly 600 in aggregate claims, one of following must
happen:
• N = 2, X1 = 100, and X2 = 500. In this case, the probability is
e−5 52
(0.8)(0.16) = 0.010781.
2!
• N = 2, X1 = 500, and X2 = 100. In this case, the probability is
e−5 52
(0.16)(0.8) = 0.010781.
2!
304 AGGREGATE LOSS MODELS
e−5 56
(0.8)6 = 0.038331.
6!
Summing up, we find a total probability of about 0.06
41 DISTRIBUTION OF S VIA THE RECURSIVE METHOD 305
Practice Problems
Problem 41.1
Let S have a Poisson frequency distribution with parameter λ = 0.04. The
individual claim amount has the following distribution:
x fX (x)
1 0.5
2 0.4
3 0.1
Problem 41.2
Let S be a compound Poisson distribution with parameter λ = 0.04 and
individual claim distribution given by
x fX (x)
1 0.5
2 0.4
3 0.1
Problem 41.3 ‡
You are given:
• S has a compound Poisson distribution with λ = 2.
• Individual claim amounts are distributed as follows: fX (1) = 0.4 and
fX (2) = 0.6.
Determine fS (4).
Problem 41.4 ‡
Aggregate claims S has a compound Poisson distribution with parameter λ
and with discrete individual claim amount distributions of fX (1) = 13 and
fX (3) = 23 . Also, fS (4) = fS (3) + 6fS (1).
Problem 41.5 ‡
Aggregate claims S has a compound Poisson distribution with parameter λ
and with discrete individual claim amount distributions of fX (1) = 13 and
306 AGGREGATE LOSS MODELS
Determine Var(S).
Problem 41.6 ‡
For aggregate claim S, you are given:
∞
X
∗n e−50 (50n )
fS (x) = fX (x) .
n!
n=0
Losses are distributed as follows: fX (1) = 0.4 fX (2) = 0.5, and fX (3) = 0.1.
Calculate Var(S).
Problem 41.7
Let S have a Poisson frequency distribution with parameter λ = 0.04. The
individual claim amount has the following distribution:
x fX (x)
1 0.5
2 0.4
3 0.1
Find FS (4).
Problem 41.8
The frequency distribution of an aggregate loss S follows a binomial dis-
tribution with m = 4 and q = 0.3. Loss amount has the distribution:
fX (0) = 0.2, fX (1) = 0.7, fX (2) = 0.1
Problem 41.9
The frequency distribution of an aggregate loss S follows a binomial dis-
tribution with m = 4 and q = 0.3. Loss amount has the distribution:
fX (0) = 0.2, fX (1) = 0.7, fX (2) = 0.1
Problem 41.10
The frequency distribution of an aggregate loss S follows a binomial dis-
tribution with m = 4 and q = 0.3. Loss amount has the distribution:
41 DISTRIBUTION OF S VIA THE RECURSIVE METHOD 307
Problem 41.11
Annual aggregate losses for a dental policy follow the compound Poisson
distribution with λ = 3. The distribution of individual losses is:
Loss Probability
1 0.4
2 0.3
3 0.2
4 0.1
Calculate the probability that aggregate losses in one year do not exceed 3.
308 AGGREGATE LOSS MODELS
Example 42.1
Let S be an aggregate random variable with a frequency distribution that
42 DISCRETIZATION OF CONTINUOUS SEVERITIES 309
Solution.
(a) The cdf of the severity distribution is
2
1000
FX (x) = 1 − .
x + 1000
Thus,
2
1000
f0 =FX (3) = 1 − = 0.006
3 + 1000
2 2
1000 1000
f1 =FX (9) − FX (3) = − = 0.0118
3 + 1000 9 + 1000
2 2
1000 1000
f2 =FX (15) − FX (9) = − = 0.0116
9 + 1000 16 + 1000
2
1000
f3 =1 − FX (15) = = 0.9707.
15 + 1000
(b) Let A denote the aggregate random variable with frequency distribution
the Poisson distribution with λ = 2 and severity distribution the arithmetize
distribution. We want to find fS (12) = fA (2). Using the recursive method,
we find
p
X Z xk +ph
r
(xk + jh) mkj = xr dFX (x), r = 0, 1, 2, · · · , p. (42.1)
j=0 xk
The unique solution to the system (42.1) is provided by the following theo-
rem.
Theorem 42.1
The solution to (42.1) is
Z xk +ph Y x − xk − ih
mkj = dFX (x), j = 0, 1, · · · , p.
xk (j − i)h
i6=j
Proof.
Let f (x) be a polynomial and consider the set of data points
Integrate both sides with respect to FX (x) over the interval [xk , xk + ph) we
find
p p Z xk +ph Y
X X x − xk − ih
(xk + jh)r mkj = (xk + jh)r dFX (x).
xk (j − i)h
j=0 j=0 i6=j
Example 42.2
Suppose X has the exponential distribution with pdf fX (x) = 0.1e−0.1x . Use
the method of local moment mathing with p = 1 and a span h = 2
(a) to find the equation corresponding to r = 0 in the resulting system of
equations. Assume that x0 = 0;
(b) to find mk0 and mk1 using Theorem 42.1;
(c) to find f0 and fn .
Solution.
(a) We have
1
X Z 2k+2 Z 2k+2
0 k
(xk +jh) mj = fX (x)dx = 0.1 e−0.1x dx = e−0.2k −e−0.2k−0.2
j=0 2k 2k
Example 42.3
Loss amounts follow a Pareto distribution with parameters α = 3 and θ = 5.
Use the Method of Local Moment Matching with h = 3 and p = 1 to find
the system of two equations in the unknowns m00 and m01 . Solve this system.
Assume x0 = 0.
Solution.
We have
3
5
m00 + m01 = Pr(0 ≤ X < 3) = FX (3) − FX (0) = 1 − = 0.7559.
3+5
This implies
3m01 = 0.79102 =⇒ m01 = 0.2637
and
m00 = 0.7559 − 0.2637 = 0.4922
42 DISCRETIZATION OF CONTINUOUS SEVERITIES 313
Practice Problems
Problem 42.1
Let S be an aggregate random variable with a frequency distribution that
has a Poisson distribution with λ = 2 and a severity distribution that has a
uniform distribution in (0, 50).
(a) Use the method of rounding to approximate f0 , f1 , f2 and f3 in the
arithmetize distribution such that f3 is the last positive probability. Use
the span h = 10.
(b) Find fS (30).
Problem 42.2
Medical claims have an exponential distribution with mean θ = 500. Using
the method of rounding with a span h = 50 estimate the probability that
the claim amount is 500.
Problem 42.3
Loss amounts follow a Weibull distribution with parameters α = 2 and
θ = 1000. Using the method of rounding with a span h = 500 estimate the
probability that the loss amount is 2000.
Problem 42.4
Loss amounts follow a Pareto distribution with parameters α = 3 and θ = 5.
Use the Method of Local Moment Matching with h = 3 and p = 1 to find
the system in two equations in the unknowns m10 and m11 . Solve this system.
Assume x0 = 0.
Problem 42.5
Loss amounts follow a Pareto distribution with parameters α = 3 and θ = 5.
Use the Method of Local Moment Matching with h = 3 and p = 1 to find
f0 and f3 .
Problem 42.6
Loss amounts follow an exponential distribution θ = 500. A discrete distri-
bution is created using the Method of Local Moment Matching such that
p = 1 and h = 50. Calculate the probability to a loss amount of 500 using
Theorem 42.1.
Problem 42.7
Loss amounts follow a Pareto distribution α = 4 and θ = 10000. A discrete
distribution is created using the Method of Local Moment Matching such
that p = 1 and h = 300. Calculate the probability to a loss amount of 3000
using Theorem 42.1.
314 AGGREGATE LOSS MODELS
We next start by reminding the reader of the following notations: The num-
ber of losses will be denoted by N L ; the number of payments by N P , the
probability of a loss resulting in a payment by v; the amount of payment
on a per-loss basis is Y L with Y L = 0 if a loss results in no payment; and
the amount of payment on a per-payment basis is Y P where only losses
that result on a nonzero payment are considered and the losses that result
in no payment are completely ignored. Note that Pr(Y P = 0) = 0 and
Y P = Y L |Y L > 0.
PN P (z) = PN (1 − v + vz).
Now back to the aggregate payments. On a per-loss basis, the total payments
may be expressed as
with S = 0 if N L = 0 and where YjL is the payment amount on the jth loss.
Alternatively, ignoring all losses that do not result in a nonzero-payment,
the aggregate S can be expressed as
with S = 0 if N P = 0 and where YjP is the payment amount on the jth loss
which results in a nonzero payment. On a per-loss basis, S is a compound
distribution with primary distribution N L and secondary distribution Y L
so that
MS (t) = PN L [MY L (t)].
Note that
Example 43.1
A ground-up model of individual losses follows a Pareto distribution with
α = 4 and θ = 10. The number of losses is a Poisson distribution with λ = 3.
There is an ordinary deductible of 6, a policy limit of 18− applied before the
policy limit and the coinsurance, and coinsurance of 75%. Find the expected
value and the variance of S, the aggregate payments on a per-loss basis.
Solution.
We have
E(N L ) =3
" α−1 #
θ θ
E(X ∧ 24) = 1−
α−1 x+θ
" 4−1 #
10 10
= 1− = 3.2485
3 24 + 10
" 4−1 #
10 10
E(X ∧ 6) = 1− = 2.5195
3 6 + 10
E(Y L ) =α[E(X ∧ u) − E(X ∧ d)] = 0.75[E(X ∧ 24) − E(X ∧ 6)]
=0.75(3.2485 − 2.5195) = 0.54675.
316 AGGREGATE LOSS MODELS
Example 43.2
A ground-up model of individual losses follows a Pareto distribution with
α = 4 and θ = 10. The number of losses is a Poisson distribution with
λ = 3. There is an ordinary deductible of 6, a policy limit of 18− applied
before the policy limit and the coinsurance, and coinsurance of 75%. Find
the distribution of S, the aggregate payments on a per-payment basis.
Solution.
Since we are treating S on a per-payment basis, we can look at S as an aggre-
gate distribution with frequency distribution N P and severity distribution
Y P . The probability that a loss will result in a payment is
4
10
v = Pr(X > 6) = SX (6) = = 0.15259.
10 + 6
We also have (see Section 36)
Let Z = X − 6|X > 6 denote the individual payment random variable with
only a deductible of 6. Then
Pr(X > z + 6)
Pr(Z > z) = .
Pr(X > 6)
Now with the 75% coinsurance, Y P = 0.75Z and the maximum payment is
0.75(24 − 6) = 13.5 so that for y < 13.5 we have
y
Pr(X > 6) − Pr X > 6 + 0.75
FY P (y) = 1 − Pr(0.75Z > y) =
Pr(X > 6)
43 INDIVIDUAL POLICY MODIFICATIONS IMPACT ON AGGREGATE LOSSES317
Example 43.3 ‡
Aggregate losses for a portfolio of policies are modeled as follows:
(i) The number of losses before any coverage modifications follows a Poisson
distribution with mean λ.
(ii) The severity of each loss before any coverage modifications is uniformly
distributed between 0 and b.
The insurer would like to model the impact of imposing an ordinary de-
ductible, d (0 < d < b), on each loss and reimbursing only a percentage,
c(0 < c ≤ 1), of each loss in excess of the deductible.
It is assumed that the coverage modifications will not affect the loss distri-
bution. The insurer models its claims with modified frequency and severity
distributions. The modified claim amount is uniformly distributed on the
interval [0, c(b − d)].
Determine the mean of the modified frequency distribution.
Solution.
Imposing the deductible will limit payments to those losses that are greater
than d. In this case, the proprotion of losses that result in a payment is
Pr(X > d) = 1 − db (remember that loss amounts are uniform in (0, b).)
Thus, the mean of the modified frequency distribution, that is, the expected
number of losses that will result in a payment being made is the product of
318 AGGREGATE LOSS MODELS
the total expected number of losses and the probability of a loss resulting
in a payment. That is
b
λ 1−
d
Example 43.4 ‡
A company insures a fleet of vehicles. Aggregate losses have a compound
Poisson distribution. The expected number of losses is 20. Loss amounts,
regardless of vehicle type, have exponential distribution with θ = 200.
In order to reduce the cost of the insurance, two modifications are to be
made:
(i) a certain type of vehicle will not be insured. It is estimated that this will
reduce loss frequency by 20%.
(ii) a deductible of 100 per loss will be imposed.
Calculate the expected aggregate amount paid by the insurer after the mod-
ifications.
Solution.
We want
From Table C,
100
E(X ∧ 100) = 200(1 − e− 200 ).
Thus, h i
100
E(S) = (20)(0.8) 200 − 200(1 − e− 200 ) ≈ 1941
43 INDIVIDUAL POLICY MODIFICATIONS IMPACT ON AGGREGATE LOSSES319
Practice Problems
Problem 43.1
A ground-up model of individual losses follows a Pareto distribution with
α = 4 and θ = 10. The number of losses is a Poisson distribution with λ = 7.
There is an ordinary deductible of 6, a policy limit of 18− applied before
the policy limit and the coinsurance, and coinsurance of 75%.
Find FY L (y).
Problem 43.2
An insurance company has a policy where the amount of each payment for
losses follows an exponential distribution with mean θ = 100. The number
of losses follows a Poisson distribution with λ = 7. There is an ordinary de-
ductible of 30, a maximum covered losses of 340− applied before the policy
limit and the coinsurance, and coinsurance of 53%.
Problem 43.3
An insurance company has a policy where the amount of each payment for
losses follows an exponential distribution with mean θ = 100. The number
of losses follows a Poisson distribution with λ = 7. There is an ordinary de-
ductible of 30, a maximum covered losses of 340− applied before the policy
limit and the coinsurance, and coinsurance of 53%. Let S be the aggregate
payments on a per-payment basis.
Problem 43.4
An insurance company has a policy where the amount of each payment for
losses follows an exponential distribution with mean θ = 100. The number
of losses follows a Poisson distribution with λ = 7. There is an ordinary de-
ductible of 30, a maximum covered losses of 340− applied before the policy
limit and the coinsurance, and coinsurance of 53%.
Problem 43.5
An insurance company has a policy where the amount of each payment for
320 AGGREGATE LOSS MODELS
Problem 43.6
An insurance company has a policy where the amount of each payment
for losses on a per-payment basis follows an exponential distribution with
mean θ = 100. The probability that a loss will result in a payment is 0.74082.
Find MY L (t).
Problem 43.7
A ground-up model of individual losses follows a Pareto distribution with
α = 4 and θ = 10. The number of losses is a Poisson distribution with λ = 7.
There is an ordinary deductible of 6, a policy limit of 18− applied before
the policy limit and the coinsurance, and coinsurance of 75%.
Find PN P (z).
44 AGGREGATE LOSSES FOR THE INDIVIDUAL RISK MODEL 321
Example 44.1
For a portfolio of 5 life insurance policies, you are given:
i qi bi
1 0.32 500
2 0.10 1000
3 0.54 250
4 0.23 375
5 0.14 650
Calculate E(S) and Var(S) where S = X1 + X2 + · · · + X5 .
Solution.
The aggregate mean is
E(S) = 0.23(500) + 0.10(1000) + 0.54(250) + 0.23(375) + 0.14(650) = 527.25.
The variance of S is
Var(S) =0.32(1 − .32)(500)2 + 0.10(1 − 0.10)(1000)2
+0.54(1 − 0.54)(250)2 + 0.23(1 − 0.23)(375)2 + 0.14(1 − 0.14)(650)2
=235698.6875
44 AGGREGATE LOSSES FOR THE INDIVIDUAL RISK MODEL 323
Example 44.2
For a portfolio of 3 life insurance policies, you are given:
i qi bi
1 0.32 500
2 0.10 1000
3 0.54 250
Calculate PS (z).
Solution.
We have
PS (z) = (0.68 + 0.32z 500 )(0.90 + 0.10z 1000 )(0.36 + 0.54z 250 )
Example 44.3
A life insurance portfolio consists of 100 policies where each policy has a
probability of a claim of 0.20. When a claim occurs, the amount of the
claim follows a Pareto distribution with parameters α = 3 and θ = 1000.
Caculate the mean and the variance of the aggregate loss.
Solution.
For i = 1, 2 · · · , 100, we have
θ 1000
E(Bi ) = = = 500.
α−1 3−1
Hence
E(S) = nqE(B) = 100(0.20)(500) = 10, 000.
Next,
αθ2
Var(B) = = 750, 000.
(α − 1)2 (α − 2)
Therefore,
Example 44.4 ‡
Each life within a group medical expense policy has loss amounts which fol-
low a compound Poisson process with λ = 0.16. Given a loss, the probability
1
that it is for Disease 1 is 16 .
Loss amount distributions have the following parameters:
324 AGGREGATE LOSS MODELS
Standard
Mean per loss Deviation per loss
Disease 1 5 50
Other Diseases 10 20
Premiums for a group of 100 independent lives are set at a level such that
the probability (using the normal approximation to the distribution for ag-
gregate losses) that aggregate losses for the group will exceed aggregate
premiums for the group is 0.24.
A vaccine which will eliminate Disease 1 and costs 0.15 per person has been
discovered.
Define:
A= the aggregate premium assuming that no one obtains the vaccine, and
B= the aggregate premium assuming that everyone obtains the vaccine and
the cost of the vaccine is a covered loss.
Calculate A/B.
Solution.
Let X1 denote the loss per person due to Disease 1 and X2 the loss per
person due to Other Diseases. Then X1 and X2 are Bernoulli with parameter
1
q = 16 .
The expected loss per perosn is
1 15 155
E(L) = (5) + (10) = = 9.6875
16 16 16
and the variance is
Also,
2 1 2 15
E(Var(X)) = 50 + 20 = 531.25.
16 16
Hence,
Var(L) == 531.25 + 1.4648 = 532.7148.
44 AGGREGATE LOSSES FOR THE INDIVIDUAL RISK MODEL 325
and
2 15 2 15
Var(S) = 100(0.16) 20 + 10 = 7500.
16 16
In this case,
S − 165 B − 165 B − 165
Pr(S > B) = Pr √ > √ = Pr Z > √ = 0.24.
7500 7500 7500
Thus,
B − 165
√ = 0.7 =⇒ B = 225.62.
7500
Finally,
A 225.09
= = 0.998
B 225.62
326 AGGREGATE LOSS MODELS
Practice Problems
Problem 44.1
A life insurance portfolio consists of 100 policies where each policy has a
probability of a claim of 0.20. When a claim occurs, the amount of the
claim follows an exponential distribution with mean θ = 1000.
Problem 44.2
A life insurance portfolio consists of 100 policies where each policy has a
probability of a claim of 0.20. When a claim occurs, the amount of the
claim follows an exponential distribution with mean θ = 1000.
Find MS (t).
Problem 44.3
A life insurance portfolio consists of 100 policies where each policy has a
probability of a claim of 0.20. When a claim occurs, the amount of the
claim follows an exponential distribution with mean θ = 1000.
Find PS (z).
Problem 44.4 ‡
A group life insurance contract covering independent lives is rated in the
three age groupings as given in the table below.
Age Number in Probability of Mean of the exponential
group age group claim per life distribution of claim amounts
18-35 400 0.03 5
36-50 300 0.07 3
51-65 200 0.10 2
(a) Find the mean and the variance of the aggregate claim.
(b) Find the 95th percentile of S.
Problem 44.5 ‡
The probability model for the distribution of annual claims per member
in a health plan is shown below. Independence of costs and occurrences
among services and members is assumed. Suppose that the plan consists of
n members.
44 AGGREGATE LOSSES FOR THE INDIVIDUAL RISK MODEL 327
For large n, the normal approximation uses the Central Limit Theorem
as follows:
!
S − E(S) s − E(S)
Pr(S ≤ s) =Pr p ≤p
Var(S) Var(S)
!
s − E(S)
≈Pr Z ≤ p
Var(S)
!
s − E(S)
=Φ p
Var(S)
Example 45.1 ‡
A group life insurance contract covering independent lives is rated in the
three age groupings as given in the table below. The insurer prices the
contract so that the probability that the total claims will exceed the premium
is 0.05.
Age Number in Probability of Mean of the exponential
group age group claim per life distribution of claim amounts
18-35 400 0.03 5
36-50 300 0.07 3
51-65 200 0.10 2
45 APPROXIMATING PROBABILITIES IN THE INDIVIDUAL RISK MODEL329
(a) Find the mean and the variance of the aggregate claim.
(b) Using the normal approximation, determine the premium that the in-
surer will charge.
Solution.
(a) The mean is given by
E(S) = 400(0.03)(5) + 300(0.07)(3) + 200(0.10)(2) = 163.
The variance is given by
Var(S) =400[(0.03)(5)2 + (0.03)(0.97)52 ]
+300[(0.07)(3)2 + 0.07(0.93)(3)2 ] + 200[0.10(2)2 + 0.10(0.9)(2)2 ]
=1107.77.
(b) We have
Pr(S > P ) =0.05
1 − Pr(S > P ) =0.95
Pr(S ≤ P ) =0.95
P − 163
Pr Z ≤ √ =0.95
1107.77
P − 163
√ =1.645.
1107.77
Solving the last equation, we find P = 217.75
Example 45.2
Repeat the previous example by replacing the normal approximation with
the lognormal approximation.
Solution.
Solving the system µ + 0.5σ 2 − 163 and 2µ + 2σ 2 = ln 163 and 2µ + 2σ 2 =
ln (1632 + 1107.77) we find µ = 5.073 and σ 2 = 0.041. Thus,
Pr(S > P ) =0.05
1 − Pr(S > P ) =0.95
Pr(S ≤ P ) =0.95
ln P − 5.073
Pr Z ≤ √ =0.95
0.041
ln P − 5.073
√ =1.645.
0.041
Solving the last equation, we find P = 222.76
330 AGGREGATE LOSS MODELS
Practice Problems
Problem 45.1 ‡
The probability model for the distribution of annual claims per member in
a health plan is shown below. Independence of costs and occurrences among
services and members is assumed.
Probability Mean of Variance of
Service of claim claim dist. claim dist.
office visits 0.7 160 4,900
Surgery 0.2 600 20,000
Other Services 0.5 240 8,100
Problem 45.2
An insurer has a portfolio consisting of 5 one-year life insurance policies
grouped as follows:
i qi bi
1 0.32 500
2 0.10 1000
3 0.54 250
4 0.23 375
5 0.14 650
Use a normal approximation to find the probability that the company will
not meet its obligation next year.
Problem 45.3
Repeat the previous problem by replacing the normal approximation with
the lognormal approximation.
Problem 45.4 ‡
An insurance company is selling policies to individuals with independent
future lifetimes and identical mortality profiles. For each individual, the
probability of death by all causes is 0.10 and the probability of death due
45 APPROXIMATING PROBABILITIES IN THE INDIVIDUAL RISK MODEL331
Problem 45.5
Consider a portfolio of 100 independent life insurance policies. It is deter-
mined that the death benefit of each insured follows a Poisson distribution
with paramaeter λ. The probability of a death is 0.1. Using the normal
approximation, it has been estimated that the probability of the aggregate
loss exceeding 25 is 0.05.
333
334 REVIEW OF MATHEMATICAL STATISTICS
Example 46.1
A population consists of the values 1,3,5, and 9. We want to estimate
the mean of the population µ. A random sample of two values from this
population is taken without replacement, and the mean of the sample µ̂ is
used as an estimator of the population mean µ.
(a) Find the probability distribution of µ̂.
(b) Is µ̂ an unbiased estimator?
Solution.
(a) The various samples are {1, 3}, {1, 5}, {1, 9}, {3, 5}, {3, 9}, and {5, 9} each
occurring with probability of 16 . The following table provides the sample
mean.
46 PROPERTIES OF POINT ESTIMATORS 335
(b) We have
1+3+5+9
µ= = 4.5
4
and
2+3+5+4+6+7
E(µ̂|µ) = = 4.5 = µ.
6
Hence, the estimator is unbiased
Example 46.2
Let X1 , X2 , · · · , Xn be normally distributed random variables each with
mean µ and variance σ 2 . Show that the estimator of the variance
Pn (Xi −µ̂)2 X1 +X2 +···+Xn
σ̂ 2 = i=1 n where µ̂ = n
is biased.
Solution.
The estimator can be expressed as
n
2 1X 2
σ̂ = Xi − µ̂2 .
n
i=1
lim E(θ̂n ) = θ
n→∞
for all θ.
336 REVIEW OF MATHEMATICAL STATISTICS
Example 46.3
Let X1 , X2 , · · · , Xn be uniform random variables on (0, θ). Show that the
estimator
θ̂n = max{X1 , X2 , · · · , Xn }
is asymptotically unbiased.
Solution.
The cdf of θ̂n is expressed as
y n
Fθ̂n (y) = Pr(X1 ≤ y, X2 ≤ y, · · · , Xn ≤ y) = .
θ
Hence, the pdf is
ny n−1
fθ̂n (y) = , 0<y<θ
θn
and the expected value of θ̂n is
Z θ
n
E(θ̂n ) = ny n θ−n dy = θ→θ
0 n+1
Theorem 46.1
If limn→∞ E(θ̂n ) = θ and limn→∞ Var(θ̂n ) = 0 then θ̂n is consistent.
Proof.
Let δ > 0. Then by Chebyshev’s inequality (see p.405 of [2]) we can write
Var(θ̂n )
0 ≤ Pr(|θ̂n − E(θ̂n | ≥ 2δ > δ) ≤ .
4δ 2
Letting n → ∞ and using the squeeze rule of limits the result follows
46 PROPERTIES OF POINT ESTIMATORS 337
Example 46.4
Let X1 , X2 , · · · , Xn be uniform random variables on (0, θ). Show that the
estimator
θ̂n = max{X1 , X2 , · · · , Xn }
is weakly consistent.
Solution.
We have already shown that θ̂n is asymptotically unbiased. It remains to
show that its variance goes to 0 as n goes to infinity. We have
Z θ
n 2
E(θ̂n2 ) = ny n+1 θ−n dy = θ
0 n+2
n 2 n2
Var(θ̂n ) = θ − θ2
n+2 (n + 1)2
nθ2
= →0
(n + 2)(n + 1)2
as n → ∞. That, is θ̂ is consistent
That is, the mean-squared error is an average of the squares of the difference
between the actual observations and those estimated. The mean-squared
error is arguably the most important criterion used to evaluate the perfor-
mance of an estimator. The MSE incorporates both the variance of the
estimator and its bias. Indeed, we have
Since the MSE decomposes into a sum of the bias and variance of the es-
timator, both quantities are important and need to be as small as possible
to achieve good estimation performance. It is common to trade-off some
increase in bias for a larger decrease in the variance and vice-versa.
338 REVIEW OF MATHEMATICAL STATISTICS
Example 46.5
Let X1 , X2 , · · · , Xn be independent and identically distributed randm vari-
ables with common distribution a normal distribution with parameters (µ, σ 2 ).
Find MSEµ̂ (µ).
Solution.
We have
Given two unbiased estimators θ̂ and θ̂0 of θ, we say that θ̂ is more efficient
than θ̂0 if Var(θ̂) < Var(θ̂0 ). Note that for an unbiased estimator θ̂, we have
MSEθ̂ (θ) = Var(θ̂). If the estimator θ̂ satisfies the property Var(θ̂) < Var(θ̂0 )
where θ̂0 is another unbiased estimator of θ then we call θ̂ a uniformly
minimum variance unbiased estimator(UMVUE). That is, θ̂ is the
most efficient unbiased estimator of θ.
Example 46.6
Let X1 , X2 , · · · , Xn be independent uniform random variables with param-
eters (0, θ). Consider the two unbiased estimators
n+1
θ̂a = 2µ̂ and θ̂b = n max{X1 , X2 , · · · , Xn }.
Solution.
We have
4 θ2 θ2
Var(θ̂a ) = 4Var(µ̂) = =
n 12 3n
and
2 2
n+1 n+1 n
Var(θ̂b ) = Var(θ̂n ) = θ2 .
n n (n + 1)2 (n + 2)
Thus,
n−1
Var(θ̂a ) − Var(θ̂b ) = > 0, n > 1.
3n(n + 2)
Thus, θ̂b is more efficient than θ̂a
46 PROPERTIES OF POINT ESTIMATORS 339
Practice Problems
Problem 46.1 ‡
You ar given:
• E(X) = θ > 0.
θ2
• Var(X) = 25 .
kX
• θ̂ = k+1 .
• MSEθ̂ (θ) = 2[biasθ̂ (θ)]2 .
Problem 46.2 ‡
You are given two independent estimates of an unknown quantity θ :
a. Estimator A : E(θ̂A ) = 1000 and σ(θ̂A ) = 400.
b. Estimator B : E(θ̂B ) = 1200 and σ(θ̂B ) = 200.
Estimator C is a weighted average of Estimator A and Estimator B such
that
θ̂C = wθ̂A + (1 − w)θ̂B .
Determine the value of w that minimizes σ(θ̂C ).
Problem 46.3 ‡
Claim sizes are uniformly distributed over the interval [0, θ]. A sample of 10
claims, denoted by X1 , , X2 , · · · , X10 was observed and an estimate of θ was
obtained using
θ̂ = max{X1 , X2 , · · · , X10 }.
Recall that the probability density function for θ̂ is
10y 9
fθ̂ (y) = .
θ10
Problem 46.4 ‡
A random sample, X1 , X2 , · · · , Xn is drawn from a distribution with a mean
of 2/3 and a variance of 1/18. An estimator of the distribution mean is given
by
X1 + X2 + · · · + Xn
µ̂ = .
n−1
Find MSEµ̂ (µ).
340 REVIEW OF MATHEMATICAL STATISTICS
Problem 46.5
A random sample of 10 independent values were taken from a Pareto dis-
tribution with parameters α = 3 and θ. The estimator used to estimate the
mean of the distribution is given by
X1 + X2 + · · · + X10
µ̂ = .
10
It is given that MSEµ̂ (µ) = 300. Determine the value of θ.
Problem 46.6
Let X1 , X2 , · · · , Xn be uniform random variables on (0, θ). The parameter
θ is estimated using the estimator
2n
θ̂n = µ
n−1
where
X1 + X2 + · · · + Xn
µ=
n
Find the bias of θ̂.
Problem 46.7
Let θ̂ denote an estimator of a parameter θ. Suppose that E(θ̂) = 3 and
E(θ̂2 ) = 17.
Problem 46.8 ‡
Which of the following statements is true?
47 Interval Estimation
The next estimation procedure that we consider is the interval estimation.
Example 47.1
Let X1 , X2 , · · · , Xn be a sample from a normal distribution with a known
standard deviation σ but unknown mean µ. Find 100(1 − α)% confidence
interval of µ.
Solution.
Let X = X1 +X2n+···+Xn by an estimator of the mean. Then the random vari-
X−µ
able σ/√ has the standard normal distribution. But the standard normal
n
distribution is symmetric about 0 so we choose L = −U so that
X −µ
Pr −U ≤ √ ≤ U = 1 − α.
σ/ n
Solution.
We have 1 − α = 0.95 so that α = 0.05. Thus, z α2 satisfies
Z zα
2
Φ(z α2 ) = φ(x)dx = 1 − 0.025 = 0.975.
0
Using the table for the standard normal distribution, we find z0.025 = 1.96
Thus, we want
X − θ
q ≤ 1.96
θ
n
which is equivalent to
3.8416θ
(X − θ)2 ≤
n
or
3.8416 2
θ2 − 2X + θ + X ≤ 0.
n
Solving this quadratic inequality, we find the confidence interval with end-
points s
X ± 1.9208 − 1 15.3664X + 3.84162 /n
n 2 n
47 INTERVAL ESTIMATION 343
Practice Problems
Problem 47.1
For a 90% confidence interval, what is z α2 ?
Problem 47.2
For a 99% confidence interval, what is z α2 ?
Problem 47.3
You are given that Pr(L ≤ θ ≤ U ) ≥ 0.80. What is the probability that a
confidence interval will not include the population parameter?
Problem 47.4
You are given:
• a population has a normal distribution with mean µ and σ = 3.
• X = 80
• n = 10
Problem 47.5 ‡
A sample of 2000 policies had 1600 with no claims and 400 with one or
more claims. Using the normal approximation, determine the symmetric
95% confidence interval for the probability that a single policy has one or
more claims.
344 REVIEW OF MATHEMATICAL STATISTICS
48 Hypothesis Testing
Within the context of this book, a statistical hypothesis is a claim or a
statement regarding a population parameter or a probability distribution.
Example 48.1
An insurance company is reviewing its current policy rates. Originally, the
insurance company believed that the average claim amount should be $1,200.
Currently, they are suspecting that the true mean is actually higher than
this. What are the hypothesis for this problem?
Solution.
The null and alternative hypothesis for this problem are:
H0 :µ ≤ 1, 200
H1 :µ > 1, 200
The testing method used in making the decision to either reject or not to
reject the null hypothesis involves two concepts: A test statistic and a re-
jection region.
48 HYPOTHESIS TESTING 345
A rejection region is the set of all values of the test statistic for which the
null hypothesis will be rejected,i.e., values that provide strong evidence in
favor of the alternative hypothesis.
The boundaries of the rejection region are called the critical values. In
the case of an alternative hypothesis with > sign the rejection region lies in
the right-tail of the distribution of the test statistic, with < the tail is left,
and with 6= sign the region is two-tailed.
Example 48.3
An insurance company is reviewing its current policy rates. When originally
setting the rates they believed that the average claim amount was $1,800.
They are concerned that the true mean is actually higher than this, because
they could potentially lose a lot of money. They randomly select 40 claims,
and calculate a sample mean of $1,950. Assuming that the standard de-
viation of claims is $500, and set α = 0.025, test to see if the insurance
company should be concerned.
Solution.
The null and alternative hypothesis for this problem are:
H0 :µ ≤ 1, 800
H1 :µ > 1, 800.
Example 48.4
Consider the following hypotheses:
H0 :µ ≥ 3
H1 :µ < 3.
48 HYPOTHESIS TESTING 347
You throw a fair die. If the face shows a 3 or less then you reject H0 . If the
face shows a number greater than 3 you fail to reject H0 or H1 is true.
(a) Calculate the level of significance α.
(b) Calculate the probability of Type II error.
Solution.
(a) We have
By comparing the p−value to the alpha level we can easily decide to re-
ject or fail to reject: If p−value > α, fail to reject H0 . If p−value ≤ α, reject
H0 .
2. (a) If the test is one-tailed, the p−value is equal to the tail area be-
yond z in the same direction as the alternative hypothesis. Thus, if the
alternative hypothesis is of the form >, the p−value is the area to the right
of, or above the observed z value. The same is true in the case of < .
(b) If the test is two-tailed, the p−value is equal to twice the tail area
beyond the observed z value in the direction of the sign of z. That is, if z is
348 REVIEW OF MATHEMATICAL STATISTICS
positive,the p−value is twice the area to the right of, or above the observed
z value. The same holds true in the case where z is negative.
Example 48.5
Find the p−value of Example 48.3.
Solution.
The p−value is 2Pr(Z > 1.897). This is equivalent to twice the area under
the standard normal curve to the right of z = 1.897. Looking in a standard
normal table we find that z = 1.897 ≈ 1.90 corresponds to 0.0287. Since
2(0.0287) = 0.0574 > α = 0.05, we fail to reject the null hypothesis
Example 48.6
You are performing a hypothesis test as follows:
You pick a random value of X. This is x, your test statistic. Your test
statistic in this case is 2. What is the p−value of this test?
Solution.
Recall that the p−value is the probability that, if the null hypothesis is true,
a higher value than the test statistic is observed.The sdf of X if H0 is true
is S(x) = e−10x . Thus, the p−value is S(4) = e−10(2) = e−20
48 HYPOTHESIS TESTING 349
Practice Problems
Problem 48.1
Suppose the current mean cost to treat a cancer patient for one month is
$18,000. Consider the following scenarios.
(a) A hospital treatment plan is implemented which hospital authorities feel
will reduce the treatment costs.
(b) It is uncertain how a new treatment plan will affect costs.
(c) A treatment plan is implemented which hospital authorities feel will in-
crease treatment costs.
Let µ represent the mean cost per patient per month after the new treat-
ment plan is implemented.
Give the research hypothesis in symbolic form for each of the above cases
Problem 48.2
Classify each of the following as a lower-tailed, upper-tailed, or two-tailed
rejection region:
(i) H0 : µ = 12 and H1 : µ 6= 12.
(ii) H0 : µ ≥ 12 and H1 : µ < 12.
(iii) H0 : µ = 12 and H1 : µ > 12.
Problem 48.3
Consider a college campus where finding a parking space is not easy. The
university claims that the average time spent in finding a parking space is at
least 30 minutes. Suppose you suspect that it takes less than that. So in a
sample of five, you found that the average time is 20 minutes. Assuming that
the time it takes to find a parking spot is normal with standard deviation
σ = 6 minutes, then perform a hypothesis test with level of significance
α = 0.01 to see if your claim is correct.
Problem 48.4
A hypothesis test has a p−value of 0.037. At which of these significance
levels would you reject the null hypothesis?
Problem 48.5
A new restaurant has opened in town. A statistician claims that the amount
spent per customer for dinner is more than $20. To verify whether his claim
is valid or not, he randomly selected a group of 49 custimers and found that
350 REVIEW OF MATHEMATICAL STATISTICS
the average amount spent was $22.60. Assume that the standard deviation
is known to be $2.50.
Using α = 2%, would he conclude the typical amount spent per customer is
more than $20.00?
Problem 48.6
Suppose a production line operates with a mean filling weight of 16 ounces
per container. Since over− or under−filling can be dangerous, a quality
control inspector samples 30 items to determine whether or not the filling
weight has to be adjusted. The sample revealed a mean of 16.32 ounces.
From past data, the standard deviation is known to be 0.8 ounces.
Problem 48.7
A dietician is trying to test the claim that a new diet plan will cause a person
to lose 10 lbs. over 4 weeks. To test her claim, she selects a random sample
of 49 overweighted individulas and found that an average weight loss of 12.5
pounds over the four weeks, with σ = 7 lbs.
Identify the critical value suitable for conducting a two-tail test of the hy-
pothesis at the 2% level of significance.
Problem 48.8
A Type II error is committed when
(a) we reject a null hypothesis that is true.
(b) we don’t reject a null hypothesis that is true.
(c) we reject a null hypothesis that is false.
(d) we don’t reject a null hypothesis that is false.
The Empirical Distribution
for Complete Data
351
352 THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA
Let X be the random variable representing the losses incurred by the pol-
icyholders. Find the empirical distribution probability function and the
empirical distribution function of X.
Solution.
The pmf is given by the table below.
x 49 50 60 75 80 120 130
1 1 1 1 1 1 1
p(x) 9 3 9 9 9 9 9
Let si denote
Pk the number of times the value yi appears inPthe sample.
k
Clearly, s
i=1 i = n. Next, for each 1 ≤ j ≤ k, let rj = i=j i . That
s
is, rj is the number of observations greater than or equal to yj . The set
of observations greater than or equal to yj is called the risk set11 . The
11
When listing the elements of this set, repeated observations must be listed
49 THE EMPIRICAL DISTRIBUTION FOR INDIVIDUAL DATA 353
convention is to call rj also as the risk set. Using the notation of rj , we can
express the edf as follows
0, x < y1
r
Fn (x) = 1 − nj , yj−1 ≤ x < yj , j = 2, · · · , k
1, x ≥ yk .
Example 49.2
Determine the edf of Example 49.1 using the previous paragraph.
Solution.
We have the following chart:
j yj sj rj
1 49 1 9
2 50 3 8
3 60 1 5
4 75 1 4
5 80 1 3
6 120 1 2
7 130 1 1
Using the above chart, we find
0, x < 49
8 1
1 − 9 = 9, 49 ≤ x < 50
5 4
1− 9 = 9, 50 ≤ x < 60
1 − 49 = 5
9, 60 ≤ x < 75
Fn (x) =
1 − 39 = 2
3, 75 ≤ x < 80
1 − 29 = 7
9, 80 ≤ x < 120
1 − 19 = 8
120 ≤ x < 130
9,
x ≥ 130
1,
Since the empirical model is a discrete model, the derivative required to
create the density and hazard rate functions cannot be taken. The best one
can do is to estimate the cumulative hazard rate function defined by:
H(x) = − ln S(x) = − ln [1 − F (x)].
Note that once an estimate of H(x) is found, we can find estimates for
F (x) = 1 − e−H(x) . An estimate of the cumulative hazard rate function is
the Nelson-Åalen estimate given by:
P 0,
x < y1
j−1 si
Ĥ(x) = i=1 ri , yj−1 ≤ x < yj , j = 2, 3, · · · , k
Pk si ,
x ≥ yk .
i=1 ri
354 THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA
Example 49.3
Determine the Nelson-Åalen estimate for Example 49.1.
Solution.
We have
0, x < 49
1
9 , 49 ≤ x < 50
1 3 35
9 + 8 = 72 , 49 ≤ x < 50
35 1 247
+ = , 60 ≤ x < 75
Ĥ(x) = 72 5 360
247 1 337
360 + 4 = 360 , 75 ≤ x < 80
337 1 457
360 + 3 = 360 , 80 ≤ x < 120
457 1 637
≤ x < 130
360 + 2 = 360 , 120
637 1 997
x ≥ 130
360 + 1 = 360
Example 49.4 ‡
A portfolio of policies has produced the following claims:
100 100 100 200 300 300 300 400 500 600
Solution.
We have
j yj sj rj
1 100 3 10
2 200 1 7
3 300 3 6
4 400 1 3
5 500 1 2
6 600 1 1
Practice Problems
Problem 49.1
Twelve policyholders were monitored from the starting date of the policy to
the time of first claim. The observed data are as follows:
Time of first claim 1 2 3 4 5 6 7
Number of claims 2 1 2 2 1 2 2
Calculate p12 (x) and F12 (x).
Problem 49.2
Twelve policyholders were monitored from the starting date of the policy to
the time of first claim. The observed data are as follows:
Time of first claim 1 2 3 4 5 6 7
Number of claims 2 1 2 2 1 2 2
Find the empirical mean and the empirical variance.
Problem 49.3
Twelve policyholders were monitored from the starting date of the policy to
the time of first claim. The observed data are as follows:
Time of first claim 1 2 3 4 5 6 7
Number of claims 2 1 2 2 1 2 2
(a) Find the cumulative hazard function from the Nelson-Åalen estimate.
(b) Find the survival distribution function from the Nelson-Åalen esti-
mate.
Problem 49.4
Below are the losses suffered by policyholders of an insurance company:
49, 50, 50, 50, 60, 75, 80, 120, 230.
Let X be the random variable representing the losses incurred by the poli-
cyholders. Find the empirical survival function.
Problem 49.5
Below are the losses suffered by policyholders of an insurance company:
49, 50, 50, 50, 60, 75, 80, 120, 130.
Let X be the random variable representing the losses incurred by the pol-
icyholders. For the observation 50, find the number of elements in the
associated risk set.
356 THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA
Problem 49.6 ‡
You are given a random sample of 10 claims consisting of two claims of
400, seven claims of 800, and one claim of 1600. Determine the empirical
skewness coefficient.
Problem 49.7 ‡
You are given the following about 100 insurance policies in a study of time
to policy surrender:
(i) The study was designed in such a way that for every policy that was
surrendered, a new policy was added, meaning that the risk set, rj , is always
equal to 100.
(ii) Policies are surrendered only at the end of a policy year.
(iii) The number of policies surrendered at the end of each policy year was
observed to be:
Example 50.1
Given the following grouped data.
Solution.
We have c0 = 0, c1 = 2, c2 = 10, c3 = 100, and c4 = 1000. We first evaluate
358 THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA
F50 (0) =0
25
F50 (2) = = 0.5
50
35
F50 (10) = = 0.7
50
45
F50 (100) = = 0.9
50
F50 (1000) =1.
Note that Fn0 (x) exists for all x 6= cj , j = 0, 1, · · · , k. Therefore, the density
function can be obtained via the formula
Fn (cj ) − Fn (cj−1 ) nj
fn (x) = = , cj−1 ≤ x < cj , j = 1, 2, · · · , k.
cj − cj−1 n(cj − cj−1 )
Example 50.2
Find the density function in Example 50.1.
Solution.
The density function is
1
4, 0≤x<2
1
40 , 2 ≤ x < 10
1
f50 (x) = 450 , 10 ≤ x < 100
1
9000 , 100 ≤ x < 1000
undef ined, x ≥ 1000
Example 50.3
Find E(X ∧ 250) in Example 50.1
50 EMPIRICAL DISTRIBUTION OF GROUPED DATA 359
Solution.
We have
Z 250
E(X ∧ 250) = xf50 (x)dx + 250[1 − F50 (250)]
0
Z 2 Z 10 Z 100
x x x
= dx + dx + dx
0 4 2 40 10 450
Z 250
x
+ dx + 250[1 − 0.9167]
100 9000
=0.5 + 1.2 + 11 + 2.9167 + 20.825
=36.4417
Example 50.4 ‡
You are given
Solution.
We have
Z 200 Z 150 Z 200
2 2 2 2 2
E(X ) − E[(X ∧ 150) ] = x f74 (x)dx + x f74 (x)dx + 150 f74 (x)dx
0 0 150
Z 200
= (x2 − 1502 )f74 (x)dx
150
Z 200
2 2 6
= (x − 150 ) dx
150 7400
3 200
x
− 1502 x = 337.84
=
3 150
360 THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA
Practice Problems
Problem 50.1 ‡
You are given
Claim size Number of claims
(0, 25] 30
(25, 50] 32
(50, 100] 20
(100, 200] 8
Assume a uniform distribution of claim sizes within each interval.
(a) Estimate the mean of the claim size distribution.
(b) Estimate the second raw moment of the claim size distribution.
Problem 50.2 ‡
For 500 claims, you are given the following distribution:
Claim size Number of claims
(0, 500] 200
(500, 1000] 110
(1000, 2000] x
(2000, 5000] y
(5000, 10000] ?
(10000, 25000] ?
(25000, ∞) ?
You are also given the following values taken from the ogive: F500 (1500) =
0.689 and F500 (3500) = 0.839. Determine y.
Problem 50.3 ‡
A random sample of payments from a portfolio of policies resulted in the
following:
Claim size Number of claims
(0, 50] 36
(50, 150] x
(150, 250] y
(250, 500] 84
(500, 1000] 80
(1000, ∞) 0
Suppose that Fn (90) = 0.21 and Fn (210) = 0.51. Find the value of x.
50 EMPIRICAL DISTRIBUTION OF GROUPED DATA 361
Problem 50.4 ‡
You are given the following information regarding claim sizes for 100 claims:
Use the ogive to estimate the probability that a randomly chosen claim is
between 2000 and 6000.
Problem 50.5 ‡
You are given:
(i)
363
364 ESTIMATION OF INCOMPLETE DATA
An important element of the estimation is the risk set which is the subject
of this section.
The most common modified data are the left truncated and right censored
observations. Left truncation usually occurs when a policy has an ordinary
deductible d (see Section 31). Right censoring occurs with a policy limit
(see Section 34). In what follows we will just use the term truncated to refer
to left truncated observation and we use the term censored to mean right
censored.
12
Each observation will have an assigned value of d and either (but not both) a value
of x or u.
51 THE RISK SET OF INCOMPLETE DATA 365
X X X
rj = I(xi ≥ yj ) + I(ui ≥ yj ) − I(di ≥ yj )
i i i
X X X
= I(di < yj ) − I(xi < yj ) − I(ui < yj )
i i i
• For survival/mortality data13 , the risk set is the number of people ob-
served alive at age yj .
• For loss amount data, the risk set is the number of policies with observed
loss amounts (either the actual amount or the maximum amount due to a
policy limit) larger than or equal to yj less those with deductibles greater
than or equal to yj .
X X X
rj = rj−1 + I(yj−1 ≤ di < yj ) − I(xi = yj−1 ) − I(yj−1 ≤ ui < yj )
i i i
Example 51.1
You are given the following mortality table:
13
In a typical mortality study, the following notation is used for an individual i : di will
denote the time the individual joined the study; ui will denote the time of withdrawal
from the study; and xi will denote the time of death of the individual.
366 ESTIMATION OF INCOMPLETE DATA
i di xi ui
1
2
3
4
5
6
7
8
9
10
Solution.
i di xi ui
1 0 − 4
2 0 0.5 −
3 0 − 1
4 0 − 4
5 1 − 4
6 1.2 2 −
7 1.5 2 −
8 2 − 3
9 2.5 − 4
10 3.1 3.2 −
51 THE RISK SET OF INCOMPLETE DATA 367
Example 51.2
Create a table summarizing yj , sj , and rj of Example 51.1
Solution.
The table is given below.
j yj sj rj
1 0.5 1 4+6−6=4
2 2 2 3+5−3=5
3 3.2 1 1+4−0=5
368 ESTIMATION OF INCOMPLETE DATA
Practice Problems
Problem 51.1
For a ground up loss of amount X, an insurer pays the loss in excess of a
deductible d, and with a policy limit of u. Which of the following statements
is true regarding censoring and truncation.
Problem 51.2
Which of the following statements is true?
Problem 51.3
You are given the following mortality table:
Life Time of Entry Time of exit Reason of exit
1 0 0.2 Lapse
2 0 0.3 Lapse
3 0 0.5 Lapse
4 0 0.5 Death
5 1 0.7 Lapse
6 1.2 1.0 Death
7 1.5 2.0 Lapse
8 2 2.5 Death
9 2.5 3.0 Lapse
10 3.1 3.5 Death
11 0 4.0 Expiry of Study
12 0 4.0 Expiry of Study
13 0 4.0 Expiry of Study
14 0 4.0 Expiry of Study
15 0 4.0 Expiry of Study
51 THE RISK SET OF INCOMPLETE DATA 369
Problem 51.4
You are given the following
i di xi ui
1 1 − 6
2 0 4 −
3 2 4 −
4 6 8 −
5 0 − 5
Create a table summarizing yj , sj , and rj .
Problem 51.5
You are given the following
j dj xj uj
1 0 0.9 −
2 0 − 1.2
3 0 1.5 −
4 0 − 1.5
5 0 − 1.6
6 0 1.7 −
7 0 − 1.7
8 1.3 2.1 −
9 1.5 2.1 −
10 1.6 − 2.3
Create a table summarizing yj , sj , and rj .
370 ESTIMATION OF INCOMPLETE DATA
Example 52.1
Find the Kaplan-Meier estimate of the survival function for the data in
Example 51.1
Solution.
We have
1, 0 ≤ t < 0.5
1− sr11 =0.75,
0.5 ≤ t < 2
S10 (t) = s2
0.75 1 − r2 = 0.45, 2 ≤ t < 3.2
0.45 1 − sr33 = 0.36, t ≥ 3.2
Let X be the random variable representing the losses incurred by the poli-
cyholders. The observations are not truncated or censored. Use a Kaplan-
Meier product-limit estimator to approximate the survival function for this
data.
Solution.
We have
0 ≤ t < 49
1,
2
1 − 10 = 0.8, 49 ≤ t < 50
3
0.8 1 − 8 = 0.5, 50 ≤ t < 60
1
0.5 1 − 5 = 0.4, 60 ≤ t < 75
S9 (t) =
0.4 1 − 14 = 0.3, 75 ≤ t < 80
0.3 1 − 13 = 0.2, 80 ≤ t < 120
1
0.2 1 − 2 = 0.1, 120 ≤ t < 130
0.1 1 − 11 = 0, t ≥ 130.
Note that this is the same as the empirical survival function for this data
set
Example 52.3 ‡
You are studying the length of time attorneys are involved in settling bod-
ily injury lawsuits. T represents the number of months from the time an
attorney is assigned such a case to the time the case is settled. Nine cases
were observed during the study period, two of which were not settled at
the conclusion of the study. For those two cases, the time spent up to the
372 ESTIMATION OF INCOMPLETE DATA
conclusion of the study, 4 months and 6 months, was recorded instead. The
observed values of T for the other seven cases are as follows:
1 3 3 5 8 8 9.
Estimate Pr(3 ≤ T ≤ 5) using the Product-Limit estimator.
Solution.
We have the following charts
j dj xj uj
1 0 1 −
2 0 3 − j yj sj rj
3 0 3 − 1 1 1 9
4 0 − 4 2 3 2 8
5 0 5 − 3 5 1 5
6 0 − 6 4 8 2 3
7 0 8 − 5 9 1 1
8 0 8 −
9 0 9 −
The Kaplan-Meier estimator of S(t) is given by
1, 0≤t<1
1 8
1− 9 = 9 ,2 1≤t<3
8
2
9 1 − 8 = 3 , 3 ≤t<5
Ŝ(t) = 2 1 8
3 1 − 5 = 15 , 5 ≤t<8
8 2 8
1 − 3 = 45 , 8 ≤ t < 9
158
1
45 1 − 1 = 0, t ≥ 9.
Now, we have
Pr(3 ≤ T ≤ 5) =Pr(T ≥ 3) − Pr(T > 5)
=Ŝ(3− ) − Ŝ(5)
8 8
= − = 0.356
9 15
Example 52.4 ‡
The claim payments on a sample of ten policies are:
2 3 3 5 5+ 6 7 7+ 9 10+
where the ”+” indicates that the loss exceeded the policy limit.
Using the Product-Limit estimator, calculate the probability that the loss
on a policy exceeds 8.
52 THE KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS 373
Solution.
We have the following charts
j dj xj uj
1 0 2 −
2 0 3 − j yj sj rj Ŝ(yj )
− 1
3 0 3 1 2 1 10 1 − 10 = 0.9
− 2
4 0 5 2 3 2 9 0.9(1 − 9 ) = 0.7
5 0 − 5 3 5 1 7 0.7(1 − 71 ) = 0.6
6 0 6 − 4 6 1 5 0.6(1 − 51 ) = 0.48
7 0 7 − 5 7 1 4 0.48(1 − 14 ) = 0.36
8 0 − 7 6 9 1 2 0.36(1 − 21 ) = 0.18
9 0 9 −
10 0 − 10
Since, y5 ≤ 8 < y6 , we have Ŝ(8) = Ŝ(7) = 0.36
Example 52.6
For a mortality study with right-censored data, you are given:
Time Number of deaths Number at risk
tj sj rj
3 1 50
5 3 49
6 5 k
10 7 21
You are also told that the Nelson-Åalen estimate of the survival function at
time 10 is 0.575. Determine k.
Solution.
From 0.575Ŝ(10) = e−Ĥ(10) we find Ĥ(10) = − ln 0.575 = 0.5534. Thus,
X si 1 3 5 7
0.5534 = = + + + .
ri 50 49 k 21
ti ≤10
Example 52.8 ‡
You are given the following times of first claim for five randomly selected
auto insurance policies observed from time t = 0 :
1 2 3 4 5
You are later told that one of the five times given is actually the time of
policy lapse (i.e., terminated), but you are not told which one.
The smallest Product-Limit estimate of S(4), the probability that the first
claim occurs after time 4, would occur at the lapse time t0 . Find t0 .
Solution.
If the time of policy lapse is at t = 1, then the at risk group at death time
2 is 4, at death time 3 is 3 and at death time 4 is 2, so that
4
Y 0 1 1 1
Ŝ(4) = = 1− 1− 1− 1− = 0.250.
5 4 3 2
i=1
If the time of policy lapse is at t = 2, then the at risk group at death time
1 is 5, at death time 3 is 3 and at death time 4 is 2, so that
4
Y 1 0 1 1
Ŝ(4) = = 1− 1− 1− 1− = 0.267.
5 4 3 2
i=1
If the time of policy lapse is at t = 3, then the at risk group at death time
1 is 5, at death time 2 is 4 and at death time 4 is 2, so that
4
Y 1 1 0 1
Ŝ(4) = = 1− 1− 1− 1− = 0.300.
5 4 3 2
i=1
If the time of policy lapse is at t = 4, then the at risk group at death time
1 is 5, at death time 2 is 4 and at death time 3 is 3, so that
4
Y 1 1 1 0
Ŝ(4) = = 1− 1− 1− 1− = 0.400.
5 4 3 2
i=1
If the time of policy lapse is at t = 5, then the at risk group at death time
1 is 5, at death time 2 is 4, at death time 3 is 3, and at death time 4 is 2 so
that
4
Y 1 1 1 1
Ŝ(4) = = 1− 1− 1− 1− = 0.200.
5 4 3 2
i=1
Practice Problems
Problem 52.1 ‡
For a mortality study with right-censored data, you are given:
Problem 52.2 ‡
You are given:
(i) The following data set:
2500 2500 2500 3617 3662 4517 5000 5000 6010 6932 7500 7500
(ii) Ĥ1 (7000) is the Nelson-Åalen estimate of the cumulative hazard rate
function calculated under the assumption that all of the observations in (i)
are uncensored.
(iii) Ĥ2 (7000) is the Nelson-Åalen estimate of the cumulative hazard rate
function calculated under the assumption that all occurrences of the values
2500, 5000 and 7500 in (i) reflect right-censored observations and that the
remaining observed values are uncensored.
Problem 52.3 ‡
For a mortality study of insurance applicants in two countries, you are given:
(i)
Country A Country B
yj sj rj sj rj
1 20 200 15 100
2 54 180 20 85
3 14 126 20 65
4 22 112 10 45
52 THE KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS 377
(ii) rj is the number at risk over the period (yj−1 , yj ). Deaths during the
period (yj−1 , yj ) are assumed to occur at yj .
(iii) S T (t) is the Product-Limit estimate of S(t) based on the data for all
study participants.
(iv) S B (t) is the Product-Limit estimate of S(t) based on the data for coun-
try B.
Problem 52.4 ‡
For observation j of a survival study:
• dj is the left truncation point
• xj is the observed value if not right-censored
• uj is the observed value if right-censored.
You are given:
j dj xj uj
1 0 0.9 −
2 0 − 1.2
3 0 1.5 −
4 0 − 1.5
5 0 − 1.6
6 0 1.7 −
7 0 − 1.7
8 1.3 2.1 −
9 1.5 2.1 −
10 1.6 − 2.3
Problem 52.5 ‡
You are given:
(i) All members of a mortality study are observed from birth. Some leave
the study by means other than death.
(ii) s3 = 1, s4 = 3.
(iii) The following Kaplan-Meier product-limit estimates were obtained:
Sn (y3 ) = 0.65, Sn (y4 ) = 0.50, Sn (y5 ) = 0.25.
(iv) Between times y4 and y5 , six observations were censored.
(v) Assume no observations were censored at the times of deaths.
Determine s5 .
378 ESTIMATION OF INCOMPLETE DATA
Problem 52.6 ‡
In a study of claim payment times, you are given:
(i) The data were not truncated or censored.
(ii) At most one claim was paid at any one time.
(iii) The Nelson-Åalen estimate of the cumulative hazard function, H(t),
immediately following the second paid claim, was 23/132.
Problem 52.7 ‡
You are given:
(i) The following is a sample of 15 losses: 11, 22, 22, 22, 36, 51, 69, 69, 69,
92, 92, 120, 161, 161, 230.
(ii) Ĥ1 (x) is the Nelson-Åalen empirical estimate of the cumulative hazard
rate function.
(iii) Ĥ2 (x) is the maximum likelihood estimate of the cumulative hazard
rate function under the assumption that the sample is drawn from an expo-
nential distribution.
Problem 52.8 ‡
For the data set
200 300 100 400 X
you are given:
(i) k = 4
(ii) s2 = 1
(iii) r4 = 1
(iv) The NelsonÅalen Estimate Ĥ(410) > 2.15.
Determine X.
53 MEAN AND VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA379
Individual Data
We first consider finding the variance of the empirical survival function of
individual data. Suppose that the sample is of size n. Let Sn (t) be the
empirical estimate of the survival function S(x) defined by
where Y is the number of observations in the sample that are greater than
x. If we regard an observation as a trial then we define a success to be the
observation that is greater than x and which occurs with probability S(x).
Then Y is a binomial random variable with parameters n and S(x) and with
mean E(Y ) = nS(x) and variance Var(Y ) = nS(x)(1 − S(x)). Thus,
E(Y ) nS(x)
E[Sn (x)] = = = S(x).
n n
This shows that the empirical estimate Sn (x) is unbiased (see Section 46).
1 S(x)(1 − S(x))
Var[Sn (x)] = Var(Y ) = →0
n2 n
If S(x) is unknown, then we can estimate the variance of Sn (x) using Sn (x)
itself in the formula
Example 53.1
Let X be a discrete random variable and p = Pr(a < X ≤ b). An estimate
of p is p̂ = Sn (a) − Sn (b). Show that p̂ is unbiased and consistent.
380 ESTIMATION OF INCOMPLETE DATA
Solution.
We first note that p̂ = Yn where Y is a binomial random variable with
parameters n and S(a) − S(b). Thus,
1
E(p̂) = [n(S(a) − S(b))] = S(a) − S(b) = p.
n
This shows that p̂ is unbiased. Next, we show that p̂ is consistent. Indeed,
we have
1 p(1 − p)
Var(p̂) = 2
[n(S(a) − S(b))(1 − S(a) + S(b))] = →0
n n
as n → ∞. This shows that p̂ is consistent. When p is unknown, we have
p̂(1 − p̂)
Var(p̂)
d =
n
Next, we consider survival probability estimators. Recall (see [3]) that the
probability of a life aged x to attain age x + t is the conditional probability
Sn (y) ny
y−x p̂x = = .
Sn (x) nx
The variance is given by
Example 53.2
The following chart provides the time of death of 15 individuals under ob-
servation from time 0.
Time of Death 1 2 3 4 5
# of Deaths 1 3 2 4 5
53 MEAN AND VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA381
Solution.
9
(a) We have S15 (3) = 15 = 0.6 and
Example 53.3
The following random sample of 9 losses has been observed from the distri-
bution of loss random variable X :
(a) Find the estimated variance of the estimate of Pr(X > 60).
(b) Find the estimated variance of the estimate of Pr(75 < X ≤ 120).
(c) Find the estimated variance of the estimate of Pr(X > 60|X > 50).
Solution. 4 4
d 1 ) = 9 (1− 9 ) = 20 .
(a) Let p1 = Pr(X > 60). Then p̂1 = 94 and Var(p̂ 9 729
(b) Let p2 = Pr(75 < X ≤ 120). Then p̂2 = S9 (75) − S9 (120) = 93 − 19 = 29
2 2
d 2 ) = 9 (1− 9 ) = 14 .
and Var(p̂ 9 729
(c) Let p3 = Pr(X > 60|X > 50) = 10 p50 . Then p̂3 = SS99 (60) 4
(50) = 5 and
d 3) = 4
Var(p̂ 125
Grouped Data
We first use the ogive to find the variance of the estimator of the survival
function and the density function. Let n observations be spread over the
grouped data
(c0 , c] , (c1 , c2 ], · · · , (ck−1 , ck ]
382 ESTIMATION OF INCOMPLETE DATA
n1 + n2 + · · · + nj
Sn (cj ) = 1 − .
n
cj − x x − cj−1
Sn (x) = Sn (cj−1 ) + Sn (cj ).
cj − cj−1 cj − cj−1
where
Example 53.4
Using a histogram (see Section 50), the empirical density function can be
expressed as
Z
fn (x) =
n(cj − cj−1 )
Solution.
The random variable Z is a binomial random variable with parameters
(n, S(cj−1 ) − S(cj )). Thus,
Example 53.5
Find Var[fn (x)], where fn (x) is the empirical density function in the previous
example.
Solution.
We have
Example 53.6
Given the following grouped data.
Estimate the probability that a loss will be no more than 90, and find the
estimated variance of that estimate.
384 ESTIMATION OF INCOMPLETE DATA
Solution.
An estimate to the probability that a loss will be no more than 90 is
cj − x x − cj−1
1 − S50 (90) =1 − Sn (cj−1 ) − Sn (cj )
cj − cj−1 cj − cj−1
100 − 90 15 90 − 10 5
=1 − −
100 − 10 50 100 − 10 50
1 3 8 1
=1 − −
9 10 9 10
=0.8778.
d ) =nSn (cj−1 )[1 − Sn (cj−1 )]
Var(Y
15 15
=50 [1 − ] = 10.5
50 50
Var(Z) =n[Sn (cj−1 ) − Sn (cj )][1 − Sn (cj−1 ) + Sn (cj )]
d
10 10
=50[ ][1 − ] = 8
50 50
Cov(Y, Z) = − n[1 − Sn (cj−1 )][Sn (cj−1 ) − Sn (cj )]
d
15 10
= − 50[1 − ] = −7
50 50
d − S50 (90)] =Var[S
Var[1 d 50 (90)]
(cj − cj−1 )2 Var(Y
d ) + (x − cj−1 )2 Var(Z)
d
=
[n(cj − cj−1 )]2
2(cj − cj−1 )(x − cj−1 )Cov(Y,
d Z)
+ 2
[n(cj − cj−1 )]
=0.00175
53 MEAN AND VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA385
Practice Problems
Problem 53.1
Let X be a discrete random variable. In a sample of n outcomes, let Nj
denote the number of times the value xj was observed in the sample with
corresponding probability p(xj ). Nj can be regarded as a binomial random
variable with parameters (n, p(xj )).
Problem 53.2
Given the following grouped data.
Estimate the density function of the loss random variable at x = 90, and
find the estimated variance of the estimator.
Problem 53.3
Consider the following data
Estimate the probability that a driver will have two accidents and find the
estimate of the variance of that estimator.
Problem 53.4
Estimated variances can be used to create confidence intervals for the true
386 ESTIMATION OF INCOMPLETE DATA
Construct approximate 95% confidence inervala for p(2) of the previous ex-
ercise.
54 GREENWOOD ESTIMATE FOR THE VARIANCE OF THE KAPLAN-MEIER ESTIMATOR387
0 ≤ t < y1
1,
j−1
Y si
1− , yj−1 ≤ t < yj , j = 2, 3, · · · , k
Sn (t) = Ŝ(t) = ri
i=1
k
si
Y
1− t ≥ yk .
or 0,
ri
i=1
In what follows, y0 < y1 will denote the smallest alive or observed age in the
sample. Assume that the rj s and yj s are fixed. The number of individuals
who died at time yj is the only random quantity, which we denote by Sj .
As a random variable, Sj has a binomial distribution based on a sample of
rj and a probability of success
and
Sj 1 [S(yj−1 ) − S(yj )]S(yj )
Var 1 − = 2 Var(Sj ) = .
rj rj rj S(yj−1 )2
388 ESTIMATION OF INCOMPLETE DATA
Example 54.1
Assume that the Sj s are independent. Show that
S(yj )
E[Sn (yj )] = (54.2)
S(y0 )
Solution.
We have
" j # Yj
Y Si Si
E[Sn (yj )] =E 1− = E 1−
ri ri
i=1 i=1
j
Y S(yi ) S(yj )
= = .
S(yi−1 ) S(y0 )
i=1
Because the survival function is usually unknown, we use (54.1) and (54.2)
to write
S(yj ) 2
h i
S(yi )
S(y0 ) ≈ [Sn (yj )]2 and S(yi−1 )
≈ 1 − srii
Remark 54.1
For non-death ages, the convention is to take the sum up to the last death
age that is less than or equal to to the age under consideration.
Example 54.2 ‡
For a survival study with censored and truncated data, you are given:
Solution.
Let 3 p1 be the probability that an individual which has survived past time
1 will also survive past time 4. Then 3 p1 = 1 − 3 q1 which implies
Var(
d 3 q̂1 ) = Var(
d 3 p̂1 ).
390 ESTIMATION OF INCOMPLETE DATA
Solution.
We first create the following table summarizing yj , rj , and sj .
yj rj sj
4 10 2
8 5 1
12 2 1
15 1 1
Since 11 is an uncensored value (loss), the largest loss less than or equal to
11 is 8. Thus,
Var(S
d 10 (11)) = Var(Sd 10 (8)).
The Kaplan-Meier estimate is
2 1 16
S10 (8) = 1 − 1− = .
10 5 25
Hence
2
16 2 1
Var(S
d 10 (11)) = + = 0.03072
25 10(10 − 2) 5(5 − 1)
54 GREENWOOD ESTIMATE FOR THE VARIANCE OF THE KAPLAN-MEIER ESTIMATOR391
Practice Problems
Problem 54.1
Let X1 , X2 , · · · , Xn be indepedent random variable. Show that
n
Y n
Y
Var(X1 X2 · · · Xn ) = (µ2i + σi2 ) − µ2i .
i=1 i=1
Problem 54.2
Show that if 0 < ai << 1 for i = 1, 2, · · · , n then
(1 + a1 )(1 + a2 ) · · · (1 + an ) ≈ 1 + a1 + a2 + · · · + an .
Problem 54.3
For a mortality study with right-censored data, you are given:
Problem 54.4
For a mortality study with right-censored data, you are given:
Problem 54.5 ‡
For 200 auto accident claims you are given:
(i) Claims are submitted t months after the accident occurs, where t =
0, 1, 2, · · · .
(ii) There are no censored observations.
392 ESTIMATION OF INCOMPLETE DATA
Example 55.1
For a survival study with censored and truncated data, you are given:
Number at risk Failures at
Time (t) at time t time t
1 30 5
2 27 9
3 32 6
4 25 5
5 20 4
Solution.
The estimated variance is given by
4
X si 5 9 6 5
Var(
d Ĥ(y4 )) =
2 = 2 + 2 + 2 + 2 = 0.0318
r 30 27 32 25
i=1 i
We next look at computing the confidence intervals (see Section 47) of both
the Kaplan-Meier estimator and the Nelson-Åalen estimator. But first we
consider the following example.
Example 55.2
You are given:
• The Kaplan-Meier estimator: Ŝ(3) = 0.8667.
• The Greenwwod approximation: Var(d Ŝ(3)) = 0.0077.
• Ŝ(t) is approximated by a normal distribution.
Find the 95% confidence interval of Ŝ(3).
394 ESTIMATION OF INCOMPLETE DATA
Solution.
Using normal approximation, the 95% confidence interval can be expressed
as √
0.8667 ± 1.96 0.0077
or in interval notation as
(0.6947, 1.0387).
Confidence intervals of this type are referred to as linear confidence in-
tervals
Example 55.3
Obtain the log-transformed confidence interval to Ŝ(3) as in Example 55.2.
Solution.
We have " √ #
0.0077
U = exp 1.96 ≈ 0.2498.
0.8667 ln 0.8667
Thus, the confidence interval is
1
(0.8667 0.2498 , 0.86670.2498 ) = (0.564, 0.9649)
Similar results are available for the Nelson-Åalen estimators. We define the
linear (1−α) confidence interval for the cumulative hazard rate function
by q
Ĥ(t) ± z α Var(
2
d Ĥ(yj )), yj ≤ t < yj+1 .
55 VARIANCE ESTIMATE OF THE NELSON-ÅALEN ESTIMATOR AND CONFIDENCE INTERVALS39
where q
Var(
d Ĥ(yj ))
U = exp z α2 .
Ĥ(t)
Example 55.4
You are given:
yj rj sj
1 50 4
2 53 5
3 32 9
4 45 11
5 20 2
Solution.
(i) We have
5
X sj
Ĥ(5) = Ĥ(y5 ) = = 0.8
rj
j=1
and
5
X sj
Var(
d Ĥ(5)) = Var(
d Ĥ(y5 )) = = 0.0226.
rj
j=1
Example 55.5 ‡
A survival study gave (1.63, 2.55) as the 95% linear confidence interval for
the cumulative hazard function H(t0 ).
Calculate the 95% log-transformed confidence interval for H(t0 ).
Solution.
The interval (1.63, 2.55) has endpoints that can be written as 2.09 ± 0.46.
Thus, Ĥ(t0 ) = 2.09 and 1.96σ̂ = 0.46. Hence, σ̂ = 0.2347. For the log-
transformed confidence interval, we first find
0.2347
U = e1.96( 2.09 ) = 1.2462.
Example 55.6 ‡
For a survival study, you are given:
(i) Deaths occurred at times y1 < y2 < · · · < y9 .
(ii) The Nelson-Åalen estimates of the cumulative hazard function at y3 and
y4 are
Ĥ(y3 ) = 0.4128 and Ĥ(y4 ) = 0.5691
(iii) The estimated variances of the estimates in (ii) are:
Var(
d Ĥ(y3 )) = 0.009565 and Var(
d Ĥ(y4 )) = 0.014448.
Example 55.7 ‡
You are given:
(i) Eight people join an exercise program on the same day. They stay in the
program until they reach their weight loss goal or switch to a diet program.
(ii) Experience for each of the eight members is shown below:
Time at which...
Member Reach Weight Loss Goal Switch to Diet Program
j xj uj
1 4
2 8
3 8
4 12
5 12
6 12
7 22
8 36
(iii) The variable of interest is time to reach weight loss goal.
Using the Nelson-Åalen estimator, calculate the upper limit of the symmetric
90% linear confidence interval for the cumulative hazard rate function H(12).
Solution.
Reaching weight loss goal is equivalent to death in mortality theory and
switching to a dieting program is considered a censored observation. We
have the following chart
j yj sj rj
1 8 1 7
2 12 2 5
3 22 1 2
4 36 1 1
By the Nelson-Åalen estimation, we have
s1 s2 1 2
Ĥ(12) = + = + = 0.5429.
r1 r2 7 5
The estimated variance of the Nelson-Åalen estimate is
d Ĥ(12)] = s1 + s2 = 1 + 2 = 0.1004.
Var[
r12 r22 72 52
The upper limit of the symmetric 90% linear confidence interval of Ĥ(12) is
√
0.5429 + 1.645 0.1004 = 1.06
398 ESTIMATION OF INCOMPLETE DATA
Practice Problems
Problem 55.1 ‡
You are given the following information
yj rj sj
1 30 5
2 27 9
3 32 6
4 25 5
5 20 4
Problem 55.2 ‡
Obtain the 95% linear confidence interval and the 95% log-transformed con-
fidence interval in Problem 55.1
Problem 55.3 ‡
The interval (0.357, 0.700) is a 95% log-transformed confidence interval for
the cumulative hazard rate function at time t, where the cumulative hazard
rate function is estimated using the Nelson-Åalen estimator.
Problem 55.4 ‡
Twelve policyholders were monitored from the starting date of the policy to
the time of the first claim. The observed data are as follows.
Time of first claim 1 2 3 4 5 6 7
Number of claims 2 1 2 2 1 2 2
Using the Nelson-Åalen estimator, calculate the 95% linear confidence in-
terval for the cumulative hazard rate function H(4.5).
Problem 55.5 ‡
For a survival study, you are given:
(i) The Product-Limit estimator Sn (t0 ) is used to construct confidence in-
tervals for S(t0 ).
(ii) The 95% log-transformed confidence interval for S(t0 ) is (0.695,0.843).
Determine Sn (t0 ).
55 VARIANCE ESTIMATE OF THE NELSON-ÅALEN ESTIMATOR AND CONFIDENCE INTERVALS39
Problem 55.6 ‡
Obtain the 95% log-transformed confidence interval for H(3) in Example
55.1, based on the Nelson-Åalen estimate.
Problem 55.7 ‡
Fifteen cancer patients were observed from the time of diagnosis until the
earlier of death or 36 months from diagnosis. Deaths occurred during the
study as follows:
Next, we define the kernel density estimator, also known as the ker-
nel smoothed estimate, of the distribution function by
k
X
F̂ (x) = p(yj )Kyj (x)
j=1
k
X
fˆ(x) = p(yj )kyj (x).
j=1
In [1], three types of kernels are only considered: uniform, triangular, and
Gamma. The uniform kernel with bandwith b is given by
0, x < y − b,
1
ky (x) = , y − b ≤ x ≤ y + b,
2b
0, x > y + b.
Note that kyj (x) is the pdf of a uniform distribution in the interval [yj −
b, yj + b].
56 KERNEL DENSITY ESTIMATION 401
Example 56.1
You are given the following ages at time of death of 10 individuals:
25 30 35 35 37 39 45 47 49 55.
Solution.
With b = 10, we have that ky (40) = 0 for y = 25 and y = 55. Thus, we have
1
k30 (40) = k35 (40) = k37 (40) = k39 (40) = k45 (40) = k47 (40) = k49 (40) = .
20
Hence,
Example 56.2
You are given the following ages at time of death of 10 individuals:
25 30 35 35 37 39 45 47 49 55.
Using a triangular kernel with bandwith 10, find the kernel smoothed density
estimate fˆ(40).
402 ESTIMATION OF INCOMPLETE DATA
Solution.
The triangular kernel with bandwith 10 is
0, x < y − 10,
x−y+10
100 , y − 10 ≤ x ≤ y,
ky (x) = y+10−x
, y ≤ x ≤ y + 10,
100
0, x > y + 10.
We first create the following chart:
y−b y y+b
15 25 35
20 30 40
25 35 45
27 37 47
29 39 49
35 45 55
37 47 57
39 49 59
45 55 65
We have
25 30 35 35 37 39 45 47 49 55.
Solution.
With α = 1, the kernel is expressed as follows:
−x
e y
ky (x) = .
y
Thus,
Example 56.4 ‡
You are given the kernel:
2
p
π 1 − (x − y)2 , y − 1 ≤ x ≤ y + 1
ky (x) =
0, otherwise.
1 3 3 5
Determine which of the following graphs shows the shape of the kernel den-
sity estimator.
404 ESTIMATION OF INCOMPLETE DATA
Solution.
We are given that y1 = 1, y2 = 3, and y3 = 5. The empirical probabilities
are p(y1 ) = p(y3 ) = 0.25 and p(y2 ) = 0.5. The kernel densities are
2
p
π 1 − (x − 1)2 , 0 ≤ x ≤ 2
k1 (x) =
0, otherwise.
2
p
π 1 − (x − 3)2 , 2 ≤ x ≤ 4
k3 (x) =
0, otherwise.
2
p
π 1 − (x − 5)2 , 4 ≤ x ≤ 6
k5 (x) =
0, otherwise.
The graphs of k1 (x), k3 (x), and k5 (x) are shown below. The kernel density
estimator is
3
X
fˆ(x) = p(yi )kyi (x) = 0.25k1 (x) + 0.5k3 (x) + 0.25k5 (x).
i=1
We see that the middle curve has double the coefficient as the curves on the
left and right, so the middle curve is doubled.
56 KERNEL DENSITY ESTIMATION 405
Example 56.5 ‡
You are given:
(i) The sample:
1 2 3 3 3 3 3 3 3 3
(ii) F̂1 (x) is the kernel density estimator of the distribution function using
a uniform kernel with bandwidth 1.
(iii) F̂2 (x) is the kernel density estimator of the distribution function using
a triangular kernel with bandwidth 1.
Determine the interval where F̂1 (x) = F̂2 (x).
Solution.
1
The empirical distribution of the data set is given by: p(1) = 10 = 0.1, p(2) =
1 8
10 = 0.1, and p(3) = 10 = 0.8.
The kernel density estimator of the distribution function using a uniform
kernel with bandwidth 1 is
where
( (
0, x<0 0, x<1 0, x<2
K1u (x) = x
2
, 0≤x≤2 , K2u (x) = x−1
2
, 1≤x≤3 , K3u (x) = x−2
2
, 2≤x≤4
1, x>2 1, x>3 1, x > 4.
406 ESTIMATION OF INCOMPLETE DATA
Thus,
0, x<0
0.05x, 0≤x≤1
0.1x − 0.05, 1≤x≤2
F̂1 (x) =
0.45x − 0.75, 2≤x≤3
0.4x − 0.6, 3≤x≤4
1, x>4
The kernel density estimator of the distribution function using a triangular
kernel with bandwidth 1 is
where
0, x<0
0, x<1
x2
(x−1)2
2 ,0≤x≤1
, 1≤x≤2
K1t (x) = (2−x)2 , K2t (x) = 2
(3−x) 2
1− 2 , 1≤x≤2
1− 2 , 2≤x≤3
1, x>2
1, x>3
and
0, x<2
(x−2)2
, 2≤x≤3
K3t (x) = 2
(4−x)2
1 − 2 , 3≤x≤4
1, x > 4.
Thus,
0, x<0
0.05x2 ,
0≤x≤1
0.1x − 0.05, 1≤x≤2
F̂2 (x) =
0.35x2 − 1.3x + 1.35, 2≤x≤3
−0.4x2 + 3.2x − 5.4, 3≤x≤4
1, x>4
Practice Problems
Problem 56.1 ‡
You are given the following ages at time of death of 10 individuals:
25 30 35 35 37 39 45 47 49 55.
Using a uniform kernel of bandwidth 10, determine the kernel density esti-
mate of the probability of survival to age 40.
Problem 56.2 ‡
From a population having distribution function F, you are given the follow-
ing sample:
2.0 3.3 3.3 4.0 4.0 4.7 4.7 4.7
Calculate the kernel density estimate of F (4) using the uniform kernel with
bandwidth 1.4.
Problem 56.3 ‡
You use a uniform kernel density estimator with b = 50 to smooth the
following workers compensation loss payments:
If F̂ (x) denotes the estimated distribution function and F5 (x) denotes the
empirical distribution function, determine |F̂ (150) − F5 (150)|.
Problem 56.4 ‡
You study five lives to estimate the time from the onset of a disease to death.
The times to death are:
2 3 3 3 7
Using a triangular kernel with bandwidth 2, estimate the density function
at 2.5.
Problem 56.5 ‡
You are given:
(i) The sample: 1 2 3 3 3 3 3 3 3 3.
(ii) F̂1 (x) is the kernel density estimator of the distribution function using
a uniform kernel with bandwidth 1.
(iii) F̂2 (x) is the kernel density estimator of the distribution function using
a triangular kernel with bandwidth 1.
Problem 56.6 ‡
You study five lives to estimate the time from the onset of a disease to death.
The times to death are:
2 3 3 3 7
Using a triangular kernel with bandwidth 2, estimate the density function
at 2.5.
57 THE KAPLAN-MEIER APPROXIMATION FOR LARGE DATA SETS409
Following the notation of [1], suppose the sample data can be split into
k intervals with boundary points c0 < c1 < · · · < ck . Let dj denote the
number of observations that are left-truncated at some value within the in-
terval [cj , cj+1 ). In mortality terms, dj is the number of lives that were first
observed at an age in the given range. Let uj be the number of observations
that are right censored (individual leaving the study for reason other than
death) at some value within the interval (cj , cj+1 ]. Note that the intervals
for dj and uj differ by which endpoints are included and which are omitted.
This is because left-truncation is not possible at the right end of a closed
interval, while right-censoring is not possible at the left end of a closed in-
terval. Let xj be the number of uncensored observations (observed deaths)
within the interval (cj , cj+1 ]. With these notation, the sample size can be
axpressed as n = k−1
P Pk−1
j=1 dj = j=1 (uj + xj ).
(1) All truncated values occur at the left-endpoint of the intervals and all
censored values occur at the right-endpoints.
(2) None of the uncensored values fall at the endpoints of the intervals.
(3) Ŝ(c0 ) = 1.
With these assumptions, the number at risk14 for the first interval is r0 =
d0 = all the new entrants for the first interval. For this interval, Ŝ(c1 ) =
1 − xd00 .
r1 = d0 + d1 − x0 − u0 .
14
Recall that the risk set is the number of observations available at a given time that
could produce an uncensored observation at that time.
410 ESTIMATION OF INCOMPLETE DATA
This is the survivors from the first interval (d0 − x0 − u0 ) plus the new
entrants d1 in the second interval. For this interval,
x0 x1
Ŝ(c2 ) = 1 − 1− .
r0 r1
r0 =d0
j
X j−1
X
rj = di − (xi + ui )
i=0 i=0
Ŝ(c0 ) =1
j−1
Y
xi
Ŝ(cj ) = 1−
ri
i=0
Ŝ(cj ) − Ŝ(cj+1 )
q̂j ≈Pr(T ≤ cj+1 |T > cj ) ≈
Ŝ(cj )
j−1 j
Y xi Y xi
1− − 1−
ri ri
= i=0 j−1
i=0
Y xi
1−
ri
i=0
xj xj
=1 − 1 − = .
rj rj
That is,
number of deaths in time period
q̂j = number of lives considered during that time period .
57 THE KAPLAN-MEIER APPROXIMATION FOR LARGE DATA SETS411
Remark 57.1
The reader needs to be aware of the difference of notations for xj , dj , uj , and
rj used in this section and the ones used in Section 52. See Problem 57.4.
Example 57.1
Below is the data for a 5-year mortality study.
Solution.
We first create the following table:
j cj dj uj xj rj
0 45 800 85 10 800
1 46 50 65 8 755
2 47 65 55 6 747
3 48 45 35 4 731
4 49 30 25 2 722
Thus,
x0 x1 x2
Ŝ(2) = 2 p̂45 = 1− 1− 1− = 0.9692
r0 r1 r2
Example 57.2 ‡
Loss data for 925 policies with deductibles of 300 and 500 and policy limits
of 5,000 and 10,000 were collected. The results are given below:
412 ESTIMATION OF INCOMPLETE DATA
The ground-up loss distribution for both types of policy is assumed to be the
same. Using the Kaplan-Meier approximation for large data sets to estimate
F (5000).
Solution.
The boundaries of the intervals are: c0 = 300, c1 = 500, c2 = 1000, c3 =
5000 and c4 = 10000. Recall that dj is the number observations that are
left-truncated at some value in [cj , cj+1 ); uj is the number of observations
that are right censored at some value in (cj , cj+1 ]; and xj is the number of
uncensored observations in (cj , cj+1 ). We have the following chart
j dj uj xj rj
0 400 − 50 400
1 525 − 125 875
2 0 120 300 750
3 0 30 300 330
4 0 − − 0
Hence,
F̂ (5000) = 1 − 0.45 = 0.55
Example 57.3 ‡
The following table was calculated based on loss amounts for a group of
motorcycle insurance policies:
57 THE KAPLAN-MEIER APPROXIMATION FOR LARGE DATA SETS413
Pj−1
cj dj ui xj Pj = i=0 (di − ui − xi )
250 6 0 1 0
500 6 0 2 5
1000 7 1 4 9
2750 0 1 7 11
5500 0 1 1 3
6000 0 0 1 1
10,000 0 0 0 0
Estimate the probability that a policy with a deductible of 500 will have a
claim payment in excess of 5500.
Solution.
First note that the risk set is rj = Pj + dj with P0 = 0. The insurance will
pay over 5500 if the insured claim is above 6000 (because of the deductible).
Thus, we are asked to estimate Pr(X > 6000|X > 500) which by Bayes’
theorem is
Pr(X > 6000) S(6000)
Pr(X > 6000|X > 500) = = .
Pr(X > 500) S(500)
We have
x0
Ŝ(500) =Ŝ(c1 ) = 1 −
r0
1
=1 − = 0.83333
6
4
Y xj
Ŝ(6000) =Ŝ(c5 ) = 1−
rj
j=0
5 9 12 4 2
=
6 11 16 11 3
=0.12397.
Hence,
0.12397
Pr(X > 6000|X > 500) ≈ = 0.14876
0.83333
In life table applications, q̂j is a an example of a single-decrement prob-
abilities. When there are multiple causes of decrement, we can express life
table functions pertaining to all causes of decrement with a right superscript
(τ ) 0 (i)
such as qj . For a single cause of decrement we will use the notation qj .
414 ESTIMATION OF INCOMPLETE DATA
Suppose that there are n causes for a decrement, then the following is true:
n
(τ )
Y 0 (i)
qj = 1 − (1 − pj ).
i=1
Example 57.4 ‡
For a double-decrement study, you are given:
(i) The following survival data for individuals affected by both decrements
(1) and (2):
(τ )
j cj qj
0 0 0.100
1 20 0.182
2 40 0.600
3 60 1.000
0 (2)
(ii) qj = 0.05 for all j.
(iii) Group A consists of 1000 individuals observed at age 0.
(iv) Group A is affected by only decrement (1).
Determine the Kaplan-Meier multiple-decrement estimate of the expected
number of individuals in Group A that survive to be at least 40 years old.
Solution.
(τ )
First, the notation qj stands for the probability that a person at age cj
will departs due to some decrement by time cj+1 . Also,
(τ ) 0 (1) 0 (2)
qj = 1 − (1 − qj )(1 − qj ).
Likewise,
(τ ) 0 (1) 0 (2) 0 (1) 0 (1)
0.182 = q1 = 1−(1−q1 )(1−q1 ) = 1−(1−q1 )(1−0.05) =⇒ 1−q1 = 0.8611.
(1)’ Truncation points and censoring points occur uniformly in each interval.
For such an approach, the risk set ri for an interval (ci , ci+1 ] is found by
the formula
Example 57.5 ‡
You are given the following information about a group of 10 claims:
Assume that claim sizes and censorship points are uniformly distributed
within each interval.
Estimate, using the life table methodology, the probability that a claim
exceeds 30,000.
Solution.
We are asked to find Ŝ(30, 000) which is given by
x0 x1
Ŝ(30, 000) = 1− 1−
r0 r1
416 ESTIMATION OF INCOMPLETE DATA
r0 =7 + 1 + 1 = 9
r1 =4 + 1 + 1 = 6.
Thus,
1 1
Ŝ(3000) = 1 − 1− = 0.741
9 6
57 THE KAPLAN-MEIER APPROXIMATION FOR LARGE DATA SETS417
Practice Problems
Problem 57.1
Below is the data for a 5-year mortality study.
Problem 57.2
Show that rj = rj−1 + dj − (xj−1 + uj−1 ).
Problem 57.3 ‡
Loss data for 925 policies with deductibles of 300 and 500 and policy limits
of 5,000 and 10,000 were collected. The results are given below:
Loss range 300 deductible Policy type (II)
(300,500] 50 −
(500,1000] 50 75
(1000,5000] 150 150
(5000,10000] 100 200
at 5000 40 80
at 10000 10 20
Total 400 525
The ground-up loss distribution for both types of policy is assumed to be
the same.
Using the Kaplan-Meier approximation for large data sets to estimate S(1000).
Problem 57.4 ‡
Loss data for 925 policies with deductibles of 300 and 500 and policy limits
of 5,000 and 10,000 were collected. The results are given below:
418 ESTIMATION OF INCOMPLETE DATA
Estimate the probability that a loss will be greater than 3000 using a Kaplan-
Meier type approximation for large data sets.
Methods of parameter
Estimation
The purpose of this chapter is to discuss methods for the estimation of pa-
rameters in parametric models. Below we present an example of a parameter
estimation.
Example 58.1 ‡
For a sample of 15 losses, you are given:
(i)
Observed number
Interval of Losses
(0, 2] 5
(2, 5] 5
(5, ∞) 5
Solution.
Since there are losses in (5, ∞), we must have θ > 5. Since there are 15 losses
and the probability of a loss to be in (0, 2] is 2−0
θ , the expected number of
losses in that interval is E1 = 2θ (15) = 30θ . Likewise, E2 = 45θ and E3 =
75
15 − θ . Hence, the formula given in the problem reduces to
" 2 2 #
1 30 45 75 2
f (θ) = −5 + − 5 + 10 − .
5 θ θ θ
419
420 METHODS OF PARAMETER ESTIMATION
Thus, f (θ) has only one critical value. Moreover, f 00 (θ) = 10260θ−4 and
f 00 (7.60) > 0 so that f (θ) is minimized at θ = 7.60. That is, θ̂ = 7.60
58 METHOD OF MOMENTS AND MATCHING PERCENTILE 421
Example 58.2
For a normal distribution, derive expressions for the method of moment
estimators for the parameters µ and σ 2 .
Solution.
We have to solve the following system of two equations
n
1X
µ = E(X) = xi = X
n
i=1
and
n
1X 2
xi = E(X 2 ) = σ 2 + µ2 .
n
i=1
Thus,
2
µ̃ = X and σ˜2 = 1 Pn 2
n i=1 xi −X
Example 58.3
For a Gamma distribution with a shape parameter α and a scale parameter
θ, derive expressions for their method of moment estimators.
422 METHODS OF PARAMETER ESTIMATION
Solution.
We have to solve the following system of two equations
n
1X
αθ = E(X) = xi
n
i=1
and
n
1X
α(α + 1)θ = E(X 2 ) = xi .
n
i=1
We have
1 Pn 2 1 Pn 2
α(α + 1)θ2 n i=1 xi n i=1 xi
= =⇒ θ̃ = −X
αθ X X
and " P #−1
1 n 2
n i=1 xi
α̃ = X −X
X
Example 58.4
For a Pareto distribution, derive expressions for the method of moment
estimators for the parameters α and θ.
Solution.
We have to solve the following system of two equations
θ
=X
α−1
and
n
2θ2 1X 2
= xi .
(α − 1)(α − 2) n
i=1
We have −2 Pn
1 2
2θ2
θ n i=1 xi
= 2 .
(α − 1)(α − 2) α−1 X
Solving this last equation, we find
" n #" n
#−1
1X 2 2 1 X 2 2
α̃ = xi − X xi − X .
n 2n
i=1 i=1
Also, "
n
#" n
#−1
1X 1 X 2
2 2
θ̃ = X x2i − X xi − X −1
n 2n
i=1 i=1
58 METHOD OF MOMENTS AND MATCHING PERCENTILE 423
Let 100gth percentile be denoted by πg (θ) where F (πg (θ)|θ) = g. Let π̂g
denote the smoothed empirical estimate of the 100gth percentile.
or
F (π̂gk |θ) = gk , k = 1, 2, · · · , p
where g1 , g2 , · · · , gp are arbitrarily chosen percentiles. In this book, the pos-
sible p values are either 1 or 2.
(i) order the sample values from smallest to largest: x(1) , x(2) , · · · , x(n) ;
(ii) find the integer p such that
p p+1
≤g≤ ;
n+1 n+1
Example 58.5 ‡
A random sample of 20 observations has been ordered as follows:
12 16 20 23 26 28 30 32 33 35
36 38 39 40 41 43 45 47 50 57
Determine the 60th sample percentile using the smoothed empirical estimate.
Solution.
We want an integer p such that
p ≤ 0.6(21) ≤ p + 1 =⇒ p = 12.
Thus,
Example 58.6 ‡
You are given:
(i) Losses follow a Burr distribution with parameters γ, θ and α = 2.
(ii) A random sample of 15 losses is
195 255 270 280 350 360 365 380 415 450 490 550 575 590 615.
Use the smoothed empirical estimates of the 30th and 65th percentiles match-
ing to estimate the parameters γ and θ.
Solution.
The cdf of the Burr distribution is
2
1
F (x|θ) = 1 − .
1 + (x/θ)γ
Hence,
π̂0.3 = [5 − 16(0.3)](280) + [16(0.3) − 4](350) = 336
and
π̂0.65 = [11 − 16(0.65)](450) + [16(0.65) − 10](490) = 466.
Hence, we have
2 γ
1 336
1− = 0.3 =⇒ = 0.1952
1 + (336/θ)γ θ
and 2 γ
1 466
1− = 0.65 =⇒ = 0.6903.
1 + (466/θ)γ θ
Hence,
γ γ −γ
466 466 336
= = 3.5364.
336 θ θ
Hence, γ̃ = ln 3.5364[ln 466 − ln 336]−1 = 3.86 and θ̃ = 512.96
58 METHOD OF MOMENTS AND MATCHING PERCENTILE 425
Example 58.7 ‡
For a sample of dental claims X1 , X2 , · · · , X10 , you are given:
10
X X10
(i) Xi = 3860 and Xi2 = 4, 574, 802.
i=1 i=1
(ii) Claims are assumed to follow a lognormal distribution with parameters
µ and σ.
(iii) µ and σ are estimated using the method of moments.
Calculate E(X ∧ 500) for the fitted distribution.
Solution.
We have
10
X
Xi
µ+0.5σ 2 i=1
e = E(X) = = 386 =⇒ µ + 0.5σ 2 = 5.9558
10
and
10
X
Xi2
2 i=1
e2µ+2σ = E(X 2 ) = = 457, 480.2 =⇒ 2µ + 2σ 2 = 13.0335.
10
Solving this system of equations, we find µ̂ = 5.3949 and σ̂ 2 = 1.1218.
Next, using A.5.1.1 in Table C, we have
ln 500 − 5.3949 − 1.1218
E(X ∧ 500) =e5.3949+0.5(1.1218) Φ √
1.1218
ln 500 − 5.3949
+500 1 − Φ √
1.1218
5.3949+0.5(1.1218)
=e Φ(−0.2852) + 500[1 − Φ(0.7739)
=e5.3949+0.5(1.1218) (0.3877) + 500(1 − 0.7805)
=259.4.
Note that the values of the standard normal distribution were obtained using
Excel
Example 58.8 ‡
You are given:
(i) Losses follow an exponential distribution with mean θ.
(ii) A random sample of losses is distributed as follows:
426 METHODS OF PARAMETER ESTIMATION
Solution.
The empirical distribution of the grouped data (see Section 50) is expressed
as follows
c 100 200 400 750 1000 1500
F (c) 0.32 0.53 0.8 0.96 0.98 1.00
Thus,
400
0.8 = F (400) = 1 − e− θ =⇒ θ̂ = 248.53
Example 58.9 ‡
You are given the following sample of claim counts:
0 0 1 2 2
Solution.
Let N denote the binomial random variable. Then (i) yields
0+0+1+2+2
mq = E(N ) = X = = 1.
5
For the 33rd percentile of the sample, we seek p such that
p ≤ 0.33(6) ≤ p + 1 =⇒ p = 1.
Since the first and the second terms in the sample are 0, we conclude that
the 33rd percentile of the sample is 0. By (ii) and the definition of the
58 METHOD OF MOMENTS AND MATCHING PERCENTILE 427
m p0
1 0
2 0.25
3 0.2963
4 0.3154
5 0.3277
6 0.3349
Hence, the smallest value of m is m = 6
Example 58.10 ‡
You are given:
(i) Losses on a certain warranty product in Year i follow a lognormal distri-
bution with parameters µi and σi .
(ii) σi = σ for i = 1, 2, 3.
(iii) The parameters µi vary in such a way that there is an annual inflation
rate of 10% for losses.
(iv) The following is a sample of seven losses:
Year 1: 20 40 50
Year 2: 30 40 90 120.
We have
24.2 + 48.4 + 60.5 + 33 + 44 + 99 + 132
µ01 = = 63.014
7
24.22 + 48.42 + 60.52 + 332 + 442 + 992 + 1322
µ02 = = 5252.64
7
2
µ1 =eµ3 +0.5σ3
2
µ2 =e2µ3 +2σ3 .
Practice Problems
Problem 58.1 ‡
The 20th and 80th percentiles of a sample are 5 and 12. Using the percentile
matching method, estimate S(8) assuming the population has a Weibull
distribution.
Problem 58.2 ‡
You are given the following information about a sample of data:
(i) Mean = 35,000
(ii) Standard deviation = 75,000
(iii) Median = 10,000
(iv) 90th percentile = 100,000
(v) The sample is assumed to be from a Weibull distribution.
Problem 58.3 ‡
You are given the following sample of five claims:
4 5 21 99 421
You fit a Pareto distribution using the method of moments. Determine the
95th percentile of the fitted distribution.
Problem 58.4 ‡
In year 1 there are 100 claims with an average size of 10,000, and in year 2
there are 200 claims with an average size of 12,500. Inflation increases the
size of all claims by 10% per year. A Pareto distribution with α = 3 and θ
unknown is used to model the claim size distribution.
Problem 58.5 ‡
The following 20 wind losses (in millions of dollars) were recorded in one
year:
1 1 1 1 1 2 2 3 3 4
6 6 8 10 13 14 15 18 22 25
Determine the 75th sample percentile using the smoothed empirical estimate.
430 METHODS OF PARAMETER ESTIMATION
Problem 58.6 ‡
You are given:
(i) Losses follow a loglogistic distribution with cumulative distribution func-
tion:
(x/θ)γ
F (x) = .
1 + (x/θ)γ
(ii) The sample of losses is:
Calculate the estimate of θ by percentile matching, using the 40th and 80th
empirically smoothed percentile estimates.
Problem 58.7 ‡
You are given:
(i) A sample x1 , x2 , · · · , , x10 is drawn from a distribution with probability
density function:
1 1 −x 1 −x
f (x) = e + e
θ σ , x > 0.
2 θ σ
(ii) θ > σ.
10
X 10
X
(iii) xi = 150 and x2i = 5000.
i=1 i=1
Estimate θ by matching the first two sample moments to the corresponding
population quantities.
Problem 58.8 ‡
You are given the following claim data for automobile policies:
200 255 295 320 360 420 440 490 500 520 1020
Problem 58.9 ‡
You are given:
x 0 1 2 3
Pr(X = x) 0.5 0.3 0.1 0.1
58 METHOD OF MOMENTS AND MATCHING PERCENTILE 431
Problem 58.10 ‡
You are given:
(i) Claim amounts follow a shifted exponential distribution with probability
density function:
1 x−δ
f (x) = e− θ , x > δ.
θ
(ii) A random sample of claim amounts X1 , X2 , · · · , X10
5 5 5 6 8 9 11 12 16 23.
10
X 10
X
(iii) xi = 100 and xi = 1306.
i=1 i=1
Estimate δ using the method of moments.
Problem 58.11 ‡
You are given the following random sample of 13 claim amounts:
99 133 175 216 250 277 651 698 735 745 791 906 947
Problem 58.12 ‡
The parameters of the inverse Pareto distribution
τ
x
F (x) =
x+θ
Problem 58.13 ‡
You are given:
(i) Losses are uniformly distributed on (0, θ) with θ > 150.
432 METHODS OF PARAMETER ESTIMATION
Problem 58.14 ‡
You are given the following data:
Problem 58.15 ‡
The following claim data were generated from a Pareto distribution:
130 20 350 218 1822
Using the method of moments to estimate the parameters of a Pareto dis-
tribution, calculate the limited expected value at 500.
Problem 58.16 ‡
A random sample of claims has been drawn from a Burr distribution with
known parameter α = 1 and unknown parameters θ and γ. You are given:
(i) 75% of the claim amounts in the sample exceed 100.
(ii) 25% of the claim amounts in the sample exceed 500.
Problem 58.17 ‡
A random sample of observations is taken from a shifted exponential distri-
bution with probability density function:
1 (x−δ)
f (x) = e− θ , δ < x < ∞.
θ
The sample mean and median are 300 and 240, respectively.
Problem 58.18 ‡
For a portfolio of policies, you are given:
(i) Losses follow a Weibull distribution with parameters θ and τ.
(ii) A sample of 16 losses is :
Problem 58.19 ‡
You are modeling a claim process as a mixture of two independent distribu-
tions A and B. You are given:
(i) Distribution A is exponential with mean 1.
(ii) Distribution B is exponential with mean 10.
(iii) Positive weight p is assigned to distribution A.
(iv) The standard deviation of the mixture is 2.
Problem 58.20 ‡
You are given the following information about a study of individual claims:
(i) 20th percentile = 18.25
(ii) 80th percentile = 35.80
Parameters µ and σ of a lognormal distribution are estimated using per-
centile matching.
Determine the probability that a claim is greater than 30 using the fitted
lognormal distribution.
434 METHODS OF PARAMETER ESTIMATION
We use the symbol “|” to indicate that the distribution also depends on a
parameter θ, where θ could be a real-valued unknown parameter or a vector
of parameters.
In the case of a complete individual data, consider a random sample with ob-
served values X1 = x1 , X2 = x2 , · · · , Xn = xn . Then the likelihood function
is
n
Y
L(θ) = f (x1 , x2 , · · · , xn |θ) = f (xi |θ)
i=1
Example 59.1
Consider the following discrete random variable X whose pmf is given below.
X 0 1 2 3
θ 1−θ 2θ 2(1−θ)
p(x|θ) 3 3 3 3
59 MAXIMUM LIKELIHOOD ESTIMATION FOR COMPLETE DATA435
Solution.
We have
L(θ) =p(2|θ)p(3|θ)p(0|θ)p(1|θ)p(1|θ)p(2|θ)p(3|θ)
2
2(1 − θ) 2 1 − θ 2 θ
2θ
=
3 3 3 3
Clearly, the likelihood function L(θ) is not easy to maximize. But maximiz-
ing L(θ) is equivalent to maximizing ln [L(θ)] since ln [L(θ)] is an increasing
function of θ. We define the loglikelihood function as
n
X
`(θ) = ln [L(θ)] = f (xi |θ).
i=1
Example 59.2
Find the maximum likelihood estimate of θ in the previous example.
Solution.
Let us look at the log likelihood function
`(θ) = ln [L(θ)]
2θ 2(1 − θ) 1−θ θ
=2 ln + 2 ln + 2 ln + ln .
3 3 3 3
Using calculus, we have
d` 3
= 0 =⇒ θ = .
dθ 7
Also,
d2 `
343
=− < 0.
dθ2 θ= 3 49
7
3
Hence, θ̂ = 7
Example 59.3
A random sample of 5 claims obtained from an exponential distribution with
parameter θ is given as follows:
15 10 7 8 20.
Solution.
1 − xθ
The density function of the distribution is f (x|θ) = θe . The likelihood
function is given by
1 − 1 (x1 +x2 +···+x5 )
L(θ) = e θ .
θ5
The loglikelihood function is
1
`(θ) = − (x1 + x2 + · · · + x5 ) − 5 ln θ.
θ
Let the derivative with respect to θ be zero:
x1 + x2 + · · · + x5
`0 (θ) = 0 =⇒ θ = = 12.
5
Moreover,
d2 `
= −5 < 0.
dθ2 θ=12
Thus, the MLE is
x1 + x2 + · · · + x5
θ̂ = = 12
5
For a complete and grouped data, the process of finding the naximum like-
lihood estimate goes as follows: Arrange the unique observation values in
increasing order
c0 < c1 < · · · < ck
where c0 is the smallest possible observation (often zero) and ck is the largest
possible observation (often infinity). For j = 1, 2, · · · , k, let nj the number
of observations in Aj = (cj−1 , cj ]. The likelihood contribution of each value
in the jth observation is
Example 59.4 ‡
Suppose that a group of 20 losses resulted in the following
Solution.
The likelihood function is
Example 59.5 ‡
The random variable X has survival function:
θ4
SX (x) = .
(θ2 + x2 )2
Solution.
The likelihood function is
where
4xθ4
fX (x) = .
(θ2 + x2 )3
The loglikelihood function is
Example 59.6 ‡
You have observed the following three loss amounts:
186 91 66
Seven other amounts are known to be less than or equal to 60. Losses follow
an inverse exponential with distribution function
θ
F (x) = e− x , x > 0.
Solution.
θ
The pdf of the inverse exponential distribution is given by f (x) = θx−2 e− x .
The likelihood function is
θ̂ 20.25
= = 10.125
2 2
Example 59.7 ‡
You are given:
(i) The distribution of the number of claims per policy during a one-year
period for 10,000 insurance policies is:
(ii) You fit a binomial model with parameters m and q using the method of
maximum likelihood.
Determine the maximum value of the loglikelihood function when m = 2.
Solution.
The pdf of the binomial distribution with m = 2 is
2
f (x) = q x (1 − q)2−x .
x
Practice Problems
Problem 59.1
Suppose X1 , X2 , · · · , Xn are i.i.d random variables with a Gamma distribu-
tion with α = 2 and θ.
Problem 59.2
Suppose X1 , X2 , · · · , Xn are i.i.d random variables with a uniform distribu-
tion is (0, θ).
Problem 59.3 ‡
You are given the following three observations:
You fit a distribution with the following density function to the data:
Problem 59.4 ‡
The proportion of allotted time a student takes to complete an exam, X, is
described by the following distribution:
Problem 59.5 ‡
Let X1 , X2 , · · · , Xn be a random sample from the following distribution with
pdf −x+θ
e , θ < x, − ∞ < θ < ∞
f (x) =
0, otherwise.
Problem 59.9 ‡
You are given:
(i) Losses follow an exponential distribution with mean θ.
(ii) A random sample of 20 losses is distributed as follows:
Problem 59.10 ‡
You fit an exponential distribution to the following data:
Problem 59.11 ‡
Losses come from a mixture of an exponential distribution with mean 100
with probability p and an exponential distribution with mean 10,000 with
probability 1 − p. Losses of 100 and 2000 are observed.
Problem 59.12 ‡
Let x1 , x2 , · · · , xn and y1 , y2 , · · · , ym denote independent random samples of
losses from Region 1 and Region 2, respectively. Single-parameter Pareto
distributions with θ = 1, but different values of α, are used to model losses
in these regions.
Past experience indicates that the expected value of losses in Region 2 is 1.5
times the expected value of losses in Region 1. You intend to calculate the
maximum likelihood estimate of α1 for Region 1, using the data from both
regions.
d
Find dα1 `(α1 ).
444 METHODS OF PARAMETER ESTIMATION
Problem 59.13 ‡
You have observed the following claim severities:
Problem 59.14 ‡
Phil and Sylvia are competitors in the light bulb business. Sylvia advertises
that her light bulbs burn twice as long as Phil’s. You were able to test 20
of Phil’s bulbs and 10 of Sylvia’s. You assumed that the distribution of the
lifetime (in hours) of a light bulb is exponential, and separately estimated
Phil’s parameter as θP = 1000 and Sylvia’s parameter as qθS = 1500 using
maximum likelihood estimation.
Example 60.1
A ground-up loss random variable X has a policy limit of 30. The following
is a random sample of 6 insurance payment amounts:
20 25 27 28 30 30.
Solution.
The likelihood function is
160
`(θ) = − − 4 ln θ
θ
and
160 4 160
`0 (θ) = − = 0 =⇒ θ = = 40.
θ2 θ 4
Furthermore,
00 320 4
` (θ)(40) = − 3 + 2 = −0.0025 < 0.
θ θ θ=40
446 METHODS OF PARAMETER ESTIMATION
Finally, θ̂ = 40
If some of the data in the sample has been left-truncated, for example a
policy with an ordinary deductible d, then each observation in the sam-
ple truncated at d contributes a factor of f (y + d|θ)[1 − F (d|θ)] where y is
recorded after the deductible d is applied. For example, suppose the jth
observation is yj , the loss amount after a deductible of d is applied. Then
f (xj |θ) f (yj + d|θ)
Pr(Xj ∈ Aj ) = Pr(Xj |Xj > d) = = , xj = yj + d.
1 − F (d|θ) 1 − F (d|θ)
Example 60.2
A ground up loss X has a deductible of 7 applied. A random sample of 6
insurance payments (after deductible is applied) is given
3 6 7 8 10 12.
Solution.
The loss amounts before the deductible is applied are:
10 13 14 15 17 19.
Finally, θ̂ = 7.6667
Instead of using the ground up losses in the above estimation process, one
can use instead the payments after the deductible is applied (i.e. cost per
payment). In this case, the problem reduces to the case of a complete indi-
vidual data. We illustrate this point in the next example.
Example 60.3
A ground up loss X has a deductible of 7 applied. A random sample of 6
insurance payments (after deductible is applied) is given
3 6 7 8 10 12.
Solution.
(a) Note that the condition xi < θ is equivalent to max{x1 , x2 , · · · , xn } < θ.
Hence, an estimate of θ is θ̂ = max{10, 13, 14, 15, 17, 19} = 19 so that an
estimate of the mean of X is 192 = 9.5.
(b) Similar to (a), an estimate of θ is θ̂ = max{3, 6, 7, 8, 10, 12} = 12 so that
an estimate of the mean is 122 =6
Example 60.4
A policy has a deductible of d = 3 and a maximum covered loss of u = 14.
A random sample of 6 insurance payments is given:
1 3 7 9 11 11.
Solution.
The policy limit is u − d = 14 − 3 = 11. The likelihood function is
Example 60.5 ‡
You are given:
(i) The number of claims follows a Poisson distribution with mean λ.
(ii) Observations other than 0 and 1 have been deleted from the data.
(iii) The data contain an equal number of observations of 0 and 1.
Determine the maximum likelihood estimate of λ.
Solution.
Let N be the number of claims. Notice that we are trying to estimate λ so
the data in our sample represent number of claims. We are told that N is
right truncated at 1. Thus, we have the following
function is
n/2 n/2 n/2
1 λ λ
L(λ) = = .
1+λ 1+λ (1 + λ)2
The loglikelihood function is
n
`(λ) = ln λ − n ln (1 + λ).
2
Differentiating and setting to 0, we find
n n
`0 (λ) = − = 0 =⇒ λ̂ = 1
2λ 1 + λ
Example 60.6 ‡
You are given:
(i) At time 4 hours, there are 5 working light bulbs.
(ii) The 5 bulbs are observed for p more hours.
(iii) Three light bulbs burn out at times 5, 9, and 13 hours, while the re-
maining light bulbs are still working at time 4 + p hours.
(iv) The distribution of failure times is uniform on (0, ω).
(v) The maximum likelihood estimate of ω is 29.
Determine p.
Solution.
Let T be the time to failure random variable. T has a uniform distribution
on (0, ω). Its pdf is f (t) = ω1 and its sdf is S(t) = 1 − ωt . The likelihood
function is
f (5) f (9) f (13) S(4 + p) 2
L(ω) =
S(4) S(4) S(4) S(4)
2
1 1 1 4+p
ωωω 1− ω
= 5
1 − ω4
(ω − 4 − p)2
= .
(ω − 4)5
The loglikelihood function is
`(ω) = 2 ln (ω − 4 − p) − 5 ln (ω − 4).
Practice Problems
Problem 60.1 ‡
You observe the following five ground-up claims from a data set that is
truncated from below at 100:
125 150 165 175 250
You fit a ground-up exponential distribution using maximum likelihood es-
timation. Determine the mean of the fitted distribution.
Problem 60.2 ‡
You are given:
(i) A sample of losses is: 600 700 900
(ii) No information is available about losses of 500 or less.
(iii) Losses are assumed to follow an exponential distribution with mean θ.
Problem 60.6 ‡
For a dental policy, you are given:
(i) Ground-up losses follow an exponential distribution with mean θ.
(ii) Losses under 50 are not reported to the insurer.
(iii) For each loss over 50, there is a deductible of 50 and a policy limit of
350.
(iv) A random sample of five claim payments for this policy is:
Problem 60.7 ‡
You are given:
(i) An insurance company records the following ground-up loss amounts,
which are generated by a policy with a deductible of 100:
(ii) Losses less than 100 are not reported to the company.
(iii) Losses are modeled using a Pareto distribution with parameters θ = 400
and α.
Problem 60.8 ‡
You are given the following information about a group of policies:
Problem 60.9 ‡
You are given the following 20 bodily injury losses (before the deductible is
applied):
350 350 500 500 500+ 1000 1000+ 1000+ 1200 1500
where the symbol + indicates that the loss exceeds the policy limit.
(iii) Ŝ1 (1250) is the product-limit estimate of S(1250).
(iv) Ŝ2 (1250) is the maximum likelihood estimate of S(1250) under the as-
sumption that the losses follow an exponential distribution.
Determine the absolute difference between Ŝ1 (1250) and Ŝ2 (1250).
454 METHODS OF PARAMETER ESTIMATION
Problem 60.12 ‡
You are given a sample of losses from an exponential distribution. However,
if a loss is 1000 or greater, it is reported as 1000. The summarized sample
is:
Reported Loss Number Total Amount
Less than 1000 62 28,140
1000 38 38,000
Total 100 66,140
Problem 60.13 ‡
You are given the following claims settlement activity for a book of auto-
mobile claims as of the end of 1999:
Number of Claims Settled
Year Year Settled
Reported 1997 1998 1999
1997 Unknown 3 1
1998 5 2
1999 4
Problem 60.14 ‡
You are given the following information about a random sample:
(i) The sample size equals five.
(ii) The sample is from a Weibull distribution with τ = 2 and unknown θ.
(iii) Two of the sample observations are known to exceed 50, and the re-
maining three observations are 20, 30 and 45.
Calculate the maximum likelihood estimate of θ.
61 ASYMPTOTIC VARIANCE OF MLE 455
Our method uses the following concept found in the statistics literature:
Let X be a random variable whose distribution depends on a parameter θ
and denote its pdf by f (x|θ). We define the Fisher information I(X|θ) in
X by Z
I(X|θ) = E[`0 (x|θ)2 ] = [`0 (x|θ)]2 f (x|θ)dx
where `(x|θ) = ln f (x|θ). The prime symbol stands for the differentiation
with respect to θ. We assume that we can exchange the order of differenti-
ation and integration, then
Z Z
0 d d
f (x|θ)dx = f (x|θ)dx = (1) = 0.
dθ dθ
Similarly,
d2 d2
Z Z
f 00 (x|θ)dx =
2
f (x|θ)dx = 2 (1) = 0.
dθ dθ
With these properties of f, we have
Z Z 0 Z
f (x|θ)
E[`0 (x|θ)] = `0 (x|θ)f (x|θ)dx = f (x|θ)dx = f 0 (x|θ)dx = 0.
f (x|θ)
Hence,
I(X|θ) = Var[`0 (x|θ)].
Also, notice that
f 00 (x|θ)f (x|θ) − f 0 (x|θ)2 f 00 (x|θ)
`00 (x|θ) = = − [`0 (θ)]2
f (x|θ)2 f (x|θ)
so that Z
00
E[` (x|θ)] = f 00 (x|θ)dx − E[`0 (x|θ)2 ] = −I(x|θ).
Example 61.1
Let X be a normal random variable with parameters µ and σ 2 . Suppose that
σ 2 is knwon but µ is the unknown parameter. Find the Fisher information
I(X|µ) in X.
Solution.
The pdf of X is
(x−µ)2
1 −
2σ 2
f (x|µ) = √ e .
2πσ
Thus,
1 (x − µ)2
`(x|µ) = − ln (2πσ 2 ) − .
2 2σ 2
Hence
(x−µ)
`0 (x|µ) = σ2
and `00 (x|µ) = − σ12 .
Thus,
1
I(X|µ) = −E[`00 (x|µ)] =
σ2
Now, suppose we have a random sample X1 , X2 , · · · , Xn coming from a
distribution for which the pdf is f (x|θ) where the value of the parameter θ
is unknown. Assuming independence, the joint pdf is given by
n
Y
L(θ) = fn (x1 , x2 , · · · , xn |θ) = f (xi |θ).
i=1
and
fn0 (x1 , x2 , · · · , xn |θ)
`0 (θ) = .
fn (x1 , x2 , · · · , xn |θ)
We define the Fisher information I(θ) in the random sample X1 , X2 , · · · , Xn
as
Z Z
I(θ) = E[`0 (θ)2 ] = · · · `0n (θ)2 fn (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn .
61 ASYMPTOTIC VARIANCE OF MLE 457
d2 d2
Z Z
f 00 (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn = 2 f (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn = 2 (1) = 0.
dθ dθ
With these properties of f, we have
Z
E[`0 (θ)] = `0n (θ)f (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn
Z 0
f (x1 , x2 , · · · , xn |θ)
= f (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn
f (x1 , x2 , · · · , xn |θ)
Z
= f 0 (x1 , x2 , · · · , xn |θ)dx1 dx2 · · · dxn = 0.
Hence,
I(θ) = Var[`0 (θ)].
Also, notice that
E[`00 (θ)] = −I(θ).
and
Xn n
X
I(θ) = −E[ `00 (xi |θ)] = − E[`00 (xi |θ)] = nI(X|θ).
i=1 i=1
In other words, the Fisher information in a random sample of size n is simply
n times the Fisher information in a single observation.
Example 61.2
Let X1 , X2 , · · · , Xn be a random sample from N (µ, σ 2 ) where σ 2 is known
but µ is unknown. Find the Fisher information of this random sample.
Solution.
We have
n
I(µ) = nI(X|µ) =
σ2
Now, let θ̂ denote an arbitrary estimator of θ. It is proven in statistics that
m0 (θ)
Var(θ̂) ≥
I(θ)
458 METHODS OF PARAMETER ESTIMATION
where E(θ̂) = m(θ). If θ̂ is unbiased then m(θ) = θ and the above inequality
becomes
1
Var(θ̂) ≥ .
I(θ)
The right-hand side of the inequality is known as the Cramér-Rao lower
bound. It is shown in Statistics that under certain conditions, no other
unbiased estimator of the parameter θ based on an i.i.d. sample of size n
can have a variance smaller than the Cramér-Rao lower bound.
Example 61.3
Let X1 , X2 , · · · , Xn be a random sample from N (µ, θ) where µ is known and
θ is unknown. Calculate the Cramér-Rao lower bound of variance for any
unbiased estimator,
Solution.
The pdf of N (µ, θ) is given by
1 (x−µ)2
f (x|θ) = √ e− 2θ .
2πθ
Thus,
(x − µ)2 1 1
`(x|θ) = − − ln (2π) − ln θ
2θ 2 2
and
(x−µ)2 2
`0 (x|θ) = 2θ2
− 1
2θ and `00 (x|θ) = − (x−µ)
θ3
+ 1
2θ2
.
Hence,
1 1 1
I(X|θ) = −Eθ [`00 (θ)] = 3
E[(X − µ)2 ] − 2 = 2
θ 2θ 2θ
and
n
I(θ) = nI(X|θ) = .
2θ2
2θ2
The Cramér-Rao lower bound is n
Theorem 61.1
Let X1 , X2 , · · · , Xn be a random sample fron a distribution with pdf f (x|θ)
and unknown parameter θ. Let θ denote the true value of the parameter and
61 ASYMPTOTIC VARIANCE OF MLE 459
p
θ̂ the MLE estimator of θ. Then the probability distribution of I(θ)(θ̂ − θ)
approaches the standard normal distribution as n → ∞. That is,
1
θ̂ ∼ N θ, .
I(θ)
Since this is merely a limiting result, which holds as the sample size tends to
infinity, we say that the MLE is asymptotically unbiased and refer to the
variance of the limiting normal distribution as the asymptotic variance
of the MLE.
Example 61.4
Let X1 , X2 , · · · , Xn be a random sample from an exponential distribution
with mean θ. Find the asymptotic distribution of θ̂.
Solution.
We have
x
`(x|θ) = − − ln θ
θ
x 1
`0 (x|θ) = 2 −
θ θ
00 2x 1
` (x|θ) = − 3 + 2
θ θ
2X 1
I(x|θ) = − E[`00 (x|θ)] = E[
3
− 2
θ θ
2 1 1
= 3 E(X) − 2 = 2
θ θ θ
n
I(θ) = 2 .
θ
Hence,
θ2
θ̂ ∼ N θ,
n
It follows that
θ2
Var(θ̂) =
n
and since we don’t know the exact value θ, we have
2
d θ̂) = θ̂ .
Var(
n
460 METHODS OF PARAMETER ESTIMATION
Example 61.5
For an exponential distribution, the parameter θ is estimated via the method
of maximum likelihood by analyzing data from the following sample:
7 12 15 19.
Solution.
From Example 59.3, we know that θ̂ = X. Hence,
2 2 2
d θ̂) = θ̂ = X = [(7 + 12 + 15 + 19)/4] = 43.890625
Var(
n n 4
Example 61.6
For an exponential distribution, the parameter θ is estimated via the method
of maximum likelihood by analyzing data from the following sample:
7 12 15 19.
Solution.
We have θ̂ = X = 7+12+15+19
4 = 13.25 and Var(
d θ̂) = 43.890625. Hence, the
95% confidence interval is
√ √
[13.25 − 1.96 43.890625, 13.25 + 1.96 43.890625] = [0.265, 26.2350]
61 ASYMPTOTIC VARIANCE OF MLE 461
Practice Problems
Problem 61.1
Let X have a Pareto distribution with parameter α and θ = 20. The pa-
rameter α is estimated via the method of maximum likelihood by analyzing
data from the following sample:
12 15 17 19.
Find `0 (α)2 , the square of the first partial derivative of the loglikelihood
function.
Problem 61.2
Let X have a Pareto distribution with parameter α and θ = 20. The pa-
rameter α is estimated via the method of maximum likelihood by analyzing
data from the following sample:
12 15 17 19.
Find `00 (α), the second partial derivative of the loglikelihood function.
Problem 61.3
Let X have a Pareto distribution with parameter α and θ = 20. The pa-
rameter α is estimated via the method of maximum likelihood by analyzing
data from the following sample:
12 15 17 19.
Find the Fisher information associated with the maximum likelihood esti-
mator.
Problem 61.4
Let X have a Pareto distribution with parameter α and θ = 20. The pa-
rameter α is estimated via the method of maximum likelihood by analyzing
data from the following sample:
12 15 17 19.
Estimate the asymptotic variance of the maximum likelihood estimator.
Problem 61.5
Let X have a Pareto distribution with parameter α and θ = 20. The pa-
rameter α is estimated via the method of maximum likelihood by analyzing
data from the following sample:
12 15 17 19.
Find the 95% confidence interval for α.
462 METHODS OF PARAMETER ESTIMATION
Problem 61.6 ‡
The information associated with the maximum likelihood estimator of a
parameter θ is 4n, where n is the number of observations. Calculate the
asymptotic variance of the maximum likelihood estimator of 2θ.
Problem 61.7 ‡
You fit an exponential distribution to the following data:
Problem 61.8 ‡
A random sample of size n is drawn from a distribution with probability
density function:
θ
f (x) = < 0 < x < ∞, 0 < θ < ∞.
(θ + x)2
Problem 61.9 ‡
You are given:
(i) The distribution of the number of claims per policy during a one-year
period for a block of 3000 insurance policies:
# of claims per policy # of policies
0 1000
1 1200
2 600
3 200
4+ 0
(ii) You fit Poisson model to the number of claims per policy using the
method maximum likelihood.
(iii) You construct the large-sample 90% confidence interval for the mean of
the underlying Poisson model that is symmetric around the mean.
Example 62.1
Find the Fisher information matrix of the MLE for the lognormal distribu-
tion.
Solution.
Recall the pdf of the lognormal distribution
1 (ln x−µ)2
f (x) = √ e− 2σ 2 .
xσ 2π
Thus, the likelihood function is
n (ln xi −µ)2
1
e−
Y
L(µ, σ) = √ 2σ 2
i=1
xi σ 2π
and its loglikelihood function is
n
" #
1 ln xi − µ 2
X 1
`(µ, σ) = − ln xi − ln σ − ln (2π) − .
2 2 σ
i=1
Taking first derivatives, we find
n n
∂`
X ln xi − µ ∂`
X (ln xi − µ)2
∂µ = and ∂σ = − nσ + .
σ2 σ3
i=1 i=1
Taking second derivatives, we find
∂2` n
2
=− 2
∂µ σ
2 n
∂ ` X ln xi − µ
=−2
∂σ∂µ σ3
i=1
n
∂2` n X (ln xi − µ)2
= − 3 .
∂σ 2 σ 2 σ4
i=1
464 METHODS OF PARAMETER ESTIMATION
The inverse matrix I(θ)−1 , referred to as the covariance matrix, has the
variance of the individual random variables on the main diagonal and co-
variances in the off-diagonal positions.
Example 62.2
Find the covariance matrix of Example 62.1.
Solution.
The covariance matrix is
" #
n σ2
−1 1 σ2
0 n 0
I(θ) = 2n = σ2
det[I(θ)] 0 σ2 0 2n
In many cases, taking both the derivatives of the loglikelihood function and
the corresponding expectations are not always easy. A way to avoid this
problem is to simply not take the expected value. So instead of taking
the values that result from the expectation, we can just use the observed
data points. This method will result in the observed information. We
illustrate this concept in the next example.
Example 62.3
You model a loss process using a lognormal distribution with parameters µ
and σ. You are given:
(i) The maximum likelihood estimates of µ and σ are µ̂ = 4.4654 and σ̂ 2 =
0.3842.
(ii) The following five observations: 27 82 115 126 155.
Estimate the covariance matrix using the observed information.
62 INFORMATION MATRIX AND THE DELTA METHOD 465
Solution.
Substituting the observations into the second derivatives, we find
∂2` n 5
2
=− 2 =− 2
∂µ σ σ
2 n
∂ ` X ln xi − µ 22.3272 − 5µ
=−2 3
= −2
∂σ∂µ σ σ3
i=1
n
∂2` n X (ln xi − µ)2 5 101.6219 − 44.6544µ + 5µ2
= − 3 = − 3 .
∂σ 2 σ 2 σ4 σ2 σ4
i=1
and
d σ̂) = [I(θ)]−1 = 0.0768 0
Var(µ̂,
0 0.0384
Example 62.4
Consider a random sample of size n from an exponential distribution with
mean θ.
(a) Find the variance of the estimated distribution variance.
(b) Construct the approximate 95% confidence interval for Pr(X > m).
Solution.
(a) We have Var(X)
d ≈ θ̂2 . Hence,
1 θ2 4θ4
Var(θ̂2 ) = ([θ2 ]0 )2 Var(θ̂) = 4θ2 · = 4θ2 · = .
I(θ) n n
466 METHODS OF PARAMETER ESTIMATION
m
(b) The estimated probability is e− θ̂ . Thus,
2 2m 2 2
d e− θ̂ = m e− θ̂ θ̂ = m e− θ̂ .
h mi 2m
Var
θ̂ 4 n nθ̂ 2
Example 62.5 ‡
You model a loss process using a lognormal distribution with parameters µ
and σ. You are given:
(i) The maximum likelihood estimates of µ and σ are µ̂ = 4.215 and σ̂ =
1.093.
(ii) The estimated covariance matrix of µ̂ and σ̂ is
0.1195 0
0 0.0597
σ2
(iii) The mean of the lognormal distribution is eµ+ 2 .
Estimate the variance of the maximum likelihood estimate of the mean of
the lognormal distribution, using the delta method.
Solution.
σ2
Let f (µ, σ) = eµ+ 2 . We have
σ̂ 2
" #" #
σ̂ 2
h
µ̂+ σ̂2
2
µ̂+ σ̂2
2
i
n 0 eµ̂+ 2
Var[f
d (µ̂, σ̂)] = e σ̂e σ̂ 2 σ̂ 2
0 2n σ̂eµ̂+ 2
0.1195 0 123.02
= 123.02 134.46
0 0.0597 134.46
123.02
= 14.70089 8.027262
134.46
=2887.85
62 INFORMATION MATRIX AND THE DELTA METHOD 467
Example 62.6 ‡
You are given:
(i) Fifty claims have been observed from a lognormal distribution with un-
known parameters µ and σ.
(ii) The maximum likelihood estimates are µ̂ = 6.84 and σ̂ = 1.49.
(iii) The covariance matrix of µ̂ and σ̂ is
0.0444 0
0 0.0222
(iv) The partial derivatives of the lognormal cumulative distribution function
are:
∂F
∂µ = − φ(z)
σ and ∂F
∂σ = − zφ(z)
σ .
z2
where φ(z) = √12π e− 2 .
(v) An approximate 95% confidence interval for the probability that the
next claim will be less than or equal to 5000 is: [PL , PH ].
Determine PL .
Solution.
Let
ln5000 − µ
F (µ, σ) = Pr(X ≤ 5000) = Φ .
σ
The point estimate is
ln5000 − 6.84
F̂ (6.84, 1.49) = Φ = Φ(1.125) = 0.87.
1.49
For the delta method, we need
∂F φ(1.125)
=− = −0.1422
∂µ 1.49
∂F 1.125φ(1.125)
=− = −0.16.
∂σ 1.49
Thus,
0.0444 0 −0.1422
Var[F
d (µ̂, σ̂)] = −0.1422 −0.16
0 0.0222 −0.16
=0.001466.
The lower limit of the 95% confidence interval is
√
PL = 0.87 − 1.96 0.001466 = 0.79496
468 METHODS OF PARAMETER ESTIMATION
Practice Problems
Problem 62.1
For a Pareto distribution, the parameters α and θ are both estimated via the
method of maximum likelihood by analyzing data from a random sample of
n observations.
Problem 62.2
For a Pareto distribution, the parameters α and θ are both estimated via the
method of maximum likelihood by analyzing data from a random sample of
n observations.
Problem 62.3
For a Pareto distribution, the parameters α and θ are both estimated via the
method of maximum likelihood by analyzing data from a random sample of
n observations.
Problem 62.4
For a Pareto distribution, the parameters α and θ are both estimated via the
method of maximum likelihood by analyzing data from a random sample of
n observations.
Problem 62.5
You model a loss process using a Pareto distribution with parameters α and
θ = 10. You are given:
(i) The maximum likelihood estimate of α : α̂ = 1.26.
(ii) The following eight observations: 3 4 8 10 12 18 22 35.
Problem 62.6
You model a loss process using a Pareto distribution with parameters α and
θ = 10. You are given:
62 INFORMATION MATRIX AND THE DELTA METHOD 469
Problem 62.7 ‡
You are given:
(i) Loss payments for a group health policy follow an exponential distribu-
tion with unknown mean.
(ii) A sample of losses is: 100 200 400 800 1400 3100.
Use the delta method to approximate the variance of the maximum like-
lihood estimator of S(1500).
Problem 62.8 ‡
The time to an accident follows an exponential distribution. A random sam-
ple of size two has a mean time of 6. Let Y denote the mean of a new sample
of size two.
(a) Using moment generating functions, show that the sum of two inde-
pendent exponential random variables is a Gamma random variables.
(b) Determine the maximum likelihood estimate of Pr(Y > 10).
(c) Use the delta method to approximate the variance of the maximum like-
lihood estimator of FY (10).
Problem 62.9 ‡
A survival study gave (0.283, 1.267) as the symmetric linear 95% confidence
interval for H(5).
Using the delta method, determine the symmetric linear 95% confidence
interval for S(5).
Problem 62.10 ‡
You have modeled eight loss ratios as Yt = α + βt + t , t = 1, 2, · · · , 8, where
Yt is the loss ratio for year t and t is an error term.
You have determined:
α̂ 0.5
=
β̂ 0.02
α̂ 0.00055 −0.00010
Var =
β̂ −0.00010 0.00002
470 METHODS OF PARAMETER ESTIMATION
Estimate the standard deviation of the forecast for year 10, Ŷ10 = α̂ + 10β̂,
using the delta method.
Problem 62.11 ‡
A sample of ten observations comes from a parametric family f (x, y, θ1 , θ2 )
with loglikelihood function
10
X
ln [L(θ1 , θ2 )] = ln [f (xi , yi , θ1 , θ2 )] = −2.5θ12 − 3θ1 θ2 − θ22 + 5θ1 + 2θ1 + k
i=1
where k is a constant.
Example 63.1
Write the inequality that will result in the 95% non-normal confidence region
for the mean θ of the exponential distribution.
Solution.
The loglikelihood function is
n
X xi
`(θ) = −n ln θ − .
θ
i=1
Now, from the table of chi-square distribution, χ20.95 = 3.841. Hence, the
confidence region is
nx
−n ln θ − ≥ −n − n ln x − 1.9205
θ
which must be evaluated numerically
Remark 63.1
Since numerical methods are needed for determining most of the non-normal
confidence intervals and the exam candidates are not provided with the tools
(such as Excel) for solving these problems, it is very unlikely to see this type
of questions on the exam.
Example 63.2
Suppose that n = 20 and x = 1424.40 in the previous example. Use Excel’s
Solver in finding the confidence interval by solving the confidence region
28488
− − 20 ln θ ≥ −167.15.
θ
Solution.
Using Excel’s Solver, one solves the equation
The two roots are 946.85 and 2285.05. Hence, the confidence interval is
[946.85, 2285.05]
63 NON-NORMAL CONFIDENCE INTERVALS FOR PARAMETER ESTIMATION473
Practice Problems
Problem 63.1
You are given:
• `(θ) = 5 ln θ − 2.55413(θ − 1).
• θ̂ = 1.95762.
Problem 63.2
Use Excel’s Solver in finding the confidence interval by solving the confidence
region
5 ln θ − −2.55413(θ − 1) ≥ −1.0077.
Problem 63.3
You are given:
• `(θ) = −θ2 + 11θ − 24.
• θ̂ = 3.
Example 64.1
Show that the uniform prior distribution on the real line π(θ) = 1 for −∞ <
θ < ∞ is improper.
Solution.
This follows from Z ∞
1dθ = ∞
−∞
where x = (x1 , x2 , · · · , xn )T .
Example 64.2 ‡
You are given:
(i) A portfolio consists of 100 identically and independently distributed risks.
(ii) The number of claims for each risk follows a Poisson distribution with
mean λ.
(iii) The prior distribution of λ is:
(50λ)4 e−50λ
π(λ) = , λ > 0.
6λ
During Year 1, the following loss experience is observed:
# of claims # of risks
0 90
1 7
2 2
3 1
Total 100
476 METHODS OF PARAMETER ESTIMATION
Solution.
(a) The model distribution is
90 7 2
e−λ λ0 e−λ λ1 e−λ λ2 e−λ λ3 e−100λ λ14
fX|Λ (x|λ) = = .
0! 1! 2! 3! 24
15018 −150λ 17
πΛ|X (λ|x) = e λ .
17!
Note that the posterior distribution is a Gamma distribution with α = 18
1
and θ = 150 .
(e) The predictive distribution is
∞ ∞
e−λ λy 15018 −150λ 17 15018
Z Z
fY |X (y|x) = e λ dλ = λ17+y e−151λ dλ
0 y! 17! y!17! 0
18
150 (18 + y)!
=
y!17! 15118+y
Example 64.3 ‡
You are given:
64 BASICS OF BAYESIAN INFERENCE 477
θ
f (x|θ) = , x > 0.
(x + θ)2
(ii) For half of the company’s policies θ = 1, while for the other half θ = 3.
For a randomly selected policy, losses in Year 1 were 5. Determine the
posterior probability that losses for this policy in Year 2 will exceed 8.
Solution.
Let Xn denote the losses in Year n. We are asked to find Pr(X2 > 8|X1 = 5).
Recall from Section 3, that probabilities can be found by conditioning. Thus,
we can write
f (θ = 1|X1 = 5)π(1)
Pr(θ = 1|X1 = 5) =π(θ = 1|x = 5) =
f (θ = 1|X1 = 5)π(1) + f (θ = 3|X1 = 5)π(3)
1 1
36 2 16
= 1 1 3 1 = .
36 2 + 64 2
43
Thus,
16 27
Pr(θ = 3|X1 = 5) = 1 − Pr(θ = 1|X1 = 5) = 1 − = .
43 43
On the other hand,
Z ∞
dx 1
Pr(X2 > 8|θ = 1) = = .
8 (x + 1)2 9
Likewise, Z ∞
3 3
Pr(X2 > 8|θ = 3) = 2
dx = .
8 (x + 3) 11
Finally,
1 16 3 27
Pr(X2 > 8|X1 = 5) = + = 0.2126
9 43 11 43
478 METHODS OF PARAMETER ESTIMATION
Example 64.4 ‡
You are given:
(i) Each risk has at most one claim each year.
(ii)
One randomly chosen risk has three claims during Years 1-6. Determine the
posterior probability of a claim for this risk in Year 7.
Solution.
Let Xn denote the number of claims in year n. Then Xn |Θ is a Bernoulli
X6
random vraible with probability of a claim (success) p. Let S = Xi .
i=1
Then S|Θ is a binomial distribution with 6 trials (number of years) and
probability of a claim p. Note that, S|T heta = 3 is the number of claims in
6 years. We are asked to find Pr(X7 = 1||(S = 3) for a given θ which by
Bayes theorem is
Pr[(X7 = 1) ∩ (S = 3)]
Pr(X7 = 1|S = 3) = .
Pr(S = 3)
6
Pr(S = 3|θ1 ) = (0.1)3 (0.9)3 = 0.01458
3
6
Pr(S = 3|θ2 ) = (0.2)3 (0.8)3 = 0.08192
3
6
Pr(S = 3|θ3 ) = (0.4)3 (0.6)3 = 0.27648
3
Pr(S = 3) =Pr(S = 3|θ1 )π(θ1 ) + Pr(S = 3|θ2 )π(θ2 ) + Pr(S = 3|θ3 )π(θ3 )
=0.01458(0.7) + 0.08192(0.2) + 0.27648(0.1) = 0.054238.
64 BASICS OF BAYESIAN INFERENCE 479
(iii) Two claims are made in a given year. Determine the mode of the
posterior distribution of q.
Solution.
We have
The mode is the value of q that maximizes the posterior distribution. Taking
the derivative and setting it to zero, we find
Moreover,
π 00 (2|0.5) = −6(0.5)4 /f (2) < 0
so that the posterior distribution is maximizex when q = 0.5. That is, the
mode of the posterior distribution is 0.5
Example 64.7 ‡
The observation from a single experiment has distribution:
Pr G = 51 = 3 1
= 25 .
5 and Pr G = 3
Calculate Pr G = 13 |D = 0 .
Solution.
We have
Pr D = 0|G = 13 Pr G = 13
1
Pr G = |D = 0 =
Pr D = 0|G = 13 Pr G = 13 + Pr D = 0|G = 15 Pr G = 15
3
(2/3)(1/5) 10
= =
(2/3)(1/5) + (1/5)(3/5) 19
64 BASICS OF BAYESIAN INFERENCE 481
Practice Problems
Problem 64.1
Show that the prior distribution π(θ) = 1θ , 0 < θ < ∞ is improper.
Problem 64.2 ‡
You are given:
(i) In a portfolio of risks, each policyholder can have at most two claims per
year.
(ii) For each year, the distribution of the number of claims is:
q2
π(q) = , 0.2 < q < 0.5.
0.039
A randomly selected policyholder had two claims in Year 1 and two claims
in Year 2. For this insured, determine
(a) the model distribution
(b) the joint distribution
(c) the marginal distribution
(d) the posterior distribution.
Problem 64.3 ‡
You are given:
(i) The annual number of claims for a policyholder follows a Poisson distri-
bution with mean λ.
(ii) The prior distribution of Λ is Gamma with probability density function:
(2λ)5 e−2λ
f (λ) = , λ > 0.
24λ
An insured is selected at random and observed to have x1 = 5 claims during
Year 1 and x2 = 3 claims during Year 2.
Problem 64.4 ‡
You are given:
(i) Annual claim frequencies follow a Poisson distribution with mean λ.
(ii) The prior distribution of Λ has probability density function:
0.4 − λ 0.6 − λ
π(λ) = e 6 + e 12 , λ > 0.
6 12
Ten claims are observed for an insured in Year 1.
Problem 64.5 ‡
You are given:
(i) In a portfolio of risks, each policyholder can have at most one claim per
year.
(ii) The probability of a claim for a policyholder during a year is q.
q3
(iii) The prior density is π(q) = 0.07 0.6 < q < 0.8.
A randomly selected policyholder has one claim in Year 1 and zero claims
in Year 2.
For this policyholder, determine the posterior probability that 0.7 < q < 0.8.
Problem 64.6 ‡
You are given:
(i) The probability that an insured will have exactly one claim is θ.
(ii) The prior distribution of Θ has probability density function:
√
π(θ) = 1.5 θ, 0 < θ < 1.
Problem 64.7 ‡
You are given:
(i) The prior distribution of the parameter Θ has probability density func-
tion:
1
π(θ) = 2 , θ > 1.
θ
Given Θ = θ, claim sizes follow a Pareto distribution with parameters α = 2
and θ.
64 BASICS OF BAYESIAN INFERENCE 483
A claim of 3 is observed.
Problem 64.8 ‡
You are given:
(i) The number of claims observed in a 1-year period has a Poisson distri-
bution with mean θ.
(ii) The prior density is:
e−θ
π(θ) = , 0 < θ < k.
1 − e−k
(iii) The unconditional probability of observing zero claims in 1 year is 0.575.
Determine k.
Problem 64.9 ‡
You are given: (i) Conditionally, given β, an individual loss X follows the
exponential distribution with probability density function:
1 − βx
f (x|β) = e , 0 < x < ∞.
β
(ii) The prior distribution of β is inverse gamma with probability density
function:
c2 − c
π(β) = 3 e β , 0 < β < ∞.
β
R ∞ 1 −a
(iii) 0 yn e y dy = (n−2)!
an−1
, n = 2, 3, 4, · · · .
Given that the observed loss is x, calculate the mean of the posterior distri-
bution of β.
Problem 64.10 ‡
You are given:
(i) Annual claim counts follow a Poisson distribution with mean λ.
(ii) The parameter λ has a prior distribution with probability density func-
tion:
1 λ
f (λ) = e− 3 .
3
Two claims were observed during the first year.
Problem 64.11 ‡
You are given:
(i) For Q = q, the random variables X1 , X2 , · · · , Xm are independent, iden-
tically distributed Bernoulli random variables with parameter q.
(ii) Sm = X1 + X2 + · · · + Xm .
(iii) The prior distribution of Q is beta with a = 1, b = 99, and θ = 1.
Problem 64.12 ‡
You are given the following information about workers compensation cover-
age:
(i) The number of claims for an employee during the year follows a Poisson
distribution with mean (100 − p)/100, where p is the salary (in thousands)
for the employee.
(ii) The distribution of p is uniform on the interval (0, 100].
An employee is selected at random. No claims were observed for this em-
ployee during the year.
Determine the posterior probability that the selected employee has salary
greater than 50 thousand.
Problem 64.13 ‡
Prior to observing any claims, you believed that claim sizes followed a Pareto
distribution with parameters θ = 10 and α = 1, 2 or 3 , with each value
being equally likely.
You then observe one claim of 20 for a randomly selected risk.
Determine the posterior probability that the next claim for this risk will
be greater than 30.
65 BAYESIAN PARAMETER ESTIMATION 485
Theorem 65.1
For squared-error loss, the Bayes estimate is the mean of the posterior dis-
tribution; for absolute loss, it is a median; for zero-one loss, it is a mode.
Example 65.1
1 − xθ
The posterior distribution of θ is given by πΘ|X (θ|x) = xe . Determine
the Bayes estimate of θ using
(a) the squared-error loss function;
(b) the absolute loss function;
(c) the zero-one loss function.
Solution.
(a) θ̂ = x.
(b) The median of the posterior distribution is the number M such that
R M 1 −θ
0 xe
x dθ = 0.5. Solving this equation, we find M = x ln 2. Thus, θ̂ =
x ln 2.
(c) The value that maximizes the posterior distribution is θ = 0. That is,
the mode of the posterior distribution is 0. Hence, θ̂ = 0
expressed as follows:
Z Z Z
E(Y |x) = yfY |X (y|x)dy = y fY |Θ (y|θ)πΘ|X (θ|x)dθdy
Z Z Z
= πΘ|X (θ|x) yfY |Θ (y|θ)dydθ = E(Y |θ)πΘ|X (θ|x)dθ.
Example 65.2 ‡
You are given:
(i) The annual number of claims for a policyholder has a binomial distribu-
tion with probability function:
2
p(x|q) = q x (1 − q)2−x , x = 0, 1, 2.
x
This policyholder had one claim in each of Years 1 and 2. Determine the
Bayesian estimate of the number of claims in Year 3.
Solution.
Let X be the previously observed data and let Y be the number of claims
in Year 3. Then E(Y |q) = nq = 2q. The joint distribution is
Thus,
Z 1 Z 1
4
E(Y |x) = E(Y |q)πQ|X (q|x)dq = (2q)[168q 5 (1 − q)2 ]dq =
0 0 3
65 BAYESIAN PARAMETER ESTIMATION 487
Example 65.3 ‡
For a group of insureds, you are given:
(i) The amount of a claim is uniformly distributed but will not exceed a
certain unknown limit θ.
(ii) The prior distribution of Θ is
500
π(θ) = , θ > 500.
θ2
Two independent claims of 400 and 600 are observed.
Determine the probability that the next claim will exceed 550.
Solution.
We are asked to find Pr(X3 > 550|X1 , X2 ). We have
Z ∞ Z ∞Z
Pr(X3 > 550|X1 , X2 ) = fX3 |X (x3 |x)dx3 = fX3 |Θ (x3 |θ)πΘ|X (θ|x)dθdx3
Z550 Z 550 Z
= πΘ|X (θ|x) fX3 |Θ (x3 |θ)dx3 dθ = Pr(X3 > 550|Θ)πΘ|X (θ|x)dθ.
550
A 100(1 − α)% credibility interval for θ is an interval [a, b] such that the
posterior probability
Pr(a ≤ Θ ≤ b|x) ≥ 1 − α.
Theorem 65.2
If the posterior random variable Θ|X is continuous and unimodal, then the
100(1 − α)% credibility interval with smallest width b − a is the unique
solution to the following system of equations:
Z b
πΘ|X (θ, x)dθ =1 − α
a
πΘ|X (a, x) =πΘ|X (b, x).
Example 65.4
You are given:
(i) The probability that an insured will have at least one loss during any
year is p.
(ii) The prior distribution for p is uniform on [0, 0.5].
(iii) An insured is observed for 8 years and has at least one loss every year.
Develop a non-zero width 95% credibility interval for the posterior proba-
bility that the insured will have at least one loss during Year 9.
Solution.
In Problem 65.2, we find
Thus, according to Theorem 65.2, one of the equations of the system that
we will need to solve is
Z b Z b
πΘ|X (θ, x)dθ = 1 − α =⇒ 4608p8 dp = 0.95.
a a
The second equation can only be solved either such that a = b or that
a = −b. The solution a = b would give a width of zero. The solution a = −b
implies the following:
Z b
4608p8 dp = 1024b9 = 0.95 =⇒ b = 0.4603.
−b
Thus, the credibility interval is [−0.4603, 0.4603]
Now, for large sampling, there is a version of the Central Limit Theorem
which we call the Bayesian Central Limit Theorem:
If π(θ) (the prior distribution) and fX|Θ (x|θ) (the model distribution) are
both twice differentiable in the elements of θ and other commonly satisfied
assumptions hold, then the posterior distribution of Θ given X = x is asymp-
totically normal.
Example 65.5
Redo Example 65.4 using the Bayesian Central Limit Theorem.
Solution.
Both the prior distribution function and the model distribution are twice
differentiable so that we can assume that the posterior distribution is ap-
proximetely normal. We have
E(p) =0.45
Var(p) =E(p2 ) − E(p)2
Z 0.5
= 4608p10 dp − 0.452 = 0.002.
0
490 METHODS OF PARAMETER ESTIMATION
Practice Problems
Problem 65.1
The true value of a parameter is θ = 50. Suppose that the Bayes estimate of
θ using the squared-error loss function is θ̂ = 52. Find the absolute value of
the difference between the squared-error loss function and the absolute loss
function.
Problem 65.2 ‡
You are given:
(i) The probability that an insured will have at least one loss during any
year is p.
(ii) The prior distribution for p is uniform on [0, 0.5].
(iii) An insured is observed for 8 years and has at least one loss every year.
Determine the posterior probability that the insured will have at least one
loss during Year 9.
Problem 65.3
You are given:
(i) The probability that an insured will have at least one loss during any
year is p.
(ii) The prior distribution for p is uniform on [0, 0.5].
(iii) An insured is observed for 8 years and has at least one loss every year.
Using the Bayesian Central Limit Theorem, estimate the posterior prob-
ability that p > 0.6.
Problem 65.4 ‡
You are given:
(i) The amount of a claim, X, is uniformly distributed on the interval [0, θ].
(ii) The prior density of θ is π(θ) = 500
θ2
, θ > 500.
Two claims, x1 = 400 and x2 = 600, are observed. You calculate the
posterior distribution as:
6003
πΘ|X (θ|x) = 3 , θ > 600.
θ4
Calculate E(X3 |X).
Problem 65.5 ‡
You are given:
492 METHODS OF PARAMETER ESTIMATION
(i) In a portfolio of risks, each policyholder can have at most two claims per
year.
(ii) For each year, the distribution of the number of claims is:
q2
π(q) = , 0.2 < q < 0.5.
0.039
A randomly selected policyholder had two claims in Year 1 and two claims
in Year 2.
For this insured, determine the Bayesian estimate of the expected number
of claims in Year 3.
Problem 65.6 ‡
You are given:
(i) The annual number of claims for each policyholder follows a Poisson
distribution with mean θ.
(ii) The distribution of θ across all policyholders has probability density
function:
f (θ) = θe−θ , θ > 0.
R∞
(iii) 0 θe−nθ dθ = n12 .
A randomly selected policyholder is known to have had at least one claim
last year.
Determine the posterior probability that this same policyholder will have
at least one claim this year.
66 CONJUGATE PRIOR DISTRIBUTIONS 493
Example 66.1
You are given:
• The prior distribution Θ is Gamma with parameters α and β.
• The model distribution is exponential with parameter θ.
Show that Θ has a conjugate prior distribution.
Solution.
The model distribution is
Pn
fX|Θ (x|θ) = θn e−θ i=1 xi
.
Example 66.2
You are given:
• The prior distribution Λ is Gamma with parameters α and θ.
• The model distribution X|Λ is Poisson with parameter λ.
Show that Λ has a conjugate pair distribution.
Solution.
The model distribution is
Pn
e−nλ λ i=1 xi
fX|Λ (x|λ) = Qn .
i=1 xi !
Example 66.3
You are given:
• The prior distribution Λ is normal with mean µ and variance a2 .
• The model distribution X|Λ is normal with mean λ and variance σ 2 .
Show that Λ has a conjugate pair distribution.
66 CONJUGATE PRIOR DISTRIBUTIONS 495
Solution.
The model distribution is
Pn 2
1 i=1 (xi −λ)
fX|Λ (x|λ) = p e− 2σ 2 .
σ n (2π)n
Practice Problems
Problem 66.1
You are given:
• The prior distribution Q is beta with parameters a, b and 1.
• The model distribution X|Q is binomial with parameters m and q.
Problem 66.2
You are given:
• The prior distribution Λ is Gamma with parameters α and θ.
• The model distribution X|Λ is inverse exponential with parameters λ.
Problem 66.3
You are given:
• The prior distribution Λ is inverse Gamma with parameters α and θ.
• The model distribution X|Λ is exponential with mean λ.
Problem 66.4
You are given:
• The prior distribution Λ is a single parameter Pareto with parameter α
α
and with pdf f (λ) = λαθ
α+1 , λ > θ.
• The model distribution X|Λ is uniform in [0, λ].
Example 67.1
Let N be a Poisson random variable with parameter λ. Then E(N ) = λ and
Var(N ) = λ.
(a) Estimate the Poisson parameter using the method of moments.
(b) Estimate the Poisson parameter using the method of maximum likeli-
hood.
(c) Calculate E(λ̂) and Var(λ̂).
(d) Find the asymptotic variance of λ̂.
(e) Construct a 95% confidence interval of the true value of λ.
Solution.
(a) The Poisson distribution parameter estimate by the method of moments
is
X∞
knk P∞
k=1 knk
λ̂ = x = ∞ = k=1 .
X n
nk
k=0
∞ −λ k nk
Y e λ
L(λ) = .
k!
k=0
(c) We have
E(λ̂) =E(N ) = λ
Var(N ) λ
Var(λ̂) = = ,
n n
The following example describes the process of finding the likelihood func-
tion in the case of incomplete data.
Example 67.2
The distribution of accidents for 70 randomly selected policies is as follows:
Solution.
The likelihood function is given by
L(λ) = p31 20 12
0 p1 p2 (1 − p0 − p1 − p2 )
7
where
e−λ λk
pk =
k!
Example 67.3
Estimate the negative binomial parameters by the method of moments.
Solution.
By the method of moments, we have the following system of two equations
∞
X
knk
k=1
rβ = =x
n
and 2
∞
X
∞
X
2
k nk knk
k=1 k=1 2
rβ(1 + β) = −
n =s .
n
Note that, if s2 < x then β̂ < 0 which is an indication that the negative
binomial model is not a good representation for the data
Example 67.4
Estimate the negative binomial parameters by the method of maximum
likelihood.
Solution.
Let r k
k+r−1 1 β
pk = , k = 0, 1, 2, · · · .
k 1+β 1+β
The likelihood function is
Y
L(r, β) = pk
k
500 METHODS OF PARAMETER ESTIMATION
The above equations are usually solved using numerical methods such as the
Newton-Raphson method. Note that if r is given then β̂ = xr
Example 67.5
Let N be a binomial random variable with parameters m and q. Then
E(N ) = mq and Var(N ) = mq(1 − q).
(a) Estimate q using the method of moments, assuming that m is known.
(b) Estimate q using the method of moments, assuming that m is unknown.
67 ESTIMATION OF CLASS (A, B, 0) 501
Solution.
(a) We have
∞
X ∞
X
knk knk
k=1 k=1
mq̂ = =⇒ q̂ = .
n nm
(b) We have to solve the system of two equations
∞ ∞ ∞ 2
X X X
2
knk k nk knk
k=1 k=1
k=1
m̂q̂ = n and m̂q̂(1 − q̂) = n −
n .
We obtain
∞
X ∞
X ∞
X
k 2 nk knk knk
k=1 k=1 k=1
q̂ = 1 − ∞ + n and m̂ = nq̂
X
knk
k=1
Example 67.6
Let N be a binomial random variable with parameters m and q. Then
E(N ) = mq and Var(N ) = mq(1 − q).
Estimate q using the method of maximum likelihood, assuming that m is
known.
Solution.
Let
m
pk = Pr(N = k) = q k (1 − q)m−k .
k
The likelihood function is
m
Y
L(m, q) = pnk k .
k=0
The loglikelihood function is
m
X
`(m, q) = nk ln pk
k=0
m
X m
= nk ln + k ln q + (m − k) ln (1 − q) .
k
k=0
502 METHODS OF PARAMETER ESTIMATION
Example 67.7 ‡
You are given:
(i) A hospital liability policy has experienced the following numbers of claims
over a 10-year period:
10 2 4 0 6 2 4 5 4 2
Solution.
Recall that
10 + 2 + 4 + 0 + 6 + 2 + 4 + 5 + 4 + 2
λ̂ = X = = 3.9.
10
We have
E(λ̂) =E(X) = λ
λ
Var(λ̂) =
n
q
λ
n 1
CV = =√
λ λn
1
= √ = 0.16
39
67 ESTIMATION OF CLASS (A, B, 0) 503
Practice Problems
Problem 67.1
The distribution of accidents for 84 randomly selected policies is as follows:
Problem 67.2 ‡
You are given the following observed claim frequency data collected over a
period of 365 days:
Fit a Poisson distribution to the above data, using the method of maximum
likelihood.
Problem 67.3
You are given the following observed claim frequency data collected over a
period of 365 days:
504 METHODS OF PARAMETER ESTIMATION
Problem 67.4
You are given:
(i)
k 0 1 2 3 4
nk 30 35 20 10 5
Problem 67.5
You are given:
(i)
k 0 1 2 3 4
nk 30 35 20 10 5
Problem 67.6 ‡
The number of claims follows a negative binomial distribution with param-
eters β and r, where β is unknown and r is known. You wish to estimate β
based on n observations, where x is the mean of these observations.
Problem 67.7 ‡
The distribution of accidents for 100 randomly selected policies is as follows:
67 ESTIMATION OF CLASS (A, B, 0) 505
Problem 67.8 ‡
You are given the following data for the number of claims during a one-year
period:
Calculate |P − Q|.
506 METHODS OF PARAMETER ESTIMATION
1 − pM
pM
k =
0
pk = (1 − pM T
0 )pk , k = 1, 2, 3, · · ·
1 − p0
where
b
pk = a+ pk−1 , k = 2, 3, · · · .
k
and pM M
0 = α, 0 ≤ α < 1. The parameters to be estimated are a, b, and p0 .
where
∞
X
`0 =n0 ln pM
0 + nk ln (1 − pM
0 )
k=1
∞
X
`1 = nk [ln pk − ln (1 − p0 )].
k=1
15
This section has not appeared in any of the C exams.
68 MLE WITH (A, B, 1) CLASS 507
resulting in
n0
pˆ0 M =
,
n
the proportion of observations that equal 0. This result is true for any
zero-modified distribution.
Example 68.1
Let pM0 be the zero-modified geometric probability function. Find the MLE
for pM
0 and β.
Solution.
Recall that
βk 1
pk = , p0 = .
(1 + β)k+1 1+β
The MLE of pM
0 is
n0
p̂M
.0 =
n
For finding the MLE of β, we first find `1 :
∞
X
`1 = nk [ln pk − ln (1 − p0 )]
k=0
∞
X
= nk [k ln β − (k + 1) ln (1 + β) − ln β − ln (1 + β)]
k=0
∞
X ∞
X
= ln β (knk − nk ) − ln (1 + β) [(k + 1)nk − nk ]
k=0 k=0
∞ ∞
∂`1 1 X 1 X
= nk (k − 1) − knk .
∂β β 1+β
k=0 k=0
Example 68.2
Find `1 for the zero-modified Poisson distribution.
Solution.
We have
∞ −λ k
X e λ
`1 = nk ln − ln (1 − e−λ )
k!
k=1
∞
!
X
= − (n − n0 )λ + knk ln λ − (n − n0 ) ln (1 − e−λ ) + c
k=1
= − (n − n0 )[λ + ln (1 − e−λ )] + nx ln λ + c
where
∞
X
c=− nk ln k!
k=1
and
∞
X
knk
k=1
x=
n
To find the estimate of λ in Example 68.2, we set the first derivative of `1
with respect to λ to zero:
∂`1 n − n0 nx
=− −λ
+ =0
∂λ 1−e λ
resulting in
n − n0
x(1 − e−λ ) = λ
n
which is solved numerically for λ. Note also that this last equation can be
expressed as
1 − p̂M
0
x= λ.
1 − p0
Example 68.3
Find `1 for the zero-modified binomial distribution.
68 MLE WITH (A, B, 1) CLASS 509
Solution.
We have
m
X m k m−k m
`1 = nk ln q (1 − q) − ln [1 − (1 − q) ]
k
k=1
m ∞
!
X X
= knk ln q + (m − k)nk ln (1 − q)
k=1 k=1
m
X
− nk ln [1 − (1 − q)m ] + c
k=1
=nx ln q + m(n − n0 ) ln (1 − q) − nx ln (1 − q)
−(n − n0 ) ln [1 − (1 − q)m ] + c
where
m
X m
c= nk ln
k
k=1
The equation
results in
1 − p̂M
0
x= mq
1 − p0
where p0 = (1 − q)m . If m is known then the MLE of q is obtained by solving
the above equation.
510 METHODS OF PARAMETER ESTIMATION
Practice Problems
Problem 68.1
You have the following observations of a discrete random variable.
Frequency (k) nk
0 9048
1 905
2 45
3 2
4+ 0
Problem 68.2
You have the following observations of a discrete random variable.
Frequency (k) nk
0 9048
1 905
2 45
3 2
4+ 0
You are to fit these to a zero-modified Poisson distribution using the maxi-
mum likelihood. Find the MLE for pM 0 and λ.
Problem 68.3
You have the following observations of a discrete random variable.
Frequency (k) nk
0 10
1 6
2 4
Problem 68.4
You have the following observations of a discrete random variable.
68 MLE WITH (A, B, 1) CLASS 511
Frequency (k) nk
0 10
1 6
2 4
The goal of this chapter is to evaluates models that fit the best to a given
data set and compare competing models.
513
514 MODEL SELECTION AND EVALUATION
One of the difficulties of the previous method is that the distinction is dif-
ficult to make when the heights between the two graphs is small. We con-
sider two ways for magnifying small changes to better interpret the good-
ness of fit. The first method consists of graphing the difference function
D(x) = Fn (x) − F ∗ (x), known as the D(x)−plot. A fit is considered good,
if the graph of D(x) is close to the horizontal axis.
Example 69.1
You are given:
(i) The following observed data: 2, 3, 3, 3, 5, 8, 10, 13, 16.
(ii) An exponential distribution is fit to the data using the maximum likeli-
hood to estimate the mean of the exponential distribution.
(a) Plot F8 (x) and F ∗ (x) in the same window.
(b) Plot D(x).
(c) Create a p − p plot.
Solution.
The empirical mass function is
x 2 3 5 8 10 13 16
1 1 1 1 1 1 1
p(x) 9 3 9 9 9 9 9
(a) The plots of both F9 (x) and F ∗ (x) are shown in Figure 69.1.
Figure 69.1
The plot indicates that the fitted model is a reasonable one.
(b) The difference function is
−x
e 7 − 1, x<2
− x7
8
e − 9, 2 ≤ x < 3
−x
5
e x7 − 9 , 3 ≤ x < 5
−7
e − 49 , 5 ≤ x < 8
D(x) = F9 (x) − F ∗ (x) = x
e− 7 − 13 , 8 ≤ x < 10
x
e− 7 − 29 , 10 ≤ x < 13
x
e− 7 − 19 , 13 ≤ x < 16
x
e− 7 , x ≥ 16.
Figure 69.2
(c) We first create the points on the graph
69 ASSESSING FITTED MODELS GRAPHICALLY 517
j
j xj F9 (xj ) = 9 F ∗ (xj )
1 2 0.1 0.249
2 3 0.2 0.349
3 3 0.3 0.349
4 3 0.4 0.349
5 5 0.5 0.510
6 8 0.6 0.681
7 10 0.7 0.760
8 13 0.8 0.844
9 15 0.9 0.883
The p − p plot is shown in Figure 69.3
Figure 69.3
Example 69.2
You are given:
(i) The following are observed claim amounts:
400 1000 1600 3000 5000 5400 6200.
(ii) An exponential distribution with θ = 3300 is hypothesized for the data.
(iii) The data are left-truncated at 500.
Write the formula for F ∗ (x) and f ∗ (x).
Solution.
The distribution function using a truncation of 500 is
(
∗
0, x < 500
F (x) = F (x)−F (500) (x−500)
− 3300
1−F (500) = 1 − e , x ≥ 500
518 MODEL SELECTION AND EVALUATION
Practice Problems
Problem 69.1 ‡
You are given:
(i) The following are observed claim amounts:
Let (s, t) be the coordinates of the p − p plot for a claim amount of 3000.
Determine (s − t) − D(3000).
Problem 69.2 ‡
The graph below shows a p − p plot of a fitted distribution compared to a
sample.
(A) The tails of the fitted distribution are too thick on the left and on
the right, and the fitted distribution has less probability around the median
than the sample.
(B) The tails of the fitted distribution are too thick on the left and on the
right, and the fitted distribution has more probability around the median
520 MODEL SELECTION AND EVALUATION
Problem 69.3 ‡
You are given the following p − p plot:
1 2 3 15 30 50 51 99 100
The test statistic is usually a measure of how close the fitted distribution
function is to the empirical distribution function. When the null hypothesis
completely specifies the model (i.e. the parameters of the fitted distribution
are given), critical values are well-known. In contrast, when the parameters
of the distribution function in H0 are to be estimated from the data, the test
statistic tends to be smaller than it would be have been had the parameter
values been prespecified. The estimation method tries to choose parameters
that produce a distribution that is close to the data and this decreases the
probability that the null hypothesis be rejected.
where Fn (x0 ) = 0.
522 MODEL SELECTION AND EVALUATION
Example 70.1
You are given:
(i) The following are observed claim amounts:
200 400 1000 1600 3000 5000 5400 6200
(ii) An exponential distribution with θ = 3300 is hypothesized for the data.
Find the value of D.
Solution.
We create the following chart.
xj F8 (xj−1 ) F8 (xj ) F ∗ (xj ) Maximum of difference
200 0 0.125 0.05881 0.06619
400 0.125 0.25 0.1142 0.1358
1000 0.25 0.375 0.2614 0.1136
1600 0.375 0.5 0.3842 0.1158
3000 0.5 0.625 0.5971 0.0971
5000 0.625 0.75 0.7802 0.177
5400 0.75 0.875 0.8053 0.1552
6200 0.875 1.0 0.8472 0.1528
x
where F ∗ (x) = 1 − e− 3300 . Hence, D = 0.177
A value of D greater than the critical value will result in rejection of the
null hypothesis.
Example 70.2
Determine whether the testing in Example 70.1 will result in rejection of the
null hypothesis for a 10% level of confidence.
Solution.
For α = 0.10, the critical value is 1.22
√
8
= 0.431 which is smaller than D.
Hence, the null hypothesis is rejected
Example 70.3
Suppose the data in Example 70.1 is right censored at 5100 and the estimated
exponential mean is θ̂ = 3100. Find the KS statistic.
70 KOLMOGOROV-SMIRNOV HYPOTHESIS TEST OF FITTED MODELS523
Solution.
x
With censoring, we have F ∗ (x) = 1 − e− 3000 for x ≤ 5100 and F ∗ (x) = 1 for
x > 5100. Hence,
xj F8 (xj−1 ) F8 (xj ) F ∗ (xj ) Maximum of difference
200 0 0.125 0.0623 0.0623
400 0.125 0.25 0.1211 0.0039
1000 0.25 0.375 0.2757 0.0257
1600 0.375 0.5 0.4032 0.0282
3000 0.5 0.625 0.6201 0.1201
5000 0.625 0.75 0.8007 0.1757
5100 0.75 0.75 0.8070 0.057
Hence, D = 0.1757
Remark 70.1
According to [1], if the data is right-censored then the critical values should
be smaller because there is less opportunity for the difference to become
large.
524 MODEL SELECTION AND EVALUATION
Practice Problems
Problem 70.1 ‡
You are given:
(i) A sample of claim payments is:
29 64 90 135 182
Problem 70.2 ‡
You are given a random sample of observations:
You test the hypothesis that the probability density function is:
4
f (x) = , x > 0.
(1 + x)5
Determine the KS test statistic.
Problem 70.3 ‡
You are given:
(i) A random sample of five observations:
(ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
H0 , that the probability density function for the population is:
4
f (x) = , x > 0.
(1 + x)5
Problem 70.4 ‡
The size of a claim for an individual insured follows an inverse exponential
distribution with the following probability density function:
θ
θe− x
f (x|θ) = , x > 0.
x2
For a particular insured, the following five claims are observed:
1 2 3 5 13
Problem 70.5
You are given:
(i) The following are observed claim amounts:
Problem 70.6
You use the Kolmogorov-Smirnov goodness-of-fit test to assess the fit of
the natural logarithms of n = 200 losses to a distribution with distribution
function F ∗ .
You are given:
(i) The largest value of |F ∗ (x) − Fn (x)| occurs for some x between 4.26 and
4.42.
(ii)
x F ∗ (x) Fn (x−) Fn (x)
4.26 0.584 0.505 0.510
4.30 0.599 0.510 0.515
4.35 0.613 0.515 0.520
4.36 0.621 0.520 0.525
4.39 0.636 0.525 0.530
4.42 0.638 0.530 0.535
526 MODEL SELECTION AND EVALUATION
Example 71.1
You are given:
(i) The following are observed claim amounts:
Solution.
x
We have d = 0 and u = ∞. Also, F ∗ (x) = 1 − e− 3300 . Let
528 MODEL SELECTION AND EVALUATION
Example 71.2
You are given:
(i) Four policy claims:
300 300 800 1500
(ii) Policy limit u = 2000.
(iii) The claims are hypothezised by uniform distribution on [0, 2500].
Calculate the Anderson-Darling test statistic.
Solution.
We have: d = 0, u = 2000, and F ∗ (x) = x
2500 . We create the following chart.
yj F8 (yj ) F ∗ (yj ) a b
0 0 0 0.1278 0
300 0.5 0.12 0.0644 0.2452
800 0.75 0.32 0.0332 0.3536
1500 1.0 0.6 0 0.2877
Total − − 0.2254 0.8865
Hence,
A2 = −4(0.8) + 4(0.2254 + 0.8865) = 1.2476
71 ANDERSON-DARLING HYPOTHESIS TEST OF FITTED MODELS529
Remark 71.1
Keep in mind that the critical values for Kolomogorv-Smirnov test and the
Anderson-Darling test are correct ONLY when the null hypothesis com-
pletely specifies the model16 .
16
See [1], page 450.
530 MODEL SELECTION AND EVALUATION
Practice Problems
Problem 71.1
Batonic Inc. has workers’ compensation claims during a month of:
An actuary who works for the company believes that the claims are dis-
tributed exponentially with mean 500.
Problem 71.2
Using the previous problem, complete the following chart.
Problem 71.3 ‡
Which of the following statements is true?
(A) For a null hypothesis that the population follows a particular distri-
bution, using sample data to estimate the parameters of the distribution
tends to decrease the probability of a Type II error.
(B) The Kolmogorov-Smirnov test can be used on individual or grouped
data.
(C) The Anderson-Darling test tends to place more emphasis on a good fit
in the middle rather than in the tails of the distribution.
(D) None of the above.
Problem 71.4 ‡
Which of the following is false?
(A) For the Kolmogorov-Smirnov test, when the parameters of the distribu-
tion in the null hypothesis are estimated from the data, the probability of
rejecting the null hypothesis decreases.
(B) For the Kolmogorov-Smirnov test, the critical value for right censored
data should be smaller than the critical value for uncensored data.
(C) The Anderson-Darling test does not work for grouped data.
(D) None of the above is true.
71 ANDERSON-DARLING HYPOTHESIS TEST OF FITTED MODELS531
Problem 71.5
If the null hypothesis completely spacifies the fitted distribution, the critical
values tend to zero as the sample size goes to infinity for which test?
For the j−th category, let pj denote the probability that an observation
falls in the interval (cj−1 , cj ]. Then p̂j = F ∗ (cj ) − F ∗ (cj−1 ). Let pnj =
F (cj ) − F (cj−1 ) be the same probability according to the empirical distribu-
tion. Thus, the expected number of observations based on F ∗ (·) is Ej = np̂j
and that based on Fn (·) is Oj = npnj = nj .
Example 72.1 ‡
You test the hypothesis that a given set of data comes from a known distri-
bution with distribution function F (x). The following data were collected:
72 THE CHI-SQUARE GOODNESS OF FIT TEST 533
Solution.
We have
j p̂j Ej Oj (Ej − Oj )2
1 0.035 10.5 5 30.25
2 0.095 28.5 42 182.25
3 0.5 150 137 169
4 0.2 60 66 36
5 0.17 51 50 1
(a) The Chi-square statistic is
30.25 182.25 169 36 1
χ2 = + + + + = 11.02.
10.5 28.5 150 60 51
(b) Since no mention of any parameter estimation, we assume r = 0. Thus,
χ24,0.95 = 9.488 < χ2 so reject the null hypothesis.
(c) We have χ24,0.975 = 11.143 > χ2 so fail to reject
Example 72.2 ‡
1000 workers insured under a workers compensation policy were observed
for one year. The number of work days missed is given below:
Number of Days Number of Workers
of Work Missed
0 818
1 153
2 25
3+ 4
Total 1000
Total Number of Days Missed 230
534 MODEL SELECTION AND EVALUATION
The Chi-square goodness-of-fit test is used to test the hypothesis that the
number of work days missed follows a Poisson distribution where:
(i) The Poisson parameter is estimated by the average number of work days
missed.
(ii) Any interval in which the expected number is less than one is combined
with the previous interval.
(a) Find the Chi-square statistic.
(b) Complete the following table.
Solution.
230
For the Poisson distribution, the mean is estimated as λ̂ = 1000 = 0.23. The
probability that a worker missed 0 days is p0 = e−0.23 = 0.7945. We can
create the following table of information.
j p̂j Ej Oj (Ej − Oj )2
0 0.7945 794.5 818 552.25
1 0.1827 182.7 153 882.09
2 0.210 21.0 25 16
3+ 0.0017 1.7 4 5.29
(b) We have
Example 72.3 ‡
You are given:
(i) A computer program simulates n = 1000 pseudo−U (0, 1) variates.
(ii) The variates are grouped into k = 20 ranges of equal length.
20
X
(iii) Oj2 = 51, 850.
j=1
(iv) The Chi-square goodness-of-fit test for U (0, 1) is performed.
Determine the result of the test.
Solution.
The hypothesized cdf is
F ∗ (x) = x.
The 20 ranges are of equal length each of length 0.05. Thus, p̂nj = F ∗ (cj ) −
F ∗ (cj−1 ) = cj − cj−1 = 0.05 and Ej = np̂nj = 1000(0.05) = 50. The Chi-
square statistic is
20
X (Ej − Oj )2
χ2 =
Ej
j=1
X20 20
X
=0.02 Oj2 − 100 Oj + 20(502 )
j=1 j=1
Example 72.4 ‡
During a one-year period, the number of accidents per day was distributed
as follows:
# of Accidents Days
0 209
1 111
2 33
3 7
4 3
5 2
536 MODEL SELECTION AND EVALUATION
You use a chi-square test to measure the fit of a Poisson distribution with
mean 0.60.
The minimum expected number of observations in any group should be 5.
The maximum possible number of groups should be used.
Determine the chi-square statistic.
Solution.
j
We have Ej = 365e−0.6 0.6
j! . We create the following table:
j Ej Oj (Ej − Oj )2 /Ej
0 200.32 209 0.38
1 120.19 111 0.70
2 36.06 33 0.26
3 7.21 7 1.52∗
4 1.08 3
5 0.13 2
∗ We are told that the minimum expected number of observations for any
group should be 5. Therefore, we combine groups 3, 4 and 5 to obtain
Practice Problems
Problem 72.1 ‡
A particular line of business has three types of claims. The historical prob-
ability and the number of claims for each type in the current year are:
Historical Number of Claims
Type Probability in Current Year
A 0.2744 112
B 0.3512 180
C 0.3744 138
You test the null hypothesis that the probability of each type of claim in
the current year is the same as the historical probability.
Problem 72.2 ‡
You are given the following observed claim frequency data collected over a
period of 365 days:
Number of Claims per Day Observed Number of Days
0 50
1 122
2 101
3 92
4+ 0
(a) Fit a Poisson distribution to the above data, using the method of maxi-
mum likelihood.
(b) Regroup the data, by number of claims per day, into four groups:
0 1 2 3+
Apply the Chi-square goodness-of-fit test to evaluate the null hypothesis
that the claims follow a Poisson distribution. Determine the result of the
chi-square test.
Problem 72.3 ‡
You are investigating insurance fraud that manifests itself through claimants
who file claims with respect to auto accidents with which they were not
involved. Your evidence consists of a distribution of the observed number of
claimants per accident and a standard distribution for accidents on which
fraud is known to be absent. The two distributions are summarized below:
538 MODEL SELECTION AND EVALUATION
Determine the result of a Chi-square test of the null hypothesis that there
is no fraud in the observed accidents.
Problem 72.4 ‡
You are given the following random sample of 30 auto claims:
You test the hypothesis that auto claims follow a continuous distribution
F (x) with the following percentiles:
You group the data using the largest number of groups such that the ex-
pected number of claims in each group is at least 5.
Problem 72.5 ‡
Which of the following statements is true?
(A) For a null hypothesis that the population follows a particular distri-
bution, using sample data to estimate the parameters of the distribution
tends to decrease the probability of a Type II error.
(B) The Kolmogorov-Smirnov test can be used on individual or grouped
data.
(C) The Anderson-Darling test tends to place more emphasis on a good fit
in the middle rather than in the tails of the distribution.
72 THE CHI-SQUARE GOODNESS OF FIT TEST 539
(D) For a given number of cells, the critical value for the Chi-square goodness-
of-fit test becomes larger with increased sample size.
(E) None of (A), (B), (C) or (D) is true.
Problem 72.6 ‡
Which of statements (A), (B), (C), and (D) is false?
(A) The Chi-square goodness-of-fit test works best when the expected num-
ber of observations varies widely from interval to interval.
(B) For the Kolmogorov-Smirnov test, when the parameters of the distribu-
tion in the null hypothesis are estimated from the data, the probability of
rejecting the null hypothesis decreases.
(C) For the Kolmogorov-Smirnov test, the critical value for right censored
data should be smaller than the critical value for uncensored data.
(D) The Anderson-Darling test does not work for grouped data.
(E) None of (A), (B), (C) or (D) is false.
540 MODEL SELECTION AND EVALUATION
The likelihood ratio test is conducted as follows: Let L(θ) denote the likeli-
hood function. Let Θ0 denote the set of all possible values of θ as specified
in the null hypothesis. Suppose that the maximum of L(θ) occurs at some
value θ0 ∈ Θ0 with maximum value L0 = L(θ0 ). Likewise, let Θa denote
the set of all possible values of θ as specified in the alternative hypothesis.
Suppose that the maximum of L(θ) occurs at some value θa ∈ Θa with max-
imum value La = L(θa ). Note that θa = M LE(θ).
Example 73.1
A random sample of n = 8 values from an exponential random variable X
is given:
3 3 4 6 7 8 10 25.
You are performing the following hypothesis test:
H0 : θ = 8.
H1 : θ 6= 8.
73 THE LIKELIHOOD RATIO TEST 541
Solution.
(a) We have Θ0 = {8} so that θ0 = 8. The likelihood function is
8
Y 1 xi 1 − 66
L(θ) = e− θ = e θ.
θ θ8
i=1
66
Hence, L0 = L(8) = 818 e− 8 .
66
On the other hand, Θa = {θ > 0 : θ 6= 8} and θ1 = M LE(θ) = x = 8 =
1 − 66
8.25 and therefore L1 = 8.25 8 e 8.25 . The test statistic is
" 8 #
8 66
− 8.25 + 66
T = 2 ln e 8 = 0.00765.
8.25
Example 73.2 ‡
You are given:
(i) A random sample of losses from a Weibull distribution is:
(ii) You use the likelihood ratio test to test the hypothesis:
H0 : τ = 2.
6 2.
Ha : τ =
α 10% 5% 2.5% 1%
cα
542 MODEL SELECTION AND EVALUATION
Solution.
(a) The likelihood function is
x τ
j
−
5
Y τ xτj −1 e θ
L(τ, θ) = .
θτ
j=1
Thus,
(b) The free parameter in the null hypothesis is θ. The free parameters in
the alternate hypothesis is τ and θ. Hence, r = 2 − 1 = 1.
(c) Using the Chi-square distribution with one degree of freedom, we find
α 10% 5% 2.5% 1%
cα 2.706 3.841 5.024 6.635
Example 73.3 ‡
You are given:
(i) Twenty claim amounts are randomly selected from a Pareto distribution
with α = 2 and an unknown θ.
(ii) The
P maximum likelihood estimate of θ is 7.0
(iii) P ln(xi + 7.0) = 49.01
(iv) ln(xi + 3.1) = 39.30
You use the likelihood ratio test to test the hypothesis that θ = 3.1.
Determine the result of the test at the 1% level.
Solution.
The likelihood function is
20
Y αθα
L(α, θ) =
(xi + θ)α+1
j=1
73 THE LIKELIHOOD RATIO TEST 543
Practice Problems
Problem 73.1
Consider the following hypothesis test problem:
H0 : The data came from a Pareto distribution with α = 1.5 and θ = 7.8.
6 1.5 and θ 6= 7.8.
Ha : The data came from a Pareto distribution with α =
Find the degrees of freedom associated with the likelihood ratio test.
Problem 73.2
Consider the following hypothesis test problem:
H0 : The data came from a Pareto distribution with α = 1.5 and θ = 7.8.
6 1.5 and θ 6= 7.8.
Ha : The data came from a Pareto distribution with α =
α 5% 2.5% 1% 0.5%
cα
Problem 73.3 ‡
You fit a Pareto distribution to a sample of 200 claim amounts and use the
likelihood ratio test to test the hypothesis that α = 1.5 and θ = 7.8.
You are given:
(i) The maximum likelihood estimates are α̂ = 1.4 and θ̂ = 7.6..
(ii) The natural logarithm of the likelihood function evaluated at the maxi-
mumPlikelihood estimates is −817.92.
(iii) ln(xi + 7.8) = 607.64.
Determine the result of the test.
Problem 73.4 ‡
You are given:
(i) A random sample of losses from a Weibull distribution is:
H0 : τ = 2.
6 2.
Ha : τ =
Problem 73.5 ‡
During a one-year period, the number of accidents per day was distributed
as follows:
# of Accidents Days
0 209
1 111
2 33
3 7
4 3
5 2
For these data, the maximum likelihood estimate for the Poisson distribu-
tion is λ̂ = 0.60, and for the negative binomial distribution, it is r̂ = 2.9 and
β̂ = 0.21.
The Poisson has a nega tive loglikelihood value of 385.9, and the negative
binomial has a negative loglikelihood value of 382.4.
Determine the likelihood ratio test statistic, treating the Poisson distribu-
tion as the null hypothesis.
546 MODEL SELECTION AND EVALUATION
In [1], two approaches for model selection are considered. The first one is
judgement-based approach where the modeler’s experience is critical. This
is a feature of the second point mentioned above. The other approach is
a score-based approach where a numerical value (a score) is assigned to a
model and the model selected is the one with best score.
With multiple models, the SBC of each model is computed. The model
preferred is the one with highest SBC.
Example 74.1
You are given that a particular model has a maximum loglikelihood value
of −412. You also know that the model uses 2 parameters and is used to
interpret a sample of 260 data points. Find the SBC of this model.
Solution.
The SBC is given by
2
SBC = −412 − ln 260 = −417.56
2
74 SCHWARZ BAYESIAN CRITERION 547
Example 74.2
What is the limit of the Schwarz Bayesian adjustement as the sample size
increases without bound?
Solution.
We have
r
lim ln n = ∞
n→∞ 2
Example 74.3
Four models are fitted to a sample of n = 200 observations with the following
results:
Model # of Parameters Loglikelihood
I 3 −180.2
II 2 −181.4
III 2 −181.6
IV 1 −183
Solution.
We have the following
Practice Problems
Problem 74.1
Which of the following statements is false?
(i) The principle of parsimony states that a more complex model is bet-
ter because it will always match the data better.
(ii) In judgment-based approaches to determining a model, a modeler’s ex-
perience is critical.
(iii) In score-based approaches, one assigns scores to the potential models.
Problem 74.2 ‡
If the proposed model is appropriate, which of the following tends to zero
as the sample size goes to infinity?
Problem 74.3 ‡
Five models are fitted to a sample of n = 260 observations with the following
results:
Problem 74.4 ‡
You are given:
(i) Sample size = 100
(ii) The negative loglikelihoods associated with five models are:
74 SCHWARZ BAYESIAN CRITERION 549
17
Also known as classical credibility.
551
552 CREDIBILITY THEORY
E(Xj ) =ξ
Var(Xj ) =σ 2
E(X) =ξ
σ2
Var(X) = .
n
An insurer’s goal is to decide on the value of ξ. The choice can be done in
one of three ways:
• Ignore past experience or data (no credibility) and charge the manual
premium21 M.
• Use only past data (full credibility) and charge X (the observed pure
premium).
σ2
X −ξ
X ∼ N ξ, → σ ∼ N (0, 1).
n √
n
Now, let
X − ξ
yp = inf Pr ≤y ≥p .
y ξ
If X is continuous then yp satisfies
X − ξ
Pr ≤ yp = p.
ξ
σ
This says that full credibility is assigned if the coefficient of variation ξ of
q
Xj is no larger than λn0 .
Alternatively, full credibility occurs if
σ2 ξ2
Var(X) = ≤ .
n λ0
The number of exposure units required for full credibility is
2
σ
n ≥ λ0 . (75.1)
ξ
We next describe how to find yp when X is continuous. Recall that the cdf
of the standard normal distribution Z is denoted by Φ. We have
p =Pr(|Z| ≤ yp )
=Pr(−yp ≤ Z ≤ yp )
=Φ(yp ) − Φ(−yp )
=Φ(yp ) − 1 + Φ(yp )
=2Φ(yp ) − 1.
Thus, Φ(yp ) = 1+p
2 so that yp is the 100
1+p
2 −th percentile of the standard
normal distribution.
Example 75.1
Determine yp and λ0 if p = 0.9 and r = 0.05. Also, determine the full-
credibility standard.
Solution.
yp 2 1.645 2
The standard normal tables give y0.9 = 1.645. Thus, λ0 = r = 0.05 =
1082.41. The full credibility standard is
2
σ
n ≥ 1082.41
ξ
Example 75.2
Let N1 , N2 , · · · , Nn the number of claims in the past n years for a policy.
That is, Ni is the total number of claims of year i. Let X1 , X2 , · · · , Xn . be
the losses in the past n years. That is, Xi is the total losses in year i. We
assume that the Xi0 and the Ni0 s are independent with the Ni0 s being iid
with common distribution the Poisson distribution with mean λ.
Let Yij denote the j th claim in year i. We assume that the Yij are iid with
mean θY and variance σY2 . Then we have
Ni
X
Xi = Yij .
j=1
Solution.
Xi is a compound Poisson distribution so that E(Xi ) = E(Ni )E(Yij ) = λθY
and Var(Xi ) = Var(Ni )[E(Yij )]2 + E(Ni )Var(Yij ) = λ(θY2 + σY2 ).
(a) The standard of full credibility based on the number of exposure units
is "
2 2 #
σXi λ0 σY
λ0 = 1+ exposures.
E(Xi ) λ θY
(b) The expected total number of claims is ni=1 E(Ni ) = nE(Ni ) = nλ.
P
The standard of full credibility based on the expected total number of claims
is "
2 2 #
σXi σY
E(Ni )λ0 = λ0 1 + claims.
E(Xi ) θY
Pn
(c) The expected total amount of claims is i=1 E(Xi ) = nE(Xi ). The
standard of full credibility based on the expected total amount of claims is
2 2
σX σY2
σXi i
E(Xi )λ0 = λ0 = λ0 θY + dollars
E(Xi ) E(Xi ) θY
Example 75.3
Repeat the previous example by replacing the Poisson distribution with a
binomial distribution with parameters (50, p0 ).
556 CREDIBILITY THEORY
Solution.
Xi is a compound binomial distribution so that E(Xi ) = E(Ni )E(Yij ) =
50p0 θY and Var(Xi ) = Var(Ni )[E(Yij )]2 + E(Ni )Var(Yij ) = λ(θY2 + σY2 ) =
50p0 [(1 − p0 )θY2 + σY2 ].
(a) The standard of full credibility based on the number of exposure units
is 2 " 2 #
σXi λ0 σY
λ0 = 0
(1 − p0 ) + .
E(Xi ) 50p θY
(b) The expected total number of claims is ni=1 E(Ni ) = nE(Ni ) = 50np0 .
P
The standard of full credibility based on the expected total number of claims
is " 2 #
σY
λ0 (1 − p0 ) + .
θY
Pn
(c) The expected total amount of claims is i=1 E(Xi ) = nE(Xi ). The
standard of full credibility based on the expected total amount of claims is
2 2
σX σY2
σXi i 0
E(Xi )λ0 = λ0 = λ0 (1 − p )θY +
E(Xi ) E(Xi ) θY
Example 75.4 ‡
You are given:
(i) The number of claims follows a negative binomial distribution with pa-
rameters r and β = 3.
(ii) Claim severity has the following distribution:
where 2
1.96
λ0 = = 1536.64.
0.05
We have
750
ξ =E(S) = E(N )E(X) = (1500) = 1125
1000
Var(S) =E(N )Var(X) + Var(N )E(X)2
=0.75(6, 750, 000) + Var(N )(1500)2
fi (ni − 0.75)2
P
Var(N ) = = 0.93243
1000 − 1
Var(S) =0.75(6, 750, 000) + 0.93243(1500)2
=7, 160, 468.
Thus,
Var(S) 7, 160, 468
λ0 = 1536.64 = 8693.77
E(S)2 11252
so that n = 8694
75 LIMITED FLUCTUATION CREDIBILITY APPROACH: FULL CREDIBILITY559
Practice Problems
Problem 75.1 ‡
You are given:
(i) The number of claims has a Poisson distribution.
(ii) Claim sizes have a Pareto distribution with parameters θ = 0.5 and
α = 6.
(iii) The number of claims and claim sizes are independent.
(iv) The observed pure premium should be within 2% of the expected pure
premium 90% of the time.
Problem 75.2 ‡
You are given the following information about a commercial auto liability
book of business:
(i) Each insured’s claim count has a Poisson distribution with mean λ, where
λ has a gamma distribution with α = 1.5 and θ = 0.2.
(ii) Individual claim size amounts are independent and exponentially dis-
tributed with mean 5000.
(iii) The full credibility standard is for aggregate losses to be within 5% of
the expected with probability 0.90.
Problem 75.3 ‡
You are given:
(i) The number of claims has probability function:
m
p(x) = q x (1 − q)m−x , x = 0, 1, · · · , m.
x
(ii) The actual number of claims must be within 1% of the expected number
of claims with probability 0.95.
(iii) The expected number of claims for full credibility is 34,574.
Determine q.
Problem 75.4 ‡
You are given:
560 CREDIBILITY THEORY
Problem 75.5 ‡
A company has determined that the limited fluctuation full credibility stan-
dard is 2000 claims if:
(i) The total number of claims is to be within 3% of the true value with
probability p.
(ii) The number of claims follows a Poisson distribution.
The standard is changed so that the total cost of claims is to be within 5%
of the true value with probability p, where claim severity has probability
density function:
1
f (x) = , 0 ≤ x ≤ 10, 000.
10, 000
Pc = ZX + (1 − Z)M, 0 ≤ Z ≤ 1
Example 76.1
(a) Find E(Pc ) and Var(Pc ).
(b) Find P√c −E(Pc ) .
Var(Pc )
Solution.
(a) We have
and
σ2
Var(Pc ) = Var[ZX + (1 − Z)M ] = Var(ZX) = Z 2 Var(X) = Z 2 .
n
(b) We have
Pc − E(Pc ) ZX − Zξ X −ξ
p = √ = √
Var(Pc ) Zσ/ n σ/ n
How do we find Z? There are several ways for finding Z and all of them
lead to the same result. We will follow
an approach similar to Section 75:
ZX−Zξ
We want to choose Z such that ξ is small with high probability.
Mathematically,
ZX − Zξ
Pr ≤ r ≥ p.
ξ
As n → ∞, by the Central Limit theorem, we have
2
2σ ZX − Zξ
ZX ∼ N Zξ, Z → ∼ N (0, 1).
n Z √σn
562 CREDIBILITY THEORY
Now, let
X − ξ
yp = inf Pr
√ ≤y ≥p .
y σ/ n
Accordingly, the condition of partial credibility is met when
√
rξ n
≥ yp .
Zσ
y 2
Letting λ0 = rp , we obtain
√ r
rξ n ξ n
≥ yp ⇐⇒ Z ≤ .
Zσ σ λ0
Based on the total number of claims, Z is the square root of the number
of available claims divided by the total number of claims required for full
credibility. Based on total amout of claims, Z is the square root of the total
amount of available claims divided by the total amount of claims required
for the standard credibility.
Example 76.2
You are given:
(i) The number of claims per exposure follows a Poisson distribution with a
mean of 10.
(ii) Claim size follows a Pareto distribution with parameters α = 3 and
θ = 1.
(iii) The number of claims per exposure and claim sizes are independent.
76 LIMITED FLUCTUATION CREDIBILITY APPROACH: PARTIAL CREDIBILITY563
(iv) The method of limited fluctuation credibility is used, and the full cred-
ibility standard has been selected so that total claim dollars per exposure
will be within 10% of expected total claim dollars per exposure 95% of the
time.
Find the credibility factor
(a) based on 45 exposures
(b) based on a total claim number of 120
(c) based on a total claim amount of 600.
Solution.
(a) The standard for full credibility based on the number of exposures is
σY2 1.96 2 1 0.752
λ0
1+ 2 = 1+ = 124.852.
λ θY 0.1 10 0.52
The credibility factor based on 45 exposures is
r
45
Z= = 0.6004.
124.852
(b) The standard for full credibility based on the total number of claims is
σY2 1.96 2 0.752
λ0 1 + 2 = 1+ = 1248.52.
θY 0.1 0.52
The credibility factor based on 120 claims is
r
120
Z= = 0.31.
1248.52
(c) The standard for full credibility based on the total amount of claims is
σY2 1.96 2 0.752
λ 0 θY + = 0.5 + = 624.26.
θY 0.1 0.5
The credibility factor based on a total amount of 2500 is
r
600
Z= = 0.9804
624.26
Example 76.3
You are given:
(i) 350 claims with a total of 300,000.
(ii) The manual premium of M = 1000.
(iii) The credibility factor Z = 0.809.
Determine the credibility premium.
564 CREDIBILITY THEORY
Solution.
The partial credibility pure premium is
300000
X= = 857.14.
350
The credibility premium is
Practice Problems
Problem 76.1 ‡
You are given the following information about a general liability book of
business comprised
PNi of 2500 insureds:
(i) Xi = j=1 Yij , is a random variable representing the annual loss of the
ith insured.
(ii) N1 , N2 , · · · , N2500 , are independent and identically distributed random
variables following a negative binomial distribution with parameters r = 2
and β = 0.2.
(iii) Yi1 , Yi2 , · · · , YiNi are independent and identically distributed random
variables following a Pareto distribution with α = 3.0 and θ = 1000.
(iv) The full credibility standard is to be within 5% of the expected aggre-
gate losses 90% of the time.
Problem 76.2 ‡
You are given:
(i) Xpartial = pure premium calculated from partially credible data.
(ii) µ = E[Xpartial ].
(iii) Fluctuations are limited to ±kµ of the mean with probability P. (iv)
Z = credibility factor.
Problem 76.3
You are given:
(i) 50 claim amounts.
(ii) Claim size is uniform in [0, θ].
(iii) The full credibility standard is to be within 5% of the expected claim
amount 90% of the time.
Problem 76.4 ‡
You are given:
(i) Claim counts follow a Poisson distribution.
(ii) Claim sizes follow a lognormal distribution with coefficient of variation
3.
(iii) Claim sizes and claim counts are independent.
(iv) The number of claims in the first year was 1000.
(v) The aggregate loss in the first year was 6.75 million.
(vi) The manual premium for the first year was 5.00 million.
(vii) The exposure in the second year is identical to the exposure in the first
year.
(viii) The full credibility standard is to be within 5% of the expected aggre-
gate loss 95% of the time.
Determine the limited fluctuation credibility net premium (in millions) for
the second year.
Problem 76.5 ‡
You are given:
(i) Claim counts follow a Poisson distribution.
(ii) claim size follows a Pareto distribution with parameters α = 3 and θ = 1.
(iii) A full credibility standard is established so that the actual number of
claims will be within 5% of the expected number of claims 95% of the time.
Determine the number of expected claims needed for 30% partial credibility
for the distribution of number of claims.
Problem 76.6 ‡
You are given:
(i) The full credibility standard is 100 expected claims.
(ii) The square-root rule is used for partial credibility.
You approximate the partial credibility formula with a Bühlmann credibility
formula by selecting a Bühlmann k value that matches the partial credibility
formula when 25 claims are expected.
Determine the credibility factor for the Bühlmann credibility formula when
100 claims are expected.
77 GREATEST ACCURACY CREDIBILITY APPROACH 567
(1) Is the policyholder is different than what was assumed in figuring out
the manual rate µ?
(2) Has it been random chance that has been responsible in the difference
between µ and X?
(2) The value of θ varies amongst policyholders in the same risk class. This
assumption allows us to quantify the difference between policyholders with
respect to the risk characteristics.
(3) θ can be viewed as a random variable on the set of risk levels with
568 CREDIBILITY THEORY
(1) The risk parameter θ is selected from the distribution π(θ) (prior distri-
bution).
(2) Claims or losses are selected from the conditional distribution fX|Θ (x|θ).
(model distribution)
Example 77.1
The amount of a claim X|Θ has the exponential distribution with parameter
1
θ . The risk parameter Θ has a gamma distribution with parameters α and
β. Provide a mathematical description of this model.
Solution.
For the risk parameter, we have
−θ
θα−1 e β
π(θ) = α .
β Γ(α)
For the claims, we have
1 x
fX|Θ (x|θ) = e− θ
θ
Example 77.2
The amount of a claim X|Λ has the inverse exponential distribution with pa-
rameter λ The risk parameter Λ has a gamma distribution with parameters
α and θ. Provide a mathematical description of this model.
Solution.
For the risk parameter, we have
λ
λα−1 e− θ
π(λ) = α .
θ Γ(α)
77 GREATEST ACCURACY CREDIBILITY APPROACH 569
Practice Problems
Problem 77.1
The amount of a claim X|Λ has the Poisson distribution with parameter λ.
The risk parameter Λ has a gamma distribution with parameters α and β.
Problem 77.2
The amount of a claim X|Θ has the normal distribution with parameters θ
and σ12 . The risk parameter Θ has a normal distribution with parameters µ
and σ22 .
Problem 77.3
The amount of a claim X|Q has the Binomial distribution with parameters
(m, q) where m is known. The risk parameter Q has a beta distribution
with parameters (a, b, 1).
Problem 77.4
The amount of a claim X|Λ has the exponential distribution with parameter
1
λ . The risk parameter Λ has an inverse Gamma distribution with parame-
ters α and θ.
Problem 77.5
The amount of a claim X|Λ has the uniform distribution in [0, λ]. The risk
parameter Λ s single parameter Pareto distribution with α and θ.
Suppose X and Y are two continuous random variables with joint density
fXY (x, y). Let fX|Y (x|y) denote the probability density function of X given
that Y = y. The conditional density function of X given Y = y is
fXY (x, y)
fX|Y (x|y) =
fY (y)
pXY (x, y)
pX|Y (x|y) = .
pY (y)
Note that
Z ∞ Z ∞
fXY (x, y) fY (y)
fX|Y (x|y)dx = dx = = 1.
−∞ −∞ fY (y) fY (y)
Example 78.1
Suppose X and Y have the following joint density
1
2 |X| + |Y | < 1
fXY (x, y) =
0 otherwise
Solution.
(a) Clearly, X only takes values in (−1, 1). So fX (x) = 0 if |x| ≥ 1. Let
−1 < x < 1,
Z ∞ Z 1−|x|
1 1
fX (x) = dy = dy = 1 − |x|.
−∞ 2 −1+|x| 2
f ( 12 , y) 1 − 12 < y < 12
fY |X (y|x) = =
fX ( 12 ) 0 otherwise
Theorem 78.1
Continuous random variables X and Y with fY (y) > 0 are independent if
and only if
fX|Y (x|y) = fX (x).
Proof.
Suppose first that X and Y are independent. Then fXY (x, y) = fX (x)fY (y).
Thus,
fXY (x, y) fX (x)fY (y)
fX|Y (x|y) = = = fX (x).
fY (y) fY (y)
Conversely, suppose that fX|Y (x|y) = fX (x). Then fXY (x, y) = fX|Y (x|y)fY (y) =
fX (x)fY (y). This shows that X and Y are independent
Example 78.2
Let X and Y be two continuous random variables with joint density function
c 0≤y<x≤2
fXY (x, y) =
0 otherwise
Solution.
(a) We have Z x
fX (x) = cdy = cx, 0 ≤ x ≤ 2
0
Z 2
fY (y) = cdx = c(2 − y), 0 ≤ y ≤ 2
y
78 CONDITIONAL DISTRIBUTIONS AND EXPECTATION 573
and
fXY (x, 1) c
fX|Y (x|1) = = = 1, 1 ≤ x ≤ 2.
fY (1) c
Example 78.3
Show that
fY |X (y|x)fX (x)
fX|Y (x|y) = .
fY (y)
Solution.
We have
fX|Y (x|y)fY (y) = fY |X (y|x)fX (x)(= fXY (x, y)).
where
fXY (x, y)
fX|Y (x|y) = .
fY (y)
Example 78.4
Suppose that the joint density of X and Y is given by
−x
y e−y
e
fXY (x, y) = , x, y > 0.
y
Solution.
The conditional density is found as follows
fXY (x, y)
fX|Y (x|y) =
fY (y)
fXY (x, y)
=R ∞
−∞ fXY (x, y)dx
−x
y e−y
(1/y)e
=R ∞ −x
(1/y)e y e−y dx
0
−x
(1/y)e y
=R ∞ −x
0 (1/y)e y dx
1 −x
= e y
y
Hence,
Z ∞ ∞ Z ∞ x
x − xy −x −
E(X|Y = y) = e dx = − xe y − e y dx
0 y 0 0
i∞
−x −x
h
= − xe + ye
y y =y
0
Notice that if X and Y are independent then fX|Y (x|y) = fX (x) so that
E(X|Y = y) = E(X).
Theorem 78.2 (Double Expectation Property)
E(X) = E(E(X|Y ))
Proof.
We give a proof in the case X and Y are continuous random variables.
Z ∞
E(E(X|Y )) = E(X|Y = y)fY (y)dy
−∞
Z ∞ Z ∞
= xfX|Y (x|y)dx fY (y)dy
−∞ −∞
Z ∞Z ∞
= xfX|Y (x|y)fY (y)dxdy
Z−∞
∞
−∞
Z ∞
= x fXY (x, y)dydx
−∞ −∞
Z ∞
= xfX (x)dx = E(X)
−∞
78 CONDITIONAL DISTRIBUTIONS AND EXPECTATION 575
Example 78.5
Suppose that X|Θ has a Poisson distribution with parameter θ and Θ has
a Gamma distribution with parameters α and β. Find E(X).
Solution.
We have
E(X) = E[E(X|Θ)] = E(Θ) = αβ
Now, for any function g(x, y), the conditional expected value of g given
Y = y is, in the continuous case,
Z ∞
E(g(X, Y )|Y = y) = g(x, y)fX|Y (x|y)dx
−∞
Example 78.6
Show that
E[E(g(X, Y )|Y )] = E[g(X, Y )].
Solution.
We have
Z ∞
E[E(g(X, Y )|Y )] = E(g(X, Y )|Y = y)fY (y)dy
−∞
Z ∞ Z ∞
= g(x, y)fX|Y (x|y)dx fY (y)dy
Z−∞
∞ Z ∞
−∞
Proposition 78.1
Let X and Y be random variables. Then
(a) Var(X|Y ) = E(X 2 |Y ) − [E(X|Y )]2
(b) E(Var(X|Y )) = E[E(X 2 |Y ) − (E(X|Y ))2 ] = E(X 2 ) − E[(E(X|Y ))2 ].
(c) Var(E(X|Y )) = E[(E(X|Y ))2 ] − (E(X))2 .
(d) Law of Total Variance: Var(X) = E[Var(X|Y )] + Var(E(X|Y )).
Proof.
(a) We have
(d) The result follows by adding the two equations in (b) and (c)
Example 78.7
Suppose that X and Y have joint distribution
( 2
3y
x3
0<y<x<1
fXY (x, y) =
0 otherwise
Find E(X), E(X 2 ), V ar(X), E(Y |X), V ar(Y |X), E[V ar(Y |X)], V ar[E(Y |X)],
and V ar(Y ).
Solution.
First we find marginal density functions.
Z x 2
3y
fX (x) = 3
dy = 1, 0 < x < 1
0 x
Z 1 2
3y 3
fY (y) = 3
dx = (1 − y 2 ), 0 < y < 1
y x 2
78 CONDITIONAL DISTRIBUTIONS AND EXPECTATION 577
Now, Z 1
1
E(X) = xdx =
0 2
Z 1
1
E(X 2 ) = x2 dx =
0 3
Thus,
1 1 1
V ar(X) = − = .
3 4 12
Next, we find conditional density of Y given X = x
fXY (x, y) 3y 2
fY |X (x|y) = = 3 , 0<x<y<1
fX (x) x
Hence,
x
3y 3
Z
3
E(Y |X = x) = 3
dx = x
0 x 4
and
x
3y 4
Z
2 3
E(Y |X = x) = dx = x2
0 x3 5
Thus,
3 9 3
V ar(Y |X = x) = E(Y 2 |X = x) − [E(Y |X = x)]2 = x2 − x2 = x2
5 16 80
Also,
3 9 9 1 3
V ar[E(Y |X)] = V ar x = V ar(X) = × =
4 16 16 12 64
and
3 2 3 3 1 1
E[V ar(Y |X)] = E X = E(X 2 ) = × = .
80 80 80 3 80
Finally,
19
V ar(Y ) = V ar[E(Y |X)] + E[V ar(Y |X)] =
320
Example 78.8 ‡
An actuary for an automobile insurance company determines that the dis-
tribution of the annual number of claims for an insured chosen at random is
modeled by the negative binomial distribution with mean 0.2 and variance
0.4.
The number of claims for each individual insured has a Poisson distribution
578 CREDIBILITY THEORY
and the means of these Poisson distributions are gamma distributed over
the population of insureds.
Calculate the variance of this gamma distribution
Solution.
Let N be the annual number of claims. We are given that E(N ) = E(N |Γ) =
E(Γ) = 0.2. By the law of total variance, we have
Practice Problems
Problem 78.1
Suppose that X is uniformly distributed on the interval [0, 1] and that, given
X = x, Y is uniformly distributed on the interval [1 − x, 1].
(a) Determine the joint density fXY (x, y).
(b) Find the probability P (Y ≥ 21 ).
Problem 78.2
The joint density of X and Y is given by
15
fXY (x, y) = 2 x(2 − x − y) 0 ≤ x, y ≤ 1
0 otherwise
Problem 78.3
The joint density function of X and Y is given by
( −x
e y e−y
fXY (x, y) = y x ≥ 0, y ≥ 0
0 otherwise
Problem 78.4
Let Y be a random variable with a density fY given by
α−1
yα y>1
fY (y) =
0 otherwise
Problem 78.5
Suppose that X and Y have joint distribution
8xy 0 < x < y < 1
fXY (x, y) =
0 otherwise
Problem 78.6
Suppose that X and Y have joint distribution
21 2 2
fXY (x, y) = 4 x y x <y <1
0 otherwise
Problem 78.7
The stock prices of two companies at the end of any given year are modeled
with random variables X and Y that follow a distribution with joint density
function
2x 0 < x < 1, x < y < x + 1
fXY (x, y) =
0 otherwise
Problem 78.8
Let X be a random variable with mean 3 and variance 2, and let Y be
a random variable such that for every x, the conditional distribution of Y
given X = x has a mean of x and a variance of x2 .
Problem 78.9
The number of stops X in a day for a delivery truck driver is Poisson with
mean λ. Conditional on their being X = x stops, the expected distance
driven by the driver Y is Normal with a mean of αx miles, and a standard
deviation of βx miles.
Give the mean and variance of the numbers of miles she drives per day.
79 BAYESIAN CREDIBILITY WITH DISCRETE PRIOR 581
Let’s recall the credibility problem in Section 77: For a particular policy-
holder, we have the observed past losses X1 , X2 , · · · , Xn and we are inter-
ested in setting the premium to cover the loss of the next exposure unit
(next year) Xn+1 . We assume that the risk parameter θ (which is unknown)
associated with the policyholder comes from a prior distribution π(θ) and
that the losses X1 , X2 , · · · , Xn+1 are conditionally independent, that is the
Xi |Θ are independent, but not necessarily identically distributed.
The mean of the predicitive distribution, also known as the Bayesian pre-
mium or Bayesian estimate, is what we would charge to cover the loss
Xn+1 . It is given by
Z
E[Xn+1 |X = x] = xn+1 fXn+1 |X (xn+1 |x)dxn+1
Z Z
= xn+1 fXn+1 |Θ (xn+1 |θ)πΘ|X (θ|x)dθ dxn+1
Z Z
= xn+1 fXn+1 |Θ (xn+1 |θ)dxn+1 πΘ|X (θ|x)dθ
Z
= µn+1 (θ)πΘ|X (θ|x)dθ
where
Z
µn+1 (θ) = E(Xn+1 |Θ = θ) = xn+1 fXn+1 |Θ (xn+1 |θ)dxn+1
Remark 79.1
In the case Θ is discrete, the integrals above are replaced by sums.
79 BAYESIAN CREDIBILITY WITH DISCRETE PRIOR 583
Example 79.1 ‡
You are given the following for a dental insurer:
(I) Claim counts for individual insureds follow a Poisson distribution.
(ii) Half of the insureds are expected to have 2.0 claims per year.
(iii) The other half of the insureds are expected to have 4.0 claims per year.
A randomly selected insured has made 4 claims in each of the first two policy
years. Determine the Bayesian estimate of this insured’s claim count in the
next (third) policy year.
Solution.
Let X be the claim count for an individual. We are given that conditional
claim count X|Θ is Poisson with mean Θ. Let Xn be the number of claims
in year n. We want to find E(X3 |X1 , X2 ) where x1 = x2 = 4.
The prior distribution is: π(2) = 0.5 and π(4) = 0.5.
The marginal distribution is
fX (4, 4) =fX1 |Θ (4|θ = 2)fX2 |Θ (4|θ = 2)π(2) + fX1 |Θ (4|θ = 4)fX2 |Θ (4|θ = 4)π(4)
−2 4 2 −4 4 2
e 2 e 4
= (0.5) + (0.5)
4! 4!
=0.02315.
The posterior distribution is:
2
e−2 24
fX|Θ (x|2)π(2) 4! (0.5)
πΘ|X (2|x) = = = 0.1758
fX (x) 0.02315
and
πΘ|X (4|x) = 1 − 0.1758 = 0.8242.
Finally, the Bayesian premium is
E(X3 |X1 , X2 ) =E(X3 |Θ = 2)πΘ|X (2|x) + E(X3 |Θ = 4)πΘ|X (4|x)
=2(0.1758) + 4(0.8242) = 3.6484
Example 79.2 ‡
You are given:
A randomly selected insured has one claim in Year 1. Determine the ex-
pected number of claims in Year 2 for that insured.
Solution.
Let Xn denote the number of claims in Year n. We are asked to find
E(X2 |X1 = 1). The parameter θ stands for the class. The prior distribution
is π(1) = 3000 1 1 1
6000 = 2 , π(2) = 3 , and π(3) = 6 . The marginal distribution
evaluated at x1 = 1 is
f (1) =f (1|1)π(1) + f (1|2)π(2) + f (1|3)π(3)
1 1 1 1 2
= · + · = .
3 2 6 3 9
The posterior distribution is
1
f (1|1)π(1) 6 3
π(1|1) = = 2 = 4
f (1) 9
1
f (1|2)π(2) 18 1
π(2|1) = = 2 = 4
f (1) 9
f (3|1)π(1) 0
π(3|1) = = 2 = 0.
f (1) 9
Thus,
E(X2 |X1 = 1) =E(X2 |1)π(1|1) + E(X2 |2)π(2|1) + E(X2 |3)π(3|1)
1 1 3 1 2 1 1
= 1 +2 + 1 +2 +3
3 3 4 6 3 6 4
=1.25
Example 79.3 ‡
You are given the following information about six coins:
Solution.
The prior distribution is
4 1
π(θ1 ) = 6 π(θ2 ) = 6 π(θ3 ) = 61 .
E(X5 |S) =E(X3 |θ1 )f (θ1 |S) + E(X3 |θ2 )f (θ2 |S) + E(X3 |θ3 )f (θ3 |S)
=0.5(0.68088) + 0.25(0.03186) + 0.75(0.28726) = 0.5639
Example 79.4 ‡
For a particular policy, the conditional probability of the annual number of
claims given Θ = θ, and the probability distribution of Θ are as follows:
Number of claims 0 1 2
Probability 2θ θ 1 − 3θ
θ 0.10 0.30
Probability 0.80 0.20
586 CREDIBILITY THEORY
Solution.
The marginal distribution is Let N denote the annual number of claims. We
have
0.80(0.10) 4
πΘ|N (0.10|1) = =
0.14 7
0.20(0.30) 3
πΘ|N (0.30|1) = = .
0.14 7
The Bayesian credibility estimate of the number of claims in Year 2 is
Example 79.5 ‡
You are given:
(i) The claim count and claim size distributions for risks of type A are:
(ii) The claim count and claim size distributions for risks of type B are:
(iv) Claim counts and claim sizes are independent within each risk type.
A randomly selected risk is observed to have total annual losses of 500.
Determine the Bayesian premium for the next year for this same risk.
79 BAYESIAN CREDIBILITY WITH DISCRETE PRIOR 587
Solution.
The prior parameter represents the type of risk so that either Θ = A or
Θ = B. The prior distribution is Pr(A) = Pr(B) = 0.5. For each of the two
classes, total annual loss L has a compound distribution. We want
We have
Thus,
Example 79.6 ‡
Two eight-sided dice, A and B, are used to determine the number of claims
for an insured. The faces of each die are marked with either 0 or 1, repre-
senting the number of claims for that insured for the year.
Die Pr(Claims=0) Pr(Claims=1)
A 1/4 3/4
B 3/4 1/4
Two spinners, X and Y, are used to determine claim cost. Spinner X has
two areas marked 12 and c. Spinner Y has only one area marked 12.
Spinner Pr(Cost=12) Pr(Ccost=c)
X 1/2 1/2
Y 1 0
588 CREDIBILITY THEORY
To determine the losses for the year, a die is randomly selected from A and
B and rolled. If a claim occurs, a spinner is randomly selected from X and
Y and spun. For subsequent years, the same die and spinner are used to
determine losses.
Losses for the first year are 12. Based upon the results of the first year, you
determine that the expected losses for the second year are 10.
Calculate c.
Solution.
The prior parameter Θ can be one of AX, BX, AY, and BY. The prior
distribution is
1
π(AX) = π(AY ) = π(BX) = π(BY ) =
4
We have
10 =E(L2 |L1 = 12) = E(L2 |AX)Pr(AX|L1 = 12) + E(L2 |AY )Pr(AY |L1 = 12)
+E(L2 |BX)Pr(BX|L1 = 12) + E(L2 |BY )Pr(BY |L1 = 12)
1 3 1 3
E(L2 |AX) =12 +c
2 4 2 4
3
= (12 + c)
8
3 3
E(L2 |AY ) =12(1) + c(0) =9
4 4
1 1 1 1
E(L2 |BX) =12 +c
2 4 2 4
1
= (12 + c)
8
1 1
E(L2 |BY ) =12(1) + c(0) =3
4 4
Pr(L1 = 12) =Pr(L1 = 12|AX)Pr(AX) + Pr(L1 = 12|AY )Pr(AY )
=Pr(L1 = 12|BX)Pr(BX) + Pr(L1 = 12|BY )Pr(BY )
3 1 1 3 1 1 1 1 1 1
= + (1) + + (1)
4 2 4 4 4 4 2 4 4 4
3
=
8
79 BAYESIAN CREDIBILITY WITH DISCRETE PRIOR 589
3 1 1
Pr(L1 = 12|AX)Pr(AX) 4 2 4 1
Pr(AX|L1 = 12) = = 3 =
Pr(L1 = 12) 8
4
3 1
Pr(L1 = 12|AY )Pr(AY ) 4 (1) 4 1
Pr(AY |L1 = 12) = = 3 =
Pr(L1 = 12) 8
2
1 1 1
Pr(L1 = 12|BX)Pr(BX) 4 2 4 1
Pr(BX|L1 = 12) = = 3 =
Pr(L1 = 12) 8
12
1 1
Pr(L1 = 12|BY )Pr(BY ) 4 (1) 4 1
Pr(BY |L1 = 12) = = 3 = .
Pr(L1 = 12) 8
6
Thus,
3 1 1 1 1 1
(12 + c) +9 + (12 + c) +3 = c.
8 4 2 8 12 6
Solving this equation, we find c = 36
Example 79.7 ‡
For a risk, you are given:
(i) The number of claims during a single year follows a Bernoulli distribution
with mean p.
(ii) The prior distribution for p is uniform on the interval [0, 1].
(iii) The claims experience is observed for a number of years.
(iv) The Bayesian premium is calculated as 1/5 based on the observed claims.
Which of the following observed claims data could have yielded this calcu-
lation?
(A) 0 claims during 3 years
(B) 0 claims during 4 years
(C) 0 claims during 5 years
(D) 1 claim during 4 years
(E) 1 claim during 5 years
Solution.
Let xi be the number of claims in year i where i = 1, 2, · · · , n and xi = 0, 1.
Let x = x1 + · · · + xn be the number of claims in n years. We have that
Xi |p is a Bernoulli distribution with probability function
f (Xi |p) = pxi (1 − p)1−xi , xi = 0, 1.
The Bayesian premium is
Z 1 Z 1
E(Xn+1 |x1 , x2 , · · · , xn ) = E(Xn+1 |p)f (p|x1 , x2 , · · · , xn )dp = pf (p|x1 , x2 , · · · , xn )dp.
0 0
590 CREDIBILITY THEORY
We have
n
Y
f (x1 , x2 , · · · , xn |p) = pxi (1 − p)1−xi
i=1
=p (1 − p)n−x
x
f (x1 , x2 , · · · , xn |p)π(p)
f (p|x1 , x2 , · · · , xn ) =
f (x1 , x2 , · · · , xn )
px (1 − p)n−x
=
f (x1 , x2 , · · · , xn )
Z 1
f (x1 , x2 , · · · , xn ) = px (1 − p)n−x dp
0
Γ(x + 1)Γ(n − x + 1) 1
Z
Γ(n + 2) 1
= px+1 (1 − p)n−x+1−1 dp
Γ(n + 2) 0 Γ(x + 1)Γ(n − x + 1) p
Γ(x + 1)Γ(n − x + 1)
=
Γ(n + 2)
Γ(n + 2) 1
f (p|x1 , x2 , · · · , xn ) = px+1 (1 − p)n−x+1−1 .
Γ(x + 1)Γ(n − x + 1) p
Practice Problems
Problem 79.1
Drivers are classified as good (G), average (A), or bad (B).
• Good drivers make up 70% of the population and for a driver in this class,
the probability of having 0 claim in one year is 0.65, 1 claim is 0.25, and 2
claims is 0.10.
• Average drivers make up 20% of the population and for a driver in this
class, the probability of having 0 claim in one year is 0.40, 1 claim is 0.40,
and 2 claims is 0.20.
• Bad drivers make up 10% of the population and for a driver in this class,
the probability of having 0 claim in one year is 0.50, 1 claim is 0.30, and 2
claims is 0.20.
For a policyholder, the risk parameter is the classification of the individual
as G, A, or B. For a particular policyholder, it has been observed that x1 = 1
and x2 = 2.
(a) Write the prior distribution of this model.
(b) Find the model distribution of x = (1, 2)T .
Problem 79.2
In Problem 79.1, answer the following questions:
(a) Find the marginal probability of X.
(b) Find the joint distribution of X1 , X2 , X3 given x = (1, 2)T .
Problem 79.3
In Problem 79.1, answer the following questions:
(a) Find the predictive distribution given x = (1, 2)T .
(b) Find the posterior probabilities.
Problem 79.4
In Problem 79.1, answer the following questions:
(a) Determine the hypothetical means.
(b) Determine the pure of the collective premium.
Problem 79.5
In Problem 79.1, answer the following questions:
(a) Determine the Bayesian premium without using the hypothetical means.
(b) Determine the Bayesian premium by using the hypothetical means.
Problem 79.6 ‡
In a certain town the number of common colds an individual will get in a
592 CREDIBILITY THEORY
year follows a Poisson distribution that depends on the individual’s age and
smoking status. The distribution of the population and the mean number
of colds are as follows:
Problem 79.7 ‡
You are given:
(i) The annual number of claims on a given policy has the geometric distri-
bution with parameter β.
(ii) One-third of the policies have β = 2, and the remaining two-thirds have
β = 5.
A randomly selected policy had two claims in Year 1.
Calculate the Bayesian expected number of claims for the selected policy in
Year 2.
Problem 79.8 ‡
An insurance company sells three types of policies with the following char-
acteristics:
Type of Policy Proportion of Total Annual Claim
Policies Frequency
I 5% Poisson with λ = 0.25
II 20% Poisson with λ = 0.50
III 75% Poisson with λ = 1.00
Problem 79.9 ‡
You are given:
(i) Claim sizes follow an exponential distribution with mean θ.
(ii) For 80% of the policies, θ = 8.
79 BAYESIAN CREDIBILITY WITH DISCRETE PRIOR 593
Problem 79.10 ‡
You are given:
(i) Two classes of policyholders have the following severity distributions:
Problem 79.11 ‡
You are given:
(i) An individual automobile insured has annual claim frequencies that fol-
low a Poisson distribution with mean λ.
(ii) An actuary’s prior distribution for the parameter λ has probability den-
sity function:
1 λ
f (λ = 0.5[5e−5λ + e− 5 ].
5
(iii) In the first policy year, no claims were observed for the insured.
Determine the expected number of claims in the second policy year.
594 CREDIBILITY THEORY
Example 80.1
Claim amount is assumed to be exponential with mean Θ1 . The prior distri-
bution Θ is assumed to be Gamma with parameters α = 5 and β = 0.0005.
Suppose a person has claims in the amount of $2000,$1000, and $3000.
(a) Provide a mathematical description of this model.
(b) Determine the predictive distribution of the fourth claim.
(c) Determine the posterior distribution of Θ.
(d) Determine the Bayesian premium without using the hypothetical means.
(e) Determine the Bayesian premium by using the hypothetical means.
Solution.
(a) The claims amount distribution (model distribution) is given by
20005 θ4 e−2000θ
π(θ) = .
24
∞
θ1 20002 e−2000θ
Z
f (2000, 1000, 3000) = (θe−2000θ )(θe−1000θ )(θe−3000θ ) dθ
0 4!
20005 ∞ 7 −8000θ
Z
= θ e dθ
4! 0
20005 7! ∞ θ7 80008 e−8000θ
Z
= dθ
80008 4! 0 Γ(8)
| {z }
1
7! 20005
= .
4! 80008
80 BAYESIAN CREDIBILITY WITH CONTINUOUS PRIOR 595
Similarly,
∞
θ1 20002 e−2000θ
Z
f (2000, 1000, 3000, x4 ) = (θe−2000θ )(θe−1000θ )(θe−3000θ )(θe−x4 θ ) dθ
0 4!
20005 ∞ 8 −θ(8000+x4 )
Z
= θ e dθ
4! 0
20005 8! ∞ (8000 + x4 )9 θ8 e−θ(8000+x4 )
Z
= dθ
(8000 + x4 )9 4! 0 Γ(9)
| {z }
1
8! 20005
= .
4! (8000 + x4 )9
8! 20005
4! (8000+x4 )9 8(80008 )
f (x4 |2000, 1000, 3000) = 7! 20005
=
(8000 + x4 )9
4! 80008
(d) We have
∞
8(80008 )
Z
E[X4 |2000, 1000, 3000] = x4 dx4
(8000 + x4 )9
|0 {z }
mean of Pareto
8000 8000
= = .
8−1 7
596 CREDIBILITY THEORY
(e) We have
Z ∞
E[X4 |2000, 1000, 3000] = µ5 (θ)π(θ|2000, 1000, 3000)dθ
0
∞
1 80008 7 −8000θ
Z
= θ e
0 θ 7!
Z ∞
80008 6! 80007 θ6 e−8000θ
= dθ
7! 80007 0 Γ(7)
| {z }
1
8000
=
7
Example 80.2 ‡
You are given:
(i) The number of claims for each policyholder has a binomial distribution
with parameters m = 8 and q.
(ii) The prior distribution of q is beta with parameters a (unknown), b = 9,
and θ = 1.
(iii) A randomly selected policyholder had the following claims experience:
(iv) The Bayesian credibility estimate for the expected number of claims in
Year 2 based on the Year 1 experience is 2.54545.
(v) The Bayesian credibility estimate for the expected number of claims in
Year 3 based on the Year 1 and Year 2 experience is 3.73333.
Determine k.
Solution.
By 0
Pn Problem 0
66.1, Q|N has
Pna beta distribution with parameters a = a +
i=1 xi , b = b + nm − i=1 xi and θ = 1, where x1 , x2 , · · · , xn are past
data.
By (iv), we have n = 1 and x1 = 2 so that a0 = a + 2 and b0 = 9 + 8 − 2 = 15.
80 BAYESIAN CREDIBILITY WITH CONTINUOUS PRIOR 597
Practice Problems
Problem 80.1
Suppose an individual’s claim amounts are given by an exponential distri-
bution with mean Λ where Λ is an inverse Gamma with parameters α = 2
and θ = 15. Last year claim was $12.
(a) Provide a mathematical description of this model.
(b) Determine the predictive distribution of next year claim.
(c) Determine the posterior distribution of Λ.
(d) Determine the Bayesian premium.
Problem 80.2 ‡
You are given:
(i) The annual number of claims for a policyholder follows a Poisson distri-
bution with mean Λ.
(ii) The Prior distribution of Λ is gamma with probability density function:
(2λ)5 e−2λ
π(λ) = , λ > 0.
24λ
An insured is selected at random and observed to have x1 = 5 claims during
Year 1 and x2 = 3 claims during Year 2. Determine E(Λ|x1 = 5, x2 = 3).
Problem 80.3
You are given:
(i) X|P is a binomial distribution with parameters (10, p).
(ii) The prior distribution of P is
Problem 80.4
(i) Show that the negative binomial distribution with parameters r and β
can be expressed in the form
Γ(r + k) r
pk = q (1 − q)k , k = 0, 1, 2, · · · .
Γ(r)Γ(k)
(ii) Suppose that X|Q is negative binomial with parameters r and q. Suppose
also that Q is beta with (a, b, 1). Find the posterior distribution of Q.
80 BAYESIAN CREDIBILITY WITH CONTINUOUS PRIOR 599
Problem 80.5
The amount of a claim X|Λ has the normal distribution with mean θ and
known variance σ12 . The risk parameter Λ has a normal distribution with
mean µ and and variance σ22 . Find the posterior distribution of Θ.
Problem 80.6 ‡
You are given:
(i) The parameter Λ has an inverse gamma distribution with probability
density function:
10
g(λ) = 500λ−4 e− λ . λ > 0.
(ii) The size of a claim has an exponential distribution with probability
density function:
x
f (x|λ) = λ−1 e− λ , x > 0, λ > 0.
For a single insured, two claims were observed that totaled 50.
Determine the expected value of the next claim from the same insured.
600 CREDIBILITY THEORY
The idea is to estimate µn+1 (θ) with a linear combination of the past data:
α0 + α1 X1 + · · · + αn Xn
The estimation is done with the linear least squares regression where the
square of the distance between the µn+1 (θ) and the estimator is to be min-
imized. That is, we want to find the αi0 s that minimize the function
n
X
Q = Q(α0 , α1 , · · · , αn ) = E[(µn+1 (Θ) − α0 − αi Xi )2 ].
i=1
which leads to
n
X
E[µn+1 (Θ)] = α0 + αi E(Xi ). (81.1)
i=1
Next, we have
n
∂Q X
= E[2(µn+1 (Θ) − α0 − αi Xi )(−Xi )] = 0
∂αi
i=1
81 BÜHLMAN CREDIBILITY PREMIUM 601
which leads to
n
X
E[µn+1 (Θ)Xi ] = α0 E(Xi ) + αj E(Xj Xi ).
j=1
However, we have
Hence,
n
X
E(Xn+1 Xi ) = α0 E(Xi ) + αj E(Xj Xi ). (81.3)
j=1
Equations (81.2) and (81.4) are known as the normal equations. Solving
these n + 1 equations to yield the credibility premium
n
X
α̂0 + α̂i Xi . (81.5)
i=1
Example 81.1
You are given:
(i) E(Xi ) = 2 and Var(Xi ) = 3 for i = 1, 2, · · · , 20.
(ii) Cov(Xi , Xj ) = 1.5 for all i 6= j.
Determine the credibility premium.
Solution.
The unbiasedness equation yields
20
X
α̂0 + 2 α̂j = 2.
j=1
602 CREDIBILITY THEORY
This implies
20
X α̂0
α̂j = 1 − .
2
j=1
or equivalently
20
X
α̂j (1.5) + 1.5α̂i = 1.5.
j=1
or equivalently
α̂0
1− = 10α̂0 .
2
Solving this equation for α̂0 we find
2
α̂0 = .
21
Hence,
1
α̂i =
21
and the credibility premium is
20
X 2
α̂0 + α̂i Xi = (1 + 10X)
21
i=1
81 BÜHLMAN CREDIBILITY PREMIUM 603
Remark 81.1
It is easy to check that the values α̂0 , α̂1 . · · · , α̂n also minimize
" #2
Xn
Q1 = E E(Xn+1 |X) − α0 − αi Xi
i=1
and
n
X
Q2 = E[(Xn+1 − α0 − αi Xi )2 ].
i=1
That is, the credibility premium (81.5) is the best linear estimator of each
the hypothetical mean E(Xn+1 |Θ), the Bayesian premium (Xn+1 |X), and
Xn+1 .
Example 81.2 ‡
You are given the following information about a credibility model:
Bayesian estimate
First observation (T ) Pr(X1 = T ) E(X2 |X1 = T )
1 1/3 1.50
2 1/3 1.50
3 1/3 3.00
Determine the Bühlmann credibility estimate of the second observation,
given that the first observation is 1.
Solution.
Let X1 be the outcome of the first observation. By Problem 81.5, the
Bühlmann credibility estimate is of the form ZX1 +(1−Z)µ where X1 = 1, 2,
or 3.By Remark 8.1, the Bühlmann estimate is the least squares approxima-
tion to the Bayesian estimate. Thus, Z and µ are the minimizers of
1 1 1
f (Z, µ) = [1.50−Z−(1−Z)µ]2 + [1.50−2Z−(1−Z)µ]2 + [3.00−3Z−(1−Z)µ]2 .
3 3 3
Taking the derivative with respect to µ and setting it to zero, we find µ = 2.
Next, taking the derivative of f with respect to Z and setting it to zero, we
find
2(−Z + 0.5)(−1) + 2(0.5)(0) + 2(Z − 1)(1) = 0 =⇒ Z = 0.75.
Thus, the Bühlmann credibility estimate of the second observation, given
that the first observation is 1, is
ZX1 + (1 − Z)µ = 0.75(1) + (1 − 0.75)(2) = 1.25
604 CREDIBILITY THEORY
Practice Problems
Problem 81.1
You are given:
(i) E(Xj ) = µ and Var(Xj ) = σ 2 for all j = 1, 2, · · · , n.
(ii) Cov(Xi , Xj ) = ρσ 2 for all i 6= j, where ρ is the coefficient of correlation.
Use the unbiasedness equation to show that
n
X α̂0
α̂j = 1 − .
µ
j=1
Problem 81.2
With the assumptions of Problem 81.1, show that the n equations (81.4)
lead to
Xn
ρ=ρ α̂j + α̂i (1 − ρ), i = 1, 2, · · · , n.
j=1
Problem 81.3
With the assumptions of Problem 81.1, show that
ρα̂0
α̂i = .
µ(1 − ρ)
Problem 81.4
With the assumptions of Problem 81.1, show that
(1−ρ)µ ρ
α̂0 = 1−ρ+nρ and α̂i = 1−ρ+nρ
Problem 81.5
With the assumptions of Problem 81.1, show that
n
X
α̂0 + α̂j Xj = (1 − Z)µ + ZX
j=1
where Z to be determined.
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 605
Example 82.1
Find the (a) mean, (b) variance and (c) covariance of Xi .
Solution.
(a) The mean is given by
where
v E[Var(Xj |Θ)]
k= = .
a Var[E(Xj |Θ)]
BP = ZX + (1 − Z)µ.
Example 82.2
Drivers are classified as good (G), average (A), or bad (B).
• Good drivers make up 70% of the population and for a driver in this class,
the probability of having 0 claim in one year is 0.65, 1 claim is 0.25, and 2
claims is 0.10.
• Average drivers make up 20% of the population and for a driver in this
class, the probability of having 0 claim in one year is 0.40, 1 claim is 0.40,
and 2 claims is 0.20.
• Bad drivers make up 10% of the population and for a driver in this class,
the probability of having 0 claim in one year is 0.50, 1 claim is 0.30, and 2
claims is 0.20.
For a policyholder, the risk parameter is the classification of the individual
as G, A, or B. For a particular policyholder, it has been observed that x1 = 1
and x2 = 2.
Determine the Bühlmann premium.
Solution.
By Problem 79.4(a), we have
µ(G) = E(Xi |G) = 0.45, µ(A) = E(Xi |A) = 0.80 µ(B) = E(Xi |B) = 0.70
π(G) = 0.70, π(A) = 0.20, π(B) = 0.10.
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 607
Hence,
X
µ= µ(θ)π(θ) = 0.45(0.70) + 0.80(0.20) + 0.70(0.10) = 0.545
θ
X
2
E[µ (Θ)] = µ(θ)2 π(θ) = 0.452 (0.70) + 0.802 (0.20) + 0.702 (0.10) = 0.31875
θ
a =E[µ2 (Θ)] − µ2 = 0.31875 − 0.5452 = 0.021725.
Example 82.3 ‡
You are given the following information on claim frequency of automobile
accidents for individual drivers:
Solution.
Let Θ be risk parameter with designations BR= Business Rural, BU= busi-
ness urban, PR = Pleasure Rural, and PU = Pleasure Urban. By double
expectation, we have
Likewise,
2.3 = 1.5Pr(R) + 2.5Pr(U ).
Solving these two equations, we find Pr(R) = 0.2 and Pr(U ) = 0.8.
The prior distribution is
Thus,
X
µ= µ(θ)π(θ) = (0.10)(1.0) + (0.40)(2.0) + (0.10)(1.5) + (0.40(2.5) = 2.05
θ
and
X
a= µ(θ)2 π(θ) − µ2
θ
=(0.10)(1.0)2 + (0.40)(2.0)2 + (0.10)(1.5)2 + (0.40(2.5)2 − 2.052 = 0.2225.
v(BR) =0.5
v(BU ) =1.0
v(P R) =0.8
v(P U ) =1.0.
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 609
Hence,
X
v= v(θ)π(θ) = 0.5(0.10) + 1.0(0.40) + 0.8(0.10) + 1.0(0.40) = 0.93.
θ
It follows that
v 0.93
k= = = 4.18
a 0.2225
and the credibility factor is
n 1
Z= = = 0.193
n+k 1 + 4.18
Example 82.4 ‡
For a particular policy, the conditional probability of the annual number of
claims given Θ = θ, and the probability distribution of Θ are as follows:
Number of claims 0 1 2
Probability 2θ θ 1 − 3θ
θ 0.05 0.30
Probability 0.80 0.20
Two claims are observed in Year 1. Calculate the Bühlmann credibility
estimate of the number of claims in Year 2.
Solution.
We have
E(Θ) =0.05(0.80) + 0.30(0.20) = 0.1
E(Θ2 ) =0.052 (0.80) + 0.302 (0.20) = 0.02
µ(θ) =E(N |Θ) = 0(2θ) + 1(θ) + 2(1 − 3θ) = 2 − 5θ
µ =E(2 − 5Θ) = 2 − 5E(Θ) = 2 − 5(0.1) = 1.5
a =Var(2 − 5Θ) = 25Var(Θ) = 25(0.02 − 0.12 ) = 0.25
v(θ) =Var(N |Θ) = 02 (2θ) + 12 (θ) + 22 (1 − 3θ) − (2 − 5θ)2 = 9θ − 25θ2
v =E(9Θ − 25Θ2 ) = 9(0.1) − 25(0.02) = 0.4
v 0.4
k= = = 1.6
a 0.25
1 5
Z= = .
1+k 13
The required estimate is
5 5
2Z + (1 − Z)µ = 2 + 1− (1.5) = 1.6923
13 13
610 CREDIBILITY THEORY
Example 82.5 ‡
For a group of policies, you are given:
(i) The annual loss on an individual policy follows a gamma distribution
with parameters α = 4 and θ.
(ii) The prior distribution of θ has mean 600.
(iii) A randomly selected policy had losses of 1400 in Year 1 and 1900 in
Year 2.
(iv) Loss data for Year 3 was misfiled and unavailable.
(v) Based on the data in (iii), the Bühlmann credibility estimate of the loss
on the selected policy in Year 4 is 1800.
(vi) After the estimate in (v) was calculated, the data for Year 3 was located.
The loss on the selected policy in Year 3 was 2763.
Calculate the Bühlmann credibility estimate of the loss on the selected policy
in Year 4 based on the data for Years 1, 2 and 3.
Solution.
Let the annual loss be denoted by X. Then X|Θ has a Gamma distribution
with parameters α = 4 and θ. We have
µ(θ) =E(X|Θ) = αθ = 4θ
v(θ) =Var(X|Θ) = αθ2 = 4θ2
µ =E[µ(Θ)] = 4E(Θ) = 4(600) = 2400
v =E[v(Θ)] = 4E(Θ2 )
a =Var[µ(Θ)] = 16Var(Θ).
1400 + 1900
X= = 1650.
2
Based on (iv), we have
2
1650Z + (1 − Z)(2400) = 1800 =⇒ Z = 0.8 = =⇒ k = 0.5.
2+k
By (vi), we have
1400 + 1900 + 2763
X= = 2021
3
and
3 6
Z= = .
3 + 0.5 7
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 611
Thus, the Bühlmann credibility estimate of the loss on the selected policy
in Year 4 based on the data for Years 1, 2 and 3
6 6
(2021) + 1 − (2400) = 2075.14
7 7
Example 82.6 ‡
You are given:
(i) The annual number of claims for an individual risk follows a Poisson
distribution with mean λ.
(ii) For 75% of the risks, λ = 1.
(iii) For 25% of the risks, λ = 3.
A randomly selected risk had r claims in Year 1. The Bayesian estimate of
this riskfs expected number of claims in Year 2 is 2.98.
Determine the Bühlmann credibility estimate of the expected number of
claims for this risk in Year 2.
Solution.
Let X be the annual amount of claims and Λ be the prior parameter. Then
X|Λ has a Poisson distribution with mean λ. The prior distribution is:
π(1) = 0.75 and π(3) = 0.25.
The posterior distribution is
e−1
f (r|1)π(1) r! (0.75)
π(1|r) = = e−1 e−3 3r
f (r|1)π(1) + f (r|3)π(3)
r! (0.75) + r! (0.25)
0.2759
=
0.2759 + 3r (0.1245)
0.2759 3r (0.1245)
π(3|r) =1 − = .
0.2759 + 3r (0.1245) 0.2759 + 3r (0.1245)
Thus,
µ(Λ) =E(X|Λ) = λ
v(Λ) =Var[X|Λ) = λ
µ =E(Λ) = 0.75(1) + 0.25(3) = 1.5
v =E(Λ) = 1.5
a =Var[Λ) = 0.75(12 ) + 0.25(32 ) − 1.52 = 0.75
v 1.5
k= = =2
a 0.75
1 1
Z= = .
1+2 3
The Bühlmann credibility estimate of the expected number of claims for this
risk in Year 2 is
1 2
(7) + (1.5) = 3.33
3 3
Example 82.7 ‡
You are given:
(i) The claim count and claim size distributions for risks of type A are:
(ii) The claim count and claim size distributions for risks of type B are:
(iv) Claim counts and claim sizes are independent within each risk type.
(v) The variance of the total losses is 296,962.
A randomly selected risk is observed to have total annual losses of 500.
Determine the Bühlmann premium for the next year for this same risk.
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 613
Solution.
Let L denote annual losses. The prior parameter represents the type of
risk so that either Θ = A or Θ = B. The prior distribution is Pr(A) =
Pr(B) = 0.5. For each of the two classes, total annual loss L has a compound
distribution.
We want
ZX + (1 − Z)µ.
Since there is a single observation (n = 1), the sample mean is X = 500.
We have
Practice Problems
Problem 82.1 ‡
You are given:
(i) Two risks have the following severity distributions:
Probability of claim Probability of claim
Amount of claim amount for Risk 1 amount for Risk 2
250 0.5 0.7
2500 0.3 0.2
60000 0.2 0.1
(ii) Risk 1 is twice as likely to be observed as Risk 2.
A claim of 250 is observed.
Determine the Bühlmann credibility estimate of the second claim amount
from the same risk.
Problem 82.2 ‡
You are given the following joint distribution:
Θ
X 0 1
0 0.4 0.1
1 0.1 0.2
2 0.1 0.1
P10
For a given value of Θ and a sample of size 10 for X : i=1 xi = 10.
Determine the Bühlmann credibility premium.
Problem 82.3 ‡
An insurer writes a large book of home warranty policies. You are given
the following information regarding claims filed by insureds against these
policies:
(i) A maximum of one claim may be filed per year.
(ii) The probability of a claim varies by insured, and the claims experience
for each insured is independent of every other insured.
(iii) The probability of a claim for each insured remains constant over time.
(iv) The overall probability of a claim being filed by a randomly selected
insured in a year is 0.10.
(v) The variance of the individual insured claim probabilities is 0.01.
An insured selected at random is found to have filed 0 claims over the past
10 years.
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 615
Problem 82.4 ‡
You are given:
(i) Claim size, X, has mean µ and variance 500.
(ii) The random variable µ has a mean of 1000 and variance of 50.
(iii) The following three claims were observed: 750, 1075, 2000
Calculate the expected size of the next claim using Buhlmann credibility
Problem 82.5 ‡
You are given:
(i) A portfolio of independent risks is divided into two classes.
(ii) Each class contains the same number of risks.
(iii) For each risk in Class 1, the number of claims per year follows a Poisson
distribution with mean 5.
(iv) For each risk in Class 2, the number of claims per year follows a binomial
distribution with m = 8 and q = 0.55.
(v) A randomly selected risk has three claims in Year 1, r claims in Year 2
and four claims in Year 3.
The Bühlmann credibility estimate for the number of claims in Year 4 for
this risk is 4.6019. Determine r.
Problem 82.6 ‡
For a portfolio of independent risks, the number of claims for each risk in a
year follows a Poisson distribution with means given in the following table:
Mean Number of
Class Claims per risk Number of Risks
1 1 900
2 10 90
3 20 10
Problem 82.7 ‡
An insurance company sells two types of policies with the following charac-
teristics:
616 CREDIBILITY THEORY
Problem 82.8 ‡
You are given:
(i) Losses in a given year follow a gamma distribution with parameters α
and θ, where θ does not vary by policyholder.
(ii) The prior distribution of α has mean 50.
(iii) The Buhlmann credibility factor based on two years of experience is
0.25.
Calculate Var(α).
Problem 82.9 ‡
For a portfolio of independent risks, you are given:
(i) The risks are divided into two classes, Class A and Class B.
(ii) Equal numbers of risks are in Class A and Class B.
(iii) For each risk, the probability of having exactly 1 claim during the year
is 20% and the probability of having 0 claims is 80%. (iv) All claims for
Class A are of size 2.
(v) All claims for Class B are of size c, an unknown but fixed quantity.
One risk is chosen at random, and the total loss for one year for that risk is
observed. You wish to estimate the expected loss for that same risk in the
following year.
Determine the limit of the Bühlmann credibility factor as c goes to infinity.
Problem 82.10 ‡
An insurance company writes a book of business that contains several classes
of policyholders. You are given:
(i) The average claim frequency for a policyholder over the entire book is
0.425.
(ii) The variance of the hypothetical means is 0.370.
(iii) The expected value of the process variance is 1.793.
One class of policyholders is selected at random from the book. Nine poli-
cyholders are selected at random from this class and are observed to have
produced a total of seven claims. Five additional policyholders are selected
82 THE BÜHLMANN MODEL WITH DISCRETE PRIOR 617
Solution.
Let N be the Poisson claim count variable, let X be the claim size variable,
and let S be the aggregate loss variable. Note that S|Θ is a compound Pois-
son distribution with primary distribution N |Θ and secondaru distribution
X|Θ.
The hypothetical mean is
Hence,
v 500
k= = = 2.25
a 222.2222
Example 83.3 ‡
You are given:
(i) The number of claims made by an individual insured in a year has a
Poisson distribution with mean λ.
(ii) The prior distribution for Λ is Gamma with parameters α = 1 and
θ = 1.2.
Three claims are observed in Year 1, and no claims are observed in Year 2.
Using Bühlmann credibility, estimate the number of claims in Year 3.
620 CREDIBILITY THEORY
Solution.
The hypothetical mean is
µ(λ) = E(X|Λ) = λ
v(λ) = Var(X|Λ) = λ
Hence,
v 1.2 1
k= = =
a 1.44 1.2
and the credibility factor is
n 2
Z= = 1 = 0.706.
n+k 2 + 1.2
Example 83.4 ‡
You are given:
(i) The number of claims in a year for a selected risk follows a Poisson dis-
tribution with mean λ.
(ii) The severity of claims for the selected risk follows an exponential distri-
bution with mean θ.
(iii) The number of claims is independent of the severity of claims.
(iv) The prior distribution of λ is exponential with mean 1.
(v) The prior distribution of θ is Poisson with mean 1.
(vi) A priori, λ and θ are independent.
Using Bühlmann’s credibility for aggregate losses, determine k.
83 THE BÜHLMANN MODEL WITH CONTINUOUS PRIOR 621
Solution.
We have
Solution.
Note that the prior distribution is a Pareto distribution with parameters α
and 1. We have
µ(β) =E(X|β) = β
1
µ =E(β) =
α−1
2 1 α
a =Var(β) = − =
(α − 1)(α − 2) (α − 1)2 (α − 1)2 (α − 2)
v(β) =Var(X|β) = β(β + 1)
1 2 α
v =E[β(β + 1)] = + =
α − 1 (α − 1)(α − 2) (α − 1)(α − 2)
v
k = =α−1
a
1 1
Z= = .
1+k α
622 CREDIBILITY THEORY
Solution.
If X is the number of claims in Year 1, then the Bühlmann estimate is
ZX + (1 − Z)µ which is a linear function of X. This implies that (E) cannot
be the answer. The Bayes estimate is given by
Z 4
E(X2 |X1 = n) = λπ(λ|X1 = n)dλ
1
The graph (B) can not be the answer since E(X2 |X1 = 8) and E(X2 |X1 = 9)
are greater than 4. Likewise, the graph (D) cannot be the answer since
E(X2 |X1 = 0) and E(X2 |X1 = 1) are less than 1. Now, by Remark 81.1, the
Bühlmann estimates are the linear least squares approximation to the Bayes
estimates. We see from graph (C) that the Bayes estimates are consistently
higher than the Bühlmann estimates and so it can not be the answer. Hence,
(A) is the most appropriate answer for the problem
83 THE BÜHLMANN MODEL WITH CONTINUOUS PRIOR 623
624 CREDIBILITY THEORY
Practice Problems
Problem 83.1
Let X1 , X2 , · · · , Xn be past claim amounts. Suppose that Xi |Θ are inde-
pendent and identically Poisson distributed with mean Θ and Θ is Gamma
distributed with parameters α and β. Determine the Bühlmann premium.
Problem 83.2 ‡
You are given:
(i) Annual claim frequency for an individual policyholder has mean λ and
variance σ 2 .
(ii) The prior distribution for λ is uniform on the interval [0.5, 1.5].
(iii) The prior distribution for σ 2 is exponential with mean 1.25.
A policyholder is selected at random and observed to have no claims in Year
1.
Using Bühlmann credibility, estimate the number of claims in Year 2 for the
selected policyholder.
Problem 83.3 ‡
You are given the following information about a book of business comprised
of 100 insureds:
(i) Xi = N
P i
j=1 Yij is a random variable representing the annual loss of the
irmth insured.
(ii) N1 , N2 , · · · , N100 are independent random variables distributed accord-
ing to a negative binomial distribution with parameters r (unknown) and
β = 0.2.
(iii) Unknown parameter r has an exponential distribution with mean 2.
(iv) Yij are independent random variables distributed according to a Pareto
distribution with α = 3.0 and θ = 1000.
Determine the Bühlmann credibility factor, Z, for the book of business.
Problem 83.4 ‡
You are given:
(i) The annual number of claims for an insured has probability function:
3
p(x) = q x (1 − q)3−x , x = 0, 1, 2, 3.
x
(ii) The prior density is π(q) = 2q, 0 < q < 1.
A randomly chosen insured has zero claims in Year 1.
Using Bühlmann credibility, estimate the number of claims in Year 2 for the
selected insured.
83 THE BÜHLMANN MODEL WITH CONTINUOUS PRIOR 625
Problem 83.5 ‡
You are given:
(i) Claims are conditionally independent and identically Poisson distributed
with mean Θ.
(ii) The prior distribution function of Θ is:
2.6
1
π(θ) = 1 − , θ > 0.
1+θ
Five claims are observed. Determine the Bühlmann credibility factor.
Problem 83.6 ‡
You are given:
(i) Claim counts follow a Poisson distribution with mean λ.
(ii) Claim sizes follow a lognormal distribution with parameters µ and σ.
(iii) Claim counts and claim sizes are independent.
(iv) The prior distribution has joint probability density function:
Problem 83.7
For a portfolio of policies, you are given:
(i) The annual claim amount on a policy has probability density function:
2x
f (x|θ) = , 0 < x < θ.
θ2
(ii) The prior distribution of θ has density function:
Problem 83.8 ‡
You are given the following information about workers compensation cover-
age:
(i) The number of claims for an employee during the year follows a Poisson
distribution with mean (100 − p)/100, where p is the salary (in thousands)
for the employee.
626 CREDIBILITY THEORY
which leads to
v/a
α̂0 = µ
m + v/a
and
mi
α̂i = .
m + v/a
Letting k = av , the credibility premium can be expressed as
n
X
α̂0 + α̂j Xj = ZX + (1 − Z)µ
j=1
where
m m1 X1 +m2 X2 +···+mn Xn
Z= m+k and X = m .
If Xi is interpreted to be the average loss/claims experienced by the mi
group members in year i, then mi Xi is the total loss/claims of the mi group
members in year i. Also, X is the overall loss/claims per group member over
the n years. The credibility premium to be charged to the group in year
n + 1 is mn+1 [ZX + (1 − Z)µ] for the mn+1 members in the next year. Keep
in mind that ZX + (1 − Z)µ is the credibility premium per exposure unit
(i.e., per occurrence of an individual Xij ).
84 THE BÜHLMANN-STRAUB CREDIBILITY MODEL 629
Remark 84.1
If mi = 1 for i = 1, 2, · · · , n then the Bühlmann-Straub model coincides
with the original Bühlmann model.
Example 84.1
You are given:
(i) In year j, there are Nj claims for mj policies.
(ii) An individual policy has a Poisson distribution with mean Λ.
(iii) Λ has a Gamma distribution with parameters α and β.
(a) Determine the Bühlmann-Straub estimate of the number of claims for
one policyholder in year n + 1.
(b) Determine the Bühlmann-Straub estimate of the number of claims in
year n + 1 if there will be mn+1 policies.
Solution.
Ni
We let Xi = m i
. Because Ni has a Poisson distribution with mean mi λ then
Xi |Λ has a Poisson distribution with mean λ. Thus,
1 mi λ v(λ)
Var(Xi |Λ) = 2 Var(Ni ) = 2 = =⇒ v(λ) = λ.
mi mi mi
We have
(a) The Bühlmann-Straub estimate of the number of claims for one policy-
holder in year n + 1 is
mβ 1
Pc = X+ (αβ)
mβ + 1 mβ + 1
1 P
where X = m i=1 nmi Xi .
(b) The Bühlmann-Straub estimate of the number of claims if there are
mn+1 policies in year n + 1 is mn+1 Pc
630 CREDIBILITY THEORY
Example 84.2 ‡
You are given:
(i) The number of claims incurred in a month by any insured has a Poisson
distribution with mean λ.
(ii) The claim frequencies of different insureds are independent.
(iii) The prior distribution is Gamma with probability density function:
(100λ)6 e−100λ
π(λ) = .
120λ
(iv)
Month Number of Insureds Number of Claims
1 100 6
2 150 8
3 200 11
4 300 ?
Determine the Bühlmann-Straub credibility estimate of the number of claims
in Month 4.
Solution.
N
Let Xj = mjj . Note that
1 mi λ v(λ)
Var(Xi |Λ) = Var(Ni ) = 2 = =⇒ v(λ) = λ.
m2i mi mi
We have
µ(λ) =E(Xi |Λ) = λ
µ =E[µ(Λ)] = E(Λ) = αβ = 0.06
a =Var[µ(Λ)] = Var(Λ) = αβ 2 = 0.0006
v =E[v(Λ)] = E(Λ) = 0.06
v 0.06
k= = = 100
a 0.0006
m 450 9
Z= = =
m+k 450 + 100 11
N1 + N2 + N3 6 + 8 + 11 25
X= = = .
m 450 450
The credibility estimate of the expected number of claims for one insured in
month 4 is
9 25 2
Pc = ZX + (1 − Z)µ = + (0.06) = 0.056364.
11 450 11
84 THE BÜHLMANN-STRAUB CREDIBILITY MODEL 631
Example 84.3 ‡
For a portfolio of insurance risks, aggregate losses per year per exposure
follow a normal distribution with mean θ and standard deviation 1000, with
θ varying by class as follows:
A randomly selected risk has the following experience over three years:
Solution.
Let Xi be the aggregate losses per exposure in year i. That is, the average
over all exposures of the total losses in year i. Thus,
24, 000
X1 = = 1, 000
24
36, 000
X2 = = 1, 200
30
28, 000 14, 000
X1 = = .
26 13
We are given: m1 = 24, m2 = 30, m3 = 26. The risk parameter Θ is the
mean of the normal distribution. The prior distribution is
π(2000) =0.6
π(3000) =0.3
π(4000) =0.1.
632 CREDIBILITY THEORY
Hence, the The Bühlmann-Straub credibility estimate for the loss per expo-
sure in Year 4 is
Practice Problems
Problem 84.1
Let X1 , X2 , · · · , Xn be losses in the past n years. Suppose that they all have
the same risk parameter θ. We assume that Xi |Θ are idependent with mean
µ(θ) = E(Xi |Θ and variance Var(Xi |Θ) = w(θ) + v(θ) mi . Such a credibility
model is knwon as Hewitt’s model.
(a) Show that
m2i + m2j
mi Xi + mj Xj v(θ)
Var Θ = 2
w(θ) + .
mi + mj (mi + mj ) mi + mj
Problem 84.2
Consider the Hewitt’s model introduced above. Let α̂0 + nj=1 α̂j Xj be the
P
credibility premium. Using normal equations, show that
aα̂0 /µ µ
α̂i = w+v/mi and α̂0 = 1+am∗
where
n
X mj
m∗ = .
v + wmj
j=1
Problem 84.3
Consider the Hewitt’s model introduced above. Show that
n
X
α̂0 + α̂j Xj = ZX + (1 − Z)µ
j=1
where
n
am∗ 1
X mj
Z= 1+am∗ and X = m∗ Xj .
v + wmj
j=1
Problem 84.4 ‡
You are given four classes of insureds, each of whom may have zero or one
claim, with the following probabilities:
634 CREDIBILITY THEORY
Problem 84.5 ‡
You are given the following data on large business policyholders:
(i) Losses for each employee of a given policyholder are independent and
have a common mean and variance.
(ii) The overall average loss per employee for all policyholders is 20.
(iii) The variance of the hypothetical means is 40.
(iv) The expected value of the process variance is 8000.
(v) The following experience is observed for a randomly selected policy-
holder:
Average loss per Number of
Year Employee Employees
1 15 800
2 10 600
3 5 400
Determine the Bühlmann-Straub credibility premium per employee for this
policyholder.
Problem 84.6 ‡
Members of three classes of insureds can have 0, 1 or 2 claims, with the
following probabilities:
Number of Claims
Class 0 1 2
I 0.9 0.0 0.1
II 0.8 0.1 0.1
III 0.7 0.2 0.1
A class is chosen at random, and varying numbers of insureds from that
class are observed over 2 years, as shown below:
84 THE BÜHLMANN-STRAUB CREDIBILITY MODEL 635
Problem 84.7 ‡
You are given:
(i) The number of claims incurred in a month by any insured follows a
Poisson distribution with mean λ.
(ii) The claim frequencies of different insureds are independent.
(iii) The prior distribution of Λ is Weibull with θ = 0.1 and τ = 2.
(iv) Some values of the gamma function are:
(v)
Problem 84.8 ‡
For each policyholder, losses X1 , · · · , Xn , conditional on Θ, are indepen-
dently and identically distributed with mean,
and variance,
v(θ) = Var(Xj |Θ = θ), j = 1, 2, · · · , n.
You are given:
(i) The Bühlmann credibility assigned for estimating X5 based on X1 , · · · , X4
is Z = 0.4.
(ii) The expected value of the process variance is known to be 8.
Calculate Cov(Xi , Xj ), i 6= j.
636 CREDIBILITY THEORY
Problem 84.9 ‡
You are given n years of claim data originating from a large number of
policies. You are asked to use the Bühlmann-Straub credibility model to
estimate the expected number of claims in year n + 1.
Which of conditions (A), (B), or (C) are required by the model?
(A) All policies must have an equal number of exposure units.
(B) Each policy must have a Poisson claim distribution.
(C) There must be at least 1082 exposure units.
(D) Each of (A), (B), and (C) is required.
(E) None of (A), (B), or (C) is required.
Problem 84.10 ‡
You are given the following information about a single risk:
(i) The risk has m exposures in each year.
(ii) The risk is observed for n years.
(iii) The variance of the hypothetical means is a.
v
(iv) The expected value of the annual process variance is w + m .
Determine the limit of the Bhlmann-Straub credibility factor as m ap-
proaches infinity.
85 EXACT CREDIBILITY 637
85 Exact Credibility
The term exact credibility refers to the case when the credibility premium
is equal to the Bayesian premium. Exact credibility arises in Bühlmann and
Bühlmann-Struab models specifically in situations involving the linear ex-
ponential family members and their conjugate priors (see Sections 24 and
66) which we demonstrate next.
Let Xi |Θ be a member of the linear exponential family. Then its pdf can be
expressed as
p(x)er(θ)x
fXi |Θ (x|θ) =
q(θ)
For a member in the linear exponential family, the mean is given by (see
Example 24.2)
q 0 (θ)
µ(θ) = E(Xi |Θ) = .
r0 (θ)q(θ)
µ0 (θ)
Var(Xi |Θ) = .
r0 (θ)
We will show that the posterior distribution is of the same type as π(θ) so
that the prior distribution is a conjugate prior distribution. Indeed, we have
h i
Z Qn p(xj ) er(θ) nj=1 xj
P
[q(θ)]−k eµkr(θ) r0 (θ)
Z
j=1
fX|Θ (x|θ)π(θ)dθ = × dθ
[q(θ)]n c(µ, k)
µk+ n
P
j=1 xj
n r(θ) (k+n)
[q(θ)]−(k+n) e r0 (θ)
Y Z k+n
= p(xj ) dθ
c(µ, k)
j=1
n ∗ ∗ ∗
[q(θ)]−k er(θ)µ k r0 (θ)
Y Z
= p(xj )
c(µ, k)
j=1
Yn
= p(xj )
j=1
where
µk+ n
P
j=1 xj
µ∗ = k+n and k ∗ = k + n.
The posterior distribution is
Pn
Qn r(θ) j=1 xj
[ j=1 p(xj )]e [q(θ)]−k eµkr(θ) r0 (θ)
[q(θ)]n × c(µ,k)
π(θ|x) = hQ i
n
j=1 p(xj )
µk+ n
P
j=1 xj
r(θ) (k+n)
[q(θ)]−(k+n) e r0 (θ)
k+n
=
c(µ, k)
∗ ∗ ∗
[q(θ)]−k er(θ)µ k r0 (θ)
= .
c(µ, k)
From the expression of π(θ|x), we can write
π(θ|x)
ln = −k ∗ ln [q(θ)] + µ∗ k ∗ r(θ) − ln [c(µ, k)].
r0 (θ)
Differentiating with respect to θ yields
[π(θ|x)/r0 (θ)]0 0
∗ q (θ)
= −k + µ∗ k ∗ r0 (θ)
π(θ|x)/r0 (θ) q(θ)
which can be rearranged as
d π(θ|x)
= −k ∗ [µ(θ) − µ∗ ]π(θ|x).
dθ r0 (θ)
85 EXACT CREDIBILITY 639
The following table list some models where the Bayesian premium equals
the credibility premium:
fX|Θ (x|θ) π(θ) π(θ|x)
Poisson Gamma Gamma
Normal Normal Normal
Bernoulli Beta Beta
Exponential Inverse Gamma Inverse Gamma
Example 85.1
You are given:
(i) The model distribution X|Λ is Poisson with parameter Λ.
(ii) The prior distribution of Λ is Gamma with parameters α and θ.
Show that this model satisfies exact credibility.
Solution.
Example 66.2 shows that the posterior distribution is a Gamma distribution
β
with parameters α0 = α + nX and θ0 = nβ+1 . The hypothetical mean is
µ(λ) =λ
µ =E(Λ) = αθ
v(λ) =Var(Xi |Λ) = λ
v =E(Λ) = αθ
a =Var(Λ) = αθ2
v 1
k= =
a θ
n nθ
Z= =
n+k nθ + 1
Pc =ZX + (1 − Z)µ
nθ 1
= X+ (αθ)
nθ + 1 nθ + 1
θ
=(nX + α) .
nθ + 1
Thus, the credibility premium equals the Bayesian premium
Example 85.2 ‡
You are given:
(i) The number of claims per auto insured follows a Poisson distribution
with mean λ.
(ii) The prior distribution for Λ has the following probability density func-
tion:
(500λ)50 e−500λ
f (λ) = .
λΓ(50)
(iii) A company observes the following claims experience:
85 EXACT CREDIBILITY 641
Year 1 Year 2
Number of claims 75 210
Number of autos insured 600 900
The company expects to insure 1100 autos in Year 3. Determine the ex-
pected number of claims in Year 3.
Solution.
The model distribution is Poisson with mean λ and the prior distribution is
1
Gamma with α = 500 and θ = 500 . Thus, this model satisfies exact credi-
bility.
Practice Problems
Problem 85.1
You are given:
(i) The model distribution X|Λ is exponential with mean Λ.
(ii) The prior distribution of Λ is inverse Gamma with parameters α and β.
Show that this model satisfies exact credibility.
Problem 85.2
You are given:
• The model distribution X|Q is binomial with parameters m and q.
• The prior distribution Q is beta with parameters a, b and 1.
Show that this model satisfies exact credibility.
Problem 85.3
You are given:
• The model distribution X|Λ is normal with mean Λ and variance σ 2 .
• The prior distribution Λ is normal with mean µ and variance a2 .
Show that this model satisfies exact credibility.
Problem 85.4 ‡
You are given:
(i) The conditional distribution of the number of claims per policyholder is
Poisson with mean λ.
(ii) The variable λ has a gamma distribution with parameters α and θ.
(iii) For policyholders with 1 claim in Year 1, the credibility estimate for the
number of claims in Year 2 is 0.15.
(iv) For policyholders with an average of 2 claims per year in Year 1 and
Year 2, the credibility estimate for the number of claims in Year 3 is 0.20.
Determine θ.
86 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN MODEL643
The format of the data for our analysis has the following structure: Suppose
we have r ≥ 1 policyholders or groups of policyholders. For policyholder
i there are ni years of claim experience/observed exposure units. Let Xij
be the average number of losses/claims for policyholder i in year j. Let the
claim vector for average number of losses/claims for policyholder i over all
years be:
Xi = (Xi1 , · · · , Xini )T , i = 1, · · · , r.
We assume that X1 , · · · , Xr are independent, this is reasonable to think
that different groups claims will be independent of each other.
Let θi denote the risk parameter for the ith policyholder. We assume that
the Θi are independent and identically distributed. We also assume that
Xij |Θi are independent for j = 1, · · · , ni .
Let mij denote the number of exposure units for policyholder i in year
j. Then the total number of exposure units over all the years is
ni
X
mi = mij , i = 1, · · · , r.
j=1
The total exposure units for all policyholders over all the years is
r
X
m= mi .
i=1
644 CREDIBILITY THEORY
The past average loss experience of policyholder i over all the years is
ni
1 X
Xi = mij Xij , i = 1, · · · , r.
mi
j=1
and
r r n
1X 1 XX
X= Xi = Xij .
r nr
i=1 i=1 j=1
86 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN MODEL645
k k
" #
X X
2
E (Yi − Y ) = E[(Yi − µ)2 ] − kE[(Y − µ)2 ]
i=1 i=1
Xk
= Var(Yi ) − kVar(Y )
i=1
σ2
=kσ 2 − k = (k − 1)σ 2 .
k
Hence,
k
" #
1 X
E (Yi − Y )2 = σ 2
k−1
i=1
Since
E(v̂i ) = E[E(v̂i |Θi )] = E[v(Θi )] = v
and
r
!
1X
E v̂i =v
r
i=1
an unbiased estimator of v is
r r n
1X 1 XX
v̂ = v̂i = (Xij − X i )2 .
r r(n − 1)
i=1 i=1 j=1
unconditionally, we have
Moreover,
r
1 X v̂
X i = â +
r−1 n
i=1
which implies
r
1 X v̂
â = (X i − X)2 −
r−1 n
i=1
r r n
1 X 1 XX
= (X i − X)2 − (Xij − X i )2 .
r−1 rn(n − 1)
i=1 i=1 j=1
Remark 86.1
Due to the subtraction in the formula for â, it is possible that â ≤ 0. When
this happens, it is customary to set â = Ẑ = 0.
Example 86.1
You are given the losses of two policyholders over a period of three years:
Determine the Bayes estimate of the Bühlmann premium for each policy-
holder for Year 4.
648 CREDIBILITY THEORY
Solution.
We have
3+5+7
X1 = =5
3
6 + 12 + 9
X2 = =9
3
5+9
µ̂ =X = =7
2
3
1 X
v̂1 = (X1j − X 1 )2
3−1
j=1
1
= [(3 − 5)2 + (5 − 5)2 + (7 − 5)2 ] = 4
3−1
3
1 X
v̂2 = (X2j − X 2 )2
3−1
j=1
1
= [(6 − 9)2 + (12 − 9)2 + (9 − 9)2 ] = 9
3−1
v̂1 + v̂2 9+4
v̂ = = = 6.5
2 2
r
1 X v̂
â = (X i − X)2 −
r−1 n
i=1
1 6.5 35
= [(5 − 7)2 + (9 − 7)2 ] − =
2−1 3 6
13
v̂ 2 39
k̂ = = 35 =
â 6
35
3 35
Ẑ = 39 = 48 .
3 + 35
The estimated Bühlmann premium for policyholder 1 in year 4 is:
35 35 133
ẐX 1 + (1 − Ẑ)µ̂ = (5) + 1 − (7) = .
48 48 24
The estimated Bühlmann premium for policyholder 2 in year 4 is:
35 35 203
ẐX 2 + (1 − Ẑ)µ̂ = (9) + 1 − (7) =
48 48 24
Remark 86.2
Due to the subtraction in the formula for â, it is possible that â ≤ 0. When
this happens, it is customary to set â = Ẑ = 0.
86 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN MODEL649
Example 86.2 ‡
Survival times are available for four insureds, two from Class A and two
from Class B. The two from Class A died at times t = 1 and t = 9. The two
from Class B died at times t = 2 and t = 4.
Nonparametric Empirical Bayes estimation is used to estimate the mean
survival time for each class. Unbiased estimators of the expected value of
the process variance and the variance of the hypothetical means are used.
Estimate Z, the Bühlmann credibility factor.
Solution.
We have: r = n = 2, X11 = 1, X12 = 9, X21 = 2, X22 = 4, X 1 = 5 and
X 2 = 3. Thus,
r n
1 XX
v̂ = (Xij − X i )2
r(n − 1)
i=1 j=1
1
= [(1 − 5)2 + (9 − 5)2 + (2 − 3)2 + (4 − 3)2 ] = 17
2(2 − 1)
and
r r n
1 X 1 XX
â = (X i − X)2 − (Xij − X i )2
r−1 rn(n − 1)
i=1 i=1 j=1
1 17
= [(5 − 4)2 + (3 − 4)2 ] − = −6.5.
2−1 2
Practice Problems
Problem 86.1 ‡
An insurer has data on losses for four policyholders for 7 years. The loss
from the ith policyholder for year j is Xij .
You are given:
4 X
X 7
(Xij − X i )2 = 33.60
i=1 j=1
and
4
X
(X i − X)2 = 3.30.
i=1
Problem 86.2 ‡
You are given total claims for two policyholders:
Year
Policyholder 1 2 3 4
X 730 800 650 700
Y 655 650 625 750
Problem 86.3 ‡
Three individual policyholders have the following claim amounts over four
years:
Problem 86.4 ‡
Three policyholders have the following claims experience over three months:
86 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN MODEL651
From
r
1 X
X= mi X i
m
i=1
we find
r r
1 X 1 X
E(X) = mi E(X i ) = mi E[E(X i |Θi )])
m m
i=1 i=1
r ni
1 X X m ij
= mi E E(Xij |Θi )
m mi
i=1 j=1
r ni r
1 X X m ij 1 X
= mi E µ(θi ) = mi E[µ(θi )]
m mi m
i=1 j=1 i=1
r
1 X
= mi µ = µ.
m
i=1
ni
1 X
E(X i |Θi ) = mij E(Xij |Θi ) = µ(θi ).
mi
j=1
87 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN-STRAUB MODEL6
Thus,
ni
1 X
E mij (Xij − X i )2 = v(θi ).
ni − 1
j=1
so that
Pr v̂i is an unbiased
Pr estimator of v. Another unbiased estimator of v is
v̂ = i=1 wi v̂i where i=1 wi − 1, One choice of the wi is
ni − 1
wi = Pr .
i=1 (ni − 1)
Thus,
!−1 " #
r r
1 X 2 X
E m− mi mi (X i − X)2 − v(r − 1) =a
m
i=1 i=1
87 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN-STRAUB MODEL6
r
!−1 " r
#
1 X 2 X
2
â = m− mi mi (X i − X) − v̂(r − 1) .
m
i=1 i=1
v̂ mi
k̂ = â and Ẑi =
mi +k̂
and the credibility premium to cover all mi,ni +1 exposure units for policy-
holder i in the next year would be
Remark 87.1
Note that the above equations provide unbiased estimators of µ, v, and a
respectively. They are nonparametric, requiring no distributional assump-
tions. Also, due to the subtraction in the formula for â, it is possible that
â ≤ 0. When this happens, it is customary to set â = Ẑ = 0.
Example 87.1
You are given:
Solution.
(a) We have
750 + 600
X1 = = 270
3+2
975 + 1200 + 900
X2 = = 205
5+6+4
5 15
X= X1 + X2
200 20
5 15
= (270) + (205) = 221.25
200 20
µ̂ =X = 221.25
2 2
3 750
3 − 270 + 2 600
2 − 270
v̂1 = = 3000
2−1
2 2 2
5 975
5 − 205 + 6 1200
6 − 205 + 4 900
4 − 205
v̂2 = = 1125
3−1
(2 − 1)v̂1 + (3 − 1)v̂2 3000 + 2(1125)
v̂ = = = 1750
(2 − 1) + (3 − 1) 3
1 2 2
â = 2 2 [5(270 − 221.25) + 15(205 − 221.25) − (2 − 1)(1750)] = 1879.1667.
20 − 5 +15
20
(b) We have
v̂ 1750
k̂ = = = 0.9313
â 1879.1667
m1 5
Ẑ1 = = = 0.843
m1 + k̂ 5 + 0.9313
15
Ẑ2 = = 0.9415.
15 + 0.9313
The premium in year 4 for a policyholder in Group I is
4[Ẑ1 X 1 + (1 − Ẑ1 )µ̂] = 4[0.8413(270) + (1 − 0.8413)(221.25)] = 1049.38.
The premium in year 4 for a policyholder in Group II is
5[Ẑ2 X 2 + (1 − Ẑ2 )µ̂] = 5[0.9415(205) + (1 − 0.9415)(221.25)] = 1029.75
It is often desirable for TP to equal TL. This leads to the following calcula-
tion
Xr
0= mi (1 − Ẑi )(µ̂ − X i )
i=1
r
X
0= k̂ Ẑi (µ̂ − X i )
i=1
r
X r
X
µ̂ Ẑi = Ẑi X i
i=1 i=1
Pr
i=1 Ẑi X i
µ̂ = P r
.
i=1 Ẑi
Hence, we have another estimator of µ̂. We refer to this process of estimat-
ing µ̂ as the credibility weighted average method or the method of
preserving total losses/claims (See Problem 87.4).
Example 87.2
Redo Example 87.1(b) if µ is estimated by credibility weighted average.
Solution.
We have
Ẑ1 X 1 + Ẑ2 X 2 0.843(270) + 0.9415(205)
µ̂ = = = 235.7061.
Ẑ1 + Ẑ2 0.843 + 0.9415
The premium in year 4 for a policyholder in Group I is
4[Ẑ1 X 1 + (1 − Ẑ1 )µ̂] = 4[0.8413(270) + (1 − 0.8413)(235.7061)] = 1058.44.
The premium in year 4 for a policyholder in Group II is
5[Ẑ2 X 2 + (1 − Ẑ2 )µ̂] = 5[0.9415(205) + (1 − 0.9415)(235.7061)] = 1033.98
658 CREDIBILITY THEORY
Practice Problems
Problem 87.1 ‡
You are given the following commercial automobile policy experience:
Company Year 1 Year 2 Year 3
Total Losses I 50,000 50,000 ?
Number of Automobiles 100 200 ?
Total Losses II ? 150,000 150,000
Number of Automobiles ? 500 300
Total Losses II 150,000 ? 150,000
Number of Automobiles 50 ? 150
Determine the nonparametric empirical Bayes credibility factor, Z, for Com-
pany III.
Problem 87.2 ‡
You are given:
Group Year 1 Year 2 Year 3 Total
Total Claims 1 10,000 15,000 25,000
Number in Group 50 60 110
Average 200 250 227.27
Total Claims 2 16,000 18,000 34,000
Number in Group 100 90 190
Average 160 200 178.95
Total Claims 59,000
Number in Group 300
Average 196.67
You are also given â = 651.03..
Use the nonparametric empirical Bayes method to estimate the credibility
factor for Group 1.
Problem 87.3 ‡
You are given the following data:
Year 1 Year 2
Total losses 12,000 14,000
Number of Policyholders 25 30
The estimate of the variance of the hypothetical means is 254.
Determine the credibility factor for Year 3 using the nonparametric empirical
Bayes method.
87 NON-PARAMETRIC EMPIRICAL BAYES ESTIMATION FOR THE BÜHLMANN-STRAUB MODEL6
Problem 87.4 ‡
You are making credibility estimates for regional rating factors. You observe
that the Bühlmann-Straub nonparametric empirical Bayes method can be
applied, with rating factor playing the role of pure premium.
Xij denotes the rating factor for region i and year j, where i = 1, 2, 3 and
j = 1, 2, 3, 4.
Corresponding to each rating factor is the number of reported claims, mij ,
measuring exposure.
You are given:
4
X 4
X 4
X
i mi = mi Xi = 1
mi mij Xij v̂i = 1
3 mij (Xij − X i )2 mi (X i − X)2
i=1 j=1 j=1
1 50 1.406 0.536 0.887
2 300 1.298 0.125 0.191
3 150 1.178 0.172 1.348
Determine the credibility estimate of the rating factor for region 1 using the
X3
method that preserves mi X i .
i=1
Problem 87.5 ‡
You are given the following experience for two insured groups:
Group Year
1 2 3 Total
1 Number of members 8 12 5 25
Average loss per member 96 91 113 97
2 Number of members 25 30 20 75
Average loss per member 113 111 116 113
Total Number of members 100
Average loss per member 109
2 X
X 3 2
X
mij (Xij − X i )2 = 2020 and mi (X i − X)2 = 4800.
i=1 j=1 i=1
Problem 87.6 ‡
You are given:
(i) A region is comprised of three territories. Claims experience for Year 1
is as follows:
A 10 4
B 20 5
C 30 3
(ii) The number of claims for each insured each year has a Poisson distribu-
tion.
(iii) Each insured in a territory has the same expected claim frequency.
(iv) The number of insureds is constant over time for each territory.
Determine the Bühlmann-Straub empirical Bayes estimate of the credibility
factor Z for Territory A.
Problem 87.7 ‡
You are given the following information on towing losses for two classes of
insureds, adults and youths:
Exposures
Exposure
Year Adult Youth Total
1996 2000 450 2450
1997 1000 250 1250
1998 1000 175 1175
1999 1000 125 1125
Total 5000 1000 6000
Pure Premium
Year Adult Youth Total
1996 0 15 2.755
1997 5 2 4.400
1998 6 15 7.340
1999 4 1 3.667
Weighted Average 3 10 4.167
You are also given that the estimated variance of the hypothetical means is
17.125.
Determine the nonparametric empirical Bayes credibility premium for the
youth class.
88 SEMIPARAMETRIC EMPIRICAL BAYES CREDIBILITY ESTIMATION661
Let Xij denote the average number of claims for policyholder i in year
j where i = 1, 2, · · · , r. Suppose that the number of claims, given Θi ,
mij Xij |Θi has a Poisson distribution with parameter mij θi . Under these
assumptions, we have
1 mij θi
µ(θi ) =E[Xij |Θi ) = E(mij Xij |Θi ) = = θi
mij mij
1 mij θi
v(θi ) =mij Var(Xij |Θi ) = Var(mij Xij |Θi ) = = θi .
mij mij
It follows that
µ = v = E(Θi )
and an unbiased estimator for both µ and v is X. That is, µ̂ = v̂ = X.
1 Pr
Hence, r−1 i=1 (Xi1 − X)2 is an unbiased estimator of a + v and therefore
r
1 X
â = (Xi1 − X)2 − v̂.
r−1
i=1
662 CREDIBILITY THEORY
Solution.
We have: r = 1875, ni = 1, mi1 = 1 and Xi1 |Θi is Poisson with parameter
θi . The estimators of µ, v, and a are found as follows:
1875
1 X 0(1563) + 1(271) + 2(32) + 3(7) + 4(2)
X= Xi1 = = 0.194
1875 1875
i=1
µ̂ =v̂ = 0.194
1875
X
(Xi1 − X)2 =1563(0 − 0.194)2 + 271(1 − 0.194)2 + 32(2 − 0.194)2
i=1
+7(3 − 0.194)2 + 2(4 − 0.194)2 = 423.3355
423.3355
â = − 0.194 = 0.032
1874
0.194
k̂ = = 6.06
0.032
1
Ẑ = = 0.14.
1 + 6.06
The estimated credibility premium for the number of claims for each poli-
cyholder is
0.14Xi1 + (0.86)(0.194).
For a policyholder with two claims, Xi1 = 2 so that the premium is
Practice Problems
Problem 88.1
Suppose that mij Xij |Θi has a binomial distribution with parameters (mij , θi ).
Express a in terms of µ and v.
Problem 88.2 ‡
You are given:
(i) During a single 5-year period, 100 policie s had the following total claims
experience:
Number of claims Number
in Year 1 through of
Year 5 Policies
0 46
1 34
2 13
3 5
4 2
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire period.
A randomly selected policyholder had 3 claims over the period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann
estimate for the number of claims in Year 6 for the same policyholder.
Problem 88.3 ‡
You are given:
(i) During a 2-year period, 100 policies had the following claims experience:
Number of claims Number
in Year 1 through of
Year 2 Policies
0 50
1 30
2 15
3 4
4 1
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire 2-year period.
A randomly selected policyholder had one claim over the 2-year period.
664 CREDIBILITY THEORY
Problem 88.4 ‡
You are given:
(i) Over a three-year period, the following claim experience was observed
for two insureds who own delivery vans:
Year
Insured 1 2 3
A Number of Vehicles 2 2 1
Number of claims 1 1 0
B Number of Vehicles N/A 3 2
Number of claims N/A 2 3
(ii) The number of claims for each insured each year follows a Poisson dis-
tribution.
Determine the semiparametric empirical Bayes estimate of the claim fre-
quency per vehicle for Insured A in Year 4.
Problem 88.5 ‡
For a group of auto policyholders, you are given:
(i) The number of claims for each policyholder has a conditional Poisson
distribution.
(ii) During Year 1, the following data are observed for 8000 policyholders:
Number of claims Number of Policies
0 5000
1 2100
2 750
3 100
4 50
5+ 0
A randomly selected policyholder had one claim in Year 1.
Determine the semiparametric empirical Bayes estimate of the number of
claims in Year 2 for the same policyholder.
Problem 88.6 ‡
For a portfolio of motorcycle insurance policyholders, you are given:
(i) The number of claims for each policyholder has a conditional Poisson
distribution.
(ii) For Year 1, the following data are observed:
88 SEMIPARAMETRIC EMPIRICAL BAYES CREDIBILITY ESTIMATION665
# of claims # of insureds
0 2000
1 600
2 300
3 80
4 20
Total 3000
Problem 88.7 ‡
The following information comes from a study of robberies of convenience
stores over the course of a year:
(i) Xi is the number of robberies of the ith store, with i = 1, 2, · · · , 500.
500
X
(ii) Xi = 50.
i=1
500
X
(iii) Xi2 = 220.
i=1
(iv) The number of robberies of a given store during the year is assumed to
be Poisson distributed with an unknown mean that varies by store.
Determine the semiparametric empirical Bayes estimate of the expected
number of robberies next year of a store that reported no robberies dur-
ing the studied year.
666 CREDIBILITY THEORY
Basics of Stochastic
Simulation
667
668 BASICS OF STOCHASTIC SIMULATION
For such procedure, two important questions are in place: The first question
deals with finding the pseudorandom values. The second question is the size
of n.
One method for finding the pseudorandom values is the inversion method23
which we discuss next.
Example 89.2 ‡
You wish to simulate a value, Y, from a two point mixture. With probability
0.3, Y is exponentially distributed with mean 0.5. With probability 0.7, Y
is uniformly distributed on [−3, 3]. You simulate the mixing variable where
low values correspond to the exponential distribution. Then you simulate
the value of Y, where low random numbers correspond to low values of Y.
Your uniform random numbers from [0, 1] are 0.25 and 0.69 in that order.
Calculate the simulated value of Y.
Solution.
The statement ”low random numbers correspond to low values of Y ” implies
that the inversion method is to be used. The value 0.25 is used to simulate
the mixture and the number 0.69 is used to simulate the value of Y. We are
told that for low simulating mixing variable, the exponential distribution
must be used. Since 0.25 < 0.3, this satisfies the criterion. That is, the
exponential distribution is to be used. By the inversion method, the simu-
lated value of Y is the number y such that Pr(Y ≤ y) = 0.69. That is, y is
y
the solution to the equation 1 − e− 0.5 = 0.69. Solving this equation, we find
y = 0.5855
Example 89.3
The cdf of X is given by
0.3x, 0≤x<1
FX (x) =
0.3 + 0.35x, 1 ≤ x ≤ 2.
Solution.
To find the simulated value of 0 ≤ u1 < 0.3, we solve the equation 0.3x1 = u1
u1
obtaining x1 = 0.3 . For any uniform number 0.3 ≤ u2 < 0.65, the simulated
value is x2 = 1. Note that Pr(0.3 ≤ U < 0.65) = 0.65 − 0.3 = 0.35 =
670 BASICS OF STOCHASTIC SIMULATION
Example 89.4
Suppose that
0.5x, 0 ≤ x < 1.2
FX (x) = 0.6, 1.2 ≤ x < 2.4
0.5x − 0.6, 2.4 ≤ x ≤ 3.2.
Solution.
For 0 ≤ x < 1.2, we have 0 ≤ FX (x) < 0.6. Since 0.5 is in that range, the
simulad value resulting from 0.5 is found by solving the equation 0.5x1 = 0.5
which implies x1 = 1. Next, we see that FX (x) = 0.6 for 1.2 ≤ x ≤ 2.4, For
thia case, x2 = 2.4. Finally, 0.6 ≤ 0.8 ≤ 1, so the corresponding simulated
value is found by solving the equation 0.5x3 −0.6 = 0.8 resulting in x3 = 2.8
Recall that a discrete distribution has jumps at the possible values of the
random variable and is constant in between, two features covered in the
previous two examples.
Example 89.5
Simulate values from a binomial distribution with m = 2 and q = 0.3 using
uniform numbers.
Solution.
The cdf of X is given by
[x]
X m
FX (x) = q i (1 − q)m−i
i
i=0
89 THE INVERSION METHOD FOR SIMULATING RANDOM VARIABLES671
where [x] is the greatest integer less than or equal to x. Thus, for m = 2
and q = 0.3, we have
0,
x < 0,
0.49, 0 ≤ x < 1
FX (x) =
0.91, 1 ≤ x < 2.
1, x ≥ 2.
For 0 ≤ u < 0.49, the simulated value is x = 0. For 0.49 ≤ u < 0.91, the
simulated value is x = 1. For 0.91 ≤ u < 1, the simulated value is x = 2
Remark 89.1
Note that FX (x) = 0.49 for all 0 ≤ x < 1. But Pr(0.49 ≤ U < 0.91) =
0.91 − 0.49 = Pr(X = 1). This is the motivation for choosing the largest
value in an interval where the cdf is constant.
Solution.
The empirical estimate of µ = E(X) is x. The central limit theorem tells us
that X n is approximately normal so that we can write
Since we do not know σ 2 and µ2 , we estimate them with the sample variance
and mean. Thus, we cease simulation when
38416s2
n≥
x2
We can apply a similar sort of idea to estimating a probability.
Solution.
Let Pnn be the empirical estimator of FX (1000) where Pn is the number of
values at or below 1000 after n simulations (see Sections 49). The central
limit theorem tells us that Pnn is approximately normal with mean FX (1000)
and variance FX (1000)[1 − FX (1000)]/n. (See Section 53). Using Pnn as an
estimator of FX (1000) and arguing as in the previous example, we arrive at
n − Pn
n ≥ 38416
Pn
89 THE INVERSION METHOD FOR SIMULATING RANDOM VARIABLES673
Practice Problems
Problem 89.1 ‡
To estimate E(X), you have simulated X1 , X2 , X3 , X4 , and X5 with the
following results:
1 2 3 4 5.
You want the standard deviation of the estimator of E(X) to be less than
0.05. Estimate the total number of simulations needed.
Problem 89.2 ‡
A company insures 100 people age 65. The annual probability of death for
each person is 0.03. The deaths are independent.
Use the inversion method to simulate the number of deaths in a year. Do
this three times using:
u1 =0.20
u2 =0.03
u3 =0.09.
Problem 89.3 ‡
You simulate observations from a specific distribution F (x), such that the
number of simulations N is sufficiently large to be at least 95 percent con-
fident of estimating F (1500) correctly within 1 percent.
Let P represent the number of simulated values less than 1500. Determine
which of the following could be values of N and P.
(A) N = 2000 P = 1890
(B) N = 3000 P = 2500
(C) N = 3500 P = 3100
(D) N = 4000 P = 3630
(E) N = 4500 P = 4020
Problem 89.4 ‡
You are planning a simulation to estimate the mean of a non-negative ran-
dom variable. It is known that the population standard deviation is 20%
larger than the population mean.
Use the central limit theorem to estimate the smallest number of trials
needed so that you will be at least 95% confident that the simulated mean
is within 5% of the population mean.
674 BASICS OF STOCHASTIC SIMULATION
Problem 89.5 ‡
Simulation is used to estimate the value of the cumulative distribution func-
tion at 300 of the exponential distribution with mean 100.
Determine the minimum number of simulations so that there is at least a
99% probability that the estimate is within ±1% of the correct value.
Problem 89.6 ‡
You are simulating a compound claims distribution:
(i) The number of claims, N, is binomial with m = 3 and mean 1.8.
(ii) Claim amounts are uniformly distributed on {1, 2, 3, 4, 5}.
(iii) Claim amounts are independent, and are independent of the number of
claims.
(iv) You simulate the number of claims, N, then the amounts of each of those
claims, X1 , X2 , · · · , XN . Then you repeat another N, its claim amounts, and
so on until you have performed the desired number of simulations.
(v) When the simulated number of claims is 0, you do not simulate any
claim amounts.
(vi) All simulations use the inverse transform method, with low random
numbers corresponding to few claims or small claim amounts.
(vii) Your random numbers from (0, 1) are
0.7 0.1 0.3 0.1 0.9 0.5 0.5 0.7 0.3 0.1
Calculate the aggregate claim amount associated with your third simulated
value of N.
90 APPLICATIONS OF SIMULATION IN ACTUARIAL MODELING675
Example 90.1 ‡
Unlimited claim severities for a warranty product follow the lognormal dis-
tribution with parameters µ = 5.6 and σ = 0.75.
You use simulation to generate severities. The following are six uniform
(0, 1) random numbers:
Using these numbers and the inversion method, calculate the average pay-
ment per claim for a contract with a policy limit of 400.
Solution.
Let X be the lognormal random variable with µ = 5.6 and σ = 0.75. Its cdf
is given by
ln x − 5.6
F (x) = Φ
0.75
where Φ is the cdf of the standard normal distribution. Using the table of
the standard normal distribution, we find
ln x1 − 5.6 ln x1 − 5.6
Φ = 1−0.6179 = 0.3821 =⇒ = 0.3 =⇒ x1 = 338.66.
0.75 0.75
In a similar manner, we find
ln x2 − 5.6 ln x2 − 5.6
Φ = 1−0.4602 = 0.5398 =⇒ = −0.1 =⇒ x3 = 250.89.
0.75 0.75
ln x3 − 5.6 ln x3 − 5.6
Φ = 1−0.9452 = 0.0548 =⇒ = 1.6 =⇒ x3 = 897.85.
0.75 0.75
ln x4 − 5.6 ln x4 − 5.6
Φ = 1−0.0808 = 0.9192 =⇒ = −1.4 =⇒ x4 = 94.63.
0.75 0.75
ln x5 − 5.6 ln x5 − 5.6
Φ = 1−0.7881 = 0.2119 =⇒ = 0.8 =⇒ x5 = 492.75.
0.75 0.75
ln x6 − 5.6 ln x1 − 5.6
Φ = 1−0.4207 = 0.5793 =⇒ = −0.2 =⇒ x6 = 232.76.
0.75 0.75
676 BASICS OF STOCHASTIC SIMULATION
Solution.
Finding the first three probabilities of the Poisson distribution, we obtain
e−4 (40 )
p0 = = 0.0183
0!
e−4 (41 )
p1 = = 0.0733
1!
e−4 (42 )
p2 = = 0.1463.
2!
Thus, FX (x) = 0.0183 for 0 ≤ x < 1, FX (x) = 0.0916 for 1 ≤ x < 2
and FX (x) = 0.2381 for 2 ≤ x < 3. Since u = 0.13 falls in the interval
(0.0916, 0.2381), the simulated number of claims is 2.
For the simulated amount of claim corresponding to u1 = 0.05, we have
x1
1 − e− 1000 = 0.05 =⇒ x1 = 51.29.
Likewise, x2
1 − e− 1000 = 0.95 =⇒ x2 = 2995.73.
Since the simulated number of claims is 2, there is no need to consider u3 .
In conclusion, the total losses are 51.29 + 2995.73 = 3047.02
Example 90.3 ‡
Losses for a warranty product follow the lognormal distribution with under-
lying normal mean and standard deviation of 5.6 and 0.75 respectively.
90 APPLICATIONS OF SIMULATION IN ACTUARIAL MODELING677
Using these numbers and the inversion method, calculate the average pay-
ment per loss for a contract with a deductible of 100.
Solution.
Let X be the lognormal random variable with µ = 5.6 and σ = 0.75. Its cdf
is given by
ln x − 5.6
F (x) = Φ
0.75
where Φ is the cdf of the standard normal distribution. Using the table of
the standard normal distribution, we find
ln x1 − 5.6 ln x1 − 5.6
Φ = 1−0.6217 = 0.3783 =⇒ = 0.31 =⇒ x1 = 341.21.
0.75 0.75
Solution.
Let X denote the outcome of the hunt. We first find the cdf of X. We are
told that F (0) = 0.80. Also, for 1000 ≤ x ≤ 5000, we have
0.2
F (x) = 0.8 + (x − 1000).
5000 − 1000
Note the presence of 0.2 in the second term of F (x). Without it, F (5000) =
1.8 > 1 which contradicts the definition of F (x) (i.e. 0 ≤ F (x) ≤ 1). So,
the cdf of X can be expressed as
0, x<0
0.8, 0 ≤ x < 1000
F (x) =
0.75 + 0.00005x, 1000 ≤ x ≤ 5000
1, x > 5000
Example 90.5 ‡
You are simulating the gain/loss from insurance where:
(i) Claim occurrences follow a Poisson process with λ = 32 per year.
(ii) Times between successive claims follow an exponential distribution with
mean 1.5.
(iii) Each claim amount is 1, 2 or 3 with p(1) = 0.25, p(2) = 0.25, and
p(3) = 0.50.
(iv) Claim occurrences and amounts are independent. Successive time claims
are independent.
(v) The annual premium equals expected annual claims plus 1.8 times the
standard deviation of annual claims.
(vi) i = 0.
You use 0.25, 0.40, 0.60, and 0.80 from the unit interval and the inversion
90 APPLICATIONS OF SIMULATION IN ACTUARIAL MODELING679
Solution.
Let N be the number of claims, X the size of a claim, and S the total annual
gain/loss. We have
Let u1 = 0.30. Since 0.25 < 0.30 ≤ 0.5, we have x1 = 2. Let u2 = 0.60.
Since 0.5 < 0.6 ≤ 1, we find x2 = 3. Finally, the gain to the insurer is
10.05 − (2 + 3) = 4.95
680 BASICS OF STOCHASTIC SIMULATION
Practice Problems
Problem 90.1 ‡
A dental benefit is designed so that a deductible of 100 is applied to annual
dental charges. The reimbursement to the insured is 80% of the remaining
dental charges subject to an annual maximum reimbursement of 1000.
You are given:
(i) The annual dental charges for each insured are exponentially distributed
with mean 1000.
(ii) Use the following uniform (0, 1) random numbers and the inversion
method to generate four values of annual dental charges:
Problem 90.2 ‡
For a warranty product you are given:
(i) Paid losses follow the lognormal distribution with µ = 13.294 and σ =
0.494.
(ii) The ratio of estimated unpaid losses to paid losses, y, is modeled by
y = 0.801x0.851 e−0.747x
where
x = 2006 − contract purchase year.
The inversion method is used to simulate four paid losses with the following
four uniform (0,1) random numbers:
Using the simulated values, calculate the empirical estimate of the average
unpaid losses for purchase year 2005.
Problem 90.3 ‡
You are given:
(i) The cumulative distribution for the annual number of losses for a poli-
cyholder is:
90 APPLICATIONS OF SIMULATION IN ACTUARIAL MODELING681
n FN (n)
0 0.125
1 0.312
2 0.500
3 0.656
4 0.773
5 0.855
.. ..
. .
(ii) The loss amounts follow the Weibull distribution with θ = 200 and τ = 2.
(iii) There is a deductible of 150 for each claim subject to an annual maxi-
mum out-of-pocket of 500 per policy.
The inversion method is used to simulate the number of losses and loss
amounts for a policyholder:
(a) For the number of losses use the random number 0.7654.
(b) For loss amounts use the random numbers: 0.2738 0.5152 0.7537 0.6481 0.3153.
Use the random numbers in order and only as needed.
Based on the simulation, calculate the insurer’s aggregate payments for this
policyholder.
Problem 90.4 ‡
The price of a non dividend-paying stock is to be estimated using simula-
tion. It is known that: h i
2
(i) The price St follows the lognormal distribution: ln SS0t ∼ N α − σ2 t, σ 2 t .
(ii) S0 = 50, α = 0.15, and σ = 0.30.
Using the following uniform (0, 1) random numbers and the inversion method,
three prices for two years from the current date are simulated.
Problem 90.5 ‡
You are given:
(i) For a company, the workers compensation lost time claim amounts follow
the Pareto distribution with α = 2.8 and θ = 36.
(ii) The cumulative distribution of the frequency of these claims is:
682 BASICS OF STOCHASTIC SIMULATION
n FN (n)
0 0.5556
1 0.8025
2 0.9122
3 0.9610
4 0.9827
5 0.9923
.. ..
. .
Problem 90.6 ‡
N is the random variable for the number of accidents in a single year. N
follows the distribution:
Pr(N = n) = 0.9(0.1)n−1 , n = 1, 2, · · · .
Xi is the random variable for the claim amount of the ith accident. Xi
follows the distribution:
u v1 v2 v3 v4
0.05 0.30 0.22 0.52 0.46
Calculate the total amount of claims during the year for the first simulation.
90 APPLICATIONS OF SIMULATION IN ACTUARIAL MODELING683
Problem 90.7 ‡
Annual dental claims are modeled as a compound Poisson process where the
number of claims has mean 2 and the loss amounts have a two-parameter
Pareto distribution with θ = 500 and α = 2.
An insurance pays 80% of the first 750 of annual losses and 100% of annual
losses in excess of 750.
You simulate the number of claims and loss amounts using the inverse trans-
form method with small random numbers corresponding to small numbers
of claims or small loss amounts.
The random number to simulate the number of claims is 0.8. The random
numbers to simulate loss amounts are 0.60, 0.25, 0.70, 0.10 and 0.80.
Calculate the total simulated insurance claims for one year.
684 BASICS OF STOCHASTIC SIMULATION
In this section, we use simulation to estimate the risk measures VaR and
TVaR. Let y1 ≤ y2 ≤ · · · ≤ yn be a simulated sample of size n of a random
variable. For a percentile p, let k = [pn] + 1, where [x] is the largest integer
less than or equal to x. The estimators of VaR and TVaR are
n
X
1
VaR
d p (X) = yk and TVaR
\ p (X) =
n−k+1 yi .
i=k
n
1 X
s2p = \ p (X))2 .
(yi − TVaR
n−k
i=k
Example 91.1
Consider the following sample of simulated values
Find VaR
d p (X), TVaR
\ p (X), and Var(
d TVaR
\ p (X)) for p = 0.6.
Solution.
Rearrnaging the values in increasing order to obtain
k =[pn] + 1 = [3.6] + 1 = 4
VaR
d p (X) =153
n
1 X
TVaR
\ p (X) = yi
n−k+1
i=k
1
= (153 + 189 + 210) = 184
6−4+1
n
1 X
s2p = \ p (X))2
(yi − TVaR
n−k
i=k
1
= [(153 − 184)2 + (189 − 184)2 + (210 − 184)2 ] = 831
6−4
s2 + p[TVaR
\ p (X) − VaRd p (X)]2
Var( \ p (X)) = p
d TVaR
n−k+1
831 + 0.6(184 − 153)2
= = 469.20
6−4+1
Practice Problems
Problem 91.1
Consider the following sample of simulated values
Find VaR
d p (X), TVaR
\ p (X), and Var(
d TVaR
\ p (X)) for p = 0.3.
686 BASICS OF STOCHASTIC SIMULATION
Example 92.1
A sample of size 2 contains the values x1 = 2 and x2 = 4. Calculate the
MSE of the unbiased estimator of the population mean using the bootstrap
method.
Solution.
The original sample mean is x = 2+4
2 = 3. Since the original sample is of size
2
2, there are 2 = bootstrap samples. The table below provides the various
samples along with their mean and square deviation.
Sample Xi (X i − X)2
2,2 2 (2 − 3)2 = 1
2,4 3 (3 − 3)2 = 0
4,2 3 (3 − 3)2 = 0
4,4 4 (4 − 3)2 = 1
Total 2
Hence, the bootstrap estimate is given by
1+0+0+1
M
\ SE(µ̂) = = 0.5
4
92 THE BOOTSTRAP METHOD FOR ESTIMATING MEAN SQUARE ERROR687
Example 92.2 ‡
You are given a random sample of two values from a distribution function
F : x1 = 1 and x2 = 3.
You estimate θ(F ) = Var(X) using the estimator g(X1 , X2 ) = 21 2i=1 (Xi −
P
Solution.
The estimator for the original sample is g0 = g(1, 3) = 1. We have the
following table.
Sample X1 X2 Xi gi = g(X1 , X2 ) (gi − g0 )2
1 1 3 2 1 0
2 1 1 1 0 1
3 3 1 2 1 0
4 3 3 3 0 1
Total 2
Hence, the bootstrap estimate is given by
2
M
\ SE(g) = = 0.5
4
Example 92.3 ‡
A sample of claim amounts is {300, 600, 1500}. By applying the deductible
to this sample, the loss elimination ratio for a deductible of 100 per claim
is estimated to be 0.125. You are given the following simulations from the
sample:
Simulation Claim
1 600 600 1500
2 1500 300 1500
3 1500 300 600
4 600 600 300
5 600 300 1500
6 600 600 1500
7 1500 1500 1500
8 1500 300 1500
9 300 600 300
10 600 600 600
Determine the bootstrap approximation to the mean square error of the
estimate.
688 BASICS OF STOCHASTIC SIMULATION
Solution.
We have
Simulation X1 X2 X3 LER (LER − 0.125)2
1 600 600 1500 0.111111 0.000193
2 1500 300 1500 0.090909 0.001162
3 1500 300 600 0.125000 0.000000
4 600 600 300 0.200000 0.005625
5 600 300 1500 0.125000 0.000000
6 600 600 1500 0.111111 0.000193
7 1500 1500 1500 0.066667 0.003403
8 1500 300 1500 0.090909 0.001162
9 300 600 300 0.250000 0.015625
10 600 600 600 0.166667 0.001736
Total 0.029099
0.029099
The bootstrap estimate to the mean sqaure error is 10 = 0.0029099
92 THE BOOTSTRAP METHOD FOR ESTIMATING MEAN SQUARE ERROR689
Practice Problems
Problem 92.1 ‡
You are given a random sample of two values from a distribution function
F : x1 = 1 and x2 = 3.
You estimate θ(F ) = Var(X) using the estimator g(X1 , X2 ) = 2i=1 (Xi −
P
X)2 where X = X1 +X2
2
.
Determine the bootstrap approximation to the mean square error.
Problem 92.2 ‡
With the bootstrapping technique, the underlying distribution function is
estimated by which of the following?
(A) The empirical distribution function
(B) A normal distribution function
(C) A parametric distribution function selected by the modeler
(D) Any of (A), (B) or (C)
(E) None of (A), (B) or (C).
Problem 92.3 ‡
Three observed values of the random variable X are:
1 1 4
You estimate the third central moment of X using the estimator:
3
1X
g(X1 , X2 , X3 ) = (Xi − X)3 .
3
i=1
Determine the bootstrap estimate of the mean-squared error of g.
Problem 92.4 ‡
For a policy that covers both fire and wind losses, you are given:
(i) A sample of fire losses was 3 and 4.
(ii) Wind losses for the same period were 0 and 3.
(iii) Fire and wind losses are independent, but do not have identical distri-
butions.
Based on the sample, you estimate that adding a policy deductible of 2 per
wind claim will eliminate 20% of the insured loss.
Determine the bootstrap approximation to the mean square error of the
estimate.
Problem 92.5 ‡
The random variable X has the exponential distribution with mean θ. Cal-
culate the mean-squared error of X 2 as an estimator of θ2 .
690 BASICS OF STOCHASTIC SIMULATION
Answer Key
Section 1
1.1 Deterministic
1.2 Stochastic
1.3 Stochastic
1.4 Stochastic
Section 2
691
692 ANSWER KEY
2.2 Note that Pr(E) > 0 for any event E. Moreover, if S is the sample
space then
∞ ∞
i
X 1X 1 1 1
Pr(S) = Pr(Oi ) = = · 1 =1
2 2 2 1− 2
i=1 i=0
∞ X
X ∞ ∞
X
Pr(∪∞
n=1 Ei ) = Pr(Onj ) = Pr(En )
n=1 j=1 n=1
2.3 0.5
2.4 0.56
2.5 0.66
2.6 0.52
2.7 0.05
2.8 0.6
2.9 0.48
2.10 0.04
The probability that the first ball is red and the second ball is blue is
PR(RB) = 0.3.
2.12
The probability that the first ball is red and the second ball is blue is
PR(RB) = 6/25.
2.13 0.173
2.14 0.467
2.15 0.1584
2.16 0.0141
694 ANSWER KEY
2.17 0.29
2.18 0.42
2.19 0.22
2.20 0.657
Section 3
3.2
x 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
p(x) 36 36 36 36 36 36 36 36 36 36 36
3.3
p, x=1
p(x) = 1 − p, x=0
0, x 6= 0, 1.
3.4 1/9
3.5 0.469
3.6 0.132
3.7 0.3
3.8 α = 1.
3.9
0 x<1
0.25 1 ≤ x<2
F (x) = 0.75 2 ≤ x < 3
0.875 3≤x<4
1 4≤x
695
3.10 (a) (
0, x<0
F (x) = 1
1− (1+x)a−1
, x ≥ 0.
(b)
0, x<0
F (x) = −kxα
1−e , x ≥ 0.
3.11
n
X
F (n) =P (X ≤ n) = P (X = k)
k=0
n
1 2 k
X
=
3 3
k=0
2 n+1
11− 3
=
3 1 − 32
n+1
2
=1 −
3
ex
3.13 f (x) = F 0 (x) = (1+ex )2
1
3.14 (a) We have that S(0) = 1, S 0 (x) = − 201
(100 − x)− 2 ≤ 0, s(x) is
right continuous, and S(100) = 0. Thus, S satisfies the properties of a sur-
vival function.
1
1
(b) F (x) = 1 − S(x) = 1 − 10 (100 − x) 2 .
(c) 0.092
3.15 0.149
x2
3.16 F (x) = 1 − S(x) = 100 , x≥0
3.20 1/480
Section 4
4.2 (b) λ
1
4.5 (c) λ2
697
4.6 (a)
Z ∞
1 x
E(X) = xe− θ xα−1 dx
θΓ(α) 0
Z ∞
θ 1 −x α
= e θ x dx
Γ(α) 0 θ
θΓ(α + 1)
=
Γ(α)
=αθ
(b)
Z ∞ α α−1
1 x θ
E(X ) =2
x2 e − θ dx
Γ(α) 0 x
Z ∞
1 1 x
= xα+1 α e− θ dx
Γ(α) 0 θ
2 Z ∞
θ Γ(α + 2) xα+1 x
= α+2
e− θ dx
Γ(α) 0 θ Γ(α + 2)
2
θ Γ(α + 2)
=
Γ(α)
where the last integral is the integral of the pdf of a Gamma random variable
with parameters (α + 2, θ). Thus,
θ2 Γ(α + 2) θ2 (α + 1)Γ(α + 1)
E(X 2 ) = = = θ2 α(α + 1).
Γ(α) Γ(α)
Finally,
4.7 4
4.8 1,417,708,752
4.9 730,182,499.20
4.10 0
4.11 9
698 ANSWER KEY
4.12 0.3284
4.13 We have
Z ∞
µ0n = xn f (x)dx
0
Z ∞
=A xB+n e−Cx dx
0
∞
xB+n e−Cx B + n ∞ B+n−1 −Cx
Z
=A − + x e dx
C
0 C 0
Z ∞
B+n
= AxB+n−1 e−Cx dx
C 0
B+n
= E(X n−1 ).
C
(B+1)(B+2)
4.14 µ = B+1
C and µ02 = C2
4.15 √2
B+1
3(B+3)
4.16 B+1
4.17 0.5
Rx Rx 0, x<0
4.18 (a) F (x) = −∞ 0.005tdt = 0 0.005tdt = 0.0025x2 , 0 ≤ x ≤ 20
1, x > 20
40 200
(b) The mean is 3 and the variance is 9
(c) 0.354
4.19 16
R∞ ∞
axk a a
4.20 (a) E(X k ) = 1 xa+1
dx = k−1 xk−a
= a−k , 0 < k < a.
1
(b) √ 1
a(a−2)
4.21 −1.596
θ αθ2
4.22 The mean is α−1 and the variance is (α−1)2 (α−2)
q
α
4.23 α−2
699
4.24 2
4.25 1.7
Section 5
5.1
5.2
4 1
Pr(X ≤ 45) = =
12 3
5
Pr(X ≤ 67) =
12
7
Pr(X ≤ 84) =
12
8
Pr(X ≤ 93) =
12
11
Pr(X ≤ 100) =
12
Pr(X ≤ 102) =1.
5.4 0.509175
5.6 108
1
5.7 λ
5.8 308,8728
e−λd
5.9 λ
700 ANSWER KEY
1 −λd
5.10 λ2
e (2 − e−λd )
1
5.11 160
5.12 94.84
5.13 88.4
θ xφ+1
5.14 θ − φ+1 −x− θφ (φ+1)
5.15
Z ∞
S(x) =Pr(X > x) = (1 + 2t2 )e−2t dt
x
∞
= −(1 + t + t2 )e−2t x = (1 + x + x2 )e−2x
5.16 We have
1 − SX (y + d) − [1 − SX (d)]
SY P (y) =1 − FY P (y) = 1 −
Sx (d)
SX (y + d) + SX (d) SX (y + d)
=1 − = .
SX (d) SX (d)
5.17 (a)
425
5.18 E[(X − 10)+ ] = and Var[(X − 10)+ ] = 36
5.19 d = 6.
701
5.20 175
5.21 1875
5.22 3.43
5.23 6.259
Section 6
6.1 we can either say 1120 is the twentieth percentile or 1120 is the one-fifth
quantile
6.2 3
6.3 2
6.5 The median is M = 0.3466. This means that half the people get in
line less than 0.3466 minutes (about 21 seconds) after the previous person,
while half arrive more than 0.3466 minutes later
6.6 0.693
6.7 998.72
6.8 3 ln 2
6.10 3659
√
6.11 a + 2 ln 2
6.12 − ln (1 − p)
6.14 2
6.15 72.97
6.16 0.4472
6.17 6299.61
6.18 2.3811
6.19 50
6.20 2.71
Section 7
7.1 0.2119
7.2 0.9876
7.3 0.0094
7.4 0.692
7.5 0.1367
7.6 0.0088
7.7 0
7.8 23
7.9 0.0162
7.10 6,342,637.5
7.11 0.8185
7.12 16
703
7.13 0.1587
7.14 0.9887
7.15 0.0244
7.16 0.9985
7.17 0.1056
7.18 0.8413
7.19 0.8201
7.20 0.224
Section 8
2 1
8.1 E(X) = λ2
and V ar(X) = λ2
8.2 A normal random variable with mean µ1 + µ2 and variance σ12 + σ22
8.3 0.70
8.4 41.9
2 +4t
8.5 e13t
8.6 4
8.7 −28
8.8 2
8.9 5000
8.10 10560
tp
8.12 1−t(1−p) provided that |t| < (1 − p)−1
8.13 True
8.14 ta PX (tb )
1 1−p
8.15 E(X) = p and Var(X) = p2
t3
8.18 2−t
8.19
x −1 0 1 2 3
16 32 24 8 1
p(x) 81 81 81 81 81
Section 9
9.1 Let m > 0. Then there is M > 0 such that ebx ≥ xm+1 which is
equivalent to saying that xm e−bx ≤ x1 for x ≥ M. By the comparison test of
improper integrals we find that
Z ∞
xm e−bx < ∞.
0
Since E(X k ) is an integral of the above form, we conclude that E(X k ) < ∞
for all k > 0. That is, the distribution of X is light-tailed.
R∞
9.2 Since X is heavy-tailed, we have E(X k ) = 0 xk fX (x)dx = ∞ for
some k > 0. Now, let t > 0. Let N be large enough so that etx ≥ xk for all
x ≥ N. Hence,
Z N Z ∞ Z N Z ∞ Z ∞
k tx k k
x fX (x)dx+ e fX (x)dx ≥ x fX (x)dx+ x fX (x)dx = xk fX (x)dx = ∞.
0 N 0 N 0
RN R∞
Since 0 xk fX (x)dx < ∞, we conclude that N etx fX (x)dx = ∞
705
9.3 We have
Z ∞ Z ∞
tx
MX (t) = e fX (x)dx ≥ etx fX (x)dx = ∞
0 N
θk Γ(α + k)
E(X k ) =
Γ(α)
provided that k < τ. Since E(X k ) is only valid for k < τ, the distribution is
heavy-tailed.
9.6 The Pareto distribution has a more heavy-tailed than the Gamma dis-
tribution,
9.7 The Weibull distribution has a lighter tail than the inverse Weibull
distribution.
9.12
Distribution Heavy-Tail Light-Tail
Weibull X
Inverse Pareto X
Normal X
Loglogistic X
706 ANSWER KEY
9.13
Distribution Heavy-Tail Light-Tail
Paralogistic X
Lognormal X
Inverse Gamma X
Inverse Gaussian X
9.14
Distribution Heavy-Tail Light-Tail
Inverse Paralogistic X
Inverse Exponential X
SX (x)
9.15 limx→∞ SY (x) =∞
SX (x)
9.16 limx→∞ SY (x) =∞
SX (x)
9.17 limx→∞ SY (x) =∞
9.18 c > 0
9.20 The tail of X is heavier than that of Y which in turn is heavier than
the tail of Z
Section 10
h i
d f (x+y) y(1−α) y α−2 − yθ
10.1 dx f (x) = x2
1+ x e < 0 for α > 1.
10.3 We have
f (x) τ xτ −1
h(x) = = .
S(x) θτ
707
Hence,
τ (τ − 1)xτ −2
h0 (x) = .
θτ
Thus, h(x) is increasing (light-tailed distribution) for τ > 1 and decreasing
(heavy-tailed distribution) for 0 < τ < 1
10.4 X is light-tailed
10.5 We have
α+1
αθα (x + θ)α+1
x+θ
H(x) = α+1
· = .
(x + y + θ) αθα x+y+θ
Hence, α
0 x+θ y
H (x) = (α + 1) > 0.
x+y+θ (x + y + θ)2
Thus, H(x) is increasing and by Theorem 10.1, h(x) is decreasing which
shows that the Pareto distribution is heavy-tailed
10.6 We have
f (x) f 0 (x)
lim h(x) = lim = lim
x→∞ x→∞ S(x) x→∞ −f (x)
d d h xi
= − lim [ln (f (x))] = − lim (α − 1) ln x −
x→∞ dx x→∞ dx θ
1 α−1 1
= lim − = .
x→∞ θ x θ
10.7 We have
R∞
x SX (t)dt −SX (x) 1
lim e(x) = lim = lim = lim .
x→∞ x→∞ S(x) x→∞ −fX (x) x→∞ h(x)
10.8 θ
10.9 Since 0 < α < 1, the hazard rate function is decreasing and hence
e(x) is increasing. The result follows from the fact that e(0) = αθ and
e(∞) = θ.
708 ANSWER KEY
10.10 Since α > 1, the hazard rate function is inecreasing and hence e(x)
is decreasing. The result follows from the fact that e(0) = αθ and e(∞) = θ
αθα θα
10.11 For this distribution we have fX (x) = (x+θ)α+1
and SX (x) = (x+θ)α .
fx (x) α
Hence, h(x) = SX (x) = x+θ . Thus,
1
lim e(x) = = ∞.
x→∞ limx→∞ h(x)
This shows that e(x) is increasing and hence the distribution is heavy-tailed
1 2 2
10.13 (a)S(x) = (x+1)2 , f (x) = (x+1)3 , and h(x) = x+1 .
10.16 S(x+y)
S(x) is nonincreasing so that e(x) is nonincreasing and therefore
X is light-tailed
10.17 f (x+y)
f (x) is nondecreasing so that h(x) is nonincreasing. Thus, X is
heavy-tailed
Section 11
2 ∞
R∞
2 2
11.1 (a) S(x) = x 2te−t dt = −e−t = e−x .
0
S(x) 2
(b) fe (x) = = √2 e−x for x > 0 and 0 otherwise.
E(X) π
1 2x+3
11.2 he (x) = e(x) = 2x+5
x2
Rx 1
e(0) − 0 e(t) dt
11.3 S(x) = e(x) e = (1 + x)e−x− 2 , x>0
e(x) x2
11.4 Se (x) = e(0) S(x) = e−x− 2 , x>0
11.5 0.6559
709
24 4
11.6 (a) E(X) = 3 and E(X 2 ) = 5 (b) 5
10
10 9
11.7 S(x) = 10+9x
11.8 λ
Section 12
12.4 (b)
12.5 (a)
12.9 0
12.10 We have
p
ρ(L + α) =E(L + α) + β Var(L + α)
p
=E(L) + α + β Var(L)
=ρ(L) + α
p
ρ(αL) =E(αL) + β Var(αL)
=αE(L) + αβVar(L)
=αρ(L)
where α > 0.
710 ANSWER KEY
Section 13
13.1 a(1 − p) + pb
13.2 1.8974
13.3 2.0227
13.4 60
1
13.5 θ(1 − p)− α
13.6 40
13.7 10,000
13.8 347.21
13.9 5
13.10 VaR0.96 = 400. This says that there is 4% chance the losses will
exceed 400
Section 14
14.2 (a) π0.90 =≈ 1.8974 and e(1.8974) = 0.0051 (b) TVaR0.90 (L) = 1.9025
14.4 (a) θ = 1000 (b) e(100) = 220 (c) π0.95 = 647.55 (d) TRaV0.95 (L) =
867.55
14.6 1.3
14.7 1.651
711
14.8 2.02
14.9 0.82
14.10 100,000
14.11 120.62
Section 15
15.1 We have
x x x x
FcX (x) = Pr(cX ≤ x) = Pr(X ≤ ) = FX = 1 − e−λ cθ = FX ( ).
c c c
This is an exponential distribution with parameter cθ
y 2
15.2 FY (y) = 1 − e−( c )
y−cθ
15.3 FY (y) = cθ
15.9 The Gamma distribution with parameters α and θ has a cdf FX (x) =
1
R xθ α−1 −t
Γ(α) 0 t e dt. Let Y = cX. Then
y
FY (y) =Pr(Y ≤ y) = Pr X ≤
c
Z y
1 cθ
= tα−1 e−t dt.
Γ(α) 0
This is a Gamma distribution with parameters α and cθ
15.10 100
x
e− θ
15.11 Letting α = 1, we obtain fX (x) = θ which is the pdf of an expo-
nential distribution
15.12 0.0295
Section 16
16.1 0.0949
16.2 100
16.5 0.0568
713
16.6 35
α 1 α 2 α N
θ1 θ2 θN
16.7 FX (x) = 1 − a1 x+θ 1
− a2 x+θ2 − · · · − aN x+θN where
PN
aj > 0 and i=1 ai = 1.
α α α
α θ 1 α θ 2 α θ N
h i h i h i
fX (x) = a1 (x+θ1 )1α1 +1 + a2 (x+θ2 )2α2 +1 + · · · + aN (x+θN )NαN +1
1 2 N
α α α
α1 θ1 1 α2 θ 2 2 αN θ N
a1 +a +···+a N
2 N
(x+θ1 )α1 +1 (x+θ2 )α2 +1 (x+θN )αN +1
hX (x) =
θ1
α1
θ2
α2
θN
α
N
a1 x+θ +a2 x+θ +···+aN x+θ
1 2 N
16.8 15
16.9 400
16.10 0.146
16.11 0.7566
Section 17
2478 2283152
17.1 E(X) = 13 , Var(X) = 169 and the mode is 10
1351 1127
17.2 E(X ∧ 105) = 13 and eX (x) = 9
17.3 0.1659
17.4
1
FX (x) = 13 number of elements in the sample that are ≤ x
17.5 0.61
17.6 17,566,092.92
714 ANSWER KEY
17.7 1 1 1
8 · 13 = 104 , 90 ≤ x ≤ 98
1 3 3
8 · 13 = 104 , 100 ≤ x ≤ 108
1 2 2
· = 104 , 130 ≤ x ≤ 138
8 13
1 4 4
fX (x) = 8 · 13 = 104 , 176 ≤ x ≤ 184
1 1 1
· = 104 , 206 ≤ x ≤ 214
8 13
1 1 1
· = 104 , 346 ≤ x ≤ 354
8 13
1 1 1
· 520 ≤ x ≤ 528
8 13 = 104 ,
and 0 otherwise.
Section 18
18.1 0.75
SY (y) = 1 − y 3 , 0 ≤ y < 1
18.4 We have
α
y y −α θ
FY (y) = FX =1− 1+ =1− .
θ θ y+θ
This is the cdf of a Pareto distribution with parameters α and θ. The pdf is
αθα
fY (y) = .
(y + θ)α+1
18.5 We have
ln y − µ
FY (y) = Φ .
σ
715
Thus,
!
z
−µ
z ln θ ln z − (µ + ln θ)
FZ (z) = FY =Φ =Φ .
θ σ σ
18.7 We have
w
FW (w) =FZ
1+r
1 w2 1 w
= 2 2
+ .
2 w + 1000(1 + r) 2 w + 1000(1 + r)
Thus, W is an equal
√ mixture of a loglogistic distribution with parameters
γ = 2 and θ = 10 10(1 + r) and a Pareto distribution with parameters
α = 1 and θ = 1000(1 + r)
Section 19
19.2 We have
!α !α
θ y
FY (y) = 1 − FX (y −1 ) = 1 = 1 .
y +θ y+ θ
1
Y has the inverse Pareto distribution with parameters α and θ
716 ANSWER KEY
19.3 fY (y) = y −2 fX (y −1 ) = 1
y2
· 2
y = 2
y3
for y > 1 and 0 otherwise
1
−y
1 y 1−α e
19.4 fY (y) = 1
f (y −1 )
y2 X
= y2 Γ(α) and 0 otherwise
τ y τ −1 1
19.5 fY (y) = τ y τ −1 fX (y τ ) = b for 0 ≤ y ≤ b τ and 0 otherwise
19.7 We have
ln y − µ
FY (y) = Φ
σ
and
1 ln y − µ 1 1 ln y−µ 2
fY (y) = fZ = √ e− 2 ( σ )
y σ yσ 2π
where Z is the standard normal distribution
19.9 We have
1
1 1 1 − ln y y −( θ +1)
fY (y) = fX (ln y) = e θ =
y yθ θ
for y > 1 and 0 otherwise
19.10 0.25
Section 20
5α−2
20.1 Var(X) = 12(α−1)2 (α−2)
b2
4
20.2 fX (x) = (4+x)2
for x > 0 and 0 otherwise
20.3 1.7975
θxγ
20.5 FX (x) = 1+θxγ
20.6 0.6094
717
20.7 14
20.8 0.61
20.9 0.75
Section 21
x
21.2 SX (x) = (1+x) ln (1+x) , x>0
x
21.3 MΛ (x) = e θ , x ≥ 0
21.4 a(x) = −1
−x −1
21.5 SX (x) = MΛ (−x) = ee
a(x)MΛ0 [−A(x)]
fX (x)
21.11 hX (x) = SX (x) = MΛ (−A(x))
Section 22
22.1
1
f (x) = α, 0<x<c
(1 − α)θe−θx , x > c.
22.2 0.9252
22.3 3 + ln 5
718 ANSWER KEY
22.4 5.61
22.5 (
1
1000+θ , 0 < x ≤ 1000
f (x) = 1 1000
− xθ
1000+θ e , x > 1000
θ
and 0 otherwise.
22.6 461.78
Section 23
23.1 We have
ln 1 + ατ
lim ln w1 = lim
τ →∞ τ →∞ (α + τ − 1 )−1
2
−1
1 + ατ (− τα2
= lim
τ →∞ −(α + τ − 1/2)−2
1 2
α −1 α
= lim 1 + α +1−
τ →∞ τ τ 2τ
=α.
Let
(ξ/x)γ α+τ
w2 = 1 + .
τ
We have
(ξ/x)γ
lim ln w1 = lim (α + τ ) ln 1 +
τ →∞ τ →∞ τ
h γ
i−1
1 + (ξ/x)
τ (−τ −2 )(ξ/x)γ
= lim
τ →∞ −(α + τ )−2
(ξ/x)γ −1 ξ γ
α 2
= lim 1 + 1+
τ →∞ τ x τ
γ
ξ
= .
x
1
−1
Also, we let ξ = θτ γ so that θ = ξτ γ .
Using this and Stirling’s formula in the pdf of a transformed beta distribu-
tion, we find
Γ(α + τ )γxγτ −1
fX (x) =
Γ(α)Γ(τ )θγτ (1 + xγ θ−γ )γ+τ
1 1
e−α−τ (α + τ )α+τ − 2 (2π) 2 γxγτ −1
≈ 1 1
Γ(α)e−τ (τ )τ − 2 (2π) 2 ξ γτ τ −τ (1 + xγ ξ −γ τ )γ+τ
α+τ − 1 −γα−1
e−α 1 + ατ 2
γx
= −α−τ −γα −γ(τ
Γ(α)τ ξ γ(τ +α) ξ x +α) (1 + xγ ξ −γ τ )γ+τ
α+τ − 1 −γα−1
e−α 1 + ατ 2
γx
= h γ α+τ
i
Γ(α)ξ −γα 1 + (ξ/x) τ
Let
α α+τ − 21
w1 = 1 + .
τ
From the previous problem, limτ ∞ w1 = eα . Now, let
(ξ/x)γ α+τ
w2 = 1 + .
τ
Then ξ γ
lim w2 = e( x ) .
τ →∞
Hence,
γξ γα
lim fX (x) = ξ γ
τ →∞
Γ(α)xγα+1 e( x )
which is the pdf of an inverse transformed Gamma distribution
Section 24
e−λ λx e−λx ex ln λ
f (x, λ) = = .
x! x!
1
Thus, p(x) = x! , q(λ) = eλ , and r(λ) = ln |lambda
Section 25
25.1 0.91873
25.2 5e−5
25.3 0.1412
Section 26
721
26.1 For the Poisson distribution the variance is equal to the mean. For
the negative binomial and geometric the variance exceeds the mean. So the
answer is (d)
26.2 0.75
26.5 192
26.6 r = 2 and β = 4
Section 27
27.1 The Poisson distribution has a variance equal to the mean. The neg-
ative binomial and geometric distributions have a varaince exceeding the
mean. The binomial distribution has a variance less than the mean. Thus,
the answer is (a)
27.2 38.34
27.4 0.0057
27.5 6.2784
27.6 0.172
27.7
m 0 1 2 3 4
pm 0.1074 0.2684 0.3020 0.2013 0,0881
F (m) 0.1074 0.3758 0.6778 0.8791 0.9672
722 ANSWER KEY
Section 28
28.3 0.125
28.4 3
28.5 0.0118
28.6 8
28.8 (III)
28.9 0.3012
28.10 0.8
28.11 0.09
Section 29
723
29.1 We have
1−pM
pM 1−p0
0
pk
k
= 1 = 1 − pM
0
pTk 1−p0 pk
29.2 We have
1 − pM 1 − pM
M
E(N ) = [PNM ]0 (1) = 0
PN0 (1) = 0
E(N )
1 − p0 1 − p0
1−pM
29.3 E[N M (N M − 1)] = 0
1−p0 E[N (N − 1)] and
2
1 − pM 1 − pM 1 − pM
M 0 0 0
Var(N ) = E[N (N −1)]+ E(N )− E(N )
1 − p0 1 − p0 1 − p0
29.4
1 z
29.5 (a) PNM (z) = 2 1+ 3−2z (b) E(N M ) = 1.5 and Var(N M ) = 2.25
29.6 0.5
Section 30
724 ANSWER KEY
30.1
β 1
a= = = 0.5
1+β 1+1
β
b =(r − 1) = −0.75
1+β
T T 0.75
p2 =p1 0.5 − = 0.106694
2
0.75
pT3 =pT2 0.5 − = 0.026674
3
pM T
1 =(1 − p0 M )p1 = 0.341421
pM T
2 =(1 − p0 M )p2 = 0.042678
pM T
3 =(1 − p0 M )p3 = 0.010670
β
30.2 E(N ) = ln (1+β)
β 2
30.3 E[N (N − 1)] = PN00 (1) = ln (1+β)
h i
β β
30.4 Var(N ) = ln (1+β) 1 + β − ln (1+β)
30.5 We have
∞
X ∞
X
PNT (z) = pTn z n = pTn z n
n=0 n=1
∞
1 X
= pn z n
1 − p0
n=1
∞
1 X p0
= pn z n −
1 − p0 1 − p0
n=0
PN (z) − p0
=
1 − p0
[1−β(z−1)]−r −(1+β)−r
30.6 PNT (z) = 1−(1+β)−r
0 (1)
PN rβ
30.7 E(N T ) = [PNT ]0 (1) = 1−p0 = 1−(1+β)−r
725
00 (1)
PN r(r+1)β 2
30.8 E[N T (N T − 1)] = [PNT ]00 (1) = 1−p0 = 1−(1+β)−r
h i2
r(r+1)β 2 rβ rβ
30.9 Var(N T ) = 1−(1+β)−r
+ 1−(1+β)−r
− 1−(1+β)−r
Section 31
31.1 (
1 − e−0.25 y=0
fY L (y) = −( y+50
2
0.0002(y + 50)e 100 ) y > 0.
(
1 − e−0.25 y=0
FY L (y) = y+50 2
1 − e−( 100 ) y > 0.
(
e−0.25 y=0
SY L (y) = y+50 2
e−( 100 ) y > 0.
31.2
2 −0.01y
fY P (y) =0.0002(y + 50)e−0.0001y
2 −0.01y
FY P (y) =1 − e−0.0001y
2 −0.01y
SY P (y) =e−0.0001y
hY P (y) =0.0002(y + 50).
31.5 2000e−400θ
31.6 1708.70
(b−d)2
31.7 12
θα
31.9 (a) (α−1)(θ+α)α−1
(b) ∞
726 ANSWER KEY
31.10 30
Section 32
32.1
d
fY L (y) = θ, y=0
1
θ, y > d.
d
θ, 0≤y≤d
FY L (y) = y
θ, y > d.
1 − dθ , 0 ≤ y ≤ d
SY L (y) =
1 − yθ , y > d.
0, 0<y<d
hY L (y) = 1
θ−y , y > d.
32.2
1
fY P (y) = , y > d.
θ−d
0, 0≤y≤d
FY P (y) = y−d
θ−d , y > d.
1, 0≤y≤d
SY P (y) = θ−y
θ−d y > d.
0, 0<y<d
hY P (y) = 1
θ−y , y > d.
θ2 −d2 θ+d
32.3 E(Y L ) = 2θ and E(Y P ) = 2
32.4 340.83
32.5 900
ln 0.40
32.6 d = −θ
32.8 320.83
32.9 456
727
32.10 6400
Section 33
33.2 86.6%
33.3 1546
33.4 500
33.5 333.33
33.6 0.07418
33.7 0.5
33.8 510.16
33.9 0.625
Section 34
and
1 − e−θy , y < u
FY (y) =
1 y≥u
Ru
e−θx dx = 1θ 1 − e−θu
34.2 E(X ∧ u) = 0
θu
− 1+r
34.3 E((1 + r)X ∧ u) = (1+r)
θ 1 − e
34.4 48
728 ANSWER KEY
34.5 182.18
34.6 2011.80
34.7 5176.78
Section 35
35.1 1.115
35.2 990,938.89
35.3 0.4163
35.4 133
35.5 353.55
35.6 3031.06
35.7 85%
35.8 29.93
35.9 109.4
35.10 0.583
Section 36
36.1 8.0925
36.2 0.5553
36.4 0.242444
729
36.6 0.6304
36.7 0.4424
Section 37
37.1 (a) This is false, In a collective risk model, all the loss amounts are
identically distributed.
(b) This is true since the loss amounts need not all have the same distribu-
tion.
(c) This is true. In the collective risk model, N (the frequency random
variable) is determined independently of each of the subscripted Xs, the
severity random variables.
(d) This is false. If frequency is independent of severity, as it is in the col-
lective risk model, then this implies that the number of payments does not
affect the size of each individual payment
37.2 0
PN (z; α) = Q(z)α
Section 38
38.1 E(S 3 ) = E(N 3 )E(X)2 −3E(N 2 )E(X)2 +2E(N )E(X)2 +3E(N 2 )E(X)E(X 2 )−
3E(N )E(X)E(X 2 ) + E(N )E(X 3 )
730 ANSWER KEY
( ln y−µ )
Rx φ
ln (x−y)−µ
38.3 FX∗2 (x) = 0 Φ σ σy
σ
dy
38.4 We have
38.5 1.226
38.7 0.1587
38.8 0.1637
38.9 0.0681
38.11 100
38.12 0.1003
38.13 0.242
38.14 0.1230
38.15 24
38.16 518
38.17 40
38.18 0.2483
731
38.19 0.0039
38.20 0.37
38.21 65.3
38.22 0.4207
38.23 0.0233
Section 39
39.1 1.014
39.2 2.25
39.3 1/3
39.4 2.064
39.5 25/16
39.6 2.3608
39.7 2.064
39.8 18.15
39.9 18.81
Section 40
732 ANSWER KEY
with the discrete part Pr(S = 0) = FS (0) = (1 + β)−1 and the continuous
part is the exponential distribution with mean θ(1 + β)
40.9 We have
∞ ∞
x X (x/2)j X
FS (x) =1 − e− 2 Pr(N = n)
j!
j=0 n=j+1
1 −x x
=1 − e 2 1+ .
2 10
Section 41
734 ANSWER KEY
41.2 We have
0.04
fS (n) = [fX (1)fS (n − 1) + fX (2)fS (n − 2) + fX (3)fS (n − 3)]
n
0.04
= [0.5fS (n − 1) + 0.4fS (n − 2) + 0.1fS (n − 3)]
n
1
= [0.02fS (n − 1) + 0.016fS (n − 2) + 0.004fS (n − 3)].
n
41.3 0.15172
41.4 12
41.5 76
41.6 165
41.7 1.0001
41.8 0.3336
41.10 0.0921
41.11 0.2883
735
Section 42
42.1 (a)
5
f0 =FX (5) = = 0.1
50
15 5
f1 =FX (15) − FX (5) = − = 0.2
50 50
25 15
f2 =FX (25) − FX (15) = − = 0.2
50 50
25
f3 =1 − FX (25) = 1 − = 0.5.
50
(b) 0.1935
42.2 0.0368
42.3 0.0404
42.4 We have
3 3
5 5
m10 + m11 = FX (6) − FX (3) = − = 0.150226
8 11
and
6
3(5)3 x
Z
3m10 + 6m11 = dx = 0.62897.
3 (x + 5)4
42.6 0.03682
42.7 0.03236
Section 43
736 ANSWER KEY
43.1
2 − v, y ≥ 13.5
where v = 0.15259
43.4
43.5
Section 44
737
44.4 (a) The mean is E(S) = 163 and the variance is Var(S) = 1107.77. (b)
217.75
Section 45
45.1 53
45.2 0.29
45.3 0.18
45.4 1975
Section 46
46.1 5
46.2 0.2
46.3 151.52
n+8
46.4 18(n−1)2
√
46.5 20 10
1
46.6 n−1 θ
46.7 12
46.8 (D)
738 ANSWER KEY
Section 47
47.1 1.64
47.2 2.58
47.3 0.2
Section 48
H0 :µ ≥ 30
H1 :µ < 30.
20 − 30
z= √ = −3.727.
6/ 5
The rejection region is Z < −1.28. Since −3.727 < −1.28, so we reject the
null hypothesis in favor of the alternative.Thus, the mean time to find a
parking space is less than 30 minutes.
48.4 We reject the null hypothesis when the level of confidence is greater
than or equal to the p−value. Thus, the answer to the question is:(ii), (iii),
and (iv).
739
H0 :µ = 20
H1 :µ > 20.
H0 :µ = 16
H1 :µ 6= 16.
The rejection region corresponding to α = 0.10 is |Z| > 1.645. Since 2.19 >
1.645, we reject H0 and conclude that the process is out of control.
H0 :µ = 10
H1 :µ 6= 10.
We have z α2 = z0.01 = 2.33. Thus, the critical values are −2.33 and 2.33.
48.8 (d)
Section 49
49.1
x 1 2 3 4 5 6 7
1 1 1 1 1 1 1
p12 (x) 6 12 6 6 12 6 6
740 ANSWER KEY
0, x<1
1
6 1≤x<2
1
2≤x<3
45
12 3≤x<4
F12 (x) = 7
12 4≤x<5
2
5≤x<6
3
5
6≤x<7
6
x ≥ 7.
1,
41 1331
49.2 The empirical mean is X = 12 and the empirical variance is 144 .
49.4
1, x < 49
8
= 91 , 49 ≤ x < 50
9
5
= 94 , 50 ≤ x < 60
9
4
= 95 , 60 ≤ x < 75
Sn (x) = 9
3
9 = 32 , 75 ≤ x < 80
2
= 97 , 80 ≤ x < 120
9
1
= 98 , 120 ≤ x < 130
9
x ≥ 130.
0,
49.5 6
741
49.6 1.291
49.7 12
Section 50
50.2 81
50.3 120
50.4 0.396
50.5 20,750
Section 51
51.1 (A) and (D) are false. (B) and (C) are true
51.2 Losses above a policy limit are right-censored and losses below a policy
deductible are left-truncated. The answer is (D)
51.3
742 ANSWER KEY
Life di xi ui
1 0 − 0.2
2 0 − 0.3
3 0 − 0.5
4 0 0.5 −
5 1 − 0.7
6 1.2 1.0 −
7 1.5 − 2.0
8 2 2.5 −
9 2.5 − 3.0
10 3.1 3.5 −
11 0 − 4.0
12 0 − 4.0
13 0 − 4.0
14 0 − 4.0
15 0 − 4.0
16 0 − 4.0
17 0 − 4.0
18 0 − 4.0
19 0.5 − 4.0
20 0.7 1.0 −
21 1.0 3.0 −
22 1.0 − 4.0
23 2.0 2.5 −
24 2.0 − 2.5
25 3.0 3.5 −
51.4
j yj sj rj
1 4 1 2+2−1=3
2 8 1 1+0−0=1
51.5
743
j yj sj rj
1 0.9 1 5+5−3=7
2 1.5 1 4+4−2=6
3 1.7 1 3+2−0=5
4 2.1 2 2+1−0=3
Section 52
52.1 0.52
52.2 0.583
52.3 0.067
52.4 0.7143
52.5 2
52.6 0.385
52.7 0.112
52.8 100
Section 53
53.1 We have
Nj E(Nj ) np(xj )
E[pn (xj )] = E = = = p(xj ).
n n n
This shows that the estimator is unbiased. Finding the variance of pn (xj )
we have
np(xj )[1 − p(xj )] p(xj )[1 − p(xj )]
Var[pn (xj )] = 2
= →0
n n
as n → ∞. This shows that the estimator is consistent
53.3 We have
64
p386 (2) == 0.1658
386
d 386 (2)] = p386 (2)[1 − p386 (2)]
Var[p
n
64 322
= 386 386 = 3.58 × 10−4
386
Section 54
54.2 We have
(1 + a1 )(1 + a2 ) · · · (1 + an ) = 1 + a1 + a2 + · · · + an + prodcuts of ai s.
But the ai s are given small so that a product of ai s is even smaller. Ignoring
all the product terms, we obtain the desired result
54.3 0.0148
54.4 0.03086
54.5 10
Section 55
745
55.3 0.607
55.5 0.779
55.7 0.2341
Section 56
56.1 0.485
56.2 0.53125
56.3 0.026
56.4 0.3
56.5 1 ≤ x ≤ 2
56.6 0.3
Section 57
57.2 We have
j
X j−1
X
rj = dj − (xj + uj )
i=0 i=1
j−1
X j−2
X
=[ dj − (xj + uj )] + dj − (xj−1 + uj−1 )
i=0 i=1
=rj−1 + dj − (xj−1 + uj−1 )
57.3 0.75
57.4 0.6
Section 58
58.1 0.52490
58.2 0.52
58.3 369
58.4 26,400
58.5 13.75
58.6 107.8
58.7 20
58.8 384
58.9 −0.24
58.10 4.468
58.11 246.6
58.12 17.55
747
58.13 208.3
58.14 1.614
58.15 296.21
58.16 224
58.17 104.4
58.18 118.32
58.19 0.983
58.20 0.345
Section 59
x1 +x2 +···+xn
59.1 θ̃ = 2n
59.2 θ̂ = max{x1 , x2 , · · · , xn }
59.3 4.3275
59.4 3.97
59.5 θ̂ = min{x1 , x2 , · · · , xn }
59.6 0.2507
59.7 2
59.8 0.6798
59.9 1996.90
59.10 0.447
748 ANSWER KEY
pe−1 1−p −0.01 p −2 1−p −0.2
59.11 L(p) = 100 + 10,000 e 100 e + 10,000 e
59.12
n m
X 2m 6 X
`0 (α1 ) = n
α1 − ln xi + − ln yi
α1 (2 + α1 ) (2 + α1 )2
i=1 i=1
59.13 16.74
59.14 916.7
Section 60
60.1 73
60.2 233.333
60.3 703
60.4 3000
60.5 2.41877
1100
e− θ
60.6 L(θ) = θ3
60.7 471
60.9 3.089
60.10 3,325.67
60.11 0.09
60.12 1067
60.13 3/8
60.14 52.68
749
Section 61
61.1 `0 (α)2 = 16
α2
− 18.501
α + 5.3481
16
61.3 I(α) = 1.732
= 5.346
61.4 0.1871
1
61.6 n
61.7 0.447
3θ2
61.8 n
61.9 0.97
Section 62
62.1
n
X
`(α, θ) = [ln α + α ln θ − (α + 1) ln (xi + θ)]
i=1
62.2
∂2` n
=− 2
∂α2 α
n
∂2` n X 1
= −
∂θ∂α θ xi + θ
i=1
n
∂2` nα X (α + 1)
= − +
∂θ2 θ2 (xi + θ)2
i=1
62.3
− αn2 n nα
" #
θ − (α+1)θ
I(θ) = n
θ − nα
(α+1)θ − nα
θ2
+ nα(α+1)
(α+2)θ2
750 ANSWER KEY
62.4 " #
−1 1 − nα
θ2
+ nα(α+1)
(α+2)θ2
− nθ + (α+1)θ
nα
I(θ) = n nα n
det[I(θ)] − θ + (α+1)θ − α2
62.5
5.0391 −0.4115
I(α, θ) =
−0.4115 −0.0524
62.6
1 0.0524 −0.4115 −0.1209 0.9495
Var(α̂,
d θ) = =
(5.0391)(−0.0524) − 0.41152 −0.4115 −5.0391 0.9495 11.6274
62.7 0.0187
62.10 0.02345
62.11
2 −3
−3 5
Section 63
63.1 −1.00774
Section 64
q2 q4
fX,Q (x, q) = q 2 = .
0.039 0.039
(c) The marginal distribution in X is
Z 0.5 4
q
fX (x) = dq = 0.15862.
0.2 0.039
(d) The posterior distribution is
q4
πQ|X (q|x) =
0.006186
413 −4λ 12
64.3 πΛ|X (λ|x) = 12! e λ
7λ 13λ
λ10 (0.8e− 6 +0.6e− 12 )
64.4 πΛ|X (λ|x) = 0.395536(10!)
64.5 0.5572
64.6 0.721
64.7 0.64
64.8 1.90
x+c
64.9 2
27
64.10 16
752 ANSWER KEY
64.11 1.9899
64.12 0.622
64.13 0.148
Section 65
65.1 2
65.2 0.45
65.3 0.000398
65.4 450
65.5 1.319
65.6 0.8148
Section 66
Pn
Hence,
Pn Q|X has a beta distribution with parameters a + i=1 xi , b + nm −
i=1 xi , and 1.
(n + α)M n+α
πΛ|X (λ, x) = .
λn+α−1
Hence, Λ|X has a single-parameter Pareto distribution with α0 = n + α and
θ0 = M
Section 67
(c) We have
E(λ̂) =E(N ) = λ
Var(N ) λ
Var(λ̂) = = ,
n n
67.2 1.6438
e−λ λk
67.3 pk = k!
67.5
X 4 3
1.25 X
H(r̂) =100 ln 1 + − nk (r̂ + m)−1
r̂
k=1 m=0
1.25 70 35 15 5
=100 ln 1 + − + + +
r̂ r̂ r̂ + 1 r̂ + 2 r̂ + 3
756 ANSWER KEY
x
67.6 β̂ = r
67.8 0.06
Section 68
68.1 We have
n0 9048
p̂M
0 = = = 0.9048
n 10000
and P∞
k=0 knk nx
β̂ = −1= − 1 = 0.05147
n − n0 n − n0
68.2 We have
n0 9048
p̂M
0 = = = 0.9048
n 10000
and
n − n0
x(1 − e−λ ) = λ =⇒ 0.1001(1 − e−λ ) = 0.0952λ =⇒ λ̂ = 0.1012
n
68.3 We have
n0 10
p̂M
0 = = = 0.5
n 20
and
1 − p̂M
0
x= mq =⇒ 0.7 = 3q =⇒ q̂ = 0.2333
1 − p0
68.4 −22.5547
Section 69
69.1 −0.0714
69.2 For x < 0.3, we have F ∗ (x) > Fn (x) so that the fitted distribu-
tion is thicker on the left than the empirical distribution. For x > 0.85,
F ∗ (x) > Fn (x) which implies S ∗ (x) < Sn (x). That is, the fitted distribution
is thinner on the right than the empirical distribution. Also, note that near
the median x = 0.5, the slope is less than 1. That is, less probability on the
757
69.3 Let’s choose x5 = 30. Than F (30) ≈ 0.6 but smaller than 0.6. If
X is uniform in [1, 100] then its cdf is Fu (x) = x−199 and Fu (30) ≈ 0.29
so that (C) is eliminated. If X is exponential with mean 10 then Fe =
1 − e−0.1x and Fe (30) ≈ 0.95. Thus, (D) is eliminated. If F (x) = x+1
x
then
30
F (30) 31 ≈ 0.97 so that (B) is eliminated. IF X is normal with mean 40
and standard deviation 40 then Fn (30) = Φ 30−40 40 = Φ(−0.25) = 0.40
so that (E) is eliminated. Note that with the function in (A) we have
F (30) = 1 − 30−0.25 ≈ 0.57. Hence, the answer is (A)
Section 70
70.1 0.2727
70.2 0.4025
70.3
70.4 0.1679
70.6
Section 71
71.1 0.252
71.2
758 ANSWER KEY
71.3 (A) Using sample data gives a better than expected fit and there-
fore a test statistic that favors the null hypothesis, thus increasing the Type
II error probability.
(B) The K-S test works only on individual data and so B is false.
(C) The Anderson-Darling test emphasizes the tails so that (C) is false.
Hence, the answer is (D)
71.4 (A), (B), and (C) are all correct. Thus, the answer is (D)
71.5 (A)
Section 72
72.1 9.151
72.4 χ2 = 6.659
72.5 A is false. Using sample data gives a better than expected fit and
therefore a test statistic that favors the null hypothesis, thus increasing the
Type II error probability. The K-S test works only on individual data and
so B is false. The A-D test emphasizes the tails, thus C is false. D is false
because the critical value depends on the degrees of freedom which in turn
depends on the number of cells, not the sample size. So the answer is (E)
72.6 (A)
Section 73
73.1 We have 0 degrees of freedom in the null hypothesis, since both param-
eters are specified, and 2 degrees of freedom in the alternative hypothesis,
since both parameters are freely chosen in maximizing L(α, θ). We thus have
2 degrees of freedom overall
73.2
α 5% 2.5% 1% 0.5%
cα 5.991 7.378 9.210 10.597
73.3
α 5% 2.5% 1% 0.5%
cα 5.991 7.378 9.210 10.597
Test Result Reject Reject Do not reject Do not reject
73.4
760 ANSWER KEY
α 10% 5% 2.5% 1%
cα 2.706 3.841 5.024 6.635
Test Result Reject Reject Do not reject Do not reject
73.5 7
Section 74
74.1 (i)
74.2 (A)
74.3 (I)
Section 75
75.1 16,913
75.2 2,381
75.3 0.10
α+1
75.4 384.16 α
75.5 960
Section 76
76.1 0.47
76.2 (E)
76.3 0.3723
76.4 5,446,250
76.5 138
761
76.6 0.8
Section 77
e−λ λx
fX|Λ (x|λ) =
x!
σ1 2π
αθα
π(λ) = , λ > θ.
λα+1
For the claims, we have
1
fX|Λ (x|λ) = , 0≤x≤λ
λ
Section 78
1
78.1 (a) fXY (x, y) = fX (x)fY |X (y|x) = x, 0 < x < 1, 1 − x < y < 1 (b)
1+ln 2
2
6x(2−x−y)
78.2 fX|Y (x|y) = 4−3y
− y1
78.3 e
78.4 (a) Observe that X only takes positive values, thus fX (x) = 0, x ≤ 0.
For 0 < x < 1 we have
Z ∞ Z ∞
α−1
fX (x) = fXY (x, y)dy = fXY (x, y)dy =
−∞ 1 α
For x ≥ 1 we have
∞ ∞
α−1
Z Z
fX (x) = fXY (x, y)dy = fXY (x, y)dy =
−∞ x αxα
fXY (x, y) α
fY |X (y|x) = = α+1 , y > 1.
fX (x) y
Hence, Z ∞ Z ∞
yα dy α
E(Y |X = x) = dy = α = .
1 y α+1 1 y α α−1
If x ≥ 1 then
fXY (x, y) αxα
fY |X (y|x) = = α+1 , y > x.
fX (x) y
763
Hence,
∞
αxα
Z
αx
E(Y |X = x) = α+1
y
dy =
x y α−1
3
78.5 E(X|Y = y) = 23 y and E(Y |X = x) = 23 1−y
1−y 2
1
78.7 12
78.8 Var(Y ) = 13
and
Section 79
π(G) =0.70
π(A) =0.20
π(B) = 0.10.
79.6 0.158
766 ANSWER KEY
79.7 3.83
79.8 0.6794
79.9 7.202
79.10 10,322
79.11 0.278
Section 80
1 −x
f (x|λ) = e λ.
λ
15
225e− λ
π(λ) = , λ > 0.
λ3
(x+15)
225e− λ
f (x, λ) = f (x|λ)π(λ) = , λ > 0.
λ4
27
225e− λ
f (12, λ) = , λ > 0.
λ4
(x+15)
∞
225e−
Z
λ
f (x) = dλ
0 λ4
767
and
27
∞
225e− λ
Z
f (12) = dλ
0 λ4
Z ∞ 3 − 27
225 27 e λ
= 3 Γ(3) dλ
27 λ4 Γ(3)
|0 {z }
1
450
= 3 .
(27 )
Similarly,
15
∞
1 − (12+x2 ) 225e− λ
Z
f (12, x2 ) = e λ dλ
0 λ2 λ3
Z ∞
1 − (27+x2 )
=225 e λ dλ
0 λ5
Z ∞ (27+x2 )
225 (27 + x2 )4 e− λ
= Γ(4) dλ
(27 + x2 )4 0 λ5 Γ(4)
| {z }
1
1350
= .
(27 + x2 )4
The predictive distribution is
1350
(27+x2 )4 3(273 )
f (x2 |12) = 450 =
(273 )
(27 + x2 )4
80.2 3.25
1
80.4 (i) Letting q = 1+β we can write
r k
r(r + 1) · · · (r + k − 1) 1 β
pk =
k! 1+β 1+β
r k
(r − 1)!r(r + 1) · · · (r + k − 1) 1 1
= 1−
k!(r − 1)! 1+β 1+β
Γ(r + k)
= q r (1 − q)k .
Γ(r)Γ(k + 1)
(ii) The model distribution is
Γ(r + x)
f (x|q) = q r (1 − q)x .
Γ(r)Γ(x + 1)
The prior distribution is
Γ(a + b) a−1
π(q) = q (1 − q)b−1 .
Γ(a)Γ(b)
The joint distribution of X and Q is
Γ(r + x) Γ(a + b) a−1
f (x, q) =f (x|q)π(q) = q r (1 − q)x q (1 − q)b−1
Γ(r)Γ(x + 1) Γ(a)Γ(b)
Γ(r + x) Γ(a + b) a+r−1
= q (1 − q)b+k−1 .
Γ(r)Γ(x + 1) Γ(a)Γ(b)
The marginal distribution is
Γ(r + x) Γ(a + b) 1 a+r−1
Z
f (x) = q (1 − q)b+k−1 dq
Γ(r)Γ(x + 1) Γ(a)Γ(b) 0
Γ(r + x) Γ(a + b) Γ(a + r)Γ(b + k) 1 Γ(a + b + k + r) a+r−1
Z
= q (1 − q)b+k−1 dq
Γ(r)Γ(x + 1) Γ(a)Γ(b) Γ(a + b + k + r) 0 Γ(a + r)Γ(b + k)
| {z }
1
Γ(r + x) Γ(a + b) Γ(a + r)Γ(b + k)
= .
Γ(r)Γ(x + 1) Γ(a)Γ(b) Γ(a + b + k + r)
The posterior distribution is
Γ(r+x) Γ(a+b) a+r−1
Γ(r)Γ(x+1) Γ(a)Γ(b) q (1 − q)b+k−1
π(q|x) = Γ(r+x) Γ(a+b) Γ(a+r)Γ(b+k)
Γ(r)Γ(x+1) Γ(a)Γ(b) Γ(a+b+k+r)
Γ(a + b + k + r) a+r−1
= q (1 − q)b+k−1
Γ(a + r)Γ(b + k)
769
80.6 15
Section 81
which implies
n
X α̂0
α̂i = 1 −
2
j=1
or equivalently
n
X
α̂j + α̂i (1 − ρ) = ρ
j=1
α̂0 nρα̂0
1− = .
µ µ(1 − ρ)
770 ANSWER KEY
(1 − ρ)µ
α̂0 = .
1 − ρ + nρ
Plugging this into Problem 81.3, we find
ρ
α̂i =
1 − ρ + nρ
=(1 − Z)µ + ZX
where
nρ
Z=
1 − ρ + nρ
Section 82
82.1 10,622
82.2 0.85651
82.3 0.22
82.4 1063.47
82.5 3
82.6 14
θ−θ2
82.7 1.5−θ2
82.8 8.33
1
82.9 9
82.10 3.27
771
Section 83
nβ 1
83.1 nβ+1 X + nβ+1 (αβ)
83.2 0.9375
83.3 0.905
83.4 1
83.5 0.93
83.6 8.69
83.7 0.428
83.8 0.8
Section 84
2 2
mi Xi + mj Xj mi mj
Var Θ = Var(Xi |Θ) + Var(Xj |Θ)
mi + mj mi + mj mi + mj
2 2
mi v(θ) mj v(θ)
= w(θ) + + w(θ) +
mi + mj mi mi + mj mj
2
mi + mj2
v(θ)
= w(θ) + .
(mi + mj )2 mi + mj
772 ANSWER KEY
(b) We have
E(Xi ) =E[E(Xi |Θ)] = E[µ(Θ)] = µ
Cov(Xi , Xj ) =E(Xi , Xj ) − E(Xi )E(Xj )
=E[E(Xi Xj |Θ)] − E[µ(Θ)]2
=E[E(Xi |Θ)E(Xj |Θ)] − E[µ(Θ)]2 (by independence)
=E[µ2 (Θ)] − E[µ(Θ)]2
=Var[µ(Θ)] = a
Var(Xi ) =E[Var(Xi |Θ)] + Var[E(Xi |Θ)]
v(Θ)
=E w(θ) + + Var[µ(Θ)]
mi
v
=w + +a
mi
84.3 We have
n n
X amj
X µ 1
α̂0 + α̂j Xj = ∗
+ Xj
1 + am v + wmj 1 + am∗
j=1 j=1
n
µ a X mj
= ∗
+ ∗
Xj
1 + am 1 + am v + wmj
j=1
=ZX + (1 − Z)µ.
84.4 2.4
84.5 12
84.6 11.13
84.7 257.11
4
84.8 3
84.9 (A) is false. This is true for Bhlmann. The Bhlmann-Straub model
allows the variation in size and exposure.
(B) is false. The model is valid for any type of distributions.
(C) is false. There is no cap on the number of exposure.
Thus, the answer to the problem is (E)
n
84.10 n+ w
a
Section 85
1 Pn
Z ∞
(θ +
Pn α+n e− λ ( i=1 xi +θ)
i=1 xi )
E(Xn+1 |X) = λ dλ
0 λα+n+1 Γ(α + n)
1 Pn
∞
(θ + ni=1 xi )α+n−1 e− λ ( i=1 xi +θ)
Z P
(nX + θ)
= dλ
α+n−1 0 λα+n−1+1 Γ(α + n − 1)
(nX + θ)
= .
α+n−1
85.2 Problem 66.1 shows that the posterior distribution hasPa beta dis-
tribution with parameters a = a + i=1 xi , b0 = b + nm − ni=1 xi . The
0
Pn
hypothetical mean is
85.3 By Example
P 66.3 the posterior distribution has normal distribution
xi −1 −1
with mean σ2
+ aµ2 σn2 + a12 and variance σn2 + a12 . The hypo-
thetical mean is
µ(λ) = E(Xi |Λ) = λ
µ(λ) =λ
µ =E(Λ)
v(λ) =Var(Xi |Λ) = σ 2
v =E(σ 2 ) = σ 2
Var(Λ) =a2
v sigma2
k= =
a a2
n na2
Z= =
n+k na2 + σ 2
Pc =ZX + (1 − Z)µ
na2 x µσ 2
= 2 2
+ 2
na + σ na + σ 2
1 −1
P
xi µ n
= + 2 + .
σ2 a σ 2 a2
85.4 0.0182
Section 86
86.1 0.818
86.2 687.375
86.3 0.78
86.4 0.8718
Section 87
87.1 0.3682
87.2 0.4987
777
87.3 0.852
87.4 1.351
87.5 98.26
87.6 0.323
87.7 7.56
Section 88
88.1 a = µ − v − µ2
88.2 0.221
88.3 0.3928
88.4 0.6333
88.5 0.5747
88.6 0.2659
88.7 0.023209
Section 89
89.1 1000
89.2 1
89.3 (D)
89.4 2212.76
89.5 3477.81
89.6 7
778 ANSWER KEY
Section 90
90.1 522.13
90.2 228,503
90.3 224.44
90.4 88.75
90.5 41.897
90.6 35.7
90.7 630.79
Section 91
Section 92
779
92.1 1
92.2 (A)
44
92.3 9
92.4 0.0131
92.5 21θ4
780 ANSWER KEY
Exam C Tables
781
782 EXAM C TABLES
783
784 EXAM C TABLES
Appendix A
An Inventory of Continuous
Distributions
A.1 Introduction
The incomplete gamma function is given by
Z x
1
Γ(α; x) = tα−1 e−t dt, α > 0, x > 0
Γ(α) 0
Z ∞
with Γ(α) = tα−1 e−t dt, α > 0.
0
Also, define Z ∞
G(α; x) = tα−1 e−t dt, x > 0.
x
At times we will need this integral for nonpositive values of α. Integration by parts produces the relationship
xα e−x 1
G(α; x) = − + G(α + 1; x)
α α
This can be repeated until the first argument of G is α + k, a positive number. Then it can be evaluated
from
G(α + k; x) = Γ(α + k)[1 − Γ(α + k; x)].
The incomplete beta function is given by
Z
Γ(a + b) x a−1
β(a, b; x) = t (1 − t)b−1 dt, a > 0, b > 0, 0 < x < 1.
Γ(a)Γ(b) 0
785
Γ(α + τ ) θα xτ −1 x
f (x) = F (x) = β(τ , α; u), u=
Γ(α)Γ(τ ) (x + θ)α+τ x+θ
θk Γ(τ + k)Γ(α − k)
E[X k ] = , −τ < k < α
Γ(α)Γ(τ )
θk τ (τ + 1) · · · (τ + k − 1)
E[X k ] = , if k is an integer
(α − 1) · · · (α − k)
θk Γ(τ + k)Γ(α − k)
E[(X ∧ x)k ] = β(τ + k, α − k; u) + xk [1 − F (x)], k > −τ
Γ(α)Γ(τ )
τ −1
mode = θ , τ > 1, else 0
α+1
αγ(x/θ)γ 1
f (x) = F (x) = 1 − uα , u=
x[1 + (x/θ)γ ]α+1 1 + (x/θ)γ
θk Γ(1 + k/γ)Γ(α − k/γ)
E[X k ] = , −γ < k < αγ
Γ(α)
VaRp (X) = θ[(1 − p)−1/α − 1]1/γ
θk Γ(1 + k/γ)Γ(α − k/γ)
E[(X ∧ x)k ] = β(1 + k/γ, α − k/γ; 1 − u) + xk uα , k > −γ
Γ(α)
µ ¶1/γ
γ−1
mode = θ , γ > 1, else 0
αγ + 1
τ γ(x/θ)γτ (x/θ)γ
f (x) = F (x) = uτ , u=
x[1 + (x/θ)γ ]τ +1 1 + (x/θ)γ
θk Γ(τ + k/γ)Γ(1 − k/γ)
E[X k ] = , −τ γ < k < γ
Γ(τ )
VaRp (X) = θ(p−1/τ − 1)−1/γ
θk Γ(τ + k/γ)Γ(1 − k/γ)
E[(X ∧ x)k ] = β(τ + k/γ, 1 − k/γ; u) + xk [1 − uτ ], k > −τ γ
Γ(τ )
µ ¶1/γ
τγ − 1
mode = θ , τ γ > 1, else 0
γ+1
786 EXAM C TABLES
APPENDIX A. AN INVENTORY OF CONTINUOUS DISTRIBUTIONS 3
γ(x/θ)γ (x/θ)γ
f (x) = F (x) = u, u=
x[1 + (x/θ)γ ]2 1 + (x/θ)γ
E[X k ] = θk Γ(1 + k/γ)Γ(1 − k/γ), −γ < k < γ
VaRp (X) = θ(p−1 − 1)−1/γ
E[(X ∧ x)k ] = θk Γ(1 + k/γ)Γ(1 − k/γ)β(1 + k/γ, 1 − k/γ; u) + xk (1 − u), k > −γ
µ ¶1/γ
γ−1
mode = θ , γ > 1, else 0
γ+1
787
APPENDIX A. AN INVENTORY OF CONTINUOUS DISTRIBUTIONS 4
A.2.3.4 Paralogistic–α, θ
This is a Burr distribution with γ = α.
α2 (x/θ)α 1
f (x) = F (x) = 1 − uα , u=
x[1 + (x/θ)α ]α+1 1 + (x/θ)α
θk Γ(1 + k/α)Γ(α − k/α)
E[X k ] = , −α < k < α2
Γ(α)
VaRp (X) = θ[(1 − p)−1/α − 1]1/α
θk Γ(1 + k/α)Γ(α − k/α)
E[(X ∧ x)k ] = β(1 + k/α, α − k/α; 1 − u) + xk uα , k > −α
Γ(α)
µ ¶1/α
α−1
mode = θ , α > 1, else 0
α2 + 1
2
τ 2 (x/θ)τ (x/θ)τ
f (x) = F (x) = uτ , u=
x[1 + (x/θ)τ ]τ +1 1 + (x/θ)τ
θk Γ(τ + k/τ )Γ(1 − k/τ )
E[X k ] = , −τ 2 < k < τ
Γ(τ )
VaRp (X) = θ(p−1/τ − 1)−1/τ
θk Γ(τ + k/τ )Γ(1 − k/τ )
E[(X ∧ x)k ] = β(τ + k/τ , 1 − k/τ ; u) + xk [1 − uτ ], k > −τ 2
Γ(τ )
mode = θ (τ − 1)1/τ , τ > 1, else 0
(x/θ)α e−x/θ
f (x) = F (x) = Γ(α; x/θ)
xΓ(α)
θk Γ(α + k)
M (t) = (1 − θt)−α , t < 1/θ E[X k ] = , k > −α
Γ(α)
E[X k ] = θk (α + k − 1) · · · α, if k is an integer
θk Γ(α + k)
E[(X ∧ x)k ] = Γ(α + k; x/θ) + xk [1 − Γ(α; x/θ)], k > −α
Γ(α)
= α(α + 1) · · · (α + k − 1)θk Γ(α + k; x/θ) + xk [1 − Γ(α; x/θ)], k an integer
mode = θ(α − 1), α > 1, else 0
788 EXAM C TABLES
(θ/x)α e−θ/x
f (x) = F (x) = 1 − Γ(α; θ/x)
xΓ(α)
θk Γ(α − k) θk
E[X k ] = , k<α E[X k ] = , if k is an integer
Γ(α) (α − 1) · · · (α − k)
θk Γ(α − k)
E[(X ∧ x)k ] = [1 − Γ(α − k; θ/x)] + xk Γ(α; θ/x)
Γ(α)
θk Γ(α − k)
= G(α − k; θ/x) + xk Γ(α; θ/x), all k
Γ(α)
mode = θ/(α + 1)
A.3.2.3 Weibull–θ, τ
τ
τ (x/θ)τ e−(x/θ) τ
f (x) = F (x) = 1 − e−(x/θ)
x
E[X k ] = θk Γ(1 + k/τ ), k > −τ
VaRp (X) = θ[− ln(1 − p)]1/τ
τ
E[(X ∧ x)k ] = θk Γ(1 + k/τ )Γ[1 + k/τ ; (x/θ)τ ] + xk e−(x/θ) , k > −τ
µ ¶1/τ
τ −1
mode = θ , τ > 1, else 0
τ
τ
τ (θ/x)τ e−(θ/x) τ
f (x) = F (x) = e−(θ/x)
x
E[X k ] = θk Γ(1 − k/τ ), k < τ
VaRp (X) = θ(− ln p)−1/τ
h τ
i
E[(X ∧ x)k ] = θk Γ(1 − k/τ ){1 − Γ[1 − k/τ ; (θ/x)τ ]} + xk 1 − e−(θ/x) , all k
h τ
i
= θk Γ(1 − k/τ )G[1 − k/τ ; (θ/x)τ ] + xk 1 − e−(θ/x)
µ ¶1/τ
τ
mode = θ
τ +1
789
e−x/θ
f (x) = F (x) = 1 − e−x/θ
θ
M (t) = (1 − θt)−1 E[X k ] = θk Γ(k + 1), k > −1
k
E[X k ] = θ k!, if k is an integer
VaRp (X) = −θ ln(1 − p)
TVaRp (X) = −θ ln(1 − p) + θ
E[X ∧ x] = θ(1 − e−x/θ )
E[(X ∧ x)k ] = θk Γ(k + 1)Γ(k + 1; x/θ) + xk e−x/θ , k > −1
= θk k!Γ(k + 1; x/θ) + xk e−x/θ , k an integer
mode = 0
θe−θ/x
f (x) = F (x) = e−θ/x
x2
k k
E[X ] = θ Γ(1 − k), k<1
VaRp (X) = θ(− ln p)−1
E[(X ∧ x)k ] = θk G(1 − k; θ/x) + xk (1 − e−θ/x ), all k
mode = θ/2
1 ln x − μ
f (x) = √ exp(−z 2 /2) = φ(z)/(σx), z = F (x) = Φ(z)
xσ 2π σ
k 2 2
E[X ] = exp(kμ + k σ /2)
µ ¶
k 2 2 ln x − μ − kσ 2
E[(X ∧ x) ] = exp(kμ + k σ /2)Φ + xk [1 − F (x)]
σ
mode = exp(μ − σ2 )
790 EXAM C TABLES
µ ¶1/2 µ ¶
θ θz 2 x−μ
f (x) = exp − , z=
2πx3 2x μ
" µ ¶ # µ ¶ " µ ¶1/2 #
1/2
θ 2θ θ x+μ
F (x) = Φ z + exp Φ −y , y=
x μ x μ
" Ã r !#
θ 2tμ2 θ
M (t) = exp 1− 1− , t < 2, E[X] = μ, Var[X] = μ3 /θ
μ θ 2μ
" µ ¶ # µ ¶ " µ ¶1/2 #
1/2
θ 2θ θ
E[X ∧ x] = x − μzΦ z − μy exp Φ −y
x μ x
αθα
f (x) = , x>θ F (x) = 1 − (θ/x)α , x>θ
xα+1
αθ(1 − p)−1/α
VaRp (X) = θ(1 − p)−1/α TVaRp (X) = , α>1
α−1
αθk αθk kθα
E[X k ] = , k<α E[(X ∧ x)k ] = − , x≥θ
α−k α − k (α − k)xα−k
mode = θ
Note: Although there appears to be two parameters, only α is a true parameter. The value of θ must be
set in advance.
791
Γ(a + b) a τ
f (x) = u (1 − u)b−1 , 0 < x < θ, u = (x/θ)τ
Γ(a)Γ(b) x
F (x) = β(a, b; u)
θk Γ(a + b)Γ(a + k/τ )
E[X k ] = , k > −aτ
Γ(a)Γ(a + b + k/τ )
θk Γ(a + b)Γ(a + k/τ )
E[(X ∧ x)k ] = β(a + k/τ , b; u) + xk [1 − β(a, b; u)]
Γ(a)Γ(a + b + k/τ )
A.6.1.2 beta–a, b, θ
Γ(a + b) a 1
f (x) = u (1 − u)b−1 , 0 < x < θ, u = x/θ
Γ(a)Γ(b) x
F (x) = β(a, b; u)
θk Γ(a + b)Γ(a + k)
E[X k ] = , k > −a
Γ(a)Γ(a + b + k)
θk a(a + 1) · · · (a + k − 1)
E[X k ] = , if k is an integer
(a + b)(a + b + 1) · · · (a + b + k − 1)
θk a(a + 1) · · · (a + k − 1)
E[(X ∧ x)k ] = β(a + k, b; u)
(a + b)(a + b + 1) · · · (a + b + k − 1)
+xk [1 − β(a, b; u)]
792 EXAM C TABLES
Appendix B
An Inventory of Discrete
Distributions
B.1 Introduction
The 16 models fall into three classes. The divisions are based on the algorithm by which the probabilities are
computed. For some of the more familiar distributions these formulas will look different from the ones you
may have learned, but they produce the same probabilities. After each name, the parameters are given. All
parameters are positive unless otherwise indicated. In all cases, pk is the probability of observing k losses.
For finding moments, the most convenient form is to give the factorial moments. The jth factorial
moment is μ(j) = E[N (N − 1) · · · (N − j + 1)]. We have E[N ] = μ(1) and Var(N ) = μ(2) + μ(1) − μ2(1) .
The estimators which are presented are not intended to be useful estimators but rather for providing
starting values for maximizing the likelihood (or other) function. For determining starting values, the
following quantities are used [where nk is the observed frequency at k (if, for the last entry, nk represents
the number of observations at k or more, assume it was at exactly k) and n is the sample size]:
∞ ∞
1X 1X 2
μ̂ = knk , σ̂ 2 = k nk − μ̂2 .
n n
k=1 k=1
When the method of moments is used to determine the starting value, a circumflex (e.g., λ̂) is used. For
any other method, a tilde (e.g., λ̃) is used. When the starting value formulas do not provide admissible
parameter values, a truly crude guess is to set the product of all λ and β parameters equal to the sample
mean and set all other parameters equal to 1. If there are two λ and/or β parameters, an easy choice is to
set each to the square root of the sample mean.
The last item presented is the probability generating function,
P (z) = E[z N ].
e−λ λk
p0 = e−λ , a = 0, b=λ pk =
k!
E[N ] = λ, Var[N ] = λ P (z) = eλ(z−1)
9
793
B.2.1.2 Geometric–β
1 β βk
p0 = , a= , b=0 pk =
1+β 1+β (1 + β)k+1
E[N ] = β, Var[N ] = β(1 + β) P (z) = [1 − β(z − 1)]−1 .
q (m + 1)q
p0= (1 − q)m , a = − , b=
1−q 1−q
µ ¶
m k
pk = q (1 − q)m−k , k = 0, 1, . . . , m
k
E[N ] = mq, Var[N ] = mq(1 − q) P (z) = [1 + q(z − 1)]m .
β (r − 1)β
p0 = (1 + β)−r , a= , b=
1+β 1+β
r(r + 1) · · · (r + k − 1)β k
pk =
k!(1 + β)r+k
E[N ] = rβ, Var[N ] = rβ(1 + β) P (z) = [1 − β(z − 1)]−r .
There are two sub-classes of this class. When discussing their members, we often refer to the “corresponding”
member of the (a, b, 0) class. This refers to the member of that class with the same values for a and b. The
notation pk will continue to be used for probabilities for the corresponding (a, b, 0) distribution.
λ
pT1 = , a = 0, b = λ,
eλ −1
λk
pTk = ,
k!(eλ − 1)
E[N ] = λ/(1 − e−λ ), Var[N ] = λ[1 − (λ + 1)e−λ ]/(1 − e−λ )2 ,
λ̃ = ln(nμ̂/n1 ),
eλz − 1
P (z) = .
eλ − 1
1 β
pT1 = , a= , b = 0,
1+β 1+β
β k−1
pTk = ,
(1 + β)k
E[N ] = 1 + β, Var[N ] = β(1 + β),
β̂ = μ̂ − 1,
[1 − β(z − 1)]−1 − (1 + β)−1
P (z) = .
1 − (1 + β)−1
B.3.1.3 Logarithmic–β
β β β
pT1 = , a= , b=− ,
(1 + β) ln(1 + β) 1+β 1+β
βk
pTk = ,
k(1 + β)k ln(1 + β)
β[1 + β − β/ ln(1 + β)]
E[N ] = β/ ln(1 + β), Var[N ] = ,
ln(1 + β)
nμ̂ 2(μ̂ − 1)
β̃ =− 1 or ,
n1 μ̂
ln[1 − β(z − 1)]
P (z) = 1 − .
ln(1 + β)
rβ β (r − 1)β
pT1 = , a= , b= ,
(1 + β)r+1 − (1 + β) 1+β 1+β
µ ¶k
r(r + 1) · · · (r + k − 1) β
pTk = ,
k![(1 + β)r − 1] 1+β
rβ
E[N ] = ,
1 − (1 + β)−r
rβ[(1 + β) − (1 + β + rβ)(1 + β)−r ]
V ar[N ] = ,
[1 − (1 + β)−r ]2
σ̂2 μ̂2
β̃ = − 1, r̃ = 2 ,
μ̂ σ̂ − μ̂
[1 − β(z − 1)]−r − (1 + β)−r
P (z) = .
1 − (1 + β)−r
This distribution is sometimes called the extended truncated negative binomial distribution because the
parameter r can extend below 0.
[1] S.A. Klugman , H.H. Panjer, and G.E. Willmot Loss Models from Data
to Decisions, 3rd Edition (2008), Wiley.
[2] M.B. Finan, A Probability Course for the Actuaries: A Preparation for
Exam P/1, 2007, Arkansas Tech University.
797
Index
798
INDEX 799
Unbiased, 334
Unbiased equation, 600
Uniform kernel, 139, 400
Uniformly minimum variance unbi-
ased estimator, 338
Union, 6
Zero-modified, 201
Zero-one loss, 485
Zero-truncated, 201