0% found this document useful (0 votes)

11 views24 pages

w6 - Statistical Modelling

statistical modelling

Uploaded by

viickeemart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views24 pages

w6 - Statistical Modelling

statistical modelling

Uploaded by

viickeemart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Generalized Linear Models

120
Agenda

• Motivation

• Exponential family

• Generalized linear models

• MLE of a generalized linear model

• Goodness of fit measures

• Comparison of models

• Extensions

121
. . . Agenda

• Two practitioners’ guides

▶ A Practitioner’s Guide to generalized Linear Models,

https://www.casact.org/pubs/dpp/dpp04/04dpp1.pdf

▶ Generalized linear Models for Insurance Rating,

https://www.casact.org/sites/default/files/2021-01/
05-Goldburd-Khare-Tevet.pdf

122
The Problem

• Task: build up a statistical procedure relating

▶ a response variable Y

▶ to explanatory variables or covariates X = (X1 , . . . , Xp ),

▶ n (independent) observations for each variable, yi , xi = (xi1 , . . . , xip ),

i = 1, . . . , n (Yi corresponding rvs)

• Use the statistical model for

▶ predictions

▶ inference

• Linear model: too rigid assumptions

Linear Models

• Multivariate linear regression model:

▶ given Xi = xi ,

Yi = β0 + β1 xi1 + . . . + βp xip + εi , i = 1, . . . , n

where the error terms (εi ) are iid with εi ∼ N(0, σε2 )

▶ equivalently, given Xi = xi ,

Yi ∼ N(β0 + β1 xi1 + . . . + βp xip , σε2 )

where
E [Yi |Xi = xi ] = β0 + β1 xi1 + . . . + βp xip
and the Yi independent

124
. . . Linear Models

• Multivariate linear regression model: defined by

▶ [LM1] random component: conditionally on Xi = xi ,

Yi ∼ N(µi , σε2 )

and independent

▶ [LM2] systematic component: the p covariates are combined to form a

linear predictor
ηi = β0 + β1 xi1 + . . . + βp xip

▶ [LM3] link function connecting the random and the systematic

components: linear model ≡ identity function:

E [Yi ] = µi = g −1 (ηi ) = ηi

where g =Id

125
Shortcomings of the Linear Model

• Normality assumption (1) ⇝ often not justified in practice

▶ errors may present skewness and/or kurtosis

• Normality assumption (2)

▶ response variables can take any real value, inconsistent with

▶ count data (number of claims, number of deaths, . . . )

▶ positive variables (claim sizes, lifetimes, . . . )

▶ bounded variables (relative frequencies, death or claim probabilities,

...)

126
. . . Shortcomings of the Linear Model

• Some models can be linearized

▶ Example.

Yi = β0 eβ1 Xi εi ⇝ log Yi = log β0 + β1 Xi + log εi

▶ linearization may not be possible or meaningful

• The variance of the response variable is constant (homoscedasticity)

▶ counter-intuitive if the variable is positive and the mean goes to 0

▶ the variance should be related to the mean

127
Generalized Linear Models

• GLM: a convenient framework where [LM1] and [LM3] are relaxed

▶ [GLM1] random component: conditionally on Xi = xi , Yi has a

distribution in the exponential family, and are independent

▶ [GLM2] systematic component: the p covariates are combined to form

the linear predictor

ηi = β0 + β1 xi1 + . . . + βp xip

▶ [GLM3] link function connecting the random and systematic

components: any differentiable and monotonic function such that

E [Yi ] = µi = g −1 (ηi ) or g (µi ) = ηi

128
Exponential Family

• A parametric family of distributions is a set

{fθ (y )| θ ∈ Θ}

where for each θ ∈ Θ, fθ is a pdf or pmf

• Y rv belongs to the exponential family if its pdf (continuous case) or

pmf (discrete case) can be written as

y · θ − b(θ)
fY (y ; θ, ϕ) = exp + c(y , ϕ)
a(ϕ)

where a, b and c are known functions and θ, ϕ are parameters

129
Exponential Family

• Two parameters

▶ θ: natural (or canonical) parameter, related to the location of the

distribution ⇝ the covariates will enter into θ

▶ ϕ: related to the scale parameter ⇝ ϕ enters in the variance only

• Further

▶ the distribution may depend only on the canonical parameter (scale

known)

▶ the variance depends on both θ and ϕ ⇝ it depends on the mean

▶ the distribution is determined by mean and (possibly]) variance

. . . Exponential Family
• Exercise. Check that the N(µ, σ 2 ), µ ∈ R, σ 2 > 0 is an exponential
family; identify θ, ϕ, a, b, c.
1 1 2
√ e− 2σ2 (y −µ) =
2 π σ2

y µ − µ2 /2 1 y2

2
= exp − + log(2πσ )
σ2 2 σ2
so that
θ = µ, ϕ = σ 2 , a(ϕ) = ϕ = σ 2
θ2 1
b(θ) = , c(y , ϕ) = − (y 2 /2 + log(2πϕ))
2 2

▶ another possible choice is θ = 2µ, b(θ) = θ2 /4, ϕ = σ 2 a(ϕ) = 2ϕ ⇝

parametrization is not unique
▶ note that E [Y ] = µ = b ′ (θ), VAR[Y ] = σ 2 = b ′′ (θ)a(ϕ)
131
. . . Exponential Family

• In general for any distribution in the exponential family

E [Y ] = µ = b ′ (θ), VAR[Y ] = b ′′ (θ)a(ϕ)

▶ the mean only depends on θ:

θ = (b ′ )−1 (E [Y ]) = (b ′ )−1 (µ)

⇝ prediction essentially depends on θ

▶ the variance depends on the mean, through the variance function

V (µ) = b ′′ (θ), and on the scale parameter ϕ:

VAR[Y ] = V (µ)a(ϕ)

132
. . . Exponential Family

• Exercise. Check that Poisson(µ), µ > 0 is an exponential family;

identify θ, ϕ, a, b, c; for positive integer y

µy e−µ
= exp (y log µ − µ − log(y !))
y!

so that
θ = log µ, b(θ) = µ = eθ , ϕ = 1
a(ϕ) = 1, c(y , ϕ) = − log(y !)

▶ again b ′ (θ) = eθ = µ = E [Y ] and the variance function is

V (µ) = b ′′ (θ) = eθ = µ
▶ the variance only depends on the variance function; scale parameter is
ϕ = 1 ⇝ no need of estimating ϕ

133
. . . Exponential Family

• Some examples
model fY (y ; θ, ϕ) E [Y ] θ b(θ) ϕ a(ϕ)
θ2
N(µ, σ 2 ) µ µ 2
σ2 ϕ

Gamma(α, λ) µ= α
λ
− µ1 − log(−θ) α 1
ϕ

Poisson(µ) µ log µ eθ 1 1
µ
Bernoulli(p) µ=p log 1−µ
log(1 + eθ ) 1 1

• Many other distributions belong to the exponential family:

▶ binomial with fixed (known) number of trials
▶ binomial proportion
▶ inverse normal
▶ ...

134
. . . Exponential Family

• Not every (family of) distributions is an exponential family

• Distributions for which the support changes with the parameters

▶ Exercise. uniform over an interval (a, b)

▶ binomial where the number of trials is a parameter

γ
• Weibull: fY (y ; c, γ) = c γ y γ−1 e−c y , y > 0, with c > 0, γ > 0

135
The Link Function

• Example. Exam pass rate; response variable

▶ Y = student pass/fail in an actuarial exam (categorical)

▶ map it to 0/1 ⇝ Bernoulli

▶ E [Y ] = µ = pass prob.

• Example. Exam pass rate; 3 covariates (p = 3)

▶ N: number of assignments submitted by the student (0, 1, 2, 3, 4)

▶ S: student’s mark on the mock exam (0 ≤ S ≤ 100)

▶ T : whether the student has attended tutorials or not (Yes or No,

categorical)
. . . The Link Function

• Example. Exam pass rate; linear predictor is

η = αT + β1 N + β2 S

where αT = αYes or αT = αNo ⇝ not using baseline here!

• Need a link function transforming the value η into the mean µ = E [Y ]

g (µ) = η ⇔ µ = g −1 (η)

▶ in this example µ is a probability ⇝ 0 < g −1 (η) < 1 for all η

▶ for instance, the logit or log-odds function

eη

µ
η = g (µ) = log , µ = g −1 (η) =
1−µ 1 + eη

137
. . . The Link Function

• Example. Pass rate example; to estimate the model individual data

(whether student passed or not, and the corresponding covariates) is
needed

▶ suppose ML (see later) gives the following estimates

αYes = −1.501, αNo = −3.196, β1 = 0.5459, β2 = 0.0251

▶ note αYes > αNo , β1 , β2 > 0, but β2 close to 0 ⇝ comments?

• equivalently, grouped data – pass frequency for groups of students

sharing the same value of covariates – could be used ⇝ Binomial
proportion distribution

138
. . . The Link Function

• Exercise. Given the above estimates,

1 what is the pass probability for a student who attends tutorials,

submits 3 assignments, and scores 60% on the mock exam?

2 how much would the probability change if the fourth assignment were
submitted

3 what is the highest pass probability for someone who does not attend
tutorials?

4 can anyone get a pass probability of 0 or 1? If not, what are the

maximum and minimum scores?

• Prediction of Y : for instance, if µ

b > 0.5 then yb = 1

139
. . . The Link Function
• Exercise.
1 use
ηb = −1.501 + 0.5459 · 3 + 0.0251 · 60 = 1.6427
1.6427
e
and so µb = 1+e 1.6427 = 84%

2 using N = 4, same calculation gives ηb = 2.1886 and µb = 90% so the

pass rate increases by 6%
3 use
ηb = −3.196 + 0.5459 · 4 + 0.0251 · 100 = 1.4976
and so µ
b = 82%
4 min prob.:

ηb = −3.196 + 0.5459 · 0 + 0.0251 · 0 = −3.196

and µ
b = 4%; max prob.:

ηb = −1.501 + 0.5459 · 4 + 0.0251 · 10 = 3.1926

and µ
b = 96%
140
. . . The Link Function

• In general, the link function relates the response mean µ = E [Y ] to

the linear predictor

η = β0 + β1 X1 + . . . + βp Xp = g (µ)

where g is the link function, assumed to be

▶ differentiable ⇝ smooth, for ML

▶ monotonic (increasing or decreasing) ⇝ parameter interpretation

• Inverting, g −1 inverse link function

µ = g −1 (η) = g −1 (β0 + β1 X1 + . . . + βp Xp )

141
. . . Link Function

• When linearizing a model (as in p. 127) the response is transformed;

here the mean of the response is transformed

• Link function choice g is not unique but must be consistent with the
range of µ = E [Y ]

• Canonical link function: guarantees that

θ = η = g (µ)

recalling that
µ = b ′ (θ)⇝ θ = (b ′ )−1 (µ)
the canonical link function is

g (µ) = θ = (b ′ )−1 (µ)

. . . Link Function

• Some canonical link functions

143

Shop Manual Komatsu pc300 7 PDF
88% (16)
Shop Manual Komatsu pc300 7 PDF
2 pages
2.1 PPT - Homogeneous and Hetero Mixtures
No ratings yet
2.1 PPT - Homogeneous and Hetero Mixtures
60 pages
Homework 8 - Solution
No ratings yet
Homework 8 - Solution
8 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Iso 3511 Instrument - Symbols - Part - 4 PDF
0% (1)
Iso 3511 Instrument - Symbols - Part - 4 PDF
10 pages
The 10 Hook Lead System
100% (1)
The 10 Hook Lead System
5 pages
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
No ratings yet
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
6 pages
7 Generalized Linear Models Padua
No ratings yet
7 Generalized Linear Models Padua
29 pages
GLM Slides 2 Exp Family
No ratings yet
GLM Slides 2 Exp Family
35 pages
Modelling Lecture 5
No ratings yet
Modelling Lecture 5
10 pages
18.650 - Fundamentals of Statistics
No ratings yet
18.650 - Fundamentals of Statistics
32 pages
Categorical Notes Ch4
No ratings yet
Categorical Notes Ch4
40 pages
15 GLM
No ratings yet
15 GLM
32 pages
Lecture 11
No ratings yet
Lecture 11
6 pages
Generalized Linear Models-1
No ratings yet
Generalized Linear Models-1
29 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Chapter 10 PDF
No ratings yet
Chapter 10 PDF
3 pages
Generalized Linear Models: FX Axb C DX Axb C DX
No ratings yet
Generalized Linear Models: FX Axb C DX Axb C DX
11 pages
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
No ratings yet
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
4 pages
Deep Learning With Actuarial Applications
No ratings yet
Deep Learning With Actuarial Applications
248 pages
(TRANSLATED) Generalized Linear Model
No ratings yet
(TRANSLATED) Generalized Linear Model
11 pages
Exponential Family
No ratings yet
Exponential Family
13 pages
Exponential Family
No ratings yet
Exponential Family
45 pages
Ho GLM
No ratings yet
Ho GLM
5 pages
Probabilistic Learning and Generalized Linear Models (GLMS)
No ratings yet
Probabilistic Learning and Generalized Linear Models (GLMS)
38 pages
Stat5900 f24 Lec9
No ratings yet
Stat5900 f24 Lec9
12 pages
Econometrics - Exercise Set 2 (Solution)
No ratings yet
Econometrics - Exercise Set 2 (Solution)
12 pages
Presentation Generalized Linear Model Theory
No ratings yet
Presentation Generalized Linear Model Theory
77 pages
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
No ratings yet
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
31 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Chapman-Kolmogorov Equations 37 This Produces The 48511
No ratings yet
Chapman-Kolmogorov Equations 37 This Produces The 48511
9 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
Exponential Families
No ratings yet
Exponential Families
42 pages
Lec # 3
No ratings yet
Lec # 3
47 pages
CQF ML Lab Estimating Default Probability With Logistic Regression
No ratings yet
CQF ML Lab Estimating Default Probability With Logistic Regression
7 pages
Unit 2
No ratings yet
Unit 2
11 pages
Msqe Metrics 1 ps2
No ratings yet
Msqe Metrics 1 ps2
11 pages
All Ex Sol
No ratings yet
All Ex Sol
43 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
109 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
Lecture BDS 2 23 24 Print
No ratings yet
Lecture BDS 2 23 24 Print
10 pages
Statistical Modeling Notes
No ratings yet
Statistical Modeling Notes
25 pages
General Additive Model - Michael Clark
No ratings yet
General Additive Model - Michael Clark
31 pages
GAMS Getting Started
No ratings yet
GAMS Getting Started
31 pages
Statistics - Lecture 7
No ratings yet
Statistics - Lecture 7
47 pages
GLM Slides 6 Binary Response Print
No ratings yet
GLM Slides 6 Binary Response Print
55 pages
Lecture 7
No ratings yet
Lecture 7
10 pages
Lecture Notes 5
100% (1)
Lecture Notes 5
53 pages
Exponential Families: Peter D. Hoff September 26, 2013
No ratings yet
Exponential Families: Peter D. Hoff September 26, 2013
9 pages
Generalized Linear Models: Simon Jackman Stanford University
No ratings yet
Generalized Linear Models: Simon Jackman Stanford University
7 pages
Notes 250310 092623
No ratings yet
Notes 250310 092623
12 pages
Problems and Solutions 4 PDF
No ratings yet
Problems and Solutions 4 PDF
67 pages
Principles of Statistics
No ratings yet
Principles of Statistics
113 pages
Chapter 12 - 250310 - 092440
No ratings yet
Chapter 12 - 250310 - 092440
4 pages
Specimen Exam Solutions Cs1a Ifoa 2019 Final
No ratings yet
Specimen Exam Solutions Cs1a Ifoa 2019 Final
11 pages
ABD Formulas
No ratings yet
ABD Formulas
55 pages
Generalised Linear Models
No ratings yet
Generalised Linear Models
74 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Stochastic Dynamics
No ratings yet
Stochastic Dynamics
72 pages
Cheat Sheet
No ratings yet
Cheat Sheet
5 pages
Mulia Edit
No ratings yet
Mulia Edit
13 pages
(A) Model Assumptions: 1.2 Outline of Generalized Linear Models
No ratings yet
(A) Model Assumptions: 1.2 Outline of Generalized Linear Models
8 pages
2.1972 Generalized Linear Models Nelder Wedderburn
No ratings yet
2.1972 Generalized Linear Models Nelder Wedderburn
16 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Example Solution
No ratings yet
Example Solution
2 pages
Margin Acc Table
No ratings yet
Margin Acc Table
2 pages
Quiz Explanation
No ratings yet
Quiz Explanation
5 pages
Financial Econ
No ratings yet
Financial Econ
15 pages
IandF CS2A 202304 Examiner Report
No ratings yet
IandF CS2A 202304 Examiner Report
14 pages
Solar-Powered Lawnmower Design and Development
No ratings yet
Solar-Powered Lawnmower Design and Development
8 pages
Volktek - Solution Catalog For Surveillance Ethernet
No ratings yet
Volktek - Solution Catalog For Surveillance Ethernet
55 pages
Literature Review On Accessibility
100% (1)
Literature Review On Accessibility
7 pages
Ways To Integrate Social Emotional Learning
No ratings yet
Ways To Integrate Social Emotional Learning
21 pages
BUCHI Destilador B-324 LIGAL 489 Operationmanual - SP
No ratings yet
BUCHI Destilador B-324 LIGAL 489 Operationmanual - SP
30 pages
Design of HVAC Control System For Building Energy Management Systems
No ratings yet
Design of HVAC Control System For Building Energy Management Systems
5 pages
C9 WS 3 PHY Electromagnet
No ratings yet
C9 WS 3 PHY Electromagnet
5 pages
Item Analysis Procedures 1
No ratings yet
Item Analysis Procedures 1
2 pages
All India Machinery Data
0% (1)
All India Machinery Data
1,705 pages
Interview Vera Geier PDF
No ratings yet
Interview Vera Geier PDF
2 pages
Csit 301 Lesson Plan 1
No ratings yet
Csit 301 Lesson Plan 1
5 pages
DTX Adjust Captured Images
No ratings yet
DTX Adjust Captured Images
3 pages
Unit 8 - TQM
No ratings yet
Unit 8 - TQM
37 pages
Cargador Frontal WA500-6 (English) Komatsu
100% (1)
Cargador Frontal WA500-6 (English) Komatsu
12 pages
Deterioration of Concrete
No ratings yet
Deterioration of Concrete
34 pages
(Ebook) The Transformers Legends by David Cian ISBN 9780743497916, 0743497910 Download
100% (2)
(Ebook) The Transformers Legends by David Cian ISBN 9780743497916, 0743497910 Download
67 pages
Information Required For Preparation of Offers For Safety Consultancy Assignments
No ratings yet
Information Required For Preparation of Offers For Safety Consultancy Assignments
3 pages
Sustainability - Feature Story
No ratings yet
Sustainability - Feature Story
2 pages
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
No ratings yet
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
3 pages
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
No ratings yet
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
45 pages
Anderson ButcherConroy
No ratings yet
Anderson ButcherConroy
21 pages
Physics Class Xii Project PDF
No ratings yet
Physics Class Xii Project PDF
20 pages
MAD111 - Chap 1
No ratings yet
MAD111 - Chap 1
237 pages
Tire Dimensions
No ratings yet
Tire Dimensions
1 page
Teacher Notes and Answers 8 Fluid Mechanics
No ratings yet
Teacher Notes and Answers 8 Fluid Mechanics
3 pages
Psychology and Other Disciplines
No ratings yet
Psychology and Other Disciplines
5 pages