0% found this document useful (0 votes)

412 views14 pages

MAS-II Formula Sheet

The document provides an overview of credibility theory, including concepts like Bühlmann Credibility, predictive distributions, and loss functions. It also discusses linear mixed models, detailing factors, effects, and matrix specifications for various data structures. Additionally, it covers Bayesian methods and the use of prior and posterior distributions in modeling.

Uploaded by

adventurine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

412 views14 pages

MAS-II Formula Sheet

Uploaded by

adventurine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

MAS-II

Updated 12/07/23

Introduction
INTRODUCTION TO to Credibility
CREDIBILITY Predictive Distribution Bühlmann Credibility
Revised unconditional distribution Expected Hypothetical Mean (EHM):
Classical Credibility (w.r.t. parameter) of the model 𝜇𝜇, = EfE[𝑋𝑋 ∣ 𝜃𝜃]h
a.k.a. Limited Fluctuation Credibility Predictive density function: 𝑓𝑓(𝑥𝑥 ∣ data) Expected Process Variance (EPV):
Predictive Mean = Bayesian Premium 𝜇𝜇67 = EfVar[𝑋𝑋 ∣ 𝜃𝜃]h
Standard for Full Credibility
# of exposures needed, 𝑛𝑛! : Loss Function Variance of Hypothetical Mean (VHM):
&
For aggregate loss/pure premium: 𝜎𝜎89 = VarfE[𝑋𝑋 ∣ 𝜃𝜃]h
𝑧𝑧"#(%⁄&) & & Function to Bayesian 𝜇𝜇67
𝑛𝑛! = # & (𝐶𝐶) ) Bühlmann 𝑘𝑘: 𝑘𝑘 = &
Minimize Estimation 𝜎𝜎89
𝑘𝑘
Posterior 𝑛𝑛
# of claims needed for, 𝑛𝑛* : Squared-error loss Bühlmann Credibility Factor: 𝑍𝑍 =
mean 𝑛𝑛 + 𝑘𝑘
For aggregate loss/pure premium: Posterior
Absolute loss Bühlmann Credibility Premium:
𝑧𝑧"#(%⁄&) & 𝜎𝜎+& median
𝑛𝑛* = # & * + 𝐶𝐶,& . 𝑈𝑈 = 𝑍𝑍𝑥𝑥̅ + (1 − 𝑍𝑍)𝜇𝜇,
𝑘𝑘 𝜇𝜇+ Zero-one loss Posterior mode 𝑈𝑈 = 𝜇𝜇, + 𝑍𝑍(𝑥𝑥̅ − 𝜇𝜇, )
• For claim frequency only: set 𝐶𝐶,& = 0 Note: 𝑍𝑍 and 𝑥𝑥̅ have the same 𝑛𝑛.
"
-!
• For claim severity only: set =0 Conjugate Priors
.!
Poisson/Gamma Empirical Bayes Method
𝑛𝑛* • Model: Poisson with mean 𝜆𝜆
𝑛𝑛* = 𝑛𝑛! ⋅ 𝜇𝜇+ ⟺ 𝑛𝑛! = Uniform Exposures
𝜇𝜇+ • Prior: 𝜆𝜆 ∼ Gamma (𝛼𝛼, 𝜃𝜃) ∑;13" ∑2:3" 𝑥𝑥1:
𝜇𝜇̂ , = = 𝑥𝑥̅
Partial Credibility • Posterior: 𝑟𝑟 ⋅ 𝑛𝑛
𝑈𝑈 = 𝑍𝑍𝑍𝑍 + (1 − 𝑍𝑍)𝑀𝑀 ( 𝜆𝜆 ∣ data ) ∼ Gamma (𝛼𝛼 ∗ , 𝜃𝜃 ∗ ) ∑;13" ∑2:3"r𝑥𝑥1: − 𝑥𝑥̅ 1 s
&

o 𝛼𝛼 = 𝛼𝛼 + ∑213" 𝑥𝑥1
∗ 𝜇𝜇̂ 67 =
= 𝑀𝑀 + 𝑍𝑍(𝐷𝐷 − 𝑀𝑀) 𝑟𝑟(𝑛𝑛 − 1)
" #"
where o 𝜃𝜃 ∗ = ] + 𝑛𝑛^
4 ∑;13"(𝑥𝑥̅ 1 − 𝑥𝑥̅ )& 𝜇𝜇̂ 67
• 𝑈𝑈: Updated prediction &
𝜎𝜎t89 = −
Exponential/Gamma 𝑟𝑟 − 1 𝑛𝑛
• 𝐷𝐷: Observed value
• 𝑀𝑀: Manual rate • Model: Exponential with rate 𝜆𝜆, or Non-uniform Exposures
• 𝑍𝑍: credibility factor/credibility mean 𝜆𝜆#" 2#
∑;13" ∑:3" 𝑚𝑚1: 𝑥𝑥1:
• Prior: 𝜆𝜆 ∼ Gamma (𝛼𝛼, 𝜃𝜃) 𝜇𝜇̂ , = = 𝑥𝑥̅
𝑚𝑚
Square Root Rule • Posterior:
2 &
𝑛𝑛 𝑛𝑛 ⋅ 𝜇𝜇+ ( 𝜆𝜆 ∣ data ) ∼ Gamma (𝛼𝛼 ∗ , 𝜃𝜃 ∗ ) ∑;13" ∑:3"
#
𝑚𝑚1: r𝑥𝑥1: − 𝑥𝑥̅1 s
𝑍𝑍 = L = L ∗ 𝜇𝜇̂ 67 =
𝑛𝑛! 𝑛𝑛* o 𝛼𝛼 = 𝛼𝛼 + 𝑛𝑛 ∑;13"(𝑛𝑛1 − 1)
" #"
where 𝑛𝑛 is the actual # of exposures. o 𝜃𝜃 ∗ = ] + ∑213" 𝑥𝑥1 ^ ∑;13" 𝑚𝑚1 (𝑥𝑥̅1 − 𝑥𝑥̅ )& − 𝜇𝜇̂ 67 (𝑟𝑟 − 1)
4 &
𝜎𝜎t89 =
𝑚𝑚 − 𝑚𝑚#" ∑;13" 𝑚𝑚1&
Bayesian Credibility Binomial/Beta
Model Distribution • Model: Binomial with fixed 𝑚𝑚 and Credibility premium for a risk in Class 𝑖𝑖:
Distribution of model conditioned on a probability of success 𝑞𝑞 𝑈𝑈 = 𝑍𝑍v1 𝑥𝑥̅1 + r1 − 𝑍𝑍v1 s𝜇𝜇̂ ,
parameter • Prior: 𝑞𝑞 ∼ Beta (𝑎𝑎, 𝑏𝑏, 1) Note: 𝑍𝑍v1 and 𝑥𝑥̅1 have the same 𝑛𝑛.
Model density function: 𝑓𝑓( 𝑥𝑥 ∣ 𝜃𝜃 ) • Posterior:
( 𝑞𝑞 ∣ data ) ∼ Beta (𝑎𝑎∗ , 𝑏𝑏∗ , 1) Balancing the Estimators
Prior Distribution
o 𝑎𝑎 = 𝑎𝑎 + ∑213" 𝑥𝑥1
∗ ∑;13" 𝑍𝑍1 𝑥𝑥̅1
Initial distribution of the parameter Estimate EHM as: 𝜇𝜇̂ <=>? =
o 𝑏𝑏∗ = 𝑏𝑏 + [𝑛𝑛(𝑚𝑚) − ∑213" 𝑥𝑥1 ] ∑;13" 𝑍𝑍1
Prior density function: 𝜋𝜋(𝜃𝜃)
Geometric/Beta
Posterior Distribution
• Model: Geometric with probability of
Revised distribution of the parameter "#5
Posterior density function: 𝜋𝜋(𝜃𝜃 ∣ data) success 𝑞𝑞, or mean 5
𝑓𝑓( data ∣ 𝜃𝜃 ) ⋅ 𝜋𝜋(𝜃𝜃) • Prior: 𝑞𝑞 ∼ Beta (𝑎𝑎, 𝑏𝑏, 1)
𝜋𝜋(𝜃𝜃 ∣ data) = /
∫#/ 𝑓𝑓( data ∣ 𝜃𝜃 ) ⋅ 𝜋𝜋(𝜃𝜃) d𝜃𝜃 • Posterior:
Note: Use numerator and domain to check ( 𝑞𝑞 ∣ data ) ∼ Beta (𝑎𝑎∗ , 𝑏𝑏∗ , 1)
∗
if (𝜃𝜃 ∣ data) follows a distribution in the o 𝑎𝑎 = 𝑎𝑎 + 𝑛𝑛
exam tables to skip integration. o 𝑏𝑏∗ = 𝑏𝑏 + ∑213" 𝑥𝑥1

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 1
LINEAR
LinearMIXED MODELS
Mixed (LMM)(LMM)
Models Factors and Effects Matrix Specification
• Covariate – predictor variable. 𝐘𝐘1 = 𝐗𝐗
ä 1 𝜷𝜷 + 𝐙𝐙
{| 𝐮𝐮1 +|~
1|}| 𝝐𝝐1
Basics of Linear Mixed Modeling • Fixed factor – categorical variable that CDE>? =FG?HI

Types and Structures of Data Sets includes all possible levels. 𝐮𝐮1 ~ 𝒩𝒩(𝟎𝟎, 𝐃𝐃)
• Clustered data – the dependent variable • Random factor – categorical variable 𝝐𝝐1 ~ 𝒩𝒩(𝟎𝟎, 𝐑𝐑 1 )
is measured once for each subject (or unit whose levels are randomly sampled from
𝑌𝑌",1
of analysis), and subjects are grouped a larger population of levels. ⎡ ⎤
𝑌𝑌&,1
into, or nested within, clusters of subjects • Fixed effect – describes the relationship 𝐘𝐘1 = ⎢ ⎥
that share some commonality. ⎢ ⋮ ⎥
between the dependent variable and a ⎣𝑌𝑌2#,1 ⎦
• Repeated measures – the dependent fixed factor or continuous covariate.
(") (&) (B)
variable is measured more than once on • Random effect – random values ⎡ 𝑥𝑥",1 𝑥𝑥",1 ⋯ 𝑥𝑥",1 ⎤ 𝛽𝛽"
the same subject across levels of one or associated with specific levels of a ⎢ (") (&)
⋯ 𝑥𝑥&,1 ⎥ 𝜷𝜷 = õ𝛽𝛽&
(B)
𝐗𝐗 1 = ⎢ 𝑥𝑥&,1 𝑥𝑥&,1
more categorical explanatory variables random factor. ⋮ ⋮ ⋱ ⋮
⎥ ⋮ú
called repeated-measures factors. ⎢ (") (&) (B) ⎥ 𝛽𝛽B
• Nested factors – each level of a factor can 𝑥𝑥
⎣ 2#,1 𝑥𝑥2#,1 ⋯ 𝑥𝑥2#,1 ⎦
• Longitudinal data – the dependent only be measured within a single level of
(") (&) (5)
variable is measured at multiple points in another factor. ⎡𝑧𝑧",1 𝑧𝑧",1 ⋯ 𝑧𝑧",1 ⎤ 𝑢𝑢",1
time for each subject. ⎢ (") ⋯ 𝑧𝑧&,1 ⎥ 𝐮𝐮 = õ𝑢𝑢&,1 ú
(&) (5)
• Crossed factor – a factor can be measured
𝐙𝐙1 = ⎢𝑧𝑧&,1 𝑧𝑧&,1
⎥ 1 ⋮
• Clustered longitudinal data – the across multiple levels of another factor. ⋮ ⋮ ⋱ ⋮
⎢ (") (5) ⎥ 𝑢𝑢5,1
dependent variable is measured at (&)
⎣𝑧𝑧2#,1 𝑧𝑧2#,1 ⋯ 𝑧𝑧2#,1 ⎦
multiple points in time for each subject, Matrix Specification for a Single
𝐃𝐃 = Var[𝐮𝐮$ ]
and subjects are grouped within clusters. Observation
(") (&) (B) ⎡ Var,𝑢𝑢%,$ . Cov,𝑢𝑢%,$ , 𝑢𝑢',$ . ⋯ Cov,𝑢𝑢%,$ , 𝑢𝑢(,$ . ⎤
Hierarchical/Multilevel Data 𝑌𝑌1 = {||||||||}||||||||~
𝛽𝛽" 𝑥𝑥@,1 + 𝛽𝛽& 𝑥𝑥@,1 + ⋯ + 𝛽𝛽B 𝑥𝑥@,1 + ⎢ Cov,𝑢𝑢',$ , 𝑢𝑢(,$ .⎥
= ⎢Cov,𝑢𝑢%,$ , 𝑢𝑢',$ . Var,𝑢𝑢',$ . ⋯
⎥
CDE>? ⋮ ⋮ ⋱ ⋮
• Level 1 – Observations at the most (") (&) (5)
⎢ ⎥
𝑢𝑢",1 𝑧𝑧@,1 + 𝑢𝑢&,1 𝑧𝑧@,1 + ⋯ + 𝑢𝑢5,1 𝑧𝑧@,1 + 𝜖𝜖@,1 ⎣Cov,𝑢𝑢%,$ , 𝑢𝑢(,$ . Cov,𝑢𝑢',$ , 𝑢𝑢(,$ . ⋯ Var,𝑢𝑢(,$ . ⎦
detailed level of data. {|||||||||||}|||||||||||~
𝐑𝐑 $ = Var[𝝐𝝐$ ]
o For clustered data, Level 1 is =FG?HI
Var,𝜖𝜖%,$ . Cov,𝜖𝜖%,$ , 𝜖𝜖',$ . ⋯ Cov,𝜖𝜖%,$ , 𝜖𝜖)!,$ .
the subject. ⎡ ⎤
Notation ⎢ Cov,𝜖𝜖%,$ , 𝜖𝜖',$ . Var,𝜖𝜖',$ . ⋯ Cov,𝜖𝜖',$ , 𝜖𝜖)!,$ .⎥
o For repeated measures/longitudinal 𝑌𝑌 The dependent variable =⎢
⋮ ⋮ ⋱ ⋮
⎥
⎢ ⎥
data, Level 1 is the repeated measures 𝑖𝑖 Identifies a subject ⎣Cov,𝜖𝜖%,$ , 𝜖𝜖)!,$ . Cov,𝜖𝜖',$ , 𝜖𝜖)!,$ . ⋯ Var,𝜖𝜖)!,$ . ⎦
made on a subject. 𝑡𝑡 Indexes time
• Level 2 – The next most detailed level Covariance Structures
𝑋𝑋 (") , … , 𝑋𝑋 (B) The 𝑝𝑝 covariates associated
of data. • Unstructured
with fixed effects
o For clustered data, Level 2 is a cluster 𝜎𝜎J&*
(")
𝑥𝑥@,1 The 𝑡𝑡th observed value of 𝜎𝜎J& 𝜎𝜎J*,J"
𝐃𝐃 = ù * 𝜎𝜎
û 𝜽𝜽𝐃𝐃 = † J*,J" °
of subjects. 𝑋𝑋 (") for subject 𝑖𝑖 𝜎𝜎J*,J" 𝜎𝜎J&"
o For repeated measures, it is the subject. 𝜎𝜎J&"
𝑍𝑍 (") , … , 𝑍𝑍 (5) The 𝑞𝑞 covariates associated
• Level 3 – The next level of data. with random effects • Diagonal/Variance components
o Clusters of Level 2 units (clusters (") 𝜎𝜎J& 0 𝜎𝜎J&*
𝑧𝑧@,1 The 𝑡𝑡th observed value of 𝐃𝐃 = ù * & û 𝜽𝜽𝐃𝐃 = ù & û
of clusters). 𝑍𝑍 (") for subject 𝑖𝑖 0 𝜎𝜎J" 𝜎𝜎J"
𝛽𝛽" , … , 𝛽𝛽B The 𝑝𝑝 fixed effects • Compound symmetric
𝑢𝑢",1 , … , 𝑢𝑢5,1 The 𝑞𝑞 random effects 𝜎𝜎 & + 𝜎𝜎" 𝜎𝜎" ⋯ 𝜎𝜎"
⎡ & ⎤
associated with subject 𝑖𝑖 𝐑𝐑1 = ⎢ 𝜎𝜎" 𝜎𝜎 + 𝜎𝜎" ⋯ 𝜎𝜎" ⎥
𝜖𝜖@,1 The random residual ⎢ ⋮ ⋮ ⋱ ⋮ ⎥
⎣ 𝜎𝜎" 𝜎𝜎" ⋯ 𝜎𝜎 & + 𝜎𝜎" ⎦
𝜎𝜎 &
𝜽𝜽𝐑𝐑 = ¢ £
𝜎𝜎"
• First-order autoregressive
𝜎𝜎 & 𝜎𝜎 & 𝜌𝜌 ⋯ 𝜎𝜎 & 𝜌𝜌2##"
⎡ & &
⎤
𝐑𝐑1 = ⎢ 𝜎𝜎 𝜌𝜌 𝜎𝜎 ⋯ 𝜎𝜎 & 𝜌𝜌2##& ⎥
⎢ ⋮ ⋮ ⋱ ⋮ ⎥
⎣𝜎𝜎 & 𝜌𝜌2##" 𝜎𝜎 & 𝜌𝜌2##& ⋯ 𝜎𝜎 & ⎦
𝜎𝜎 &
𝜽𝜽𝐑𝐑 = ¢ £
𝜌𝜌
• Both 𝐃𝐃 and 𝐑𝐑1 must be positive definite.
• Heterogeneous variances are also
possible. Same structure for each group,
but different parameters in 𝜽𝜽𝐃𝐃 or 𝜽𝜽𝐑𝐑 .

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 2
Hierarchical Models Model Estimation and Inference Intraclass Correlation Coefficient (ICC)
Break the LMM into levels and write an Maximum Likelihood and Restricted In general, the ICC for a given level of
equation for each level. Maximum Likelihood clustering can be thought of as the
• Maximum Likelihood (ML) proportion of the total observed variation
Consider the model below: ® is the best linear
(") (&) (O) o If 𝜽𝜽 is known, 𝜷𝜷 due to the random effects at that level and
Y1,:,M = 𝛽𝛽N + 𝛽𝛽" 𝑥𝑥1,:,M + 𝛽𝛽& 𝑥𝑥:,M + 𝛽𝛽O 𝑥𝑥M unbiased estimator (BLUE) for 𝜷𝜷. higher levels. It must be positive.
(")
+ 𝑢𝑢N,:|M + 𝑢𝑢",:|M 𝑥𝑥1,:,M ® is the empirical
o If 𝜽𝜽 is not known, 𝜷𝜷 The ICC for level 𝑗𝑗 of a two-level variance
+ 𝑢𝑢N,M + 𝜖𝜖1,:,M best linear unbiased estimator (EBLUE) components model is:
𝑢𝑢N,:|M for 𝜷𝜷.
𝐮𝐮:|M = ]𝑢𝑢 ^ ~ 𝒩𝒩r𝟎𝟎, 𝐃𝐃(") s Variance in common 𝜎𝜎 &
",:|M • Restricted Maximum Likelihood (REML) ICC: = = & DGQ &
& Total variance 𝜎𝜎DGQ + 𝜎𝜎
𝑢𝑢N,M ~ Normal r0, 𝜎𝜎DGQ:S=HTU s o Adjusts for the loss of degrees of
& freedom from estimating the fixed
𝜖𝜖1,:,M ~ Normal (0, 𝜎𝜎 ) For a subject 𝑖𝑖 in cluster 𝑗𝑗 nested within
effects to produce an unbiased estimate group of clusters 𝑘𝑘, the Level 2 ICC is:
In hierarchical form: for 𝜽𝜽. & &
𝜎𝜎DGQ::(M) + 𝜎𝜎DGQ:M
• Level 1 – ICC:|M = & &
Estimate REML ML 𝜎𝜎DGQ::(M) + 𝜎𝜎DGQ:M + 𝜎𝜎 &
Y1,:,M = 𝑏𝑏N,:|M + 𝑏𝑏",:|M 𝑥𝑥1,:,M + 𝜖𝜖1,:,M
®
𝜷𝜷 Unbiased Unbiased and the Level 3 ICC is:
𝜖𝜖1,:,M ~ Normal (0, 𝜎𝜎 & ) &
Biased 𝜎𝜎DGQ:M
• Level 2 – ®
𝜽𝜽 Unbiased ICCM = & &
(&) Downward 𝜎𝜎DGQ::(M) + 𝜎𝜎DGQ:M + 𝜎𝜎 &
𝑏𝑏N,:|M = 𝑏𝑏N,M + 𝛽𝛽& 𝑥𝑥:,M + 𝑢𝑢N,:|M
®h Biased Biased
𝑏𝑏",:|M = 𝛽𝛽" + 𝑢𝑢",:|M Varf𝜷𝜷 Marginal ICC
Downward Downward
𝑢𝑢N,:|M For the implied marginal model arising
𝐮𝐮:|M = ]𝑢𝑢 ^ ~ 𝒩𝒩r𝟎𝟎, 𝐃𝐃(") s
",:|M Computational Algorithms from a variance components model
• Level 3 – Used for estimating parameters in an LMM. Covf𝑌𝑌1,: , 𝑌𝑌1 +,: h
(O) Corrf𝑌𝑌1,: , 𝑌𝑌1 +,: h =
𝑏𝑏N,M = 𝛽𝛽N + 𝛽𝛽O 𝑥𝑥M + 𝑢𝑢N,M • Expectation Maximization
¨Varf𝑌𝑌1,: h ⋅ Varf𝑌𝑌1 +,: h
&
𝑢𝑢N,M ~ Normal r0, 𝜎𝜎DGQ:S=HTU s o Pros: good at finding starting values for
& &
other algorithms. 𝜎𝜎DGQ 𝜎𝜎DGQ
Marginal Linear Models = = = ICC:
o Cons: converges slowly and produces &
&
𝜎𝜎DGQ+ 𝜎𝜎 &
A population-averaged model: no ¨r𝜎𝜎DGQ
&
+ 𝜎𝜎 & s
"optimistic" estimators.
random effects. • Newton-Raphson is referred to as the marginal ICC and can be
𝐘𝐘1 = 𝐗𝐗 1 𝜷𝜷 + 𝝐𝝐∗1 o Pros: converges in a small number of viewed as the marginal correlation between
𝝐𝝐∗1 ~ 𝒩𝒩(𝟎𝟎, 𝐕𝐕1∗ ) iterations and can be used to obtain an two different observations within the
Implied Marginal Model asymptotic covariance matrix for the same group.
𝐘𝐘1 = 𝐗𝐗 1 𝜷𝜷 + 𝝐𝝐∗1 covariance parameters in 𝜽𝜽.
𝝐𝝐∗1 ~ 𝒩𝒩(𝟎𝟎, 𝐕𝐕1 ) o Cons: each iteration takes a while.
𝐕𝐕1 = 𝐙𝐙1 𝐃𝐃𝐙𝐙1V + 𝐑𝐑1 • Fisher Scoring
o Pros: less computationally intensive
• Easier to fit than an LMM. Only restriction
and more likely to converge.
is that 𝐕𝐕1 must be positive definite.
o Cons: difficult to obtain the expected
• E[𝐘𝐘1 ] = 𝐗𝐗 1 𝜷𝜷
Hessian matrix, which is needed in
• Var[𝐘𝐘1 ] = 𝐙𝐙1 𝐃𝐃𝐙𝐙1V + 𝐑𝐑1
order for the estimates to be accurate.
• 𝐘𝐘1 ~ 𝒩𝒩r𝐗𝐗 1 𝜷𝜷, 𝐙𝐙1 𝐃𝐃𝐙𝐙1V + 𝐑𝐑1 s
In general, start with a few iterations of EM
to generate starting values, finish with N-R,
and avoid Fisher scoring.

Handling Computational Problems

If 𝐃𝐃 does not converge:
1. Choose different starting values for the
covariance parameters.
2. Rescale the covariates.
3. Remove potentially unnecessary
random effects.
4. Fit the implied marginal model instead.
5. Fit the marginal model with an
unstructured covariance matrix.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 3
EBLUPs Alternative Tests Information Criterion
≠: = Ef𝐮𝐮: Æ 𝐘𝐘: = 𝐲𝐲: h
𝐮𝐮 • 𝑡𝑡-test AIC = −2𝑙𝑙r𝜽𝜽 ® ; 𝐲𝐲s + 2 ⋅ 𝑝𝑝
• Empirical – based on the estimates of 𝜷𝜷 𝛽𝛽v: BIC = −2𝑙𝑙r𝜽𝜽 ®; 𝐲𝐲s + ln 𝑛𝑛 ⋅ 𝑝𝑝
𝑡𝑡. 𝑠𝑠. =
and 𝜽𝜽 𝑠𝑠𝑠𝑠r𝛽𝛽v: s • 𝑝𝑝 is the number of parameters in the
• Best – minimum variance among all o Used to test the hypothesis 𝐻𝐻N : 𝛽𝛽: = 0 model (fixed effects and
unbiased predictors versus 𝐻𝐻" : 𝛽𝛽: ≠ 0. covariance parameters).
• Linear – linear functions of the o Does not follow an exact 𝑡𝑡-distribution. • 𝑛𝑛 is the number of observations.
observed data Degrees of freedom are calculated ®; 𝐲𝐲s is the log-likelihood of the
• 𝑙𝑙r𝜽𝜽
• Unbiased using a computer. observed data in 𝐲𝐲 under the REML
• Predictors • 𝐹𝐹-test estimates of the parameters, 𝜽𝜽 ®.

For a variance components model, o Used to test the hypothesis 𝐻𝐻N : 𝐋𝐋𝜷𝜷 = 𝟎𝟎
The Top-Down Strategy
versus 𝐻𝐻" : 𝐋𝐋𝜷𝜷 ≠ 𝟎𝟎, where 𝐋𝐋 is some
𝜇𝜇̂ + 𝑢𝑢tN,: = 𝑍𝑍: × 𝑦𝑦≤: + r1 − 𝑍𝑍: s × 𝜇𝜇̂ Start with a complex model and reduce it to
matrix that encodes the fixed effects
𝑣𝑣 𝜎𝜎 & a simpler model.
𝑘𝑘 = = & being tested.
𝑎𝑎 𝜎𝜎DGQ 1. Build a model with a "loaded"
o Follows an approximate 𝐹𝐹-distribution
𝑛𝑛: 𝑛𝑛: mean structure.
𝑍𝑍: = = with 𝑑𝑑" numerator degrees of freedom
𝑛𝑛: + 𝑘𝑘 𝜎𝜎 & 2. Select a structure for the random effects.
𝑛𝑛: + & and 𝑑𝑑& denominator degrees
𝜎𝜎DGQ 3. Select a residual error
of freedom.
• If 𝑛𝑛: = 1, then 𝑍𝑍: = ICC: . covariance structure.
§ 𝑑𝑑" = number of parameters
• 𝑢𝑢tN,: is the EBLUP for level 𝑗𝑗 of the 4. Reduce the model by removing non-
being tested.
random factor. significant fixed effects.
§ 𝑑𝑑& is usually approximated with
• 𝑍𝑍: is the Bühlmann credibility factor for a computer. The Step-Up Strategy
level 𝑗𝑗. o Type I 𝐹𝐹-test: conditional on only the More commonly utilized for constructing
• 𝜇𝜇̂ is the unconditional predicted value effects listed before the one being models in hierarchical form. Start with a
using the marginal model. tested (sequential). simple model and gradually add terms.
• 𝑦𝑦≤: is the observed mean response for o Type III 𝐹𝐹-test: conditional on all other 1. Build a "means-only" model.
level 𝑗𝑗. fixed effects. 2. Check the random intercepts.
• 𝜇𝜇̂ + 𝑢𝑢tN,: is the shrinkage mean for level 𝑗𝑗. o Kenward-Roger Method: corrects for 3. Add the Level 1 covariates and related
bias in the estimation of 𝑑𝑑& by inflating Level 2 random coefficients.
Likelihood Ratio Tests the marginal covariance matrix. More 4. Add the Level 2 covariates and related
𝑡𝑡. 𝑠𝑠. = 2(𝑙𝑙" − 𝑙𝑙N ) important the smaller the sample size Level 3 random coefficients.
• Used for comparing two nested models. is. 5. Repeat Step 4 if necessary for any
• 𝐻𝐻N : The null model is adequate. • Omnibus Wald Test Level 3 covariates.
• 𝐻𝐻" : The reference model is better. o Used to test the hypothesis 𝐻𝐻N : 𝐋𝐋𝜷𝜷 = 𝟎𝟎
• Reject 𝐻𝐻N in favor of 𝐻𝐻" at the 𝛼𝛼 versus 𝐻𝐻" : 𝐋𝐋𝜷𝜷 ≠ 𝟎𝟎.
significance level if 𝑡𝑡. 𝑠𝑠. exceeds the value o Difficult to calculate the test statistic,
from the test distribution. but it follows a chi-square distribution
with degrees of freedom equal to the
REML/ Test
Testing number of parameters being tested.
ML Distribution
• Wald 𝑧𝑧-test
𝑑𝑑 fixed
ML 𝜒𝜒W& o Can be used to test the significance of a
effects
covariance parameter.
𝑑𝑑 residual
o Not well suited for LMMs.
covariance REML 𝜒𝜒W&
parameters
A lone
random REML 0.5𝜒𝜒"&
effect
One of
50-50
multiple
REML mixture of
random &
𝜒𝜒W#" and 𝜒𝜒W&
effects

• 𝑑𝑑 is the difference in parameters between

the null and reference models.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 4
LMM Diagnostics and Other Issues Other Diagnostics Missing Data
Residual Diagnostics • *Influence Diagnostics – Techniques used • LMMs are better at handling datasets that
• Conditional Residual – The difference to identify the influence an observation or have different-sized groups or missing
between the observed response and the set of observations have on the response, observations than alternatives such as
conditional predicted value. or the parameter estimates in 𝜷𝜷 and 𝜽𝜽. repeated-measures ANOVA.
𝝐𝝐t1 = 𝐲𝐲1 − 𝐗𝐗1 𝜷𝜷® − 𝐙𝐙1 𝐮𝐮
≠1 • Random Effect Diagnostics – Diagnose • LMMs assume that any unobserved data
• Marginal Residual – The difference random effects by looking at the EBLUPs. is missing at random, meaning that the
between the observed response and the o EBLUPs do not have to follow the true probability of having missing data on a
unconditional predicted value. distribution of the random effects, so given variable may depend on other
®
𝝐𝝐t∗1 = 𝐲𝐲1 − 𝐗𝐗1 𝜷𝜷 checking them for normality is observed data, but cannot depend on the
• Standardized Residual – Scaling a not needed. data that would have been observed.
residual by dividing by its true standard o Focus on identifying potential outliers,
Centering Covariates
deviation. Denoted as 𝜖𝜖̂1,: XQF
. as an unusually small or large EBLUP
• Grand Mean Centering – The overall
could point toward an abnormality
• Studentized Residual – Scaling a residual mean of a covariate is subtracted from
within the corresponding group.
by dividing by its estimated standard each observation.
XQT
deviation. Denoted as 𝜖𝜖̂1,: . • Observed vs. Predicted Values – Plot the
o Changes the interpretation of the
observed response values against the
o Internal Studentization – The estimate intercept, but not the
conditional predicted values to verify a
of the standard deviation includes the corresponding coefficient.
model's accuracy.
observation the residual • Group Mean Centering – The mean
o We hope to see a roughly linear
corresponds to. covariate value for a higher-level cluster
relationship between observed and
o External Studentization – The estimate or group is subtracted from
predicted values. If these values are not
of the standard deviation does not each observation.
similar, our model may not
include the observation the residual o Changes the interpretation of the
be adequate.
corresponds to. intercept and the
• Pearson Residual – Scaling a residual by Aliasing corresponding coefficient.
dividing by the estimated standard When there is ambiguity in the specification
6 Crossed Random Factors
deviation of the response. Denoted as 𝜖𝜖̂1,: . of a parametric model that would lead to
A model with crossed random factors has
multiple possible sets of parameters that
Potential Issues with Residuals multiple random factors whose levels do
each imply identical or
• Residuals with non-zero averages not have a specific nesting structure. This
indistinguishable models.
• Heteroscedasticity slightly changes the way the model is
• Nonestimability, a result of aliasing,
• Non-normal errors specified, the way parameters are
implies that infinitely many sets of
• Outliers estimated, and the form of the implied
parameters would lead to the same
marginal covariance matrix.
predicted values.
• Intrinsic Aliasing – Aliasing due to a
model's formula specification. Sometimes
referred to as "nonidentifiability"
or "overparameterization".
• Extrinsic Aliasing – Aliasing due to
characteristics of the dataset.

Common Examples of Aliasing

• Including a gender factor and a fixed
intercept term.
• Including a random intercept and
correlated residuals.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 5
Name Parameter(s) Description Diagnostic Type
Change in ML log-likelihood for all data with
Likelihood Distance 𝝍𝝍
𝝍𝝍 estimated using all data vs. reduced data
Overall influence
Restricted Likelihood Change in REML log-likelihood for all data with
𝝍𝝍
Distance 𝝍𝝍 estimated using all data vs. reduced data
𝜷𝜷 Scaled change in estimated 𝜷𝜷 vector
Cook’s Distance
𝜽𝜽 Scaled change in estimated 𝜽𝜽 vector
Scaled change in estimated 𝜷𝜷 vector using
𝜷𝜷 ®h Change in parameter estimates
Multivariate DFITS the “externalized” Varf𝜷𝜷
Statistic Scaled change in estimated 𝜽𝜽 vector using
𝜽𝜽 ®h
the “externalized” Varf𝜽𝜽
Change in precision of estimated 𝜷𝜷 vector based on
𝜷𝜷 ®h
the determinant of Varf𝜷𝜷 Change in precision of parameter
Covariance Ratio
Change in precision of estimated 𝜽𝜽 vector based on estimates
𝜽𝜽 ®h
the determinant of Varf𝜽𝜽
Predicted Residual
Sum of Squared PRESS Residuals calculated by
Error Sum of Squares N/A Sum of squared PRESS residuals
deleting observations in 𝑢𝑢
(PRESS) Statistic
*More information on Influence Diagnostics

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 6
STATISTICAL LEARNING Key Ideas on Model Accuracy
Statistical Learning
• As flexibility increases, the training MSE (or error rate) decreases,
Overview and Prerequisites but the test MSE (or error rate) follows a u-shaped pattern.
Types of Variables • Low flexibility leads to a method with low variance and high bias;
Response A variable of primary interest high flexibility leads to a method with high variance and low bias.
Explanatory A variable used to study the response variable
Count A quantitative variable usually valid on Validation Set
non-negative integers • Randomly splits all available observations into two groups: the
Continuous A real-valued quantitative variable training set and the validation set.
Nominal A categorical/qualitative variable having categories • Only the observations in the training set are used to attain the
without a meaningful or logical order fitted model, and those in validation set are used to estimate the
Ordinal A categorical/qualitative variable having categories test MSE.
with a meaningful or logical order 𝑘𝑘-fold Cross-Validation
Notation 1. Randomly divide all available observations into 𝑘𝑘 folds.
𝑦𝑦, 𝑌𝑌 Response variable 2. For 𝑣𝑣 = 1, … , 𝑘𝑘, obtain the 𝑣𝑣th fit by training with all observations
𝑥𝑥, 𝑋𝑋 Explanatory variable except those in the 𝑣𝑣th fold.
Subscript 𝑖𝑖 Index for observations 3. For 𝑣𝑣 = 1, … , 𝑘𝑘, use 𝑦𝑦t from the 𝑣𝑣th fit to calculate a test MSE
𝑛𝑛 No. of observations estimate with observations in the 𝑣𝑣th fold.
Subscript 𝑗𝑗 Index for variables except response 4. To calculate CV error, average the 𝑘𝑘 test MSE estimates in the
𝑝𝑝 No. of variables except response previous step.
𝐀𝐀V Transpose of matrix 𝐀𝐀 Leave-one-out Cross-Validation (LOOCV)
𝐀𝐀#" Inverse of matrix 𝐀𝐀 • Calculate LOOCV error as a special case of 𝑘𝑘-fold cross-validation
𝜀𝜀 Error term where 𝑘𝑘 = 𝑛𝑛.
𝑦𝑦t, 𝑌𝑌ƒ, 𝑓𝑓v(𝑥𝑥) Estimate/Estimator of 𝑓𝑓(𝑥𝑥)
Key Ideas on Cross-Validation
Regression Problems • With respect to bias, LOOCV < 𝑘𝑘-fold CV < Validation Set.
𝑌𝑌 = 𝑓𝑓r𝑥𝑥" , … , 𝑥𝑥B s + 𝜀𝜀 where E[𝜀𝜀] = 0, so E[𝑌𝑌] = 𝑓𝑓r𝑥𝑥" , … , 𝑥𝑥B s • With respect to variance, LOOCV > 𝑘𝑘-fold CV > Validation Set.
&
Test MSE = E #r𝑌𝑌 − 𝑌𝑌ƒs & ,
Standardizing Variables
∑213"(𝑦𝑦1 − 𝑦𝑦t1 )& • A centered variable is the result of subtracting the sample mean
which can be estimated using
𝑛𝑛 from a variable.
For fixed inputs 𝑥𝑥" , … , 𝑥𝑥B , the test MSE is
• A scaled variable is the result of dividing a variable by its sample
&
Varf𝑓𝑓vr𝑥𝑥" , … , 𝑥𝑥B sh + rBiasf𝑓𝑓vr𝑥𝑥" , … , 𝑥𝑥B shs +
{||||||||||||}||||||||||||~ Var[𝜀𝜀]
{}~ standard deviation.
=>?T<DYZ> >==H= D==>?T<DYZ> >==H= • A standardized variable is the result of first centering a variable,
Classification Problems then scaling it.
Test Error Rate = Ef𝐼𝐼r𝑌𝑌 ≠ 𝑌𝑌ƒsh,
∑213" 𝐼𝐼(𝑦𝑦1 ≠ 𝑦𝑦t1 )
which can be estimated using
𝑛𝑛
Bayes Classifier:
𝑓𝑓r𝑥𝑥" , … , 𝑥𝑥B s = arg max Prr𝑌𝑌 = 𝑐𝑐Æ𝑋𝑋" = 𝑥𝑥" , … , 𝑋𝑋B = 𝑥𝑥B s
*

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 7
Contrasting Statistical Learning Elements

Statistical Learning Method Supervised Unsupervised Parametric Non-Parametric

Least Squares Linear Regression
✓ ✓
(SLR, MLR)
Shrinkage Methods (Lasso, Ridge) ✓ ✓
Generalized Linear Models (GLM) ✓ ✓
Generalized Additive Models* (GAM) ✓ ✓ ✓
K-Nearest Neighbors ✓ ✓
Decision Trees ✓ ✓
Ensemble Methods ✓ ✓
Principal Components Analysis (PCA) ✓
Cluster Analysis ✓
Neural Network ✓ ✓
*Depending on the methods included in a GAM, it could be considered parametric and/or non-parametric.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 8
𝒌𝒌-Nearest Neighbors (KNN) Decision Trees Cost Complexity Pruning
Algorithm Notation Regression:
1. Let the observation having inputs 𝑅𝑅 Region of predictor space |V|
&
𝑥𝑥" , … , 𝑥𝑥B be the center of 𝑛𝑛\ No. of observations in node 𝑚𝑚 Minimize — — r𝑦𝑦1 − 𝑦𝑦≤], s + 𝜆𝜆|𝑇𝑇|
the neighborhood. 𝑛𝑛\,* No. of category 𝑐𝑐 observations in \3" 1:𝐱𝐱 # ∈],

2. Starting from the center of the node 𝑚𝑚 Classification:

|V|
neighborhood, identify the 𝑘𝑘 nearest 𝐼𝐼 Impurity 1
training observations. 𝐸𝐸 Classification error rate Minimize — 𝑛𝑛\ ⋅ 𝐼𝐼\ + 𝜆𝜆|𝑇𝑇|
𝑛𝑛
3. For classification, 𝑦𝑦t is the most frequent 𝐺𝐺 Gini index \3"

category among the 𝑘𝑘 training 𝐷𝐷 Cross entropy Key Ideas

observations; for regression, 𝑦𝑦t is the 𝑇𝑇 Subtree • Terminal nodes or leaves represent the
average of the response among the 𝑘𝑘 |𝑇𝑇| No. of terminal nodes in 𝑇𝑇 partitions of the predictor space.
training observations. 𝜆𝜆 Tuning parameter • Internal nodes are points along the tree
Euclidean Distance Algorithm where splits occur.
& 1. Construct a large tree with 𝑔𝑔 terminal • Terminal nodes do not have child nodes,
B
𝑑𝑑(𝑖𝑖, 𝑚𝑚) = ¨∑:3"r𝑥𝑥1,: − 𝑥𝑥\,: s nodes using recursive binary splitting. but internal nodes do.
& & 2. Obtain a sequence of best subtrees, • Branches are lines that connect any
= ¨r𝑥𝑥1," − 𝑥𝑥\," s + ⋯ + r𝑥𝑥1,B − 𝑥𝑥\,B s as a function of 𝜆𝜆, using cost two nodes.
complexity pruning. • A decision tree with only one internal
Key Ideas node is called a stump.
3. Choose 𝜆𝜆 by applying 𝑘𝑘-fold cross
• 𝑘𝑘 is inversely related to flexibility.
validation. Select the 𝜆𝜆 that results in the
• The number of training observations 𝑛𝑛 Advantages of Trees
lowest cross-validation error.
must be large to produce • Easy to interpret and explain.
4. The best subtree is the subtree created in
good predictions. • Can be presented visually.
step 2 with the selected 𝜆𝜆 value.
• It does not perform well in • Manage categorical variables without the
high dimensions. Recursive Binary Splitting need of dummy variables.
• KNN regression will outperform linear Regression: • Mimic human decision-making.
`
regression when the chosen functional
Minimize — — r𝑦𝑦1 − 𝑦𝑦≤], s
& Disadvantages of Trees
form poorly approximates the true
\3" 1:𝐱𝐱 # ∈], • Not robust.
relationship between the response and
Classification: • Do not have the same degree of predictive
explanatory variables. ` accuracy as other statistical methods.
• Predictor variables should be scaled prior 1
Minimize — 𝑛𝑛\ ⋅ 𝐼𝐼\
to performing KNN. 𝑛𝑛
\3"

More Under Classification:

𝑝𝑝̂\,* = 𝑛𝑛\,* ⁄𝑛𝑛\
𝐸𝐸\ = 1 − max 𝑝𝑝̂\,*
*
𝐺𝐺\ = ∑a
*3" 𝑝𝑝̂\,* r1 − 𝑝𝑝̂ \,* s
𝐷𝐷\ = − ∑a
*3" 𝑝𝑝̂\,* ln 𝑝𝑝̂ \,*

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 9
Ensemble Methods Boosting Principal Components Analysis (PCA)
Bagging Let 𝑧𝑧" be the actual response variable, 𝑦𝑦. Notation
1. Create 𝑏𝑏 bootstrap samples from the 1. For 𝑘𝑘 = 1, 2, … , 𝑏𝑏: 𝑧𝑧, 𝑍𝑍 Principal component (score)
original training dataset. • Use recursive binary splitting to fit a 𝑚𝑚 Index for principal components
2. Construct a decision tree for each tree with 𝑑𝑑 splits to the data with 𝑧𝑧M as 𝜙𝜙 Principal component loading
bootstrap sample using recursive the response. 𝑥𝑥, 𝑋𝑋 Centered explanatory variable
binary splitting. • Update 𝑧𝑧M by subtracting 𝜆𝜆 ⋅ 𝑓𝑓vM (𝐱𝐱), i.e.,
Principal Components
3. Predict the response of a new observation let 𝑧𝑧Mb" = 𝑧𝑧M − 𝜆𝜆 ⋅ 𝑓𝑓vM (𝐱𝐱). B B
by averaging the predictions (regression 2. Calculate the boosted model prediction as 𝑧𝑧\ = — 𝜙𝜙:,\ 𝑥𝑥: , 𝑧𝑧1,\ = — 𝜙𝜙:,\ 𝑥𝑥1,:
trees) or by using the most frequent 𝑓𝑓v(𝐱𝐱) = ∑cM3" 𝜆𝜆 ⋅ 𝑓𝑓vM (𝐱𝐱). :3" :3"
category (classification trees) across B &
• ∑:3" 𝜙𝜙:,\ =1
all 𝑏𝑏 trees. Properties
• Increasing 𝑏𝑏 can cause overfitting. • ∑B:3" 𝜙𝜙:,\ ⋅ 𝜙𝜙:,J = 0, 𝑚𝑚 ≠ 𝑢𝑢
Properties • Boosting reduces bias.
• Increasing 𝑏𝑏 does not cause overfitting. Proportion of Variance Explained (PVE)
• 𝑑𝑑 controls complexity of the B B 2
• Bagging reduces variance. boosted model. 1 &
— 𝑠𝑠k&5 = — — 𝑥𝑥1,:
• Out-of-bag error is a valid estimate of • 𝜆𝜆 controls the rate at which 𝑛𝑛
:3" :3" 13"
test error. boosting learns. 2
1 &
Random Forests 𝑠𝑠l&, = — 𝑧𝑧1,\
Bayesian Additive Regression Trees (BART) 𝑛𝑛
13"
1. Create 𝑏𝑏 bootstrap samples from the (J)
Let 𝑓𝑓v (𝐱𝐱) be the prediction at 𝐱𝐱 for the 𝑘𝑘th
M
𝑠𝑠l&,
original training dataset. PVE =
tree in the 𝑢𝑢th iteration for 𝑘𝑘 = 1, 2, … , 𝑏𝑏 ∑B:3" 𝑠𝑠k&5
2. Construct a decision tree for each
and 𝑢𝑢 = 1, 2, … , 𝑣𝑣. • The variance explained by each
bootstrap sample using recursive binary
(") ∑- e
splitting. At each split, a random subset of 1. Initiate by letting 𝑓𝑓v (𝐱𝐱) = #.* # for 𝑘𝑘 =
M 2c
subsequent principal component is
𝑘𝑘 variables are considered. 1, 2, … , 𝑏𝑏. always less than the variance explained
3. Predict the response of a new observation (") by the previous principal component.
2. Calculate 𝑓𝑓v (") (𝐱𝐱) = ∑cM3" 𝑓𝑓vM (𝐱𝐱) = 𝑦𝑦≤.
by averaging the predictions (regression • Total variance is the sum of the variance
3. For 𝑢𝑢 = 2, 3, … , 𝑣𝑣:
trees) or by using the most frequent explained by the first 𝑘𝑘 principal
a) For 𝑘𝑘 = 1, 2, … , 𝑏𝑏:
category (classification trees) across components and MSE of the 𝑘𝑘-
i. For 𝑖𝑖 = 1, 2, … , 𝑛𝑛, calculate the
all 𝑏𝑏 trees. dimensional approximation.
current partial residual, 𝑟𝑟1 = 𝑦𝑦1 −
(J) (J#")
Properties ∑M +fM 𝑓𝑓v + (𝐱𝐱1 ) − ∑M +gM 𝑓𝑓v +
M
(𝐱𝐱1 ).
M
Key Ideas
• Bagging is a special case of (J)
ii. Fit a new tree, 𝑓𝑓vM (𝐱𝐱), to the partial • All principal components are
random forests. residuals by randomly perturbing uncorrelated with one another.
• Increasing 𝑏𝑏 does not cause overfitting. the 𝑘𝑘th tree from the previous • A dataset has min(𝑛𝑛 − 1, 𝑝𝑝) distinct
• Decreasing 𝑘𝑘 reduces the correlation (J#") principal components.
iteration, 𝑓𝑓v
M (𝐱𝐱).
between predictions. (J)
• The first 𝑘𝑘 principal component scores
b) Calculate 𝑓𝑓v (J) (𝐱𝐱) = ∑cM3" 𝑓𝑓vM (𝐱𝐱). and loadings approximate the original
4. Calculate the mean after 𝑡𝑡 burn-in dataset, 𝑥𝑥1,: ≈ ∑M\3" 𝑧𝑧1,\ 𝜙𝜙:,\ .
∑20.34* hi (0) (𝐱𝐱)
samples, 𝑓𝑓v(𝐱𝐱) = . • Principal components are low-
j#@
dimensional surfaces in 𝑝𝑝-dimensional
Properties space that are closest to the observations.
• Like bagging and random forests, BART • Scaling has a significant effect on the
incorporates randomness. result of PCA.
• Like boosting, BART sequentially builds • A scree plot can be used to determine the
trees to capture information not captured number of principal components.
by previous trees.
• Each principal component loading vector
is unique up to a sign flip.
• PCA is most useful when multicollinearity
is present in the features.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 10
Cluster Analysis Hierarchical Clustering Key Ideas
Notation 1. Select the dissimilarity measure and • For 𝑘𝑘-means clustering, the algorithm
𝐶𝐶 Cluster containing indices linkage to be used. Treat each needs to be repeated for each 𝑘𝑘.
𝑊𝑊(𝐶𝐶) Within-cluster variation observation as its own cluster. • For hierarchical clustering, the algorithm
of cluster 2. For 𝑘𝑘 = 𝑛𝑛, 𝑛𝑛 − 1, … , 2: only needs to be performed once for any
|𝐶𝐶| No. of observations in cluster • Compute the inter-cluster dissimilarity number of clusters.
between all 𝑘𝑘 clusters. • The result of clustering depends on many
𝑘𝑘-Means Clustering
• Examine all rM&s pairwise parameters, such as:
1. Randomly assign a cluster to each
dissimilarities. The two clusters with o Choice of 𝑘𝑘 in 𝑘𝑘-means clustering.
observation. This serves as the initial
the lowest inter-cluster dissimilarity o Choice of number of clusters, linkage,
cluster assignments.
are fused. The dissimilarity indicates and dissimilarity measure in
2. Calculate the centroid of each cluster.
the height in the dendrogram at which hierarchical clustering.
3. For each observation, identify the closest
these two clusters join. o Choice to standardize variables.
centroid and reassign to that cluster.
4. Repeat steps 2 and 3 until the cluster
Linkage Inter-cluster Dissimilarity
assignments stop changing.
B Complete The largest dissimilarity
1 &
𝑊𝑊(𝐶𝐶J ) = — —r𝑥𝑥1,: − 𝑥𝑥\,: s Single The smallest dissimilarity
|𝐶𝐶J |
1,\∈m0 :3"
B Average The arithmetic mean
&
= 2 — —r𝑥𝑥1,: − 𝑥𝑥̅J,: s
The dissimilarity between
1∈m0 :3" Centroid
the cluster centroids

Neural Network

Type of Layers Description or Formula No. of Parameters

Input Consists of 𝑝𝑝 features 0
𝐴𝐴M = 𝑔𝑔r𝑤𝑤M,N + ∑B:3" 𝑤𝑤M,: 𝑥𝑥: s (𝑝𝑝 + 1) × 𝐾𝐾
Hidden (𝑝𝑝 + 1) × 𝐾𝐾
𝐴𝐴ℓ,M = 𝑔𝑔r𝑤𝑤M,N + ∑B:3" 𝑤𝑤M,: 𝑥𝑥ℓ,: + ∑p
o3" 𝑢𝑢M,o 𝐴𝐴ℓ#",o s
𝐾𝐾 × 𝐾𝐾
Embedding Hidden layer used to lower 𝑝𝑝 dimensions into 𝑚𝑚 dimensions 𝑚𝑚 × 𝑝𝑝
Convolution Hidden layer used to detect small patterns in an image (𝐾𝐾1 + 1) × 𝐾𝐾1b" **
Detector Hidden layer for application of ReLU function to convolved image 0
Pooling Hidden layer to reduce size of detected patterns 0
Dense Another name for hidden layer after all convolution and pooling (𝐾𝐾1 + 1) × 𝐾𝐾1b" **
𝑓𝑓(𝐱𝐱) = 𝛽𝛽N + ∑p
M3" 𝛽𝛽M 𝐴𝐴M
Output (𝐾𝐾 + 1) × 𝑤𝑤
𝑂𝑂ℓ = 𝛽𝛽N + ∑pM3" 𝛽𝛽M 𝐴𝐴ℓ,M
*All instances of 𝐾𝐾 can have subscripts depending on number of hidden layers.
**Refer to Section 3.6.3 Convolutional Neural Networks.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 11
Activation Functions Performance Measures Receiver Operating Characteristic
Sigmoid: Lift measures a model's ability to avoid (ROC) Curves
𝑒𝑒 l adverse selection by accurately determining • Plots the true positive rates (sensitivity)
𝑔𝑔(𝑧𝑧) =
1 + 𝑒𝑒 l an actuarially fair premium rate for against the false positive rates (1 minus
each insured. specificity) for different values of the
Rectified linear unit (ReLU):
0, 𝑧𝑧 < 0 discrimination threshold.
𝑔𝑔(𝑧𝑧) = 𝑧𝑧b = ‹ Actual vs. Predicted Plots
𝑧𝑧, otherwise • Sensitivity, true positive rate, or hit rate is
• Plots the actual response variable against
the percentage of positive observations
Softmax: the predicted response variable for
with correct predictions.
𝑒𝑒 l6 each model.
𝑔𝑔(𝑧𝑧* ) = • Specificity is the percentage of negative
∑a
\3" 𝑒𝑒
l, • The better model is closer to the
observations with correct predictions.
diagonal line.
Estimating Parameters • Observations belong to exactly one of the
• Coefficients are also called weights. Simple Quantile Plots following four groups: true positive, true
Intercepts are also called biases. • Plots the average actual response and the negative, false positive, false negative.
• Estimated to minimize squared-error loss average predicted response for each • AUROC is the area under the ROC curve.
for a regression problem and to minimize quantile for each model. • The better model has a larger AUROC.
cross entropy for a classification problem. • The better model is better at predicting
o Squared-error loss: ∑213"[𝑦𝑦1 − 𝑓𝑓(𝐱𝐱1 )]& the actual response in each quantile, has
o Cross entropy: fewer reversals, and has a larger vertical
− ∑213" ∑a*3" 𝑦𝑦1,* ln[𝑓𝑓* (𝐱𝐱 1 )] distance between the first and
• Slow learning, using gradient descent: last quantiles.
1. Start with an initial estimate 𝜽𝜽 ®(N) for 𝜽𝜽,
Double Lift Charts
and set 𝑡𝑡 = 0. • Plots the average actual response and the
2. Iterate until the objective 𝑅𝑅(𝜽𝜽) fails to average predicted response for each
decrease: quantile for each model in one chart.
®(@b") ← 𝜽𝜽
a. Set 𝜽𝜽 ®(@) − 𝜌𝜌 ⋅ 𝑅𝑅′r𝜽𝜽
® (@) s.
• The better model is better at predicting
b. Set 𝑡𝑡 ← 𝑡𝑡 + 1. the actual response in each quantile.
• Stochastic gradient descent is gradient
descent, but instead of all 𝑛𝑛 observations Loss Ratio Charts
contributing to the calculation of • Plots the actual loss ratio for
gradient, only a sampled minibatch does. each quantile.
• Regularization such as lasso, ridge, early • Used to examine the efficacy of the
stopping, and dropout learning mitigates current rating plan.
the risk of overfitting. • If new rating plan can distinguish
between policies with low loss ratios and
those with high loss ratios, the current
rating plan is poor.

Lorenz Curves
• Plots the cumulative percentage of actual
response against the cumulative
percentage of exposures.
• The Gini index is twice the area between
the Lorenz curve and the line of equality.
• The better model has a larger Gini index.

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 12
Time
TIME SERIES Series with
WITH Additive Seasonality Time Series Models
CONSTANT VARIANCE
Constant Variance 1. Estimate the seasonality component for
White Noise
each observation as 𝑠𝑠̂@ = 𝑥𝑥@ − 𝑚𝑚 ≠@.
E[𝑊𝑊@ ] = 0
Notation 2. Calculate the average seasonality &
Var[𝑊𝑊@ ] = 𝜎𝜎s
𝑀𝑀@ Trend component for each season, 𝑠𝑠̅1 .
𝛾𝛾M = 0, 𝑘𝑘 ≠ 0
𝑆𝑆@ Seasonal effect 3. Adjust the averages to equal 0, i.e.,
𝑍𝑍@ Error term calculate each 𝑠𝑠̅1∗ = 𝑠𝑠̅1 − ∑`13" 𝑠𝑠̅1 ⁄𝑔𝑔. Random Walk
𝛾𝛾M Lag 𝑘𝑘 autocovariance function @
4. Calculate seasonally adjusted data as
𝜌𝜌M Lag 𝑘𝑘 autocorrelation function 𝑥𝑥@ − 𝑠𝑠̅1∗ . 𝑋𝑋@ = 𝑋𝑋@#" + 𝑊𝑊@ = — 𝑊𝑊1
𝑐𝑐M Lag 𝑘𝑘 sample 13"

autocovariance function Multiplicative Seasonality

E[𝑋𝑋@ ] = 0
𝑟𝑟M Lag 𝑘𝑘 sample 1. Estimate the seasonality component for &
Var[𝑋𝑋@ ] = 𝛾𝛾M (𝑡𝑡) = 𝑡𝑡 ⋅ 𝜎𝜎s
autocorrelation function each observation as 𝑠𝑠̂@ = 𝑥𝑥@ ⁄𝑚𝑚 ≠@.
1
𝑐𝑐M (𝑥𝑥, 𝑦𝑦) Lag 𝑘𝑘 sample 2. Calculate the average seasonality 𝜌𝜌M (𝑡𝑡) =
component for each season, 𝑠𝑠̅1 . Â1 + 𝑘𝑘/𝑡𝑡
cross-covariance function
𝑟𝑟M (𝑥𝑥, 𝑦𝑦) Lag 𝑘𝑘 sample 3. Adjust the averages to equal 1, i.e., Random Walk with Drift
`
cross-correlation function calculate each 𝑠𝑠̅1∗ = 𝑠𝑠̅1 ⁄fr∑13" 𝑠𝑠̅1 s⁄𝑔𝑔h. 𝑋𝑋@ = 𝑋𝑋@#" + 𝛿𝛿 + 𝑊𝑊@
B Backward shift operator 4. Calculate seasonally adjusted data as
𝑥𝑥@ ⁄𝑠𝑠̅1∗ . Differencing
∇ Difference operator
∇𝑋𝑋@ = 𝑋𝑋@ − 𝑋𝑋@#"
𝜃𝜃B Characteristic polynomial for 𝑥𝑥@ Autocorrelation/Serial Correlation ∇& 𝑋𝑋@ = ∇(∇𝑋𝑋@ ) = 𝑋𝑋@ − 2𝑋𝑋@#" + 𝑋𝑋@#&
𝜙𝜙5 Characteristic polynomial for 𝑤𝑤@ 𝛾𝛾M = Cov[𝑋𝑋@ , 𝑋𝑋@bM ]
𝛾𝛾M Backward Shift Operator
𝜌𝜌M = Corr[𝑋𝑋@ , 𝑋𝑋@bM ] = &
Introduction to Time Series 𝜎𝜎 𝐁𝐁2 𝑋𝑋@ = 𝑋𝑋@#2
2#M
1 ∇2 𝑋𝑋@ = (1 − 𝐁𝐁)2 𝑋𝑋@
Decomposition Models 𝑐𝑐M = —(𝑥𝑥@ − 𝑥𝑥̅ )(𝑥𝑥@bM − 𝑥𝑥̅ )
• Additive model: 𝑛𝑛 Characteristic Equations
@3"
𝑋𝑋@ = 𝑀𝑀@ + 𝑆𝑆@ + 𝑍𝑍@ 𝑐𝑐M 𝑋𝑋@ = 𝛼𝛼" 𝑋𝑋@#" + ⋯ + 𝛼𝛼B 𝑋𝑋@#B + 𝑊𝑊@ + 𝛽𝛽" 𝑊𝑊@#"
𝑟𝑟M =
• Multiplicative seasonality model: 𝑐𝑐N + ⋯ + 𝛽𝛽5 𝑊𝑊@#5
𝑋𝑋@ = 𝑀𝑀@ ⋅ 𝑆𝑆@ + 𝑍𝑍@ Testing autocorrelation: • The characteristic equation for 𝑋𝑋@ is
• Multiplicative model: 1 𝜃𝜃B (𝐁𝐁) = 1 − 𝛼𝛼" 𝐁𝐁 − 𝛼𝛼& 𝐁𝐁& − ⋯ − 𝛼𝛼B 𝐁𝐁B .
𝑋𝑋@ = 𝑀𝑀@ ⋅ 𝑆𝑆@ ⋅ 𝑍𝑍@ 𝑟𝑟M − ]− ^
𝑡𝑡. 𝑠𝑠. = 𝑛𝑛 • The characteristic equation for 𝑊𝑊@ is
1
Stationarity 𝜙𝜙5 (𝐁𝐁) = 1 + 𝛽𝛽" 𝐁𝐁 + 𝛽𝛽& 𝐁𝐁& + ⋯ + 𝛽𝛽5 𝐁𝐁5 .
√𝑛𝑛
• Ergodic: The mean of the observed time 𝐻𝐻N : 𝜌𝜌M = 0 against 𝐻𝐻" : 𝜌𝜌M ≠ 0 • Stationarity
series approaches the theoretical mean as Reject 𝐻𝐻N if |𝑡𝑡. 𝑠𝑠. | ≥ 𝑧𝑧"#7 No roots of 𝜃𝜃B (𝐁𝐁) = 0 are less than or
time goes to infinity.
"
equal to 1 in absolute value.
• Second-order stationary: The mean, Cross-Correlation • Invertibility
variance, and correlation of 𝑋𝑋@ are 1
2#M No roots of 𝜙𝜙5 (𝐁𝐁) = 0 are less than or
constant over time. 𝑐𝑐M (𝑥𝑥, 𝑦𝑦) = —(𝑥𝑥@bM − 𝑥𝑥̅ )(𝑦𝑦@ − 𝑦𝑦≤) equal to 1 in absolute value.
𝑛𝑛
• Strictly stationary: All moments of 𝑋𝑋@ are @3" • Parameter Redundancy
𝑐𝑐M (𝑥𝑥, 𝑦𝑦)
constant over time. 𝑟𝑟M (𝑥𝑥, 𝑦𝑦) = 𝜃𝜃B (𝐁𝐁) and 𝜙𝜙5 (𝐁𝐁) share a common factor.
Â𝑐𝑐N(𝑥𝑥, 𝑥𝑥) 𝑐𝑐N (𝑦𝑦, 𝑦𝑦)
Center Moving Average for Monthly Data
0.5𝑥𝑥@#q + ∑@br
13@#r 𝑥𝑥1 + 0.5𝑥𝑥@bq
𝑚𝑚
≠@ =
12

© 2023 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-II Formula Sheet 13
Autoregressive Models, AR(𝑝𝑝) ARMA Models, ARMA(𝑝𝑝, 𝑞𝑞) Time Series with Regression
𝑋𝑋@ = 𝛼𝛼" 𝑋𝑋@#" + 𝛼𝛼& 𝑋𝑋@#& + ⋯ + 𝛼𝛼B 𝑋𝑋@#B + 𝑊𝑊@ 𝑋𝑋@ = 𝛼𝛼" 𝑋𝑋@#" + ⋯ + 𝛼𝛼B 𝑋𝑋@#B + 𝑊𝑊@ + 𝛽𝛽" 𝑊𝑊@#"
Variance of Sample Mean
𝜃𝜃B (𝐁𝐁) ⋅ 𝑋𝑋@ = 𝑊𝑊@ + ⋯ + 𝛽𝛽5 𝑊𝑊@#5 2#"
𝜎𝜎 & 𝑘𝑘
𝜃𝜃B (𝐁𝐁) ⋅ 𝑋𝑋@ = 𝜙𝜙5 (𝐁𝐁) ⋅ 𝑊𝑊@ Var[𝑋𝑋≤] = ˆ1 + 2 — 1 − Ò 𝜌𝜌M ˜
Stationary AR(1) 𝑛𝑛 𝑛𝑛
M3"
𝑋𝑋@ = 𝛼𝛼𝑋𝑋@#" + 𝑊𝑊@ Stationary ARMA(1, 1)
𝑋𝑋@ = 𝛼𝛼𝑋𝑋@#" + 𝑊𝑊@ + 𝛽𝛽𝑊𝑊@#" Harmonic Seasonal Model
E[𝑋𝑋@ ] = 0 ⌊`⁄&⌋
&
𝜎𝜎s E[𝑋𝑋@ ] = 0 2𝜋𝜋𝜋𝜋𝜋𝜋
Var[𝑋𝑋@ ] = 𝑋𝑋@ = 𝑀𝑀@ + — ¢𝛽𝛽",: sin  Ò
1 − 𝛼𝛼 & 1 + 2𝛼𝛼𝛼𝛼 + 𝛽𝛽 & 𝑔𝑔
& :3"
M &
𝛼𝛼 𝜎𝜎s Var[𝑋𝑋@ ] = 𝜎𝜎s * .
1 − 𝛼𝛼 & 2𝜋𝜋𝜋𝜋𝜋𝜋
𝛾𝛾M = + 𝛽𝛽&,: cos  Ò£ + 𝑍𝑍@
1 − 𝛼𝛼 & 1 + 𝛼𝛼𝛼𝛼 𝑔𝑔
& (𝛼𝛼
𝜌𝜌M = 𝛼𝛼 M 𝛾𝛾M = 𝜎𝜎s + 𝛽𝛽)𝛼𝛼 M#"  Ò, 𝑘𝑘 > 0
1 − 𝛼𝛼 &
|𝛼𝛼| < 1 𝛼𝛼 M#" (𝛼𝛼
+ 𝛽𝛽)(1 + 𝛼𝛼𝛼𝛼) Correction Factors for Logged Models
𝜌𝜌M = , 𝑘𝑘 > 0 • If 𝑍𝑍@ follows a Gaussian white noise
Stationary AR(2) 1 + 2𝛼𝛼𝛼𝛼 + 𝛽𝛽&
𝜌𝜌M = 𝛼𝛼𝜌𝜌M#" , 𝑘𝑘 ≥ 2 process, use the lognormal correction
𝑋𝑋@ = 𝛼𝛼" 𝑋𝑋@#" + 𝛼𝛼& 𝑋𝑋@#& + 𝑊𝑊@ factor:
𝛼𝛼& − 𝛼𝛼" < 1 ARIMA Models, ARIMA(𝑝𝑝, 𝑑𝑑, 𝑞𝑞) "
E[𝑒𝑒 y-49 ] = 𝑒𝑒 - ⁄&
𝛼𝛼& + 𝛼𝛼" < 1 𝜃𝜃B (𝐁𝐁)(1 − 𝐁𝐁)W ⋅ 𝑋𝑋@ = 𝜙𝜙5 (𝐁𝐁) ⋅ 𝑊𝑊@ • For any 𝑍𝑍@ , use the empirical correction
|𝛼𝛼& | < 1 • If ∇W 𝑋𝑋@ = 𝑊𝑊@ , then 𝑋𝑋@ is I(𝑑𝑑). factor:
2
• If ∇W 𝑋𝑋@ is ARMA(𝑝𝑝, 𝑞𝑞), then 𝑋𝑋@ is 1
Moving Average Models, MA(𝑞𝑞) ARIMA(𝑝𝑝, 𝑑𝑑, 𝑞𝑞). E[𝑒𝑒 y-49 ] = — 𝑒𝑒 y3
𝑋𝑋@ = 𝑊𝑊@ + 𝛽𝛽" 𝑊𝑊@#" + ⋯ + 𝛽𝛽5 𝑊𝑊@#5 𝑛𝑛
@3"
• ARIMA(0, 𝑑𝑑, 𝑞𝑞) = IMA(𝑑𝑑, 𝑞𝑞)
𝑋𝑋@ = 𝜙𝜙5 (𝐁𝐁) ⋅ 𝑊𝑊@ • ARIMA(𝑝𝑝, 𝑑𝑑, 0) = ARI(𝑝𝑝, 𝑑𝑑)
E[𝑋𝑋@ ] = 0 Seasonal ARIMA(𝑝𝑝, 𝑑𝑑, 𝑞𝑞)(𝑃𝑃, 𝐷𝐷, 𝑄𝑄)`
5
Θt (𝐁𝐁 ` ) ⋅ 𝜃𝜃B (𝐁𝐁) ⋅ (1 − 𝐁𝐁 ` )u ⋅ (1 − 𝐁𝐁)W ⋅ 𝑋𝑋@
&
Var[𝑋𝑋@ ] = 𝜎𝜎s — 𝛽𝛽1&
13N
= Φv (𝐁𝐁 ` ) ⋅ 𝜙𝜙5 (𝐁𝐁) ⋅ 𝑊𝑊@
5#M
&
𝛾𝛾M = 𝜎𝜎s — 𝛽𝛽1 𝛽𝛽1bM , 0 ≤ 𝑘𝑘 ≤ 𝑞𝑞
13N
1, 𝑘𝑘 = 0
⎧ 5#M
⎪∑13N 𝛽𝛽1 𝛽𝛽1bM
𝜌𝜌M = 5 &
, 1 ≤ 𝑘𝑘 ≤ 𝑞𝑞
⎨ ∑13N 𝛽𝛽1
⎪0, 𝑘𝑘 > 𝑞𝑞
⎩
• An invertible MA(𝑞𝑞) model can be
expressed as a stationary AR(∞) model.
• A stationary AR(𝑝𝑝) model can be
expressed as an invertible MA(∞) model.

CS1 R Summary Sheets
No ratings yet
CS1 R Summary Sheets
26 pages
Astam Formula Sheet
No ratings yet
Astam Formula Sheet
10 pages
Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
STAM Formula Sheet
100% (2)
STAM Formula Sheet
4 pages
Computational Methods For Mixed Models
No ratings yet
Computational Methods For Mixed Models
21 pages
Robins and Greenland 1986
No ratings yet
Robins and Greenland 1986
11 pages
Resolucion Capítulo 2 Econometria Stock y Watson PDF
No ratings yet
Resolucion Capítulo 2 Econometria Stock y Watson PDF
29 pages
Mixed Model Analysis For Overdispersion
No ratings yet
Mixed Model Analysis For Overdispersion
9 pages
17.874 Lecture Notes Part 6: Panel Models
No ratings yet
17.874 Lecture Notes Part 6: Panel Models
13 pages
Stam Formula Sheet
No ratings yet
Stam Formula Sheet
8 pages
Course1 Review
No ratings yet
Course1 Review
45 pages
Seattle SISG 18 IntroQG Lecture08
No ratings yet
Seattle SISG 18 IntroQG Lecture08
21 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
Stock Watson 3U ExerciseSolutions Chapter2 Instructors
83% (6)
Stock Watson 3U ExerciseSolutions Chapter2 Instructors
29 pages
Mathematics of Machine Learning-Notation (Rajen D. Shah) (University of Cambridge)
No ratings yet
Mathematics of Machine Learning-Notation (Rajen D. Shah) (University of Cambridge)
2 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
54 pages
Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
Econometrics II Distance Module
No ratings yet
Econometrics II Distance Module
97 pages
Econometric Lec4
No ratings yet
Econometric Lec4
58 pages
Random and Fixed Effects: Newsom 1 USP 656 Multilevel Regression Winter 2006
No ratings yet
Random and Fixed Effects: Newsom 1 USP 656 Multilevel Regression Winter 2006
3 pages
Credibility Theory (Tutorial Problem Set)
No ratings yet
Credibility Theory (Tutorial Problem Set)
12 pages
Econometrics Notes
No ratings yet
Econometrics Notes
30 pages
Bayesian Inference in The Normal Linear Regression Model
No ratings yet
Bayesian Inference in The Normal Linear Regression Model
53 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
4 pages
DA Unit 2
No ratings yet
DA Unit 2
124 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Previewpdf
No ratings yet
Previewpdf
42 pages
Week01 RegressionWithPanelDataPart1
No ratings yet
Week01 RegressionWithPanelDataPart1
37 pages
Linear Regression 101
No ratings yet
Linear Regression 101
20 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
17 pages
Econ20222 MJAbackgr
No ratings yet
Econ20222 MJAbackgr
164 pages
ECS4863 SOLUTIONS Activity 4.1 A Reviews of Statistical and Econometric Concepts
No ratings yet
ECS4863 SOLUTIONS Activity 4.1 A Reviews of Statistical and Econometric Concepts
5 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
364 pages
Dataanalyticsunit 2
No ratings yet
Dataanalyticsunit 2
24 pages
Week 7 PPT
No ratings yet
Week 7 PPT
90 pages
Deep Learning With Actuarial Applications
No ratings yet
Deep Learning With Actuarial Applications
248 pages
Exercise Solutions Chapter 2
No ratings yet
Exercise Solutions Chapter 2
29 pages
24 Intro To Bayesian Inference
No ratings yet
24 Intro To Bayesian Inference
33 pages
Jornadas de Estad Istica Aplicada, Universidad de Chimborazo, Riobamba, Ecuador, 10 - 13th June 2013
No ratings yet
Jornadas de Estad Istica Aplicada, Universidad de Chimborazo, Riobamba, Ecuador, 10 - 13th June 2013
28 pages
Credibility Using Semiparametric Models
No ratings yet
Credibility Using Semiparametric Models
13 pages
Fixed and Random Effects: Jos Elkink
No ratings yet
Fixed and Random Effects: Jos Elkink
121 pages
Econometrics Module 2
No ratings yet
Econometrics Module 2
38 pages
Quantitative Methods Sessions 11 - 21
No ratings yet
Quantitative Methods Sessions 11 - 21
41 pages
Fixed and Random Coefficients in Multilevel Regression: Var (U) and Var (U)
No ratings yet
Fixed and Random Coefficients in Multilevel Regression: Var (U) and Var (U)
3 pages
ML - Unit 2
No ratings yet
ML - Unit 2
155 pages
Machine Learning
No ratings yet
Machine Learning
92 pages
Advanced Econometrics: Instructor: Kanika Mahajan
No ratings yet
Advanced Econometrics: Instructor: Kanika Mahajan
36 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
Presented By:-Tanushree Shekhawat M.Sc. Statistics (ACTUARIAL) Curaj
No ratings yet
Presented By:-Tanushree Shekhawat M.Sc. Statistics (ACTUARIAL) Curaj
22 pages
Intro Bayes Time Series 1
No ratings yet
Intro Bayes Time Series 1
72 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
29 pages
Statistics Notes
No ratings yet
Statistics Notes
18 pages
Chap 3
No ratings yet
Chap 3
74 pages
Econometric Modeling
No ratings yet
Econometric Modeling
38 pages
MultivariableRegression 2
No ratings yet
MultivariableRegression 2
79 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
ECTRX Topic1 Review
No ratings yet
ECTRX Topic1 Review
40 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Case Study - Vua Nem
No ratings yet
Case Study - Vua Nem
2 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Chapter 1
No ratings yet
Chapter 1
7 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
PA Summary Sheet
No ratings yet
PA Summary Sheet
9 pages
Technical Analysis I Presentation
No ratings yet
Technical Analysis I Presentation
9 pages
Risk and Compliance in Bank WCRM
No ratings yet
Risk and Compliance in Bank WCRM
2 pages
MAS-I Formula Sheet
No ratings yet
MAS-I Formula Sheet
22 pages
Monaco Social Security System
No ratings yet
Monaco Social Security System
15 pages
Sample FM 3
No ratings yet
Sample FM 3
26 pages
Chapter 1 - Accounting - Actuary Class
No ratings yet
Chapter 1 - Accounting - Actuary Class
32 pages
Sample FM 7
No ratings yet
Sample FM 7
22 pages
Efficient Algorithm For Accurate Touch Detection of Large Touch Screen Panels
No ratings yet
Efficient Algorithm For Accurate Touch Detection of Large Touch Screen Panels
2 pages
07 - Assessment of Learning
No ratings yet
07 - Assessment of Learning
14 pages
MMW (Data Management) - Part 2
No ratings yet
MMW (Data Management) - Part 2
43 pages
Q2 - Yana Hendayanaa - Exploring Impact of Profitability, Lev, Capital Intens On Ta in LQ45
No ratings yet
Q2 - Yana Hendayanaa - Exploring Impact of Profitability, Lev, Capital Intens On Ta in LQ45
14 pages
PredictiveAnalysis U1 U2
No ratings yet
PredictiveAnalysis U1 U2
7 pages
TARGET-1000 (JEE ADV-2025) - MATHS - SPL Assignment - Statistics
No ratings yet
TARGET-1000 (JEE ADV-2025) - MATHS - SPL Assignment - Statistics
3 pages
Introduction To Management Chapter 7
No ratings yet
Introduction To Management Chapter 7
7 pages
TH Evo 10DT EN
No ratings yet
TH Evo 10DT EN
38 pages
Statistics+DPP+-+11th+Elite by Arvind Kalia Sir
No ratings yet
Statistics+DPP+-+11th+Elite by Arvind Kalia Sir
80 pages
(Ebook PDF) Statistics: Principles and Methods, 8th Edition Download
100% (1)
(Ebook PDF) Statistics: Principles and Methods, 8th Edition Download
53 pages
Confidence Intervals Notes
No ratings yet
Confidence Intervals Notes
12 pages
Statistics L2 Class 10 Asaltu Shimon Sir Vedantu Master Tamil
No ratings yet
Statistics L2 Class 10 Asaltu Shimon Sir Vedantu Master Tamil
41 pages
MPC 006
No ratings yet
MPC 006
99 pages
An Analysis of Anxiety Levels of University Foreign Language Preparatory Class Students During Online Education - Tevfik Engin - Baki Karakuş
No ratings yet
An Analysis of Anxiety Levels of University Foreign Language Preparatory Class Students During Online Education - Tevfik Engin - Baki Karakuş
16 pages
Module 1 of Statistics 27-30
No ratings yet
Module 1 of Statistics 27-30
11 pages
Astm D711-23
No ratings yet
Astm D711-23
9 pages
Session 09 Lecture Notes 0215
No ratings yet
Session 09 Lecture Notes 0215
17 pages
Casio FX 702P
No ratings yet
Casio FX 702P
86 pages
Thesis of HBL
No ratings yet
Thesis of HBL
48 pages
Module 2 - Data Science and Statistics
No ratings yet
Module 2 - Data Science and Statistics
14 pages
MGMT 650 HW2 Fall 2023
No ratings yet
MGMT 650 HW2 Fall 2023
34 pages
SM025 - Topic 6 - Student
No ratings yet
SM025 - Topic 6 - Student
32 pages
General Physics1: Quarter 1 - Module 1: Title: Measurements
100% (5)
General Physics1: Quarter 1 - Module 1: Title: Measurements
26 pages
SPC Awareness Training
No ratings yet
SPC Awareness Training
70 pages
PS QuestionBank
No ratings yet
PS QuestionBank
13 pages
The Effect of E-Learning and Students' Digital Literacy Towards Their Learning Outcomes
No ratings yet
The Effect of E-Learning and Students' Digital Literacy Towards Their Learning Outcomes
9 pages
A Novel Technique For The Modeling of Shale Swelling Behavior in Water Based Drilling Fluids
No ratings yet
A Novel Technique For The Modeling of Shale Swelling Behavior in Water Based Drilling Fluids
15 pages
Presentation of Distribution of Errors
No ratings yet
Presentation of Distribution of Errors
15 pages
Dividend Policy
No ratings yet
Dividend Policy
9 pages
Comparison of Pile Capacity Estimated by CAPWAP ICAP and Case Method
No ratings yet
Comparison of Pile Capacity Estimated by CAPWAP ICAP and Case Method
7 pages

MAS-II Formula Sheet

Uploaded by

MAS-II Formula Sheet

Uploaded by

MAS-II

Handling Computational Problems

• 𝑑𝑑 is the difference in parameters between

Common Examples of Aliasing

Statistical Learning Method Supervised Unsupervised Parametric Non-Parametric

2. Starting from the center of the node 𝑚𝑚 Classification:

category among the 𝑘𝑘 training 𝐷𝐷 Cross entropy Key Ideas

More Under Classification:

Type of Layers Description or Formula No. of Parameters

autocovariance function Multiplicative Seasonality

You might also like