Reference Card (PDQ Sheet) : Week 1: Statistical Inference For One and Two Popula-Tion Variances
Reference Card (PDQ Sheet) : Week 1: Statistical Inference For One and Two Popula-Tion Variances
• Hypotheses:
1
• Rejection Rule (p-value Approach): If p-value < α, reject H0 ; Otherwise, if p-value
≥ α, do not reject H0 (Note: In most of the cases, you need to approximate p-value
from Chi-square table as the table does not have enough details)
• Rejection Rule (Confidence Interval Approach (for two-tailed test only): If the confi-
(n − 1)s2 (n − 1)s2
dence interval (( , )) contains σ02 , do not reject H0 ; otherwise, reject
χ2U χ2L
H0
2
Chapter 11: 11.4 - 11.5 (Test of Two Population Variances)
• F distribution is used to test two population variances σ12 , σ22 .
• Always uses upper critical region to test hypothesis. If the hypothesis is two-tailed,
upper critical region is α/2, while if the test is one-tailed (left-tailed or right-tailed),
the upper critical region is α
• Hypotheses:
• Test Statistic:
3
Week 1: Experimental Design and One-way ANOVA
Chapter 12: 12.1 - 12.1 (One-way ANOVA or Completely Random-
ized Design (Parametric Test))
• Preliminaries:
(i) Observational Study: Researcher simply observes the subjects without interfering
and the data are collected without particular design. Cause and effect relationship
cannot be established.
(ii) Experimental Study: Experimenter manipulates the attributes of study partici-
pants and observes the consequences. Cause and effect relationship can be estab-
lished.
(iii) Experimental unit: Object on which measurement is taken.
(iv) Factor: Whose effect is to be studied.
(v) Level: Values (or intensities) of the factor.
(vi) Treatment: Combination of factors.
(vii) Response: Outcome of the treatment being measured by the experimenter.
• Notations:
p ni
X X
(i) SST O = (xij − x̄)2 = (n − 1)s2 (Note: Obtain s2 from calculator)
i=1 j=1
p
X
(ii) SST = ni (x̄i − x̄)2 (Note: Obtain x̄ from calculator and calculate x̄i manually
i=1
)
(iii) SSE = SST O − SST
4
• ANOVA Table:
• Hypotheses:
H0 : µ1 = µ2 = · · · = µp
HA : At least two of µ1 , µ2 · · · µp differ
M ST
• Test Statistic: F =
M SE
• Rejection Rule: (Critical Value Approach):
• Post Hoc test: Tukey’s Simultaneous Confidence Interval (If H0 is rejected only)
p p!
– Number of pairs to be compared: =
2 2!(p − 2)!
– Point estimate for the mean difference: x̄i − x̄h where i and j designate two dif-
ferent treatments
r
qα 1 1
– (i) When sample sizes are unequal: (x̄i − x̄h ) ± M SE( + )
2 ni nh
r
M SE
(ii) When sample sizes are equal: (ni = nh = m): (x̄i − x̄h ) ± qα
m
(iii) Critical value qα : qα is the upper α percentage point of the studentized range
table for p(which is r) and (n − p)(which is v)
(iv) Decision: If the confidence interval contains 0, the difference between the
population average treatment means is NOT significant. If the confidence
interval does NOT contain 0, the difference is significant.
5
Week 2: One-way ANOVA (Nonparametric Test) +
Randomized Block Design (Parametric + Nonparamet-
ric
Chapter 18: 18.4 (Kruskal-Wallis H Test: A Nonparametric Test
for Comparing Several Population Means)
• When to use nonparametric test? Nonparametric tests (or distribution-free methods)
are used when the assumptions (same variance, normal distribution) of parametric
tests are not fulfilled
(i) When the usual assumptions are fulfilled, nonparametric tests are about 95%
efficient than parametric equivalents
(ii) When normality and common variance assumptions are not satisfied, nonpara-
metric procedures are more efficient than parametric equivalents
• Step 2: Rank the combined sample from smallest to largest (average ranks for ties)
(Sort −→ Potential (or Initial) Rank −→ Final Rank: Use EXCEL for ranking)
• Step 3: Find T1 : The sum of ranks for sample 1, T2 : The sum of ranks for sample 2,
· · · , Tp : The sum of ranks for sample p where p is the number of populations to be
compared
• Step 4: Hypotheses
• Rejection Rule (Critical Value Approach): Reject H0 if H > χ2α where χ2α follows
chi-square distribution with (p − 1) degrees of freedom.
6
Chapter 12: 12.3 (Randomized Block Design (RBD)) (Parametric
Test)
• When to use RBD? There are situations when another nuisance factor (not primary
factor) may affect the observed response in one-way design
• Notion: Blocking technique can be used to eliminate the additional factor’s effect on
the statistical analysis of the main (or primary) factor of interest
• Factors: Treatment (primary factor), Block (nuisance factor) (In the data layout, treat-
ment and block can swap the positions of row and column)
• Notations:
• ANOVA Table:
7
• Test of Treatment (Primary Factor) Means:
– Hypotheses
H0 : µ1 = µ2 = · · · = µp
HA : At least two of µ1 , µ2 · · · µp differ
M ST
– Test Statistic: F =
M SE
– Rejection Rule: (Critical Value Approach):
– Hypotheses:
H0 : µb1 = µb2 = · · · = µbp
HA :At least two block effects (i.e. block means) differ
M SB
– Test Statistic: F =
M SE
– Rejection Rule: (Critical Value Approach):
• Post Hoc test: Tukey’s Simultaneous Confidence Interval (If H0 is rejected only)
p p!
– Number of pairs to be compared: =
2 2!(p − 2)!
– Point estimate for the mean difference: x̄i. − x̄h. where i and j designate two
different treatments
r
M SE
– Confidence Interval: (x̄i. − x̄h. ) ± qα
b
8
(iii) Critical value qα : qα is the upper α percentage point of the studentized range
table for p(which is r) and (p − 1)(b − 1)(which is v)
(iv) Decision: If the confidence interval contains 0, the difference between the popula-
tion average treatment means is NOT significant. If the confidence interval does
NOT contain 0, the difference is significant.
9
Friedman Fr Test: A Nonparametric Test for RBD
• Step 1: Rank the p measurements within each block from 1, . . . , p after sorting from
smallest to largest (average ranks for ties) (Sort −→ Potential (or Initial) Rank −→
Final Rank: Use EXCEL for ranking)
• Step 2: Find T1 : The sum of ranks for treatment 1, T2 : The sum of ranks for treatment
2, · · · , Tp : The sum of ranks for treatment p where p is the number of treatments
• Step 5:Rejection Rule (Critical Value Approach): Reject H0 if Fr > χ2α where χ2α fol-
lows chi-square distribution with (p − 1) degrees of freedom (use upper critical region).
• Step 5:Rejection Rule (p-value Approach): If p-value (exact p-value: MINITAB, ap-
proximate p-value based on chi-square table) < α, reject H0 ; Otherwise, if p-value ≥ α,
do not reject H0
10
Week 3: Chi-square Goodness-of-Fit Test and Test of
Independence
Chapter 13: 13.1 (Chi-square Goodness-of-Fit Test)
• Inference for count data
• Research Question: Does the actual distribution differ from model (no preference or
specific preference)? How good is the fit?
• Hypotheses:
11
• Rejection Rule (Critical Value Approach):
Reject H0 if χ2 > χ2α where χ2α follows Chi-square distribution with (k − 1) degrees of
freedom or (k − 1 − m) (in case of goodness-of-fit test for normal distribution) where
m is the number of parameters to be estimated
12
Chapter 13: 13.2 (Chi-square Test of Independence or Contingency
Table Analysis)
• Inference for count data
• Hypotheses:
H0 : The two classifications are statistically independent (or not associated)
HA : The two classifications are statistically dependent (or associated)
Reject H0 if χ2 > χ2α where χ2α follows Chi-square distribution with (r − 1)(c − 1)
degrees of freedom.
13
Week 4: Correlation and Simple Linear Regression Anal-
ysis
Chapter 14.6: Correlation
• Pearson Correlation Coefficient: Measures and describes the direction, strength and
form of relationship between two quantitative variables.
( x)2 ( y)2
P P P P
X x y X
2
X
2
SSxy = xy − , SSxx = x − , SSyy = y −
n n n
• Properties:
• Caution:
(i) Correlation and Causation: Correlation does NOT imply causation (because causal
path is asymmetric, while correlation is symmetric).
(ii) Spurious Correlation: Effect of lurking variable.
(iii) Correlation and Restricted Range of Scores: Never generalize a correlation be-
yond the sample range of data.
(iv) Correlation and Outliers: Outliers produce a disproportionately large impact on
the correlation coefficient
• Assumptions:
14
(i) The data are measured on interval or ratio level
(ii) The two variables (x and y) are distributed as a Bivariate Normal distribution
• Hypotheses:
15
Chapter 14.6: Simple Linear Regression Analysis
• Origin of the word “Regression”: “Regress” means “Step back”. Here the goal is to
find the average regression line (stepping back from all data points towards ‘average’)
which goes through the middle of all points, hence the naming.
• Why Simple? Because it involves one independent (or explanatory or predictor) vari-
able X and one dependent (or response) variable Y .
• Objective:
(i) An objective method of precisely defining the best fitting line that defines the
linear relationship between X and Y .
(ii) Prediction of Y based on knowledge of Xs
( x)2
P P P
SSxy X x y X
2
(a) b1 = where SSxy = xy − , SSxx = x − .
SSxx P P n n
y x
(b) b0 = ȳ − b1 x̄ where ȳ = , x̄ = .
n n
– Interpretation of b0 , b1 :
16
(iii) Interpretation of r2 : r2 = 52% means that 52% of the variation of Y is explained
by the model (i.e., X) and 48% of the variation of Y is still unexplained.
√
r r
SSE SSE
(iv) Standard Error of Estimate: s = =, = M SE. Here,
n−k−1 n−2
k = 1, the number of independent variables in simple linear regression model.
(v) Partition of Sum of Squares:
( y)2
X P
2
SST = y − , SSE = (1 − r2 )SST, and SSR = SST − SSE.
n
(vi) Relationship between r and r2 :
17
Week 5: Correlation and Simple Linear Regression (cont’d)
Chapter 14.3 - 14.5: Simple Linear Regression Analysis
• Test of Significance of Simple Regression Model: Three equivalent tests.
(i) Test 1: Test for significance of the correlation between X and Y (Please refer to
the section of Correlation).
(ii) Test 2: Test for significance of the coefficient of determination
∗ Hypotheses:
18
(iii) Test 3: Test for Regression Slope, β1
∗ Hypotheses:
b1 − β 1 s
∗ Test Statistic: t = where sb1 = qP , and s = Standard
sb 1 (
P
x)2
x2 − n
error of estimate.
(a) Reject H0 if t > tα/2 or t < tα/2 where t is the critical value from t distri-
bution with (n − k − 1) = (n − 2) degrees of freedom. (two-tailed test)
19
• Test of Significance of y-intercept, β0 :
– Hypotheses:
(a) Reject H0 if t > tα/2 or t < tα/2 where t is the critical value from t distribution
with (n − k − 1) = (n − 2) degrees of freedom. (two-tailed test)
20
21
Week 6: Correlation and Simple Linear Regression (cont’d)
Chapter 14.7: Simple Linear Regression Analysis (Residual Anal-
ysis)
• Model Assumptions
Linear Pattern: No pattern can be discerned. Residuals are fairly randomly dis-
tributed over the reference line (0).
22
(b) Constant Variance: Homoskedasticity
Plot of (Residual(e) vs Independent Variable) (X) or
(Residual (e) vs Predicted Y (Ŷ )).
Nonconstant Variance Pattern: Residuals “fan out” or “funnel in”. Residuals not
spreading equally over all values of X.
23
(c) Normality:
Histogram of Residuals: Plot of Residual(e)
24
(c) Independence: If the observations of the error terms are not uncorrelated with
each other in time series data, the assumption is violated and the problem of
autocorrelation or serial correlation arises.
25
No Autocorrelation: Different observations of the error term are completely un-
correlated with each other.
Step 1: Hypotheses
26
Step 2: Test Statistic
n
(et − et−1 )2
P
t=2
d = n
e2t
P
t=1
27
Week 7: Multiple Regression and Model Building
Chapter 15: Multiple Regression and Model Building (15.1 - 15.5)
• Multiple regression is an extension of simple regression that allows more than one
predictor (or independent or explanatory) variable.
• Quality of Fit:
(a) R2 : Fraction of the total variation in y accounted for by the model (all the pre-
dictor variables included).
SSR SSE
R2 = = 1−
SST SST
28
(b) Adjusted R2 : Adjusted R2 increases only if the gain in SSR outweighs the loss of
degrees of freedom due to addition of new independent variables. Adjusted R2 is
always less than R2 .
k n−1
R̄2 = (R2 − )( )
n−1 n−k−1
,
(c) Standard Error of Estimate, s: A point estimate of the standard deviation of the
error term, σ.
√
r
SSE
s = M SE =
n−k−1
Interpretation of s: Is this value large or small? Take into account the mean size of y
for comparison (using the rule 68 – 95 – 99.7% rule). If the interval ȳ ± 2s appears to
be too large to be acceptable from point of prediction, steps should be taken to reduce
standard error of estimate, s by improving the regression model.
• ANOVA Table
29
∗ Hypotheses:
H0 : β1 = β2 = · · · = βk = 0
HA : At least one βi 6= 0
(F-test always uses upper critical critical region and is right-tailed)
SSR/k
∗ Test Statistic: F = , where k is number of independent
SSE/(n − k − 1)
variables in multiple linear regression model and n is sample size.
bi − β i
∗ Test Statistic: t = where sbi , is standard error of coefficient, βi (from
sb i
MINITAB output).
(a) Reject H0 if t > tα/2 or t < tα/2 where t is the critical value from t distri-
bution with (n − k − 1) degrees of freedom. (two-tailed test)
30
∗ Rejection Rule (p-value Approach): If p-value < α, reject H0 ; Otherwise, if
p-value ≥ α, do not reject H0 (Note: Exact p-value can be obtained from
MINITAB, while approximate p-value can be obtained from t table as the t
table does not have enough details)
∗ Rejection Rule (Confidence Interval Approach): If the confidence interval
bi ± t × sbi contains 0, do not reject H0 ; if the confidence interval does not
contain 0, reject H0 .
31
Week 8: Multiple Regression and Model Building
Chapter 15: Multiple Regression and Model Building (15.6 - 15.9)
• Confidence and Prediction Intervals:
(i) Distance Value: Not directly available in multiple regression output. Here is the
workaround formula:
sˆy
distance value = ( )2 where sˆy is standard error of fit and s is the standard
s
error of estimate available in MINITAB output.
(ii) Confidence Interval for the Mean of Y given X1 = x10 , X2 = x20 , · · · , Xk = xk0 :
√
ŷ ± tα/2 s distance value
(iii) Confidence Interval (or Prediction Interval) for the individual Y given X1 =
x10 , X2 = x20 , · · · , Xk = xk0 :
√
ŷ ± tα/2 s 1 + distance value
• Dummy Variable
(i) Some variables are inherently “qualitative” (rather than “quantitative”) and can’t
be expressed as a number.
(ii) Some qualitative variables can be quantified by using a dummy variable or indi-
cator variable. A dummy variable takes on value of 0 (if condition is not met) or
1 (if condition is met).
(iii) Rule: Create one fewer dummy than there are conditions. Specifically, k categories
requires k - 1 dummy variables. Omitted condition is considered as “reference” or
“benchmark” category. Every category is compared with reference or benchmark
category when interpreting the dummy variable coefficients.
(iv) Intercept Dummy (Dummy variables without interaction effect in the model): In-
tercept changes but slope remains constant (i.e., parallel).
32
Model for Di = 1 : Yi = (β0 + β2 ) + β1 Xi
| {z } |{z}
intercept slope
Model for Di = 0 : Yi = β0 + β1 Xi
|{z} |{z}
intercept slope
Method 2: On average, Y is b2 more (if sign is positive) or less (if sign is neg-
ative) for Di=1 category compared to Di = 0 category.
(vi) Slope Dummy (Dummy variables with interaction term in the model): A quanti-
tative variable and dummy variable interaction is typically called a slope dummy.
If a slope dummy and intercept dummy are added to an equation, both intercept
and slope change (i.e., intercepts shift and lines are NOT parallel).
33
• Using Squared and Interaction Variables
y = β0 + β1 x + β2 x2 +
(ii) Interaction Term: Regression models often contain interaction variables. In the
following regression model, x3 and x4 interact and x3 is used a quadratic.
y = β0 + β1 x4 + β2 x3 + β3 x23 + β4 x3 x4 +
The Partial F-test is a formal hypothesis test designed to deal with a single
hypothesis about a group (or subset) of coefficients.
∗ Hypotheses:
H0 : β1 = β2 = · · · = βM = 0
HA : At least one βi 6= 0
34
(F-test always uses upper critical critical region and is right-tailed). Here, M
is the number of coefficients to be tested or number of constraints.
(SSEM − SSE)/M
∗ Test Statistic: F = , where
SSE/(n − k − 1)
· k is number of independent variables in multiple linear regression model.
· n is sample size.
· SSE is the residual sum of squares from the full (unconstrained) model.
· SSEM is the residual sum of squares from the reduced (constrained)
model.
· M is the number of constraints or coefficients in the null hypothesis to
be tested
∗ Rejection Rule (Critical Value Approach): Reject H0 if F > Fα where Fα
is the critical value from F distribution with M and (n − k − 1) degrees of
freedom (Right-tailed Test).
35
Week 9: Multiple Regression and Model Building
Chapter 15: Multiple Regression and Model Building (15.10 -
15.11)
• Multicollinearity:
Perfect Multicollinearity: When two independent variables are perfectly linearly corre-
lated (i.e., r = +1 or −1). Perfect multicollinearity is a violation of Classical Regression
Assumption.
Consequences:
(i) Simple Correlation Coefficient (e.g., correlation matrix) (for two independent vari-
ables)
(ii) Variance Inflation Factor (VIF) (for two or more independent variables):
1
V IFj = , where (1 − Rj2 ) is called tolerance.
1 − Rj2
Criteria for Assessing Multicollinearity:
36
(i) Step 1: Run an OLS regression that has Xj as a function of all the other
independent variables in the equation.
• Comparing Models: A model is considered desirable if the following criteria are fulfilled:
– Mallows Cp a.k.a C :.
SSE
C= − [n − 2(k + 1)] ≤ (k + 1), where s2p is the mean square error for the
s2p
model containing all p potential independent variables and SSE is sum of squares
for a reduced model with k independent variables.
37
(ii) Constant Variance: Plot of residuals vs predicted value; plot of residuals vs each
independent variable
(iii) Independence: Plot of residuals vs time (if data is time series)
• Detection of Unusual Observations:
(i) Extrapolation and Prediction: Linear models ought not be trusted beyond
the span of the x-values of the data which is used to fit the model. If you extrap-
olate far into the future, be prepared for the actual values to be (possibly quite)
different from your predictions.
(ii) Outlier: An outlying y value (a large distance from ȳ).
(ii) Leverage: An outlying x value (a large distance from x̄). A leverage value is
(k + 1)
considered to be large if it is greater than 2 .
n
38
(iii) Influential: A point is called influential if omitting the point will change the
slope dramatically.
(iv) Standardized or Studentized Residuals: A residual divided by an estimate
of its standard deviation. If studentized residual is greater than 2 in absolute
value, the observation is deemed as an outlier.
(v) Deleted Residuals: e(i) = yi − ŷ(i) where prediction ŷ(i) is obtained using least
squares point estimates based on all n observations except for observation i. An
observation is outlier with respect to y if
39
Week 10: Time Series Forecasting
Chapter 17: Time Series Forecasting (17.1 - 17.2)
• Distinction between Cross-sectional and Time Series Data:
Time Series Data: Variables that are measured at regular intervals for the same entity
over time are called time series.
Cross-sectional Data: When several variables are all measured at the same point in
time for different entity, the data is called cross-sectional data.
(i) A trend is the long-term increase or decrease in a variable being measured over
time (linear or nonlinear).
40
(iii) A cyclical component is a long-run up and down fluctuations around the trend
level and has a recurrence period of more than one year (2 to 10 years or even
longer measured from peak to peak or trough to trough).
(iv) A random component depicts changes in time-series data that are unpredictable
(follow no regular or recognizable pattern) and cannot be associated with a trend,
seasonal, or cyclical component.
(i) Model: yt = β0 + t .
(ii) Point Estimate: yˆt = b0 = ȳ.
r
1
(iii) Interval Estimate: ŷt ± tα/2 s 1+ , where tα/2 is the critical value from t dis-
n
tribution with (n − 1) degrees of freedom.
(i) Model: yt = β0 + β1 t + t
41
(ii) Point Estimate:P P
tyt − t n yt
P
b1 = P P 2
t2 − ( nt)
P P
yt t
b0 = − b1
n n
• Modelling Seasonal Components:
Within regression, seasonality can be modeled using dummy variables. If the time
series data is quarterly, the following model can be considered:
Using Transformation
Sometimes transforming the time series data (with nonconstant seasonality) makes
it easier to forecast. The following are typical transformation techniques:
Constant Seasonality: The magnitude of the swing does not depend on the level of the
time series.
Nonconstant Seasonality: The magnitude of the swing increases as the level of the time
series increases.
42
Steps of Forecasting using Transformation:
95% Predictive Interval Estimate for: y169 = [(5.2913)4 , (5.4065)4 ] = [783.88, 854.41]
43
Chapter 17: Adjusting for Seasonality: Multiplicative Decomposi-
tion (17.3)
• Multiplicative Decomposition Model:
When to use:
Moving Average:
(i) Compute each moving average from the k appropriate consecutive data values,
where k is the number of values in one period of the time series
(ii) Compute the centered moving averages (if the number of time series is odd, cen-
tering procedure would not have been necessary) (trt × clt )
(iii) Isolate the seasonal component by computing the ratio-to-moving-average values
yt
( = snt × irt )
trt × clt
(iv) Compute the seasonal indexes by averaging the ratio-to-moving-average values
for comparable periods (sn¯ t)
L
(v) Normalize the seasonal indexes (if necessary) ( P (normalizing coefficient);
sn
¯t
L
snt = sn ¯t×P )
sn
¯t
(vi) Deseasonalize the time series by dividing the actual data by the appropriate sea-
yt
sonal index ( )
snt
(vii) Use least squares regression to develop the trend line using the deseasonalized
data (trt = b0 + b1 t)
(viii) Develop the unadjusted forecasts using trend projection
44
(ix) Seasonally adjust the forecasts by multiplying the unadjusted forecasts by the
appropriate seasonal index (This returns seasonality to forecast) (trt × snt )
(x) We estimate the period-by-period cyclical and irregular component by dividing
the deseasonalized observation from Step (vi) by the deseasonalized forecast from
yt
Step (viii) ( = clt × irt )
trt × snt
(xi) We use a three-period moving average to average out the irregular component
(clt )
(xii) The value from Step (x) divided by the value from Step (xi) gives us the irregular
clt × irt
component (irt = )
clt
45
Week 11: Time Series Forecasting
Chapter 17: Simple Exponential Smoothing (17.4)
• Uses: Forecast a series with no trend and no seasonality
• Advantage: Simple, popular, adaptive (learn continuously from recent data)
• Key Concept: Smoothing constant (α)
• Model (No Trend, No Seasonality): yt = β0 + t
• Level Updating Equation:
ST = αyT + (1 − α)ST −1
• Smoothing Constant, α: Controls the speed of learning (0 ≤ α ≤ 1).
ST = αyT + (1 − α)ST −1
= αyT + (1 − α)[αyT −1 + (1 − α)ST −2 ]
= αyT + (1 − α)αyT −1 + (1 − α)2 [αyT −2 + (1 − α)ST −3 ]
αyT + α(1 − α)yT −1 + α(1 − α)2 yT −2 + · · ·
Because the weights αyT , α(1 − α), α(1 − α)2 decay exponentially in the past, the
method is called exponential smoothing.
46
Week 12: Time Series Forecasting
Chapter 17: Double Exponential Smoothing - Holt-Winter’s Model
(17.5)
• Uses: Forecast a series with trend and/or seasonality
lT = αyT + (1 − α)[lT −1 + bT −1 ]
The equation says that lT equals a fraction α of the newly observed time series value
yT plus a fraction (1 − α) of the level and trend in time period (T − 1).
bT = γ[lT − lT −1 ] + (1 − γ)bT −1
The equation says that bT equals a fraction γ of [lT −lT −1 ], an estimate of the difference
between the levels of the time series in in periods T and T − 1 plus a fraction (1 − γ)
of bT −1 , the estimate of the trend in time period (T − 1) .
(a) ŷ25 (24) = l24 + b24 (One-step ahead forecast i.e., forecast made in time 24 (origin)
for future time 25).
(b) ŷ26 (24) = l24 + 2b24 (Two-step ahead forecast i.e., forecast made in time 24 (ori-
gin) for future time 26).
47
Chapter 17: Multiplicative Winter’s Model (17.5)
• Uses: Forecast a series with trend and seasonality
The equation says that lT equals a fraction α of the newly observed deseasonalized
yT
time series value plus a fraction (1 − α) of the level and trend in time period
snT −L
(T − 1).
bT = γ[lT − lT −1 ] + (1 − γ)bT −1
The equation says that bT equals a fraction γ of [lT −lT −1 ], an estimate of the difference
between the levels of the time series in in periods T and T − 1 plus a fraction (1 − γ)
of bT −1 , the estimate of the trend in time period (T − 1) .
(a) ŷ37 (36) = (l36 + b36 )sn25 (One-step ahead forecast i.e., forecast made in time 36
(origin) for future time 37; sn25 is the most recent seasonal factor).
(b) ŷ43 (36) = (l36 + 7b36 )sn31 (Seven-step ahead forecast i.e., forecast made in time
36 (origin) for future time 43; sn31 is the most recent seasonal factor).
48
Chapter 17: Additive Winter’s Model (17.5)
• Uses: Forecast a series with trend and seasonality
The equation says that lT equals a fraction α of the newly observed deseasonalized
time series value (yT − snT −L ) plus a fraction (1 − α) of the level and trend in time
period (T − 1).
bT = γ[lT − lT −1 ] + (1 − γ)bT −1
The equation says that bT equals a fraction γ of [lT −lT −1 ], an estimate of the difference
between the levels of the time series in in periods T and T − 1 plus a fraction (1 − γ)
of bT −1 , the estimate of the trend in time period (T − 1) .
49
Chapter 17: Forecast Error Comparisons (17.7)
• Forecast Error: et = yt − ŷt . In general, we want a forecast model with least forecast
error.
P P
|et | |yt − ŷt |
– Mean Absolute Deviation (MAD): =
n n
P 2
(yt − ŷt )2
P
et
– Mean Squared Deviation (MSD): =
n n
P yt −ŷt
| yt | × 100%
– Mean Absolute Percentage Error (MAPE):
n
50