T2.Statistics Review (Stock & Watson)
T2.Statistics Review (Stock & Watson)
T2.Statistics Review (Stock & Watson)
Quantitative Analysis:
Statistics review,
Chapter 3 of Stock and Watson
FRM 2012 Practice Questions
Table of Contents
Key Ideas from Stock & Watson Chapter 3 ........................................................... 2
Question 208: Sample mean estimators (Stock & Watson) ........................................ 3
Question 209: T-statistic and confidence interval .................................................. 5
Question 210: Hypothesis testing (Stock & Watson) ................................................ 7
Question 211: Type I and II errors and p-value (Stock & Watson) ................................ 9
Question 212: Difference between two means .................................................... 11
Question 213: Sample variance, covariance and correlation (Stock & Watson) .............. 13
www.bionicturtle.com
We mostly do not observe population parameters but instead infer them from sample
estimates which are values given by estimators such as sample mean and sample
variance. An estimator is a recipe for obtaining an estimate of a population parameter.
o
The t-statistic tests the null hypothesis that the population mean equals a certain value.
o
If the sample (n) is large (e.g., greater than 30), the t-statistic has a standard
normal sampling distribution when the null hypothesis is true.
p-value is the smallest significance level at which the null can be rejected
If the p-value is very small (e.g., 0.00x), reject the null. If the p-value is large
(e.g., 0.19 or 19%), accept (fail to reject) the null.
You will NOT be asked, on the FRM, to calculate a p-value (e.g., you cannot derive
it on the TI BA II+ or HP 12c). You may be asked to interpret a given p-value.
A 95% confidence interval for is an interval constructed so that it contains the true value
of in 95% of all possible samples:
Y 1.96SE Y
Y 2.58SE Y
1
( X i X )(Yi Y )
n 1
Sample covariance:
sample XY
Sample correlation
sample r XY
www.bionicturtle.com
s XY
s X sY
Correlation (X,Y)
= covariance (X,Y) / [Std Deviation(X)] * [Std Deviation(Y)]
www.bionicturtle.com
Answers:
208.1. B. Standard error of the sample mean is 1.038
Sample mean = 7.
Sample variance = Sum of: [(3-7)^2 + (6-7)^2 + (6-7)^2 + (8-7)^2 + (9-7)^2 + (10-7)^2] / (6-1)
= 32/5 = 6.40
Sample standard deviation = SQRT[sample variance] = 2.5298
Standard error = SQRT[sample variance/n] = SQRT(6.4/6) = 1.038 = sample standard/SQRT(n)
= 1.038
In regard to (A), (C), and (D), each is TRUE.
208.2. D. Consistency says that, as the sample size increases, the estimator converges
toward the population mean; an estimator with a smaller variance is more EFFICIENT.
Unbiased says the E[estimator] = population parameter.
In regard to (A), (B) and (C), each is TRUE.
208.3. C. 0.56% (s.e.) and 1.07 t-statistic
The standard error of the sample mean = SQRT[1.6%*(1-1.6%)/500] = 0.56114%
The t-statistic or t-ratio = (1.6% - 1.0%)/0.56114% = 1.0692; i.e., the null hypothesis is that the
VaR model is accurate such that we expect p = 1.0%.
The p-value = 2*[1 - normal CDF (z = 1.0692)] = 2*[1-85.75%] = 28.5%; i.e., we would fail to
reject the null hypothesis that the VaR model is accurate. Put another way, 1.6% could exceed an
accurate VaR (i.e., 1%) due merely to random sampling variation, as we could expect this
outcome fully 28.5% of the time.
www.bionicturtle.com
209.2. Over the last two years, a fund produced an average monthly return of +3.0% but with
monthly volatility of 10.0%. That is, assume the random sample size (n) is 24, with mean of 3.0%
and sigma of 10.0%. Are the returns statistically significant; in other words, can we decide the
true mean return is great than zero with 95% confidence?
a)
b)
c)
d)
209.3. Assume the frequency of internal fraud (an operational risk event type) occurrences per
year is characterized by a Poisson distribution. Among a sample of 43 companies, the mean
frequency is 11.0 with a sample standard deviation of 4.0. What is the 90% confidence interval
of the population's mean frequency?
a)
b)
c)
d)
10.0 to 12.0
8.8 to 13.2
7.5 to 14.5
Need more information (Poisson parameter)
www.bionicturtle.com
Answers:
209.1. B. No, the t-statistic is only 1.08. For a large sample, the distribution is normally
approximated, such that at 5.0% two-tailed significance, we reject if the abs(t-statistic)
exceeds 1.96.
The standard error = SQRT(15%*85%/60) = 0.046098; please note: if you used
SQRT(10%*90%/60) for the standard error, that is not wrong, but also would not change the
conclusion as the t-statistic would be 1.29
The t statistic = (15%-10%)/0.046098 = 1.08;
The two-sided p value is 27.8%, but as the t statistic is well below 2.0, we cannot confidently
reject.
We don't really need the lookup table or a calculator: the t-statistic tells us that the observed
sample mean is only 1.08 standard deviations (standard errors) away from the hypothesized
population mean.
A two-tailed 90% confidence interval implies 1.64 standard errors, so this (72.8%) is much less
confident than even 90%.
209.2. B. No, the t-statistic is 1.47
The standard error = 10%/SQRT(24) = 0.020412
The t statistic = (3.0% - 0%)/0.020412 = 1.47.
The one-tailed critical t, at 95% with 23 df, is 1.71; two-tailed is 2.07.
(even if we assume normal one-sided, the 95% critical Z is 1.645, of course.)
209.3. A. 10.0 to 12.0
The central limit theorem (CLT) says, if the sample is random (i.i.d.), the sampling distribution of
the sample mean tends toward the normal REGARDLESS of the underlying distribution!
The standard error = SQRT(4^2/43) = 4/SQRT(43) = 0.609994.
The 90% confidence interval = 11.0 +/- 1.645*0.609994 = 11.0 +/- 1.0 = 10.0 to 12.0
... did you realize that a 90% two-side confidence INTERVAL implies the same deviate (1.645) as
a 95% one-sided deviate?
www.bionicturtle.com
210.2. Analyst Jane is concerned that the average days sales outstanding (DSO) in her coverage
sector has increased above its historical average of 27.0 days (a lower number is better). From a
large sample of 36 companies, she computes a sample mean DSO of 29.0 days with sample
standard deviation of 7.0. Her one-sided alternative hypothesis is that DSO is greater than 27.0.
Does she reject the null?
a)
b)
c)
d)
No, do not reject one-sided null as the t-statistic is less than 1.65
No, do not reject one-sided null as the t-statistic is less than 1.96
Yes, do reject one-sided null as the t-statistic is greater than 1.65
Yes, do reject one-sided null as the t-statistic is greater than 1.96
210.3. The average capital ratio of a sample of 49 banks is 7.4% with a sample standard
deviation of 5.0%. What is the two-sided 95% confidence interval for the population's true
average capital ratio; i.e., the random interval that has a 95% probability of containing the
population mean?
a)
b)
c)
d)
5.5% to 9.3%
6.0% to 8.8%
6.7% to 8.1%
7.1% to 7.7%
www.bionicturtle.com
Answers:
210.1. D. A reduction of 5% to 1% significance (becoming "more conservative") offers a
trade-off: the probability of erroneously rejecting a true null (Type I error) decreases but
the probability of erroneously accepting a false null (Type II error) increases.
In regard to (A), (B), and (C), each is true about a 5.0% significance test.
210.2. C. Yes, do reject one-sided null as the t-statistic is greater than 1.65
Standard error = 7/SQRT(36) = 1.6667; t statistic = (29-27)/1.6667 = 1.71.
As 1.71 is greater than 1.645 (i.e., the critical value for a one-tailed 5% significance test of the
large sample MEAN), she rejects the null in favor of the alternative but accepts, conditional on a
true null, a 5.0% probability of making a Type I error.
210.3. B. 6.0% to 8.8%
Standard error = 5.0%/SQRT(49) = 0.7143%
Confidence interval = 7.4% +/ (0.7143% * 1.96) = 6.0% to 8.8%
www.bionicturtle.com
Question 211: Type I and II errors and p-value (Stock & Watson)
AIM: Define, calculate and interpret type I and type II errors. Define and
interpret the p-value.
211.1. The Basel III market risk backtest requires a bank to observe the number of exceptions
(aka, exceedences; the number of days on which the VaR loss was exceeded) in order to infer
whether the bank's 99.0% 10-day value at risk (VaR) model is accurate. Even before the
impossible task of prediction, the observed sample is making an inference about a historical
population. The null hypothesis implicit in the Basel backtest is: H(0) = the VaR model is
accurate with 99.0% confidence. Therefore, the alternative H(A) = VaR model is accurate with
less than 99.0% confidence. Basel designed three "stoplight" zones to acknowledge the reality
that sampling is a statistical test which is necessarily error-prone: A "Green Zone" outcome is
when sufficiently few exceptions occur such that the decision should be to accept the model; a
"Red Zone" outcome is when too many exceptions occur such that the decision is to reject the
model as bad. Which of the following constitutes a Type II error?
a)
b)
c)
d)
A Red Zone outcome when the VaR model is 99.0% confident (i.e., gives 99.0% coverage)
A Green Zone outcome when the VaR model is, for example, only 97.0% confident
A Green Zone outcome when the VaR model is 99.0% confident
The Type II error cannot occur in the backtest
211.2. A sample of 144 has a sample mean of 112.9 with sample standard deviation of 60. The
null hypothesis is that the true population mean is 100.0; the two-sided alternative hypothesis is
that the true population mean is different than 100.0. What is the two-sided p-value?
a)
b)
c)
d)
0.5%
1.0%
2.5%
5.0%
211.3. Your colleague Mary believes that FRM scores are correlated with work experience.
Somehow she got hold of data and produced a linear regression, FRMScore = intercept +
slope*YearExperience, such that the p-value of the slope coefficient is 1.7% (0.017). Per common
practice, the significance test starts with a null hypothesis that the slope is zero, and the twosided alternative hypothesis is that the slope is non-zero. Which of the following is a correct
interpretation of the p-value?
a) If her prespecified significance level is 1.0%, she rejects the null and deems the true
slope to be non-zero
b) If her prespecified significance level is 5.0%, she rejects the null and deems the true
slope to be non-zero
c) The probability that the slope is zero is 1.7%
d) The probability that the slope is non-zero is 98.3%
www.bionicturtle.com
Answers:
211.1. B. A Green Zone outcome when the VaR model is, for example, only 97.0%
confident
The null is: the 99.0% VaR model is accurate.
A Type I error is to mistakenly reject a true null.
A Type II error is to mistakenly accept a false null; in this case, to mistakenly decide the model is
good (Green Zone) when the model is less than 99.0% confident.
211.2. B. 1.0%
The standard error = 60/SQRT(144) = 60/12 = 5.0.
The t-statistic = (112.9 - 100) / 5.0 = 2.58;
This is the deviate that corresponds to a 1.0% probability under a two-tailed test; put another
way, the area under the normal curve, to the left of +2.58 sigmas is 99.5%, such that the area
under both left & right tails is 1.0%.
211.3. b. If her prespecified significance level is 5.0%, she rejects the null and deems the
true slope to be non-zero
The prespecified significance level is 5.0%, which implies a 5.0% rejection region (2.5% in each
tail); i.e., if the null is true, 5.0% chance of mistakenly rejecting (Type I error).
A p-value of 1.7% is the "exact significance level:" we can reject at higher significance levels (e.g.,
5%) but we cannot reject at any lower significance levels (e.g., 1.0%)
www.bionicturtle.com
212.2. The average hourly earnings among a sample of 1,500 men is $22.00 with a sample
standard deviation of $9.00. The average hourly earnings among a sample of 1,000 women is
$20.00 with a sample standard deviation of $6.00. What is the 95% confidence interval for the
(two-sided) difference in average earnings between men and women?
a)
b)
c)
d)
$0.04 to $3.96
$1.41 to $2.59
$1.70 to $2.30
$1.83 to $2.17
212.3. A credit rating agency wants to compare the difference in default rates between
structured notes in two speculative rating categories: SF B versus SF CCC. The default rate
among a sample of 1,800 SF B-rated obligors was 5.0%, compared to the default rate among a
sample of 1,000 SF CCC-rated obligors was 8.0%. Default is characterized by a Bernoulli random
variable. What is the 95% confidence interval for the difference in default rates?
a)
b)
c)
d)
2.97% to 3.04%
2.11% to 3.89%
1.75% to 4.25%
1.04% to 4.96%
www.bionicturtle.com
Answers:
212.1. C. Yes, the t-statistic is 2.148; i.e., reject the null.
SE [avg(A) - avg(B)] = SQRT(3%^2/60 + 2%^2/60) = 0.4655%
t-statistic = [(3% - 2%) - 0]/0.4655% = 2.148
The one-sided critical value is 1.64 (note we would also reject the two-sided critical value of
1.96)
212.2. B. $1.41 to $2.59
SE [avg(men) - avg(women)] = SQRT(9^2/1,500 + 6^2/1,000) = 0.300;
95% CI = $2.00 +/- 1.96*0.30 = $1.41 to $2.59
212.3. D. 1.04% to 4.96%
SE (difference in default rate) = SQRT(8%*92%/1,000 + 5%*95%/1,800) = 1.0%
95% CI = 3.0% +/- 1.96*1.0% = 1.04% to 4.96%
www.bionicturtle.com
Question 213: Sample variance, covariance and correlation (Stock & Watson)
AIMs: Define, calculate, and interpret the sample variance, sample
standard deviation, and standard error. Define, describe, and interpret
the sample covariance and correlation.
213.1. Consider the following five (X,Y) data points: (1,5), (2,4), (3,3), (4,2), (5,1). What is the
sample standard deviation of (X)?
a)
b)
c)
d)
0.97
1.58
2.00
2.50
213.2. What is the sample covariance of the following five (X,Y) data points: (1,5), (2,4), (3,3),
(4,2), (5,1).
a)
b)
c)
d)
-4.00
-2.50
-1.50
-1.00
213.3. Let Y(i) be the sample set of seven (n = 7) observations: 2, 3, 5, 6, 9, 11 and 13. What is the
standard error of the sample average, SE(sample average Y)?
a)
b)
c)
d)
1.56
1.68
4.12
7.00
213.4. What is the sample correlation of the following five (X,Y) data points: (2,4), (3,1), (5,3),
(7,7), (13,9)?
a)
b)
c)
d)
0.511
0.667
0.744
0.862
www.bionicturtle.com
Answers:
213.1. B. 1.58
The average of each (X) and (Y) is 3, such that:
Standard Deviation (X) = Standard Deviation (Y) = SQRT([(1-3)^2 + (2-3)^2 + (3-3)^2 + (4-3)^2
+ (5-3)^2]/4) = SQRT([(-2)^2+(-1)^2+0^2+1^2+2^2]/4) = SQRT(10/4) = 1.58114
... note per a SAMPLE variance we divide by (n-1) or 4
213.2. B. -2.50
Sample covariance = [(1-3)(5-3) + (2-3)(4-3) + (3-3)(3-3) + (4-3)(2-3) + (5-3)(1-3)]/(5-1);
Sample covariance = [-2*2 + (-1)*1 + 0*0 + 1*-1 + 2*-2]/4 = [-4 - 1 + 0 - 1 - 4]/4 = -10/4 = -2.50
213.3. A. 1.56
As the average = 49/7 = 7.0, the sample variance = [(2-7)^2 + (3-7)^2 + (5-7)^2 + (6-7)^2 + (97)^2 + (11-7)^2 + (13-7)^2]/(7-1) = 102/6 = 17.0;
Sample standard deviation = SQRT(17) = 4.1231;
Standard error (sample average Y) = SQRT(17/7) = 1.558387
213.4. D. 0.862
Sample Variance (X) = 19.0;
Sample Variance (Y) = 10.2;
Sample Covariance (X,Y) = 12.0;
Sample correlation (X,Y) = Sample Covariance (X,Y) / (SQRT[Sample Variance (X)] *
SQRT[Sample Variance (Y)]) = 12.0 / [SQRT(19)*SQRT(10.2)] = 0.86199
www.bionicturtle.com