T2.Statistics Review (Stock & Watson)

P1.T2.
Quantitative Analysis:
Statistics review,
Chapter 3 of Stock and Watson
FRM 2012 Practice Questions
By David Harper, CFA FRM CIPM

www.bionicturtle.com
Table of Contents
Key Ideas from Stock & Watson Chapter 3 ........................................................... 2
Question 208: Sample mean estimators (Stock & Watson) ........................................ 3
Question 209: T-statistic and confidence interval .................................................. 5
Question 210: Hypothesis testing (Stock & Watson) ................................................ 7
Question 211: Type I and II errors and p-value (Stock & Watson) ................................ 9
Question 212: Difference between two means .................................................... 11
Question 213: Sample variance, covariance and correlation (Stock & Watson) .............. 13
FRM 2012 QUANT: STATISTICS REVIEW 1
Key Ideas from Stock & Watson Chapter 3
We mostly do not observe population parameters but instead infer them from sample
estimates which are values given by estimators such as sample mean and sample
variance. An estimator is a recipe for obtaining an estimate of a population parameter.
o
The sample mean is BLUE: Best Linear Unbiased Estimator
The t-statistic tests the null hypothesis that the population mean equals a certain value.
o
If the sample (n) is large (e.g., greater than 30), the t-statistic has a standard
normal sampling distribution when the null hypothesis is true.
A common test is to test the significance of a regression coefficient. While the

specifics vary, in many cases here the null is the slope coefficient is zero.
The p-value is an exact (aka, marginal) significance level: it is the probability of

drawing a statistic at least as adverse to the null hypothesis as the one actually computed
(observed), assuming the null hypothesis is correct.
o
p-value is the smallest significance level at which the null can be rejected
If the p-value is very small (e.g., 0.00x), reject the null. If the p-value is large
(e.g., 0.19 or 19%), accept (fail to reject) the null.
You will NOT be asked, on the FRM, to calculate a p-value (e.g., you cannot derive
it on the TI BA II+ or HP 12c). You may be asked to interpret a given p-value.
A 95% confidence interval for is an interval constructed so that it contains the true value
of in 95% of all possible samples:
Y 1.96SE Y
Y 2.58SE Y
90% CI for Y Y 1.64SE Y

95% CI for Y
99% CI for Y
Where SE is the standard error = sample standard deviation / SQRT (n) =

SQRT (sample variance / n)
1
( X i X )(Yi Y )
n 1
Sample covariance:
sample XY
Sample correlation
sample r XY
s XY
s X sY
Correlation (X,Y)
= covariance (X,Y) / [Std Deviation(X)] * [Std Deviation(Y)]
Question 208: Sample mean estimators (Stock & Watson)

AIMS: Describe and interpret estimators of the sample mean and their
properties. Describe and interpret the least squares estimator. Describe
the properties of point estimators: Distinguish between unbiased and
biased estimators. Define an efficient estimator and consistent estimator.
208.1. A random sample, drawn from a population with unknown mean and variance, includes
the following six outcomes: 3, 6, 6, 8, 9, 10. Please note: "random sample" implies independent
and identically distributed (i.i.d.). Each of the following is TRUE except:
a) The sample variance is 6.40
b) The standard error of the sample mean is 2.61
c) The standard error of the sample mean is an estimator of the standard deviation of the
sample mean
d) The sample variance employs a degrees of freedom correction (n-1). However even for
this small sample, the standard error of the sample mean uses (n) or SQRT(n) in the
denominator and therefore does not itself employ a degrees of freedom correction.
208.2. We assume there is a population mean for the monthly return of hedge funds that employ
a certain strategy (e.g., market neutral funds in 2011). A sample of hedge fund returns is
collected and the sample mean return is +1.0% per month.
a) If the returns are not a random sample (i.e., are not i.i.d.), the sample mean may be
biased estimator of the population mean
b) If the returns are a random sample (i.i.d.), the sample mean is the Best Linear Unbiased
Estimator (BLUE)
c) If the returns are a random sample (i.i.d.), the sample mean is the least squares estimator
d) If the returns are a random sample (i.i.d.), the property of consistency implies that the
variance of the sample mean is smaller than the variance of alternative estimators of the
population mean
208.3. A backtest of a 99.0% value at risk (VaR) model over two years observes 8 exceptions in
500 trading days; i.e., the VaR loss threshold was exceeded on 1.6% of the days but the model
was calibrated to expect losses in excess of the VaR for only 5 days (1.0%). Please note that we
assume exceptions (exceedences) are i.i.d. with a Bernoulli distribution. What is, respectively,
the standard error of the sample mean and the t-statistic? (Bonus for finding the p-value, which
cannot be done with most calculators)
a)
b)
c)
d)
0.39% (s.e.) and 0.88 t-statistic

Answers:
208.1. B. Standard error of the sample mean is 1.038
Sample mean = 7.
Sample variance = Sum of: [(3-7)^2 + (6-7)^2 + (6-7)^2 + (8-7)^2 + (9-7)^2 + (10-7)^2] / (6-1)
= 32/5 = 6.40
Sample standard deviation = SQRT[sample variance] = 2.5298
Standard error = SQRT[sample variance/n] = SQRT(6.4/6) = 1.038 = sample standard/SQRT(n)
= 1.038
In regard to (A), (C), and (D), each is TRUE.
208.2. D. Consistency says that, as the sample size increases, the estimator converges
toward the population mean; an estimator with a smaller variance is more EFFICIENT.
Unbiased says the E[estimator] = population parameter.
In regard to (A), (B) and (C), each is TRUE.
208.3. C. 0.56% (s.e.) and 1.07 t-statistic
The standard error of the sample mean = SQRT[1.6%*(1-1.6%)/500] = 0.56114%
The t-statistic or t-ratio = (1.6% - 1.0%)/0.56114% = 1.0692; i.e., the null hypothesis is that the
VaR model is accurate such that we expect p = 1.0%.
The p-value = 2*[1 - normal CDF (z = 1.0692)] = 2*[1-85.75%] = 28.5%; i.e., we would fail to
reject the null hypothesis that the VaR model is accurate. Put another way, 1.6% could exceed an
accurate VaR (i.e., 1%) due merely to random sampling variation, as we could expect this
outcome fully 28.5% of the time.
Question 209: T-statistic and confidence interval

AIMs: Define, interpret, and calculate the t-statistic. Define, calculate,
and interpret a confidence interval.
209.1. Nine (9) companies among a random sample of 60 companies defaulted. The companies
were each in the same highly speculative credit rating category: statistically, they represent a
random sample from the population of CCC-rated companies. The rating agency contends that
the historical (population) default rate for this category is 10.0%, in contrast to the 15.0%
default rate observed in the sample. Is there statistical evidence, with any high confidence, that
the true default rate is different than 10.0%; i.e., if the null hypothesis is that the true default rate
is 10.0%, can we reject the null?
a)
b)
c)
d)
No, the t-statistic is 0.39

Yes, the t-statistic is 1.74
209.2. Over the last two years, a fund produced an average monthly return of +3.0% but with
monthly volatility of 10.0%. That is, assume the random sample size (n) is 24, with mean of 3.0%
and sigma of 10.0%. Are the returns statistically significant; in other words, can we decide the
true mean return is great than zero with 95% confidence?
a)
b)
c)
d)

209.3. Assume the frequency of internal fraud (an operational risk event type) occurrences per
year is characterized by a Poisson distribution. Among a sample of 43 companies, the mean
frequency is 11.0 with a sample standard deviation of 4.0. What is the 90% confidence interval
of the population's mean frequency?
a)
b)
c)
d)
10.0 to 12.0
8.8 to 13.2
7.5 to 14.5
Need more information (Poisson parameter)
Answers:
209.1. B. No, the t-statistic is only 1.08. For a large sample, the distribution is normally
approximated, such that at 5.0% two-tailed significance, we reject if the abs(t-statistic)
exceeds 1.96.
The standard error = SQRT(15%*85%/60) = 0.046098; please note: if you used
SQRT(10%*90%/60) for the standard error, that is not wrong, but also would not change the
conclusion as the t-statistic would be 1.29
The t statistic = (15%-10%)/0.046098 = 1.08;
The two-sided p value is 27.8%, but as the t statistic is well below 2.0, we cannot confidently
reject.
We don't really need the lookup table or a calculator: the t-statistic tells us that the observed
sample mean is only 1.08 standard deviations (standard errors) away from the hypothesized
population mean.
A two-tailed 90% confidence interval implies 1.64 standard errors, so this (72.8%) is much less
confident than even 90%.
209.2. B. No, the t-statistic is 1.47
The standard error = 10%/SQRT(24) = 0.020412
The t statistic = (3.0% - 0%)/0.020412 = 1.47.
The one-tailed critical t, at 95% with 23 df, is 1.71; two-tailed is 2.07.
(even if we assume normal one-sided, the 95% critical Z is 1.645, of course.)
209.3. A. 10.0 to 12.0
The central limit theorem (CLT) says, if the sample is random (i.i.d.), the sampling distribution of
the sample mean tends toward the normal REGARDLESS of the underlying distribution!
The standard error = SQRT(4^2/43) = 4/SQRT(43) = 0.609994.
The 90% confidence interval = 11.0 +/- 1.645*0.609994 = 11.0 +/- 1.0 = 10.0 to 12.0
... did you realize that a 90% two-side confidence INTERVAL implies the same deviate (1.645) as
a 95% one-sided deviate?
Question 210: Hypothesis testing (Stock & Watson)

AIMs: Explain and apply the process of hypothesis testing: Define and
interpret the null hypothesis and the alternative hypothesis; Distinguish
between onesided and twosided hypotheses; Describe the confidence
interval approach to hypothesis testing; Describe the test of significance
approach to hypothesis testing.
210.1. Your colleague Robert wants to conduct a statistical test to determine whether hedge
funds create alpha (i.e., excess return after attribution to all common factor exposure), on
average. His test collects a large sample (n>30) and he is computing the mean (average) excess
return such that the both the central limit theorem (CLT) and the law of large numbers apply.
His null hypothesis is that, based on a sample of returns, the true (population) ex post realized
alpha is approximately zero; therefore, the alternative hypothesis is that the true mean is nonzero and his two-tailed test allows for the possibility that funds destroy alpha via fees. He is
going to test his hypothesis with a prespecified significance level of 5.0%, per convention. In this
case, each of the following is true EXCEPT for:
a)
b)
c)
d)
He can conduct the test without computing a p-value

The probability of erroneously rejecting a true null hypothesis is 5.0%
He can reject the null if the t-statistic is greater than 1.96
If he reduces the significance level to 1.0%, he reduces the probability of erroneously
rejecting a false null
210.2. Analyst Jane is concerned that the average days sales outstanding (DSO) in her coverage
sector has increased above its historical average of 27.0 days (a lower number is better). From a
large sample of 36 companies, she computes a sample mean DSO of 29.0 days with sample
standard deviation of 7.0. Her one-sided alternative hypothesis is that DSO is greater than 27.0.
Does she reject the null?
a)
b)
c)
d)
No, do not reject one-sided null as the t-statistic is less than 1.65
No, do not reject one-sided null as the t-statistic is less than 1.96
Yes, do reject one-sided null as the t-statistic is greater than 1.65
Yes, do reject one-sided null as the t-statistic is greater than 1.96
210.3. The average capital ratio of a sample of 49 banks is 7.4% with a sample standard
deviation of 5.0%. What is the two-sided 95% confidence interval for the population's true
average capital ratio; i.e., the random interval that has a 95% probability of containing the
population mean?
a)
b)
c)
d)
5.5% to 9.3%
6.0% to 8.8%
6.7% to 8.1%
7.1% to 7.7%
Answers:
210.1. D. A reduction of 5% to 1% significance (becoming "more conservative") offers a
trade-off: the probability of erroneously rejecting a true null (Type I error) decreases but
the probability of erroneously accepting a false null (Type II error) increases.
In regard to (A), (B), and (C), each is true about a 5.0% significance test.
210.2. C. Yes, do reject one-sided null as the t-statistic is greater than 1.65
Standard error = 7/SQRT(36) = 1.6667; t statistic = (29-27)/1.6667 = 1.71.
As 1.71 is greater than 1.645 (i.e., the critical value for a one-tailed 5% significance test of the
large sample MEAN), she rejects the null in favor of the alternative but accepts, conditional on a
true null, a 5.0% probability of making a Type I error.
210.3. B. 6.0% to 8.8%
Standard error = 5.0%/SQRT(49) = 0.7143%
Confidence interval = 7.4% +/ (0.7143% * 1.96) = 6.0% to 8.8%
Question 211: Type I and II errors and p-value (Stock & Watson)
AIM: Define, calculate and interpret type I and type II errors. Define and
interpret the p-value.
211.1. The Basel III market risk backtest requires a bank to observe the number of exceptions
(aka, exceedences; the number of days on which the VaR loss was exceeded) in order to infer
whether the bank's 99.0% 10-day value at risk (VaR) model is accurate. Even before the
impossible task of prediction, the observed sample is making an inference about a historical
population. The null hypothesis implicit in the Basel backtest is: H(0) = the VaR model is
accurate with 99.0% confidence. Therefore, the alternative H(A) = VaR model is accurate with
less than 99.0% confidence. Basel designed three "stoplight" zones to acknowledge the reality
that sampling is a statistical test which is necessarily error-prone: A "Green Zone" outcome is
when sufficiently few exceptions occur such that the decision should be to accept the model; a
"Red Zone" outcome is when too many exceptions occur such that the decision is to reject the
model as bad. Which of the following constitutes a Type II error?
a)
b)
c)
d)
A Red Zone outcome when the VaR model is 99.0% confident (i.e., gives 99.0% coverage)
A Green Zone outcome when the VaR model is, for example, only 97.0% confident
A Green Zone outcome when the VaR model is 99.0% confident
The Type II error cannot occur in the backtest
211.2. A sample of 144 has a sample mean of 112.9 with sample standard deviation of 60. The
null hypothesis is that the true population mean is 100.0; the two-sided alternative hypothesis is
that the true population mean is different than 100.0. What is the two-sided p-value?
a)
b)
c)
d)
0.5%
1.0%
2.5%
5.0%
211.3. Your colleague Mary believes that FRM scores are correlated with work experience.
Somehow she got hold of data and produced a linear regression, FRMScore = intercept +
slope*YearExperience, such that the p-value of the slope coefficient is 1.7% (0.017). Per common
practice, the significance test starts with a null hypothesis that the slope is zero, and the twosided alternative hypothesis is that the slope is non-zero. Which of the following is a correct
interpretation of the p-value?
a) If her prespecified significance level is 1.0%, she rejects the null and deems the true
slope to be non-zero
b) If her prespecified significance level is 5.0%, she rejects the null and deems the true
slope to be non-zero
c) The probability that the slope is zero is 1.7%
d) The probability that the slope is non-zero is 98.3%
Answers:
211.1. B. A Green Zone outcome when the VaR model is, for example, only 97.0%
confident
The null is: the 99.0% VaR model is accurate.
A Type I error is to mistakenly reject a true null.
A Type II error is to mistakenly accept a false null; in this case, to mistakenly decide the model is
good (Green Zone) when the model is less than 99.0% confident.
211.2. B. 1.0%
The standard error = 60/SQRT(144) = 60/12 = 5.0.
The t-statistic = (112.9 - 100) / 5.0 = 2.58;
This is the deviate that corresponds to a 1.0% probability under a two-tailed test; put another
way, the area under the normal curve, to the left of +2.58 sigmas is 99.5%, such that the area
under both left & right tails is 1.0%.
211.3. b. If her prespecified significance level is 5.0%, she rejects the null and deems the
true slope to be non-zero
The prespecified significance level is 5.0%, which implies a 5.0% rejection region (2.5% in each
tail); i.e., if the null is true, 5.0% chance of mistakenly rejecting (Type I error).
A p-value of 1.7% is the "exact significance level:" we can reject at higher significance levels (e.g.,
5%) but we cannot reject at any lower significance levels (e.g., 1.0%)
Question 212: Difference between two means

AIM: Perform and interpret hypothesis tests for the difference between
two means.
212.1. We want to decide whether the average arithmetic return of Fund A is better than the
average return of Fund B: the null hypothesis is that the true average difference is zero. For both
fund, our sample is 60 months. Over this sample, the average return of Fund A was 2.0% with a
sample standard deviation of 3.0%; the average return of Fund B was only 1.0% with sample
standard deviation of 2.0%. With 95% confidence, do we reject the null hypothesis (i.e., fail to
accept) and decide that the average return of Fund A was truly better?
a)
b)
c)
d)
No, the t-statistic is only 0.465

No, the t-statistic is only 1.465
212.2. The average hourly earnings among a sample of 1,500 men is $22.00 with a sample
standard deviation of $9.00. The average hourly earnings among a sample of 1,000 women is
$20.00 with a sample standard deviation of $6.00. What is the 95% confidence interval for the
(two-sided) difference in average earnings between men and women?
a)
b)
c)
d)
$0.04 to $3.96
$1.41 to $2.59
$1.70 to $2.30
$1.83 to $2.17
212.3. A credit rating agency wants to compare the difference in default rates between
structured notes in two speculative rating categories: SF B versus SF CCC. The default rate
among a sample of 1,800 SF B-rated obligors was 5.0%, compared to the default rate among a
sample of 1,000 SF CCC-rated obligors was 8.0%. Default is characterized by a Bernoulli random
variable. What is the 95% confidence interval for the difference in default rates?
a)
b)
c)
d)
2.97% to 3.04%
2.11% to 3.89%
1.75% to 4.25%
1.04% to 4.96%
Answers:
212.1. C. Yes, the t-statistic is 2.148; i.e., reject the null.
SE [avg(A) - avg(B)] = SQRT(3%^2/60 + 2%^2/60) = 0.4655%
t-statistic = [(3% - 2%) - 0]/0.4655% = 2.148
The one-sided critical value is 1.64 (note we would also reject the two-sided critical value of
1.96)
212.2. B. $1.41 to $2.59
SE [avg(men) - avg(women)] = SQRT(9^2/1,500 + 6^2/1,000) = 0.300;
95% CI = $2.00 +/- 1.96*0.30 = $1.41 to $2.59
212.3. D. 1.04% to 4.96%
SE (difference in default rate) = SQRT(8%*92%/1,000 + 5%*95%/1,800) = 1.0%
95% CI = 3.0% +/- 1.96*1.0% = 1.04% to 4.96%
Question 213: Sample variance, covariance and correlation (Stock & Watson)
AIMs: Define, calculate, and interpret the sample variance, sample
standard deviation, and standard error. Define, describe, and interpret
the sample covariance and correlation.
213.1. Consider the following five (X,Y) data points: (1,5), (2,4), (3,3), (4,2), (5,1). What is the
sample standard deviation of (X)?
a)
b)
c)
d)
0.97
1.58
2.00
2.50
213.2. What is the sample covariance of the following five (X,Y) data points: (1,5), (2,4), (3,3),
(4,2), (5,1).
a)
b)
c)
d)
-4.00
-2.50
-1.50
-1.00
213.3. Let Y(i) be the sample set of seven (n = 7) observations: 2, 3, 5, 6, 9, 11 and 13. What is the
standard error of the sample average, SE(sample average Y)?
a)
b)
c)
d)
1.56
1.68
4.12
7.00
213.4. What is the sample correlation of the following five (X,Y) data points: (2,4), (3,1), (5,3),
(7,7), (13,9)?
a)
b)
c)
d)
0.511
0.667
0.744
0.862
Answers:
213.1. B. 1.58
The average of each (X) and (Y) is 3, such that:
Standard Deviation (X) = Standard Deviation (Y) = SQRT([(1-3)^2 + (2-3)^2 + (3-3)^2 + (4-3)^2
+ (5-3)^2]/4) = SQRT([(-2)^2+(-1)^2+0^2+1^2+2^2]/4) = SQRT(10/4) = 1.58114
... note per a SAMPLE variance we divide by (n-1) or 4
213.2. B. -2.50
Sample covariance = [(1-3)(5-3) + (2-3)(4-3) + (3-3)(3-3) + (4-3)(2-3) + (5-3)(1-3)]/(5-1);
Sample covariance = [-2*2 + (-1)*1 + 0*0 + 1*-1 + 2*-2]/4 = [-4 - 1 + 0 - 1 - 4]/4 = -10/4 = -2.50
213.3. A. 1.56
As the average = 49/7 = 7.0, the sample variance = [(2-7)^2 + (3-7)^2 + (5-7)^2 + (6-7)^2 + (97)^2 + (11-7)^2 + (13-7)^2]/(7-1) = 102/6 = 17.0;
Sample standard deviation = SQRT(17) = 4.1231;
Standard error (sample average Y) = SQRT(17/7) = 1.558387
213.4. D. 0.862
Sample Variance (X) = 19.0;
Sample Variance (Y) = 10.2;
Sample Covariance (X,Y) = 12.0;
Sample correlation (X,Y) = Sample Covariance (X,Y) / (SQRT[Sample Variance (X)] *
SQRT[Sample Variance (Y)]) = 12.0 / [SQRT(19)*SQRT(10.2)] = 0.86199

T2.Statistics Review (Stock & Watson)

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

T2.Statistics Review (Stock & Watson)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

T2.Statistics Review (Stock & Watson)

Uploaded by

Copyright:

Available Formats

P1.T2.

By David Harper, CFA FRM CIPM

FRM 2012 QUANT: STATISTICS REVIEW 1

Key Ideas from Stock & Watson Chapter 3

The sample mean is BLUE: Best Linear Unbiased Estimator

A common test is to test the significance of a regression coefficient. While the

The p-value is an exact (aka, marginal) significance level: it is the probability of

90% CI for Y Y 1.64SE Y

Where SE is the standard error = sample standard deviation / SQRT (n) =

FRM 2012 QUANT: STATISTICS REVIEW 2

Question 208: Sample mean estimators (Stock & Watson)

0.39% (s.e.) and 0.88 t-statistic

FRM 2012 QUANT: STATISTICS REVIEW 3

FRM 2012 QUANT: STATISTICS REVIEW 4

Question 209: T-statistic and confidence interval

No, the t-statistic is 0.39

No, the t-statistic is 0.85

FRM 2012 QUANT: STATISTICS REVIEW 5

FRM 2012 QUANT: STATISTICS REVIEW 6

Question 210: Hypothesis testing (Stock & Watson)

He can conduct the test without computing a p-value

FRM 2012 QUANT: STATISTICS REVIEW 7

FRM 2012 QUANT: STATISTICS REVIEW 8

FRM 2012 QUANT: STATISTICS REVIEW 9

FRM 2012 QUANT: STATISTICS REVIEW 10

Question 212: Difference between two means

No, the t-statistic is only 0.465

FRM 2012 QUANT: STATISTICS REVIEW 11

FRM 2012 QUANT: STATISTICS REVIEW 12

FRM 2012 QUANT: STATISTICS REVIEW 13

FRM 2012 QUANT: STATISTICS REVIEW 14

You might also like