Week 3 - The SLRM (2) - Updated PDF
Week 3 - The SLRM (2) - Updated PDF
Week 3 - The SLRM (2) - Updated PDF
1
An introduction to statistical inference
• The population values that describe the true relationship between the variables
would be of more interest than the sample, but are never available.
• We want to make inferences about the likely population values from the
regression parameters.
Example: Suppose we have the following regression results:
yˆ t 20.3 0.5091xt
(14.38) (0.2561)
• $ 0.5091 is a single (point) estimate of the unknown population parameter, .
How “reliable” is this estimate?
• The reliability of the point estimate is measured by the coefficient’s standard
error.
• The information from one or more of the sample coefficients and their standard
errors can be used to make inferences about the population parameters.
Hypothesis testing: Some concepts
• We will always have two hypotheses that go together, the null hypothesis
(denoted H0) and the alternative hypothesis (denoted H1).
• The null hypothesis is the statement or the statistical hypothesis that is actually
being tested. The alternative hypothesis represents the remaining outcomes of
interest.
• For example, suppose given the regression results above, we are interested in
the hypothesis that the true value of is in fact 0.5. We would use the notation
H0 : = 0.5
H1 : 0.5
– This states that the hypothesis that the true but unknown value of β could be 0.5 is
being tested against an alternative hypothesis where β is not 0.5.
– This would be known as a two sided test, since the outcomes of both β < 0.5 and β >
0.5 are subsumed under the alternative hypothesis.
One-sided hypothesis tests
• In very general terms, if the estimated value is a long way away from the
hypothesised value, the null hypothesis is likely to be rejected.
• If the value under the null hypothesis and the estimated value are close to
one another, the null hypothesis is less likely to be rejected.
• What is required now is a statistical decision rule that will permit the
formal testing of such hypotheses.
The probability distribution of the least squares estimators
Probability Distribution of yt
• Mean
E(yt) = E( + xt + ut)
= E( + xt) + E(ut)
= + xt [⸪ E(ut) = 0 and the concept of expectation does not relate to α, β, and xt).
• Variance
Var(yt) = E[yt – E(yt)]2
= E(𝑢𝑡2 ) [⸪ yt = α + βxt + ut)
= σ2
⸫ Probability distribution of yt:
yt ~ N(α + xt, σ2)
The probability distribution of the least squares estimators
1 𝑋ത 2 መ 𝜎 2
𝛼~𝑁
ො 𝛼, + σ 𝑋𝑡2
𝛽~𝑁 𝛽, σ 𝑋 2
𝑛 𝑡
• simplify, it becomes:
𝛼~𝑁
ො 𝛼, 𝑉𝑎𝑟(α) መ
𝛽~𝑁 𝛽, 𝑉𝑎𝑟(β)
• However, these results are useful when the variance of the disturbance term (𝜎 2 ) is
known, and needs to be estimated as:
ˆ 2 t
ˆ
u 2
T 2
• What if the errors are not normally distributed? Will the parameter estimates still be
normally distributed?
• Yes, if the other assumptions of the CLRM hold, and the sample size is sufficiently
large.
The probability distribution of the least squares estimators
ˆ ˆ
~ N 0,1 and ~ N 0,1
var var
ˆ ˆ
~ tT 2 and ~ tT 2
SE (ˆ ) ˆ
SE ( )
• The standardised statistics follow a t-distribution with T − 2 degrees
of freedom rather than a normal distribution.
A note on the t and the normal distribution
• You should all be familiar with the normal distribution and its characteristic
“bell” shape, and and its symmetry around the mean (of zero for a standard
normal distribution).
• We can scale a normal variate to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.
• There is, however, a specific relationship between the t- and the standard
normal distribution. Both are symmetrical and centred on zero. The t-
distribution has another parameter, its degrees of freedom. We will always
know this (for the time being from the number of observations -2).
What does the t-distribution look like?
• t-distribution: looks similar to a normal distribution, but with fatter tails, and a smaller peak at the
mean.
• As the number of degrees of freedom for the t-distribution increases from 4 to 40, the critical
values fall substantially.
• This is represented by a gradual increase in the height of the distribution at the centre and a
reduction in the fatness of the tails as the number of degrees of freedom increases.
normal distribution
t-distribution
Comparing the t and the normal distribution
H0 : β = β∗
H1 : β > β∗
f(x)
95% non-rejection
region 5% rejection region
The rejection region for a 1-sided test (lower tail)
H0 : β = β∗
H1 : β < β∗ f(x)
7. Finally perform the test. If the test statistic lies in the rejection
region then reject the null hypothesis (H0), else do not reject H0.
The test of significance approach: Drawing conclusions
• One potential problem with the use of a fixed (e.g., 5%) size of test is that if
the sample size is sufficiently large, any null hypothesis can be rejected.
– What happens is that the standard errors reduce as the sample size increases,
thus leading to an increase in the value of all t-test statistics.
– Some econometricians have suggested that a lower size of test (e.g. 1%) should
be used for large samples.
3. Use the t-tables to find the appropriate critical value, which will again have T-2
degrees of freedom.
5. Perform the test: If the hypothesised value of (i.e., *) lies outside the
confidence interval, then reject the null hypothesis that = *, otherwise do not
reject the null.
Confidence intervals vs tests of significance
• The first step is to obtain the critical value. We want tcrit = t20;5%
Determining the Rejection Region
f(x)
-2.086 +2.086
Performing the test
• Note that we can test these with the confidence interval approach.
For interest (!), test
H0 : = 0
vs. H1 : 0
H0 : = 2
vs. H1 : 2
• For example, say we wanted to use a 10% size of test. Using the test of
significance approach, $ *
test stat
SE ( $ )
05091
. 1
1917
.
0.2561
as above. The only thing that changes is the critical t-value.
Changing the size of the test: The new rejection regions
f(x)
-1.725 +1.725
Changing the size of the test: The conclusion
t20;10% = 1.725. So now, as the test statistic lies in the rejection region, we would
reject H0.
Testing of a number of different hypotheses - easier under the confidence interval
approach.
A consideration of the effect of the size of the test on the conclusion - easier under the
test of significance approach.
• If we reject the null hypothesis at the 5% level, we say that the result of the
test is statistically significant.
• If the null hypothesis is not rejected, the result of the test is ‘not significant’,
or that it is ‘insignificant’.
• If the null hypothesis is rejected at the 1% level, the result is termed ‘highly
statistically significant’.
• Note that a statistically significant result may be of no practical significance.
– E.g., if a shipment of cans of beans is expected to weigh 450g per tin, but the
actual mean weight of some tins is 449g, the result may be highly statistically
significant but presumably nobody would care about 1g of beans.
– E.g., if the estimated beta for a stock under a CAPM regression is 1.05, and a null
hypothesis that β = 1 is rejected, the result will be statistically significant. But it
may be the case that a slightly higher beta will make no difference to an investor’s
choice as to whether to buy the stock or not. In that case, the result of the test was
statistically significant, but financially or practically insignificant.
The errors that we can make using hypothesis tests
• The probability of a type I error is just , the significance level or size of test we chose. To
see this, recall what we said significance at the 5% level meant: it is only 5% likely that a
result as or more extreme as this could have occurred purely by chance.
• What happens if we reduce the size of the test (e.g., from a 5% test to a 1% test)? We
reduce the chances of making a type I error ... but we also reduce the probability that we
will reject the null hypothesis at all, so we increase the probability of a type II error:
less likely
→ Lower chance of
to falsely reject
type 1 error.
Reduce size more strict reject null
of test criterion for hypothesis more likely to → Higher chance of
rejection less often incorrectly not type II error
reject
• So there is always a trade-off between type I and type II errors when choosing a
significance level. The only way we can reduce the chances of both is to increase the
sample size.
• In practice, type I errors are usually considered more serious and hence a small size of
test is usually chosen (5% or 1% are the most common).
A special type of hypothesis test: The t-ratio
$i
Since i* = 0, test stat
SE ( $i )
• If the variable is not ‘significant’, it means that while the estimated value of the
coefficient is not exactly zero (e.g. 1.10 in the example), the coefficient is
indistinguishable statistically from zero.
• If a zero were placed in the fitted equation instead of the estimated value, this
would mean that whatever happened to the value of that explanatory variable, the
dependent variable would be unaffected.
– The variable is not helping to explain variations in y, and that it could therefore be removed from the
regression equation.
• It is worth noting that, for degrees of freedom greater than around 25, the 5%
two-sided critical value is approximately ±2.
• So, as a rule of thumb (i.e., a rough guide), the null hypothesis would be rejected if
the t-statistic exceeds 2 in absolute value.
What does the t-ratio tell us?
• If we reject H0, we say that the result is significant. If the coefficient is not
“significant” (e.g., the intercept coefficient in the last regression), then it
means that the variable is not helping to explain variations in y. Variables that
are not significant are usually removed from the regression model.
• In practice there are good statistical reasons for always having a constant
even if it is not significant. Look at what happens if no intercept is included:
yt
xt
The exact significance Level or p-value
• If the test statistic is large in absolute value, the p-value will be small, and
vice versa. The p-value gives the plausibility of the null hypothesis.
Reject H0
Estimation of SLRM using Eviews
Cannot reject H0