0% found this document useful (0 votes)
14 views77 pages

MFIN 305 - Lecture1

Uploaded by

tamara.sammak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views77 pages

MFIN 305 - Lecture1

Uploaded by

tamara.sammak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

MFIN 305: Quantitative

Methods of Finance
A Brief Overview of the Classical Linear Regression Model
1. What is a Regression Model?
It is the relationship between a given variable and one or more variables

Names for 𝑦 Names for 𝑥s


Dependent variable Independent variables
Regressand Regressors
Effect variable Causal variables
Explained variable Explanatory variables
2. Regression Versus Correlation
• Correlation is the degree of linear association between 2 variables
• If we say 𝑦 and x are correlated, it means that we treating 𝑦 and 𝑥 in a
completely symmetrical way
• In regression, we treat the dependent variables (𝑦) and the independent
variables(s) (x’s) very differently.
• The 𝑦 variable is assumed to be random or “stochastic” (i.e., to have a
probability distribution). The x variables are, however, assumed to
have fixed (“non-stochastic”) values in repeated samples
3. First step: Scatter Plot

x
4. Line of Best Fit
• The equation of a straight line is used to get the line of best fit:
𝑦 = 𝛼 + 𝛽𝑥 (Perfect fit)
• 𝛼 represents the value of 𝑦 if 𝑥 is zero
• Interpretation of 𝛽: if 𝑥 increases by 1, 𝑦 will be expected, all else
being equal, to increase by 𝛽
• However, it is unrealistic that 𝑥 alone explains 𝑦
4. Line of Best Fit
• Line of best fit:
Too broad as a definition
ൠ ⇒Imprecise
Room for disagrement about "best" fit
• The values of 𝛼 and 𝛽 that place the line as close as possible to all the
points / data are chosen
4. Line of Best Fit
𝑦𝑡 = 𝛼 + 𝛽𝑥𝑡 + 𝑢𝑡 t = 1,2,3, …, T

Dependent Variable Disturbance term


Parameters Regressor

• 𝛼 and 𝛽 should minimize the distance (vertical) from the data points to
the fitted line
5. Ordinary Least squares
• The most common method of fitting a line to the data is Ordinary
Least Squares (OLS)
• OLS is the workhorse of econometrics/data science!
• OLS entails measuring the vertical distance from a point to the line,
squaring it and then minimizing the sum of squares (hence “least
squares”)
5. Ordinary Least squares
Let:
• 𝑦𝑡 denote the actual data point
• 𝑦ො𝑡 denote the fitted value from a regression line, the value predicted by
the model for a given value x
• ^ denote estimated or fitted values (any value that I calculate)
• 𝑢ො 𝑡 denote the residual: 𝑢ො 𝑡 = 𝑦𝑡 − 𝑦ො𝑡

Note: 𝑥𝑡 is assumed to be non-stochastic (fixed) in repeated samples


5. Ordinary Least squares
• Minimizing the sum of 𝑢ො 𝑡2 by finding 𝛼ො and 𝛽መ that do so (optimization
problem). This is OLS.
• Minimizing the sum of squared distances is given by minimizing:
𝑢ො 12 + 𝑢ො 22 + 𝑢ො 32 + 𝑢ො 42 + 𝑢ො 52 or σ5𝑡=1 𝑢ො 𝑡2
• σ5𝑡=1 𝑢ො 𝑡2 is known as the residual sum of squares (RSS) or sum of
squared residuals
6. Deriving the OLS Estimator
More generally, Min σ𝑇𝑡=1 𝑢ො 𝑡2 = Min σ𝑇𝑡=1 𝑦𝑡 − 𝑦ො𝑡 2
𝛼 ෡
ෝ ,𝛽 ෡
ෝ ,𝛽
𝛼

• Let L denote the RSS, also known as a loss function


መ 𝑡 )2
𝐿 = σ𝑇𝑡=1(𝑦𝑡 −𝑦ො𝑡 )2 = σ𝑇𝑡=1(𝑦𝑡 − 𝛼ො − 𝛽𝑥 (1)

• We will Min 𝐿 to find the values of 𝛼ො and 𝛽መ that give the line that is

ෝ ,𝛽
𝛼
closest to the data

Note: Chain Rule: 𝑓 𝑥 = 𝑓(𝑔 𝑥 ) → 𝑓 ′ 𝑥 = 𝑓 ′ (𝑔 𝑥 )𝑔′ (𝑥)


6. Deriving the OLS Estimator
𝜕𝐿 መ 𝑡 =0
• = −2 σ𝑇𝑡=1 𝑦𝑡 − 𝛼ො − 𝛽𝑥 (2)
𝜕ෝ
𝛼
𝜕𝐿 መ 𝑡 =0
• ෡ = −2 σ𝑇𝑡=1 𝑥𝑡 𝑦𝑡 − 𝛼ො − 𝛽𝑥 (3)
𝜕𝛽

• From (2): መ 𝑡 =0
σ 𝑦𝑡 − 𝛼ො − 𝛽𝑥 (4)
σ 𝑦𝑡 − 𝑇𝛼ො − 𝛽መ σ 𝑥𝑡 = 0 (5)
Use σ 𝑦𝑡 = 𝑇𝑦 and σ 𝑥𝑡 = 𝑇𝑥 to write:
𝑇𝑦 − 𝑇𝛼ො − 𝑇𝛽෡ 𝑥 = 0 (6)
መ =0
Or 𝑦 − 𝛼ො − 𝛽𝑥 (7)
6. Deriving the OLS Estimator
መ 𝑡 =0
• From (3): σ𝑡 𝑥𝑡 𝑦𝑡 − 𝛼ො − 𝛽𝑥 (8)

• From (7): 𝛼ො = 𝑦 − 𝛽𝑥 (9)
• Substituting into (8) for 𝛼ො from (9) yields:
σ 𝑥𝑡 (𝑦𝑡 − 𝑦 + 𝛽𝑥መ − 𝛽𝑥 መ 𝑡) = 0 (10)
መ σ 𝑥𝑡 − 𝛽መ σ 𝑥𝑡2 = 0
σ 𝑥𝑡 𝑦𝑡 − 𝑦 σ 𝑥𝑡 + 𝛽𝑥 (11)
መ 𝑥ҧ 2 − 𝛽መ σ 𝑥𝑡2 = 0
σ 𝑥𝑡 𝑦𝑡 − 𝑇𝑥 𝑦 + 𝛽𝑇 (12)
6. Deriving the OLS Estimator

• Rearrange for 𝛽:

𝛽መ 𝑇𝑥ҧ 2 − σ𝑡 𝑥𝑡 2 = 𝑇 𝑥 𝑦 − σ 𝑥𝑡 𝑦𝑡 (13)

• Line of best fit:


σ 𝑥𝑡 𝑦𝑡 −𝑇𝑥 𝑦

𝛽= 2 and 𝛼
ො = 𝑦 − ෡𝑥
𝛽
2
σ 𝑥𝑡 −𝑇 𝑥

σ 𝑥𝑡 −𝑥 (𝑦𝑡 −𝑦)

• More intuitively: 𝛽 = Sample cov(x,y)
σ𝑡(𝑥𝑡 −𝑥)ҧ 2
Sample variance of x
6. Deriving the OLS Estimator
σ 𝑥𝑡 −𝑥ҧ (𝑦𝑡 −𝑦)
• Note: Sample covariance 𝜎𝑥𝑦 =
(𝑇−1)

σ 𝑥𝑡 −𝑥 (𝑦𝑡 −𝑦) 𝜎𝑥𝑦


Sample Correlation 𝜌𝑥𝑦 = =
𝑇−1 𝜎𝑥 𝜎𝑦 𝜎𝑥 𝜎𝑦

• From the equation of 𝛼,


ො it is obvious that the regression line will go
through the mean of the observations (i.e. ( 𝑥, 𝑦 ) lies on the regression
line)
7. What do we use 𝛼ො and 𝛽መ For?
Suppose that we have the following data on the excess returns on a fund
manager’s portfolio (“fund XXX”) together with the excess returns on a
market index:
Year, t Excess return = 𝒓𝑿𝑿𝑿,𝒕 − 𝒓𝒇𝒕 Excess return on market index = 𝒓𝒎𝒕 − 𝒓𝒇𝒕

1 17.8 13.7
2 39.0 23.2
3 12.8 6.9
4 24.2 16.8
5 17.2 12.3
7. What do we use 𝛼ො and 𝛽መ For?
45
Excess return on fund XXX

40
35
30
25
20
15
10
5
0
0 5 10 15 20 25
Excess return on market portfolio
7. What do we use 𝛼ො and 𝛽መ For?
• Using the data, calculate 𝛼ො and 𝛽መ
• We will get 𝛼ො = -1.74 and 𝛽መ = 1.64

• The fitted line is:


𝑦ො𝑡 = −1.74 + 1.64𝑥𝑡

(Fitted) Excess returns Excess return on market portfolio


on the fund (𝑟𝑚 − 𝑟𝑡 ) 𝑜𝑟 𝑚𝑎𝑟𝑘𝑒𝑡 𝑟𝑖𝑠𝑘 𝑝𝑟𝑒𝑚𝑖𝑢𝑚
7. What do we use 𝛼ො and 𝛽መ For?
Suppose 𝑟𝑚 − 𝑟𝑓 (or 𝑥𝑡 ) next year (year 6) is 20%. What would the
expected return on the fund be?
𝑦ො𝑡 = −1.74 + 1.64 ⨯ 20 = 31.06

Beta in a CAPM Excess return over the


Risky fund risk free rate of 31.06%

መ if x increases by 1 unit, y will be expected,


• Interpretation of 𝛽:
everything else being equal, to increase by 1.64 units
8. The Population and the Sample
Population Sample
• The total collection of all objects or • Some items from the population
people to be studied
• It may be either finite or infinite • It is usually random
• Ex. All stocks on LSE • Representative of the population of
interest
• Each individual in the population is
equally likely to be drawn
• Size = number of available
observations
9. The DGP, PRF and SRF
Data Generating Process Population Regression Function Sample Regression Function
(DGP) (PRF) (SRF)
Process describing the way the • Description of the model It the relationship estimated
actual observations on y came thought to be generating the using the sample observations
about actual data (information)
• Represents the true
relationship between the መ 𝑡
𝑦ො𝑡 = 𝛼ො + 𝛽𝑥
variables
• Embodies the true values of መ 𝑡 + 𝑢ො 𝑡
𝑦𝑡 = 𝛼ො + 𝛽𝑥
𝛼 and 𝛽 = 𝑦ො𝑡 + 𝑢ො 𝑡
𝑦𝑡 = 𝛼 + 𝛽𝑥𝑡 + 𝑢𝑡 𝑦ො𝑡 : fitted value, 𝑢ො 𝑡 : residual
10. Linearity
• Linearity assumption: in order to use OLS, a linear model is required
(i.e. the relationship between y and x is a straight line)
• The model must be linear in the parameters (𝛼 and 𝛽), but it does not
necessarily have to be linear in the variables (y and x)
• Some nonlinear models can be made to be linear
𝛽
𝑌𝑡 = 𝐴𝑋𝑡 𝑒 𝑢𝑡 (exponential regression model)
ln𝑌𝑡 = ln 𝐴 + β ln 𝑋𝑡 + 𝑢𝑡 ⇒ 𝑦𝑡 = 𝛼 + 𝛽 𝑥𝑡 + 𝑢𝑡

𝑦𝑡 𝛼 𝑥𝑡
10. Linearity
• 𝛽 is an elasticity (strictly, unit changes on a logarithmic scale)
• Say 𝛽መ = 1.2, then a rise in x of 1% will lead, on average, everything else
being equal, to a rise in 𝑦 of 1.2%
• If the theory suggests that should be inversely related to a model of the
form:
𝛽
𝑦𝑡 = 𝛼 + + 𝑢𝑡
𝑥𝑡
You can estimate the regression by OLS by setting:
1
𝑧𝑡 =
𝑥𝑡
• Some models are intrinsically non-linear: 𝑦𝑡 = 𝛼 + 𝛽𝑥𝑡𝛿 + 𝑢𝑡
11. Estimator or Estimate?

Estimator Estimate
Actual numerical values for the
Formulae used to calculate the
coefficients obtained from the
coefficients
sample
12. The Classical Linear Regression Model

𝑦𝑡 = 𝛼 + 𝛽𝑥𝑡 + 𝑢𝑡

• After assumptions are made about 𝑢𝑡 , this becomes the classical linear
regression model (CLRM)

• Assumptions are made about the disturbances 𝑢𝑡 , not the residuals


𝑢ො 𝑡
12. The Classical Linear Regression Model

Assumptions Interpretation
(1) E 𝑢𝑡 = 0 The errors have zero mean
The variance of the errors is constant and finite over all
(2) var 𝑢𝑡 = 𝜎 2 < ∞
values of 𝓍𝑡
(3) cov 𝑢𝑖 , 𝑢𝑗 = 0 The errors are linearly independent of one another
There is no relationship between the error and
(4) cov 𝑢𝑡 , 𝑥𝑡 = 0
corresponding x variate
12. The Classical Linear Regression Model
• If assumption (1) holds, assumption (4) can be written as E 𝑥𝑡 𝑢𝑡 = 0
(i) This implies that the regressor is orthogonal (i.e. unrelated to) the error
term
(ii) An alternative assumption to (4), which is slightly stronger is that the 𝑥𝑡
are non-stochastic or fixed in repeated samples (i.e. no sampling variation
in 𝑥𝑡 and its value is determined outside the model)

• A fifth assumption is required to make valid inferences about the population



parameters (𝛼 and 𝛽) from the sample parameters (𝛼ො and 𝛽):

(5) 𝑢𝑡 ~ 𝑁(0, 𝜎 2 ) ⇒ 𝑢𝑡 is normally distributed


13. Properties of the OLS Estimator
If assumptions (1)-(4) hold, then the estimators 𝛼ො and 𝛽መ determined by
OLS will have a number of desirable properties
Under assumptions (1)-(4), the OLS estimator is the best linear unbiased
estimator (BLUE)
• Best: means that the OLS estimator has minimum variance among the
class of linear unbiased estimators. The Gauss-Markov theorem proves
that the OLS estimator is best by examining an arbitrage alternative
linear unbiased estimator and showing in all cases that it must have a
variance no smaller than the OLS estimator
• Linear: 𝛼ො and 𝛽መ are linear estimators (linear combinations) of y
13. Properties of the OLS Estimator
• Unbiased: on average, the actual vales of 𝛼ො and 𝛽መ will be equal to
their true values

• Estimator: 𝛼ො and 𝛽መ are estimates of the true values of 𝛼 and 𝛽


13. Properties of the OLS Estimator
OLS estimator is consistent, unbiased and efficient
Consistency
• 𝛽መ is consistent if: lim 𝑃𝑟 𝛽መ − 𝛽 < 𝛿 = 0 ∀ 𝛿 > 0
𝑇→∞
• As the sample size increases (tends to infinity), the estimates will
converge to their true values
• Consistency is a large sample or asymptotic, property
• The assumptions that E 𝑥𝑡 𝑢𝑡 = 0 and E 𝑢𝑡 = 0 are sufficient to
derive consistency
13. Properties of the OLS Estimator

T=300

0 T=100 T=200
13. Properties of the OLS Estimator
Unbiasedness
OLS estimates are unbiased
𝐸 𝛼ො = 𝛼 On average, the estimated values for the
መ ቋ coefficients will be equal to their true
𝐸 𝛽 =𝛽
values
• No systematic over or under-estimation of true coefficients
• Proving unbiasedness requires that cov(𝑥𝑡 , 𝑢𝑡 ) = 0
• It is a stronger condition than consistency since it holds for small as
well larger samples
13. Properties of the OLS Estimator
Efficiency
• An estimator of parameter 𝛽 is said to be efficient if no other
estimators has a smaller variance
• Minimizing the probability that it will be far away from the true value
• Estimator’s probability distribution is Efficient / Best
narrowly dispersed around the true value

0
14. Precision and Standard Errors
• Set of estimates 𝛼ො and 𝛽መ are specific to the sample used in their
estimation
• Different sample → different (𝑥𝑡 , 𝑦𝑡 ) → different 𝛼ො and 𝛽መ
መ How good these estimates are?
• Measure the reliability of 𝛼ො and 𝛽.
• Precision (confidence in the estimates) → standard error
• Would estimates vary from one sample to another (out of the
population)?
• Precision can be calculated using available data
14. Precision and Standard Errors
Given assumptions 1 to 4, valid estimators of the standard errors are
given by:
σ 𝑥𝑡2 σ 𝑥𝑡2
𝑠𝑒 𝛼ො = 𝑠 = 𝑠
𝑇 σ(𝑥𝑡 − 𝑥) 2 𝑇 σ 𝑥𝑡2 − 𝑇 2 𝑥ҧ 2

1 1
𝑠𝑒 𝛽መ = 𝑠 2
=𝑠
σ(𝑥𝑡 − 𝑥) σ 𝑥𝑡 2 − 𝑇𝑥ҧ 2

Where s is the estimated standard deviation of the residuals


14. Precision and Standard Errors
• Standard errors give a measure of the degree of uncertainty in the
estimated values for the coefficients
• 𝑠 2 : estimate of the variance of the disturbance term
• 𝜎 2 : actual variance of the disturbance term
2
15. Estimating the variance of the 𝑢𝑡 ; 𝜎
0
2
𝜎2 = var 𝑢𝑡 = 𝐸 𝑢𝑡 − 𝐸(𝑢𝑡 )

𝜎 2 = 𝑣𝑎𝑟 𝑢𝑡 = 𝐸(𝑢𝑡2 )
• 𝑢𝑡 is unobservable (population disturbances), thus:
1
𝑠2 = σ 𝑢ො 𝑡2 Biased estimator of 𝜎 2 , but consistent
𝑇
ෝ𝑡2
σ𝑢 ෝ𝑡2
σ𝑢
𝑠2 = ⇒𝑠= standard error of the regression or
𝑇−2 𝑇−2
standard error of the estimate
• s is a broad measure of the fit of the regression equation.
• The smaller s is, the closer is the fit of the line to the actual data
16. Comments on the Standard Error
Estimators
σ 𝑥𝑡2 σ 𝑥𝑡2
𝑠𝑒 𝛼ො = 𝑠 2
=𝑠 2
𝑇 σ(𝑥𝑡 − 𝑥) σ 2
𝑇(( 𝑥𝑡 ) − 𝑇 𝑥 )
መ , se(𝛼)
1. As the sample size T , se(𝛽) ො
2. As 𝑠 2 , se(𝛽)መ , se(𝛼)
ො → The larger the residuals are, the worse is
the fit of the line
3. The larger σ(𝑥t − 𝑥ҧ )2 is, the smaller the coefficient variances
→ More variation in the x’s around their mean is better
The importance of the deviation of x from its
mean
y
y

y y

x 0 x x
0 x
Aside: Accuracy of the intercept estimate
y

0 x
17. How to Calculate the Parameters and
Standard Errors
Assume the following data have been calculated from a regression of y
on a single variable 𝑥 and a constant over 22 observations:
σ 𝑥𝑡 𝑦𝑡 = 830102, 𝑇 = 22, 𝑥 = 416.5, 𝑦 = 86.65
σ 𝑥𝑡2 = 3919654, 𝑅𝑆𝑆 = 130.6
σ 𝑥𝑡 𝑦𝑡 −𝑇 𝑥ҧ 𝑦ത 830102−(22⨯416.5⨯86.65)

•𝛽= 2 = = 0.35
2
σ 𝑥𝑡 −𝑇 𝑥 3919654−22(416.5) 2

• 𝛼ො = 𝑦 − 𝛽෡ 𝑥 = 86.65 − 0.35 ⨯ 416.5 = −59.12


መ 𝑡 = −59.12 + 0.35𝓍𝑡
• Sample regression function: 𝑦ො = 𝛼ො + 𝛽𝑥
17. How to Calculate the Parameters and
Standard Errors
ෝ𝑡 2
σ𝑢 130.6
• 𝑠𝑒 regression : 𝑠 = = = 2.55
𝑇−2 20

σ 𝑥𝑡2 3919654
• 𝑠𝑒 𝛼ො = 𝑠 2 = 2.55 ⨯ = 3.35
𝑇 σ 𝑥𝑡2 −𝑇 2 𝑥 22⨯(3919654−22⨯416.52 )

1 1
• 𝑠𝑒 𝛽መ = 𝑠 σ 𝑥𝑡2 −𝑇𝑥 2
= 2.55 ⨯ = 0.0079
3919654−22⨯416.52

• We write: 𝑦ො𝑡 = −54.12 + 0.35𝑥𝑡


(3.35) (0.0079)
18. An Introduction to Statistical Inference
• Financial theory will often suggest that certain coefficients should take on
particular values, or values within a given range
• Sample point estimates 𝛼ො and 𝛽መ of true but unknown population parameters
𝛼, 𝛽
• How reliable are these estimates? 𝑠𝑒 𝛼ො , 𝑠𝑒(𝛽)መ
• 𝛼 and 𝛽 describe the true relationships between the variables, but they are
never available
• Inferences are made concerning the likely population values from the
regression parameters that have been estimated from the sample of data to
hand
• The aim is to determine the differences between the coefficient estimates
actually obtained and expectations arising from financial theory
19. Hypothesis Testing
• Null Hypothesis (𝐻0 ): statement or statistical hypothesis being tested, it is
always an equality
• Alternative hypothesis (𝐻1 ): Remaining outcomes of interest
• Hypothesis testing is always conducted on population values
𝐻0 : 𝛽 = 0.5
• Two-sided test: ቊ
𝐻1 : 𝛽 ≠ 0.5
𝐻0 : 𝛽 = 0.5 𝐻0 : 𝛽 = 0.5
• One-sided test: ൜ or ቊ
𝐻1 : 𝛽 > 0.5 𝐻1 : 𝛽 < 0.5

A bank requires traders to keep their portfolio’s beta below 0.5


(financial theory)
19. Hypothesis Testing
Hypothesis testing Level of significance
approach
Test of significance approach Confidence
interval approach
• Intuition: if 𝛽መ = 0.5091 and the hypothesized value is 5, the null
hypothesis is likely to be rejected
20. Probability Distribution of Least Squares
Estimators
• CLRM assumption: 𝑢𝑡 ~ 𝑁(0, 𝜎 2 )
• Since 𝑢𝑡 is normally distributed, 𝑦𝑡 will be normally distributed
• Since OLS estimators are linear combinations of the random variables
𝛽መ = σ 𝑤𝑡 𝑦𝑡 ; 𝑤𝑡 :weights
and since the weighted sum of normal random variables is also
normally distributed, the OLS estimates will be normally distributed
𝛼~𝑁
ො 𝛼, v𝑎𝑟 𝛼ො መ
𝑎𝑛𝑑 𝛽~𝑁(𝛽, v𝑎𝑟 𝛽መ )
• Even if the errors do not follow a normal distribution, the coefficient
estimates still follow a normal distribution CLT
20. Probability Distribution of Least Squares
Estimators
• Standard normal variables can be constructed from 𝛼ො and 𝛽መ by subtracting
the mean and dividing by the square root of the variance:
𝛼ො − 𝛼 𝛽መ − 𝛽
~𝑁 0,1 and ~𝑁(0,1)
v𝑎𝑟(𝛼)
ො መ
v𝑎𝑟(𝛽)

• v𝑎𝑟(𝛼)
ො and v𝑎𝑟(𝛽) መ are unknown. Thus, their counterparts, the calculated
standard errors of the coefficient estimates 𝑠𝑒(𝛼) መ are used
ො and 𝑠𝑒(𝛽)
𝛼ො − 𝛼 𝛽መ − 𝛽
~𝑡𝑇−2 and ~𝑡𝑇−2
𝑠𝑒(𝛼)
ො መ
𝑠𝑒(𝛽)
21. A Note on the t and the Normal
Distribution
• The normal distribution is “bell shaped”, it
is symmetric around the mean 𝑓(𝓍)

• We can scale a normal variate to have a Normal


distribution
zero mean and unit variance by subtracting
its mean and dividing by its standard
deviation
• There is a specific relationship between the t-distribution
t-distribution and the standard normal
distribution, and the t-distribution has
𝜇 𝓍
another parameter, its degrees of freedom
21. A Note on the t and the Normal
Distribution
Normal Distribution t-distribution
• Normal distribution is fully • Depends on its degree of freedom
characterized by its first two moments parameter
(mean and variance)
• symmetric • Looks like the normal distribution
(symmetric)
• Bell shaped • Has fatter tails than the normal
distribution
• Can be scaled to have zero mean and • Smaller peak at the mean
unit variance
21. A Note on the t and the Normal
Distribution
In the limit, a t-distribution with an infinite number of degrees of
freedom is a standard normal, i.e. 𝑡∞ = 𝑁(0,1)

Significance level 𝑁(0,1) t(40) t(4)


50% 0 0 0
5% 1.64 1.68 2.13
2.5% 1.96 2.02 2.78
0.5% 2.57 2.70 4.60
22. The Test of Significance Approach
Assume the regression equation is given by: 𝑦𝑡 = 𝛼 + 𝛽𝑥𝑡 + 𝑢𝑡

ො 𝛽መ and 𝑠𝑒 𝛼ො , 𝑠𝑒(𝛽)
(1) Estimate 𝛼, መ
𝐻0 : 𝛽 = 𝛽 ∗
(2) State the null and alternative hypothesis: ቊ
𝐻1 : 𝛽 ≠ 𝛽 ∗
and complete the test statistic:

𝛽−𝛽 ∗
test statistic= ෡
𝑠𝑒(𝛽)

where 𝛽 ∗ is the value of 𝛽 under the null hypothesis


22. The Test of Significance Approach
(3) Choose a significance level 𝛼, usually 𝛼 = 5%
𝛼: Rejecting 𝐻0 by chance, probability of rejecting 𝐻0 when 𝐻0 is
true
(4) Obtain from the tabulated distribution a critical value. Compare the
estimated test statistic to the critical value
(5) Rejection and non-rejection regions can be determined
Decision rule: if the t-statistic lies in the rejection region, then reject
𝐻0 . Otherwise, do not reject 𝐻0 .
22. The Test of Significance Approach
𝐻0 : 𝛽 = 𝛽 ∗
Rejection region for a two-sided 5% (level of significance) test: ቊ
𝐻1 : 𝛽 ≠ 𝛽 ∗
𝛽መ

2.5% rejection region 2.5% rejection region

95% non-rejection region


- t-critical value 𝛽∗ = 0 t-critical value
22. The Test of Significance Approach
𝐻0 : 𝛽 = 𝛽 ∗
Rejection region for a one-sided hypothesis test: ቊ ∗

𝐻1 : 𝛽 < 𝛽
𝛽

5% rejection region
95% non-rejection region
- t-critical value 𝛽∗ = 0
22. The Test of Significance Approach
𝐻0 : 𝛽 = 𝛽 ∗
Rejection region for a one-sided hypothesis test: ቊ
𝐻1 : 𝛽 > 𝛽 ∗
𝛽መ

5% rejection region
95% non-rejection region

𝛽∗ = 0 t-critical value
22. The Test of Significance Approach
𝛽መ − 𝛽∗
𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 =

𝑠𝑒(𝛽)
• If: (1) 𝑠𝑒(𝛽) መ , test statistic (we are confident in the estimate)
(2) 𝛽መ is far away from 𝛽∗ , test statistic
• Degrees of freedom: number of pieces of information (additional) beyond
minimum
• If two parameters are estimated, a minimum of two data points are needed
• If the degrees of freedom , critical value , one is more confident of the
results of the hypothesis test
• 𝛼 = significance level = size of the test
23. The Confidence Interval Approach to
Hypothesis Testing
• Say 𝛽መ = 0.93 and the 95% confidence interval for 𝛽 is (0.77, 1.09)
• This means that in many repeated samples, 95% of the time, the true
but unknown value of 𝛽 will be contained within this interval
• Confidence intervals are always for true but unknown population
parameters. Given a level of significance, we have a 1 − 𝛼 %
confidence interval. 1 − 𝛼 is the level of confidence.
𝛼 = 1% 99% CI
𝛼 = 5% 95% CI
𝛼 = 10% 90% CI
23. The Confidence Interval Approach to
Hypothesis Testing
• Confidence intervals and test of significance approaches always give
the same conclusion for a given 𝛼

𝐻0 : 𝛽 = 𝛽 ∗
• ∗ ቋ CIs are always used for two-sided hypotheses
𝐻1 : 𝛽 ≠ 𝛽

• The null hypothesis will not be rejected if the test-statistic lies in the
non-rejection region
23. The Confidence Interval Approach to
Hypothesis Testing
• The confidence interval contains the same information (but
rearranged) as the test of significance approach:
𝛽መ − 𝛽 ∗
−𝑡𝑐𝑟𝑖𝑡 ≤ ≤ +𝑡𝑐𝑟𝑖𝑡
𝑠𝑒 𝛽መ
𝛽መ − 𝑡𝑐𝑟𝑖𝑡 . 𝑠𝑒 𝛽መ ≤ 𝛽 ∗ ≤ 𝛽መ + 𝑡𝑐𝑟𝑖𝑡 . 𝑠𝑒(𝛽)

• If the hypothesized value lies within the confidence interval, do not


reject 𝐻0
• If the hypothesized value lies outside the confidence interval, reject 𝐻0
24. Carrying out a hypothesis test using
confidence intervals
(1) Compute 𝛼, መ 𝑠𝑒 𝛼ො , 𝑠𝑒 𝛽መ
ො 𝛽,
(2) Choose 𝛼. This is significant to choosing a 1 − 𝛼 ⨯ 100% CI
(3) Find the appropriate critical value, which will again have T-2
degrees of freedom
(4) Construct the confidence interval as
𝛽መ − 𝑡𝑐𝑟𝑖𝑡 . 𝑠𝑒 𝛽መ ≤ 𝛽 ∗ ≤ 𝛽መ + 𝑡𝑐𝑟𝑖𝑡 . 𝑠𝑒(𝛽)

(5) Perform the test


26. Hypothesis Testing: Example
Using the regression:
𝑦ො𝑡 = 20.3 + 0.5091𝑥𝑡
(14.38) (0.2561)

and 𝑇 = 22 , test the hypothesis that 𝛽 = 1 against a two-sided


alternative using both the test of significance and confidence interval
approaches
𝐻0 : 𝛽 = 1

𝐻1 : 𝛽 ≠ 1
26. Hypothesis Testing: Example
Test of significance approach

𝛽−𝛽 ∗ 0.5091−1
• Test-statistic = ෡ = = −1.917
𝑠𝑒(𝛽) 0.2561

• t-critical= 𝑡20,5% = ±2.086

Since test − statistic < 2.086


1.917 < 2.086
⇒ Do not reject H0
26. Hypothesis Testing: Example
Confidence interval approach
𝛽መ ± 𝑡𝑐𝑟𝑖𝑡 ⨯ 𝑠𝑒 𝛽መ = 0.5091 ±2.086⨯(0.2561)

95% CI = (−0.0251; 1.0433)

⇒ Do not Reject H0 since the hypothesized value 1 lies within the


interval
26. Hypothesis Testing: Example

2.5% rejection region 2.5% rejection region


95% non-rejection region

-2.086 2.086
26. Hypothesis Testing: Example
What if in the example before: 𝛼 = 10%?
• 𝑡20,10% = ±1.725
• Given that the test-statistic = 1.917, the null is rejected

Note:
• Test of significance approach → better if the size of test 𝛼 changes
• CI approach → useful for testing many hypothesis at a given 𝛼
26. The Exact Level of Significance Approach
• Commonly known as the p-value approach
• Gives the marginal significance level where one would be indifferent
between rejecting and not rejecting the null hypothesis
• If the test statistic is large in absolute value, the p-value will be small
and vice-versa
26. The Exact Level of Significance
Approach
• Informally, the p-value is also often referred to as the probability of
being wrong when the null hypothesis is rejected
• A p-value of 0.05 or less leads the researcher to reject the null
hypothesis
• The p-value is also termed the plausibility of the null hypothesis
• Decision rule: if p-value < 𝛼, reject H0
27. Terminology: Significance and t-ratio
• Reject 𝐻0 at 𝛼 = 5% → The result of the test or the estimate of the
parameter is statistically significant

• Reject 𝐻0 at 𝛼 = 1% → Highly statistically significant


• The t-ratio is the test statistic designed to test the null:

𝐻0 : 𝛽 = 0
𝐻1 : 𝛽 ≠ 0

𝛽
• The t-ratio is ෡
𝑠𝑒(𝛽)
• The ratio of the coefficient to its standard error is known as the t-ratio
28. The Errors That we can make using
Hypothesis Tests
• We usually reject 𝐻0 if the test statistic is statistically significant at a
chosen significance level

• There are two possible errors we could make:


1. Rejecting 𝐻0 when it was really true: type I error
2. Not rejecting 𝐻0 when it was in fact false: type II error
28. The Errors That we can make using
Hypothesis Tests
Reality
𝐻0 is true 𝐻0 is false
Reject 𝐻0
Type I error = 𝛼 √
(significant)
Result of Test
Do not reject 𝐻0
√ Type II error = 𝛽
(insignificant)
28. The Errors That we can make using
Hypothesis Tests
• There is always a trade-off between type I and type II errors when
choosing a significance level.

Lower chance of type I error


• Reduce 𝛼
Higher chance of type II error
• One way to reduce both 𝛼 and 𝛽 is to increase the sample size
• Power of a test = 1 − 𝛽
29. Do Actively Managed Mutual Funds
“Beat the Market”? Jensen’s Alpha
• Testing for the presence and significance of abnormal returns
(“Jensen’s alpha” - Jensen, 1968).
• The Data:
Annual Returns on the portfolios of 115 mutual funds from 1945-1964
(T = 20, df = T-2 = 18).
• The model is: for j = 1, …, 115
• We are interested in the significance of j.
• The null hypothesis is H0: j = 0 . (Two sided: t0.025, 18 = 2.10, One
Sided: t0.05, 18 = 1.7341).
29. Do Actively Managed Mutual Funds
“Beat the Market”? Jensen’s Alpha
• Using α = 5%, Two sided: 115×0.025 = 2.87 then 2 to 3 funds
outperform or “beat the market” by pure chance! That is, because of
Type I error and a wrong decision we make about H0.
• Using α = 5%, One Sided: 115×0.05 = 5.75 then 5 to 6 funds “beat
the market” chance!
29. Jensen (1968)’s Findings


̂
29. Frequency Distribution of t-ratios of Mutual
Fund Alphas (gross of transactions costs)
29. Frequency Distribution of t-ratios of Mutual Fund
Alphas (net of transactions costs)
30. Workspace Application
• Obtaining alpha and beta for a mutual fund/ETF
• Obtaining other quantitative measures for a mutual fund/ETF

You might also like