0% found this document useful (0 votes)
26 views54 pages

Ordinary Least Squares

The document provides an overview of Ordinary Least Squares (OLS) regression analysis, including the nature of financial data, model estimation, and hypothesis testing. It discusses time series and cross-sectional data, the significance of the residual term, and the interpretation of regression coefficients. Additionally, it covers the assumptions of the simple linear regression model, the Gauss-Markov theorem, and the properties of estimators in relation to their variances and consistency.

Uploaded by

keyanna gillette
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views54 pages

Ordinary Least Squares

The document provides an overview of Ordinary Least Squares (OLS) regression analysis, including the nature of financial data, model estimation, and hypothesis testing. It discusses time series and cross-sectional data, the significance of the residual term, and the interpretation of regression coefficients. Additionally, it covers the assumptions of the simple linear regression model, the Gauss-Markov theorem, and the properties of estimators in relation to their variances and consistency.

Uploaded by

keyanna gillette
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 54

Ordinary least Squares

Introduction
• Describe the nature of financial data.
• Assess the concepts underlying
regressions analysis
• Describe some examples of financial
models.
• Examine the Ordinary least Squares
(OLS) technique and hypothesis testing
Time Series data
• Examples of Problems that Could be Tackled Using a
Time Series Regression
- How the value of a country’s stock index has varied
with that country’s macroeconomic fundamentals.
- How the value of a company’s stock price has
varied when it announced the value of its dividend
payment.
- The effect on a country’s currency of an increase in
its interest rate
Cross-Sectional data
• Cross-sectional data are data on one or more
variables collected at a single point in time, e.g.
- A poll of usage of internet stock broking
services
- Cross-section of stock returns on the New
York Stock Exchange
- A sample of bond credit ratings for UK banks
Model Estimation
Economic or Financial Theory (Previous Studies)

Formulation of an Estimable Theoretical Model

Collection of Data

Model Estimation

Is the Model Statistically Adequate?

No Yes

Reformulate Model Interpret Model

Use for Analysis


Financial Data
MI i

0 18

3 21

4 20

3 23

4 25

6 27

4 26

6 28

5 30

6 32
Regression
Chart Title

40

35
y = 2.4024x + 15.606
30
market index

25
market index
20
Linear (market index)
15

10

0
0 2 4 6 8
interest rate
Econometric Model

yt   xt  ut
yt  dependent var iable
  cons tan t
  slope parameter
xt  exp lanatory var iable
ut  error term
Estimates

yˆ t 0.7  0.8 xt
1 unit rise in xt gives a 0.8 unit rise in y t
The Residual Term
• (Also called the error term and disturbance term)
• It describes the random component of the
regression. It is caused by:
- Omission of explanatory variables
- The aggregation of the variables
- Mis-specification of the model
- Incorrect functional form of the model
- Measurement error
Least Squares Approach
• The aim of this approach is to minimize
the residual for all residuals
• We square the residual before minimizing
• We can then derive our intercept and
slope parameter using basic calculus
Regression
• Regression is the degree of dependency
of the dependent variable on the
explanatory variables
• Correlation measures the strength of a
linear association between two variables
• Causation suggests the dependent
variable depends on previous values of
the explanatory variable
• Regression does not imply causation
R-Squared Statistic
• This statistic explains the proportion of
total variation in the dependent variable
which is explained by the regression
• The statistic explains the explanatory
power of the regression and measures
how good a fit the data is.
• The value of this statistic lies between 0
and 1(when all the scatter plots lie on the
regression line.)
Interpretation of Results
• Consider the type of model being estimated
• What are the units of measurement of the
variables (unless all the variables are in
logarithmic form)
• The range of observations
• The signs of the variables, does it accord with
the theoretical model
• Are the magnitudes of the parameters plausible
• We need to remember it is only a model, the
parameters are estimates, so it describes
average values, individual cases may vary
Significance Testing

yˆ t 0.7  1.2 xt
(0.7) (0.4)
(Standard errors in parentheses)
Hypotheses Test

H 0 : ˆ 0
H : ˆ 0
1

ˆ   0 1.2  0
T   3
SE ( ˆ ) 0.4
Critical value is 1.98
3  1.98, reject H 0
Hypothesis Testing
• Test the significance of the constant term in the same
way as the slope parameter
• Although the conventional test for significance is at the
5% level, we also test at the 1% and 10% levels
• Use of the t-distribution tells us what to expect ‘by
chance’
• For finite samples, when applying the t-test, we need to
allow for degrees of freedom
• The t-test can be applied to either one or two tailed tests
• The t-test is an absolute value, so we can ignore the sign
The T-test
• If the t-test statistic exceeds the critical
value, reject the null hypothesis, if we are
testing if the coefficient equals zero, this
means it is significant
• If the test statistic is below the critical
value accept the null hypothesis.
• To find the critical value, you need to know
the degrees of freedom, which equal n-k-
1.
1.20
Interpretation of Coefficients, b1 and b2

b2 represents an estimate of the mean change in y


responding to a one-unit change in x.
b1 is an estimate of the mean y when x = 0. It
must be very careful to interpret the estimated
intercept since we usually do not have any data
points near x = 0.
Note that regression analysis cannot be interpreted as
a procedure for establishing a cause-and-effect
relationship between variables.
1.21
Simple Linear Regression Model

yt = 1 + 2 x t +  t

yt = demand for cars


x t = prices

For a given level of x t, the expected


level of demand for cars will be:
E(yt|x t) = 1 + 2 x t
Assumptions of the Simple1.22
Linear Regression Model

1. yt = 1 + 2x t +  t
2. E( t) = 0 <=> E(yt I xt) = 1 + 2x t
3. var( t) =  2 = var(yt)

4. cov( i, j) = cov(yi,yj) = 0


5. x t is not constant (no perfect collinearity)
6.  t~N(0, 2) <=> yt~N(1+ 2x t, 2)
 
1.23
The population parameters 1 and 2
are unknown population constants.

The formulas that produce the


sample estimates b1 and b2 are
called the estimators of 1 and 2.

When b1 and b2 are used to represent


the formulas rather than specific values,
they are called estimators of 1 and 2
which are random variables because
they are different from sample to sample.
1.24
Estimators are Random Variables
( estimates are not )
If the least squares estimators b1 and b2
are random variables, then what are their
their means, variances, covariances and
probability distributions?

Compare the properties of alternative estimators


to the properties of the
least squares estimators.
1.25
The Expected Values of b1 and b2

The least squares formulas (estimators)


in the simple regression case:

Txtyt - xt yt


b2 =
Txt -(xt)
2 2

b1 = y - b2x

where y = yt / T and x = x t / T


Substitute in yt 1.26
= 1 +
to get:
Txtt - xt t
b2 = 2 + 2
Txt -(xt) 2

The mean of b2 is:


TxtE(t) - xt E(t)
E(b2) = 2 +
Txt -(xt)
2 2

Since E(t) = 0, then E(b2) = 2 .


1.27
An Unbiased Estimator

The result E(b2) = 2 means that


the distribution of b2 is centered at 2.

Since the distribution of b2


is centered at 2 ,we say that
b2 is an unbiased estimator of 2.
1.28
Wrong Model Specification

The unbiasedness result on the


previous slide assumes that we
are using the correct model.

If the model is of the wrong form


or is missing important variables,
then E(t) = 0, then E(b2) = 2 .
1.29
Unbiased Estimator of the Intercept

In a similar manner, the es


of the intercept or constant
shown to be an unbiased
when the model is correctl

E(b1) = 1
1.30
Equivalent expressions for b2:

(xt  x)yt  y )
b2 =
xt  x ) 2

Expand and multiply top and bottom by T:

Txtyt  xt yt xtyt – T x y


b2 = =
Txt (xt)
2 2
xt2 – T x2
1.31
Variance of b2

Given that both yt and t have variance  2,


the variance of the estimator b2 is:

var(b2) =  2

x t  x
2

b2 is a function of the yt values but


var(b2) does not involve yt directly.
1.32
Variance of b1

Given b1 = y  b2x
the variance of the estimator b1 is:

2
x t
var(b1) =  2
x t  x
2

1.33
Covariance of b1 and b2

2x
cov(b1,b2) = 
x t  x
2
What factors determine variance 1.34
and covariance of b1 and b2?
1. The larger the 2, the greater the uncertainty about b1 ,
b2 and their relationship.
2. The more spread out the xt values are then the more
confidence we have in b1, b2, etc.
3. The larger the sample size, T, the smaller the
variances and covariances.
4. The variance b1 is large when the (squared) xt values are
far from zero (in either direction).
5. Changing the slope, b2, has no effect on the intercept, b1 ,
when the sample mean is zero. But if sample mean is
positive, the covariance between b1 and b2 will be negative,
and vice versa.
1.35
Gauss-Markov Theorem

Under the first five assumptions of the


simple, linear regression model, the
ordinary least squares estimators b1
and b2 have the smallest variance of
all linear and unbiased estimators of
1 and 2. This means that b1and b2
are the Best Linear Unbiased Estimators
(BLUE) of 1 and 2.
Implications of Gauss-Markov 1.36
b1 and b2 are best within the class of linear and
unbiased estimators.
2. Best means smallest variance within the class
of linear/unbiased.
3. All of the first five assumptions must hold to
satisfy Gauss-Markov.
4. Gauss-Markov does not require assumption
six: normality.
5. G-Markov is not based on the least squares
principle but on the estimation rules of b1 and b2.
1.37
G-Markov implications (continued)
6. If we are not satisfied with restricting
our estimation to the class of linear and
unbiased estimators, we should ignore the
Gauss-Markov Theorem and use some
nonlinear and/or biased estimator
instead. (Note: a biased or nonlinear
estimator could have smaller variance
than those satisfying Gauss-Markov.)
7. Gauss-Markov applies to the b1 and b2
estimators and not to particular sample
values (estimates) of b1 and b2.
1.38
yt and t normally distributed

The least squares estimator of 2 and 1 can be


expressed as a linear combination of yt:
b2 = wt yt
x t  x
where wt = 2
x t  x
b1 = y  b2x

This means that b1and b2 are normal since


linear combinations of normals are normal.
1.39
normally distributed under
The Central Limit Theorem

If the first five Gauss-Markov assumptions


hold, and sample size, T, is sufficiently large,
then the least squares estimators, b1 and b2,
have a distribution that approximates the
normal distribution with greater accuracy
the larger the value of sample size, T.
1.40
Probability Distribution
of Least Squares Estimators

If one of the above two conditions is satisfied,


then the distributions of b1 and b2 are

 
2
x 2
t
b1 ~ N  1 ,
x t  x 2
2
b2 ~ N  2 ,

x t  x 2
1.41
Consistency

We would like our estimators, b1 and b2, to collapse


onto the true population values, 1 and 2, as sample
size, T, goes to infinity.

One way to achieve this consistency property is for the


variances of b1 and b2 to go to zero as T goes to
infinity.

Since the formulas for the variances of the least


squares estimators b1 and b2 show that their
variances do, in fact, go to zero, then b1 and b2, are
consistent estimators of 1 and 2.
Estimating the variance 1.42
of the error term, 2

^ε = yt b1  b2 x t
t

T
^2
^  ε
 
 t =1
= t
T2

^  is an unbiased estimator of  2
 
1.43
The Least Squares
Predictor, y^o

Given a value of the explanatory


variable, Xo, we would like to
predict a value of the dependent
variable,
The leastysquares
o. predictor is:
^
yo = b 1 + b2 x o
1.44
Probability Distribution
of Least Squares
Estimators

 x t
2 2

b1 ~ N  1 ,
x t  x2

2
b2 ~ N  2 ,
x t  x 2
1.45
2
b2 ~ N  2 ,
x t  x
2

Create a standardized normal random variable, Z,


by subtracting the mean of b2 and dividing by its
standard deviation:

b2 2
 
var(b2)
Coefficient of 1.46
Determination
What proportion of the variation
in yt is explained?
2
0< R <1
2 SSR
R =SST
1.47
Coefficient of Determination
SST = SSR + SSE
SST SSR S
Dividing = +
by SST SST SST S
SSR
1 = SSE
SST SST

SSR 2 SSE
SST R =
SST =
1.48
Coefficient of Determination
R2 is only a descriptive
measure.
R2 does not measure the
quality
of the regression model.
Focusing solely on maximizing
R is not a good idea.
2
1.49
In simple linear regression models, there are two ways to test

H0: β2 = 0 vs HA: β2 ≠ 0

1. Under H0, t = b2 / se(b2) ~ t(T-2)


2. Under H0, F = MSR / MSE ~ F1, T-2

Note that
1. It can be show that t2(T-2) = F1, T-2

2. F = MSR / MSE = R2 / [(1  R2) / (T  2)]


Regression Computer 1.50
Output
Typical computer output of regression
estimates:
Table 6.2 Computer Generated Least Squares Results
(1) (2) (3) (4) (5)
Parameter Standard T for H0:
Variable Estimate Error Parameter=0 Prob>|T|
INTERCEPT 40.7676 22.1387 1.841 0.0734
X 0.1283 0.0305 4.201 0.0002
Regression Computer 1.51
Output
b1 = 40.7676 b2 = 0.1283

se(b1) = ^ 1)
var(b = 490.12 = 22.1287

se(b2) = ^ 2)
var(b = 0.0009326 = 0.0305

b1 40.7676
se(b1)
t =
22.1287
b2 0.1283
se(b2) 0.0305
t =
Regression Computer 1.52
Output
Sources of variation in the d

Table 6

Source
Explaine
Unexpla
Total
Regression Computer 1.53
Output
2
SST = (yty) = 79532
^
2
SSR = (yty) = 25221
^
2
SSE = εt = 54311
^ SSE /(T-2) = 2 =
SSR SSE
2
R = = 1 
SST SST
1.54
Reporting Regression Results
2
R = 0.317
2
This R value may seem
typical in studies involving
data analyzed at the indiv

2
A considerably higher R value
expected in studies involving ti
analyzed at an aggregate or m

You might also like