WEEK2 Simple Regression
WEEK2 Simple Regression
SIMPLE REGRESSION
WEEK2
FALL 2024
y = 0 + 1x + u
Types of Data – Cross Sectional
➢ Cross-sectional data is a random sample
𝑦𝑖 𝑥𝑖
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
➢ Each observation is a new individual, firm,
etc., with information at a point in time
6
Some Terminology, cont.
➢In the simple linear regression of y on x, we
typically refer to x as the
▪ Independent Variable, or
▪ Right-Hand Side Variable, or
▪ Explanatory Variable, or
▪ Regressor, or
▪ Covariate, or
▪ Control Variables
7
EXAMPLE1 : Weight vs. Height
BOY vs KILO
100,0
80,0
KILO
60,0
40,0
155,0 166,7 178,3 190,0
BOY 8
EXAMPLE1 : Weight vs. Height
BOY vs KILO
100,0
80,0
KILO
60,0
40,0
155,0 166,7 178,3 190,0
90
80 Regression
Line
KILO
70
60
50
40
▲ Male
80,0 ● Female
KILO
60,0
40,0
155,0 166,7 178,3 190,0
BOY
Varyans Covariance Matrix Varyans Covariance Matrix
25.464 19.953 79.524 35.147
19.953 42.437 35.147 27.061
Correlation = 0.61 Correlation = 0.76 11
EXAMPLE1 : Regression Lines
Weight vs. Height vs. Gender
Weight = 0 + 1Height + u
BOY vs KILO
100,0
80,0
KILO
60,0
40,0
155,0 166,7 178,3 190,0
12
BOY
EXAMPLE2: Capital Asset Pricing Model
CAPM
ri = Return of the asset i
( ri − rf ) = + ( rm − rf ) + rm = Return of the market
rf = Risk-free interest rate
Var ( ri − rf ) = Var[ + ( rm − rf ) + ]
Var ( ri − rf ) = Var[] + Var[( rm − rf )] + Var[]
Var ( ri − rf ) = Var[( rm − rf )] + Var[]
Var ( ri − rf ) = 2 Var[(rm − rf )] + Var[]
Total Risk Originated Risk Originated
Risk From Market from Firm
13
EXAMPLE2: Capital Asset Pricing Model
CAPM
0.2
0.1
0.0
RPET-RF
-0.1
Slope of
-0.2 the line is
-0.3
-0.2 -0.1 0.0 0.1 0.2 0.3
RM-RF
14
EXAMPLE2: Capital Asset Pricing Model CAPM
Turkish Stock Exchange Market: BIST100 Akbank Garanti stocks
19/10/2020 18/10/2021
1,600
1,500
12
1,400
10 1,300
1,200
8
1,100
6
4
M10 M11 M12 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
2020 2021
15
EXAMPLE2: Capital Asset Pricing Model CAPM
Turkish Stock Exchange Market: BIST100 Akbank Garanti stocks
19/10/2020 18/10/2021
.12 .12
.08 .08
DLOG(GARANTI)
DLOG(AKBANK)
.04 .04
.00 .00
-.04 -.04
-.08 -.08
-.12 -.12
-.12 -.10 -.08 -.06 -.04 -.02 .00 .02 .04 -.12 -.10 -.08 -.06 -.04 -.02 .00 .02 .04
DLOG(BIST100) DLOG(BIST100)
GARANTİ 5.75
= 1.252*1.97=3.08 + 2.67 (%46)
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
➢The average value of u, the error term, in
the population is 0. That is,
◼ E(u) = 0
18
A Simple Regression Assumptions
➢ Zero Conditional Mean
➢ We need to make a crucial assumption about
how u and x are related
➢ We want it to be the case that knowing
something about x does not give us any
information about u, so that they are
completely unrelated. That is, that
➢ E(u|x) = E(u) = 0, which implies
➢ E(y|x) = 0 + 1x 19
A Simple Regression Assumptions
➢ E(y|x) as a linear function of x, where for any x
the distribution of y is centered about E(y|x)
20
Parameter Estimation:
Ordinary Least Squares
➢ Basic idea of regression is to estimate the
population parameters from a sample
➢ Let {(xi,yi): i=1, …,n} denote a random
sample of size n from the population
➢ For each observation in this sample, it will
be the case that
yi = 0 + 1xi + ui
21
SIMPLE REGRESSION MODEL
y
y = + x
Q4
Q3
Q2
Q1
x1 x2 x3 x4 x
y
y = + x
40
Age of
First 30
Child
20
10
0
30 40 50 60 x
Age of
Mother
24
SIMPLE REGRESSION MODEL
Population
y P4
u4
P1
y = + x
Q3
Q2 Q4
u1 u3
u2
Q1 P3
P2
x1 x2 x3 x4 x
x1 x2 x3 x4 x
27
DERIVING LINEAR REGRESSION
COEFFICIENTS a and b
Ordinary Least Squares OLS
𝑦ෝ𝑖 = 𝑎 + 𝑏𝑥𝑖 + 𝑒𝑖
Least squares criterion:
Minimize S, where
Orbit Estimation
Estimation of the
coefficients of the
ellipse function
29
DERIVING ORDINARY LEAST SQUARES
OLS
Least squares criterion:
Minimize S, where
𝑒𝑖 = 𝑒1 +. . . +𝑒𝑛
3 y1
2
0
0 1 2 3
This sequence shows how the regression coefficients for a simple regression model are
derived, using the least squares criterion (OLS, for ordinary least squares)
We will start with a numerical example with just three observations: (1,3), (2,5), and
(3,6). 31
DERIVING LINEAR REGRESSION
COEFFICIENTS
y
yˆ 3 = a + 3b
6
y3
5 y2
4 yˆ 2 = a + 2b
yˆ 1 = a + b
3 y1
2 b True model: 𝑦 = 𝛼 + 𝛽𝑥 + 𝑢
a Fitted line: ෞ𝑦 = 𝑎 + 𝑏𝑥
1
0
0 1 2 3
Writing the fitted regression as y = a + bx, we will determine the values of a
^
and b that minimize the sum of the squares of the residuals.
32
DERIVING LINEAR REGRESSION
COEFFICIENTS
y
yˆ 3 = a + 3b y = yˆ + e
6
y3
5 y2 y = (a + bx ) + e
yˆ 2 = a + 2b e = y − a − bx
4
yˆ 1 = a + b
3 y1 e1 = y1 − yˆ 1 = 3 − a − b
2 b e2 = y2 − yˆ 2 = 5 − a − 2b
a
1 e3 = y3 − yˆ 3 = 6 − a − 3b
0
0 1 2 3
S
= 0 6a + 12b − 28 = 0
a
S
= 0 6a + 12b − 28 = 0
a
S
= 0 12a + 28b − 62 = 0
b
The first-order conditions give us two equation in two unknowns.
36
DERIVING LINEAR REGRESSION
COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S
= 0 6a + 12b − 28 = 0 6a + 12b = 28
a
S 12a + 28b = 62
= 0 12a + 28b − 62 = 0
b
y
yˆ 3 = 6.17
6
y3
5 y2
4 yˆ 2 = 4.67
yˆ 1 = 3.17
3 y1
2 1.50
True model : y = + x + u
1.67
1
Fitted line : yˆ = 1.67 + 1.50 x
0
0 1 2 3
yn
y1
x1 xn x 39
DERIVING LINEAR REGRESSION
COEFFICIENTS
y True model : y = + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn
yn
y1
yˆ 1 = a + bx1
a b
x1 xn x
Given our choice of a and b, we will obtain a fitted line as shown. 40
DERIVING LINEAR REGRESSION
COEFFICIENTS
y True model : y = + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn
yn
y1
e1 e1 = y1 − yˆ 1 = y1 − a − bx1
yˆ 1 = a + bx1 .....
a b en = yn − yˆ n = yn − a − bxn
x1 xn x
The residual for the first observation is defined. 41
DERIVING LINEAR REGRESSION
COEFFICIENTS
y
True model : y = + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn
en
yn
y1
e1 e1 = y1 − yˆ 1 = y1 − a − bx1
yˆ 1 = a + bx1 .....
a b en = yn − yˆ n = yn − a − bxn
x1 xn x
Similarly we define the residuals for the remaining observations. That for
42
the last one is marked.
DERIVING LINEAR REGRESSION COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S = e12 + ... + en2 = ( y1 − a − bx1 ) 2 + ... + ( yn − a − bxn ) 2
= y12 + a 2 + b 2 x12 − 2ay1 − 2bx1 y1 + 2abx1
+ ...
+ yn2 + a 2 + b 2 x n2 − 2ayn − 2bxn yn + 2abxn
= yi2 + na 2 + b 2 x i2 − 2a yi − 2b x i yi + 2ab x i
The sum of the squares of the residuals is defined for the general case. The data for the
numerical example are shown for comparison.
43
The quadratics are expanded. Like terms are added together.
DERIVING LINEAR REGRESSION COEFFICIENTS
S = 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S
= 0 6a + 12b − 28 = 0
a a = 1.67, b = 1.50
S
= 0 12a + 28b − 62 = 0
b
S = yi2 + na 2 + b 2 x i2 − 2a y1 − 2b x i yi + 2ab x i
The first derivatives of S with respect to a and b provide us with two equations
that can be used to determine a and b.
Note that in this situation the observations on x and y are just data which
determine the coefficients in the expression for S.
The choice variables in the expression are a and b. This may seem a bit
strange because in elementary calculus courses a and b are always constants
and x and y are variables.
44
DERIVING LINEAR REGRESSION COEFFICIENTS
Divide through by 2.
S
= 0 2b xi2 − 2 xi yi + 2a xi = 0
b
b xi2 − xi yi + a xi = 0
46
DERIVING LINEAR REGRESSION COEFFICIENTS
We now substitute for a using the expression obtained for it and we thus
obtain an equation that contains b only.
S = yi2 + na 2 + b 2 x i2 − 2a y1 − 2b x i yi + 2ab x i
S
= 0 2na − 2 yi + 2b xi = 0
a
na = yi −b x i a = y − bx
S
= 0 2b xi2 − 2 xi yi + 2a xi = 0
b
We now substitute for a using the expression obtained for it and we thus
obtain an equation that contains b only.
S
= 0 2b xi2 − 2 xi yi + 2a xi = 0
b
b xi2 − xi yi + a xi = 0
b xi2 − xi yi + ( y − bx ) xi = 0 47
DERIVING LINEAR REGRESSION COEFFICIENTS
b xi2 − xi yi + ( y − bx ) xi = 0 xi
x=
n
b xi2 − xi yi + ( y − bx )nx = 0
Terms not involving b have been transferred to the right side and the
equation has been divided through by n.
1 2 1
b x i − x = x i y i − x y
2
n n 48
DERIVING LINEAR REGRESSION COEFFICIENTS
S
= 0 2b xi2 − 2 xi yi + 2a xi = 0
b
b xi2 − xi yi + a xi = 0
b xi2 − xi yi + ( y − bx ) xi = 0
b xi2 − xi yi + ( y − bx )nx = 0
1 2 1
b x i − x = x i y i − x y
2
n n
bVar( x ) = Cov( x , y )
Hence we obtain a tidy expression for b.
Cov( x , y )
b=
Var( x ) 49
DERIVING LINEAR REGRESSION COEFFICIENTS
y True model : y = + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn
yn
y1
a = y − bx
yˆ 1 = a + bx1 b=
Cov( x , y )
a b Var( x )
x1 xn x
_ _
𝒂 = 𝒚lj − 𝒃𝒙lj ⇒ 𝒚 = 𝒂 + 𝒃𝒙
The expression for a is standard, and we will soon see that it generalizes easily.
There are various ways of writing the expression for b. 50
Summary of OLS Slope Estimate
70
Hourly earnings ($)
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
The scatter diagram shows hourly earnings in 1994 plotted against highest grade completed
for a sample of 570 respondents from the National Longitudinal Survey of Youth.
Highest grade completed means just that for elementary and high school. Grades 13, 14,
and 15 mean completion of one, two and three years of college.
Grade 16 means completion of four-year college. Higher grades indicate years of
postgraduate education. 61
INTERPRETATION OF A REGRESSION
EQUATION
. reg earnings hgc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------
For the time being, we will be concerned only with the estimates of the
parameters. The variables in the regression are listed in the first
column and the second column gives the estimates of their coefficients.
In this case there is only one variable, HGC, and its coefficient is 1.073.
The estimate of the intercept is -1.391. _cons, in Stata, refers to the
constant. 63
INTERPRETATION OF A REGRESSION EQUATION
80
70
^
Hourly earnings ($)
60
EARNINGS = −1.391 + 1.073 HGC
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
14
Hourly earnings ($)
13
$11.49
12
11 $1.07
10 One year
$10.41
9
7
10.8 11 11.2 11.4 11.6 11.8 12 12.2
Highest grade completed
The regression line indicates that completing 12th grade instead of 11th grade
would increase earnings by $1.073, from $10.413 to $11.486, as a general tendency.
67
INTERPRETATION OF A REGRESSION EQUATION
80
^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
70 ^
EARNINGS = −1.391 + 1.073 HGC
Hourly earnings ($)
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
What about the constant term? (Try to answer this question yourself before
continuing with this sequence.)
Literally, the constant indicates that an individual with no years of education
would have to pay $1.39 per hour to be allowed to work. 69
INTERPRETATION OF A REGRESSION EQUATION
80
^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
80
70
^
Hourly earnings ($)
60
𝑬𝑨𝑹𝑵𝑰𝑵𝑮𝑺 = −𝟏. 𝟑𝟗𝟏 + 𝟏. 𝟎𝟕𝟑 𝑯𝑮𝑪
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
A safe solution to the problem is to limit the interpretation to the range of the
sample data, and to refuse to extrapolate on the ground that we have no evidence
outside the data range.
With this explanation, the only function of the constant term is to enable you to
draw the regression line at the correct height on the scatter diagram. It has no
meaning of its own. 71
INTERPRETATION OF A REGRESSION EQUATION
80
70
60
Hourly earnings ($)
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10
Another solution is to explore the possibility that the true relationship is nonlinear.
We will soon extend the regression technique to fit nonlinear models.
72
Algebraic Properties of OLS
➢ The sum of the OLS residuals is zero.
Thus, the sample average of the OLS
residuals is zero as well
➢ The sample covariance between the
regressors and the OLS residuals is zero
➢ The OLS regression line always goes
through the mean of the sample
77
More Terminology
We can think of each observation as being made
up of an explained part, and an unexplained part,
𝑦𝑖 = 𝑦ො𝑖 + 𝑢ො 𝑖 We then define the following:
2
𝑦𝑖 − 𝑦lj is the total sum of squares (SST)
n n n
( y − y )2 = ( yˆ − y )2 + e 2
−
2
ESS ( ˆ
y y ) 2
ei
R =
2
= i
= 1−
TSS ( yi − y ) 2
i
( y − y ) 2
Thus the correlation coefficient is the square root of R2. It follows that
it is maximized by the use of the least squares principle to determine
the regression coefficients. 83
Goodness-of-Fit R2
. reg earnings hgc
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------
In this case there is only one variable, HGC, and its coefficient is 1.073. _cons,
in Stata, refers to the constant. The estimate of the intercept is -1.391.
◼ Efficient
85
UNBIASEDNESS AND EFFICIENCY
OLS x x
x
x xx x
x x xx x x
xx
Unbiased x Unbiased
x
Efficient Inefficient
x x x
xxx
xx
Biased xx x Biased
x
Efficient x x Inefficient
86
OLS ASSUMPTIONS
1. Assumptions on the disturbances
a. Random disturbances have zero mean E[ui] = 0
b. Homoskedasticity Var(ui) = 2
c. No serial correlation Cov(ui uj) = 0 i ≠j
2. Assumptions on model and its parameters
a. Constant parameters
b. Linear model
3. Assumption on the probability distribution
a. Normal distribution ui ~N(0, 2 )
4. Assumptions on regressors
a. Fixed - nonstochastic regressors 87
REGRESSION
COEFFICIENTS
93
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
y = + x + u
yˆ = a + bx
The error term depends on the value of the disturbance term in every observation
in the sample, and thus it is a special type of random variable.
We will investigate its effect on b in two ways: first, directly, using a Monte Carlo
experiment, and, second, analytically. 94
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
Choose model in which y is
determined by x, parameter y = + x + u
values, and u
Choose Choose u is
Choose x= = 2.0
parameter distribution independent
data for x 1, 2, ... , 20 = 0.5
values for u N(0,1)
We will then regress y on x using the OLS estimation technique and see how
well our estimates a and b correspond to the true values and . 95
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
y = 2.0 + 0.5x + u
x 2.0+0.5x u y x 2.0+0.5x u y
1 2.5 11 7.5
2 3.0 12 8.0
3 3.5 13 8.5
4 4.0 14 9.0
5 4.5 15 9.5
6 5.0 16 10.0
7 5.5 17 10.5
8 6.0 18 11.0
9 6.5 19 11.5
10 7.0 20 12.0
y = 2.0 + 0.5 x
14.00
y
12.00
10.00
8.00
6.00
4.00
2.00
0.00 x
0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 18.00 20.00
y = 2.0 + 0.5x + u
x 2.0+0.5x u y x 2.0+0.5x u y
Next, we generate randomly a value of the disturbance term for each observation
using a N(0,1) distribution (normal with zero mean and unit variance).
We will generate values of y for all the 20 observations. 98
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
y 14.00
12.00
10.00
8.00
6.00
4.00
2.00
x
0.00
0.00 5.00 10.00 15.00 20.00
14.00
y yˆ = 2.52 + 0.48 x
12.00
10.00
8.00
6.00
4.00
2.00
0.00
0.00 5.00 10.00 15.00 20.00 x
This time the slope coefficient has been overestimated and the
intercept underestimated.
100
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
14.00
y yˆ = 2.13 + 0.45 x
12.00
10.00
8.00
6.00
4.00
2.00
0.00
0.00 5.00 10.00 15.00 20.00 x
As last time, the slope coefficient has been underestimated and the
intercept overestimated.
101
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
replication a b
1 1.63 0.54
12
2 2.52 0.48
10
3 2.13 0.45
8
4 2.14 0.50
5 1.71 0.56 6
6 1.81 0.51 4
7 1.72 0.56 2
8 3.18 0.41 0
0,40 0,45 0,50 0,55 0,60
9 1.26 0.58
10 1.94 0.52
The table summarizes the results of the three regressions and adds
those obtained repeating the process a further seven times.
102
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
12
10
0
0,40 0,45 0,50 0,55 0,60
50 replications
12
10
0
0,40 0,45 0,50 0,55 0,60
100 replications
105
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
12
10
0
0,40 0,45 0,50 0,55 0,60
This is the histogram with 100 replications. We can see that the distribution
100 replications
appears to be symmetrical around the true value, implying that the estimator
is unbiased.
The red curve shows the limiting shape of the distribution. It is symmetrical
around the true value, confirming that the estimator is unbiased.
The distribution is normal because the disturbance term was drawn from a
normal distribution.. 106
OLS ASSUMPTIONS
1. Assumptions on regressors
a. Fixed - nonstochastic regressors
2. Assumptions on the disturbances
a. Random disturbances have zero mean E[ui] = 0
b. Homoskedasticity Var(ui) = 2
c. No serial correlation Cov(ui uj) = 0 i j
3. Assumptions on model and its parameters
a. Constant parameters
b. Linear model
4. Assumption on the probability distribution
a. Normal distribution u N(0, 2 )
107
Variance of the OLS Estimators
➢ Now we know that the sampling
distribution of our estimate is centered
around the true parameter
➢ Want to think about how spread out this
distribution is
➢ Much easier to think about this variance
under an additional assumption, so
➢Assume Var(u|x) = 2 (Homoskedasticity)
108
Variance of OLS (cont)
➢ Var(u|x) = E(u2|x)-[E(u|x)]2
➢ E(u|x) = 0, so 2 = E(u2|x) = E(u2) = Var(u)
➢ Thus 2 is also the unconditional variance,
called the error variance
➢ , the square root of the error variance is
called the standard deviation of the error
➢ Can say: E(y|x)=0 + 1x and Var(y|x) = 2
109
Homoskedastic Case
y
f(y|x)
. E(y|x) = + x
0 1
.
x1 x2
110
Heteroskedastic Case
f(y|x)
.
. E(y|x) = 0 + 1x
.
x1 x2 x3 x
111
PRECISION OF THE REGRESSION
COEFFICIENTS
Simple regression model: y = + x + u
Variances of the regression coefficients
u2 x2 2
pop.var ( a ) = 1 + pop.var ( b ) = u
n Var ( x ) n Var ( x )
113
PRECISION OF THE REGRESSION COEFFICIENTS
Simple regression model: y = + x + u u2
pop.var ( b ) =
Variances of the regression coefficients n Var ( x )
y y
35 35
30 30
25 25
20 20
15 15
10 10
5 5
0 0
0 5 10 15 20 0 5 10 15 20
-5
-5 x x
-10 -10
-15 -15
This is illustrated by the diagrams above. The nonstochastic component of the
relationship, y = 3.0 + 0.8x, represented by the dotted line, is the same in both
diagrams.
The values of x are the same, and the same random numbers have been used to
generate the values of the disturbance term in the 20 observations.
However, in the right-hand diagram the random numbers have been multiplied by
a factor of 5. As a consequence, the regression line, the solid line, is a much
poorer approximation to the nonstochastic relationship. 114
PRECISION OF THE REGRESSION COEFFICIENTS
Simple regression model: y = + x + u u2
Variances of the regression coefficients pop.var ( b ) = n Var ( x )
y y
35 35
30 30
25 25
20 20
15 15
10 10
5 5
0 x 0 x
0 5 10 15 20 0 5 10 15 20
-5 -5
-10 -10
-15 -15
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------
117
Estimating the Error Variance
➢ We don’t know what the error variance, 2,
is, because we don’t observe the errors, ui
118
Error Variance Estimate (cont)
ui = SSR / (n − 2 )
1
=
ˆ2
(n − 2) ˆ 2
119
Error Variance Estimate (cont)
ˆ = ˆ 2 = Standard error of the regression
()
recall that sd ˆ =
sx
if we substitute ˆ for then we have
the standard error of ˆ1 ,
( ) (
se ˆ1 = ˆ / (xi − x )
2
)
1
2
120
TESTING A HYPOTHESIS
RELATING TO A
REGRESSION COEFFICIENT
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
Model: y=+x+u
Null hypothesis: H0: = 0
Alternative hypothesis H1: ≠ 0
b − 0 Reject H0
Z= if |Z| > 1.96
s.d.
2.5% 2.5%
2.5% 2.5%
0.5% 0.5%
0.5% 0.5%
We look up the critical value of t and if the t statistic is greater than it,
positive or negative, we reject the null hypothesis. If it is not, we do not. 129
t TEST OF A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
0.4
0.3 normal
t, n = 10
0.2 t, n = 5
0.1
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
model: y = + x + u
null hypothesis: H0 : ≤ 0
alternative hypothesis: H1 : > 0
133
ONE-TAILED TESTS
probability density null hypothesis: H0 : ≤ 0
function of b
alternative hypothesis: H1 : > 0
5%
0 1.65 sd
However, if you can justify the use of a one-tailed test, for example with H0:
> 0, your estimate only has to be 1.65 standard deviations above 0.
This makes it easier to reject H0 and thereby demonstrate that y really is
influenced by x (assuming that your model is correctly specified). 134
CONFIDENCE
INTERVALS
CONFIDENCE INTERVALS
2.5% 2.5%
(2) (1)
R2 =
ESS
=
i
( y − ˆ
y ) 2
TSS ( yi − y ) 2
y = + x + u
H 0 : = 0, H 1 : 0
Since x is the only explanatory variable at the moment, the null hypothesis is that y is not
determined by x Mathematically, we have H0: = 0.
142
F TEST OF GOODNESS OF FIT
Var(𝑦) = Var(𝑦)
ො + Var(𝑒)
y = + x + u lj 2 = ∑(𝑦ො − 𝑦)
∑(𝑦 − 𝑦) lj 2 + ∑𝑒 2
𝑇𝑆𝑆 = 𝐸𝑆𝑆 + 𝑅𝑆𝑆
𝐻0 : 𝛽 = 0, 𝐻1 : 𝛽 ≠ 0
𝐸𝑆𝑆 ∑(𝑦𝑖 − ሜ 2
𝑦)
ො
𝐻0 : 𝑅2 = 0, 𝐻1 : 𝑅2 ≠ 0 𝑅2 = =
𝑇𝑆𝑆 ∑(𝑦𝑖 − 𝑦) lj 2
𝐸𝑆𝑆
𝐸𝑆𝑆/𝑘 ൗ𝑘
𝐹(𝑘, 𝑛 − 𝑘 − 1) = = 𝑇𝑆𝑆
𝑅𝑆𝑆/(𝑛 − 𝑘 − 1) 𝑅𝑆𝑆ൗ(𝑛 − 𝑘 − 1)
𝑇𝑆𝑆
𝑅 2 /𝑘
=
(1 − 𝑅 2 )/(𝑛 − 𝑘 − 1)
If calculated F test value is greater than F table value Reject Null
Hypothes concerning goodness of fit.
F statistic, defined as shown. k is the number of explanatory
variables, which at present is just 1. 143
F TEST OF GOODNESS OF FIT and R2
F 140
120
100
R2 / k
F ( k , n − k − 1) =
80
(1 − R 2 ) /( n − k − 1)
60
40
20
0
0 0.2 0.4 0.6 0.8 1 R2
90
80
KILO
70 Correlation 0.85
60
50
40
155 160 165 170 175 180 185 190
BOY
146
Example1: Height vs. Weight
190
185
180
175
HEIGHT
160
155
40 50 60 70 80 90 100
WEIGHT
147
Example1: Height vs. Weight
Dependent Variable: KILO
Method: Least Squares KILO = -161.63 + Generally the constant
1.307*BOY
Sample (adjusted): 1 44 term is not interpreted.
Included observations: 44 after adjustments Mostly it is meaningless.
Variable Coefficient Std. Error t-Statistic Prob.
Applications 7,000
APPLICATION
6,000
Applicationi = 0 + 1*Ranki + ui
5,000
4,000