0% found this document useful (0 votes)

10 views133 pages

WEEK2 Simple Regression

The document covers the fundamentals of simple regression models, including types of data (cross-sectional, time series, and panel data) and terminology related to dependent and independent variables. It provides examples, such as the relationship between weight and height, and discusses the Capital Asset Pricing Model (CAPM) with risk decomposition. Additionally, it outlines assumptions for simple regression, parameter estimation using Ordinary Least Squares (OLS), and the historical background of regression analysis.

Uploaded by

meminatmaca55

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views133 pages

WEEK2 Simple Regression

Uploaded by

meminatmaca55

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 133

ECONOMETRICS

SIMPLE REGRESSION
WEEK2

FALL 2024

Prof. Dr. Burç Ülengin

The Simple Regression
Model

y =  0 +  1x + u
Types of Data – Cross Sectional
➢ Cross-sectional data is a random sample
𝑦𝑖 𝑥𝑖
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
➢ Each observation is a new individual, firm,
etc., with information at a point in time

➢ If the data is not a random sample, we have

a sample-selection problem
3
Types of Data – Time Series
➢ Time series data has a separate observation for
each time period – e.g., stock prices, GDP
𝑦𝑡 𝑥𝑡
𝑦𝑡 = 𝛽0 + 𝛽1 𝑥𝑡 + 𝑢𝑡
➢ Since it is not a random sample, different
problems to consider

➢ Trends, seasonality, and dynamic behavior

will be important 4
Types of Data – Panel
➢ Can pool random cross-sections and treat
them similarly to a typical cross-section. We
will need to account for time differences.
𝑦𝑖𝑡 𝑥𝑖𝑡
𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + 𝑢𝑖𝑡
➢ Can follow the same random individual
observations over time – known as panel data
or longitudinal data
5
Some Terminology
➢In the simple linear regression model,
where y = 0 + 1x + u, we typically refer
to y as the
▪ Dependent Variable, or
▪ Left-Hand Side Variable, or
▪ Explained Variable, or
▪ Regressand

6
Some Terminology, cont.
➢In the simple linear regression of y on x, we
typically refer to x as the
▪ Independent Variable, or
▪ Right-Hand Side Variable, or
▪ Explanatory Variable, or
▪ Regressor, or
▪ Covariate, or
▪ Control Variables
7
EXAMPLE1 : Weight vs. Height
BOY vs KILO
100,0

80,0
KILO

60,0

40,0
155,0 166,7 178,3 190,0

BOY 8
EXAMPLE1 : Weight vs. Height
BOY vs KILO
100,0

80,0
KILO

60,0

40,0
155,0 166,7 178,3 190,0

Varyans Covariance Matrix BOY

66.173 86.493
Correlation 0.85 9
86.493 156.353
EXAMPLE1 : Weight vs. Height
100

80 Regression
Line
KILO

155 160 165 170 175 180 185 190

BOY
10
EXAMPLE1: Weight vs. Height vs.
Gender
BOY vs KILO
100,0

▲ Male
80,0 ● Female
KILO

60,0

40,0
155,0 166,7 178,3 190,0

BOY
Varyans Covariance Matrix Varyans Covariance Matrix
25.464 19.953 79.524 35.147
19.953 42.437 35.147 27.061
Correlation = 0.61 Correlation = 0.76 11
EXAMPLE1 : Regression Lines
Weight vs. Height vs. Gender

Weight = 0 + 1Height + u
BOY vs KILO
100,0

80,0
KILO

60,0

40,0
155,0 166,7 178,3 190,0

12
BOY
EXAMPLE2: Capital Asset Pricing Model
CAPM
ri = Return of the asset i
( ri − rf ) =  + ( rm − rf ) +  rm = Return of the market
rf = Risk-free interest rate
Var ( ri − rf ) = Var[ + ( rm − rf ) + ]
Var ( ri − rf ) = Var[] + Var[( rm − rf )] + Var[]
Var ( ri − rf ) = Var[( rm − rf )] + Var[]
Var ( ri − rf ) = 2 Var[(rm − rf )] + Var[]
Total Risk Originated Risk Originated
Risk From Market from Firm

13
EXAMPLE2: Capital Asset Pricing Model
CAPM
0.2

0.1

0.0
RPET-RF

-0.1

Slope of
-0.2 the line is 

-0.3
-0.2 -0.1 0.0 0.1 0.2 0.3

RM-RF
14
EXAMPLE2: Capital Asset Pricing Model CAPM
Turkish Stock Exchange Market: BIST100 Akbank Garanti stocks
19/10/2020 18/10/2021
1,600

1,500

12
1,400

10 1,300

1,200
8

1,100
6

4
M10 M11 M12 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
2020 2021

BIST100 AKBANK GARANTI

15
EXAMPLE2: Capital Asset Pricing Model CAPM
Turkish Stock Exchange Market: BIST100 Akbank Garanti stocks
19/10/2020 18/10/2021
.12 .12

.08 .08

DLOG(GARANTI)
DLOG(AKBANK)

.04 .04

.00 .00

-.04 -.04

-.08 -.08

-.12 -.12
-.12 -.10 -.08 -.06 -.04 -.02 .00 .02 .04 -.12 -.10 -.08 -.06 -.04 -.02 .00 .02 .04

DLOG(BIST100) DLOG(BIST100)

Return(AKBANK) = -0.00089 + 1.186*Return(BIST100)

Return(GARANTI) = -0.000055 + 1.253*Return(BIST100) 16

EXAMPLE2: Risk Decomposition
Akbank and Garanti stocks
19/10/2020 18/10/2021

𝑉𝑎𝑟(𝑟𝑖 − 𝑟𝑓 ) = 𝛽2 𝑉𝑎𝑟[(𝑟𝑚 − 𝑟𝑓 )] + 𝑉𝑎𝑟[𝜀]

Total Risk Originated Risk Originated

Risk = From Market + from Firm

AKBANK 4.41 = 1.192*1.97=2.79 + 1.62 (%37)

GARANTİ 5.75
= 1.252*1.97=3.08 + 2.67 (%46)

Garanti is more risky than Akbank

17
A Simple Regression Assumptions

𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
➢The average value of u, the error term, in
the population is 0. That is,

◼ E(u) = 0

➢This is not a restrictive assumption, since

we can always use 0 to normalize E(u) to 0

18
A Simple Regression Assumptions
➢ Zero Conditional Mean
➢ We need to make a crucial assumption about
how u and x are related
➢ We want it to be the case that knowing
something about x does not give us any
information about u, so that they are
completely unrelated. That is, that
➢ E(u|x) = E(u) = 0, which implies
➢ E(y|x) = 0 + 1x 19
A Simple Regression Assumptions
➢ E(y|x) as a linear function of x, where for any x
the distribution of y is centered about E(y|x)

20
Parameter Estimation:
Ordinary Least Squares
➢ Basic idea of regression is to estimate the
population parameters from a sample
➢ Let {(xi,yi): i=1, …,n} denote a random
sample of size n from the population
➢ For each observation in this sample, it will
be the case that
yi = 0 + 1xi + ui

21
SIMPLE REGRESSION MODEL
y

y =  + x
Q4
Q3
Q2
 Q1

x1 x2 x3 x4 x

If the relationship were an exact –deterministic-one, the

observations would lie on a straight line and we would have no
trouble obtaining accurate estimates of  and .
23
SIMPLE REGRESSION MODEL
Random Disturbances

y
y =  + x
40
Age of
First 30
Child
20

0
30 40 50 60 x
Age of
Mother
24
SIMPLE REGRESSION MODEL
Population
y P4

P1
y =  + x
Q3
Q2 Q4
u1 u3
u2
 Q1 P3
P2

Actual value P = Deterministic part Q + Stochastic part u

x1 x2 x3 x4 x

To allow for such divergences, we will write the model

as y =  + x + u, where u is a disturbance term.
25
SIMPLE REGRESSION MODEL
Sample
y (actual value)
y y^ (fitted value) y4
^
y - y = e (residual)
e4 y^ = a + bx
y1
𝑦ෝ3
𝑦ෝ2 𝑦ෝ4
e1 e3
e2
y3
a=?
𝑦ෝ1 y2
a b=?

x1 x2 x3 x4 x

The discrepancies between the actual and fitted values of y are

known as the residuals.
26
Deriving Linear
Regression Coefficients

27
DERIVING LINEAR REGRESSION
COEFFICIENTS a and b
Ordinary Least Squares OLS

𝑦ෝ𝑖 = 𝑎 + 𝑏𝑥𝑖 + 𝑒𝑖
Least squares criterion:
Minimize S, where

𝑆 = ෍ 𝑒𝑖2 = 𝑒12 +. . . +𝑒𝑛2

To begin with, we will draw the fitted line so as to minimize the

sum of the squares of the residuals. This is described as the
least squares criterion.
28
HISTORICAL BACKGROUND
➢ GAUSS 1795 – early user, not published
➢ LEGENDRE 1805 – an explicit mathematical representation
➢ GAUSS 1809 – integrated with probability theory
➢ GALTON 1900 – first used the Word “regression.”

Orbit Estimation
Estimation of the
coefficients of the
ellipse function

29
DERIVING ORDINARY LEAST SQUARES
OLS
Least squares criterion:
Minimize S, where

𝑆 = ෍ 𝑒𝑖2 = 𝑒12 +. . . +𝑒𝑛2

Why not minimize

෍ 𝑒𝑖 = 𝑒1 +. . . +𝑒𝑛

➢ Why the squares of the residuals? Why not just minimize

the sum of the residuals?
1. Avoid canceling positive and negative residuals and
2. More penalty for large resisuals than small ones. 30
DERIVING LINEAR REGRESSION
COEFFICIENTS
True model : y =  + x + u
y
6 Fitted line : yˆ = a + bx
y3
5 y2
4

3 y1
2

0
0 1 2 3
This sequence shows how the regression coefficients for a simple regression model are
derived, using the least squares criterion (OLS, for ordinary least squares)
We will start with a numerical example with just three observations: (1,3), (2,5), and
(3,6). 31
DERIVING LINEAR REGRESSION
COEFFICIENTS
y
yˆ 3 = a + 3b
6
y3
5 y2
4 yˆ 2 = a + 2b
yˆ 1 = a + b
3 y1
2 b True model: 𝑦 = 𝛼 + 𝛽𝑥 + 𝑢
a Fitted line: ෞ𝑦 = 𝑎 + 𝑏𝑥
1

0
0 1 2 3
Writing the fitted regression as y = a + bx, we will determine the values of a
^
and b that minimize the sum of the squares of the residuals.
32
DERIVING LINEAR REGRESSION
COEFFICIENTS

y
yˆ 3 = a + 3b y = yˆ + e
6
y3
5 y2 y = (a + bx ) + e
yˆ 2 = a + 2b e = y − a − bx
4
yˆ 1 = a + b
3 y1 e1 = y1 − yˆ 1 = 3 − a − b
2 b e2 = y2 − yˆ 2 = 5 − a − 2b
a
1 e3 = y3 − yˆ 3 = 6 − a − 3b
0
0 1 2 3

Given our choice of a and b, the residuals are as shown.

33
DERIVING LINEAR REGRESSION
COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
𝑒1 = 𝑦1 − 𝑦ො1 = 3 − 𝑎 − 𝑏
𝑒2 = 𝑦2 − 𝑦ො2 = 5 − 𝑎 − 2𝑏
𝑒3 = 𝑦3 − 𝑦ො3 = 6 − 𝑎 − 3𝑏

The sum of the squares of the residuals is thus as shown above.

The quadratics have been expanded.
Like terms have been added together. 34
DERIVING LINEAR REGRESSION
COEFFICIENTS
S = e12 + e22 + e32 = (3 − a − b) 2 + (5 − a − 2b) 2 + (6 − a − 3b) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
S = 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab

S
= 0  6a + 12b − 28 = 0
a

For a minimum, the partial derivatives of S with respect to a

and b should be zero. (We should also check a second-order
condition.) 35
DERIVING LINEAR REGRESSION
COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab

S
= 0  6a + 12b − 28 = 0
a
S
= 0  12a + 28b − 62 = 0
b
The first-order conditions give us two equation in two unknowns.
36
DERIVING LINEAR REGRESSION
COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S
= 0  6a + 12b − 28 = 0 6a + 12b = 28
a
S 12a + 28b = 62
= 0  12a + 28b − 62 = 0
b

Solving them, we find that S is minimized when ∴ 𝐚 = 𝟏. 𝟔𝟕 𝐛 = 𝟏. 𝟓𝟎

a and b are equal to 1.67 and 1.50, respectively. 37
DERIVING LINEAR REGRESSION
COEFFICIENTS

y
yˆ 3 = 6.17
6
y3
5 y2
4 yˆ 2 = 4.67
yˆ 1 = 3.17
3 y1
2 1.50
True model : y =  + x + u
1.67
1
Fitted line : yˆ = 1.67 + 1.50 x

0
0 1 2 3

The fitted line and the fitted values of y are as shown.

38
DERIVING LINEAR REGRESSION
COEFFICIENTS
Now we will do the same thing for the general case with
n observations.
y True model : y =  + x + u
Fitted line : yˆ = a + bx

yn
y1

x1 xn x 39
DERIVING LINEAR REGRESSION
COEFFICIENTS
y True model : y =  + x + u
Fitted line : yˆ = a + bx

yˆ n = a + bxn

yn
y1

yˆ 1 = a + bx1
a b

x1 xn x
Given our choice of a and b, we will obtain a fitted line as shown. 40
DERIVING LINEAR REGRESSION
COEFFICIENTS
y True model : y =  + x + u
Fitted line : yˆ = a + bx

yˆ n = a + bxn

yn
y1
e1 e1 = y1 − yˆ 1 = y1 − a − bx1
yˆ 1 = a + bx1 .....
a b en = yn − yˆ n = yn − a − bxn

x1 xn x
The residual for the first observation is defined. 41
DERIVING LINEAR REGRESSION
COEFFICIENTS
y
True model : y =  + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn
en
yn
y1
e1 e1 = y1 − yˆ 1 = y1 − a − bx1
yˆ 1 = a + bx1 .....
a b en = yn − yˆ n = yn − a − bxn

x1 xn x
Similarly we define the residuals for the remaining observations. That for
42
the last one is marked.
DERIVING LINEAR REGRESSION COEFFICIENTS
S = e12 + e22 + e32 = ( 3 − a − b ) 2 + (5 − a − 2b ) 2 + (6 − a − 3b ) 2
= 9 + a 2 + b 2 − 6a − 6b + 2ab
+ 25 + a 2 + 4b 2 − 10a − 20b + 4ab
+ 36 + a 2 + 9b 2 − 12a − 36b + 6ab
= 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S = e12 + ... + en2 = ( y1 − a − bx1 ) 2 + ... + ( yn − a − bxn ) 2
= y12 + a 2 + b 2 x12 − 2ay1 − 2bx1 y1 + 2abx1
+ ...
+ yn2 + a 2 + b 2 x n2 − 2ayn − 2bxn yn + 2abxn
=  yi2 + na 2 + b 2  x i2 − 2a  yi − 2b x i yi + 2ab x i
The sum of the squares of the residuals is defined for the general case. The data for the
numerical example are shown for comparison.
43
The quadratics are expanded. Like terms are added together.
DERIVING LINEAR REGRESSION COEFFICIENTS
S = 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab
S
= 0  6a + 12b − 28 = 0
a  a = 1.67, b = 1.50
S
= 0  12a + 28b − 62 = 0
b
S =  yi2 + na 2 + b 2  x i2 − 2a  y1 − 2b x i yi + 2ab x i

The first derivatives of S with respect to a and b provide us with two equations
that can be used to determine a and b.
Note that in this situation the observations on x and y are just data which
determine the coefficients in the expression for S.
The choice variables in the expression are a and b. This may seem a bit
strange because in elementary calculus courses a and b are always constants
and x and y are variables.
44
DERIVING LINEAR REGRESSION COEFFICIENTS

S = 70 + 3a 2 + 14b 2 − 28a − 62b + 12ab

S
= 0  6a + 12b − 28 = 0
a  a = 1.67, b = 1.50
S
= 0  12a + 28b − 62 = 0
b

S =  yi2 + na 2 + b 2  x i2 − 2a  y1 − 2b x i yi + 2ab x i

S
= 0  2na − 2 yi + 2b xi = 0
a
na =  yi −b x i a = y − bx
The first derivative with respect to a.
With some simple manipulation we obtain a tidy expression for a. 45
DERIVING LINEAR REGRESSION COEFFICIENTS

The first derivative with respect to b.

S =  yi2 + na 2 + b 2  x i2 − 2a  y1 − 2b x i yi + 2ab x i
S
= 0  2na − 2 yi + 2b xi = 0
a
na =  yi −b x i a = y − bx
S
= 0  2b xi2 − 2 xi yi + 2a  xi = 0
b

Divide through by 2.
S
= 0  2b  xi2 − 2  xi yi + 2a  xi = 0
b
b  xi2 −  xi yi + a  xi = 0
46
DERIVING LINEAR REGRESSION COEFFICIENTS

We now substitute for a using the expression obtained for it and we thus
obtain an equation that contains b only.
S =  yi2 + na 2 + b 2  x i2 − 2a  y1 − 2b x i yi + 2ab x i
S
= 0  2na − 2 yi + 2b xi = 0
a
na =  yi −b x i a = y − bx
S
= 0  2b xi2 − 2 xi yi + 2a  xi = 0
b
We now substitute for a using the expression obtained for it and we thus
obtain an equation that contains b only.
S
= 0  2b  xi2 − 2  xi yi + 2a  xi = 0
b
b  xi2 −  xi yi + a  xi = 0

b  xi2 −  xi yi + ( y − bx )  xi = 0 47
DERIVING LINEAR REGRESSION COEFFICIENTS

The definition of the sample mean has been used.

S
= 0  2b  xi2 − 2  xi yi + 2a  xi = 0
b
b  xi2 −  xi yi + a  xi = 0

b  xi2 −  xi yi + ( y − bx )  xi = 0  xi
x=
n
b  xi2 −  xi yi + ( y − bx )nx = 0
Terms not involving b have been transferred to the right side and the
equation has been divided through by n.

b( xi2 − nx 2 ) =  xi yi − nxy

1 2 1
b  x i − x  =  x i y i − x y
2

n  n 48
DERIVING LINEAR REGRESSION COEFFICIENTS
S
= 0  2b  xi2 − 2  xi yi + 2a  xi = 0
b
b  xi2 −  xi yi + a  xi = 0

b  xi2 −  xi yi + ( y − bx )  xi = 0

b  xi2 −  xi yi + ( y − bx )nx = 0

b( xi2 − nx 2 ) =  xi yi − nxy

1 2 1
b  x i − x  =  x i y i − x y
2

n  n
bVar( x ) = Cov( x , y )
Hence we obtain a tidy expression for b.
Cov( x , y )
b=
Var( x ) 49
DERIVING LINEAR REGRESSION COEFFICIENTS

y True model : y =  + x + u
Fitted line : yˆ = a + bx
yˆ n = a + bxn

yn
y1
a = y − bx
yˆ 1 = a + bx1 b=
Cov( x , y )
a b Var( x )

x1 xn x
_ _
𝒂 = 𝒚ǉ − 𝒃𝒙ǉ ⇒ 𝒚 = 𝒂 + 𝒃𝒙
The expression for a is standard, and we will soon see that it generalizes easily.
There are various ways of writing the expression for b. 50
Summary of OLS Slope Estimate

➢The slope estimate is the sample covariance

between x and y divided by the sample
variance of x
◼ If x and y are positively correlated, the

slope will be positive

◼ If x and y are negatively correlated, the

slope will be negative

➢ Only need x to vary in our sample
58
Interpretation of
Regression Equation
INTERPRETATION OF
A LINEAR REGRESSION EQUATION
ෝ =  0 +  1x
𝒚
x*=x + 1
ෝ* =  0 +  1x*
𝒚
ෝ* =  0 +  1(x + 1)
𝒚
ෝ* = (  0 +  1x) +  1
𝒚
ෝ + 1
ෝ* = 𝒚
𝒚

If x increases 1 unit => y changes 1 units

60
INTERPRETATION OF A REGRESSION
EQUATION
80

70
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

The scatter diagram shows hourly earnings in 1994 plotted against highest grade completed
for a sample of 570 respondents from the National Longitudinal Survey of Youth.
Highest grade completed means just that for elementary and high school. Grades 13, 14,
and 15 mean completion of one, two and three years of college.
Grade 16 means completion of four-year college. Higher grades indicate years of
postgraduate education. 61
INTERPRETATION OF A REGRESSION
EQUATION
. reg earnings hgc

Source | SS df MS Number of obs = 570

------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------

This is the output from a regression of earnings on highest

grade completed, using Stata.
Units: HGC Years of schooling (highest grade completed )
Earnings hourly earnings in $
62
INTERPRETATION OF A REGRESSION EQUATION
. reg earnings hgc

Source | SS df MS Number of obs = 570

For the time being, we will be concerned only with the estimates of the
parameters. The variables in the regression are listed in the first
column and the second column gives the estimates of their coefficients.
In this case there is only one variable, HGC, and its coefficient is 1.073.
The estimate of the intercept is -1.391. _cons, in Stata, refers to the
constant. 63
INTERPRETATION OF A REGRESSION EQUATION
80

^
Hourly earnings ($)

60
EARNINGS = −1.391 + 1.073 HGC
50

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

Here is the scatter diagram again, with the regression line shown.
What do the coefficients actually mean?
7
64
INTERPRETATION OF A REGRESSION EQUATION
80 ^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

To answer this question, you must refer to the units in which the variables are
measured.
HGC is measured in years (strictly speaking, grades completed), EARNINGS in
dollars per hour. So the slope coefficient implies that hourly earnings increase
by $1.07 for each extra year of schooling. 65
INTERPRETATION OF A REGRESSION EQUATION
80
^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

We will look at a geometrical representation of this interpretation. To do

this, we will enlarge the marked section of the scatter diagram.
66
INTERPRETATION OF A REGRESSION EQUATION
15

14
Hourly earnings ($)

13
$11.49
12

11 $1.07
10 One year
$10.41
9

7
10.8 11 11.2 11.4 11.6 11.8 12 12.2
Highest grade completed

The regression line indicates that completing 12th grade instead of 11th grade
would increase earnings by $1.073, from $10.413 to $11.486, as a general tendency.
67
INTERPRETATION OF A REGRESSION EQUATION
80
^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

You should ask yourself whether this is a plausible figure. If it is implausible,

this could be a sign that your model is misspecified in some way.
For low levels of education it might be plausible.
But for high levels it would seem to be an underestimate
68
INTERPRETATION OF A REGRESSION EQUATION
80

70 ^
EARNINGS = −1.391 + 1.073 HGC
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

What about the constant term? (Try to answer this question yourself before
continuing with this sequence.)
Literally, the constant indicates that an individual with no years of education
would have to pay $1.39 per hour to be allowed to work. 69
INTERPRETATION OF A REGRESSION EQUATION
80
^
EARNINGS = −1.391 + 1.073 HGC
70
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

This does not make any sense at all. In former times craftsmen might require an
initial payment when taking on an apprentice, and might pay the apprentice
little or nothing for quite a while, but an interpretation of negative payment is
impossible to sustain. 70
INTERPRETATION OF A REGRESSION EQUATION

70
^
Hourly earnings ($)

60
𝑬𝑨𝑹𝑵𝑰𝑵𝑮𝑺 = −𝟏. 𝟑𝟗𝟏 + 𝟏. 𝟎𝟕𝟑 𝑯𝑮𝑪
50

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade com pleted

A safe solution to the problem is to limit the interpretation to the range of the
sample data, and to refuse to extrapolate on the ground that we have no evidence
outside the data range.
With this explanation, the only function of the constant term is to enable you to
draw the regression line at the correct height on the scatter diagram. It has no
meaning of its own. 71
INTERPRETATION OF A REGRESSION EQUATION

60
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
-10

Highest grade completed

Another solution is to explore the possibility that the true relationship is nonlinear.
We will soon extend the regression technique to fit nonlinear models.

72
Algebraic Properties of OLS
➢ The sum of the OLS residuals is zero.
Thus, the sample average of the OLS
residuals is zero as well
➢ The sample covariance between the
regressors and the OLS residuals is zero
➢ The OLS regression line always goes
through the mean of the sample

77
More Terminology
We can think of each observation as being made
up of an explained part, and an unexplained part,
𝑦𝑖 = 𝑦ො𝑖 + 𝑢ො 𝑖 We then define the following:

2
෍ 𝑦𝑖 − 𝑦ǉ is the total sum of squares (SST)

෍ 𝑦ො𝑖 − 𝑦ǉ 2 is the explained sum of squares (SSE)

෍ 𝑢ො 𝑖2 is the residual sum of squares (SSR)

Then SST = SSE + SSR
79
Goodness-of-Fit R 2

• How do we think about how well our

sample regression line fits our sample data?

• Can compute the fraction of the total sum of

squares (SST) that is explained by the
model, call this the R-squared of regression

R2 = SSE / SST = 1 – SSR / SST

81
Goodness-of-Fit R2
ei = yi − yˆ i  yi = yˆ i + ei
Var( y ) = Var( yˆ + e ) = Var( yˆ ) + Var(e ) + 2Cov( yˆ , e )
= Var( yˆ ) + Var(e )
_
1 1 1
( y − y ) = ( yˆ − yˆ ) + (e − e )2
2 2

n n n
( y − y )2 = ( yˆ − y )2 +  e 2

TSS = ESS + RSS

 − 
2
ESS ( ˆ
y y ) 2
ei
R =
2
= i
= 1−
TSS  ( yi − y ) 2
 i
( y − y ) 2

The main criterion of goodness of fit, formally described as the coefficient of

determination, but usually referred to as R2, is defined to the the ratio of ESS to
TSS, that is, the proportion of the variance of y explained by the regression
equation. 82
Goodness-of-Fit R2
Cov( y , yˆ ) Cov([ yˆ + e ], yˆ )
ry , yˆ = =
Var( y ) Var( yˆ ) Var( y ) Var( yˆ )
Cov( yˆ , yˆ ) + Cov(e , yˆ ) Var( yˆ )
= =
Var( y ) Var( yˆ ) Var( y ) Var( yˆ )
Var( yˆ ) Var( yˆ ) Var( yˆ )
= =
Var( y ) Var( yˆ ) Var( y )
= R2
1
ESS  ( yˆ i − y )
2
n
 i
( ˆ
y − y ) 2
Var( yˆ )
R =
2
= = =
TSS  ( yi − y ) 2
1
n
 ( y i − y ) 2 Var( y )

Thus the correlation coefficient is the square root of R2. It follows that
it is maximized by the use of the least squares principle to determine
the regression coefficients. 83
Goodness-of-Fit R2
. reg earnings hgc

Source | SS df MS Number of obs = 570

In this case there is only one variable, HGC, and its coefficient is 1.073. _cons,
in Stata, refers to the constant. The estimate of the intercept is -1.391.

A variation at the HGC explains 10.4% of the variation of the

EARNINGS. 84
Properties of OLS
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
If the assumptions are hold
➢ OLS estimator
◼ Unbiased

◼ Efficient

➢ BLUE – Best Linear Unbiased Estimator

85
UNBIASEDNESS AND EFFICIENCY

OLS x x
x
x xx x
x x xx x x
xx
Unbiased x Unbiased
x
Efficient Inefficient

x x x
xxx
xx
Biased xx x Biased
x
Efficient x x Inefficient
86
OLS ASSUMPTIONS
1. Assumptions on the disturbances
a. Random disturbances have zero mean E[ui] = 0
b. Homoskedasticity Var(ui) = 2
c. No serial correlation Cov(ui uj) = 0 i ≠j
2. Assumptions on model and its parameters
a. Constant parameters
b. Linear model
3. Assumption on the probability distribution
a. Normal distribution ui ~N(0, 2 )
4. Assumptions on regressors
a. Fixed - nonstochastic regressors 87
REGRESSION
COEFFICIENTS

93
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

y =  + x + u
yˆ = a + bx

Cov(𝑥, 𝑦) Cov(𝑥, [𝛼 + 𝛽𝑥 + 𝑢])

𝑏= =
Var(𝑥) Var(𝑥)
Cov(𝑥, 𝛼) + Cov(𝑥, 𝛽𝑥) + Cov(𝑥, 𝑢)
=
Var(𝑥)
0 + 𝛽Cov(𝑥, 𝑥) + Cov(𝑥, 𝑢)
=
Var(𝑥)
Cov(𝑥,𝑢)
b = 𝛽 + Var(𝑥)

The error term depends on the value of the disturbance term in every observation
in the sample, and thus it is a special type of random variable.
We will investigate its effect on b in two ways: first, directly, using a Monte Carlo
experiment, and, second, analytically. 94
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
Choose model in which y is
determined by x, parameter y =  + x + u
values, and u

Choose Choose u is
Choose x=  = 2.0
parameter distribution independent
data for x 1, 2, ... , 20  = 0.5
values for u N(0,1)

Model y = 2.0 + 0.5x + u

Generate the Generate the

values of y values of y

Estimators b = Cov(x, y)/Var(x); a = y − bx

Estimate the values Estimate the values

of the parameters of the parameters

We will then regress y on x using the OLS estimation technique and see how
well our estimates a and b correspond to the true values  and . 95
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

y = 2.0 + 0.5x + u

x 2.0+0.5x u y x 2.0+0.5x u y

1 2.5 11 7.5
2 3.0 12 8.0
3 3.5 13 8.5
4 4.0 14 9.0
5 4.5 15 9.5
6 5.0 16 10.0
7 5.5 17 10.5
8 6.0 18 11.0
9 6.5 19 11.5
10 7.0 20 12.0

Given our choice of numbers for  and , we can derive the

nonstochastic component of y. 96
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

y = 2.0 + 0.5 x
14.00

y
12.00

10.00

8.00

6.00

4.00

2.00

0.00 x
0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 18.00 20.00

The nonstochastic component is displayed graphically.

97
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

y = 2.0 + 0.5x + u
x 2.0+0.5x u y x 2.0+0.5x u y

1 2.5 -0.59 1.91 11 7.5 1.59 9.09

2 3.0 -0.24 2.76 12 8.0 -0.92 7.08
3 3.5 -0.83 2.67 13 8.5 -0.71 7.79
4 4.0 0.03 4.03 14 9.0 -0.25 8.75
5 4.5 -0.38 4.12 15 9.5 1.69 11.19
6 5.0 -2.19 2.81 16 10.0 0.15 10.15
7 5.5 1.03 6.53 17 10.5 0.02 10.52
8 6.0 0.24 6.24 18 11.0 -0.11 10.89
9 6.5 2.53 9.03 19 11.5 -0.91 10.59
10 7.0 -0.13 6.87 20 12.0 1.42 13.42

Next, we generate randomly a value of the disturbance term for each observation
using a N(0,1) distribution (normal with zero mean and unit variance).
We will generate values of y for all the 20 observations. 98
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

y 14.00

12.00

10.00

8.00

6.00

4.00

2.00
x
0.00
0.00 5.00 10.00 15.00 20.00

The 20 observations are displayed graphically.

99
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

14.00
y yˆ = 2.52 + 0.48 x
12.00

10.00

8.00

6.00

4.00

2.00

0.00
0.00 5.00 10.00 15.00 20.00 x

This time the slope coefficient has been overestimated and the
intercept underestimated.
100
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

14.00
y yˆ = 2.13 + 0.45 x
12.00

10.00

8.00

6.00

4.00

2.00

0.00
0.00 5.00 10.00 15.00 20.00 x

As last time, the slope coefficient has been underestimated and the
intercept overestimated.
101
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

replication a b

1 1.63 0.54
12
2 2.52 0.48
10
3 2.13 0.45
8
4 2.14 0.50
5 1.71 0.56 6

6 1.81 0.51 4

7 1.72 0.56 2

8 3.18 0.41 0
0,40 0,45 0,50 0,55 0,60
9 1.26 0.58
10 1.94 0.52

The table summarizes the results of the three regressions and adds
those obtained repeating the process a further seven times.
102
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

1-10 11-20 21-30 31-40 41-50

0.54 0.49 0.54 0.52 0.49

0.48 0.54 0.46 0.47 0.50
0.45 0.49 0.45 0.54 0.48
0.50 0.54 0.50 0.53 0.44
0.56 0.54 0.41 0.51 0.53
0.51 0.52 0.53 0.51 0.48
0.56 0.49 0.53 0.47 0.47
0.41 0.53 0.47 0.55 0.50
0.58 0.60 0.51 0.51 0.53
0.52 0.48 0.47 0.58 0.51

Here are the estimates of  obtained with 40 further replications of

the process..
103
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

0
0,40 0,45 0,50 0,55 0,60
50 replications

The histogram is beginning to display a central tendency.

104
REGRESSION COEFFICIENTS AS RANDOM VARIABLES

0
0,40 0,45 0,50 0,55 0,60

100 replications

105
REGRESSION COEFFICIENTS AS RANDOM VARIABLES
12

0
0,40 0,45 0,50 0,55 0,60

This is the histogram with 100 replications. We can see that the distribution
100 replications
appears to be symmetrical around the true value, implying that the estimator
is unbiased.
The red curve shows the limiting shape of the distribution. It is symmetrical
around the true value, confirming that the estimator is unbiased.
The distribution is normal because the disturbance term was drawn from a
normal distribution.. 106
OLS ASSUMPTIONS
1. Assumptions on regressors
a. Fixed - nonstochastic regressors
2. Assumptions on the disturbances
a. Random disturbances have zero mean E[ui] = 0
b. Homoskedasticity Var(ui) = 2
c. No serial correlation Cov(ui uj) = 0 i j
3. Assumptions on model and its parameters
a. Constant parameters
b. Linear model
4. Assumption on the probability distribution
a. Normal distribution u N(0, 2 )
107
Variance of the OLS Estimators
➢ Now we know that the sampling
distribution of our estimate is centered
around the true parameter
➢ Want to think about how spread out this
distribution is
➢ Much easier to think about this variance
under an additional assumption, so
➢Assume Var(u|x) = 2 (Homoskedasticity)
108
Variance of OLS (cont)
➢ Var(u|x) = E(u2|x)-[E(u|x)]2
➢ E(u|x) = 0, so 2 = E(u2|x) = E(u2) = Var(u)
➢ Thus 2 is also the unconditional variance,
called the error variance
➢ , the square root of the error variance is
called the standard deviation of the error
➢ Can say: E(y|x)=0 + 1x and Var(y|x) = 2
109
Homoskedastic Case
y
f(y|x)

. E(y|x) =  +  x
0 1
.

x1 x2
110
Heteroskedastic Case
f(y|x)

.
. E(y|x) = 0 + 1x

.
x1 x2 x3 x
111
PRECISION OF THE REGRESSION
COEFFICIENTS
Simple regression model: y =  + x + u
Variances of the regression coefficients
 u2  x2   2
pop.var ( a ) = 1 +  pop.var ( b ) = u
n  Var ( x )  n Var ( x )

The variances are inversely proportional to n, the number of

observations in the sample. The more information you have, the
more accurate your estimates are likely to be.
The variances are proportional to u2, the variance of the disturbance
term. The bigger the luck factor, the worse the estimates are likely to
be, other things being equal.

113
PRECISION OF THE REGRESSION COEFFICIENTS
Simple regression model: y =  + x + u  u2
pop.var ( b ) =
Variances of the regression coefficients n Var ( x )
y y
35 35

30 30

25 25

20 20

15 15

10 10

5 5

0 0
0 5 10 15 20 0 5 10 15 20
-5
-5 x x
-10 -10

-15 -15
This is illustrated by the diagrams above. The nonstochastic component of the
relationship, y = 3.0 + 0.8x, represented by the dotted line, is the same in both
diagrams.
The values of x are the same, and the same random numbers have been used to
generate the values of the disturbance term in the 20 observations.
However, in the right-hand diagram the random numbers have been multiplied by
a factor of 5. As a consequence, the regression line, the solid line, is a much
poorer approximation to the nonstochastic relationship. 114
PRECISION OF THE REGRESSION COEFFICIENTS
Simple regression model: y =  + x + u  u2
Variances of the regression coefficients pop.var ( b ) = n Var ( x )
y y
35 35

30 30

25 25

20 20

15 15

10 10

5 5

0 x 0 x
0 5 10 15 20 0 5 10 15 20
-5 -5

-10 -10

-15 -15

In the diagrams above, the nonstochastic component of the relationship is the

same and the same random numbers have been used for the 20 values of the
disturbance term.
However, Var(x) is much smaller in the right-hand diagram because the values
of x are much closer together.
Hence in that diagram the position of the regression line is more sensitive to
the values of the disturbance term, and as a consequence the regression line
is likely to be relatively inaccurate. 115
PRECISION OF THE REGRESSION COEFFICIENTS
Simple regression model: y =  + x + u
Variances of the regression coefficients
. reg earnings hgc

Source | SS df MS Number of obs = 570

The standard errors of the coefficients always appear as part of the

output of a regression. Here is the regression of hourly earnings on
years of schooling discussed in a previous sequence. The standard
errors appear in a column to the right of the coefficients.
116
Variance of OLS Summary
➢ The larger the error variance, 2, the larger
the variance of the slope estimate
➢ The larger the variability in the xi, the
smaller the variance of the slope estimate
➢ As a result, a larger sample size should
decrease the variance of the slope estimate
➢ Problem that the error variance is unknown

117
Estimating the Error Variance
➢ We don’t know what the error variance, 2,
is, because we don’t observe the errors, ui

➢ What we observe are the residuals, ûi

➢ We can use the residuals to form an

estimate of the error variance

118
Error Variance Estimate (cont)

uˆi = yi − ˆ0 − ˆ1 xi

= ( 0 + 1 xi + ui ) − ˆ0 − ˆ1 xi
i (
= u − ˆ −  − ˆ − 
0 0 ) ( 1 1 )
Then, an unbiased estimator of  2 is

ui = SSR / (n − 2 )
1
 =
ˆ2

(n − 2)  ˆ 2

119
Error Variance Estimate (cont)
ˆ = ˆ 2 = Standard error of the regression
()
recall that sd ˆ = 
sx
if we substitute ˆ for  then we have
the standard error of ˆ1 ,

( ) (
se ˆ1 = ˆ /  (xi − x )
2
)
1
2

120
TESTING A HYPOTHESIS
RELATING TO A
REGRESSION COEFFICIENT
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT

Model: y=+x+u
Null hypothesis: H0:  = 0
Alternative hypothesis H1:  ≠  0

This sequence describes the testing of a hypothesis at the 5% and

1% significance levels. It also defines what is meant by a Type I
error.
We will suppose that we have the standard simple regression
model and that we wish to test the hypothesis H0 that the slope
coefficient is equal to some value 0.
The hypothesis being tested is described as the null hypothesis.
We test it against the alternative hypothesis H1, which is simply
that  is not equal to 0. 122
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
Model: y=+x+u
Null hypothesis: H0:  =  0
Alternative hypothesis H1:  =  0
Example model: p=+w+u
Null hypothesis: H0:  = 1.0
Alternative hypothesis: H1:  = 1.0
As an illustration, we will consider a model relating price inflation to
wage inflation. p is the rate of growth of prices and w is the rate of
growth of wages.
We will test the hypothesis that the rate of price inflation is equal to
the rate of wage inflation. The null hypothesis is therefore H0:  = 1.0.
(We should also test  = 0. 123
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
probability density Decision rule (5% significance level):
function of b
Reject H0:  = 0
• if b > 0 + 1.96 s.d.
• if b < 0 - 1.96 s.d.

b − 0 Reject H0
Z= if |Z| > 1.96
s.d.

2.5% 2.5%

0-1.96sd 0-sd 0 0+sd 0+1.96sd b

Thus we would reject H0 if the estimate were 1.96 standard deviations (or more) above
or below the hypothetical mean.
With the present test, if the null hypothesis is true, a Type I error will occur 5% of the
time because 5% of the time we will get estimates in the upper or lower 2.5% tails. 124
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT

probability density Type I error: rejection of H0 when it is in fact true.

function of b
Probability of Type I error: in this case, 5%
Significance level of the test is 5%.

reject H0:  = 0 acceptance region for b reject H0:  = 0

2.5% 2.5%

0-1.96sd 0-sd 0 0+sd 0+1.96sd b

The significance level of a test is defined to be the probability of

making a Type I error if the null hypothesis is true.
125
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
Decision rule (1% significance level):
Reject H0:  = 0
probability density
(1) if b > 0 + 2.58 s.d. (2) if b < 0 - 2.58 s.d.
function of b
(1) if Z > 2.58 (2) if Z < -2.58

acceptance region for b :

b − 0
0 - 2.58 s.d. < b < 0 + 2.58 s.d. Z=
-2.58 < Z < 2.58 s.d.

0.5% 0.5%

0-2.58sd 0-sd 0 0+sd 0+2.58sd b

The 0.5% tails of a normal distribution start 2.58 standard deviations
from the mean, so we now reject the null hypothesis if Z is greater
than 2.58, in absolute terms. 126
TESTING A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT

probability density Type I error: rejection of H0 when it is in fact true.

function of b Probability of Type I error: in this case, 1%
Significance level of the test is 1%.

reject H0:  = 0 acceptance region for b reject H0:  = 0

0.5% 0.5%

0-2.58sd 0-sd 0 0+sd 0+2.58sd b

Since the probability of making a Type I error, if the null hypothesis

is true, is now only 1%, the test is said to be a 1% significance test.
127
t TEST OF A HYPOTHESIS
RELATING TO A REGRESSION
COEFFICIENT
t TEST OF A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
s.d. of b known s.d. of b not known

discrepancy between discrepancy between

hypothetical value and sample hypothetical value and sample
estimate, in terms of s.d.: estimate, in terms of s.e.:
b − 0 b − 0
Z= t=
s.d. s.e.

5% significance test: 5% significance test:

reject H0:  = 0 if reject H0:  = 0 if
Z > 1.96 or Z < -1.96 t > tcrit or t < -tcrit

We look up the critical value of t and if the t statistic is greater than it,
positive or negative, we reject the null hypothesis. If it is not, we do not. 129
t TEST OF A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
0.4

0.3 normal
t, n = 10
0.2 t, n = 5

0.1

0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

So why do we make such a fuss about referring to the t distribution

rather than the normal distribution? Would it really matter if we always
used 1.96 for the 5% test and 2.58 for the 1% test?
The answer is that it does make a difference. Although the distributions
are generally quite similar, the t distribution has longer tails than the
normal distribution, the difference being the greater, the smaller the
number of degrees of freedom. 130
t TEST OF A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
0.1
normal
t, n = 10
t, n = 5

0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

As a consequence, the probability of obtaining a high test statistic on

a pure chance basis is greater with a t distribution than with a normal
distribution.
This means that the rejection regions have to start more standard
deviations away from zero for a t distribution than for a normal
131
distribution.
t TEST OF A HYPOTHESIS RELATING TO A
REGRESSION COEFFICIENT
t Distribution: Critical values of t
Degrees of Two-tailed test 10% 5% 2% 1% 0.2% 0.1%
freedom One-tailed test 5% 2.5% 1% 0.5% 0.1% 0.05%
1 6.314 12.706 31.821 63.657 318.31 636.62
2 2.920 4.303 6.965 9.925 22.327 31.598
3 2.353 3.182 4.541 5.841 10.214 12.924
4 2.132 2.776 3.747 4.604 7.173 8.610
5 2.015 2.571 3.365 4.032 5.893 6.869
… … … … … … …
… … … … … … …
18 1.734 2.101 2.552 2.878 3.610 3.922
19 1.729 2.093 2.539 2.861 3.579 3.883
20 1.725 2.086 2.528 2.845 3.552 3.850
… … … … … … …
… … … … … … …
120 1.658 1.980 2.358 2.617 3.160 3.373
 1.645 1.960 2.326 2.576 3.090 3.291
If we were performing a regression with 20 observations, as in the price
inflation/wage inflation example, the number of degrees of freedom
would be 18 and the critical value of t for a 5% test would be 2.101. 132
ONE-TAILED TESTS

model: y =  + x + u

null hypothesis: H0 :  ≤ 0
alternative hypothesis: H1 :  > 0

It occurs when you wish to demonstrate that a variable x

influences another variable y. You set up the null hypothesis
of no effect and try to reject H0.

133
ONE-TAILED TESTS
probability density null hypothesis: H0 :  ≤ 0
function of b
alternative hypothesis: H1 :  > 0

do not reject H0 reject H0

0 1.65 sd
However, if you can justify the use of a one-tailed test, for example with H0: 
> 0, your estimate only has to be 1.65 standard deviations above 0.
This makes it easier to reject H0 and thereby demonstrate that y really is
influenced by x (assuming that your model is correctly specified). 134
CONFIDENCE
INTERVALS
CONFIDENCE INTERVALS

probability density function of b null hypothesis

conditional on  = 0 being true H0:  = 0

acceptance region for b

2.5% 2.5%

0-1.96sd 0-sd 0 0+sd 0+1.96sd

We ended by deriving the range of estimates that are compatible

with H0 and called it the acceptance region.
136
CONFIDENCE INTERVALS

95% confidence interval

b - tcrit (5%) se <  < b + tcrit (5%) se

99% confidence interval

b - tcrit (1%) se <  < b + tcrit (1%) se

This implies that the standard error should be multiplied by the

critical value of t, given the significance level and number of degrees
of freedom, when determining the limits of the interval.
137
CONFIDENCE INTERVALS

probability density function of b

(1) conditional on  = max being true
(2) conditional on  = min being true

(2) (1)

max-2sd min-sd min min+sd b max-sd max max+sd max+2sd

The diagram shows the limiting values of the hypothetical values of

, together with their associated probability distributions for b.
138
EXAMPLE: HYPOTHESIS TESTING
𝐄𝐀𝐑𝐍𝐈𝐍𝐆𝐒 = 𝛂 + 𝛃 𝐇𝐆𝐂 + 𝐮
. reg earnings hgc H0:  = 
Source | SS df MS
H1:  ≠  Number of obs = 570
---------+------------------------------ F( 1, 568) = 65.64
Model | 3977.38016 1 3977.38016 Prob > F = 0.0000
Residual | 34419.6569 568 60.5979875 R-squared = 0.1036
---------+------------------------------ Adj R-squared = 0.1020
Total | 38397.0371 569 67.4816117 Root MSE = 7.7845
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------

5% significance test: 𝑏 − 𝛽0 1.073 − 0

𝑡= = = 8.102
Reject H0:  =  s.e. 0.132
If t > tcrit or t < -tcrit t > 1.96 Reject H0:  = 
HGC has a significant effect on
Earnings at 5% significance level. 139
EXAMPLE: CONFIDENCE INTERVALS
𝐄𝐀𝐑𝐍𝐈𝐍𝐆𝐒 = 𝛂 + 𝛃 𝐇𝐆𝐂 + 𝐮
. reg earnings hgc

Source | SS df MS Number of obs = 570

---------+------------------------------ F( 1, 568) = 65.64
Model | 3977.38016 1 3977.38016 Prob > F = 0.0000
Residual | 34419.6569 568 60.5979875 R-squared = 0.1036
---------+------------------------------ Adj R-squared = 0.1020
Total | 38397.0371 569 67.4816117 Root MSE = 7.7845
------------------------------------------------------------------------------
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hgc | 1.073055 .1324501 8.102 0.000 .8129028 1.333206
_cons | -1.391004 1.820305 -0.764 0.445 -4.966354 2.184347
------------------------------------------------------------------------------
95% confidence interval for 𝛃
b - tcrit (5%) se <  < b + tcrit (5%) se

1.073 – 1.960.132 <  < 1.073+1.960.132

0.813<  < 1.333 140

F TEST
OF GOODNESS OF FIT
F TEST OF GOODNESS OF FIT

Var( y ) = Var( yˆ ) + Var(e )

( y − y )2 = ( yˆ − y )2 +  e 2
TSS = ESS + RSS

R2 =
ESS
=
 i
( y − ˆ
y ) 2

TSS  ( yi − y ) 2

y =  + x + u

H 0 :  = 0, H 1 :   0

Since x is the only explanatory variable at the moment, the null hypothesis is that y is not
determined by x Mathematically, we have H0:  = 0.
142
F TEST OF GOODNESS OF FIT
Var(𝑦) = Var(𝑦)
ො + Var(𝑒)
y =  + x + u ǉ 2 = ∑(𝑦ො − 𝑦)
∑(𝑦 − 𝑦) ǉ 2 + ∑𝑒 2
𝑇𝑆𝑆 = 𝐸𝑆𝑆 + 𝑅𝑆𝑆
𝐻0 : 𝛽 = 0, 𝐻1 : 𝛽 ≠ 0
𝐸𝑆𝑆 ∑(𝑦𝑖 − ሜ 2
𝑦)
ො
𝐻0 : 𝑅2 = 0, 𝐻1 : 𝑅2 ≠ 0 𝑅2 = =
𝑇𝑆𝑆 ∑(𝑦𝑖 − 𝑦) ǉ 2
𝐸𝑆𝑆
𝐸𝑆𝑆/𝑘 ൗ𝑘
𝐹(𝑘, 𝑛 − 𝑘 − 1) = = 𝑇𝑆𝑆
𝑅𝑆𝑆/(𝑛 − 𝑘 − 1) 𝑅𝑆𝑆ൗ(𝑛 − 𝑘 − 1)
𝑇𝑆𝑆
𝑅 2 /𝑘
=
(1 − 𝑅 2 )/(𝑛 − 𝑘 − 1)
If calculated F test value is greater than F table value Reject Null
Hypothes concerning goodness of fit.
F statistic, defined as shown. k is the number of explanatory
variables, which at present is just 1. 143
F TEST OF GOODNESS OF FIT and R2

F 140

120

100
R2 / k
F ( k , n − k − 1) =
80
(1 − R 2 ) /( n − k − 1)
60

0
0 0.2 0.4 0.6 0.8 1 R2

F is a monotonically increasing function of R2. As R2 increases, the

numerator increases and the denominator decreases, so for both of
these reasons F increases. 144
Evaluation of The Regression
Results
1. Is the equation supported by sound theory?
2. How well does the estimated regression fit the data?
3. Is the data set reasonably large and accurate?
4. Is OLS the best estimator to be used for this equation?
5. How well do the estimated coefficients correspond to
the expectations developed by the researcher before
the data were collected?
6. Are all the obviously important variables included in the
equation?
7. Has the most theoretically logical functional form been
used?
8. Does the regression appear to be free of major
145
econometric problems?
Example1: Height vs. Weight
100

80
KILO

70 Correlation 0.85

40
155 160 165 170 175 180 185 190

BOY
146
Example1: Height vs. Weight
190

185

180

175
HEIGHT

170 WEIGHTi = a + b*HEIGHTi + ei

165

160

155
40 50 60 70 80 90 100

WEIGHT

147
Example1: Height vs. Weight
Dependent Variable: KILO
Method: Least Squares KILO = -161.63 + Generally the constant
1.307*BOY
Sample (adjusted): 1 44 term is not interpreted.
Included observations: 44 after adjustments Mostly it is meaningless.
Variable Coefficient Std. Error t-Statistic Prob.

C -161.6304 21.73056 -7.437930 0.0000

BOY 1.307075 0.124817 10.47192 0.0000

R-squared 0.723067 Mean dependent var 65.68182

Adjusted R-squared 0.716473 S.D. dependent var Height explains 72% of
12.64869
S.E. of regression 6.735079 Akaike info criterion 6.696925
the variation of Weight.
Sum squared resid 1905.174 Schwarz criterion 6.778025
Log likelihood -145.3324 Hannan-Quinn criter. 6.727001
F-statistic 109.6611 Absolute valuestat
Durbin-Watson of t is grater then table
2.648693
Prob(F-statistic) 0.000000 critical value(1.96). Or the p-value is less
than 5%. Reject H0: 1= 0
If height increases 1cm, The coefficient is significant. Height has
weight increases 1.3kg . a significant effect on weight. 148
Example2: College Applications
➢ Suppose that you work in the admissions office of a college
that doesn’t allow prospective students to apply by using the
Common Application. How might you go about estimating the
number of extra applications that your college would receive if
it allowed the use of the Common Application?
COLLEGE
Amherst College
APPLICATION
6680
RANK
2
SIZE
1648
◼ APPLICATIONi = the number of applications
Bard College
Bates College
4980
4434
36
23
1641
1744
received by the ith college.
Bowdoin College 5961 7 1726
Bucknell University 8934 29 3529 ◼ RANKi = the U.S. News10 rank of the ith college
Carleton College 4840 6 1966
Centre College 2159 44 1144 (1 = best)
Colby College 4679 20 1865
Colgate University 8759 16 2754
Colorado College
Connecticut College
4826
4742
26
39
1939
1802
1. Is there any relationship between
Davidson College
Denison University
3992
5196
10
48
1667
2234
ranking and the number of application?
DePauw University 3624 48 2294
Dickinson College 5844 41 2372 2. Interpret the coefficient of rank.
Furman University 3879 41 2648
Gettysburg College
Grinnell College
6126
3077
45
14
2511
1556
3. Is it significant?
Hamilton College
Haverford College
4962
3492
17
9
1802
1168
4. Interpret the R2 and F statistics. 149
Example2: College Applications
9,000
➢ It looks there is a
8,000 negative relation
7,000 between number of
APPLICATION

application and rating.

6,000
➢ It the college at the top
5,000
rank, the number of
4,000 applications is higher.
3,000 ➢ Both variables are
almost normally
2,000
distributed.
0 10 20 30 40 50 60 ➢ There is no obvious
RANK
outlier.
150
9,000

Example2 : College 8,000

Applications 7,000

APPLICATION
6,000

Applicationi = 0 + 1*Ranki + ui
5,000

4,000

Dependent Variable: APPLICATION 3,000

Method: Least Squares 2,000

Sample: 1 49 0 10 20 30 40 50 60

Included observations: 49 RANK

Variable Coefficient Std. Error t-Statistic Prob.

C 5956.410 444.8246 13.39047 0.0000

RANK -33.52880 13.39669 -2.502768 0.0159

R-squared 0.117600 Mean dependent var 4992.286

Adjusted R-squared 0.098826 S.D. dependent var 1640.112
S.E. of regression 1556.962 Akaike info criterion 17.57882
Sum squared resid 1.14E+08 Schwarz criterion 17.65604
Log likelihood -428.6811 Hannan-Quinn criter. 17.60812
F-statistic 6.263847 Durbin-Watson stat 1.943843
Prob(F-statistic) 0.015857
151
Example2 : College Applications
Generally the constant
Applicationi = 0 + 1*Ranki + ui
term is not interpreted.
Dependent Variable: APPLICATION Mostly it is meaningless.
Method: Least Squares
Sample: 1 49
Included observations: 49

Variable Coefficient Std. Error t-Statistic Prob.

C 5956.410 444.8246 13.39047 0.0000

RANK -33.52880 13.39669 -2.502768 0.0159
Absolute value of
t is grater then table
R-squared 0.117600 Mean dependent var 4992.286 critical value(1.96).
Adjusted R-squared 0.098826 S.D. dependent var 1640.112
S.E. of regression 1556.962 Akaike info criterion 17.57882
Or the p-value is less
Sum squared resid 1.14E+08 Schwarz criterion 17.65604 than 5%.
Log likelihood -428.6811 Hannan-Quinn criter. 17.60812 Reject H0: 1= 0
F-statistic 6.263847 Durbin-Watson stat 1.943843
Prob(F-statistic) 0.015857
The coefficient is
significant. Rank has
an significant effect
If the ranking drops one level, on the # of
the number of of application applications
decreases 34 persons. 152
Example2 : College Applications
Dependent Variable: APPLICATION
Method: Least Squares
Sample: 1 49
Included observations: 49

Variable Coefficient Std. Error t-Statistic Prob.

C 5956.410 444.8246 13.39047 0.0000

RANK -33.52880 13.39669 -2.502768 0.0159 The variation of ranking
explain 12% of the
R-squared 0.117600 Mean dependent var 4992.286
Adjusted R-squared 0.098826 S.D. dependent var 1640.112 variation of the # of
S.E. of regression 1556.962 Akaike info criterion 17.57882 applications. R2=0.1176
Sum squared resid 1.14E+08 Schwarz criterion 17.65604
Log likelihood -428.6811 Hannan-Quinn criter. 17.60812
F-statistic 6.263847 Durbin-Watson stat 1.943843
Prob(F-statistic) 0.015857

The whole equation is

significant at 95% CL
153

Unit II Notes
No ratings yet
Unit II Notes
36 pages
Data Analytics-Lab Manual
No ratings yet
Data Analytics-Lab Manual
19 pages
Essentials of Modern Business Statistics With Microsoft Excel 7th Edition David Anderson - Ebook PDF PDF Download
100% (2)
Essentials of Modern Business Statistics With Microsoft Excel 7th Edition David Anderson - Ebook PDF PDF Download
75 pages
MQM100 MultipleChoice Chapter10
71% (7)
MQM100 MultipleChoice Chapter10
28 pages
Bivariate Analysis
100% (2)
Bivariate Analysis
19 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Basic Eco No Metrics - Gujarati
50% (2)
Basic Eco No Metrics - Gujarati
48 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Ejercicios Minimos Cuadrados Chapra
No ratings yet
Ejercicios Minimos Cuadrados Chapra
3 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
WEEK3 Multiple Regression
No ratings yet
WEEK3 Multiple Regression
192 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Business Statistics II
100% (2)
Business Statistics II
100 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
Eighteenth Edition, Global Edition: Company and Marketing Strategy
No ratings yet
Eighteenth Edition, Global Edition: Company and Marketing Strategy
68 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
13 Predictive Analysis - Tests of Association - Regression
No ratings yet
13 Predictive Analysis - Tests of Association - Regression
70 pages
MGT Three
No ratings yet
MGT Three
86 pages
Ch3 Slides Ed4 2024 20
No ratings yet
Ch3 Slides Ed4 2024 20
72 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Chapter 9 Simple Linear Regression and Correlation
No ratings yet
Chapter 9 Simple Linear Regression and Correlation
56 pages
Regression
No ratings yet
Regression
60 pages
Finplanningandgrowth
No ratings yet
Finplanningandgrowth
60 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
36 pages
Linear Regression Model: Man - PN@VNP - Edu.vn
No ratings yet
Linear Regression Model: Man - PN@VNP - Edu.vn
77 pages
1-Chap II Econometrics ABC DR Mitiku
No ratings yet
1-Chap II Econometrics ABC DR Mitiku
80 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
Kitap 1
No ratings yet
Kitap 1
45 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Lecture 16 Regression
No ratings yet
Lecture 16 Regression
30 pages
125.785 Module 2.1
No ratings yet
125.785 Module 2.1
94 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Eighteenth Edition, Global Edition: Consumer Markets and Buyer Behavior
No ratings yet
Eighteenth Edition, Global Edition: Consumer Markets and Buyer Behavior
37 pages
Lecture 2. Simple Linear Regression
No ratings yet
Lecture 2. Simple Linear Regression
49 pages
STAT 445-Lecture 1 - 2021
No ratings yet
STAT 445-Lecture 1 - 2021
42 pages
Linear Models
No ratings yet
Linear Models
92 pages
Chapter 2 - 1907876925
No ratings yet
Chapter 2 - 1907876925
33 pages
Simple Regression Model CH02
No ratings yet
Simple Regression Model CH02
60 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Lecture 3 Simple Linear Regression
No ratings yet
Lecture 3 Simple Linear Regression
46 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Eighteenth Edition, Global Edition: Business Markets and Business Buyer Behavior
No ratings yet
Eighteenth Edition, Global Edition: Business Markets and Business Buyer Behavior
23 pages
Topic 2
No ratings yet
Topic 2
23 pages
Unit 11
No ratings yet
Unit 11
21 pages
ML Lab - V Sem - Bca
No ratings yet
ML Lab - V Sem - Bca
22 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Econometrics Simple Linear Regression
No ratings yet
Econometrics Simple Linear Regression
22 pages
Topic 3 SRM 1
No ratings yet
Topic 3 SRM 1
61 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Lect5 Math231
No ratings yet
Lect5 Math231
31 pages
Business Statistics Session 17: Simple Correlation and Regression
No ratings yet
Business Statistics Session 17: Simple Correlation and Regression
24 pages
Simple Regression Model
No ratings yet
Simple Regression Model
55 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
Autoregressive Integrated Moving Average
No ratings yet
Autoregressive Integrated Moving Average
4 pages
U02Lecture06 Regression
No ratings yet
U02Lecture06 Regression
25 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Topic 2
No ratings yet
Topic 2
23 pages
House Price Prediction
No ratings yet
House Price Prediction
7 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
LAB 1 - Brand Strategy and Super Bowl Twitter Analytics
No ratings yet
LAB 1 - Brand Strategy and Super Bowl Twitter Analytics
6 pages
Psychophysics. Irt
No ratings yet
Psychophysics. Irt
5 pages
2024 Module Test 2 - 2
No ratings yet
2024 Module Test 2 - 2
6 pages
Fiverr Gig Research
No ratings yet
Fiverr Gig Research
7 pages
ExecCourse1 9-20pdf
No ratings yet
ExecCourse1 9-20pdf
44 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Iris - Regression - Jupyter Notebook
No ratings yet
Iris - Regression - Jupyter Notebook
5 pages
MTBF
No ratings yet
MTBF
5 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
Resumen Gujarati Econometria
No ratings yet
Resumen Gujarati Econometria
8 pages
Notes 2
No ratings yet
Notes 2
16 pages
Jurnal - Indikator Sikap Wajib Pajak
No ratings yet
Jurnal - Indikator Sikap Wajib Pajak
10 pages
Core Course - Co3crt08 - Quantitative Techniques For Business - 1
No ratings yet
Core Course - Co3crt08 - Quantitative Techniques For Business - 1
2 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
19 pages
6.3 Linear Regression
No ratings yet
6.3 Linear Regression
4 pages
Kappa Test For Agreement Between Two Raters
No ratings yet
Kappa Test For Agreement Between Two Raters
12 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
RESEARCH METHODS LESSON 18 - Multiple Regression
No ratings yet
RESEARCH METHODS LESSON 18 - Multiple Regression
6 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Correlation
No ratings yet
Correlation
27 pages
Ols Hat Matrix
No ratings yet
Ols Hat Matrix
14 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
3917 Econ
No ratings yet
3917 Econ
2 pages
Newbold Et Al 94 Adventures With Arima Software
No ratings yet
Newbold Et Al 94 Adventures With Arima Software
9 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
U04d1 Repeated Measures ANOVA
No ratings yet
U04d1 Repeated Measures ANOVA
2 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Statistics Formula
No ratings yet
Statistics Formula
4 pages
Aayakar on Micro Bazaar
From Everand
Aayakar on Micro Bazaar
Ramesh Kumar P
No ratings yet
The SABR/LIBOR Market Model: Pricing, Calibration and Hedging for Complex Interest-Rate Derivatives
From Everand
The SABR/LIBOR Market Model: Pricing, Calibration and Hedging for Complex Interest-Rate Derivatives
Riccardo Rebonato
4/5 (1)
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)