Simple Linear Regression
Simple Linear Regression
Simple regression
3
Simple Regression
„Regression of y on x“:
We refer to y as:
• Dependent variable
• Explained variable = intercept or constant
• Regressand
= slope parameter
We refer to x as:
• Independent variable
• Explanatory variable = error term or disturbance („“, „e“…);
• Regressor contains all other factors relevant for y that are
• Covariate not explicitly included as separate variables in
• Control variable the regression equation. That is, the effects of all
unobserved factors important for y are included
in the error term.
4
Simple Regression
𝑌 = 𝛽𝑜 + 𝛽1 𝑋 + 𝑢
𝑌 𝑖= 𝛽 𝑜 + 𝛽 1 𝑋 𝑖 +𝑢𝑖 (i identifies individual observations)
y3,x3
^3
𝑢
y5,x5 y1,x1
y4,x4
y6,x6
^
𝛽1 y2,x2 +
^
𝛽0 1
x, e.g., fertilizer
5
Simple Regression
What do we hope to learn from the regression?
Essentially, the effect of on , which we extract by looking at the change in associated with a
change in .
if
Note that we assume a linear relationship between and by the functional form we chose –
assumption can be relaxed (varying effect of x on y with differing x).
? Is merely the value for when Rarely of importance for the analysis.
6
First assumptions
For the estimation of the unknown/unobserved parameters and , we need to make some
assumptions.
First assumption:
Assumption is not restrictive, because we can use to normalize to 0 (shifting the regression
line until ).
7
First assumptions
One crucial assumption concerns the relationship between u and x.
The average value of u does not depend on x; u is mean independent of x. At the same time,
Is the assumption plausible? Consider soil quality contained in u in the fertilizer example?
vs. ?
8
Estimation of the parameters
Ordinary Least Squares (OLS)
Key idea of a regression is the estimation of population parameter using a sample of the
population.
9
Regression line and sample points
y E(y|x) = b0 + b1x
y4 .
y3 .
y2 .
y1 .
x1 x2 x3 x4 10 x 10
Ordinary Least Squares
Our assumption also implies
By rearranging the regression equation, we can express the error term as:
11
Ordinary Least Squares
To estimate the model parameters, we form the sample counterparts of the population
expectations:
and
All we need to do now is to choose values for and that make the conditions above true. Both
parameters can be identified because we have two unknowns and two moment conditions.
12
Ordinary Least Squares – Finding
Rewrite the first moment condition, using as the shorthand expression for the
sample mean of
or
13
Ordinary Least Squares – Finding
Using the second moment condition
[because ]
[by rearranging]
14
Ordinary Least Squares – Finding
The slope parameter is the sample covariance between and divided by the sample variance.
15
Ordinary Least Squares – Why the name?
Define the fitted value for when :
An intuitive way of optimizing the regression line ist to choose and such that the sum of
squared residuals (as a measure for the “errors” of the regression line) is minimized.
It can be shown that minimizing SSR relies on the same conditions and leads to the same
estimators.
16
Ordinary Least Squares – Why the name?
17
Algebraic Properties of OLS
The sum of the OLS residuals is zero
The sample covariance between the regressors and the OLS residuals is zero
The OLS regression line always goes through the mean of the sample.
18
More terminology
We can regard each observation as being made up of:
• An explained part,
𝑦 𝑖= ^ ^𝑖
𝑦 𝑖 +𝑢
• An unexplained part,
Further is:
• the Total Sum of Squares (SST):
(measures variation in ; SST=SSE+SSR)
19
Goodness-of-Fit
unexplained variation
total variation
explained variation
We can calculate the share of the total sum of squares explained by the model:
20
Goodness-of-Fit
Interpretation of a low R²?
However, a low R² suggests that other (unobserved) factors are of far greater importance for
explaining the variation in y, compared to the variable(s) included in the regression.
21
OLS Example
Scatter plot and R output of regression of test scores on student-teacher ratio (stratio)
22
OLS Example
Interpretation
Coefficient of STR:
School districts that have one additional student per teacher show, on average, 2.28 points
lower test scores.
Intercept:
Would mean that school districts with a student-teacher ratio of zero have average test scores
of 698.9. Is this meaningful?
23
Units of Measurement
What happens to the regression parameters if the unit of x or y is changed?
Suppose we estimate the salary of CEO’s dependent on the company’s return on equity:
The regression model can be adjusted by incorporating nonlinear terms of the explaining
variable. These additional terms would be additional regressors, making our model a multiple
regression (not „simple“ anymore). We discard this option until the next chapter.
Incorporating nonlinearities changes the interpretation of the slope parameter (not linear
anymore).
One popular transform: Log-transform, forming the natural logarithm of the dependent,
indepentent, or both variables.
Caution: natural logarithm = , oftentimes just called “log”
25
Functional form
Relationship between test scores and student-teacher ration looks (somewhat) linear:
26
Functional form
… but how about the relationship between test scores and income?
27
Log-level model of the wage equation
𝑤𝑎𝑔𝑒=𝑒 𝛽 + 𝛽 𝑒𝑑𝑢𝑐
0 1
28
Log-level model of the wage equation
70000 11.5
11 𝑤𝑎𝑔𝑒 = 𝛽 0 + 𝛽1 𝑒𝑑𝑢𝑐
𝛽 0 + 𝛽1 𝑒𝑑𝑢𝑐
𝑤𝑎𝑔𝑒
60000 =𝑒 ln
50000 10.5
40000 10
30000 9.5
20000 9
10000 8.5
0 8
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
29
Functional Form
Taking Logs of in/dependent variables
„Log-differences“ can be used to calculate the approximate percentage change.
E.g.,
2. Assume we can use a random sample of size n, , from the population model. Thus, we
can write the sample model
31
Unbiasedness of OLS estimators
1. Linear in parameters
32
Unbiasedness of OLS estimators
2. Random sampling
The condition is met, if the individuals in our sample are chosen randomly.
For example, this assumption is oftentimes not fulfilled for time series analysis because of
autocorrelation (however, there are ways to correct for it). y
t
33
Unbiasedness of OLS estimators
3. There is variation in x
Intuitively, a regression line cannot be estimated, if all observations show the same value for x
(we cannot explore the change in y if x does not change in our sample).
34
Unbiasedness of OLS estimators
4.
Remember: all other factors not explicitly considered/measured as variables in our regression
equation are contained in u.
35
Unbiasedness of OLS estimators
In the following, we want to show that indeed our OLS estimators are unbiased under the
assumptions made.
In order to think about unbiasedness, we need to rewrite our estimator in terms of the
population parameter
Numerator:
36
Unbiasedness of OLS estimators
37
Unbiasedness of OLS estimators
The OLS estimates of and are unbiased
Proof of unbiasedness depends on our 4 assumptions—if any assumption fails, then OLS is
not necessarily unbiased
38
Variance of the OLS estimators
Unbiasedness: We know now that the sampling distribution of our estimate is centered around
the true parameter.
The second important information is, how reliable our estimators are, that is, how spread out
their distribution is.
Like their expected values, we can estimate the standard errors from information from the
sample.
39
Standard errors
Deriving under the assumption of homoscedasticity ():
40
Standard errors
The greater the error variance (unexplained factors more important), the greater the standard
error.
The greater the variation in x (heterogeneous/larger sample), the smaller the standard error.
41
Heteroscedasticity
Consequences of heteroscedasticity:
remain unbiased
Standard errors need to be adjusted: „Robust“ standard errors
x
42
Summary
We saw how to get unbiased estimates by OLS estimation for and .
Their unbiasedness depends on 4 critical assumptions—if any assumption fails, then OLS is
not necessarily unbiased.
During the course, we will gradually get to know methods to make OLS results more robust.
43