CH 02 PPT Simple Linear Regression
CH 02 PPT Simple Linear Regression
Econometrics (香港浸會大學)
The Simple
Regression Model
Chapter 2
Definition
Definition of the simple linear regression model
Dependent variable,
explained variable, Error term,
Independent variable, disturbance,
regressand, explanatory variable,
response variable,… unobservables,…
regressor,…
Interpretation
I nterpretation of the simple linear regression model
as long as
By how much does the dependent Interpretation only correct if all other
variable change if the independent things remain equal when the indepen-
variable is increased by one unit? dent variable is increased by one unit
Examples
Example: Soybean yield and fertilizer
Rainfall,
land quality,
presence of parasites, …
Measures the effect of fertilizer on
yield, holding all other factors fixed
Conditional Mean
Independence
When is there a causal interpretation?
Conditional mean independence assumption
e.g. intelligence …
Population Regression
Function
Population regression function ( PRF)
The conditional mean independence assumption implies that
Example of PRF
Random Sample
I n order to estimate the regression model one needs data
First observation
Second observation
OLS Estimates
What does "as good as possible" mean?
Regression residuals
Example 1 (ceosal1)
CEO Salary and return on equity
Fitted regression
Intercept
If the return on equity increases by 1 percent,
then salary is predicted to change by 18.501 $
Causal interpretation?
Example 1 (ceosal1)
salary c roe
Example 1 (ceosal1)
Example 1 Cont‘d
Example 2 (wage1)
Wage and education
Fitted regression
Intercept
In the sample, one more year of education was
associated with an increase in hourly wage by 0.54 $
Causal interpretation?
Example 2 (wage1)
wage c educ
Example 3 (vote1)
Voting outcomes and campaign expenditures ( tw o parties)
Fitted regression
Intercept
If candidate A‘s share of spending increases by one
percentage point, he or she receives 0.464 percen-
Causal interpretation? tage points more of the total vote
Example 3 (vote1)
votea c sharea
Properties of OLS
Properties of OLS on any sample of data
Fitted values and residuals
Example Illustration
Goodness of Fit
Goodness-of-Fit
"How well does the explanatory variable explain the dependent variable?"
Measures of Variation
R-squared
Decomposition of total variation
determination)
Examples (Reporting
Regression Result)
CEO Salary and return on equity
Unit of Measurement
=963,191+18,501roe
݈݀ݕݎ݈ܽܽݏ
=963.191+1850.1roedec
ݕݎ݈ܽܽݏ
Semi-logarithmic Form
I ncorporating nonlinearities: Semi-logarithmic form
Regression of log w ages on years of eduction
… if years of education
are increased by one year
Example (wage1)
log(wage) c educ
Example (wage1)
Fitted regression
For example:
Log-logarithmic Form
I ncorporating nonlinearities: Log-logarithmic form
CEO salary and firm sales
Example (ceosal1)
log(salary) c log(sales)
Example (ceosal1)
CEO salary and firm sales: fitted regression
Data is random and depends on particular sample that has been drawn
Discussion of Random
Sampling
Discussion of random sampling: Wage and education
The population consists, for example, of all workers of country A
In the population, a linear relationship between wages (or log wages)
and years of education holds
Draw completely randomly a worker from the population
The wage and the years of education of the worker drawn are random
because one does not know beforehand which worker is drawn
Throw back worker into population and repeat random draw times
The wages and years of education of the sampled workers are used to
estimate the linear relationship between wages and education
Illustration
Unbiaseness of OLS
Theorem 2.1 ( Unbiasedness of OLS)
I nterpretation of unbiasedness
The estimated coefficients may be smaller or larger, depending on
the sample that is the result of a random draw
However, on average, they will be equal to the values that charac-
terize the true relationship between y and x in the population
"On average" means if sampling was repeated, i.e. if the random
sample draw and the estimation was repeated many times
In a given sample, estimates may differ considerably from true values
Homoskedasticity
Graphical illustration of homoskedasticity
Heteroskedasticity
An example for heteroskedasticity: Wage and education
Conclusion:
The sampling variability of the estimated regression coefficients will be
the higher the larger the variability of the unobserved factors, and the
lower, the higher the variation in the explanatory variable
Error Variance
Estimating the error variance
Plug in for
the unknown
The estimated standard deviations of the regression coefficients are called "standard
errors". They measure how precisely the regression coefficients are estimated.