0% found this document useful (0 votes)
23 views11 pages

L1 The SLR Model

The document provides an overview of the Simple Linear Regression Model, detailing the terminology and structure of the model, including dependent and independent variables. It explains the Ordinary Least Squares (OLS) method for estimating parameters and the assumptions necessary for the model. Additionally, it distinguishes between population parameters and estimated parameters, emphasizing the importance of understanding residuals and the linearity of parameters.

Uploaded by

dangikunal1001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views11 pages

L1 The SLR Model

The document provides an overview of the Simple Linear Regression Model, detailing the terminology and structure of the model, including dependent and independent variables. It explains the Ordinary Least Squares (OLS) method for estimating parameters and the assumptions necessary for the model. Additionally, it distinguishes between population parameters and estimated parameters, emphasizing the importance of understanding residuals and the linearity of parameters.

Uploaded by

dangikunal1001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Simple Linear Regression Model

Terminology

By convention, we distinguish between variables y and x


y x1 , x2 , ..., xK
Dependant variable Independent variables
Outcome Covariates
Response variable Control Variables
Target variable Features
Regressand Regressors
Explanatory variables
In this course, we will only deal with a single dependant variable y ,
but will discuss cases where there is a single, x, two xs and
generally mutiple (K ) x’s.
We consider models of the form:
With a single explantory variable x: (referred to as the Simple
Linear Regression Model)

yi = β0 + β1 xi + ui

For two explanatory variables:

yi = β0 + β1 x1i + β2 x2i + ui

In the general K variable case:

yi = β0 + β1 x1i + β2 x2i + ... + βk xKi + ui

≡ y = Xβ + u
where y and u are (n × 1), X is (n × K ) and β is (K × 1).
The population parameters β describe how x are related to y .
(In the simple linear regression case, (with one x variable) if u were
unchanged, then the slope parameter β1 captures how y changes
with a unit change in x.)
These population parameters are generally unknown, and must be
estimated from a random sample of observations.
u captures all factors other than those captured by xs that explain
y . u is referred to as the error term or disturbance term, and is
unobserved.
y is thus composed of a systematic part (x, β) + an unsystematic
(unobserved) part (u).
The Simple Linear Regression Model

To begin with, let’s start with the simple linear regression model:

yi = β0 + β1 xi + ui
β0 is termed the constant, or intercept, and β1 is the slope
parameter. As noted earlier, these are unknown and must be
estimated using a random sample of observations or data on y and
x of size n, with i denoting the observation.
Assumptions in the SLR model

Assume, for now


1. The model yi = β0 + β1 xi + ui is linear in parameters.
2. There is a random sample of n observations on y and x
3. Not all the x have the same value
The OLS estimator

There are many ways to estimate the βs.


One of the most commonly used is the Ordinary Least Squares
(OLS) Estimator.
Denote the estimated parameters by β̂0 and β̂1 , and denoted the
fitted or predicted value of y as ŷi = β̂0 + β̂1 xi .
The idea is to get the fitted or predicted ŷ as close to the actual or
observed y . Denote the estimated residual û as the difference:
ûi = yi − ŷi . We want to choose β̂0 and β̂1 such that this distance
û is as small as possible for all the observations i. A common way
to take the square of û (although it is also possible to minimize
P
|ûi |)
The OLS estimator

ûi2
P
OLS involves minimizing the sum of squared residuals (SSR)
n
X n
X
Minβ̂0 ,β̂1 S = ûi2 = [yi − β̂0 − β̂1 xi ]2
i i

First order conditions:


n n n
∂S X X X
= −2[yi −β̂0 −β̂1 xi ] = 0 ⇒ yi = β̂0 n+β̂1 xi (1)
∂β0
i i i

n n n
∂S X X X X
= −2[yi −β̂0 −β̂1 xi ]xi = 0 ⇒ xi yi = β̂0 xi +β̂1 xi2 (2)
∂β1
i i i

(1) and (2) are referred to as the normal equations.


Note that
(1) ⇒ ȳ = β̂0 + β̂1 x̄ ⇒ β̂0 = ȳ − β̂1 x̄.
Plugging this into (2) and recognizing

P
xi = nx̄ where x̄ is the mean of x

P P
(xi −Px̄)(yi − ȳ ) = P xi yi − nx̄ ȳ . Obviously a special case
is that (xi − x̄)2 = xi2 − n(x̄)2
it is easy to show that
Pn
iP(xi − x̄)(yi − ȳ )
β̂1 = n 2
i (xi − x̄)
These are the OLS estimators of β0 and β1 .
More on terminology

It is important to distinguish between population parameters and


estimated parameters using OLS:
The beta’s estimated using OLS are denoted by β̂0 and β̂1 .
Furthermore, ûi = yi − ŷi , is an estimated residual, and is in sharp
contrast to the unobserved error u.
Similarly, to reiterate, ŷ is the fitted or predicted value of (and
distinct from) y .
Assumptions again

• Linearity in parameters ̸= linearity of functional form. We


usually estimate nonlinear functions (quadratic, semi-log etc).
But these are nonlinear in x. Forms nonlinear in β cannot be
accommodated.
• What does a random sample mean?
• Why was assumption 3 necessary? Ask yourself how you
would estimate β̂0 and β̂1 without this assumption.

You might also like