0% found this document useful (0 votes)
26 views

Theory of Linear Regression

Linear regression models the relationship between a scalar response and one or more explanatory variables. Simple linear regression concerns two variables and finds the linear function that best predicts the dependent variable from the independent variable. The least squares method is commonly used to minimize the sum of the squared residuals by estimating the intercept and slope. The estimated regression line can then be used to make predictions and evaluate the model through metrics like the coefficient of determination. Hypothesis tests and confidence intervals allow assessing whether the estimated slope and intercept differ significantly from any comparison values.

Uploaded by

quynh.lephuong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Theory of Linear Regression

Linear regression models the relationship between a scalar response and one or more explanatory variables. Simple linear regression concerns two variables and finds the linear function that best predicts the dependent variable from the independent variable. The least squares method is commonly used to minimize the sum of the squared residuals by estimating the intercept and slope. The estimated regression line can then be used to make predictions and evaluate the model through metrics like the coefficient of determination. Hypothesis tests and confidence intervals allow assessing whether the estimated slope and intercept differ significantly from any comparison values.

Uploaded by

quynh.lephuong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

3.

Theory of linear regression


In statistics, linear regression is a linear approach for modeling the relationship between a
scalar response and one or more explanatory variables (also known as dependent and
independent variables). The case of one explanatory variable is called simple linear
regression; for more than one, the process is called multiple linear regression. In this report,
we only measure the term of single linear regression.
2.1 Simple linear regression
As mentioned, simple linear regression is a linear regression with a single explanatory variable.
That is, it concerns two-dimensional sample points (conventionally, the x and y coordinates in a
Cartesian coordinate system) and finds a linear function (a non-vertical straight line) that, as
accurately as possible, predicts the dependent variable value as a function of the independent
variable. The adjective “simple” refers to the fact that the outcome variables are related to a
single predictor.
It is common to make the additional stipulation that the ordinary least square (OLS) method
should be used: the accuracy of each predicted value is measure by its square residual (vertical
distance between the point of the data set and the fitted line), and the goal is to make the sum of
these square deviations as small as possible.
The expected value of Y at each level of x is a random variable.
Ε ( Y |x )=α + βx
We assume that each observation, Y, can be described by the model:
Y 1=α + β x 1 +ϵ 1
Y 2=α + β x2 +ϵ 2

Y n=α + β x n+ ϵ n
The fitted or estimated regression line is therefore:
^
y 1=a+b x i
Note that each pair of observation satisfies the relationship:
y i=a+bx i +ϵ i ,i=1 , 2 ,3 , , n
Where e i= y i− ^
yi is called the residual. The residual describes the error in the fit of the model to
the 𝑖 − 𝑡ℎ observation of y i.
Estimate Ε ( Y |X =x ¿ ) : ^
y ¿ =a+b x ¿
The least-square estimate of the intercept and slope in the simple linear regression model are
S xy
b=
S xx
And
a= y−b x ,
Where
n
S xx =∑ ¿ ¿
i=1

( )( )
n n n n
1
S x y =∑ (x i ¿−x )( y i− y )=∑ x i y i − ∑ xi ∑ y i ,¿
i=1 i=1 n i=1 i=1

n
S yy =∑ ¿ ¿
i=1

2.2 Sum of square


- The sum of square of the residuals:
SSE=∑ ( y i−^
2
yi )
- The total sum of square of the response variable:
SST =∑ ( y i− y i )
2

- The sum of square for regression:


SS R=∑ ( ^
2
y i− y i )
- Fundamental identity:
SST =SSE+ SSR
- Computational formula:
SST =S yy , SSR=b S x y , SSE=S yy −b S xy
- It can be shown that:
2
E(SSE)=(n−2)σ
- An unbiased estimator of σ 2:
2 SSE
S=
n−2
- Coefficient of determination:
2 SSE
r =1−
SST
2.3 Properties of the Least Square Estimators I
*Slope properties
2
σ 2
E(b)=β ,V (b)= =σ b
S xx
- The estimated standard error of the slope:

Sb =
√ S2
S xx
- Moreover:
b− β
T= t (n−2)
Sb
2.4 Properties of the Least Square Estimators II
*Intercept properties:
2
σ μ xx 2
E(a)=α ,V (a)= =σ a
S xx
1
- With μ xx =
n
∑ x 2i . The estimated standard error of the slope:

Sa =

S2 μ xx
S xx
=√ S b μxx
2

- Moreover:
a−α
T= t(n−2)
Sa

2.5 Confidence Interval


- Confidence interval for the slope:
b ± t v/ 2 ,n−2∗S b
- Confidence interval for the intercept:
a ± t v/ 2 ,n−2∗S a
2.6 Hypothesis Testing
b− β0
- Slope: T = with
Sb
H 1 : β ≠ β 0 , H 1 : β< β 0 , H 1 : β > β 0
a−α 0
-Intercept: T = with
Sa
H 1 : α ≠ α 0 , H 1 : α < α 0 , H 1 : α >α 0

You might also like