0% found this document useful (0 votes)

454 views

Simple Linear Regression: From Wikipedia, The Free Encyclopedia

Simple linear regression fits a straight line to data points to minimize the sum of squared residuals. It finds the slope and intercept of the line using the least squares method. Confidence intervals for the slope and intercept can be constructed assuming either normality of errors or that the number of observations is large enough for the central limit theorem to apply.

Uploaded by

Gabriel Omar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

454 views

Simple Linear Regression: From Wikipedia, The Free Encyclopedia

Uploaded by

Gabriel Omar

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Simple linear regression

From Wikipedia, the free encyclopedia

This article does not cite any references or sources.

Please help improve this article by adding citations to reliable sources. Unsourced material may
be challenged and removed. (December 2009)

Okun’s law in macroeconomics is an example of the simple linear regression. Here the dependent variable (GDP
growth) is presumed to be in a linear relationship with the changes in the unemployment rate.

In statistics, simple linear regression is the least squares estimator of a linear regression
model with a single predictor variable. In other words, simple linear regression fits a straight line
through the set of n points in such a way that makes the sum of squared residuals of the model
(that is, vertical distances between the points of the data set and the fitted line) as small as
possible.

The adjective simple refers to the fact that this regression is one of the simplest in statistics. The
fitted line has the slope equal to the correlation between yand x corrected by the ratio of standard
deviations of these variables. The intercept of the fitted line is such that it passes through the
center of mass (x, y) of the data points.

Other regression methods besides the simple Ordinary least squares (OLS) also exist (see Linear
model). In particular when one wants to do regression “by eye”, people usually tend to draw a
slightly steeper line, closer to the one produced by the total least squares method. This occurs
because it is more natural for one’s mind to consider the orthogonal distances from the
observations to the regression line, rather than the vertical ones as OLS method does.

Contents
[hide]

• 1 Fitting the regression

line

○ 1.1 Properties

• 2 Confidence intervals

○ 2.1 Normality

assumption

○ 2.2 Asymptotic

assumption

• 3 Numerical example

• 4 See also

[edit]Fitting the regression line

Suppose there are n data points {yi, xi}, where i = 1, 2, …, n. The goal is to find the equation of
the straight line

which would provide a “best” fit for the data points. Here the “best” will be understood as in
the least-squares approach: such a line that minimizes the sum of squared residuals of the
linear regression model. In other words, numbers α and β solve the following minimization
problem:

Find , where

Using simple calculus it can be shown that the values of α and β that minimize the
objective function Q are

where rxy is the sample correlation coefficient between x and y, sx is the standard
deviation of x, and sy is correspondingly the standard deviation of y. Horizontal
bar over a variable means the sample average of that variable. For

example:

Substituting the above expressions for and into

yields

This shows the role rxy plays in the regression line of standardized
data points.

Sometimes people consider a simple linear regression model without

the intercept term: y = βx. In such a case the OLS estimator for β will
be given by formula

(in the regression without the intercept):

[edit]Properties

1. The line goes through the “center of mass” point (x, y).

2. The sum of the residuals is equal to zero, if the model

includes a constant:

3. The linear combination of the residuals in which the

coefficients are the x-values is equal to

zero:

4. The estimators and are unbiased. This requires

that we interpret the model stochastically, that is we
have to assume that for each value of x the
corresponding value of y is generated as a mean
responseα + βx plus an additional random
variable ε called the error term. This error term has to be
equal to zero on average, for each value of x. Under
such interpretation the least-squares estimators

and will themselves be random variables, and they

will unbiasedly estimate the “true values” α and β.
[edit]Confidence intervals
The formulas given in the previous section allow you to calculate
the point estimates of α and β — that is the coefficients of the
regression line for the given dataset. Those formulas however do
not tell us how precise the estimates are. That is, how much the

estimators and can deviate from the “true” values

of α and β. The latter question is answered by the confidence
intervals for the regression coefficients.

In order to construct the confidence intervals usually one of the

two possible assumptions is made: either that the errors in the
regression are normally distributed (the so-called classic
regression assumption), or that the number of observations n is
sufficiently large so that the actual distribution of the estimators
can be approximated using the Central Limit Theorem.
[edit]Normality assumption
Under the first assumption of the normality of the error terms, the
estimator of the slope coefficient will itself be normally distributed

with mean equal to β and variance At the

same time the sum of squared residuals Q is distributed
proportionally to χ2 with (n−2) degrees of freedom, and

independently from This allows us to construct a t-statistic

where

which has Student’s t-distribution with (n−2) degrees of

freedom. Here sβ is the standard deviation of the

estimator
Using this t-statistic we can construct a confidence interval
for β:

at confidence level (1−γ),

where is the (1−γ/2)-th quantile of the tn–

2 distribution. Usually we choose γ = 0.05, so that the
confidence level is equal to 95%.

Similarly, confidence interval for the intercept

coefficient α is given by

at confidence level (1−γ),

where

The US “changes in unemployment – GDP

growth” regression with the 95% confidence
bands.

The confidence intervals for α and β give

us the general idea where these regression
coefficients are most likely to be. For
example in the “Okun’s law” regression
shown at the beginning of the article the
point estimates are

and The 95% confidence

intervals for these estimates are

with 95% confidence.

In order to represent this information

graphically, in the form of the
confidence bands around the
regression line, one has to proceed
carefully and account for the joint
distribution of the estimators. It can
be shown that at confidence level
(1−γ) the confidence band has
hyperbolic form given by the equation

[edit]Asymptotic
assumption
The alternative second
assumption states that when the
number of points in the dataset
is “large enough”, the Law of
large numbers and the Central
limit theorem become
applicable, and then the
distribution of the estimators is
approximately normal. Under
this assumption all formulas
derived in the previous section
remain valid, with the only
exception that the
quantile t*n−2 of student-
t distribution is replaced with the
quantile q* of the standard
normal distribution. Occasionally
the fraction 1⁄(n−2) is replaced
with 1⁄n. When n is large such
change does not alter the
results considerably.

[edit]Numerical
example
As an example we shall conside
the data set from the Ordinary
least squares article. This data
set gives average weights for
humans as a function of their
height in the population of
American women of age 30–39.
Although the OLS article argues
that it would be more
appropriate to run a quadratic
regression for this data, we will
not do so and fit the simple
linear regression instead.

x Heigh
1.47 1.50 1.52 1.55 1.57 1.60 1.63 1.65 1.68 1.70 1.73 1.75 1.78 1.80 1.83
i t (m)

Weig
y 52.2 53.1 54.4 55.8 57.2 58.5 59.9 61.2 63.1 64.4 66.2 68.1 69.9 72.1 74.4
ht
i 1 2 8 4 0 7 3 9 1 7 8 0 2 9 6
(kg)

There are n = 15 points in

this data set, and we start
by calculating the following
five sums:
These quantities can
be used to calculate
the estimates of the
regression
coefficients, and their
standard errors.

The 0.975
quantile of
Student’s t-
distribution with
13 degrees of
freedom is
t*13 = 2.1604,
and thus
confidence
intervals
for α and β are

[edit]See
also
 OLS/P
roofs
—
deriva
tion of
all
formul
as
used
in this
article
in
gener
al
multidi
mensi
onal
case

 Linear
model
—
altern
ative
regres
sion
metho
ds
which
can
be
applie
d in
this
contex
t

 Demin
g
regres
sion
—
orthog
onal
linear
regres
sion

Solution Manual For Microeconometrics
59% (22)
Solution Manual For Microeconometrics
785 pages
GoodBelly Sales Spreadsheet - Case Study
No ratings yet
GoodBelly Sales Spreadsheet - Case Study
72 pages
CFA Level II Formula Sheet CFA Level II Formula Sheet: Finance (Harvard University) Finance (Harvard University)
100% (1)
CFA Level II Formula Sheet CFA Level II Formula Sheet: Finance (Harvard University) Finance (Harvard University)
5 pages
Back To Assignment: 1. Characteristics of Oligopoly
No ratings yet
Back To Assignment: 1. Characteristics of Oligopoly
16 pages
Decision Making Under Uncertainty
No ratings yet
Decision Making Under Uncertainty
4 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Chapter 2 - 1907876925
No ratings yet
Chapter 2 - 1907876925
33 pages
Simple_linear_regression-Presentation -Review-analysis -covariance
No ratings yet
Simple_linear_regression-Presentation -Review-analysis -covariance
10 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
WEEK2 Simple Regression
No ratings yet
WEEK2 Simple Regression
133 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
C1 English
No ratings yet
C1 English
26 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Theory of Linear Regression
No ratings yet
Theory of Linear Regression
4 pages
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
No ratings yet
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
20 pages
Manual ML 1
No ratings yet
Manual ML 1
8 pages
Ordinary Least Squares: Linear Model
No ratings yet
Ordinary Least Squares: Linear Model
13 pages
Ch3_slides_Ed4_2024_20(1)
No ratings yet
Ch3_slides_Ed4_2024_20(1)
72 pages
1-Chap II Econometrics ABC DR Mitiku
No ratings yet
1-Chap II Econometrics ABC DR Mitiku
80 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
econometrics final
No ratings yet
econometrics final
13 pages
qrm2 Session1 2
No ratings yet
qrm2 Session1 2
89 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
Module 5
No ratings yet
Module 5
28 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
37 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Simple Linear Regression Model (1)
No ratings yet
Simple Linear Regression Model (1)
51 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
17 pages
Notes 1
No ratings yet
Notes 1
26 pages
Linear Regression
No ratings yet
Linear Regression
34 pages
ch12 0
No ratings yet
ch12 0
82 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Lecture1 STAT4355
No ratings yet
Lecture1 STAT4355
59 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
EECM3724 Unit 9 ch14 Slides 2023
No ratings yet
EECM3724 Unit 9 ch14 Slides 2023
57 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
Ols 2
No ratings yet
Ols 2
19 pages
05 SOC Chapter - 5 Simple Linear Regression and Correlation
No ratings yet
05 SOC Chapter - 5 Simple Linear Regression and Correlation
26 pages
Week 2
No ratings yet
Week 2
33 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
EmFi L 04
No ratings yet
EmFi L 04
17 pages
Basic Economterics - I
No ratings yet
Basic Economterics - I
17 pages
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
No ratings yet
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
52 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Analytical Methods of Optimization
From Everand
Analytical Methods of Optimization
D. F. Lawden
No ratings yet
Standard-Slope Integration: A New Approach to Numerical Integration
From Everand
Standard-Slope Integration: A New Approach to Numerical Integration
Peter James Italia, MD
No ratings yet
Prediction of Embankment Dam Breach Parameters USBR 1998
No ratings yet
Prediction of Embankment Dam Breach Parameters USBR 1998
67 pages
Two Way Slab
No ratings yet
Two Way Slab
11 pages
Superpave Mix Design
100% (3)
Superpave Mix Design
28 pages
Marshall and Superpave Mix Design Procedures 1
No ratings yet
Marshall and Superpave Mix Design Procedures 1
8 pages
Classification: Algoritma Naïve Bayes
No ratings yet
Classification: Algoritma Naïve Bayes
8 pages
Regression: (Dataset1) D:/Kuliah/Metil/Crossec1
No ratings yet
Regression: (Dataset1) D:/Kuliah/Metil/Crossec1
27 pages
FRM Quantitative Analysis Test 1 Solutions
No ratings yet
FRM Quantitative Analysis Test 1 Solutions
4 pages
A Statistical Comparison of Objetive Functions For The Vehicle Routing Problem With Route Balancing
No ratings yet
A Statistical Comparison of Objetive Functions For The Vehicle Routing Problem With Route Balancing
12 pages
Chapter 8. Sampling Distribution and Estimation Nguyen Thi Thu Van (This Version Is Dated On 22 Aug, 2021)
No ratings yet
Chapter 8. Sampling Distribution and Estimation Nguyen Thi Thu Van (This Version Is Dated On 22 Aug, 2021)
1 page
Analysis of Variance Anova
No ratings yet
Analysis of Variance Anova
38 pages
Present Value Table PDF
100% (6)
Present Value Table PDF
2 pages
Chapter 4 (Time Value of Money)
No ratings yet
Chapter 4 (Time Value of Money)
38 pages
Lesson 11 - Introduction To Compound Interest - 2019-2020
No ratings yet
Lesson 11 - Introduction To Compound Interest - 2019-2020
8 pages
Linear Regression Analysis For Survey Data
No ratings yet
Linear Regression Analysis For Survey Data
28 pages
Discrete Probability Distributions: - Geometric, Binomial & Poisson
No ratings yet
Discrete Probability Distributions: - Geometric, Binomial & Poisson
24 pages
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
No ratings yet
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
14 pages
Chapter 10 - AK
No ratings yet
Chapter 10 - AK
14 pages
9872 - Intro To LPP and Its Formulation
No ratings yet
9872 - Intro To LPP and Its Formulation
26 pages
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
No ratings yet
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
33 pages
TOPIC 6 Hypothesis-Testing
No ratings yet
TOPIC 6 Hypothesis-Testing
2 pages
[Ebooks PDF] download Econometrics by Example 2nd Edition Damodar Gujarati full chapters
100% (6)
[Ebooks PDF] download Econometrics by Example 2nd Edition Damodar Gujarati full chapters
50 pages
Dijkstra Algorithm
No ratings yet
Dijkstra Algorithm
13 pages
AIMMS Modeling Guide - Linear Programming Tricks
No ratings yet
AIMMS Modeling Guide - Linear Programming Tricks
16 pages
Week 4: Diversification and Portfolio Risk
No ratings yet
Week 4: Diversification and Portfolio Risk
35 pages
ECON 330 Problem Sets
No ratings yet
ECON 330 Problem Sets
3 pages
Review Questions: P (X) Represent?
No ratings yet
Review Questions: P (X) Represent?
1 page
07 Fitri Uji Bredenkamp
No ratings yet
07 Fitri Uji Bredenkamp
8 pages
CHapter 2 Econometrics
No ratings yet
CHapter 2 Econometrics
53 pages
Journal of King Saud University - Science: Manoj Kumar Rastogi, Faton Merovci
No ratings yet
Journal of King Saud University - Science: Manoj Kumar Rastogi, Faton Merovci
7 pages

Simple Linear Regression: From Wikipedia, The Free Encyclopedia

Uploaded by

Simple Linear Regression: From Wikipedia, The Free Encyclopedia

Uploaded by

Simple linear regression

From Wikipedia, the free encyclopedia

This article does not cite any references or sources.

• 1 Fitting the regression

[edit]Fitting the regression line

Substituting the above expressions for and into

Sometimes people consider a simple linear regression model without

(in the regression without the intercept):

2. The sum of the residuals is equal to zero, if the model

3. The linear combination of the residuals in which the

4. The estimators and are unbiased. This requires

and will themselves be random variables, and they

estimators and can deviate from the “true” values

In order to construct the confidence intervals usually one of the

with mean equal to β and variance At the

independently from This allows us to construct a t-statistic

which has Student’s t-distribution with (n−2) degrees of

at confidence level (1−γ),

where is the (1−γ/2)-th quantile of the tn–

Similarly, confidence interval for the intercept

at confidence level (1−γ),

The US “changes in unemployment – GDP

The confidence intervals for α and β give

and The 95% confidence

with 95% confidence.

In order to represent this information

There are n = 15 points in

You might also like