0% found this document useful (0 votes)

55 views46 pages

Simple Linear Regression Analysis - Final

The document summarizes key concepts in simple linear regression including: - The simple linear regression model relates a response variable Y to a single predictor X using a straight line. - The least squares method is used to estimate the intercept β0 and slope β1 parameters by minimizing the sum of squared residuals. - Hypothesis tests can be conducted on the slope and intercept parameters using t-tests assuming the errors are normally distributed. - The coefficient of determination R2 indicates how well the model fits the data.

Uploaded by

Prarthana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views46 pages

Simple Linear Regression Analysis - Final

Uploaded by

Prarthana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Lecture 03 - 07

Isuru Kumarasiri
Content
• Simple Linear Regression Model
• Least square estimation of the parameters
• Hypothesis Testing on the slope and Intercept
• Interval Estimation in Simple Linear Regression
• Prediction of new observations
• Coefficient of Determination
• Estimation by maximum likelihood
Simple Linear Regression Model
• simple linear regression model, is a model with a single regressor x
that has a relationship with a response y that is a straight line.
• This simple linear regression model is:

• where the intercept β0 and the slope β1 are unknown constants and ε
is a random error component.
• The errors are assumed to have mean zero and unknown variance σ2.
Simple Linear Regression Model Contd..
• Additionally we usually assume that the errors are uncorrelated. This
means that the value of one error does not depend on the value of
any other error.

• The parameters β0 and β1 are usually called regression coefficients.

Least square estimation of the parameters
• The parameters β0 and β1 are unknown and must be estimated using
sample data.
• That is, we estimate β0 and β1 so that the sum of the squares of the
differences between the observations yi and the straight line is a
minimum.

• Thus, the least - squares criterion is

Least square estimation of the parameters
Contd…
Least square estimation of the parameters
Contd…
• Simplifying these two equations yields

• And
Least square estimation of the parameters
Contd…
• The fitted simple linear regression model is then

• Question: Obtain the regression equation of Y on X and estimate

Y when X=55 from the following by hand and using R software.

Ans: Y = 0.942X+6.08
Least square estimation of the parameters
Contd…
෢0 and 𝛽
• Therefore 𝛽 ෢1 are the least-squares estimators of the intercept
and slope, respectively. The fitted simple linear regression model is
then,
෢0 + 𝛽
𝑦ො = 𝛽 ෢1 𝑥
• Above equation gives a point estimate of the mean of y for a
particular x.
New Notations
• 𝑠𝑥𝑥 =

෢1 is,
• Thus, a convenient way to write 𝛽
Residuals
• The difference between the observed value yi and the corresponding
fitted value yˆi is a residual. Mathematically the ith residual is,

• Residuals play an important role in investigating model adequacy and

in detecting departures from the underlying assumptions. This topic is
discussed later.
Residuals Contd…
• After obtaining the least-squares fit, several interesting questions
come to mind:
1. How well does this equation fit the data?
2. Is the model likely to be useful as a predictor?
3. Are any of the basic assumptions (such as constant variance and uncorrelated
errors) violated, and if so, how serious is this?
• All of these issues must be investigated before the model is finally
adopted for use. As noted previously, the residuals play a key role in
evaluating model adequacy.
Hypothesis Testing on the slope and intercept
• We are often interested in testing hypotheses and constructing
confidence intervals about the model parameters.
• These procedures require that we make the additional assumption
that the model errors εi are normally distributed.
• Thus, the complete assumptions are that the errors are normally and
independently distributed with mean 0 and variance σ2, abbreviated
NID(0, σ2 ).
Testing using t Tests
• Suppose that we wish to test the hypothesis
that the slope equals a constant, say β10 . The
appropriate hypotheses are,

• where we have specified ed a two-sided

alternative.
• The Test statistic is,
• Which follows N(0,1) distribution(Z distribution)
When 𝜎 is unknown
• Then we can show that MSRes is an unbiased estimator of σ2.
• Then the test statistic is,

where
• And,

• follows a tn-2 distribution if the null hypothesis H0 : β1 = β10

Test Hypothesis about Intercept
• A similar procedure can be used to test hypotheses about the
intercept. To test

• we would use the test statistic,

• We reject the null hypothesis if,

Testing the significance of Regression line
• A very important special case of the hypotheses is,

• These hypotheses relate to the significance of a regression.

• Failing to reject H0: β1 = 0 implies that there is no linear relationship
between x and y
• Alternatively, if H0: β1 = 0 is rejected, this implies that x is of value in
explaining the variability in y.
Testing the significance of Regression line Contd…

• However, rejecting H0: β1 = 0

could mean either that the
straight-line model is
adequate or that even
though there is a linear
effect of x, better results
could be obtained with the
addition of higher order
polynomial terms in x.
Testing the significance of Regression line
Contd…
• The test procedure for H0: β1 = 0 may be developed by simply making
use of the t statistic with β10 = 0.
• The null hypothesis of significance of regression would be rejected if
Example 1

• Consider the dataset(rocket

propellant) and answer the
following questions.

1. Estimate the model parameters

2. Estimate σ2
3. Test for significance of regression
Example 2

• The purity of oxygen produced by a fractional

distillation process is thought to be related to
the percentage of hydrocarbons in the main
condenser of the processing unit. Twenty
samples are shown. Find,
• Fit a simple linear regression model to the
data.
• Test the hypothesis H0: β1 = 0.
Several other
Properties of
Least Square
Fit
Analysis of Variance
• We may also use an analysis-of-variance approach to test the
significance of a regression.
• The analysis of variance is based on a partitioning of the total
variability in the response variable y.
• To obtain this partitioning, begin with the identity
Analysis of
Variance
Contd..
ANOVA Contd…

• We can use the usual analysis-of-

variance F test to test the
hypothesis
H0: β1 = 0
• The test statistic is,

• follows the F1,n− 2 distribution.

Standard Error
• The denominator of the test statistic, t0, is often called the estimated
standard error, or more simply, the standard error of the slope. That
is,

and

• Similarly, for the intercept,

Interval Estimation in Simple Linear Regression
• Here, we consider confidence interval estimation of the regression
model parameters β0, β1
• If the errors are normally and independently distributed, then
100(1 − α ) percent confidence interval (CI) on the slope β1 is given by,

• and a 100(1 − α ) percent CI on the intercept β0 is,

Question

• construct 95% CIs on β1 using the rocket

propellant data from the Example.
Covariance
• Covariance measures the direction of the relationship between two
variables.
• A positive covariance means that both variables tend to be high or
low at the same time.
• A negative covariance means that when one variable is high, the
other tends to be low.
Properties of Covariance
• Let X1 and X2 denote random variables and let a, b, c, . . . denote
some constants. Then, the following properties hold.
Questions
1. Show that , Sxy= and Sxx=

2. Assuming that, =0 , show that

a.
n
b.
c.
Interval Estimation of the Mean Response
• A major use of a regression model is to estimate the mean response
E(y) for a particular value of the regressor variable x.
• Let x0 be the level of the regressor variable for which we wish to
estimate the mean response, say E(y|x0).
• An unbiased point estimator of E(y|x0) is found from the fitted model
as
Interval
Estimation of
the Mean
Response
Contd…
To obtain a 100(1 − α )
percent CI on E(y|x0)
Question
• Find a 95% CI on E (y|x0) for
the rocket propellant data in
Example. (obtain the 95% CI on the
mean response at x = x0 .)

• For 𝑥ҧ show that the CI is,

Prediction of New Observation
• An important application of the regression model is the prediction of
new observations y corresponding to a specified level of the regressor
variable x.
• If x0 is the value of the regressor variable of interest, then

• Now consider obtaining an interval estimate of this future

observation y0
• We now develop a prediction interval for the future observation y0.
Prediction of New Observation Contd…
• Note that the random variable

• is normally distributed with mean zero and variance

• Thus, the 100(1 − α ) percent prediction interval on a future

observation at x0 is
Question
• Find a 95% prediction interval on a future value of propellant shear
strength in a motor made from a batch of sustainer propellant that is
10 weeks old.
Ans:
Question
• Try page 64 Question (2.18) in the Reference book
Coefficient of Determination
• The quantity

is called the coefficient of determination

• R2 is often called the proportion of variation explained by the
regressor x.
and 0 ≤ R2 ≤ 1
• Show that the R2 for the previous example is 0.9018.
• that is, 90.18% of the variability in strength is accounted for by the
regression model.
Question
Maximum Likelihood Estimates
• The MLE is an example of a point estimate because it gives a single
value for the unknown parameter.
• Given data the maximum likelihood estimate (MLE) for the parameter
p is the value of p that maximizes the likelihood P(data|p). That is, the
MLE is the value of p for which the data is most likely.
• If is often easier to work with the natural log of the likelihood
function. For short, this is simply called the log-likelihood. Since ln(x)
is an increasing function, the maxima of the likelihood and log-
likelihood coincide.
Example
Example Contd…
Estimation(β0 and β1 )by Maximum Likelihood
• Consider the data ( yi , xi ), i = 1, 2, . . . , n . If we assume that the
errors in the regression model are NID(0, σ2 ), then the observations yi
in this sample are normally and independently distributed random
variables with mean β0 + β1xi and variance σ2.
• The Likelihood function is,
Estimation(β0 and β1 )by Maximum Likelihood
Contd…
• Show that the maximum - likelihood estimators are,
End of Simple Linear Regression Part !!!

Questions?

ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
No ratings yet
ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
51 pages
Week1 SLR
No ratings yet
Week1 SLR
30 pages
StatLearning2r PDF
No ratings yet
StatLearning2r PDF
267 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
Regression Slides
No ratings yet
Regression Slides
27 pages
Lesson 11 Simple Linear Regression and Correlation
No ratings yet
Lesson 11 Simple Linear Regression and Correlation
38 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
M1T1L4-Simple Linear Regression Final
No ratings yet
M1T1L4-Simple Linear Regression Final
17 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
ch12 0
No ratings yet
ch12 0
43 pages
Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
F Regression
No ratings yet
F Regression
65 pages
Chapter 2: Simple Linear Regression
No ratings yet
Chapter 2: Simple Linear Regression
58 pages
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
No ratings yet
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
60 pages
3CP10 Final MJJ Linear Regression
No ratings yet
3CP10 Final MJJ Linear Regression
68 pages
C1 English
No ratings yet
C1 English
26 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Linear Regression
No ratings yet
Linear Regression
47 pages
Module III (Part II) (Regression and Time Series)
No ratings yet
Module III (Part II) (Regression and Time Series)
118 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Chapter 9 Simple Linear Regression and Correlation
No ratings yet
Chapter 9 Simple Linear Regression and Correlation
56 pages
Lecture1 STAT4355
No ratings yet
Lecture1 STAT4355
59 pages
Regression 2
No ratings yet
Regression 2
28 pages
The Impact of External Debt On Economic Growth in Nigeria
100% (1)
The Impact of External Debt On Economic Growth in Nigeria
43 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Simple Linear Regression
No ratings yet
Simple Linear Regression
63 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
CVEN2002 Week11
No ratings yet
CVEN2002 Week11
49 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Regression Notes - Part-1
No ratings yet
Regression Notes - Part-1
17 pages
Inference in The Regression Model
No ratings yet
Inference in The Regression Model
4 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Linear Regression
No ratings yet
Linear Regression
56 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
27 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Econometrics
No ratings yet
Econometrics
25 pages
UCCM2263 Tutorial 1 2016
No ratings yet
UCCM2263 Tutorial 1 2016
4 pages
Lecture 12
No ratings yet
Lecture 12
29 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
ntXIiKXCDR6JFJWL - Learning To Use Regression Analysis
100% (1)
ntXIiKXCDR6JFJWL - Learning To Use Regression Analysis
26 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
100% (2)
Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
657 pages
Estimation of Causal Relationships I: Illustration 1
No ratings yet
Estimation of Causal Relationships I: Illustration 1
8 pages
EPBM 05 Group 6 Assignment
No ratings yet
EPBM 05 Group 6 Assignment
40 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
No ratings yet
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
40 pages
Sales Person For Real Estate
No ratings yet
Sales Person For Real Estate
26 pages
EViews
No ratings yet
EViews
9 pages
Regressi On
No ratings yet
Regressi On
16 pages
Chapter9 Regression Multicollinearity
No ratings yet
Chapter9 Regression Multicollinearity
25 pages
09 Inference For Regression Part1
No ratings yet
09 Inference For Regression Part1
12 pages
STAT 497 Lecture Note 9: Diagnostic Checks
No ratings yet
STAT 497 Lecture Note 9: Diagnostic Checks
30 pages
4BBA B Rakhi Toshniwal 21211137
No ratings yet
4BBA B Rakhi Toshniwal 21211137
37 pages
27.12.10h15 KTLTC De-1
No ratings yet
27.12.10h15 KTLTC De-1
6 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
PSTAT 174/274 Lecture Notes 5
No ratings yet
PSTAT 174/274 Lecture Notes 5
28 pages
L1S18MSAF0004 Muhammad Sufyan Sarwar
No ratings yet
L1S18MSAF0004 Muhammad Sufyan Sarwar
3 pages
Statistical Inference: Confidence Intervals
No ratings yet
Statistical Inference: Confidence Intervals
22 pages
EAE250A B. Eke 2013-2
No ratings yet
EAE250A B. Eke 2013-2
3 pages
Surviving Graduate Econometrics With R Difference-In-Differences Estimation - 2 of 8
No ratings yet
Surviving Graduate Econometrics With R Difference-In-Differences Estimation - 2 of 8
7 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
InOpe - 1 - Forecasting
No ratings yet
InOpe - 1 - Forecasting
2 pages
Ajams 8 2 1
No ratings yet
Ajams 8 2 1
5 pages
Multiple Regression
No ratings yet
Multiple Regression
3 pages
Asymptotic Variance
No ratings yet
Asymptotic Variance
6 pages
Example Problems - Econometrics
No ratings yet
Example Problems - Econometrics
8 pages
Econometrics
No ratings yet
Econometrics
4 pages
University of Zimbabwe: Time: 2 Hours
No ratings yet
University of Zimbabwe: Time: 2 Hours
3 pages
Case Project Econometrics
No ratings yet
Case Project Econometrics
4 pages
Pragyan Chandra Resume Template by Me
No ratings yet
Pragyan Chandra Resume Template by Me
1 page
Econometrics I, Lecture 3: Fn. Form What Does "Linear" Mean?
No ratings yet
Econometrics I, Lecture 3: Fn. Form What Does "Linear" Mean?
3 pages
SOA Exam SRM - ASM Learning Flashcards
No ratings yet
SOA Exam SRM - ASM Learning Flashcards
26 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet

Simple Linear Regression Analysis - Final

Uploaded by

Simple Linear Regression Analysis - Final

Uploaded by

Lecture 03 - 07

• The parameters β0 and β1 are usually called regression coefficients.

• Thus, the least - squares criterion is

• Question: Obtain the regression equation of Y on X and estimate

• Residuals play an important role in investigating model adequacy and

• where we have specified ed a two-sided

• follows a tn-2 distribution if the null hypothesis H0 : β1 = β10

• we would use the test statistic,

• We reject the null hypothesis if,

• These hypotheses relate to the significance of a regression.

• However, rejecting H0: β1 = 0

• Consider the dataset(rocket

1. Estimate the model parameters

• The purity of oxygen produced by a fractional

• We can use the usual analysis-of-

• follows the F1,n− 2 distribution.

• Similarly, for the intercept,

• and a 100(1 − α ) percent CI on the intercept β0 is,

• construct 95% CIs on β1 using the rocket

2. Assuming that, =0 , show that

• For 𝑥ҧ show that the CI is,

• Now consider obtaining an interval estimate of this future

• is normally distributed with mean zero and variance

• Thus, the 100(1 − α ) percent prediction interval on a future

is called the coefficient of determination

You might also like