0% found this document useful (0 votes)

11 views70 pages

Sessions 18 19 - Regression - SLR MLR

This document discusses simple linear regression and multiple linear regression. It covers topics such as using regression to predict a dependent variable from an independent variable, interpreting regression coefficients, evaluating regression assumptions, and making inferences about slopes and correlation coefficients.

Uploaded by

jeevanboda.738

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views70 pages

Sessions 18 19 - Regression - SLR MLR

Uploaded by

jeevanboda.738

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 70

Sessions 18 & 19

Simple Linear Regression & Multiple

Linear Regression
Learning Objectives
In this session, we will learn:
• How to use regression analysis to predict the value of a
dependent variable based on an independent variable
• The meaning of the regression coefficients b0 and b1
• How to evaluate the assumptions of regression analysis and
know what to do if the assumptions are violated
• To make inferences about the slope and correlation
coefficient
• To estimate mean values and predict individual values
Correlation vs. Regression
• A scatter plot can be used to show the relationship
between two variables
• Correlation analysis is used to measure the strength
of the association (linear relationship) between two
variables- [refer session 3 material]
• Correlation is only concerned with strength of the
relationship
• No causal effect is implied with correlation
Introduction to
Regression Analysis
• Regression analysis is used to:
• Predict the value of a dependent variable based on the
value of at least one independent variable
• Explain the impact of changes in an independent variable
on the dependent variable
Dependent variable: the variable we wish to
predict or explain
Independent variable: the variable used to predict
or explain the dependent
variable
Simple Linear Regression Model

• Only one independent variable, X

• Relationship between X and Y is described
by a linear function
• Changes in Y are assumed to be related to
changes in X
Types of Relationships

Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X
Types of Relationships-linear relationships
(continued)
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Types of Relationships
(continued)
No relationship

X
Simple Linear Regression Model

Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Yi  β0  β1Xi  ε i
Linear component Random Error
component
Simple Linear Regression Model
(continued)

Y Yi  β0  β1Xi  ε i
Observed Value
of Y for Xi

εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value

Intercept = β0

Xi X
Simple Linear Regression Equation
(Prediction Line)
The simple linear regression equation provides an
estimate of the population regression line

Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for

Ŷi  b0  b1Xi
observation i
The Least Squares Method
Ŷ
b0 and b1 are obtained by finding the values of that
minimize the sum of the squared differences between
Ŷ
Y and :

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

2 2
Least Squares Line…

n ces
re
d iffe
e d
these differences are q u ar
s
called residuals f the
m o …
e su line
e s th
d t he
im iz a n
t s
e min poin
n
h is li n the
T wee
be t

16.13
The Mathematics Behind…
Least Squares Method
• Slope for the Estimated Regression Equation

b1   ( x  x )( y  y )
i i

 (x  x )
i
2

where:
xi = value of independent variable for ith
observation
yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
Least Squares Method

 y-Intercept for the Estimated Regression Equation

b0  y  b1 x
Interpretation of the
Slope and the Intercept

• b0 is the estimated average value of Y

when the value of X is zero

• b1 is the estimated change in the average

value of Y as a result of a one-unit increase
in X
Simple Linear Regression Example

• A real estate agent wishes to examine the relationship

between the selling price of a home and its size
(measured in square feet)

• A random sample of 10 houses is selected

• Dependent variable (Y) = house price in $1000s
• Independent variable (X) = square feet
Simple Linear Regression
Example: Data
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Simple Linear Regression Example: Scatter Plot
House price model: Scatter Plot

450
400

House Price ($1000s) 350

300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
Simple Linear Regression Example: Refer
to actual Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price  98.24833  0.10977 (square feet)
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Simple Linear Regression Example:
Coefficient of Determination, r2 in Excel

SSR 18934.9348
Regression Statistics
r2    0.58082
Multiple R 0.76211 SST 32600.5000
R Square 0.58082
Adjusted R Square 0.52842 58.08% of the variation in
Standard Error 41.33032
house prices is explained by
Observations 10
variation in square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Simple Linear Regression Example:
Standard Error of Estimate in Excel
Regression Statistics
Multiple R 0.76211 S YX  41.33032
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Simple Linear Regression Example:
Graphical Representation

House price model: Scatter Plot and Prediction Line

450
400

House Price ($1000s)

350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet

house price  98.24833  0.10977 (square feet)

Simple Linear Regression Example:
Interpretation of bo

house price  98.24833  0.10977 (square feet)

• b0 is the estimated average value of Y when the value

of X is zero (if X = 0 is in the range of observed X
values)
• Because a house cannot have a square footage of 0,
b0 has no practical application
Simple Linear Regression Example:
Interpreting b1

house price  98.24833  0.10977 (square feet)

• b1 estimates the change in the average value of Y as

a result of a one-unit increase in X
• Here, b1 = 0.10977 tells us that the mean value of a house
increases by .10977($1000) = $109.77, on average, for
each additional one square foot of size
Simple Linear Regression
Example: Making Predictions
Predict the price for a house
with 2000 square feet:

house price  98.25  0.1098 (sq.ft.)

 98.25  0.1098(200 0)

 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Simple Linear Regression
Example: Making Predictions
• When using a regression model for prediction, only
predict within the relevant range of data
Relevant range for
interpolation
450
400
House Price ($1000s)

350
300
250
200
150
100
50 Do not try to
0
extrapolate
0 500 1000 1500 2000 2500 3000
Square Feet
beyond the range
of observed X’s
Comparing Standard Errors
SYX is a measure of the variation of observed
Y values from the regression line
Y Y

small SYX X large SYX X

The magnitude of SYX should always be judged relative to the

size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices in
the $200K - $400K range
Performing different Hypothesis tests

t-test for correlation co-efficient

F-test for overall model fit &
t-test for slope co-efficient
Module 2
Hypothesis Test for Correlation Coefficient
• Hypotheses
H0 : ρ = 0 (no correlation between X and Y)
H1 : ρ ≠ 0 (correlation exists)

• Test statistic

r -ρ (with n – 2 degrees of freedom)

t STAT 
2
1 r where

n2 r   r 2 if b1  0

r   r 2 if b1  0
Hypothesis Test for Correlation Coefficient
(continued)

Is there evidence of a linear relationship

between square feet and house price at
the .05 level of significance?

H 0: ρ = 0 (No correlation)
H 1: ρ ≠ 0 (correlation exists)
 =.05 , df = 10 - 2 = 8

rρ .762  0
t STAT    3.329
1 r2 1  .7622
n2 10  2
Hypothesis Test for Correlation Coefficient
(continued)

rρ .762  0 Decision:

t STAT    3.329
Reject H0
1 r2 1  .7622
n2 10  2 Conclusion:
There is
d.f. = 10-2 = 8
evidence of a
linear association
a/2=.025 a/2=.025
at the 5% level of
significance
Reject H0 Do not reject H0 Reject H0
-tα/2 0
tα/2
-2.3060 2.3060
3.329
F Test for Significance- Overall Model Fit
• F Test statistic:
MSR
FSTAT 
where MSE

SSR
MSR 
k
SSE
MSE 
n  k 1
where FSTAT follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom

(k = the number of independent variables in the regression model)

Measures of Variation
• Total variation is made up of two parts:

SST  SSR  SSE

Total Sum of Regression Sum Error Sum of
Squares of Squares Squares

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

where:
Y = Mean value of the dependent variable
Yi = Observed value of the dependent variable
Yˆi = Predicted value of Y for the given X value
i
Measures of Variation

• SST = total sum of squares (Total Variation)

• Measures the variation of the Yi values around their mean Y
• SSR = regression sum of squares (Explained Variation)
• Variation attributable to the relationship between X and Y
• SSE = error sum of squares (Unexplained Variation)
• Variation in Y attributable to factors other than X
Measures of Variation

Y
Yi  
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2

Y  _
_ SSR = (Yi - Y)2 _
Y Y

Xi X
Coefficient of Determination, r2
• The coefficient of determination is the portion of
the total variation in the dependent variable that
is explained by variation in the independent
variable
• The coefficient of determination is also called r-
squared and is denoted as r2

SSR regression sum of squares

r 
2

SST total sum of squares

note: 0 r 1
2
F Test for Significance
(continued)

H0: β1 = 0 Test Statistic:

H1: β1 ≠ 0 MSR
FSTAT   11.08
 = .05 MSE
df1= 1 df2 = 8 Decision:
Critical Reject H0 at  = 0.05
Value:
F = 5.32
Conclusion:
 = .05
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
Inferences About the Slope:
t Test Example
House Price Estimated Regression Equation:
Square Feet
in $1000s
(x)
(y)
house price  98.25  0.1098 (sq.ft.)
245 1400
312 1600
279 1700
308 1875 The slope of this model is 0.1098
199 1100
219 1550
Is there a relationship between the
405 2350 square footage of the house and its
324 2450 sales price?
319 1425
255 1700
Inferences About the Slope: t Test
• t test for a population slope
• Is there a linear relationship between X and Y?
• Null and alternative hypotheses
• H0: β1 = 0 (no linear relationship)
• H1: β1 ≠ 0 (linear relationship does exist)
• Test statistic where:
b1 = regression slope
coefficient
b1  β 1
t STAT  β1 = hypothesized slope

Sb Sb1 = standard
1 error of the slope
Df = n-k-1,
k = # indepenedent variables
d.f.  n  2
Inferences About the Slope

• The standard error of the regression slope coefficient (b1) is estimated

S YX S YX
Sb1  
SSX  (X i  X) 2

where:
Sb1 = Estimate of the standard error of the slope

SSE
S YX  = Standard error of the estimate
n2
Standard Error of Estimate
• The standard deviation of the variation of
observations around the regression line is estimated
by
n

SSE
 (Yi  Yˆi ) 2
i 1
S YX  
n2 n2
Where
SSE = error sum of squares
n = sample size
Inferences About the Slope: t Test Example
H0: β1 = 0
From Excel output: H1: β1 ≠ 0
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

b1 Sb1

b1  β 1 0.10977  0
t STAT    3.32938
Sb 0.03297
1
Inferences About the Slope: t Test Example
H0: β1 = 0
Test Statistic: tSTAT = 3.329
H1: β1 ≠ 0

d.f. = 10- 2 = 8

a/2=.025 a/2=.025
Decision: Reject H0

There is sufficient evidence

Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 that square footage affects
0
-2.3060 2.3060 3.329 house price
Inferences About the Slope: t Test Example
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

p-value
Decision: Reject H0, since p-value < α
There is sufficient evidence that
square footage affects house price.
Confidence Interval Estimate for the
Slope
Confidence Interval Estimate of the Slope:
b1  t α / 2 S b d.f. = n - 2
1

Excel Printout for House Prices:

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

At 95% level of confidence, the confidence interval for

the slope is (0.0337, 0.1858)
Confidence Interval Estimate for the
Slope
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Since the units of the house price variable is

$1000s, we are 95% confident that the average
impact on sales price is between $33.74 and
$185.80 per square foot of house size

This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance
Assumptions of Regression
L.I.N.E

• Linearity
• The relationship between X and Y is linear
• Independence of Errors
• Error values are statistically independent
• Normality of Error
• Error values are normally distributed for any given value of
X
• Equal Variance (also called homoscedasticity)
• The probability distribution of the errors has constant
variance
Residual Analysis
ei  Yi  Ŷi
• The residual for observation i, ei, is the difference between
its observed and predicted value
• Check the assumptions of regression by examining the
residuals
• Examine for linearity assumption
• Evaluate independence assumption
• Evaluate normal distribution assumption
• Examine for constant variance for all levels of X (homoscedasticity)

• Graphical Analysis of Residuals

• Can plot residuals vs. X
Residual Analysis for
Linearity
Y Y

x x
residuals

residuals
x x

Not Linear
 Linear
Residual Analysis for
Independence

Not Independent
 Independent
residuals

residuals
X
residuals

X
Checking for Normality
• Examine the Stem-and-Leaf Display of the Residuals
• Examine the Boxplot of the Residuals
• Examine the Histogram of the Residuals
• Construct a Normal Probability Plot of the Residuals
Residual Analysis for
Normality
When using a normal probability plot, normal
errors will approximately display in a straight line

Percent
100

0
-3 -2 -1 0 1 2 3
Residual
Residual Analysis for
Equal Variance

Y Y

x x
residuals

residuals
x x

Non-constant variance
 Constant variance
Simple Linear Regression Example: Excel Residual Output

RESIDUAL OUTPUT
House Price Model Residual Plot
Predicted
House Price Residuals
80
1 251.92316 -6.923162
2 273.87671 38.12329 60

3 284.85348 -5.853484 40

Residuals
4 304.06284 3.937162 20
5 218.99284 -19.99284 0
6 268.38832 -49.38832 0 1000 2000 3000
-20
7 356.20251 48.79749 -40
8 367.17929 -43.17929
-60
9 254.6674 64.33264 Square Feet
10 284.85348 -29.85348
Does not appear to violate
any regression assumptions
Estimating Mean Values and
Predicting Individual Values
Goal: Form intervals around Y to express
uncertainty about the value of Y for a given Xi
Confidence
Interval for Y 
the mean of Y
Y, given Xi

Y = b0+b1Xi

Prediction Interval
for an individual Y,
given Xi Xi X
Confidence Interval for the Average Y, Given X

Confidence interval estimate for the

mean value of Y given a particular Xi

Confidenceintervalfor μ Y|X  X :
i

Yˆ  t α / 2 S YX hi

Size of interval varies according

to distance away from mean, X

1 (X i  X)2 1 (X i  X)2
hi    
n SSX n  (X i  X)2
Prediction Interval for an Individual Y, Given X

Confidence interval estimate for an

Individual value of Y given a particular Xi

Confidenceintervalfor YX X :
i

Yˆ  t α / 2 S YX 1  hi

This extra term adds to the interval width to reflect

the added uncertainty for an individual case
Estimation of Mean Values:
Example
Confidence Interval Estimate for μY|X=X
i

Find the 95% confidence interval for the mean price

of 2,000 square-foot houses

Predicted Price Yi = 317.85 ($1,000s)

1 (X i  X) 2
Ŷ  t 0.025S YX   317.85 37.12
n
 (X i  X) 2

The confidence interval endpoints (from Excel) are

280.66 and 354.90, or from $280,660 to $354,900
Estimation of Individual Values:
Example
Prediction Interval Estimate for YX=X
i

Find the 95% prediction interval for an individual

house with 2,000 square feet

Predicted Price Yi = 317.85 ($1,000s)

1 (X i  X) 2
Ŷ  t 0.025S YX 1    317.85 102.28
n
 (X i  X) 2

The prediction interval endpoints from Excel are 215.50

and 420.07, or from $215,500 to $420,070
What’s the Difference?
Prediction Interval Confidence Interval

1 no 1

Used to estimate the value of Used to estimate the mean

one value of y (at given x) value of y (at given x)

The confidence interval estimate of the expected value of y will be narrower than
the prediction interval for the same given value of x and confidence level. This is
because there is less error in estimating a mean value as opposed to predicting an
individual value.
Multiple Linear Regression
• Concept and analogy to Simple Linear Regression
• Equation type and interpretation
• Model fit [DoF=n-k-1] and explanatory power of each independent
variable
• Comparing different models- Adjusted R squared
Adjusted r2
• r2 never decreases when a new X variable is added
to the model
• This can be a disadvantage when comparing models
• What is the net effect of adding a new variable?
• We lose a degree of freedom when a new X variable is
added
• Did the new X variable add enough explanatory power to
offset the loss of one degree of freedom?
Adjusted r2
(continued)
• Shows the proportion of variation in Y explained by all
X variables adjusted for the number of X variables used
 2  n  1 
r 2
adj  1  (1  r ) 
  n  k  1 
(where n = sample size, k = number of independent variables)

• Penalizes excessive use of unimportant independent

variables
• Smaller than r2
• Useful in comparing among models
Pitfalls of Regression Analysis
• Lacking an awareness of the assumptions underlying
least-squares regression
• Not knowing how to evaluate the assumptions
• Not knowing the alternatives to least-squares
regression if a particular assumption is violated
• Using a regression model without knowledge of the
subject matter
• Extrapolating outside the relevant range
Strategies for Avoiding the Pitfalls of Regression
• Start with a scatter plot of X vs. Y to observe possible
relationship
• Perform residual analysis to check the assumptions
• Plot the residuals vs. X to check for violations of assumptions
such as homoscedasticity
• Use a histogram, stem-and-leaf display, boxplot, or normal
probability plot of the residuals to uncover possible non-
normality
• If there is violation of any assumption, use alternative
methods or models
• If there is no evidence of assumption violation, then test
for the significance of the regression coefficients and
construct confidence intervals and prediction intervals
• Avoid making predictions or forecasts outside the relevant
range
Chapter Summary
• Making inferences about the slope
• Correlation -- measuring the strength of the association
• The estimation of mean values and prediction of individual values
• Possible pitfalls in regression and recommended strategies to avoid
them
APPENDIX - I
APPENDIX - II

Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Simple and Multiple Regression
100% (2)
Simple and Multiple Regression
39 pages
Mbas901 - L3
No ratings yet
Mbas901 - L3
103 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
195 pages
Chapter 12 - Simple Linear Regression
No ratings yet
Chapter 12 - Simple Linear Regression
77 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
51 pages
1 Simple LR
No ratings yet
1 Simple LR
111 pages
Session 3 - Linear Regression
No ratings yet
Session 3 - Linear Regression
96 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
68 pages
Regression
No ratings yet
Regression
60 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Simple and Multiple Regression
No ratings yet
Simple and Multiple Regression
56 pages
Regression
No ratings yet
Regression
56 pages
Chapter 7 - S
No ratings yet
Chapter 7 - S
49 pages
Lecture 11
No ratings yet
Lecture 11
62 pages
Correlation and Regression
No ratings yet
Correlation and Regression
32 pages
Chap 12
No ratings yet
Chap 12
62 pages
Simple Linear Reg Ex 1
No ratings yet
Simple Linear Reg Ex 1
34 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
25 pages
Simple Linear Reg Ex 1
No ratings yet
Simple Linear Reg Ex 1
34 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Course 10-Part 1
No ratings yet
Course 10-Part 1
32 pages
Simplelinearregression NBC
No ratings yet
Simplelinearregression NBC
50 pages
Application of Regression Analysis Final
No ratings yet
Application of Regression Analysis Final
27 pages
CUHK STAT5102 Ch1
No ratings yet
CUHK STAT5102 Ch1
54 pages
Business Statistics: A Decision-Making Approach: Introduction To Linear Regression and Correlation Analysis
No ratings yet
Business Statistics: A Decision-Making Approach: Introduction To Linear Regression and Correlation Analysis
64 pages
Cor Regression
No ratings yet
Cor Regression
27 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Bsafc4 PPT ch12
No ratings yet
Bsafc4 PPT ch12
30 pages
Lecture 4.3 Regression-1
No ratings yet
Lecture 4.3 Regression-1
30 pages
Regression Analysis All
No ratings yet
Regression Analysis All
29 pages
Introduction To Linear Regression and Correlation Analysis: Objectives
100% (1)
Introduction To Linear Regression and Correlation Analysis: Objectives
33 pages
Regression Analysis
100% (2)
Regression Analysis
28 pages
Chapter11 - Simple Regression
No ratings yet
Chapter11 - Simple Regression
12 pages
07 Linear Correlation & Regression
No ratings yet
07 Linear Correlation & Regression
12 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
64 pages
Industrial Engineering (Simple Linear)
No ratings yet
Industrial Engineering (Simple Linear)
82 pages
Stat 302 Lec 12
No ratings yet
Stat 302 Lec 12
59 pages
Regression and Correlation
No ratings yet
Regression and Correlation
66 pages
Slide Chap11
No ratings yet
Slide Chap11
19 pages
L20 - Regression Analysis 1 PDF
No ratings yet
L20 - Regression Analysis 1 PDF
17 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
CH 12
No ratings yet
CH 12
57 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
Regression Analysis: (And It's Application in Business)
No ratings yet
Regression Analysis: (And It's Application in Business)
31 pages
Applied Quantitative Analysis and Practices: Lecture#22
No ratings yet
Applied Quantitative Analysis and Practices: Lecture#22
27 pages
Estad Istica II Chapter 4: Simple Linear Regression
No ratings yet
Estad Istica II Chapter 4: Simple Linear Regression
46 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
BUKU 2. Language Testing
No ratings yet
BUKU 2. Language Testing
140 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Regression
No ratings yet
Regression
66 pages
MCQ Regression and Correlation With Correct Answers 1
100% (1)
MCQ Regression and Correlation With Correct Answers 1
9 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Regression Analysis: (And It's Application in Business)
No ratings yet
Regression Analysis: (And It's Application in Business)
31 pages
Introduction To Linear Regression and Correlation Analysis
No ratings yet
Introduction To Linear Regression and Correlation Analysis
47 pages
Statistics For Business Analysis: Learning Objectives
No ratings yet
Statistics For Business Analysis: Learning Objectives
37 pages
Applied Statistics From Bivariate Through Multivariate Techniques 2nd Edition Rebecca M Warner ISBN10 141299134X ISBN13 9781412991346 PDF Download
No ratings yet
Applied Statistics From Bivariate Through Multivariate Techniques 2nd Edition Rebecca M Warner ISBN10 141299134X ISBN13 9781412991346 PDF Download
333 pages
Research Design by Okite Umi
No ratings yet
Research Design by Okite Umi
38 pages
JetGrouting Akin2016
No ratings yet
JetGrouting Akin2016
39 pages
SSRN 4995360
No ratings yet
SSRN 4995360
36 pages
Data Analysis With Microsoft Excel Updated For Office 2007 3rd Edition by Kenneth N Berk, Patrick M Carey ISBN 0538494670 9780538494670
100% (11)
Data Analysis With Microsoft Excel Updated For Office 2007 3rd Edition by Kenneth N Berk, Patrick M Carey ISBN 0538494670 9780538494670
85 pages
LAS for-Practical-Research-2
No ratings yet
LAS for-Practical-Research-2
3 pages
ML Unit 1
No ratings yet
ML Unit 1
74 pages
Foreign Language Annals - 2009 - Bailey - Foreign Language Anxiety and Learning Style
No ratings yet
Foreign Language Annals - 2009 - Bailey - Foreign Language Anxiety and Learning Style
14 pages
Lec 3 Data Preprocessing and Transformation
No ratings yet
Lec 3 Data Preprocessing and Transformation
66 pages
The Effect of Social Media On Student Academic Performance: A Case Study at The Islamic University of Bangladesh
No ratings yet
The Effect of Social Media On Student Academic Performance: A Case Study at The Islamic University of Bangladesh
20 pages
Chapter1-5 Ijr
No ratings yet
Chapter1-5 Ijr
52 pages
0081 Mathematics Practice Set C
No ratings yet
0081 Mathematics Practice Set C
3 pages
Relationship and Factors of Technical Efficiency and Quality of Hospital Care: The Federal State System of Germany
No ratings yet
Relationship and Factors of Technical Efficiency and Quality of Hospital Care: The Federal State System of Germany
21 pages
The Impact of Channel Integration On Consumer Responses in Omni Channel Retailing
No ratings yet
The Impact of Channel Integration On Consumer Responses in Omni Channel Retailing
30 pages
Generic Crash Pulses Representing Future Accident Scenarios of Highly Automated Vehicles
No ratings yet
Generic Crash Pulses Representing Future Accident Scenarios of Highly Automated Vehicles
27 pages
Management of NPA's in Indian Commercial Banks With Special Reference To SBI
No ratings yet
Management of NPA's in Indian Commercial Banks With Special Reference To SBI
19 pages
EJ1352485
No ratings yet
EJ1352485
19 pages
Survey Research in Operations Management A Process
No ratings yet
Survey Research in Operations Management A Process
44 pages
Emigration and Educational Attainment in Mexico
No ratings yet
Emigration and Educational Attainment in Mexico
39 pages
Bacharach, 1989
No ratings yet
Bacharach, 1989
21 pages
11 - Mean, Median, Covariance and Correlation
No ratings yet
11 - Mean, Median, Covariance and Correlation
19 pages
Absenteism 1
No ratings yet
Absenteism 1
8 pages
Gellish 2007
No ratings yet
Gellish 2007
8 pages
Yieldandgrowthof Pak Choiand Green Oakvegetablesgrowninsubstrateplotsandhydroponicswithdifferentplantspacing
No ratings yet
Yieldandgrowthof Pak Choiand Green Oakvegetablesgrowninsubstrateplotsandhydroponicswithdifferentplantspacing
15 pages
JIST 2011 Art00005 Dragoljub-Novaković
No ratings yet
JIST 2011 Art00005 Dragoljub-Novaković
10 pages
PH Honey Electrical Conductivity PDF
No ratings yet
PH Honey Electrical Conductivity PDF
9 pages
The Interrelationships Between Quality Management Practices and Their Effects On Innovation Performance
No ratings yet
The Interrelationships Between Quality Management Practices and Their Effects On Innovation Performance
5 pages
Dissertation Blues - BIIB-IB - 110222
No ratings yet
Dissertation Blues - BIIB-IB - 110222
2 pages
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Exercises of Function Study
From Everand
Exercises of Function Study
Simone Malacrida
No ratings yet

Sessions 18 19 - Regression - SLR MLR

Uploaded by

Sessions 18 19 - Regression - SLR MLR

Uploaded by

Sessions 18 & 19

Simple Linear Regression & Multiple

• Only one independent variable, X

Linear relationships Curvilinear relationships

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

 y-Intercept for the Estimated Regression Equation

• b0 is the estimated average value of Y

• b1 is the estimated change in the average

• A real estate agent wishes to examine the relationship

• A random sample of 10 houses is selected

House Price ($1000s) 350

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Residual 8 13665.5652 1708.1957

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

House price model: Scatter Plot and Prediction Line

House Price ($1000s)

house price  98.24833  0.10977 (square feet)

house price  98.24833  0.10977 (square feet)

• b0 is the estimated average value of Y when the value

house price  98.24833  0.10977 (square feet)

• b1 estimates the change in the average value of Y as

house price  98.25  0.1098 (sq.ft.)

small SYX X large SYX X

The magnitude of SYX should always be judged relative to the

t-test for correlation co-efficient

r -ρ (with n – 2 degrees of freedom)

Is there evidence of a linear relationship

rρ .762  0 Decision:

(k = the number of independent variables in the regression model)

SST  SSR  SSE

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

• SST = total sum of squares (Total Variation)

SSR regression sum of squares

H0: β1 = 0 Test Statistic:

• The standard error of the regression slope coefficient (b1) is estimated

There is sufficient evidence

Excel Printout for House Prices:

At 95% level of confidence, the confidence interval for

Since the units of the house price variable is

This 95% confidence interval does not include 0.

• Graphical Analysis of Residuals

Confidence interval estimate for the

Size of interval varies according

Confidence interval estimate for an

This extra term adds to the interval width to reflect

Find the 95% confidence interval for the mean price

The confidence interval endpoints (from Excel) are

Find the 95% prediction interval for an individual

The prediction interval endpoints from Excel are 215.50

Used to estimate the value of Used to estimate the mean

• Penalizes excessive use of unimportant independent

You might also like