0% found this document useful (0 votes)

15 views64 pages

Linear Regression

Uploaded by

sarvesh.bagad25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views64 pages

Linear Regression

Uploaded by

sarvesh.bagad25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 64

Correlation vs.

Regression
 A scatter plot can be used to show the
relationship between two variables
 Correlation analysis is used to measure the
strength of the association (linear relationship)
between two variables
 Correlation is only concerned with strength of the
relationship
 No causal effect is implied with correlation
Introduction to
Regression Analysis
 Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to
predict or explain
Independent variable: the variable used to predict
or explain the
dependent variable
Simple Linear Regression
Model
 Only one independent variable, X
 Relationship between X and Y is
described by a linear function
 Changes in Y are assumed to be related
to changes in X
Types of Relationships

Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X
Types of Relationships
(continued)
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Types of Relationships
(continued)
No relationship

X
Simple Linear Regression
Model

Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Yi  β0  β1Xi  ε i
Linear component Random Error
component
Simple Linear Regression
Model
(continued)

Y Yi  β0  β1Xi  ε i
Observed Value
of Y for Xi

εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value

Intercept = β0

Xi X
Simple Linear Regression
Equation (Prediction Line)
The simple linear regression equation provides an
estimate of the population regression line

Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for

Ŷi  b0  b1Xi
observation i
The Least Squares Method

b0 and b1 are obtained by finding the values

that minimize the sum of the squared
differences between Y and Ŷ :

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

2 2
Finding the Least Squares
Equation

 The coefficients b0 and b1 , and other

regression results in this chapter, will be
found using Excel
Interpretation of the
Slope and the Intercept

 b0 is the estimated average value of Y

when the value of X is zero

 b1 is the estimated change in the

average value of Y as a result of a
one-unit increase in X
Simple Linear Regression
Example
 A real estate agent wishes to examine the
relationship between the selling price of a home
and its size (measured in square feet)
 A random sample of 10 houses is selected
 Dependent variable (Y) = house price in $1000s

 Independent variable (X) = square feet

Simple Linear Regression
Example: Data
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Simple Linear Regression
Example: Scatter Plot
House price model: Scatter Plot
450
400
House Price ($1000s)

350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
Simple Linear Regression Example:
Using Excel Data Analysis Function

1. Choose Data 2. Choose Data Analysis

3. Choose Regression
Simple Linear Regression Example:
Using Excel Data Analysis Function
(continued)
Enter Y’s and X’s and desired options
Simple Linear Regression
Example: Using PHStat
Add-Ins: PHStat: Regression: Simple Linear Regression
Simple Linear Regression Example:
Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price  98.24833  0.10977 (square feet)
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Simple Linear Regression Example:
Graphical Representation

House price model: Scatter Plot and Prediction Line

450
400
House Price ($1000s)

350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet

house price  98.24833  0.10977 (square feet)

Simple Linear Regression
Example: Interpretation of bo

house price  98.24833  0.10977 (square feet)

 b0 is the estimated average value of Y when the

value of X is zero (if X = 0 is in the range of
observed X values)
 Because a house cannot have a square footage
of 0, b0 has no practical application
Simple Linear Regression
Example: Interpreting b1

house price  98.24833  0.10977 (square feet)

 b1 estimates the change in the average

value of Y as a result of a one-unit
increase in X
 Here, b1 = 0.10977 tells us that the mean value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size
Simple Linear Regression
Example: Making Predictions
Predict the price for a house
with 2000 square feet:

house price  98.25  0.1098 (sq.ft.)

 98.25  0.1098(200 0)

 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Simple Linear Regression
Example: Making Predictions
 When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation
450
400
House Price ($1000s)

350
300
250
200
150
100
50 Do not try to
0
extrapolate
0 500 1000 1500 2000 2500 3000
Square Feet
beyond the range
of observed X’s
Measures of Variation

 Total variation is made up of two parts:

SST  SSR  SSE

Total Sum of Regression Sum Error Sum of
Squares of Squares Squares

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

where:
Y = Mean value of the dependent variable
Yi = Observed value of the dependent variable
Yˆi = Predicted value of Y for the given X value
i
Measures of Variation
(continued)

 SST = total sum of squares (Total Variation)

 Measures the variation of the Yi values around their
mean Y
 SSR = regression sum of squares (Explained Variation)
 Variation attributable to the relationship between X
and Y
 SSE = error sum of squares (Unexplained Variation)
 Variation in Y attributable to factors other than X
Measures of Variation
(continued)

Y
Yi  
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2

Y  _
_ SSR = (Yi - Y)2 _
Y Y

Xi X
Coefficient of Determination, r2
 The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
 The coefficient of determination is also called
r-squared and is denoted as r2
SSR regression sum of squares
r 
2

SST total sum of squares

note: 0 r 1
2
Examples of Approximate
r2 Values
Y
r2 = 1

Perfect linear relationship

between X and Y:
X
r2 = 1
Y 100% of the variation in Y is
explained by variation in X

X
r =1
2
Examples of Approximate
r2 Values
Y
0 < r2 < 1

Weaker linear relationships

between X and Y:
X
Some but not all of the
Y
variation in Y is explained
by variation in X

X
Examples of Approximate
r2 Values

r2 = 0
Y
No linear relationship
between X and Y:

The value of Y does not

X depend on X. (None of the
r2 = 0
variation in Y is explained
by variation in X)
Simple Linear Regression Example:
Coefficient of Determination, r2 in Excel

SSR 18934.9348
2
Regression Statistics
r    0.58082
Multiple R 0.76211 SST 32600.5000
R Square 0.58082
Adjusted R Square 0.52842
58.08% of the variation in
Standard Error 41.33032
Observations 10
house prices is explained by
variation in square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Standard Error of Estimate
 The standard deviation of the variation of
observations around the regression line is
estimated by
n

SSE
 (Yi  Yˆi ) 2
i 1
S YX  
n2 n2
Where
SSE = error sum of squares
n = sample size
Simple Linear Regression Example:
Standard Error of Estimate in Excel
Regression Statistics
Multiple R
R Square
0.76211
0.58082
S YX  41.33032
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Comparing Standard Errors
SYX is a measure of the variation of observed
Y values from the regression line
Y Y

small SYX X large SYX X

The magnitude of SYX should always be judged relative to the

size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices in
the $200K - $400K range
Assumptions of Regression
L.I.N.E

 Linearity
 The relationship between X and Y is linear

 Independence of Errors
 Error values are statistically independent

 Normality of Error
 Error values are normally distributed for any given

value of X
 Equal Variance (also called homoscedasticity)
 The probability distribution of the errors has constant

variance
Residual Analysis
ei  Yi  Ŷi
 The residual for observation i, ei, is the difference
between its observed and predicted value
 Check the assumptions of regression by examining the
residuals
 Examine for linearity assumption
 Evaluate independence assumption
 Evaluate normal distribution assumption
 Examine for constant variance for all levels of X
(homoscedasticity)
 Graphical Analysis of Residuals
 Can plot residuals vs. X
Residual Analysis for Linearity

Y Y

x x
residuals

x residuals x

Not Linear
 Linear
Residual Analysis for
Independence

Not Independent
 Independent
residuals

residuals
X
residuals

X
Checking for Normality

 Examine the Stem-and-Leaf Display of the

Residuals
 Examine the Boxplot of the Residuals
 Examine the Histogram of the Residuals
 Construct a Normal Probability Plot of the
Residuals
Residual Analysis for Normality

When using a normal probability plot, normal

errors will approximately display in a straight line

Percent
100

0
-3 -2 -1 0 1 2 3
Residual
Checking Homoscedasticity:
Residual Analysis for
Equal (Constant) Variance

Y Y

x x
residuals

x residuals x

Non-constant variance
 Constant variance
Simple Linear Regression
Example: Excel Residual Output

RESIDUAL OUTPUT House Price Model Residual Plot

Predicted
House Price Residuals 80
1 251.92316 -6.923162
60
2 273.87671 38.12329
40
3 284.85348 -5.853484
Residuals
4 304.06284 3.937162 20
5 218.99284 -19.99284 0
6 268.38832 -49.38832 0 1000 2000 3000
-20
7 356.20251 48.79749
-40
8 367.17929 -43.17929
-60
9 254.6674 64.33264
Square Feet
10 284.85348 -29.85348

Does not appear to violate

any regression assumptions
Measuring Autocorrelation:
The Durbin-Watson Statistic
 Used when data are collected over time to
detect if autocorrelation is present
 Autocorrelation exists if residuals in one
time period are related to residuals in
another period
Autocorrelation
 Autocorrelation is correlation of the errors
(residuals) over time
Time (t) Residual Plot

15
 Here, residuals show a 10

Residuals
5
cyclic pattern, not
0
random. Cyclical
-5 0 2 4 6 8
patterns are a sign of -10
positive autocorrelation -15
Time (t)

 Autocorrelation violates the regression assumption

that residuals are random and independent
The Durbin-Watson Statistic
 The Durbin-Watson statistic is used to test for
autocorrelation
H0: residuals are not correlated, i.e. no
autocorrelation is present
H1: positive autocorrelation is present
 The possible range is 0 ≤ D ≤ 4
n  D should be close to 2 if H0 is true
 (e  ei i 1 ) 2
 A rule of thumb is that DW values in the range of
D i 2
n 1.5 to 2.5 are relatively normal. Any value outside

 i
e 2

i 1
this range could be a cause for concern
 D less than 2 may signal positive autocorrelation,
D greater than 2 may signal negative
autocorrelation
Testing for Positive
Autocorrelation
H0: positive autocorrelation does not exist
H1: positive autocorrelation is present
 Calculate the Durbin-Watson test statistic = D
(The Durbin-Watson Statistic can be found using Excel or SPSS)

 Find the values dL and dU from the Durbin-Watson

table
(for sample size n and number of independent variables k)
Decision rule: reject H0 if D < dL

Reject H0 Inconclusive Do not reject H0

0 dL dU 2
Testing for Positive
Autocorrelation (continued)

 Suppose we have the following time series

data:
160

140

120

100
Sales

80 y = 30.65 + 4.7038x
2
60 R = 0.8976
40

0
0 5 10 15 20 25 30
Tim e

 Is there autocorrelation?
Testing for Positive (continued)
Autocorrelation
160
 Example with n = 25: 140

120
Excel/PHStat output: 100

Durbin-Watson Calculations

Sales
80 y = 30.65 + 4.7038x
2
Sum of Squared 60 R = 0.8976
Difference of Residuals 3296.18 40

Sum of Squared 20
Residuals 3279.98 0
Durbin-Watson 0 5 10 15 20 25 30

Statistic 1.00494 Tim e

 (e i  ei1 )2
3296.18
D i 2
n
  1.00494
3279.98
 ei
2

i1
Testing for Positive
Autocorrelation (continued)

 Here, n = 25 and there is k = 1 one independent variable

 Using the Durbin-Watson table, dL = 1.29 and dU = 1.45

 D = 1.00494 < dL = 1.29, so reject H0 and conclude that

significant positive autocorrelation exists

Decision: reject H0 since

D = 1.00494 < dL

Reject H0 Inconclusive Do not reject H0

0 dL=1.29 dU=1.45 2
Inferences About the Slope

 The standard error of the regression slope

coefficient (b1) is estimated by

S YX S YX
Sb1  
SSX  (X i  X) 2

where:
Sb1 = Estimate of the standard error of the slope

SSE
S YX  = Standard error of the estimate
n2
Inferences About the Slope:
t Test

 t test for a population slope

 Is there a linear relationship between X and Y?
 Null and alternative hypotheses
 H0: β1 = 0 (no linear relationship)
 H1: β1 ≠ 0 (linear relationship does exist)
 Test statistic where:
b1  β 1
t STAT  b1 = regression slope
coefficient
Sb β1 = hypothesized slope
1
Sb1 = standard
d.f.  n  2 error of the slope
Inferences About the Slope:
t Test Example

House Price
Square Feet
Estimated Regression Equation:
in $1000s
(x)
(y)
house price  98.25  0.1098 (sq.ft.)
245 1400
312 1600
279 1700
308 1875
The slope of this model is 0.1098
199 1100
219 1550 Is there a relationship between the
405 2350 square footage of the house and its
324 2450
sales price?
319 1425
255 1700
Inferences About the Slope:
t Test Example
H0: β1 = 0
From Excel output: H1: β1 ≠ 0
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

b1 Sb1

b1  β 1 0.10977  0
t STAT    3.32938
Sb 0.03297
1
Inferences About the Slope:
t Test Example

H0: β1 = 0
Test Statistic: tSTAT = 3.329
H1: β1 ≠ 0

d.f. = 10- 2 = 8

a/2=.025 a/2=.025
Decision: Reject H0

There is sufficient evidence

Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 that square footage affects
0
-2.3060 2.3060 3.329 house price
Inferences About the Slope:
t Test Example
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

p-value
Decision: Reject H0, since p-value < α
There is sufficient evidence that
square footage affects house price.
F Test for Significance

 F Test statistic: F MSR

STAT 
MSE

where SSR
MSR 
k
SSE
MSE 
n  k 1
where FSTAT follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom

(k = the number of independent variables in the regression model)

F-Test for Significance
Excel Output

Regression Statistics
Multiple R 0.76211
MSR 18934.9348
R Square 0.58082 FSTAT    11.0848
Adjusted R MSE 1708.1957
Square 0.52842
Standard Error 41.33032
With 1 and 8 degrees p-value for
Observations 10
of freedom the F-Test
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
F Test for Significance
(continued)

H0: β1 = 0 Test Statistic:

H1: β1 ≠ 0 MSR
FSTAT   11.08
 = .05 MSE
df1= 1 df2 = 8 Decision:
Critical Reject H0 at  = 0.05
Value:
F = 5.32
Conclusion:
 = .05
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
Confidence Interval Estimate
for the Slope
Confidence Interval Estimate of the Slope:
b1  t α / 2 S b d.f. = n - 2
1

Excel Printout for House Prices:

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

At 95% level of confidence, the confidence interval for

the slope is (0.0337, 0.1858)
Confidence Interval Estimate
for the Slope (continued)

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Since the units of the house price variable is

$1000s, we are 95% confident that the average
impact on sales price is between $33.74 and
$185.80 per square foot of house size

This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance
Linear Regression Variable Selection
Methods (in SPSS)

 Method selection allows you to specify how independent

variables are entered into the analysis

 Using different methods, you can construct a variety of

regression models from the same set of variables
Method Selection

The methods are:

1. Enter (Regression):
Enters all variables in a single step

2. Stepwise:
At each step, the independent variable not in the equation that has the
smallest probability of F is entered, if that probability is sufficiently small.
Variables already in the regression equation are removed if their
probability of F becomes sufficiently large. The method terminates when
no more variables are eligible for inclusion or removal
Contd.

3. Remove
A procedure for variable selection in which all variables in a block are
removed in a single step

4. Backward Elimination
o All variables are entered into the equation and then sequentially

removed.
o The procedure stops when there are no variables that meet the removal

criterion

5. Forward Selection
o A stepwise variable selection procedure in which variables are

sequentially entered into the model.

o The procedure stops when there are no variables that meet the entry

criterion

How To Subnet in Your Head
95% (20)
How To Subnet in Your Head
67 pages
Regression Analysis Using Excel
100% (1)
Regression Analysis Using Excel
85 pages
Simple and Multiple Regression
100% (2)
Simple and Multiple Regression
39 pages
Digital Video Guidebook
100% (2)
Digital Video Guidebook
18 pages
C Programming Sollution
100% (1)
C Programming Sollution
43 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
Regression Analysis: (And It's Application in Business)
No ratings yet
Regression Analysis: (And It's Application in Business)
31 pages
Statistics Overview Part II
No ratings yet
Statistics Overview Part II
29 pages
Simple Linear Regressionclassroom
No ratings yet
Simple Linear Regressionclassroom
37 pages
Geographic Vs Projected Coordinate Systems PDF
No ratings yet
Geographic Vs Projected Coordinate Systems PDF
8 pages
Chapter 12 - Simple Linear Regression
No ratings yet
Chapter 12 - Simple Linear Regression
77 pages
Introduction To Linear Regression and Correlation Analysis: Objectives
100% (1)
Introduction To Linear Regression and Correlation Analysis: Objectives
33 pages
Tve 12 - CSS 1ST Semester Midterm Module 1 (Fernandez)
No ratings yet
Tve 12 - CSS 1ST Semester Midterm Module 1 (Fernandez)
13 pages
SAP System Administration Made Easy 4.0B
100% (1)
SAP System Administration Made Easy 4.0B
358 pages
HDL Based Synthesis
No ratings yet
HDL Based Synthesis
23 pages
BDP and CapDev Format Sample
No ratings yet
BDP and CapDev Format Sample
17 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
195 pages
Regression and Correlation
No ratings yet
Regression and Correlation
66 pages
Regression
No ratings yet
Regression
66 pages
Session 3 - Linear Regression
No ratings yet
Session 3 - Linear Regression
96 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Stat 302 Lec 12
No ratings yet
Stat 302 Lec 12
59 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
64 pages
File4-Session3-Introduction To Regression
No ratings yet
File4-Session3-Introduction To Regression
50 pages
CH 12
No ratings yet
CH 12
57 pages
COMM5005 Lecture 8
No ratings yet
COMM5005 Lecture 8
54 pages
Levine Bsfc7ge ch12
No ratings yet
Levine Bsfc7ge ch12
84 pages
Chap 12
No ratings yet
Chap 12
62 pages
Dell Inspiron-17-N7010 Setup Guide En-Us
No ratings yet
Dell Inspiron-17-N7010 Setup Guide En-Us
94 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
68 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Applied Quantitative Analysis and Practices: Lecture#22
No ratings yet
Applied Quantitative Analysis and Practices: Lecture#22
27 pages
1 Simple LR
No ratings yet
1 Simple LR
111 pages
Introduction To Linear Regression and Correlation Analysis
No ratings yet
Introduction To Linear Regression and Correlation Analysis
47 pages
Simplelinearregression NBC
No ratings yet
Simplelinearregression NBC
50 pages
Statistics For Business Analysis: Learning Objectives
No ratings yet
Statistics For Business Analysis: Learning Objectives
37 pages
Chapter 7 - S
No ratings yet
Chapter 7 - S
49 pages
Statistics For Managers Using Microsoft Excel: 6 Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: 6 Edition
79 pages
CSC118 - Fundamentals of Algorithm Development
0% (1)
CSC118 - Fundamentals of Algorithm Development
3 pages
Course 10-Part 1
No ratings yet
Course 10-Part 1
32 pages
Business Statistics: A Decision-Making Approach: Introduction To Linear Regression and Correlation Analysis
No ratings yet
Business Statistics: A Decision-Making Approach: Introduction To Linear Regression and Correlation Analysis
64 pages
Industrial Engineering (Simple Linear)
No ratings yet
Industrial Engineering (Simple Linear)
82 pages
Regression Analysis: (And It's Application in Business)
No ratings yet
Regression Analysis: (And It's Application in Business)
31 pages
Sessions 18 19 - Regression - SLR MLR
No ratings yet
Sessions 18 19 - Regression - SLR MLR
70 pages
Cor Regression
No ratings yet
Cor Regression
27 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
Regression Analysis All
No ratings yet
Regression Analysis All
29 pages
Addition of Integers
No ratings yet
Addition of Integers
6 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
51 pages
Slide Chap11
No ratings yet
Slide Chap11
19 pages
ALV - ALV Utility Program
No ratings yet
ALV - ALV Utility Program
4 pages
Regression
No ratings yet
Regression
56 pages
Regression
No ratings yet
Regression
60 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Bsafc4 PPT ch12
No ratings yet
Bsafc4 PPT ch12
30 pages
Chapter 7 REGRESSION Ver 3
No ratings yet
Chapter 7 REGRESSION Ver 3
58 pages
L20 - Regression Analysis 1 PDF
No ratings yet
L20 - Regression Analysis 1 PDF
17 pages
Simple Linear Reg Ex 1
No ratings yet
Simple Linear Reg Ex 1
34 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
Chapter 13 Part 1
No ratings yet
Chapter 13 Part 1
49 pages
Formation of Bus Admittance and Impedance Matrices and Solution of Networks Date: Expt No: Aim
No ratings yet
Formation of Bus Admittance and Impedance Matrices and Solution of Networks Date: Expt No: Aim
6 pages
Simple Linear Reg Ex 1
No ratings yet
Simple Linear Reg Ex 1
34 pages
Application of Regression Analysis Final
No ratings yet
Application of Regression Analysis Final
27 pages
Unit 1
No ratings yet
Unit 1
62 pages
Fodor's Essential Scandinavia: The Best of Norway, Sweden, Denmark, Finland, and Iceland 3rd Edition Michelle Arrouas All Chapters Instant Download
100% (1)
Fodor's Essential Scandinavia: The Best of Norway, Sweden, Denmark, Finland, and Iceland 3rd Edition Michelle Arrouas All Chapters Instant Download
41 pages
Lec 4
No ratings yet
Lec 4
16 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
07 Linear Correlation & Regression
No ratings yet
07 Linear Correlation & Regression
12 pages
Chapter11 - Simple Regression
No ratings yet
Chapter11 - Simple Regression
12 pages
Drive Spares Old PDF
No ratings yet
Drive Spares Old PDF
3 pages
Ma 242 Midterm 1 W Solutions 2006 Linear Algebra Nikola Popovic PDF
No ratings yet
Ma 242 Midterm 1 W Solutions 2006 Linear Algebra Nikola Popovic PDF
5 pages
MIDI Solutions Event Processor Plus
No ratings yet
MIDI Solutions Event Processor Plus
9 pages
Forever Flashlight Rip Cord: User's Guide For
No ratings yet
Forever Flashlight Rip Cord: User's Guide For
9 pages
Network Security Policy 2024
No ratings yet
Network Security Policy 2024
11 pages
SO - HPE GreenLake For Aruba
No ratings yet
SO - HPE GreenLake For Aruba
4 pages
Data Flow Diagram For Existing Inventory System
No ratings yet
Data Flow Diagram For Existing Inventory System
6 pages
Consortia For Vaccine Candidate Development Against Arboviruses PI: Dr. Amit Awasthi Project Research Scientist-I
No ratings yet
Consortia For Vaccine Candidate Development Against Arboviruses PI: Dr. Amit Awasthi Project Research Scientist-I
4 pages
Business Information System PDF
No ratings yet
Business Information System PDF
4 pages
Eric Resume
No ratings yet
Eric Resume
2 pages
PHP - Form Introduction: Dynamic Websites
No ratings yet
PHP - Form Introduction: Dynamic Websites
3 pages
Barcamp Accra: Marketing Plan
No ratings yet
Barcamp Accra: Marketing Plan
5 pages
BPL PVC Pipe
No ratings yet
BPL PVC Pipe
1 page

Linear Regression

Uploaded by

Linear Regression

Uploaded by

Correlation vs.

Linear relationships Curvilinear relationships

b0 and b1 are obtained by finding the values

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

 The coefficients b0 and b1 , and other

 b0 is the estimated average value of Y

 b1 is the estimated change in the

 Independent variable (X) = square feet

1. Choose Data 2. Choose Data Analysis

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

House price model: Scatter Plot and Prediction Line

house price  98.24833  0.10977 (square feet)

house price  98.24833  0.10977 (square feet)

 b0 is the estimated average value of Y when the

house price  98.24833  0.10977 (square feet)

 b1 estimates the change in the average

house price  98.25  0.1098 (sq.ft.)

 Total variation is made up of two parts:

SST  SSR  SSE

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

 SST = total sum of squares (Total Variation)

Perfect linear relationship

Weaker linear relationships

The value of Y does not

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Residual 8 13665.5652 1708.1957

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

small SYX X large SYX X

The magnitude of SYX should always be judged relative to the

 Examine the Stem-and-Leaf Display of the

When using a normal probability plot, normal

RESIDUAL OUTPUT House Price Model Residual Plot

Does not appear to violate

 Autocorrelation violates the regression assumption

 Find the values dL and dU from the Durbin-Watson

Reject H0 Inconclusive Do not reject H0

 Suppose we have the following time series

Statistic 1.00494 Tim e

 Here, n = 25 and there is k = 1 one independent variable

 Using the Durbin-Watson table, dL = 1.29 and dU = 1.45

 D = 1.00494 < dL = 1.29, so reject H0 and conclude that

Decision: reject H0 since

Reject H0 Inconclusive Do not reject H0

 The standard error of the regression slope

 t test for a population slope

There is sufficient evidence

 F Test statistic: F MSR

(k = the number of independent variables in the regression model)

H0: β1 = 0 Test Statistic:

Excel Printout for House Prices:

At 95% level of confidence, the confidence interval for

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Since the units of the house price variable is

This 95% confidence interval does not include 0.

 Method selection allows you to specify how independent

 Using different methods, you can construct a variety of

The methods are:

sequentially entered into the model.

You might also like