Ch08 - Linear Regression

biostatistics

Uploaded by

Omar Seeria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views37 pages

Ch08 - Linear Regression

biostatistics

Uploaded by

Omar Seeria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Linear Regression

Prof. Andy Field

Aims
• Understand linear regression with one predictor
• Understand how we assess the fit of a regression
model
– Total Sum of Squares
– Model Sum of Squares
– Residual Sum of Squares
– F
– R2
• Know how to do Regression on IBM SPSS
• Interpret a regression model

Slide 2
What is Regression?
• A way of predicting the value of one
variable from another.
– It is a hypothetical model of the relationship
between two variables.
– The model used is a linear one.
– Therefore, we describe the relationship using
the equation of a straight line.

Slide 3
Describing a Straight Line
Yi  b0  b1X i   i
• bi
– Regression coefficient for the predictor
– Gradient (slope) of the regression line
– Direction/Strength of Relationship
• b0
– Intercept (value of Y when X = 0)
– Point at which the regression line crosses the Y-
axis (ordinate)
Slide 4
Intercepts and Gradients
The Method of Least Squares

Slide 6
How Good is the Model?
• The regression line is only a model
based on the data.
• This model might not reflect reality.
– We need some way of testing how well
the model fits the observed data.
– How?

Slide 7
Sums of Squares

Slide 8
Summary
• SST
– Total variability (variability between scores and the
mean).
• SSR
– Residual/Error variability (variability between the
regression model and the actual data).
• SSM
– Model variability (difference in variability between
the model and the mean).

Slide 9
Testing the Model: ANOVA
SST
Total Variance In The Data

SSM SSR
Improvement Due to the Model Error in Model

• If the model results in better prediction than

using the mean, then we expect SSM to be
much greater than SSR

Slide 10
Testing the Model: ANOVA
• Mean Squared Error
– Sums of Squares are total values.
– They can be expressed as averages.
– These are called Mean Squares, MS

MS M
F MSR
Slide 11
Testing the Model: R 2

• R2
– The proportion of variance accounted for by
the regression model.
– The Pearson Correlation Coefficient Squared

2 SS M
R  SST
Slide 12
Outliers and residuals
• An outlier is a case that differs
substantially from the main trend of the
data
• The green line shows the original model,
and the red line shows the model with the
outlier included. The outlier has a
dramatic effect on the regression model:
the line becomes flatter (i.e., b1 is smaller)
and the intercept increases (i.e., b0 is
larger)
• Examine residuals to look for outliers
• These residuals represent the error present
in the model. If a model fits the sample
data well then all residuals will be small.
Also, if any cases stand out as having a
large residual, then they could be outliers.
• unstandardized residuals described above
are measured in the same units as the
outcome variable and so are difficult to
interpret across different models
• we cannot define a universal cut-off point
for what constitutes a large residual.
• standardized residuals, which are the residuals
converted to z-scores
• 1.96 cuts off the top 2.5% of the distribution.
• −1.96 cuts off the bottom 2.5% of the
distribution.
• As such, 95% of z-scores lie between −1.96 and
1.96.
• 99% of z-scores lie between −2.58 and 2.58,
• 99.9% of them lie between −3.29 and 3.29.
Standardized Residuals
• In an average sample, 95% of
standardized residuals should lie
between  2.
• 99% of standardized residuals should
lie between  2.5.
• Outliers
– Any case for which the absolute value of
the standardized residual is 3 or more, is
likely to be an outlier.

Slide 18
Influential cases
• look at whether certain cases that exert undue
influence over the parameters of the model.
• if we were to delete a certain case, would we
obtain different regression coefficients?
• This type of analysis can help to determine
whether the regression model is stable across the
sample, or whether it is biased by a few
influential cases. Again, this process will unveil
outliers.
• adjusted predicted value for a case when
that case is excluded from the analysis. In
effect, the computer calculates a new
model without a particular case and then
uses this new model to predict the value of
the outcome variable for the case that was
excluded.
• We can also look at the residual based on the
adjusted predicted value: that is, the difference
between the adjusted predicted value and the
original observed value. This is the
deleted residual. The deleted residual can be
divided by the standard error to give a
standardized value known as the Studentized
deleted residual. This residual can be compared
across different regression analyses because it is
measured in standard units.
• The deleted residuals are very useful to assess
the influence of a case on the ability of the
model to predict that case.
• However, they do not provide any information
about how a case influences the model as a
whole
• USE Cook’s distance:a measure of the overall
influence of a case on the model, values greater
than 1 may be cause for concern.
• Run the regression analysis with a case
included and then rerun the analysis with
that same case excluded. If we did this,
undoubtedly there would be some
difference between the b coefficients in
the two regression equations. This
difference would tell us how much
influence a particular case has on the
parameters of the regression model.
• The difference between a parameter
estimated using all cases and estimated
when one case is excluded is known as
the DFBeta
• Again, the units of measurement used will
affect these values and so SPSS produces
a standardized DFBeta
• A related statistic is the DFFit, which is
the difference between the predicted value
for a case when the model is calculated
including that case and when the model is
calculated excluding that case: in this
example the value is −1.90
• We have a problem with units Therefore,
SPSS also produces standardized versions
of the DFFit values (Standardized DFFit
).
Regression: An Example
• A record company boss was interested in
predicting album sales from advertising.
• Data
– 200 different album releases
• Outcome variable:
– Sales (CDs and Downloads) in the week after
release
• Predictor variable:
– The amount (in £s) spent promoting the album
before release.
Step One: Graph the Data

Slide 28
Regression Using IBM SPSS

Slide 29
Output: Model Summary

Model Summary

Std. Error
Adjus ted R of the
Model R R Square Square Es ti mate
a
1 .578 .335 .331 65.9914
a. Predi c tors : (Cons tant), Adv erti s i ng Budget (thous ands
of pounds )

Slide 30
Output: ANOVA

Slide 31
SPSS Output: Model Parameters

Slide 32
t=(b observed-b expected)/SE
Using The Model

Slide 34
• The sample size required to test the
overall regression model depending on the
number of predictors and the size of
expected effect, R2 = .02 (small), .13
(medium) and .26 (large)

PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
No ratings yet
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
39 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
No ratings yet
ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
51 pages
15multiple Linear Regression
No ratings yet
15multiple Linear Regression
168 pages
05 Linear Regression 2
No ratings yet
05 Linear Regression 2
71 pages
Running and Interpreting Multiple Regression in SPSS (Includes Review of Assumptions)
No ratings yet
Running and Interpreting Multiple Regression in SPSS (Includes Review of Assumptions)
60 pages
ABRM Regression
No ratings yet
ABRM Regression
22 pages
LGT2425 Lecture 3 Part II (Notes)
No ratings yet
LGT2425 Lecture 3 Part II (Notes)
55 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Real Statistics Examples Regression 1
No ratings yet
Real Statistics Examples Regression 1
412 pages
Residual Analysis For Simple Linear Regression: X B B y N e N e
No ratings yet
Residual Analysis For Simple Linear Regression: X B B y N e N e
15 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
Linear Regression
100% (2)
Linear Regression
28 pages
ML EasySol
No ratings yet
ML EasySol
62 pages
Manual ML 1
No ratings yet
Manual ML 1
8 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
6 pages
Regression Basics: Predicting A DV With A Single IV
No ratings yet
Regression Basics: Predicting A DV With A Single IV
20 pages
Business Statistics II
100% (2)
Business Statistics II
100 pages
Chapter 8 Regression Model - 2023
No ratings yet
Chapter 8 Regression Model - 2023
21 pages
Chapter 5,6 Regression Analysis
50% (2)
Chapter 5,6 Regression Analysis
44 pages
Section 2
No ratings yet
Section 2
22 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Regression Lecture Summary
No ratings yet
Regression Lecture Summary
31 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
328formulas03 (2019 - 04 - 03 15 - 13 - 21 UTC)
No ratings yet
328formulas03 (2019 - 04 - 03 15 - 13 - 21 UTC)
12 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Concepts - Regression Overview
No ratings yet
Concepts - Regression Overview
14 pages
300b-l1 2017 CV Notes
No ratings yet
300b-l1 2017 CV Notes
11 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Linear Regression SPSS
No ratings yet
Linear Regression SPSS
19 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
CSS
No ratings yet
CSS
15 pages
Regression For Everyone Vol. 1
No ratings yet
Regression For Everyone Vol. 1
25 pages
Final Answer Bank
No ratings yet
Final Answer Bank
10 pages
Ch08 Part 2 - Multtiple Regression
No ratings yet
Ch08 Part 2 - Multtiple Regression
45 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Effectiveness of Anti-Bullying Intervention Programs
100% (2)
Effectiveness of Anti-Bullying Intervention Programs
25 pages
Model Paper of Business Statics MBA
100% (1)
Model Paper of Business Statics MBA
5 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Robust Regression Modeling With STATA Lecture Notes
No ratings yet
Robust Regression Modeling With STATA Lecture Notes
93 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
No ratings yet
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
15 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Chapter 14
No ratings yet
Chapter 14
15 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
6 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Module 4: Regression Shrinkage Methods
No ratings yet
Module 4: Regression Shrinkage Methods
5 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Chapter 11
No ratings yet
Chapter 11
10 pages
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
No ratings yet
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
4 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
BSA Business Statistics Course Syllabus For Editing
No ratings yet
BSA Business Statistics Course Syllabus For Editing
6 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Buku Mkdki
No ratings yet
Buku Mkdki
120 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
PR 2 3RD Quarter
No ratings yet
PR 2 3RD Quarter
7 pages
IB Math AI SL Questionbank - Descriptive Statistics 5
No ratings yet
IB Math AI SL Questionbank - Descriptive Statistics 5
1 page
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
Practice Questions Additional PDF
No ratings yet
Practice Questions Additional PDF
33 pages
Greg 4171: Statistical Applications in The Garment Industry
No ratings yet
Greg 4171: Statistical Applications in The Garment Industry
4 pages
Rumus MPC 2 - UASgfhg
No ratings yet
Rumus MPC 2 - UASgfhg
2 pages
EBA2123 1.data and Statistics
No ratings yet
EBA2123 1.data and Statistics
36 pages
Introduction To Statistics and Probability
No ratings yet
Introduction To Statistics and Probability
10 pages
Introduction To Engineering Statistics
No ratings yet
Introduction To Engineering Statistics
11 pages
Use Z-Test and T-Test
No ratings yet
Use Z-Test and T-Test
6 pages
First Part of Measures of Variability
No ratings yet
First Part of Measures of Variability
33 pages
Tutorial 5
0% (1)
Tutorial 5
4 pages
Nooo2 Merged
No ratings yet
Nooo2 Merged
39 pages
Different Types of Standard Score
No ratings yet
Different Types of Standard Score
18 pages
Statistik MBA
No ratings yet
Statistik MBA
41 pages
Estimation Estimation Is The Process of Estimating The Value of Parameter From Information
No ratings yet
Estimation Estimation Is The Process of Estimating The Value of Parameter From Information
7 pages
Slide 5-7
No ratings yet
Slide 5-7
4 pages
ML Lab PGM 6
No ratings yet
ML Lab PGM 6
5 pages
Laporan Projek Statistic (DAS 20502) Muhazreen
No ratings yet
Laporan Projek Statistic (DAS 20502) Muhazreen
20 pages
Module 2 - Session 1
No ratings yet
Module 2 - Session 1
18 pages
01 Case Ch02
No ratings yet
01 Case Ch02
8 pages
Wisnu Haji - 1701035098 - Tugas SPSS
No ratings yet
Wisnu Haji - 1701035098 - Tugas SPSS
11 pages
T. Y. B. Com. (Honors) (Sem. - V) Examination March - 2023 Business Statistics - I
No ratings yet
T. Y. B. Com. (Honors) (Sem. - V) Examination March - 2023 Business Statistics - I
2 pages
Asas SPSS Kuantitatif
No ratings yet
Asas SPSS Kuantitatif
10 pages
Uts - Yudha Nugraha Pane
No ratings yet
Uts - Yudha Nugraha Pane
4 pages
Bias Variance Derivation
No ratings yet
Bias Variance Derivation
2 pages
Nama: Vicky Sumarsono NIM: 12030118120104 Mata Kuliah: Statistika Bisnis (C)
No ratings yet
Nama: Vicky Sumarsono NIM: 12030118120104 Mata Kuliah: Statistika Bisnis (C)
3 pages

Ch08 - Linear Regression

Uploaded by

Ch08 - Linear Regression

Uploaded by

Linear Regression

Prof. Andy Field

• If the model results in better prediction than

You might also like