0% found this document useful (0 votes)

111 views40 pages

Bivariate Regression Analysis: The Beginning of Many Types of Regression

The document provides an overview of bivariate regression analysis. It discusses how regression analysis is used to test causal hypotheses, make predictions from sample data, and derive rates of change between variables. Ordinary least squares (OLS) regression fits a line to minimize prediction errors and provide the best linear unbiased estimates. Key outputs are the slope (effect of the independent variable on the dependent variable) and intercept (value of the dependent variable when the independent variable is zero). The coefficient of determination (R-squared) indicates how well the model fits the data.

Uploaded by

Ashish Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

111 views40 pages

Bivariate Regression Analysis: The Beginning of Many Types of Regression

Uploaded by

Ashish Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 40

Bivariate Regression Analysis

The beginning of many types of

regression
TOPICS
• Beyond Correlation
• Forecasting
• Two points to estimate the slope
• Meeting the BLUE criterion
• The OLS method
Purpose of Regression Analysis
• Test causal hypotheses

• Make predictions from samples of data

• Derive a rate of change between variables

• Allows for multivariate analysis

Goal of Regression
• Draw a regression line through a sample of
data to best fit.

• This regression line provides a value of how

much a given X variable on average affects
changes in the Y variable.

• The value of this relationship can be used

for prediction and to test hypotheses and
provides some support for causality.
Perfect relationship between Y and X: X causes all
change in Y

  a  b
Where a = constant, alpha, or intercept (value of
Y when X= 0 ; B= slope or beta, the value of X

Imperfect relationship between Y and X

  a  b  e
E = stochastic term or error of estimation
and captures everything else that affects
change in Y not captured by X
The Intercept
• The intercept estimate (constant) is where
the regression line intercepts the Y axis,
which is where the X axis will equal its
minimal value.

• In a multivariate equation (2+ X vars) the

intercept is where all X variables equal
zero.
The Intercept

a    b
The intercept operates as a baseline for the
estimation of the equation.
The Slope
• The slope estimate equals the average
change in Y associated with a unit change
in X.

• This slope will not be a perfect estimate

unless Y is a perfect function of X. If it
was perfect, we would always know the
exact value of Y if we knew X.
The Least Squares Concept
• We draw our regression lines so that the
error of our estimates are minimized. When
a given sample of data is normally
distributed, we say the data are BLUE.

• BLUE stands for Best Linear Unbiased

Estimate. So, an important assumption of
the Ordinary Least Squares model (basic
regression) is that the relationship between
X variables and Y are linear.
Do you have the BLUES?
The BLUE criterion
• B for Best (Minimum error)
• L for Linear (The form of the relationship)
• U for Un-bias (does the parameter truly
reflect the effect?)
• E for Estimator
The Least Squares Concept
• Accuracy of estimation is gained by
reducing prediction error, which occurs
when values for an X variable do not fall
directly on the regression line.

• Prediction error = observed – predicted or

ˆ
 i  i
NOT BLUE
BLUE
Ordinary Least Square (OLS)
• OLS is the technique used to estimate a line
that will minimize the error. The difference
between the predicted and the actual values of
Y

ˆ
Y Y  e
OLS
• Equation for a population

Y    X  
• Equation for a sample

Y  a  bX  e
The Least Squares Concept
• The goal is to minimize the error in the
prediction of b. This means summing the
errors of each prediction, or more
appropriately the Sum of the Squares of the
Errors.

SSE =  (i  i )
ˆ 2
The Least Squares and b coefficient
• The sum of the squares is “least” when

b
 (   )(   )
i i

• And  (   )i
2

a    b
Knowing the intercept and the slope, we can predict
values of Y given X.
Calculating the slope & intercept

b
 ( Xi  X )(Yi  Y )
 ( Xi  X ) 2

a  Y  bX
Step by step
1. Calculate the mean X
of Y and X Y

2. Calculate the errors Xi  X

of X and Y Yi  Y

3. Get the product ( Xi  X )(Yi  Y )

(multiply)

4. Sum the products ( Xi  X )(Yi  Y )

Step by step
5. Squared the ( Xi  X ) 2

difference of X

6. Sum the squared ( Xi  X ) 2

difference

7. Divide (step4/step6) b
 ( Xi  X )(Yi  Y )
 ( Xi  X ) 2

8. Calculate a
a  Y  bX
An Example: Choosing two points
Y X

Log value Log sqft

5.13 4.02

5.2 4.54

4.53 3.53

4.79 3.8

4.78 3.86

4.72 4.17
Forecasting Home Values
LOG_VALU
5.2

5.1
2

5.0

4.9

4.8

4.7

4.6
Linear

1
4.5
3.4 3.6 3.8 4.0 4.2 4.4 4.6

LOG_SQFT
Forecasting Home Values
LOG_VALU
5.2

5.1

5.0 Y2 - Y1
_______
4.9

4.8 X2 - X1
4.7

4.6
4.54 – 3.53
Linear

4.5
3.4 3.6 3.8 4.0 4.2 4.4 4.6
__________ =.69

LOG_SQFT
5.2 – 4.5
SPSS OUTPUT
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 2.565 .929 2.761 .051
X .575 .232 .778 2.476 .068
a. Dependent Variable: Y

• The coefficient beta is the marginal impact of X

on Y (derivative)
• In other words for a one unit change of X how
much Y changes (.575)
Stochastic Term
• The stochastic error term measures the
residual variance in Y not covered by X.

• This is akin to saying there is measurement

error and our predictions/models will not be
perfect.

• The more X variables we add to a model,

the lower the error of estimation.
Interpreting a Regression
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 797.952 45.360 17.592 .000 708.478 887.425
UNEMP -69.856 6.500 -.615 -10.747 .000 -82.678 -57.034
a. Dependent Variable: STOCKS

Model Summary

Adjusted Std. Error of

Model R R Square R Square the Estimate
1 .615a .378 .375 122.85545
a. Predictors: (Constant), UNEMP
Interpreting a Regression

• The prior table shows that with an increase

in unemployment of one unit (probably
measured as a percent), the S&P 500 stock
market index goes down 69 points, and this
is statistically significant.

• Model Fit: 37.8% of variability of Stocks

predicted by change in unemployment
figures.
Interpreting a Regression 2

• What can we say about this relationship

regarding the effect of X on Y?

• How strongly is X related to Y?

• How good is the model fit?

Model Fit: Coefficient of
Determination
• R squared is a measure of model fit.

2
R
• What amount of variance in Y is explained
by X variable?

• What amount of variability in Y not

explained by X variable(s)?

R r 2 2
This measure is based on the degree to which
the point estimates of fall on the regression
line. The higher the error from the line, the
lower the R square (scale between 1 and 0).
 i
(    ) 2
= Total sum of squared deviations (TSS)
= regression (explained) sum of squared
 (ˆ i   ) deviations (RSS)
2

= error (unexplained) sum of squared

 i i deviations (ESS)
(   ˆ
 ) 2

TSS= RSS + ESS

Where R2 = RSS/TSS
Interpreting a Regression 2
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 3.057 .041 74.071 .000
UPOP 4.176E-05 .000 .133 13.193 .000
a. Dependent Variable: DEMOC

Correlations

DEMOC UPOP
Model Summary
Pearson Correlation DEMOC 1.000 .133
UPOP .133 1.000 Adjusted Std. Error of
Sig. (1-tailed) DEMOC . .000 Model R R Square R Square the Estimate
1 .133a .018 .018 3.86927
UPOP .000 .
a. Predictors: (Constant), UPOP
N DEMOC 9622 9622
UPOP 9622 9622
Interpreting a Regression 2
• The correlation between X and Y is weak (.133).

This is reflected in the bivariate correlation

coefficient but also picked up in model fit of .018.
What does this mean?

• However, there appears to be a causal

relationship where urban population increases
democracy, and this is a highly significant
statistical relationship (sig.= .000 at .05 level)
Interpreting a Regression 2
• Yet, the coefficient 4.176E-05 means that a
unit increase in urban pop increases
democracy by .00004176, which is tiny.

• This model teaches us a lesson: We need

to pay attention to both matters of both
statistical significance but also matters of
substance. In the broader picture urban
population has a rather minimal effect on
democracy.
The Inference Made
• As with some of our earlier models, when
we interpret the results regarding the
relationship between X and Y, we are often
making an inference based on a sample
drawn from a population. The regression
equation for the population uses different
notation:

Yi = α + βXi + εi
OLS Assumptions
1. No specification error
a) Linear relationship between X and Y
b) No relevant X variables excluded
c) No irrelevant X variables included

2. No Measurement Error
• (self-evident I hope, otherwise what would we
be modeling?)
OLS Assumptions
3. On Error Term:
a. Zero mean: E(εi2), meaning we expect
that for each observation the error equals
zero.
b. Homoskedasticity: The variance of the
error term is constant for all values of X i.
c. No autocorrelation: The error terms are
uncorrelated.
d. The X variable is uncorrelated with the
error term
e. The error term is normally distributed.
OLS Assumptions
• Some of these assumptions are complex
and issues for a second level course
(autocorrelation, heteroskedasticity).

• Of importance is that when assumptions 1

and 3 are met our regression model is
BLUE. The first assumption is related to the
proper model specification. When aspects
of assumption 3 are violated we may likely
need a new method of estimation besides
OLS

BRM - L4,5 - Linear Regression
No ratings yet
BRM - L4,5 - Linear Regression
113 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Lecture Set 2
No ratings yet
Lecture Set 2
47 pages
STAT 445-Lecture 1 - 2021
No ratings yet
STAT 445-Lecture 1 - 2021
42 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
Multiple Regression
No ratings yet
Multiple Regression
14 pages
Eco 3
No ratings yet
Eco 3
68 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Pertemuan 2 - Simple Linear Regression
No ratings yet
Pertemuan 2 - Simple Linear Regression
24 pages
EECM3724 Unit 9 ch14 Slides 2023
No ratings yet
EECM3724 Unit 9 ch14 Slides 2023
57 pages
Chapter 2 SLRM
No ratings yet
Chapter 2 SLRM
40 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Regression Analysis (Simple)
100% (1)
Regression Analysis (Simple)
8 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Chapter Two
No ratings yet
Chapter Two
44 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
Regression Analysis
No ratings yet
Regression Analysis
31 pages
Manual ML 1
No ratings yet
Manual ML 1
8 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Review Econometrics: Mohammad Obaidullah
No ratings yet
Review Econometrics: Mohammad Obaidullah
37 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Class 8
No ratings yet
Class 8
9 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
R18&19
No ratings yet
R18&19
32 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
Topic 2
No ratings yet
Topic 2
23 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
24 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
49 pages
STAT 445 Regression Analysis
No ratings yet
STAT 445 Regression Analysis
49 pages
P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor
No ratings yet
P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor
34 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Overview of Regression Analysis
No ratings yet
Overview of Regression Analysis
2 pages
Chapter 2 Econometric
No ratings yet
Chapter 2 Econometric
28 pages
Linear Regression Model
No ratings yet
Linear Regression Model
3 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Topic 2
No ratings yet
Topic 2
23 pages
Student Notes Madule 2
No ratings yet
Student Notes Madule 2
12 pages
Regression
No ratings yet
Regression
24 pages
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
No ratings yet
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
60 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
ECC321 Chapter2
No ratings yet
ECC321 Chapter2
5 pages
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
No ratings yet
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
59 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Day of The Week Effects
No ratings yet
Day of The Week Effects
13 pages
The Cointegrated VAR Methodology
No ratings yet
The Cointegrated VAR Methodology
73 pages
Loss Functions
No ratings yet
Loss Functions
8 pages
Assign 343
No ratings yet
Assign 343
12 pages
Applied Multivariate Analysis: Frequently Asked Questions
No ratings yet
Applied Multivariate Analysis: Frequently Asked Questions
7 pages
Exercise2 Submission Group 12 Yalcin Mehmet
No ratings yet
Exercise2 Submission Group 12 Yalcin Mehmet
10 pages
Wilkins, A Zurn Company: Demand Forecasting: Submitted By: Group-8 Section-C
No ratings yet
Wilkins, A Zurn Company: Demand Forecasting: Submitted By: Group-8 Section-C
6 pages
REGRESSION
No ratings yet
REGRESSION
7 pages
Assignment-Practical Exercise in One-Way Anova
No ratings yet
Assignment-Practical Exercise in One-Way Anova
11 pages
Lesson 5
No ratings yet
Lesson 5
5 pages
Reliability Theory and Survival Analysis Final
No ratings yet
Reliability Theory and Survival Analysis Final
12 pages
Resume 190922 - Restri Ayu Safarina
No ratings yet
Resume 190922 - Restri Ayu Safarina
3 pages
Data Science in Medicine - Precision & Recall or Specificity & Sensitivity? - by Alon Lekhtman - Towards Data Science
No ratings yet
Data Science in Medicine - Precision & Recall or Specificity & Sensitivity? - by Alon Lekhtman - Towards Data Science
11 pages
Business Analysis and Econometric Application: Poonam Singh National Institute of Industrial Engineering
No ratings yet
Business Analysis and Econometric Application: Poonam Singh National Institute of Industrial Engineering
13 pages
Portfolio Composition and Backtesting: Dakota Wixom
No ratings yet
Portfolio Composition and Backtesting: Dakota Wixom
34 pages
Studi Kasus: Identifikasi Komponen Penciri Akreditasi Sekolah/Madrasah Pada Tingkat SD/MI Di Provinsi Kalimantan Timur Tahun 2015
No ratings yet
Studi Kasus: Identifikasi Komponen Penciri Akreditasi Sekolah/Madrasah Pada Tingkat SD/MI Di Provinsi Kalimantan Timur Tahun 2015
8 pages
Gumble Distribution
No ratings yet
Gumble Distribution
24 pages
Handout 2
No ratings yet
Handout 2
10 pages
Quality Planning Quality Assurance Quality Control
No ratings yet
Quality Planning Quality Assurance Quality Control
23 pages
Neymar Pearson
No ratings yet
Neymar Pearson
2 pages
SEM Notes
No ratings yet
SEM Notes
3 pages
Econometrics Exercises
No ratings yet
Econometrics Exercises
8 pages
9 SVM 2
No ratings yet
9 SVM 2
7 pages
Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation
No ratings yet
Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation
94 pages
AIML Practical 02 22105A2021
No ratings yet
AIML Practical 02 22105A2021
8 pages
Credibility, Mahler & Dean (AutoRecovered)
No ratings yet
Credibility, Mahler & Dean (AutoRecovered)
4 pages
S2 Exercise 7D
No ratings yet
S2 Exercise 7D
4 pages
Econometrics Questions
No ratings yet
Econometrics Questions
3 pages
Mann Whiteney Test
No ratings yet
Mann Whiteney Test
2 pages
Basic Statistical Techniques in Data Analysis
No ratings yet
Basic Statistical Techniques in Data Analysis
23 pages

Bivariate Regression Analysis: The Beginning of Many Types of Regression

Uploaded by

Bivariate Regression Analysis: The Beginning of Many Types of Regression

Uploaded by

Bivariate Regression Analysis

The beginning of many types of

• Make predictions from samples of data

• Derive a rate of change between variables

• Allows for multivariate analysis

• This regression line provides a value of how

• The value of this relationship can be used

Imperfect relationship between Y and X

• In a multivariate equation (2+ X vars) the

• This slope will not be a perfect estimate

• BLUE stands for Best Linear Unbiased

• Prediction error = observed – predicted or

2. Calculate the errors Xi  X

3. Get the product ( Xi  X )(Yi  Y )

4. Sum the products ( Xi  X )(Yi  Y )

6. Sum the squared ( Xi  X ) 2

Log value Log sqft

• The coefficient beta is the marginal impact of X

• This is akin to saying there is measurement

• The more X variables we add to a model,

Adjusted Std. Error of

• The prior table shows that with an increase

• Model Fit: 37.8% of variability of Stocks

• What can we say about this relationship

• How strongly is X related to Y?

• How good is the model fit?

• What amount of variability in Y not

= error (unexplained) sum of squared

TSS= RSS + ESS

This is reflected in the bivariate correlation

• However, there appears to be a causal

• This model teaches us a lesson: We need

• Of importance is that when assumptions 1

You might also like