Chapter-23 Bivariate Statistical Analysis: Measurement of Association

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 30

Chapter-23

Bivariate Statistical
Analysis: Measurement of
Association

© 1–1
LEARNING
LEARNING OUTCOMES
OUTCOMES
After studying this chapter, you should be able to
1. Apply and interpret simple bivariate correlations
2. Interpret a correlation matrix
3. Understand simple (bivariate) regression
4. Understand the least-squares estimation technique
5. Interpret regression output including the tests of
hypotheses tied to specific parameter coefficients

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–2
The Basics
• Measures of Association
 Refers to a number of bivariate statistical techniques
used to measure the strength of a relationship
between two variables.
 The chi-square (2) test provides information about
whether two or more less-than interval variables are
interrelated.
 Correlation analysis is most appropriate for interval or
ratio variables.
 Regression can accommodate either less-than
interval or interval independent variables, but the
dependent variable must be continuous.
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–3
EXHIBIT 23.1
Bivariate Analysis—
Common Procedures for
Testing Association

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–4
Simple Correlation Coefficient (continued)
• Correlation coefficient
 A statistical measure of the covariation, or
association, between two at-least interval variables.
• Covariance
 Extent to which two variables are associated
systematically with each other.

  X i  X Yi  Y 
n

i 1
rxy  r yx 
  Xi  X   Yi  Y 
n 2 n 2

i 1 i 1
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–5
Simple Correlation Coefficient
• Correlation coefficient (r)
 Ranges from +1 to -1
 Perfect positive linear relationship = +1
 Perfect negative (inverse) linear relationship = -1
 No correlation = 0

• Correlation coefficient for two variables (X,Y)

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–6
EXHIBIT 23.2 Scatter Diagram to Illustrate Correlation Patterns

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–7
Correlation, Covariance, and Causation
• When two variables covary, they display
concomitant variation.
• This systematic covariation does not in and of
itself establish causality.
• e.g., Rooster’s crow and the rising of the sun
 Rooster does not cause the sun to rise.

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–8
EXHIBIT 23.3
23.3
Correlation Analysis of
Number of Hours Worked
in Manufacturing Industries
with Unemployment Rate

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–9
Coefficient of Determination
• Coefficient of Determination (R2)
 A measure obtained by squaring the correlation
coefficient; the proportion of the total variance of a
variable accounted for by another value of another
variable.
 Measures that part of the total variance of Y that is
accounted for by knowing the value of X.

Explained variance
2
R 
Total Variance
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–10
Correlation Matrix
• Correlation matrix
 The standard form for reporting correlation
coefficients for more than two variables.
• Statistical Significance
 The procedure for determining statistical significance
is the t-test of the significance of a correlation
coefficient.

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–11
EXHIBIT 23.4 Pearson Product-Moment Correlation Matrix for Salesperson Example a

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–12
Regression Analysis
• Simple (Bivariate) Linear Regression
 A measure of linear association that investigates
straight-line relationships between a continuous
dependent variable and an independent variable that
is usually continuous, but can be a categorical
dummy variable.
• The Regression Equation (Y = α + βX )
 Y = the continuous dependent variable
 X = the independent variable
 α = the Y intercept (regression line intercepts Y axis)
 β = the slope of the coefficient (rise over run)

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–13
Regression Line and Slope

Y
130

120

110

100 Yˆ  aˆ  ˆX
90 Yˆ
80
X
X
80 90 100 110 120 130 140 150 160 170
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–14
The Regression Equation
• Parameter Estimate Choices
 β is indicative of the strength and direction of the
relationship between the independent and dependent
variable.
 α (Y intercept) is a fixed point that is considered a
constant (how much Y can exist without X)
• Standardized Regression Coefficient (β)
 Estimated coefficient of the strength of relationship
between the independent and dependent variables.
 Expressed on a standardized scale where higher
absolute values indicate stronger relationships (range
is from -1 to 1).
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–15
The Regression Equation (cont’d)
• Parameter Estimate Choices
 Raw regression estimates (b1)
 Raw regression weights have the advantage of retaining the
scale metric—which is also their key disadvantage.
 If the purpose of the regression analysis is forecasting, then
raw parameter estimates must be used.
 This is another way of saying when the researcher is
interested only in prediction.
 Standardized regression estimates (β)
 Standardized regression estimates have the advantage of a
constant scale.
 Standardized regression estimates should be used when the
researcher is testing explanatory hypotheses.
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–16
EXHIBIT 23.5 The Advantage of Standardized Regression Weights

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–17
EXHIBIT 23.6 Relationship of Sales Potential to Building Permits Issued

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–18
EXHIBIT 23.7 The Best Fit Line or Knocking Out the Pins

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–19
Ordinary Least-Squares (OLS) Method of
Regression Analysis
• OLS
 Guarantees that the resulting straight line will produce the least
possible total error in using X to predict Y.
 Generates a straight line that minimizes the sum of squared
deviations of the actual values from this predicted regression
line.
 No straight line can completely represent every dot in the scatter
diagram.
 There will be a discrepancy between most of the actual scores
(each dot) and the predicted score .
 Uses the criterion of attempting to make the least amount of total
error in prediction of Y from X.
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–20
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–21
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)

The equation means that the predicted value for any value
of X (Xi) is determined as a function of the estimated slope
coefficient, plus the estimated intercept coefficient + some
error.

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–22
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–23
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)
• Statistical Significance Of Regression Model
• F-test (regression)
 Determines whether more variability is explained by
the regression or unexplained by the regression.

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–24
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)
• Statistical Significance Of Regression Model
 ANOVA Table:

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–25
Ordinary Least-Squares Method of
Regression Analysis (OLS) (cont’d)
• R2
 The proportion of variance in Y that is explained by X
(or vice versa)
 A measure obtained by squaring the correlation
coefficient; that proportion of the total variance of a
variable that is accounted for by knowing the value of
another variable.

3,398.49
R  2
 0.875
3,882.40
© 2010 Cengage Learning. All rights reserved. May not be scanned,
copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–26
EXHIBIT 23.8 Simple Regression Results for Building Permit Example

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–27
EXHIBIT 23.9 OLS Regression Line

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–28
Simple Regression and Hypothesis Testing
• The explanatory power of regression lies in
hypothesis testing. Regression is often used to
test relational hypotheses.
 The outcome of the hypothesis test involves two
conditions that must both be satisfied:
 The regression weight must be in the hypothesized direction.
Positive relationships require a positive coefficient and
negative relationships require a negative coefficient.
 The t-test associated with the regression weight must be
significant.

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–29
EXHIBIT 23A.1 Least-Squares Computation

© 2010 Cengage Learning. All rights reserved. May not be scanned,


copied or duplicated, or posted to a publically accessible website, in
whole or in part. 23–30

You might also like