0% found this document useful (0 votes)

13 views7 pages

Correlation and Regression Handout 1

The document discusses correlation and regression analysis, explaining how to measure the strength of association between two numerical variables using correlation coefficients. It details the interpretation of correlation values, the significance of linear relationships, and the process of regression analysis to predict dependent variables based on independent variables. The conclusion emphasizes a strong positive relationship between store area and sales, with statistical significance indicating that area significantly affects sales.

Uploaded by

keithceoal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views7 pages

Correlation and Regression Handout 1

Uploaded by

keithceoal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Correlation Analysis

◼ Used to measure and interpret the strength of association (linear relationship) between two numerical
variables

 Only concerned with strength of the relationship

 No causal effect is implied

Example: if cigarette smoking and lung cancer are highly correlated, it is not sufficient proof of
causation. One variable may cause the other or vice-versa, or a third factor is involved, or a rare
event may have already occurred

◼ Population correlation coefficient r (Rho) is used to measure the strength of the linear relationship
between two variables, X and Y, that is independent of their respective scales of measurement

◼ Sample correlation coefficient r is a point estimate of r and is used to measure the strength of the linear
relationship of two variables in the sample observations

◼ r is the Pearson product moment coefficient of correlation between X and Y

Features of 𝜌 and 𝑟

◼ Unit free

◼ Range between -1 and 1, inclusive of the endpoints

◼ The closer to -1, the stronger the negative linear relationship

◼ The closer to +1, the stronger the positive linear relationship

◼ The closer to 0, the weaker the linear relationship

Correlation Strength of
Interpretation
Coefficient (r) Relationship

Perfect Positive A perfect positive linear relationship: as one variable

r=1
Relationship increases, the other increases proportionally.

Very Strong Positive A very strong positive linear relationship, with only small
r = 0.9 to 1
Relationship deviations from the ideal straight line.

Strong Positive A strong positive relationship, but with some fluctuations

r = 0.7 to 0.9
Relationship or variance in the data.

lpile
Moderate Positive A moderate positive relationship, but there is some scatter
r = 0.5 to 0.7
Relationship or noise in the data.

Weak Positive A weak positive relationship, with considerable variability

r = 0.3 to 0.5
Relationship in the data.

Very Weak Positive or

r = 0 to 0.3 A very weak or nearly nonexistent positive relationship.
No Relationship

Very Weak Negative A very weak negative relationship, with little or no

r = -0.3 to 0
Relationship predictable inverse correlation.

Weak Negative A weak negative relationship, but with some scatter or

r = -0.5 to -0.3
Relationship noise in the data.

Moderate Negative A moderate negative relationship, where one variable

r = -0.7 to -0.5
Relationship tends to decrease as the other increases, but not perfectly.

Strong Negative A strong negative relationship, with only small deviations

r = -0.9 to -0.7
Relationship from the ideal straight line.

Perfect Negative A perfect negative linear relationship: as one variable

r = -1
Relationship increases, the other decreases proportionally.

Key Points

◼ Positive Correlation (r > 0): As one variable increases, the other also increases.

- Hours Studied vs. Test Scores

- Temperature vs. Ice Cream Sales
- Number of Hours Exercised vs. Calories Burned

◼ Negative Correlation (r < 0): As one variable increases, the other decreases.

- Age of a Vehicle vs. Resale Value

- Time Spent Watching TV vs. Academic Performance

◼ No or Weak Correlation (r ≈ 0): There is little to no linear relationship between the two variables.

- Shoe Size vs. Test Scores

- Hair Color vs. Intelligence
- Favorite Movie Genre vs. Monthly Income

lpile
Example 1 Correlation Analysis You want to examine the correlation of the annual sales of produce stores on their
size in square footage. Sample data for seven stores were obtained.

Annual Store Area (in Sq Ft) Sales ($1000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Scatter Diagram

12000
Annual Sales ($000)

10000

8000

6000

4000

2000

0
0 2000 4000 6000

Square Feet

Multiple R – correlation coefficient (Pearson 𝑟) representing the strength and direction of the relationship between
the independent and the dependent variable

Interpretation: There is a very strong positive relationship between Area (in square feet) and Sales.

lpile
Question: Is there any evidence of a linear relationship between the annual sales of a store and its square footage
at .05 level of significance?

𝐻0 : 𝜌 = 0 (No association)

𝐻𝐴 : 𝜌 ≠ 0 (Association)

Since 𝑝 = 0.00028 < 0.01, we reject the null hypothesis and conclude that there is a statistically significant linear
association between the store's square footage and its annual sales.

Remark: If the relationship/ association is significant, you can proceed to Regression Analysis.

Regression Analysis

◼ Regression analysis is used primarily to establish linear relationship between variables and provide
prediction

 Predicts the value of a dependent (response) variable based on the value of at least one
independent (explanatory) variable

 Explains the relationship of the independent variables on the dependent variable

◼ Relationship between variables is described by a linear function

◼ This function relates how much change in the dependent variable is associated with a unit increase (or
decrease) in the independent variable.

◼ Sample regression line provides an estimate of the population regression line as well as a predicted
value of Y

Slope and Intercept

◼ 𝑏0 is the estimated average value of Y when the value of X is zero.

◼ 𝑏1 is the estimated change in the average value of Y as a result of a one-unit change in X.

lpile
◼ R-squared: Measure of how well the regression line fits the data. A higher value (close to 1) indicates a
good fit. It also tell us the percentage of variability in the dependent variable that is likely due to or
explained by the independent variable.

Example 1 Regression Analysis

Since Area (in sq ft) and Sales (in $1000) are significantly associated, we can proceed to regression analysis to

i) fit a linear regression model to the data;

ii) use the regression line to predict values; and
iii) check whether or not the independent variable significantly affects the dependent variable.

Solutions:

i) The coefficients of the regression line are 𝑏0 = 1636.41 and 𝑏1 = 1.49. Thus, the regression line is

𝑦 = 1636.41 + 1.49𝑥
where 𝑥 is the area (in sq ft) and 𝑦 is the sales (in $1000).

ii) Suppose that we want to predict the sales (𝑦) when are is 3000 sq ft (𝑥), we use the regression line and
substitute the values of 𝑥 and 𝑦. Hence,
𝑦 = 1636.41 + 1.49(3000) = 6106.41(in $1000)
Thus, if the store area is 3000 sq ft, the predicted sales is $6,106,410.

lpile
R-square = 0.94198: It means that 94.198% of the variability in the dependent variable is explained by the
independent variable(s) in your regression model. The remaining 5.802% of the variability is due to other factors not
captured by the model, such as random error, other variables not included in the model, or inherent variability in
the data.

Measure of how well the regression line fits the data. A higher value (close to 1) indicates a good fit.

iii) To check whether or not the area (independent variable) significantly affects sales (dependent
variable), we set-up the following hypotheses:
𝐻0 : β1 = 0 (no effect)
𝐻𝐴 : 𝛽1 ≠ 0 (there is an effect)

Since 𝑝 = 0.00028 < 0.01, we reject 𝐻0 and conclude that there is evidence that square footage affects annual sales.

ANOVA

In regression analysis, the ANOVA output is used to assess the overall significance of the regression model.
It tells you whether the independent variable(s) in the regression model collectively explain a significant portion of
the variation in the dependent variable.

Components:

◼ F (F-statistic): The F-statistic is used to test if the regression model is a good fit for the data. It is calculated
as the ratio of MS (regression) to MS (residual) (mean square of residuals or errors). A larger F-value
indicates that the model explains a significant portion of the variability in the dependent variable.

◼ Significance F (p-value for F-statistic): This is the p-value associated with the F-statistic. It tells you whether
the overall regression model is statistically significant. If Significance F is less than your alpha level (usually
0.05), you reject the null hypothesis and conclude that the regression model is statistically significant.

ANOVA
df SS MS F Significance F
Regression 1 30380456.12 30380456.12 81.17909015 0.000281201
Residual 5 1871199.595 374239.919
Total 6 32251655.71

◼ The F-statistic (81.18) is very large, suggesting that the regression model fits the data well.
◼ The p-value (0.000281) is much less than the significance level of 0.05, so we reject the null hypothesis
that the regression model does not explain a significant portion of the variability in the dependent
variable.

lpile
◼ This means the independent variable in the regression model is statistically significant and has a
meaningful relationship with the dependent variable.

CONCLUSION:
The regression analysis demonstrates a very strong positive linear relationship (𝑟 = 0.97056) between area (in sq
ft) and sales (in $1000). The model is statistically significant at 𝛼 = 0.01, with an 𝑅 2 of 0.94198, indicating that
94.198% of the variability in sales can be explained by the area.

The slope of the regression equation (𝑏1 =1.48663) suggests that for every additional square foot in area, sales are
expected to increase by approximately P1.48663 (in $1000). The intercept (𝑏0 =1636.41) indicates that when the
area is 0 square feet, the baseline sales are P1,636.41 (in $1000).

lpile

Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
From Everand
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Jim Frost
5/5 (4)
Team 10 Primer
No ratings yet
Team 10 Primer
12 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Regression Analysis
No ratings yet
Regression Analysis
7 pages
CORRELATION
No ratings yet
CORRELATION
10 pages
DADM-Correlation and Regression
No ratings yet
DADM-Correlation and Regression
138 pages
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
Regression and Correlation
100% (1)
Regression and Correlation
9 pages
Regression Using Excel
No ratings yet
Regression Using Excel
18 pages
10-Correlation and Linear Regression
No ratings yet
10-Correlation and Linear Regression
25 pages
BL 234 Revised Correlation Notes
No ratings yet
BL 234 Revised Correlation Notes
8 pages
How Can We Explore The Association Between Two Quantitative Variables?
No ratings yet
How Can We Explore The Association Between Two Quantitative Variables?
7 pages
Regression Assumptions Presentation VIT
No ratings yet
Regression Assumptions Presentation VIT
10 pages
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
No ratings yet
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
35 pages
Correlation and Regression
No ratings yet
Correlation and Regression
11 pages
Research-Methodology-Litrature-Review of Fii N Fdi 2003
No ratings yet
Research-Methodology-Litrature-Review of Fii N Fdi 2003
12 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Af Notes by Midhila)
No ratings yet
Af Notes by Midhila)
60 pages
Correlation and Regression Using Jamovi
No ratings yet
Correlation and Regression Using Jamovi
8 pages
Lecture 4 - Correlation and Regression
No ratings yet
Lecture 4 - Correlation and Regression
35 pages
Regression and Correlation Notes
No ratings yet
Regression and Correlation Notes
28 pages
4.analyze and Data Driven - Facebook
No ratings yet
4.analyze and Data Driven - Facebook
27 pages
Lecture - Hoi Qui Don - DT - New - 8.5
No ratings yet
Lecture - Hoi Qui Don - DT - New - 8.5
10 pages
Chapter 3
No ratings yet
Chapter 3
11 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Topic 7.1 - Correlation and Simple Linear Regression
No ratings yet
Topic 7.1 - Correlation and Simple Linear Regression
20 pages
Correlation
No ratings yet
Correlation
32 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Lesson 7 - Linear Correlation and Simple Linear Regression
No ratings yet
Lesson 7 - Linear Correlation and Simple Linear Regression
8 pages
Lesson 6.2 Correlation and Regression Analysis Final Edition
No ratings yet
Lesson 6.2 Correlation and Regression Analysis Final Edition
8 pages
@vtucode - in 21CS71 Module 5 PDF
No ratings yet
@vtucode - in 21CS71 Module 5 PDF
5 pages
Regression
No ratings yet
Regression
66 pages
Lecture 8 and 9 Regression Correlation and Index
No ratings yet
Lecture 8 and 9 Regression Correlation and Index
32 pages
Assigmnment On 3203
No ratings yet
Assigmnment On 3203
100 pages
Correlation and Covariance
No ratings yet
Correlation and Covariance
11 pages
09 - M & S - Corr+Regr
No ratings yet
09 - M & S - Corr+Regr
18 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
19 pages
Income Tax
No ratings yet
Income Tax
9 pages
Pearson Correlation and Linear Regression
No ratings yet
Pearson Correlation and Linear Regression
42 pages
Unit 2
No ratings yet
Unit 2
44 pages
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
No ratings yet
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
14 pages
Course Pack Correlation
No ratings yet
Course Pack Correlation
12 pages
Regcorr 5
No ratings yet
Regcorr 5
20 pages
Correlation
No ratings yet
Correlation
5 pages
Screenshot 2023-12-04 at 11.27.14
No ratings yet
Screenshot 2023-12-04 at 11.27.14
32 pages
Correlation Coefficient: How Well Does Your Regression Equation Truly Represent Your Set of Data?
No ratings yet
Correlation Coefficient: How Well Does Your Regression Equation Truly Represent Your Set of Data?
3 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Chapter 3 - Regression
No ratings yet
Chapter 3 - Regression
8 pages
Correlation and Regression
No ratings yet
Correlation and Regression
32 pages
Lecture 1. Part 1-Regression Analysis. Correlation and SLRM
No ratings yet
Lecture 1. Part 1-Regression Analysis. Correlation and SLRM
44 pages
06 Simple Linear Regression Part1
No ratings yet
06 Simple Linear Regression Part1
8 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 4
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 4
7 pages
Q&A Univ 3unit
No ratings yet
Q&A Univ 3unit
18 pages
Corelation and Regrassion
No ratings yet
Corelation and Regrassion
5 pages
Topic 2 - Correlation Theory
No ratings yet
Topic 2 - Correlation Theory
15 pages
CORRELATION
No ratings yet
CORRELATION
23 pages
Correlation and Regration
No ratings yet
Correlation and Regration
8 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Beginner’s Guide to Correlation Analysis: Bite-Size Stats, #4
From Everand
Beginner’s Guide to Correlation Analysis: Bite-Size Stats, #4
Lee Baker
No ratings yet
Strategy Implementation Organizing For Action
No ratings yet
Strategy Implementation Organizing For Action
6 pages
IV. Financial Planning
No ratings yet
IV. Financial Planning
9 pages
Handout AE9 Statistical Tests Z Test T Test
No ratings yet
Handout AE9 Statistical Tests Z Test T Test
4 pages
Skewness and Kurtosis Examples
No ratings yet
Skewness and Kurtosis Examples
33 pages
For Correlation and Regression Analysis
No ratings yet
For Correlation and Regression Analysis
7 pages
Project Information Document (PID) : The World Bank
No ratings yet
Project Information Document (PID) : The World Bank
12 pages
كل مذكرات السنة الأولى في الانجليزية
No ratings yet
كل مذكرات السنة الأولى في الانجليزية
32 pages
Astm e 165 - 02
100% (1)
Astm e 165 - 02
20 pages
TOS - Statistics and Probability - 3rd Quarter Examination
No ratings yet
TOS - Statistics and Probability - 3rd Quarter Examination
2 pages
Gas Laws Practice Worksheet
No ratings yet
Gas Laws Practice Worksheet
2 pages
Computer Network - CS610 Power Point Slides Lecture 12
No ratings yet
Computer Network - CS610 Power Point Slides Lecture 12
20 pages
WEG - Transformer
No ratings yet
WEG - Transformer
20 pages
2013 ME Magway,, English
No ratings yet
2013 ME Magway,, English
4 pages
LRFD 0.9F 0.75F 0.99F: LR F A LR
No ratings yet
LRFD 0.9F 0.75F 0.99F: LR F A LR
4 pages
Planning Engineer
No ratings yet
Planning Engineer
2 pages
fml-g12s Ds en
No ratings yet
fml-g12s Ds en
7 pages
Simple Compound Complex Sentences
No ratings yet
Simple Compound Complex Sentences
15 pages
Strategic Environmental Assessment Framework
No ratings yet
Strategic Environmental Assessment Framework
30 pages
Theory of Elasticity
No ratings yet
Theory of Elasticity
4 pages
Testing MCQ
No ratings yet
Testing MCQ
59 pages
MIL 11 - 12 Q3 0102 What Is Media and Information Literacy PS
No ratings yet
MIL 11 - 12 Q3 0102 What Is Media and Information Literacy PS
14 pages
AX Series Hanyoung Brochure
No ratings yet
AX Series Hanyoung Brochure
6 pages
A Detailed Lesson Plan in Mathematics 7: I. Objectives
No ratings yet
A Detailed Lesson Plan in Mathematics 7: I. Objectives
8 pages
EL BR 023 CA EN 0120.1 - PVC Duct DB2 ES2 Pipe Fittings
No ratings yet
EL BR 023 CA EN 0120.1 - PVC Duct DB2 ES2 Pipe Fittings
8 pages
Microcontroller 8051
No ratings yet
Microcontroller 8051
72 pages
DxDiag Requisitos
No ratings yet
DxDiag Requisitos
30 pages
Current Affairs - Compendium - DMS - IIT - Delhi
No ratings yet
Current Affairs - Compendium - DMS - IIT - Delhi
28 pages
2001 Nieuwaal
No ratings yet
2001 Nieuwaal
89 pages
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
No ratings yet
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
33 pages
Physics Grade 9 Worksheet I Second Sem
No ratings yet
Physics Grade 9 Worksheet I Second Sem
11 pages
Installation Instructions: Diesel/Alternator Tachometer 3-3/8" & 5"
No ratings yet
Installation Instructions: Diesel/Alternator Tachometer 3-3/8" & 5"
2 pages
A2mot En5
100% (1)
A2mot En5
5 pages
Revision For Mid Term Test
No ratings yet
Revision For Mid Term Test
7 pages
ICT 7 Learning Module
No ratings yet
ICT 7 Learning Module
77 pages
P.7 Math
No ratings yet
P.7 Math
12 pages