Correlation

The document explains correlation as a statistical technique that measures the relationship between pairs of variables, such as height and weight, and introduces regression analysis as a method to predict outcomes based on independent variables. It outlines the assumptions of linear regression and provides a formula for calculating the regression equation, alongside examples of its application. Additionally, it discusses the correlation coefficient and the use of scatter diagrams to visualize relationships between variables.

Uploaded by

kylajayne205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views22 pages

Correlation

Uploaded by

kylajayne205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

CORRELATION

CORRELATION :

• Correlation is a statistical technique that can show whether and how

strongly pairs of variables are related. For example, height and
weight are related; taller people tend to be heavier than shorter
people. The relationship isn't perfect. People of the same height vary
in weight, and you can easily think of two people you know where the
shorter one is heavier than the taller one. Nonetheless, the average
weight of people 5'5'' is less than the average weight of people 5'6'',
and their average weight is less than that of people 5'7'', etc.
• Correlation can tell you just how much of the variation in peoples'
weights is related to their heights. Although this correlation is fairly
obvious your data may contain unsuspected correlations. You may
also suspect there are correlations, but don't know which are the
strongest. An intelligent correlation analysis can lead to a greater
understanding of your data.
LEARNING OUTCOMES:
• After successful completion of this unit, you should be able to:
• explain the concept of regression;
• apply the concept of regression in solving problems;
• identify the advantages and disadvantages regression;
• explain the concept of Chi-square;
• apply the concept of Chi – square in solving problems; and
• identify the advantages and disadvantages Chi – square.
Regression Analysis – Linear model assumptions
Linear regression analysis is based on six fundamental assumptions:
1. The dependent and independent variables show a linear
relationship between the slope and the intercept.
2. The independent variable is not random.
3. The value of the residual (error) is zero.
4. The value of the residual (error) is constant across all observations.
5. The value of the residual (error) is not correlated across all
observations. 6. The residual (error) values follow the normal
distribution
Regression Analysis – Simple linear regression Simple linear regression
is a model that assesses the relationship between a dependent variable
and an independent variable. The simple linear model is expressed
using the following equation:
Y = a + bX + ϵ
Where: • Y – Dependent variable
• X – Independent (explanatory) variable
• a – Intercept
• b – Slope
• ϵ – Residual (error)
Example:
Last year, five randomly selected students took a math aptitude test before
they began their statistics course. The Statistics Department has three
questions.
▪ What linear regression equation best predicts statistics performance, based
on math aptitude scores?
▪ If a student made an 80 on the aptitude test, what grade would we expect
her to make in statistics?
▪ How well does the regression equation fit the data? How to Find the
Regression Equation In the table below, the xi column shows scores on
the aptitude test. Similarly, the yi column shows statistics grades. The last
two columns show deviations scores - the difference between the
student's score and the average score on each test. The last two rows
show sums and mean scores that we will use to conduct the regression
analysis.
• The regression equation is a linear equation of the form:
ŷ = b0 + b1x .
To conduct a regression analysis, we need to solve for b0 and b1.
Computations are shown below. Notice that all of our inputs for the
regression analysis come from the above three tables.

First, we solve for the regression coefficient (b1):

b1 = Σ [ (xi - x)(yi - y) ] / Σ [ (xi - x )2 ]
b1 = 470/730
b1 = 0.644
• Once we know the value of the regression coefficient (b1), we can
solve for the regression slope (b0):

b0 = y - b1 * x
b0 = 77 - (0.644)(78)
b0 = 26.768

Therefore, the regression equation is: ŷ = 26.768 + 0.644x

How to Use the Regression Equation:
Once you have the regression equation. Choose a value for the
independent variable (x), perform the computation, and you have an
estimated value (ŷ) for the dependent variable.
In our example, the independent variable is the student's score on the
aptitude test. The dependent variable is the student's statistics grade. If
a student made an 80 on the aptitude test, the estimated statistics
grade (ŷ) would be:
ŷ = b0 + b1x
ŷ = 26.768 + 0.644x
= 26.768 + 0.644 * 80
ŷ = 26.768 + 51.52 = 78.288
Correlation coefficient

The degree of association is measured by a correlation coefficient,

denoted by r. It is sometimes called Pearson's correlation coefficient
after its originator and is a measure of linear association.
If a curved line is needed to express the relationship, other and more
complicated measures of the correlation must be used.
The correlation coefficient is measured on a scale that varies from
+ 1 through 0 to - 1
Complete correlation between two variables is expressed by either + 1
or -1. When one variable increases as the other increases the
correlation is positive; when one decreases as the other increases it is
negative. Complete absence of correlation is represented by 0. Figure
below gives some graphical representations of correlation.
Looking at data: scatter diagrams
When an investigator has collected two series of observations and wishes to
see whether there is a relationship between them, he or she should first
construct a scatter diagram. The vertical scale represents one set of
measurements and the horizontal scale the other. If one set of observations
consists of experimental results and the other consists of a time scale or
observed classification of some kind, it is usual to put the experimental
results on the vertical axis. These represent what is called the "dependent
variable". The "independent variable", such as time or height or some other
observed classification, is measured along the horizontal axis, or baseline.
The words "independent" and "dependent" could puzzle the beginner
because it is sometimes not clear what is dependent on what. This confusion
is a triumph of common sense over misleading terminology, because often
each variable is dependent on some third variable, which may or may not be
mentioned. It is reasonable, for instance, to think of the height of children as
dependent on age rather than the converse but consider a positive
correlation between mean tar yield and nicotine yield of certain brands of
cigarette.'
• The nicotine liberated is unlikely to have its origin in the tar: both vary in
parallel with some other factor or factors in the composition of the
cigarettes. The yield of the one does not seem to be "dependent" on the
other in the sense that, on average, the height of a child depends on his
age. In such cases it often does not matter which scale is put on which axis
of the scatter diagram. However, if the intention is to make inferences
about one variable from the other, the observations from which the
inferences are to be made are usually put on the baseline. As a further
example, a plot of monthly deaths from heart disease against monthly
sales of ice cream would show a negative association. However, it is hardly
likely that eating ice cream protects from heart disease! It is simply that the
mortality rate from heart disease is inversely related - and ice cream
consumption positively related - to a third factor, namely environmental
temperature
Example: Calculation of the correlation coefficient

https://www.youtube.com/watch?v=nUD04ka4goA
ASSESSMENT:
• Determine if there is a correlation between size of pulmonary
anatomical dead space and height of child. Make a scatter diagram to
show the heights and pulmonary anatomical dead spaces in the 15
children. Use 5% level of significance.

Correlation Regression
100% (1)
Correlation Regression
25 pages
Lecture 7 - Correlation Regression
No ratings yet
Lecture 7 - Correlation Regression
47 pages
Ai Predictor (SAB)
No ratings yet
Ai Predictor (SAB)
9 pages
Unit 3 Fod
No ratings yet
Unit 3 Fod
21 pages
SOCI1005 - Correlation and Regression
No ratings yet
SOCI1005 - Correlation and Regression
36 pages
Lecture 7
No ratings yet
Lecture 7
65 pages
Business Statistic Presentation
No ratings yet
Business Statistic Presentation
22 pages
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
No ratings yet
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
53 pages
Captura de Ecrã 2024-10-16 À(s) 13.04.06
No ratings yet
Captura de Ecrã 2024-10-16 À(s) 13.04.06
38 pages
Scatter Plot
No ratings yet
Scatter Plot
20 pages
Correlation and Regression Original
No ratings yet
Correlation and Regression Original
44 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Correlation and Regression Analysis Using SPSS
No ratings yet
Correlation and Regression Analysis Using SPSS
102 pages
Regression Analysis
No ratings yet
Regression Analysis
7 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
Biostat Lecture Note 3
No ratings yet
Biostat Lecture Note 3
5 pages
Relationship - Correlation and Regression
No ratings yet
Relationship - Correlation and Regression
42 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
5 - Chapter9-Linear Regression
No ratings yet
5 - Chapter9-Linear Regression
15 pages
Regression & Correlation 230224 221642
No ratings yet
Regression & Correlation 230224 221642
9 pages
Correlation
100% (1)
Correlation
29 pages
CH 6
No ratings yet
CH 6
42 pages
Correlation Anad Regression
No ratings yet
Correlation Anad Regression
13 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Correlation and Simple Linear Regression: Y. I.E. X
100% (1)
Correlation and Simple Linear Regression: Y. I.E. X
9 pages
PSNM - Ch. 1
No ratings yet
PSNM - Ch. 1
16 pages
Summarize The Methods of Studying Correlation.: Module - 3
No ratings yet
Summarize The Methods of Studying Correlation.: Module - 3
17 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
37 pages
Lesson 6.2 Correlation and Regression Analysis Final Edition
No ratings yet
Lesson 6.2 Correlation and Regression Analysis Final Edition
8 pages
Regression: Simple Linear Regression Model
No ratings yet
Regression: Simple Linear Regression Model
16 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Correlation and Regression
No ratings yet
Correlation and Regression
4 pages
Basic Statistics (3685) PPT - Lecture On 22-01-2019
No ratings yet
Basic Statistics (3685) PPT - Lecture On 22-01-2019
29 pages
Intermediate Statistics Sample Test 1
0% (3)
Intermediate Statistics Sample Test 1
17 pages
Unit 6, Regression
No ratings yet
Unit 6, Regression
34 pages
Complete - Lesson 2 Correation Analysis
No ratings yet
Complete - Lesson 2 Correation Analysis
26 pages
Correlation Regression
100% (1)
Correlation Regression
55 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
No ratings yet
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
35 pages
Correlation Analysis Notes-2
No ratings yet
Correlation Analysis Notes-2
5 pages
Corr PDF
No ratings yet
Corr PDF
30 pages
07 - Correlation and Regression Analysis-1
No ratings yet
07 - Correlation and Regression Analysis-1
13 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Lecture 11-Correlation and Linear Regression
No ratings yet
Lecture 11-Correlation and Linear Regression
7 pages
Artificial Intelligence Fundamentals Midterm Q1
No ratings yet
Artificial Intelligence Fundamentals Midterm Q1
4 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Regression and Correlation
No ratings yet
Regression and Correlation
37 pages
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
No ratings yet
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
28 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
Using Gretl For POE4
No ratings yet
Using Gretl For POE4
500 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Solutions To Chapter 10 Problems
No ratings yet
Solutions To Chapter 10 Problems
40 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
33 pages
02 Regression and Classification Problems
No ratings yet
02 Regression and Classification Problems
7 pages
Slides 1 Arnold Ventures 2024
No ratings yet
Slides 1 Arnold Ventures 2024
68 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Stat Wizards-Case Study 2 - DS853
No ratings yet
Stat Wizards-Case Study 2 - DS853
23 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
PGD - Stat-3.Simple Corr & Regression
No ratings yet
PGD - Stat-3.Simple Corr & Regression
28 pages
Ta Khanh Vinh
No ratings yet
Ta Khanh Vinh
33 pages
Jerome H. Friedman
No ratings yet
Jerome H. Friedman
44 pages
Production Planning and Control
No ratings yet
Production Planning and Control
44 pages
Machine Learning Questions and Answers For Interview
No ratings yet
Machine Learning Questions and Answers For Interview
20 pages
ECO3021S - Quants Chap 4 PT 1
No ratings yet
ECO3021S - Quants Chap 4 PT 1
5 pages
Assignment 2 B
No ratings yet
Assignment 2 B
10 pages
ML Important Topic
No ratings yet
ML Important Topic
13 pages
Introduction To Bivariate Regression
No ratings yet
Introduction To Bivariate Regression
51 pages
Correlation
No ratings yet
Correlation
34 pages
Linear Correlation Coefficient
No ratings yet
Linear Correlation Coefficient
3 pages
Tugas Rancoblan Kelompok: 1.putri Fardha A.O.H 2.ilma Amira 3.ajeng Musyafaah
No ratings yet
Tugas Rancoblan Kelompok: 1.putri Fardha A.O.H 2.ilma Amira 3.ajeng Musyafaah
10 pages
IBA ASSIGNMENT p22251
No ratings yet
IBA ASSIGNMENT p22251
6 pages
Regression in Data Mining
No ratings yet
Regression in Data Mining
15 pages
B. CORRELATION and REGRESSION
No ratings yet
B. CORRELATION and REGRESSION
4 pages
Resume 190922 - Restri Ayu Safarina
No ratings yet
Resume 190922 - Restri Ayu Safarina
3 pages
Questions Answers Topic 5
No ratings yet
Questions Answers Topic 5
5 pages
Artikel Ahmad Fadhil Imran PDF
No ratings yet
Artikel Ahmad Fadhil Imran PDF
5 pages
Solution - Sample Test: MULTIPLE CHOICE: Answer Key
No ratings yet
Solution - Sample Test: MULTIPLE CHOICE: Answer Key
4 pages
Regresi Logistik Metode Enter
No ratings yet
Regresi Logistik Metode Enter
5 pages
Time Series Practice HW
No ratings yet
Time Series Practice HW
3 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)

Correlation

Uploaded by

Correlation

Uploaded by

CORRELATION

• Correlation is a statistical technique that can show whether and how

First, we solve for the regression coefficient (b1):

Therefore, the regression equation is: ŷ = 26.768 + 0.644x

The degree of association is measured by a correlation coefficient,

You might also like