0% found this document useful (0 votes)

11 views5 pages

6) CorrelationAndRegression - 27

Uploaded by

uwtfme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

6) CorrelationAndRegression - 27

Uploaded by

uwtfme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

17.10.

2023

Correlations Analysis
The term correlation refers to any type of relationship
between events and objects.
If we are interested only in determining whether a
y relationship exists, we employ correlation analysis.
Correlation analysis refers exclusively to a
quantifiable relationship between two variables.
Y=a+bX For the correlation calculation, there must be two
x measures (variables) for each subject.
If this condition is satisfied, the data can be inserted
Prof.Dr. Ahmet DİRİCAN
into a statistical formulation that will reveal the
İÜ-C. CERRAHPAŞA MEDICAL FACULTY type and strength of the relationship under study.
DEPARTMENT OF BIOSTATISTICS

Correlation-two variables (Univariate & Bivariate Statistics)

The most widely employed measure of
 linear pattern of relationship between one
statistical correlation is the product-moment variable (x) and another variable (y)
correlation coefficient devised by Pearson.  an association between two variables
Many other techniques used to describe  relative position of one variable correlates with
relationships are analogous to the relative distribution of another variable
Pearson approach.  graphical representation of the relationship
between two variables

We considered the problem of one variable  Warning:

(the dependent variables "Y") from one or more  No proof of causality
related variables (the independent variables "X")  Cannot assume x causes y

Sample vs. Population Hypothesis testing with Correlations

• Sample statistics estimate Population parameters • Two possibilities
– Ho: ρ = 0 (no actual correlation; The Null Hypothesis)
– 𝑥ҧ tries to estimate μ .. ( ‘‘x bar’’ Sample mean  ‘‘mü’’Population Mean) – Ha: ρ ≠ 0 (there is some correlation; The Alternative Hyp.)
– r tries to estimate ρ … (“rho” – greek symbol --- not “p”)
• Case #1 (see correlation worksheet)
• r correlation for a sample (based on a the limited observations we have)
– Correlation between distance and points r = -0.904
• ρ actual correlation in population (the true correlation) – Sample small (n=6), but r is very large
– We guess ρ < 0 (we guess there is some correlation in the pop.)
• Beware Sampling Error!!
– even if ρ=0 (there’s no actual correlation), • Case #2
– you might get r=0.08 or r = -0.26 just by chance. – Correlation between aiming and points, r = 0.628
– We look at r, but we want to know about ρ – Sample small (n=6), and r is only moderate in size
– We guess ρ = 0 (we guess there is NO correlation in pop.)

1
17.10.2023

Scatter Diagrams: The most accurate information about the relationship model
THE MEASURES OF RELATIONSHIP between two variables is obtained from the scatter diagram of individuals.
BETWEEN CONTINUOUS VARIABLES y r0 y r1 y r-1 y r0

A scatterplot can reveal various types of

associations between two variables. x x x x
The y variable can respond to the Small and large values of x
There appears to be no
Although a scatterplot is an essential first discernible relationship
increase of the x variable with an
increase or decrease.
variable are associated with
between two variables. large values of y variable. The
step in studying the association between The variables are related linearly. relationship is U-shaped.
variables, it is often useful to quantify the TYPES OF THE RELATIONSHIPS;
2-Ampirik (Deneysel), Stochastic (olasılıklı) relationships
stregth of association by calculating a 1-Deterministic (kesin)relationships.
summary index. The observed (x, y)
data points fall directly
y=1.5x+ Random Error A stochastic model
y=1.5x
is a mathematical
on a line.
The relationship description (of the
between degrees of relevant properties) of an
Fahrenheit and Celsius entropy source
is known to be: using random
F=(9/5)*C+32 variables.

As the scatter in the sample space widens, the strength of Interpretation of the Correlation Coefficients
the relationship decreases. ‘‘r’’ indicates two information about the relationship, such as ‘‘strength(1) and direction(2)’’

“r=” Relation
1 1)Strength of relationship:
0.00 – 0.19 None (Chance Effect)
Positive The strength of the ‘‘r’’ is as 0.20 – 0.34 Weak
r=1 r = 0,85 r =0,55 follows, as the absolute value. 0.35 – 0.49 Low
relationship
0.50 – 0.64 Moderate
0.65 – 0.79 Strong
2 0.80 – 0.95 Very Strong
r = -1 r = -0,55
r = -0,85

Negative 0.96 – 1.00 Perfect

relationship
2) Direction of relationship: r ranges in value from “–1” to “+1”
• positive (direct, paralel) – variables move in same direction
3 r = 0,0
• negative (inverse) – variables move in opposite directions
No relation
-1 (Strong Negative) 0 (No Relation) +1 (Strong Positive)

Final score State board score

One commonly used measure is the Pearson correlation Example: A random sample of 25 x y
87 440
coefficient, denoted by r. It is defined as following formulas. nurses selected from a state registry 87 480
of nurses yielded the following 87 535
 x  y 
 x . y    n
88 460
i i Testing the validity of the information on each nurse’s score on 88 525
correlation coefficient 89 480
i i the state board examination and his or 89 510
r H0:=0 HA:0 her score in school. Both scores 89 530

 x    x   y    y  
  2
 2
related to the nurse’s area of affiliation
89
89
545
600

  2
i
n 
 i

n 
2
i
i
t 
1 r2
r 90
90
495
545
   700 90
91
575
525
State Board Score

n2 91 575
91 600
x y 600 92 490
xy - 92 510
n 92 575
r = 500 93
93
540
595
SDx ² SDy ² n² If the number of cases 94 525
less than 30; 94 545
400 94 600
86 88 90 92 94 96
xy - ( x y / n ) n²-1 is used instead of n² Final Score 94 625
r = n-1 is used instead of n ∑x=2263, ∑y=13425, …
SDx SDy n ∑xy=1216685, ∑x2=7264525 and ∑y2=1216685

2
17.10.2023

Is there any corelation between final score and state board score? Example: The findings of the study
n n of 20 women are presented in the
n x y i i table, to investigate whether there
x y i i  i 1
n
i 1
is a relationship between the
r i 1

H0:=0  n
 n
 number of pregnancies (x) and
 n ( xi ) 2  n (  yi ) 2  hemoglobin (y) values.
HA:0   
 
x 2
 i 1
y 2
 i 1
i
n  i
n 

i 1

i 1
 96  236
   1075 
r 20
∑x=2263, (2263)(13425) 96 2 236
1216685 - (616  )  (2824 
∑y=13425, r 25  0,541788 20 20
∑xy=1216685,  5121169  180230625 
∑x2=7264525  7264525 - 1216685 -  t 
r 2
 25  25  [t=4.67] > [t(18;0.05)236
=2.1])
∑y2=1216685 1 r2
r
t  n2  p<0.05 20
1 r2  3.09 > t table 2.0687  P<0.05
n2 Reject H0 H0 rejected, H1 accepted,

The strength of the relationship between final score and state ‘‘r is strong, invers and
board score was moderate (r=0.54), paralel and valid. valid’’.

Coefficient of Determination…
Correlation Coefficient = r = - 0.74 , p<0.05
Tests thus far have shown if a linear relationship
Interpretation: A strong, invers and valid correlation exists; it is also useful to measure the strength of the
between the number of pregnancies and relationship. This is done by calculating the coefficient
hemoglobin values was found. of determination.
The coefficient of determination is the square of the
coefficient of correlation (r), hence R2 = r2
Coefficient of Determination = r2 = 0.55 Interpretation of ‘‘R2’’, is the proportion of the
Interpretation: 55% of changes in hemoglobin variation in the dependent variable that is predictable
from the independent variable(s).
values depend on the number of pregnancies. Unlike the value of a test statistic, the coefficient of
…….explained on the next slide
determination does not have a critical value that
enables us to draw conclusions.
In general the higher the value of R2, the better the
model fits the data.

Regression Analysis The problem is to fit a straight line to the data that in some
sense gives the best prediction of y for any value of x.
The nature and strength of the relationships between variables may
be examined by regression and correlation analysis.
Intuitively this will be a line that minimizes the distance
The linear correlation coefficient was presented as a quantity that between the data and the fitted line.
measures the strength of a linear relationship (dependency).
There are several approaches to this problem, but the
Regression analysis is helpful in ascertaining the probable form
standard method is called least squares regression.
of the relationship between variables, and the ultimate objective
When we use this method to fit a regression line
when this method of analysis is employed usually is to predict or
estimate the value of one variable corresponding to we minimize the sum of squares of the vertical distances
of the observations from the line.
a given value of another variable.
The general purpose of regression is to learn more about the Each distance is the difference for an individual between
relationship between one or several independent or predictor the observed value and the value given by the line, known
variables and a dependent or criterion variable. as the fitted value. The technical term for this distance is a
residual.

3
17.10.2023

Bivariate Linear Regression Analysis Calculation of Regression Equation

Mathematical model of bivariate linear relation: A line in a two-variable
space is defined by the following equation. The Y variable can be x y
expressed in terms of a constant (a) and a slope (b) times the X variable. xy - xy - ( x y / n )
n
Independent Variable b= =
Dependent
Variable Y = a + bX   Random Error
(x)²
x² - n
SD2x * n
Constant
If the number of cases less than 30;
Coefficient of Slope
n-1 is used instead of n
Regression Coefficients of the “a” and “b” is calculating with
“LEAST SQUARES METHOD”
y=a+bx must pass through the intersection of
. * Scatter Diagram
.
* *
.
.
Y=a+bX
Regression
(*) The distribution of (𝑥ҧ and 𝑦ത ), Hence, 𝑦=a+b
ത 𝑥,ҧ
individuals in the
. * Equation
sample space
.
* *
.
.
according to x, y also considering that calculated as a=𝑦-
ത b𝑥ҧ
values
.
. . . . . . . . . . . . . . .

 x  y 
 xy   n
Example: The findings of the study of 20 women
are presented in the table, to investigate whether
b To interpret the direction of the relationship
there is a relationship between the number of  x
x  
2

pregnancies (x) and hemoglobin (y) values

n
between variables, one looks at the signs (plus or minus)
b=[1075-((96*236)/20)]/(616-(962/20)=-0.37
of the regression or  coefficients.
y=a+bx 
a=(236/20)-[-0.37* 96/20)]=13.57 If a  coefficient is positive, then the relationship of this
y = 13.57 – 0.37 x variable with the dependent variable is positive; if the 
coefficient is negative then the relationship is negative.
x: 0 x: 5 x: 10
Of course, if the  coefficient is equal to 0 then there is no
relationship between the variables.
Hemoglobin Değerleri

, the slope, gives the amount of change in the dependent

variable when the independent variable changes by one unit.

0 0 5 10
0 2 4 6 8 10 12
Gebelik Sayısı

Classical assumptions for regression analysis include: Correlation Analysis in SPSS

(SPSS: Statistical Package for The Social Science)
1. The sample must be representative of the population for the
• Regression and Correlation analyses are two complementary methods.
inference prediction.
• If only correlation analysis will be performed after entering the data
2. The error is assumed to be a random variable with a mean of zero into the data page, made analysis using this menu options.
conditional on the explanatory variables. • Correlation findings are also included in the final table of regression
3. The independent variables are error-free. If this is not so, modeling analysis Analyze>Correlation>Bivariate
may be done using errors-in-variables model techniques.
4. The predictors must be linearly independent, i.e. it must not be
possible to express any predictor as a linear combination of the
others.
5. The errors are uncorrelated, that is, the variance-covariance matrix of the
errors is diagonal and each non-zero element is the variance of the error.
6. The variance of the error is constant across observations
(homoscedasticity).

Analyze>Correlation>Bivariate

4
17.10.2023

Dialog box of Correlation Analysis Regression Analysis in SPSS

Analyze>Regression>Linear

Correlation and Regression Outputs of "Math. and Intelligence (IQ)_Points"

CORRELATION ANALYSIS

r=
p=

a; sign (+), b; strength level(0.93),

c;Significance(P<0,001***)

REGRESSION ANALYSIS

Interpretation: The mathematical

model of the relationship between
Math and IQ scores is shown by Math Point = -11.925 +(1.111* IQ Point)
the linear regression equation.

Correlation and Regression
100% (5)
Correlation and Regression
49 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
10 pages
Decision Making Under Uncertainty
No ratings yet
Decision Making Under Uncertainty
3 pages
Discounted Cash Flow Valuation: Concept Questions and Exercisescorporate Finance 11E by Ross, Westerfield, Jaffe
100% (1)
Discounted Cash Flow Valuation: Concept Questions and Exercisescorporate Finance 11E by Ross, Westerfield, Jaffe
6 pages
Correlation Regression
No ratings yet
Correlation Regression
18 pages
Correlation Regression
No ratings yet
Correlation Regression
42 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Correlation Analysis
No ratings yet
Correlation Analysis
102 pages
SolomonAntonioVisuyanTandoyBallartaGumbocAretanoNaive - Ed104 - Pearson R & Simple Regression - April 24, 2021
No ratings yet
SolomonAntonioVisuyanTandoyBallartaGumbocAretanoNaive - Ed104 - Pearson R & Simple Regression - April 24, 2021
13 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
37 pages
Correlation and Regression
No ratings yet
Correlation and Regression
17 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
Module-4
No ratings yet
Module-4
35 pages
IV - Measures of Relationship
100% (1)
IV - Measures of Relationship
4 pages
Correlation and Regression Analysis Using SPSS
No ratings yet
Correlation and Regression Analysis Using SPSS
102 pages
Chapter XI Correlation and Regression
No ratings yet
Chapter XI Correlation and Regression
41 pages
Correlation
No ratings yet
Correlation
9 pages
Portion 10
No ratings yet
Portion 10
55 pages
26 - Correlation and Regression Analysis
No ratings yet
26 - Correlation and Regression Analysis
50 pages
Correlation and Regration
No ratings yet
Correlation and Regration
57 pages
@vtucode - in 21CS71 Module 5 PDF
No ratings yet
@vtucode - in 21CS71 Module 5 PDF
5 pages
Pearson and Spearman Correlation
No ratings yet
Pearson and Spearman Correlation
50 pages
Correlation
100% (1)
Correlation
29 pages
Correlation Analysis and Regression 22
No ratings yet
Correlation Analysis and Regression 22
41 pages
Correlation Research Design - PRESENTASI
100% (1)
Correlation Research Design - PRESENTASI
62 pages
DADM-Correlation and Regression
No ratings yet
DADM-Correlation and Regression
138 pages
08 Introduction To Correlation and Linear Regression Analysis 2
No ratings yet
08 Introduction To Correlation and Linear Regression Analysis 2
5 pages
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
No ratings yet
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
53 pages
Pearson R
No ratings yet
Pearson R
25 pages
Correlation
No ratings yet
Correlation
46 pages
5 Correlation and Cofficient 2023
No ratings yet
5 Correlation and Cofficient 2023
51 pages
Lesson 6: Correlation and Linear Regression
No ratings yet
Lesson 6: Correlation and Linear Regression
39 pages
Regression and Correlation
No ratings yet
Regression and Correlation
19 pages
Topic 6 Correlation and Regression
100% (1)
Topic 6 Correlation and Regression
25 pages
Correlation
No ratings yet
Correlation
35 pages
Notes - Correlation and Regression
No ratings yet
Notes - Correlation and Regression
26 pages
Lecture 4 - Correlation and Regression
No ratings yet
Lecture 4 - Correlation and Regression
35 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
Group Assignment
No ratings yet
Group Assignment
3 pages
Correlation and Regression
No ratings yet
Correlation and Regression
62 pages
Biostatistics Stat-301: WWW - Tuf.edu - PK
No ratings yet
Biostatistics Stat-301: WWW - Tuf.edu - PK
16 pages
Dr. Saeed A. Dobbah Alghamdi Department of Statistics Faculty of Sciences King Abdulaziz University
No ratings yet
Dr. Saeed A. Dobbah Alghamdi Department of Statistics Faculty of Sciences King Abdulaziz University
30 pages
Correlation
No ratings yet
Correlation
20 pages
Correlation and Its Significance
No ratings yet
Correlation and Its Significance
15 pages
Lecture 7 - Correlation Regression
No ratings yet
Lecture 7 - Correlation Regression
47 pages
Correlation and Regression
No ratings yet
Correlation and Regression
43 pages
Introduction To Correlation and Regression Analysis
No ratings yet
Introduction To Correlation and Regression Analysis
14 pages
Correlation Regression
100% (1)
Correlation Regression
55 pages
Correlation and Regression Original
No ratings yet
Correlation and Regression Original
44 pages
Day 8 - Module Linear Correlation
No ratings yet
Day 8 - Module Linear Correlation
5 pages
Chapter 8 - PSYC 284
No ratings yet
Chapter 8 - PSYC 284
7 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
17 pages
Biostat Lecture Note 3
No ratings yet
Biostat Lecture Note 3
5 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Introduction To Correlationand Regression Analysis BY Farzad Javidanrad PDF
No ratings yet
Introduction To Correlationand Regression Analysis BY Farzad Javidanrad PDF
52 pages
Correlation & Regression
No ratings yet
Correlation & Regression
26 pages
Statistics Lecture Series: BY Frahi Fadila
No ratings yet
Statistics Lecture Series: BY Frahi Fadila
15 pages
Statistic Group 4
No ratings yet
Statistic Group 4
12 pages
Chapter1-Introduction To Regression Analysis
No ratings yet
Chapter1-Introduction To Regression Analysis
12 pages
Correlation and Regression
No ratings yet
Correlation and Regression
16 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Final Sample
No ratings yet
Final Sample
4 pages
Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou
No ratings yet
Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou
63 pages
hw4 So
100% (2)
hw4 So
18 pages
Chapter 3 & 4 Decision Analysis
No ratings yet
Chapter 3 & 4 Decision Analysis
77 pages
Simultaneous Equation Models
100% (1)
Simultaneous Equation Models
17 pages
Variable 1 Variable 2: T-Test: Two-Sample Assuming Equal Variances
No ratings yet
Variable 1 Variable 2: T-Test: Two-Sample Assuming Equal Variances
13 pages
The Disjunction Effect in Choice Under Uncertainty
No ratings yet
The Disjunction Effect in Choice Under Uncertainty
6 pages
Chapter 4 Demand Estimation
50% (2)
Chapter 4 Demand Estimation
8 pages
Wooldridge Solution Chapter 3
50% (2)
Wooldridge Solution Chapter 3
11 pages
Lecture Note 6 - Cointegration and Common Trends
No ratings yet
Lecture Note 6 - Cointegration and Common Trends
31 pages
Modern Regression - Ridge Regression
100% (1)
Modern Regression - Ridge Regression
21 pages
Game Theory Approach For Multi-Objective Structural Optimization
No ratings yet
Game Theory Approach For Multi-Objective Structural Optimization
9 pages
Game Theory 1
No ratings yet
Game Theory 1
23 pages
Econometric Methods
No ratings yet
Econometric Methods
4 pages
Data Table: No. Date Stock Prices Returns DHT Vnindex DHT Vnindex
No ratings yet
Data Table: No. Date Stock Prices Returns DHT Vnindex DHT Vnindex
7 pages
Regression Analysis 1
No ratings yet
Regression Analysis 1
20 pages
Inference For The Least-Squares Line
No ratings yet
Inference For The Least-Squares Line
4 pages
Marshall - Framework For Assessing Risk Margins
No ratings yet
Marshall - Framework For Assessing Risk Margins
43 pages
Game Theory With Excel
No ratings yet
Game Theory With Excel
5 pages
Multiple Regression Analysis: Estimation
No ratings yet
Multiple Regression Analysis: Estimation
50 pages
(Ebook PDF) Introduction To Probability and Statistics 3rd by William Mendenhall - The Ebook Is Ready For Download, No Waiting Required
100% (2)
(Ebook PDF) Introduction To Probability and Statistics 3rd by William Mendenhall - The Ebook Is Ready For Download, No Waiting Required
45 pages
Quick Stata Guide
No ratings yet
Quick Stata Guide
22 pages
One-Sample Kolmogorov-Smirnov Test
No ratings yet
One-Sample Kolmogorov-Smirnov Test
2 pages
Rizka - Isnaini.husna@mail - Ugm.ac - Id: Econometrics Problem Set 3
No ratings yet
Rizka - Isnaini.husna@mail - Ugm.ac - Id: Econometrics Problem Set 3
1 page
Advanced Econometrics:Test 1 (30 Marks) Date: October 9, 2017
No ratings yet
Advanced Econometrics:Test 1 (30 Marks) Date: October 9, 2017
4 pages
OLS Estimation of Single Equation Models PDF
No ratings yet
OLS Estimation of Single Equation Models PDF
40 pages
Estimating Econometric Models With Fixed Effects
No ratings yet
Estimating Econometric Models With Fixed Effects
14 pages
History of Behavioral Economics PDF
100% (1)
History of Behavioral Economics PDF
17 pages