0% found this document useful (0 votes)
142 views

Ppt. Correlation and Regression

Uploaded by

jaredsioson456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views

Ppt. Correlation and Regression

Uploaded by

jaredsioson456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Prof.

Lhas Malana Barsabal


Biostatistics with Epidemiology

Cagayan State University


 Definition:
Noun
 mutual relation of two or more things, parts, etc.:
 Correlation is a statistical tool designed to
measure the extent of relationship or association
between two variables. (dependent and
independent variables)
 Statistics. the degree to which two or more
attributes or measurements on the same group
of elements show a tendency to vary together.
 Pearson (r)Product Moment Correlation
 Spearman Rank Correlation
 In correlation analysis, we estimate a
sample correlation coefficient, more
specifically the Pearson Product Moment
correlation coefficient.
 The sample correlation coefficient, denoted r,
 r ranges between -1 and +1 and quantifies
the direction and strength of the linear
association between the two variables.
 The correlation between two variables can be
positive or negative
n = number of data points of the two variables
Di = difference in rank of the ith elements
 A study is conducted involving 17
infants to investigate the association
between gestational age at birth,
measured in weeks, and birth weight,
measured in grams.
r = 0.82
 As we noted, sample correlation coefficients
range from -1 to +1. In practice, meaningful
correlations (i.e., correlations that are
clinically or practically important)
 An investigator wants to arrange the 15 items
on her scale of language impairment on the
basis of the order in which language skills
appear in development. Not being entirely
confident that she has selected the correct
ordering of skills, she asks another
professional to rank the items from 1 to 15 in
terms of the orderin which he thinks they
should appear. The data are given below:
 Investigator: 1 2 3 4 5 6 7 8 9 10 11 12 13
14 15
 Consultant: 1 3 2 4 7 5 6 8 10 9 11 12 15 13
14
Rank of I Rank of C dif. In Rank (d) sqr. Of d
1 1 0 0
2 3 -1 1
3 2 1 1
4 4 0 0
5 7 -2 4
6 5 1 1
7 6 1 1
8 8 0 0
9 10 -1 1
10 9 1 1
11 11 0 0
12 12 0 0
13 14 1 1
14 13 -1 1
15 14 1 1
13
 Using the spearman formula
 r = .97
 1. In a study of diagnostic processes, entering
clinical graduate students are shown a 20-minute
videotape of children’s behavior and asked to
rank-order 10 behavioral events on the tape in the
order of
the importance each has for a behavioral
assessment. (1 = most important.) The data are
then averaged to
produce an average rank ordering for the entire
class. The same thing was then done using
experienced
clinicians. The data follow:
Events 1 2 3 4 5 6 7 8 9 10
 Experienced Clinicians: 1 3 2 7 5 4 8 6 9 10

 New Students 2 4 1 6 5 3 10 8 7 9
 2. The data here represents approximate time in
hours spent training for a marathon (x) and
approximate time to complete the marathon in hours
(y).
(a) Compute a Pearson's Product Moment correlation on
the following data; (b) determine if the
relationship is significant; (c) sketch a scatterplot of the
data; (d) what is the standard error of estimate?
x y
8 2
4 2
2 4
1 5
5 2
3. The raw data in the table below is used to
calculate the correlation between the IQ of a
person with the number of hours spent in front
of TV per week
IQ Hours of TV per week
106 7
100 27
86 2
101 50
99 28
103 29
97 20
113 12
112 6
110 17
 The simple linear regression model is
represented by:
y = β0 +β1x+ε
 The two factors that are involved in simple
linear regression analysis are
designated x and y. The equation that
describes how y is related to x is known as
the regression model.

 The linear regression model contains an error
term that is represented by ε. The error term
is used to account for the variability in y that
cannot be explained by the linear
relationship between x and y. If ε were not
present, that would mean that
knowing x would provide enough information
to determine the value of y.

 The simple linear regression equation is graphed
as a straight line, where:

 β0 is the y-intercept of the regression line.


 β1 is the slope.
 Ε(y) is the mean or expected value of y for a
given value of x.
 A regression line can show a positive linear
relationship, a negative linear relationship, or no
relationship3
 No relationship: The graphed line in a simple linear regression is
flat (not sloped). There is no relationship between the two
variables.
 Positive relationship: The regression line slopes upward with the
lower end of the line at the y-intercept (axis) of the graph and
the upper end of the line extending upward into the graph field,
away from the x-intercept (axis). There is a positive linear
relationship between the two variables: as the value of one
increases, the value of the other also increases.
 Negative relationship: The regression line slopes downward with
the upper end of the line at the y-intercept (axis) of the graph
and the lower end of the line extending downward into the graph
field, toward the x-intercept (axis). There is a negative linear
relationship between the two variables: as the value of one
increases, the value of the other decreases.4
 The Estimated Linear Regression Equation
 If the parameters of the population were
known, the simple linear regression equation
(shown below) could be used to compute the
mean value of y for a known value of x.
 Ε(y) = β0 +β1x+ε
Comparison Between Correlation and
Regression

Basis Correlation Regression


A statistical measure that Describes how an
defines co-relationship or independent variable is
Meaning
association of two associated with the
variables. dependent variable.

Dependent and Both variables are


No difference
Independent variables different.

To describe a linear To fit the best line and


Usage relationship between two estimate one variable
variables. based on another variable.
To estimate values of a
To find a value expressing
random variable based on
Objective the relationship between
the values of a fixed
variables.
variable.

You might also like