0% found this document useful (0 votes)

15 views77 pages

Simple Linear Regression and Correlation

This document discusses simple linear regression and correlation analysis, focusing on their application in health sciences to understand relationships between numeric variables. It outlines the regression model, assumptions underlying it, and the process of regression analysis, including steps to evaluate and use the regression equation for predictions. The chapter emphasizes the importance of understanding the population from which sample data is drawn and the need for compatibility between chosen models and data.

Uploaded by

lowtarhkG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views77 pages

Simple Linear Regression and Correlation

Uploaded by

lowtarhkG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 77

CIIAPTER

SITIPLE LIIIEAR RE GR^BSS/O,V

A,\-D CORREI.,ATION
9.I INTRODUCTION
9.5 USING THE REGRESSIONEQUATION
9.2 THE REGRESSIONMODEL
9.6 THE CORRELATIONMODEL
9.3 THE SAMPLE
9.7 THE CORRELATIONCOEFFICIENT
REGRESSIONEQUATION
9.8 SOME PRECAUTIONS
9.4 EVALUATING THE
REGRESSIONEQUATION 9.9 SUMMARY

9.T TNTRODUCTTON

In analyzing data for the health sciences

disciplines, we find that it is frequently
desirable to learn something about
the relationship between two numeric
ables' we may, for example, be interested vari-
in studying the rerationship between
blood pressure and age, height and
weight, the concentration of.an injected
drug and heart rate, the consumption
level of some nuffient and weight
the intensity of a stimulus and reaction gain,
time, or total family income and
ical care expenditures. The nature med-
and strength of the relationships between
variables such as these may be examine
d, by regressionand. correrattonanalysis,
fwo statistical techniques that, although
rerated, serve differe",
0".0"....

RSu.res-sloy Regression anarysis is helpful in assessing

relationship between variables, and specific forms of the
the uitimate objective when this
analysis-is employed usLrallyis to predict method of
or estimatetire value of one variable
responding to a siven value of another cor-
variable. Th. id.;;;;.=.g...rio. were
elucidated by the English scientist firsr
sir Francis Galton (rgzz_1g-r1) in
his research on heredity-first reports of
in sweet f.a, urrd later in human
stature. He
4to
9.2 THE REGRESSIONMODEL
4tl
described a tendency of adult-offspring,
having either short or tall parents,
vert back roward the average heigirt to re_
oi t'e general popuration. He first
word reaersion,and later regression, used the
to refer to this phenomenon.

Comelation correlation analysis,on the other hand,

is concerned with measur-
ing the strength of the rerationship
n.t ..r, variables. when we compure
of correlation from a set of data, measures
we are interested in the degree of the
between variables. Again, the concepts co*elation
u.rJ t r-i.rology of coielation analysis
inated with Galton, r,trhofirrt ,rr.a orig_
tir. *ora ,*ontion in lggg.
In this chapter our discussion is limited
to tt. .*ptoruiro' or the rinear rera_
tionship between two variables. The
concepts and methods of regression
ered first, beginning in the next section. are cov_
In Section 9.6 the ideas and techniques
of correlation are introduced. In
the next chapter we consider the
there is an interest in the relationship, case where
u-orrg three or more variabres.
Regression and correlation anaiysis
are areas in which the speed and
racy of a computer are most appreciated. accu_
The data ro. trr. .*.r.ises of this
ter, therefore' are presented i.t chap_
u *uy that makes them suitable for
processing. As is arwaysthe c.ase,^the computer
input requirements and output
the particular programs and soffwur. features of
carefully. iu.t ug.s to be used should be studied

9.2 TIID RDGRESSTON MODDI, ill

il
lt
.t
In the typical regression problem,
as in most probrems in applied
searchers have availa.brefor analysis statistics, re_
a sampre of observatiorrr'f.o- some
hypothetical popuration. Based on rear or
th.."r,rtr, of their "".ffi"rthe
they are interested in reaching sampre data,
decisions about the population from
sample is presumed to have Leen which the
drawn. It is important, therefore,
researchers understand the nature that the
of the population in which they are
They should know e interested.
to be ableeitherto constructa
marhematicur
-oa.lf51""T:j^:Tpopulation
someesrabrisneamoaLl:i:.:'i::;.#H:1,::::.fi:Tff
"ir#T",ffir**
ods of simple linear regression, for
example, shouli n" ,..rrr" in the
that the simple linear regression knowledge
model is, at least, an approximate representation
of the population. It is unlikely
that the moder wilr be a perfect
real situation, since this characteristic portrait of the
is seldom found 1., *ia.i, of practical value.
A model constructed so that it-.o...rpo.rds
precisely with the details of the situ_
ation is usually too complicated. to
yi.^ta u.ry information of value.
hand, the results obtained from on the other
ths an"ryrir "r data that t uu. u".r,
model that does not fit are also fbrced into a
worthless. Fortunately, no*.u.i a perfectly fitting
model is not a requirement for obtaining
useful resulis. R.s.a..h..s, then, should
be able to distinguish between th.
o..irro'when their chosen models and the
412 CIIAPTER9 SIMPLELINEARREGRESSION
AND CORRELATION

data are sufficiently compatible for them to proceed and the casewhere their chosen
model must be abandoned.

Assumptions und,erWing sEmryile rinea,r fi,egression In the simple

linear regression model two variables, usually labeled Xand y, are of interest. The
letter X is usually used to designate a variable referred to as the ind,epend,ent aari-
able, since frequently it is controlled by the investigator; that is, values of X may
be selected by the investigator and, corresponding to each preselected value of
{ one or more values of another variable, labeled I, are obtained. The variable. X
accordingly, is called t}:'e dependentaariable, and we speak of the regression of r
on X. The following are the assumptions underlying the simple linear regression
model.

l. values of the independent variable Xare said to be "fixed." This means that
the values of X are preselected by the investigator so that in the collection
of the data they are not allowed to vary from these preselected values, In
this model, X is referred to by some writers as a nonrandom vaiable and by
others as a mathematicalvariable. It should be pointed out at this time that
the statement of this assumption classifies our model as the classicalregres-
sion model.Regression analysis also can be carried out on data in which X is
a random variable.
2. The variable X is measured without error. Since no measuring procedure is
perfect, this means that the magnitude of the measurement error in X is
negligible.
3. For each value of X there is a subpopulation of Yvalues. For the usual infer-
ential procedures of estimation and hlpothesis testing to be valid, these sub
populations must be normally distributed. In order that these procedures
may be presented it will be assumed that the Y values are normally distritr
uted in the examples and exercises that follow.
4. The variances of the subpopulations of Y are all equal.
5. The means of the subpopulations of I all lie on the same straight line. This
is known as the assumptionof linearilry.This assumption may be expressed slnn-
bolically as

F2y*: a * Bx rq9 t\

where A!,1"is the mean of the subpopulation of yvalues for a particular value
of X, and a and B are called population regression coefficients. Geometri-
cally, a and p represent the y-intercept and slope, respectively, of the line
on which all the means are assumed to lie.
6. The r values are statistically independent. In other words, in drawing the
sample, it is assumed that the values of rchosen at one value of Xin no way
depend on the values of Ichosen at another value of X.
9.2 TIIE REGRESSIONMODEL 4tB

These assumptions may be summarized by means of the following equation,

which is called the regression model:

y:d+Bx*e /q99\

where 1 is a tipical value from one of the subpopulations of Ij a and p are as defined
for Equation 9.2.1, and eis called the error term. If we solve g.2.2for e, we have

e:)-\a+Bx) (e.2.3)
:\-u"-
/ , ^
f

and we see that e shows the amount by which y deviates from the mean of the
subpopulation of rvalues from which it is drawn. As a consequence of the assump-
tion that the subpopulations of Ivalues are normally distributed with equal vari-
ances, the e's for each subpopulation are normally distributed with a variance
equal to the common variance of the subpopulations of Ivalues.
The following acronyrn will help the reader remember most of the assump-
tions necessary for inference in linear regression analysis:

LINE fl-inear (assumption 5), Independent (assumption O), Normal (assump-

tion 3), Equal variances (assumption 4)l

A graphical representation of the regression model is given in Figure 9.2.1.

f(x, Yl

X )l^4

FIGURD 9.2.1 Representationof the simple linear regressionmodel.

4t4 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

9.3 THD SATIPI.,D RDGRDSSTON ATION

In simple linear regression the object of the researcher's interest is the population
regression equation-the equation that describes the true relationship between the
dependent variable I and the independent variable X.
In an effort to reach a decision regarding the likely form of this relationship,
the researcher draws a sample from the population of interest and using the resuit-
ing data, computes a sample regression equation that forms the basis for reaching
conclusions regarding the unknown population regression equation.

Steps in R,egressioro Anulgsis In the absence of extensive information

regarding the nature of the variables of interest, a frequently employed strategy is
to assume initially that they are linearly related. Subsequent analysis, then, involves
the following steps.

l. Determine whether or not the assumptions underlying a linear relationship

are met in the data available for analysis.
2. Obtain the equation for the line that best fits the sample data.
3. Evaluate the equation to obtain some idea of the strength of the relation-
ship and the usefulness of the equation for predicting and estimating.
4. If the data appear to conform satisfactorily to the linear model, use the equa-
tion obtained from the sample data to predict and to estimate.

when we use the regression equation to predict, we will be predicting the

value r is likely to have when X has a given value. When we use the equation to
estimate,we will be estimating the mean of the subpopulation of Xvalues assumed
to exist at a given value of X. Note that the sample data used to obtain the regres-
sion equation consist of known values of both X and x When the equation is used
to predict and to estimate Y, only the corresponding values of X will be known.
We illustrate the steps involved in simpler linear regression analysis by means of
the following example.

EXAMPI,E O.3.I
Despr6s et al. (A-1) point our that the topography of adipose tissue (Ar) is associ-
ated with metabolic complications considered as risk factors for cardiovascular dis-
ease. It is important, they state, to measure the amount of intraabdominal Ar as
part of the evaluation of the cardiovasculardisease risk of an individual. Computed
tomography (CT), the only available technique that precisely and reliably measures
the amount of deep abdominal AT, however, is costly and requires irradiation of the
subject. In addition, the technique is not available to many physicians. Despr6s
and his colleagues conducted a study to develop equations to predict the amount
of deep abdominal AI from simple anthropometric measurements. Their subjects
were men between the ages of 18 and 42 years who were free from metabolic dis-
ease that would require treatrnent. Among the measurements taken on each subiect
9.3 THE SAMPLE REGRESSIONEQUATION 4t6

were deep abdominal AI obtained by CT and waist circumference as shown in

Table 9.3.1. A question of interest is how well one can predict and estimate deep
abdominal AI from a knowledge of waist circumference. This question is typical of
those that can be answered by means of regression analysis. Since deep abdominal
AI is the variable about which we wish to make predictions and estimations, it is the
dependent variable. The variable waist measurement, knowledge of which will
be used to make the predictions and estimations, is the independent variable.

9.8.1 Waist Cireumferenee (cm)r Xo and l)eep Abdominat AG Y, of

PIen

Subjeet x v Subject x
I 74.75 25.72 3B 103.00 129.00 75 108.00 217.00
z 72.60 25.89 39 80.00 74.02 76 100.00 140.00
BI.BO 42.60 40 79.00 55.48 77 103.00 109.00
4 83.95 42.80 4I 83.50 73.13 t6 104.00 t27.00
74.65 29.84 42 76.00 50.50 79 106.00 112.00
o 71.85 2t.68 43 80.50 50.88 BO 109.00 192.00
7 80.90 29.08 44 86.50 140.00 BI 103.50 132.00
I 83.40 32.98 45 83.00 96.54 82 110.00 126.00
o 63.50 tt.44 46 107.t0 118.00 B3 n0.00 r$.00
l0 73.20 32.22 94.30 107.00 84 112.00 158.00
ll 7r.90 28.32 48 94.50 123.00 B5 108.50 183.00
t2 75.00 43.86 49 79.70 65.92 B6 104.00 184.00 $

73.10 38.21 50 79.30 8t.29 a'7 111.00 r2r.00

l3
l4 79.00 42.48 5l 89.80 tlt.00 BB 108.50 159.00
l5 77.00 30.96 52 83.80 90.73 B9 121.00 245.00 ii
l6 68.85 55.78 53 85.20 133.00 90 109.00 137.00
17 75.95 43.78 54 75.50 41.90 9l 97.50 165.00
l8 74.t5 33.41 55 78.40 4r.7r 92 105.50 152.00
l9 73.80 43.35 56 78.60 58.16 93 98.00 t8l.00
20 75.90 29.3r 57 B7.BO BB.B5 94 94.50 80.95 1,,
2l 76.85 36.60 5B 86.30 155.00 95 97.00 137.00
22 80.90 40.25 59 85.50 70.77 96 105.00 125.00
o1 79.90 35.43 60 83.70 75.08 97 106.00 24t.oo
24 89.20 60.09 6T 77.60 57.05 9B 99.00 134.00
25 82.00 45.84 62 84.90 99.73 99 9r.00 150.00
26 92.00 70.40 63 79.80 27.96 I00 102.50 198.00
27 86.60 83.45 64 108.30 123.00 l0l 106.00 151.00
28 80.50 84.30 65 il9.60 90.4r to2 109.10 229.00
29 86.00 78.89 66 119.90 106.00 103 rI5.00 253.00
30 82.50 64.75 OI 96.50 144.00 104 10t.00 lBB.00
3l 83.50 72.56 6B 105.50 121.00 105 100.10 L24.00
JZ BB.1O 89.31 69 105.00 97.13 106 93.30 62.20
JJ 90.80 78.94 70 107.00 166.00 107 101.80 133.00
Jri 89.40 83.55 7l 107.00 87.99 l0B 107.90 208.00
102.00 t27.00 72 101.00 154.00 109 108.50 208.00
36 94.50 I2t.00 97.00 100.00
aa
91.00 107.00 74 100.00 123.00

SouRcr:Jean-Piene Despr6s, Ph.D. Used with pemission.

4tG CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

The Scatter Diagram

A first step that is usually useful in studying the relationship between two
variables
is to prepare a scatter diagram of the data such as is shor,r,rrin Figure 9.3.1.
The
points are plotted by assigning values of the independent variable X to
the hori-
zontal axis and values of the dependent variable y to the vertical axis.
The pattern made by the points plotted on the scatter diagram usually suggests
"variables.
the basic nature and strength of the relationship between two As we look
at Figure 9.3.1, for example, the points seem to be scattered around an invisible
straight line. The scatter diagram also shows that, in general, subjects with large
waist
circumferences also have larger amounts of deep abdominal Ai. th.r" impiessions
suggest that the relationship between the two variables may be described by a
straight
line crossing the Y-axisbelow the origin and making approximately a 45-degre.
u.r!t.
with the X-axis. It looks as if it would be simple to driw, freehanj, through the
dita
points the line that describes the relationship between X and Z It is highly
unlikely,
however, that the lines drawn by any two people would be exactly the same.
In other
words, for every person drawing such a line by eye, or freehand, we would.
expect a

240

o
220
a
t

a
1 rao a
N

E
ll 160
o o o
E t a
a a a

J 140 a
o
' o o
a
6 ttt.
.s 120 t t
t tt'
E I
a

a o a
E
o
1oo
a
o
!
3 8 0 a
a a o
a a
a
a t

. 1 .
40 .1. r d
1 . .
t - t t t a t '
20

*
0 60 65 70 758 0 8 5 9 0 9 5 1 0 0 1 0 5 1 1 0 1 1 5 120 125
Waistcircumference (cm),X
I'IGURD 9.3.1 Scatter diagram of data shown in Table 9.3.1.

i
t
t
I

...
I
I

-
t
9.3 THE SAMPLE REGRESSIONEQUATION 4t7

slightly different line. The question then arises as to which line best describes the
relationship between the two variables. We cannot obtain an answer to this question
by inspecting the lines. In fact, it is not likely that any freehand line drawn through
the data will be the line that best describes the relationship benareenX and Y, since
freehand lines will reflect any defects of vision or judgment of the person drawing
the line. Similarly, when judging which of two lines best describes the relationship,
subjective evaluation is liable to the same deficiencies.
\Alhat is needed for obtaining the desired line is some method that is not
fraught with these difficulties.
The Least-Squares Line
The method usually employed for obtaining the desired line is known as the method
of least squares,and the resulting line is called the least-squares
line. The reason for
calling the method by this name will be explained in the discussion that follows.
we recall from algebra that the general equation for a straight line may be
written as
y: a* bx (e.3.1)
where 1 is a value on the vertical axis, x is a value on the horizontal axis, a is the
point where the line crossesthe vertical axis, and D shows the amount by which y
changes for each unit change in x. we refer to a as the y-intercept and D as the
slopeof the line. To draw a line based on Equation 9.3.1, we need the numerical
values of the constants a and b. Given these constants, we may substitute various
values of x into the equation to obtain corresponding values of y. The resulting
points may be plotted. Since any tlvo such coordinates determine a straight line,
we may select any two, locate them on a graph, and connect them to obtain the
line corresponding to the equation.
Obtaining the Least-Square Line
The least-squaresregression line equation may be obtained from sample data by sim-
ple arithmetic calculations that may be carried out by hand. Since the necessaryhand
calculations are time consuming, tedious, and subject to error, the regression line
equation is best obtained through the use of a computer software package, Although
the typical researcher need not be concerned with the arithmetic involved, the inter-
ested reader will find them discussed in references listed at the end of this chapter.
For the data in Table 9.3.1 we obtain the least-squaresregression equation
by means of MINITAB. After entering the Xvalues in column 1 and the rvalues
in Column 2 we proceed as shown in Figure 9.3.2.
For now, the only information from the output in Figure 9.3.2 that we are
interested in is the regression equation. Other information in the output will be
discussed later.
From Figure 9.3.2 we see that the linear equation for the least-squaresline that
describes the relationship between waist circumference and deep abdominal AI may
be written. then. as

i:-216*3.46x (e.3.2)
4IB cIIAFTER 9 sIMPLE LINEAR REGRESSIoNAND coRRELATIoN

Dialogbox: Sessioncommand:

Stat > Regression > Regression MTB > Name C3 = .FrTSL, C4 =.RES] 1,
Typ. ) in Response and x in Predictors. MTB > Regress .y, 1 .x, ;
Click Storage. Check Residuals and Fits. SUBC> Fits .FITS1,
,.
Click OK. sUBc> Constant;
SUBC> Residuals 'RESI1' .

Output:

RegressionAnalysis:y versus x
Tha rr aEaVr a aarr rf .i a* .i ^
! sc D
c Drvlt i nn
egudLtull ID

Y = -2L6+ 3.46 x

Predictor Coef Stdev t-ratio p

Constant -2L5.98 2t.80 -9.91, 0. OOO
x 3.4589 0.234'7 L4.74 0.000

s = 33.06 R-sq = 6i.oZ R-sq(adj) = AA.tt

Analysis of Varlance

SOURCE DF SS MS F p
Regression t 237549 237549 2 A ' t. 2 8 0.000
Error a07 ]-L6982 1093
Total 108 354531

Unusual- Observations
Obs. x y Fit Stdev. Fit Residual St.Resid
s8 86 15s.00 82.52 3.43 72.48 2.20R
6s ]-20 90.4L 1,9'7.70 t.zr -!vt.z> -3.33R
66 ]-20 106.00 1,98.74 1.29 -92.74 -2.BBR
7! 107 87.99 L 5 4. 1 , 2 4.75 -66.13 -2.02R
9'/ 106 241_.OO l_50.66 4. s8 90.34 2.16R
I02 109 229.00 161_.38 5._LJ 6/-62 2.01R
103 115 253.00 1,81,.79 6.28 7L.21_ 2.19R

R denotes an obs. with a larqe st. resid.

I'IGIIRD 9.3.2 MINITAB procedure and output for obtaining the least-squaresregressron
equationfrom the data in Table 9.3.1.
9.3 THE SAMPLE REGRESSIONEQUATTON 4tg

This equation tells us that since a is negative, the line crossesthe y-axis below
the origin, and that since D, the slope, is positive, the line extends from the lower
left-hand corner of the graph to the upper right-hand corner. We see further that for
each unit increase in x, y increases by an amount equal to 3.46. The symbol y denotes
a value of y computed from the equation, rather than an observed r,alue of x
By substituting two convenient values of Xinto Equation g.z.2,we may obtain
_
the necessary coordinates for drawing the line. suppose, first, we let x: 70 and,
obtain

-216+3.46(70):26.2
i:
If we let X: 110 we obtain

y: -Ztq + 3.46(110):164

The line, along with the original data, is shown in Figure 9.3.3.

240

a
220
o

1 reo t a

E
ll 160
o
o a

; 140
a
a
o
ttta a
t a o a
E O
|=-216+3.46x a
E too
G a a
a
q ^ ^ 3
a
3 0 u a

o
r j l
a oo
t
o o o o

"
0 60 65 70 75 80 85 90 95 1 0 0 1 0 5 1 1 0 1 1 5 120 t25
Waist circumference(cml, X
FIGIIBD 9.3.3 Original data and least-squares line for Example 9.3.1. T
42o crrAprER 9 srMpLE LINEARREGRESsToN
AND coRRELATToN

Thc l-cust Squures Criterion Now that we have obtained what we call
the "best" line for describing the relationship between our two variables, we need
to determine by what criterion it is considered best. Before the criterion is stated,
let us examine Figure 9.3.3. We note that generally the least-squares line does not
pass through the observed points that are plotted on the scatter diagram. In other
*"'o?il",';:;T;:H:'#il"#::{:',iIJHIi:$J1?11'-H
Thesum of thesquaredverticaldniations of theobsnaeddatapoints
U) from the lcast-
squaresline is small.erthan thesum of thesquaredaerticaldcaiationsof the data points
from any otherline.

In other words, if we square the vertical distance from each observed point
(y) to the least-squares line and add these squared values for all points, the result-
ing total will be smaller than the similarly computed total for any other line that
can be drawn through the points. For this reason the line we have drawn is called
the least-squares line.

DXDRCTSDS
9.3.1 Plot each of the following regression equations on graph paper and state whether X and
Iare directly or inversely related.
(a)r:-342*
(b)i:3+0.5x
(c)y:70-0.75x

9.3.2 The following scores represent a nurse's assessment(X) and a physician's assessment(I)
of the condition of 10 patients at time of admission to a trauma center.

x 1 8 1 3 1 8 1 5 1 0L 2 8 4 7 3
Y 2 3 2 0 1 8 1 6 1 4 11 10 7 6 4
(a) Construct a scatter diagram for these data.
(b) Plot the following regression equations on the scatter diagram and indicate which
one
you think best fits the data. State the reason for your choice.
(l)i:B+0.5x
(2)i=-70+2x
(3)i:1+lx
For each of the following exercises (a) draw a scatter diagram and (b) obtain the regres-
sion equation and plot it on the scatter diagram.

9.3.3 Methadone is often prescribed in the treatment of opioid addiction and chronic pain.
Ikantz et al. (A-2) studied the relationship between dose of methadone and the corrected
Qf (Qf.) interval for 17 subjects who developed tmsad.ed,epointes (ventricular tachycardia
nearly always due to medications). QTc is calculated from an electrocardiogram and is mea-
sured in mm/sec. A higher QTc ralue indicates a higher risk of cardiovascular mortality.
A question of interest is how well one can predict and estimate the
QTc value from a
knowledge of methadone dose. This question is tlpical of those that can be answered by means
EXERCISES 421

of regression analysis. since

QTc is the variable about which we wish to make prediclions and
estimations, it is the dependent variable. The r.ariable
methadone dose, knowledge of which
will be used to make the predictions and estimations, is the
indepencrent variable.

Methadone Dose Methadone Dose

(mg/day)
QTc (mrn/sec) (mg/day) QTc (mm/sec)
1000 600 650 785
550 625 600 765
97 560 660 6il
90 585 270 600
.'J 590 680 625
t26 500 540 650
300 700 600 635
110 570 330 522
65 540
Souncr::Mori .I. Krantz,Il.anaB. Kutinsky,AlastairD. Roberston, philip
"Dose-Related and S. Mehler,
Effects-ofMethadone on Qt prolongationin a Seriesof patientswith
Torsadede Pointes,,,pharmacotherapy, ZS IZOOZi1,
802_805.

9'3'4 Reiss et al' (A-3) compared point-of-care and standard

hospital laboratory assaysfor moni-
toring patients receiving a single anticoagulant or a .egimen
consisting of a combination of
anticoagulants' It is quite common when comparing two
measuring techniques, to use regres-
sion analysis in which one variable is used to p;dict
another. ln the iresent study,-the
researchers obtained measures of international normalized
ratio (INR) by a.ruy of ,upillury
and venous blood samples collected from 90 subjects
taking warfarin. INR, used especially
when parients are receiving warfarin, measures the clotting
uiitiq, of the blood. point_of_car.e
testing for INR was conducted with the CoaguChek urruy
f.od.r.t. Hospital testing was done
with standard h":pi,{-l_1!?tatory assays.The authors used^the
hospital assayINR llvel to pre-
dict the CoaguChek INR level. The measuremenm are
given in the fbllowing table.

CoaguChek Hospital CoaguChek Hospiral CoaguChek Hospital

(v) (x) (Y) (x) (v) (x)
1.8 1.6 2.4 t.2 3.1 2.4
1.6 1.9 z.J z-J I.7 1.8
2.8 2.0 I.6 1.8 1.6
1.9 2.4 3.8
J.J
1.9 1.7
1.3 i.5 1,9 1.6 5.3 A '
a.J l.B t.B 1.5 1.6 1.6
1.2 1.3 2.8 l.B 1.6 1.4
t ? 2.4 2.5 i.5 3.3
2.0 2.I 0.B 1.0 1.5 1.5
1.5 1.5 1.3 1.2 2.2 t o
2.I 2.4 1.4
J. {
1.1 1.6
1.5 1.5 2.4 r.6 2.6 2.6
(Continued)

lt
I
l

II
I
422 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

CoaguChek Ilospital CoaguChek Hospiral CoaguChek Hospital

(v) (x) (v) (x) (v) (x)
1.5 1.7 4.r 2 D
6.4 5.0
1.8 2.I 2.4 r.2 1.5 1.4
1.0 r.2 2.3 t e 3.0 2.8
2.L 1.9 3.1 1.6 2.6 2.3
1.6 1.6 1.5 r.4 t.2 r.2
r.7 1.6 3.6 2.1 2.I 1.9
2.0 1.9 2.5 t.7 1.1 1.1
lB 1.6 2.r r.7 1.0 1.0
1.3 4,1 l.B r.2 t.4 1.5
1.5 t.9 1.5 1.3 I.7 1.3
J.O 2.I 2.5 1.1 1,.2 t.l
2.4 2.2 1.5 t.2 2.5 2.4
2.2 t 1
t.5 t.l t.2 1.3
2.7 2.2 r.6 t.2 2.5 2.9
2.9 J.l r.4 t.4 1.9 r.7
2.0 2.2 4.0 z.J t.B 1,.7
1.0 I,2 2.0 r.2 t.2 1.1
2.4 2.6 2.5 t.5 1.3 l.l
Scluncl:CurtisE. Haas,Pharm.D.
Usedwith pemission.

9.3.5 Digoxin is a drug often prescribed to treat heart ailments. The purpose of a study by parker
et al. (A4) was to examine the interactions of digoxin with common grapefruit juice. In one
experiment, subjects took digoxin with water for I weeks, followed by a 2-week period dur-
ing which digoxin was withheld. During the next 2 weeks subjects took digoxin with grape-
fruit juice' For seven subjects, the average peak plasma digoxin concentration (Cma-x) when
taking water is given in the first column of the following table. The second column contains
the percent change in Cmax concentration when subjects were taking the digoxin with grape-
fruit juice ICFJ (%) changel. Use the Cmax level when taking digoxin wittrwater to pied^i.t
the percent change in cmax concentration when taking digoxin with grapefruit juice.

Cmax (ngUml) with Water Change in Cmax with GFJ (7o)

D 2A
29.5
2.46 40.7
l B7 5.3
3.09 zJ-t

s.59 -45.I
4.05 -
J).J

6.21 -44.6
2.34 29.5
Souncl: Robert B. Parker,Pham.D. Used with permission
EXERCISES 42',

9.3.6 Evans et al. (A-5) examined the effect of velocity on ground reaction forces (GRF) in dogs
with lameness from a torn cranial cruciate ligament. The dogs were walked and trotted
over a force platform and the GRF recorded (in Newtons) during the stance phase. The
following table contains 22 measurements of force expressed as the mean of five force
measurements per dog when walking and the mean of five force measurements per dog
when trotting. Use the GRF value when walking to predict the GRF value when trotting.

GRF-Walk GRF-Trot GRF-Walk GRF-Trot

31.5 50.8 24.9 30.2

)J.J JJ.O 46.3
)2.) 44.8 30.7 41.8
2B.B 39.5 27.2 32.4
38.3 44.0 44.0 65.8
36.9 60.1 28.2 32.2
qA )
14.6 I1.1 29.5
27.0 )z-) 31.6 JO. {

32.8 41.3 29.9 42.0

27.4 38.2 J+.J 37.6
31.5 50.8 24.9 30.2 SouRcu:Richard Evans,Ph.D.
Used with pemission.

9.3.7 Glomerular filtration rate (GFR) is the most important parameter of renal function
assessedin renal transplant recipients. Although inulin clearance is regarded as the gold
standard measure of GFR, its use in clinical practice is limited. Krieser et al. (4-6) exam-
ined the relationship between the inverse of Cystatin C (a cationic basic protein measured
in mg/L) and inulin GFR as measured by technetium radionuclide labeled diethylenetri-
amine penta-acetic acid) (DTPA GFR) clearance (mL/r rrirr/lr73 m21. The results of 27 tests
are shown in the following table. Use DTPA GFR as the predictor of inverse Cystatin C.

DTPA GFR l/Cystarin C DTPA GFR l/Cystatin C

IB 0.213 i ,
0.485 ,-1r
2l 0.265 42 0.427
2l 0.446 43 0.562
z) 0.203 +J 0.463
27 0.369 48 0.549
27 0.568 4B 0.538
30 0.382 51 0.571
JZ 0.383 55 0.546
?t 0.274 5B 0.402
J' 0.424 60 0.592
36 0.308 62 0.541
J I 0.498 67 0.568
4l 0.398 6B 0.800
BB 0.667 Souncr: David Krieser,M.D.
Used with permission.

t
I
I
I
I
I

I
424 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

9.4 DVALUATING TIIE

REGRnSSTON DQUATTON
Once the regression equation has been obtained it must be evaluated to deter-
mine whether it adequately describes the relationship between the two variables
and whether it can be used effectively for prediction and estimation purposes.

When E1ot F = O fs Not Reiected If in the population the relationship

betweenXand Iislinear, F,the slopeof thelinethatdescribesthisrelationship,
will be either positive, negative, or zero. If p is zero, sample data drawn from the
population will, in the long run, yield regression equations that are of little or no
value for prediction and estimation purposes. Furthermore, even though we
assume that the relationship befiveen X and X is linear, it may be that the rela-
tionship could be described better by some nonlinear model. \{hen this is the
case, sample data when fitted to a linear model will tend to yield results compat-
ible with a population slope of zero. Thus, following a test in which the null

' t ta a
o
a
a O
a o o t a a o o
a o

a a t a a o
t ' l o
a t a o o
t a a
a a ' o o
O O o o

o ol
a
a O a o t a a
a

a a
a
a
o
t o

,bl
FIGIIRE 9.4.1 Conditions in a population that may prevent rejection of the null hypothesis
that p : 0. (a) The relationship between X and.Y is linear, but p is so close to zero that sample
data are not likely to yield equations that are useful for predicting Y when X is given. (6) The
relationship between X and,Y is not linear; a curvilinear model provides a better fit to the data;
sample data are not likely to yield equations that are useful for predicting Y when X is given.

t
I

L
9.4 EVALUATTNGTHE REGRESSION EQUATTON 425
hlpothesis that B equl! zero is not rejected, we may conclude (assuming
that we
h-a1re
noJ made a qp. error by accepting a false null hlpothesis) eithei (r) that
II
although the relationship between X and. y o'ay be lineai it is
not strong enough
for X to be of much value in pred.icting and estimatirg y, or (2)
that the rera-
tionship between X and y is not linear; that is, ,orn. ..rruilinear
model provides
a better fit to the data. Figure 9.4.1 shows the kinds of relationships
between X
and IZin a population thar may prevent rejection of the null hlpothesis
that p : g.
vrhen IIot F = o rs F'feeted Now let us consider the situations in a pop-
ulation that may lead to rejection of the null h;pothesis that
B: 0. Assuming^th;r
we do not commit a q?e I error, rejection of the null hypothesis
that B: d _uy
be attributed to one of the following conditions in the
fopulation: (l) the rela_
tionship is linear and of sufficient strength to justify the Lse of
sample regression
equations to predict and estimate rfor given values of X; and (2)
there is a good
fit of the data to a Iinear model, but some curvilinear model
might provid"e an
even better fit. Figure 9.4.2 illustrates the two population conditions
tnit may lead
to rejection of Ho: B : 0.

(b)

FIGITIED 9'4.2 Populationconditionsrelative to X and Y that may cause rejection of

the nuli
hypothesisthat p : 0. (a) The relationshipbetweenX and"Yis linear
and of s,-rfficientstrengthto jus-
tify the of a sample regressionequation to predict and estimate Y for given values
-use of X. (6) A lin-
ear model provides a good fit to the data, but somecurvilinear model
*oul'd provide an even better fit.

II
L
42G' AND CORREI,ATION
CIIAPTER9 SIMPLELINEAR REGRESSION

Thus, we see that before using a sample regression equation to predict and
estimate, it is desirable to test Ho: F : 0. We may do this either by using analysis of
r,ariance and the Fstatistic or by using the I statistic. We will illustrate both methods.
Before we do this, however, let us see how we may investigate the strength of the
relationship between X and Y.

The Coettiaient of lDeterminalion One way to evaluate the strength of

the regression equation is to compare the scatter of the points about the regres-
sion line with the scatter about 1, the mean of the sample values of X If we take
the scatter diagram for Example 9.3.1 and draw through the points a line that inter-
sects the Y-axis at 1' and is parallel to the X-axis, we may obtain a visual impression
of the relative magnitudes of the scatter of the points about this line and the
regression line. This has been done in Figure 9.4.3.
It appears rather obvious from Figure 9.4.3 that the scatter of the points
about the regression line is much less than the scatter about the y line. We would
not wish, however, to decide on this basis alone that the equation is a useful one.

a
220 a
i
200
a
O a
1 reo a
N |=-zta+s.+a*
E
ll roo
o o
E
a
p 140 a

o ttta
tta
.= 120
E
o
E
o
roo I o
t=101.89 a
o
o a n
o

r3t
a ta
a
l o t a

75 80 85 90 95 100 105 110 115 120

(cm),X
Waistcircumference
FIGIIRD 9.4.3 Scatter diagram, sample regressionline, and 1 line for Example 9.3.1.
9.4 EVALUATINGTHE REGRESSIONEQUATION 427

The situation may not be always this clear-cut, so that an objective measure of
some sort would be much more desirable. Such an obiective measure, called the
cofficient of determination, is available.

Tlrc Total lDeaiatiut, Before defining the coefficient of determination, let us

justif its use by examining the logic behind its computation. We begin by consider-
ing the point corresponding to any observed value, 1;, and by measuring its vertical
distance from the y line. We call this the total deuiationand designate it (y; - y).

The Dxplo.ined, IDeaiation If we measure the vertical distance from the

regression line to the y line, we obtain (yr- y), which is called th.e explaineddeai-
ation, since it shows by how much the total deviation is reduced when the regres-
sion line is fitted to the points.

llnexplo';ined, IleoEattion Finally, we measure the vertical distance of the

observed point from the regression line to obtain Oo- J), which is called the
unexplained d,etiation, since it represents the portion of the total deviation not
"explained"
or accounted for by the introduction of the regression line. These
three quantities are shown for a qpical value of Y in Figure 9.4.4.
It is seen, then, that the total deviation for a particular y1is equal to the sum
of the explained and unexplained deviations. We may write this q'rnbolically as

(r,-r):(ro+r)+(r,-r) (e.4.1)
total explained unexplained
deviation deviation deviation

If we measure these deviations for each value of y; and ir, square each devi-
ation, and add up the squared deviations, we have

)0, - i)' : >0, - t)' + ) 0,- r)' (9.4.2)

'"tH"o '"":lt#"'o
::'#
of squares of squares of squares

These quantities may be considered measures of dispersion or variability.

Total Srnn of Squorres The total sum of squares(SS?), for example, is a mea-
sure of the dispersion of the observed values of Y about their mean ); that is, this
term is a measure of the total variation in the observed values of X The reader will
recognize this term as the numerator of the familiar formula for the sample variance.

Expla.;ined Sum of Squores The exltlained sum of squares measures the

amount of the total variability in the observed values of I that is accounted for
by the linear relationship between the observed values of X and X This quantity
is referred to also as the sum of squaresdue to linear regression(SSR).

t
I

I
f
I
t
I
42,4 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

240
Unexplained
I deviation
220
a h-)'
t
200
Total deviation
I rao \)i-Yt
E
ll roo
q) a Explained
' deviation
o
s 140 $ = -z"ta+3.46x. rli-]t
a ' a
-ta
E rzo a
t'a

E
E 1oo . '
t= 101.89 a

3 e o
o

o
rjt
a o r o
-taaa a
ao

"
0 60 65 70 75 80 85 90 95 100 105 110 115 120 125
(cm),X
Waistcircumference
FIGIIBE 9.4.4 Scatter diagram showing the total, explained, and unexplained deviations
for a selectedvalue of Y, Example 9.3.1.

unexplnined' su"m of squo,res The unexplained sum of squaresis a measure

of the dispersion of the observed Ivalues about the regression line and is some-
times called the errar sum of squares,or the residual sum of squares(SSE). It is this
quantity that is minimized when the least-squares line is obtained.
We may express the relationship among the three sums of squares values as

SST: SS/?+ SSE

The numerical values of these sums of squares for our illustrative example appear
in the analysis of variance table in Figure 9.3.2. Thus, we see that ssT : zb4bzl,
SSft: 237549. SSO: 116982. and

354531:237549 + 116982
354537:354537
9.4 EYALUATING THE REGRESSIONEQUATION 42'0

cal'culnting rz It is intuitively appealing to speculate that if a regression

e,quation does a good job of describing the relationship between two variables,
the or regression sum of squares should constitute a large proportion
_explained
of the total sum of squares. It would be of interest, then, to determine the mag_
nitude of this proportion by computing the ratio of the explained sum of ,q.ru.J,
to the total sum of squares. This is exactly what is done in evaluating a regression
equation based on sample data, and the result is called the sample coeficient
of
determinat'ion,12. That is.

":2(l'-')' = ssn
-
) (l' l)t ssr
In our present example we have, using the sums of squares values from Figure
9.3.2,

" 237549
r'' : '67
uuour:
The sample coefficient of determination measures the closeness of fit of *re
sample regression equation to the observed r,aluesof x \A/hen the quantities -
0, i),
the vertical distances of the observed values of )zfrom the equations, are small, the
unexplained sum of squares is small. This leads to a large explained sum of squares
that leads, in turn, to a large ralue of r2. This is illustrated in Figure 9.4.5.
In Figure 9.4.5(a) we see that the observations all lie close to the regression
line, and we would expect "2 to be large. In fact, the computed r2 for thEse
d.ata
is .986, indicating that about 99 percent of the total variation in rhe y; is explained
by the regression.
In Figure 9.4.5(b) we illustrate a case in which the
), are widely scattered
about the regression line, and there we suspect that r2 is small. The computed. 12
for the data is .403; that is, less than 50 percent of the total variation in the
y; is
explained by the regression.
The largest value that ,2 canlassume is l, a result that occurs when all the
variation in the y; is explained by the regression. \Ahen 12 : l all the observations
fall on the regression line. This situation is shown in Figure g.4.b(c).
The lower limit of "2 is 0. This result is obtained when the regression line
and the line drawn through 1 coincide. In this situation none of the variation in
,f.. f, explained by the regression. Figure 9.4.5(d) illustrates a situation in which
:
r- ls close to zero.
"2 is large, then, the regression has accounted for a large propor-
. Yh,."
tion of the total variability in the observed values of { and we look wifh fivor
on the regression equation. on the other hand, a small r2 which indicates a
failure of the regression to account for a large proportion of the total variation
in the observed values of Y, tends to cast doubt on the usefulness of the regres-
sion equation for predicting and estimating purposes. we do not, however, pass

i
t

I
I
f
f
I
4BO CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

(bl
\al

Cf ose fit, large 12 Poorfit, smallr2

(c) (dl
12=1 1 2- ) 0
FIGIIRD 91.5 ,2 u. u -"urur" of closeness-of-fit of the sample regression line to the
sample observations.

final judgment on the equation until it has been subjectedto an objective sta-
tistical test.

Testing ;g.osF = O with the F Statistia The following example illustrates

Xand Y'
one method for reaching a conclusionregarding the relationship betr,rreen

DXAMPI,D 9.4.I

Refer to Example 9.3.1. We wish to know if we can conclude that, in the popula-
tion from which our sample was drawn, X and Y are linearly related.
s.4 EVALUMTNG TrrE REGRESSTONEQUATTON 4Bl

Solution: The steps in the hypothesis testing procedure are as follows:

l. Data. The data were described in the opening statement of Exam-

ple 9.3.1.

2. Assumptions. We presume that the simple linear regression

model and its underlying assumptions as given in Section 9.2 are
applicable.

3. Hypotheses.
H s :B : 0
H6:B*0
a:.05
4. Test statistic. The test statistic is V.R. as explained in the discus-
sion that follows.
From the three sums-of-squaresterms and their associated
degrees of freedom the analysisof variance table of Table 9.4.1 may
be constructed.
In general, the degrees of freedom associatedwith the sum
of squares due to regression is equal to the number of constants
in the regression equation minus 1. In the simple linear case we
have two constants, a and D; hence the degrees of freedom for
regressioa nre2- 1:1.

5. Distribution of test statistic. It can be shown that when the hlooth-

esis of no linear relationship between Xand Yis true, and when the
assumptions underlying regression are met, the ratio obtained by
dividing the regression mean square by the residual mean square is
distributed as Fwith 7 and n - 2 degrees of freedom.

6. Decision rule. Reject /1e if the computed value of V.R. is equal to

or greater than the critical value of -Fl

7. Calculation of test statistic. As shown in Fisure 9.3.2, the com-

puted value of .F.is 2I7.28.

TABID 9.4.1 ANOVA Tbble for Simple

Linear Regression

Source of
Variation SS d.f. MS v.R.
Linear regression SSR I MSR : Ssft/f MSR/MSE
Residual SSt n-2 MSE:SSE/(n-2\

Total SSf n-l

I
I

t
I
ilL? CIIAFTER 9 SIMPLE LTNEAR RECRESSION AND CORREI.ATION

8. Statisticat decision. Since 217.28 is greater than 3.94, the critical

value of F (obtained by interpolation) for 1 and 107 degrees of
freedom, the null hypothesis is rejected.

9. Conclusion. We conclude that the linear model provides a good

fit to the data.

10. p value. For this test, since 217.28 > 8.25, we have p<.005' I

Dstinnlr;tfing the Populo;tion Coettieient of lDeterminat]on The

sample coefficient of determination provides a point estimate of p2 the popu-
lation coeffi,cientof deterrninati,on.The population coefficient of determinatiotr, P'
has the same function relative to the population as "2 has to the sample. It
shows what proportion of the total population variation in Y is explained by
the regression of Y on X. When the number of degrees of freedom is small, r'
is posi-tively biased. That is, 12 tends to be large. An unbiased estimator of p2 is
provided by

) (r, - il2/(n - 2) (e.4.3)

F 2 : 7 '
) (r, - ,)2/("- r)
Observe that the numerator of the fraction in Equation 9.4.3 is the unexplained
mean square and the denominator is the total mean square. These quantities
appear in the analysis of variance table. For our illustrative example we have, using
the data from Figure 9.3.2,

176982/107
v_e' = t - f f i = . 6 6 6 9 5

This quantity is labeled R-sq(adj) in Figure 9.3.2 and is reported as 66.7 percent.
We see that this value is less than

- 116er?:
r.:).__Zb4bg1 .uzoo+

- L)/(n - 2). When

We see that the difference in r2 and f2 is due to the factor (n
n is large, this factor will approach I and the difference between 12 and i2 will
approach zero.

Testing I|o: F = O uvith the t Statist'ia When the assumptions stated in

Section 9.2 are rrret, a, and b are unbiased point estimators of the corresponding
parameters a and B. Since, under these assumptions, the subpopulations of Yval-
ues are normally distributed, we may construct confidence intervals for and test
hypotheses about a and p. When the assumptions of Section 9.2 hold true, the
e.4 EVALUATTNGTHE REGRESSTONEQUATTON
433
sampling distributions of a and D are each normally distributed
with means and
variances as follows:

F": d (q44\

" o'jp2 xl
* (e.4.5)
n2(*,-4'
ltt: B (e.4.6)
and

, oiw
" (e.4.7)
)(x;-x)2

In Equations 9.4'5 and g.4.7 of,1, it the unexprained

variance of the subpopula-
tions of Yvalues.
with knowledge of the sampling distributions of a and b
_ we o'ay construct
confidence intervals and test hypotheses relative to a
and B in the usual man_
ner. Inferences regarding at are usually not of interest.
on the other hand, as
we have seen' a great deal of interest centers on inferential
procedures with
respect to B' The reason for this is the fact that p tells
us so much about the
form of the relationship between X and x \A4ren X
and y are rinearly related
a positive B indicates that, in general, r increases as
X increases, and we say
that there is a direct linear rerationshipbetween X and
x A negative B indicates
that values of rtend to decrease as values of Xincrease,
arrd"we say that there
is an inuerselinear relationship between X and x \A/hen
there is no linear rela-
tionship between X and Y, p is equal to zero. These
three situations are illus-
trated in Figure 9.4.6.

Thc Test statistic For testing hypotheses about B the test statisric when orzl,
is known is

b- Fo
/q4R\

where Be is the hypothesized value of

B. The hypothesized value of B does not
have to be zero, but in practice, more often than not,
the null hlpothesis of inter_
est is that B : 0.
As a rule o]1-is unknown. \Arhen this is the case, the
test statistic is

b- Fo
(e.4.e)

where s6is an estimate of o6, and. r is distributed as student's

, with z - 2 degrees
of freedom.

I
4'B4 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

a a
o o
a a '
" a a a o o
a a a a '

^
la) (c)

I'IGIIRD 9.4.6 Scatter diagrams showing (a) direct linear relationship, (6) inverse linear
relationship, and (c) no linear relationship between X and V

If the probability of observing a value as extreme as the value of the test sta-
tistic computed by Equation 9.4.9 when the null hypothesis is true is less than a/2
(since we have a two-sided test), the null hypothesis is rejected.

DXAMPI,D 9.4.2
Refer to Example 9.3.1. We wish to know if we can conclude that the slope of
the population regression line describing the relationship between X and Y
is zero.

Solution:
l. Data. See Example 9.3.1.

2. Assumptions. We presume that the simple linear regression

model and its underlying assumptions are applicable.

3. Hypotheses.
Hs:B:0
Ha:B*0
a: .Ob

4. Test Statistic. The test statistic is given by Equation 9.4.9.

5. Distribution of test statistic. When the assumptions are met and

,FI6is true, the test statistic is distributed as Student's t\rith n - 2
degrees of freedom.

6. Decision rule. Reject /1q if the computed value of I is either

greater than or equal to f.9826 or less than or equal to -1.9826.
7. Calculation of statistic. The output in Figure 9.3.2 shows that
b : 3.4589, s6: .2347, and
3.4589- 0 :
14.74
.2347

i
t
t
I
I
I
I
I
I
.
I

f
9.4 EYALUATING THE REGRESSION EQUATION 4.B5

8. Statistical decision. Reject I1e because 14.74 > l.9326.

9. Conclusion. We conclude that the slope of the true regression

line is not zero.

10. p value. The p value for this test is less than .0I, since, when .F1eis
true, the probability of getting a value of , as large as or larger than
2.6230 (obtained by interpolation) is .005, and the probability of get-
ting a value of , as small as or smaller than -2.6230 is also .005, Since
14.74 is greater than 2.6230, the probability of observing a ralue of
/ as large as or larger than 14.74 (when the null h;pothesis is true)
is less than .005. We double this value to obtain 2(.00b) : .01.
The practical implication of our results is that we can expect
to get better predictions and estimates of I if we use the sample
regression equation than we would get if we ignore the relation-
ship between Xand x rhe fact that Dis positive leads us to believe
that p is positive and that the relationship between X and I, is a
direct linear relationship. I

As has already been pointed out, Equation g.4.g rnay be used to test the null
hlpothesis that B is equal to some value other than 0. The hlpothesized value for
F, Fo, is substituted into Equation 9.4.9. All other quantities, as well as the com-
putations, are tlte same as in the illustrative example. The degrees of freedom and
the method of determining significance are also the same.

A contid,enee rnteroal tor B once we determine that it is unlikelv. in

light of sample evidence, that B is zero, we may be interested in obtaining an
interval estimate of B. The general formula for a confidence interval,

estimator + (reliability factor)(standarderror of the estimate)

may be used. when obtaining a confidence interval for p, the estimator is D, the
reliability factor is some value of z or t (depending on whether or not of,1, i,
known), and the standard error of the estimator is

a "'-
",: \>(",_
#
When o]1* it unknown, o6 is estimated by

where tv: MSE

I
I
4AG CHAPTER9 SIMPLELINEAR REGRESSION
AND CORRELATION

In most practical situations our 100(1 - a) percent confidence interval for

Fis

b + tg-op1st (9.4.10)

For our illustrative example we construct the following 95 percent confi-

dence interval for 6:

3.4589 -f r.9826(.2347)
2.99,3.92

We interpret this interval in the usual manner. From the probabilistic point of
view we say that in repeated sampling 95 percent of the intervals constructed in
this way will include B. The practical interpretation is that we are 95 percent con-
fident that the single interval constructed includes p.

Using the Contidenee lnteroal to Test |Iar B: O It is instructive to

note that the confidence interval we constructed does not include zero, so that
zero is not a candidate for the parameter being estimated. We feel, then, that it
is unlikely that B : 0. This is compatible with the results of our hlpothesis test in
which we rejected the null hypothesis that B: 0. Actually, we can always test
Ho: F :0 at the o significance level by constructing the 100(1 - a) percent con-
fidence interval for B, and we can reject or fail to reject the hlpothesis on the
basis of whether or not the interval includes zero. If the interval contains zero,
the null hlpothesis is not rejected; and rf zero is not contained in the interval, we
reject the null hypothesis.

lnterpreting tIrc |fi,esults It must be emphasized that failure to reject the null
hypothesis that p : 0 does not mean that Xand Yare not related. Not only is it pos-
sible that a type II error may have been committed but it may be true that Xand I
are related in some nonlinear manner. On the other hand, when we reject the null
hypothesis that p : 0, we cannot conclude that the true relaionship between X and
Y is linear. Again, it may be that although the data fit the linear regression model
fairlywell (as evidenced by the fact that the null hlpothesis that p: 0 is rejected),
some nonlinear model would provide an even better fit. Consequently, when we
reject /Ie that p : 0, the best we can say is that more useful results (discussedbelow)
may be obtained by taking into account the regression of Yon Xthan in ignoring it.

EXDBCISDS

9.4.1 to 9.4.5 Refer to Exercises 9.3.3 to 9.3.7,and for each one do the following:
(a) Compute the coefficient of determination.
(b) Prepare an ANOVA table and use the F statistic to test the null hypothesis that B = 9.
Let a : .05.

I
t
I

l
I
I

l
t
I.
I
f

t,
I
9.5 USING TIIE REGRESSIONEQUATION 447

(c) Use the lstatistic to test the null hlpothesis that F:0 at the .05 level of significance.
(d) Determine the p value for each hlpothesis test.
(e) State your conclusions in terms of the problem.
(f) Construct the 95 percent confidence interval for B.

9.5 USING THD RDGRf,SSION EQUATION

If the results of the evaluation of the sample regression equation indicate that
there is a relationship between the two variables of interest, we can put the regres-
sion equation to practical use. There are two ways in which the equation can be
used. It can be used to predictwhat value lis likely to assume given a particular
value of X. When the normality assumption of Section 9.2 is met, a prediction inter-
val for this predicted value of Y rnay be constructed.
We may also use the regression equation to estimatethe mean of the sub-
population of Ivalues assumed to exist at any particular value of X. Again, if the
assumption of normally distributed populations holds, a confidence interval for
this parameter may be constructed. The predicted value of Y and the point esti-
mate of the mean of the subpopulation of Y will be numerically equivalent for
any particular value of X but, as we will see, the prediction interval will be wider tl r
11
than the confidence interval.
1r
'II
Predicting Y tor u Gioen X If it is known, or if we are willing to assume !
that the assumptions of Section 9.2are met, and when of1" is unknown, then the I
I
100(1 - a) percent prediction interval for Yis given by

(*p-x)'
, , I
'-;- , (e.5.1)
) ! t6 oyz1s1,
>(t,-tr
where xp is the particular value of x at which we wish to obtain a prediction inter-
val for Y and the degreesof freedom used in selectingt are n - 2.

Estirnating the llleun of Y tor u Gioen X The 100(1 - a) percent con-

fidence interval for p,r", when of,p is unknown, is given by

1 , (*p-*)'
j ! tg (e.5.2)
oyzlsy,
i- 2r*--*y
We use MINITAB to illustrate, for a specified value of -X, the calculation of a 95
percent confidence interval for the mean of Y and a 95 percent prediction inter-
val for an individual I measurement.
Suppose, for our present example, we wish to make predictions and esti-
mates about AI for a waist circumference of 100 cm. In the regression dialog box

iI
L
4UA CHAPTER9 SIMPLELINEAR REGRESSION
AND CORREI-ATION

click on "Options." Enter 100 in the "Prediction interval for new observations"
box. Click on "Confidence limits," and click on "Prediction limits.',
We obtain the following output:
Fit Stdev.Fit 9bI% CJ. gb.\Vop.t.
129.90 3.69 (122.58,137.23) (63.93,195.87)
We interpret the 95 percent confidence interval (C.I.) as follows.
If we repeatedly drew samples from our population of men, performed a
regression analysis,and estimated,tr,nl,:16swith a similarly constructed confidence
interval, about 95 percent of such intervals would include the mean amount of
deep abdominal Ar for the population. For this reason we are 95 percent confi-
dent that the single interval constructed contains the population mean and that
it is somewhere between 122.58 and 137.23.
Our interpretation of a prediction interval (P.I.) is similar to the interpre-
tation of a confidence interval. If we repeatedly draw samples, do a regression
analysis, and construct prediction intervals for men who have a waist circumfer-
ence of 100 cm, about 95 percent of them will include the man's deep abdomi-
nal AT value. This is the probabilistic interpretation. The practical interpretation
is that we are 95 percent confident that a man who has a waist circumference of
100 cm will have a deep abdominal AT area of somewhere between 63.93 and
f 95.87 square centimeters.
Figure 9.5.1 contains a partial printout of the SAS@simple linear regression
analysis of the data of Example 9.3.1.

R,esisto;nt line Frequently, data sets available for analysis by linear regression
techniques contain one or more "unusual" observations; that is, values of x or y,
or both, may be either considerably larger or considerably smaller than most of
the other measurements. In the output of Figure g.3.2,we see that the computer
detected seven unusual observations in the waist circumference and deep abdom-
inal AI data shown in Table 9.3.1.
The least-squares method of fitting a straight line to data is sensitive to
unusual observations, and the location of the fitted line can be affected substan-
tially by them. Because of this characteristic of the least-squaresmethod, the result-
ing least-squares line is said to lack resistanceto the influence of unusual observa-
tions. Several methods have been devised for dealing with this problem, including
one developed by John w. Tukey. The resulting line is variously referred to as
Tukq's l:ine and the resistant line.
Based on medians, which, as we have seen, are descriptive measures that are
themselves resistant to extreme values, the resistant line methodology is an
exploratory data analysis tool that enables the researcher to quickly fita straight
line to a set of data consisting of paired x, ) measurements. The technique involves
partitioning, on the basis of the independent variable, the sample measurements
into three groups of as near equal size as possible: the smallest measurements, the

tI
I

r
t
I
t
f

L
9.5 USING THE REGRESSIONEQUATION 4Bg

The SAS System

Model-: MODEL1
Dependent Variable: y

fl *u- 1r .d. ^.. J. 1_ y s^ l _ so

^€
I var]-ance

Sum of Mean
Source DF Squares Square F Vafue prob >F

Model ! 23i548.s1,620 237s48.51620 2 r ' 7. 2 7 g 0.0001

Error 1"07 1 1 6 9 8 1. 9 8 6 0 2 i.O93.2!g5g
C Tota] 108 3 5 4 s 30 . 5 0 2 2 2

Root MSE 33.06493 R-square O.6700

Dep Mean 101.89404 Adj R-sq 0.6670
c .v. 32 .45031_

Parameter Estimates

Parameter Standard T for H0:

variab]e DF Estimate Error parameter = o erob > | r I

T N T E R C E PL -2L5.981488 27.19621076 -9 .90g 0.0001

x 1 3 .458859 0.23465205 1,4.740 o. oool

FIGIIBD 9.5.1 Partialprintoutof the computeranalysisof the datagivenin Example9.3.1,

usingthe SAS@softwarepackage.

largest measurements, and those in between. The resistant line is the line fitted
in such away that there are an equal number of values above and below it in both
the smaller group and the larger group. The resulting slope and y-intercept esti-
mates are resistant to the effects of either extreme y values, extreme x values, or
both. To illustrate the fitting of a resistant line, we use the data of rable 9.3.1 and
MINITAB. The procedure and ourput are shown in Figure 9.5.2.
We see from the output in Figure 9.5.2 that the resistant line has a slope of
3.2869 and a y-intercept of -203.7808. The hatfstoperatio, shown in the output as
equal to .690, is an indicator of the degree of linearity between x and y. A ilope,
called a half-slope, is computed for each half of the sample data. The ratio of the
right half-slope, 6n, and the left half-slope, Da,is equal to bp/br.If the relationship
between x and y is straight, the half-slopes will be equal, and their ratio will be i.
A half-slope ratio that is not close to I indicates a lack of linearity between x and y.
44o. CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

Dialog box: Session command:

Stat > EDA > Resistant Line M T B> N a m e C 3 : ,RESI1' C4:,FITS1,

MTB > Rline C2 C1 ,RESI1, ,F]TS1';
SUBC> Maxlterati-ons 10.

Type C2 in Response and CI in Predictors.

Check Residuals and Fits. Click OI(.

Output:

Resistant Line Fitl G2 versus Gl

Slope = 3 . 2 8 6 9 L e v e l -= -203.786A Ha]f-slope ratio = 0.690

FIGIIRD 9.5.2 MINITAB resistant line procedure and output for the data of Thble 9.3.1.

The resistant line methodology is discussedin more detail by Hartwig and Dear-
ing (1),Johnstone and Velleman (2), McNeil (3), and Velleman and Hoaglin (4).

BXDBCISDS

In each exercise refer to the appropriate previous exercise and, for the value of X indi-
cated, (a) construct the 95 percent confidence interval for p,r1*and (b) construct the 95
percent prediction interval for X

9.5.1 Refer to Exercise 9.3.3 and let X: 400.

9.5.2 Refer to Exercise 9.3.4 and let X: 1.6.
9.5.3 Refer to Exercise 9.3.5 and let X: 4.16.
9.5.4 Refer to Exercise 9.3.6 and let X:29.4.

9.5.5 Refer to Exercise 9.3.7 and let X:35.

9.6 TIID COBBEI,ATION MOD]L

In the classic regression model, which has been the underlying model in our dis-
cussion up to this point, only Y, which has been called the dependent variable, is
required to be random. The variable Xis defined as a fixed (nonrandom or math-
ematical) variable and is referred to as the independent variable. Recall. also. that

t
I

l
I
I
I

I
9.6 THE CORRELATIONMODEL 441

under this model observations are frequently obtained by preselecting values

of
X and determining corresponding values of X
when both I'and X are random variables, we have what is called the cor-
relation model. Typically, under the correlation model, sample observations
are
obtained by selecting a random sample of the units of assoiiation (which mav
be
persons, places, animals, points in time, or any other element on which the
two
measurements are taken) and taking on each a measurement of X and
a mea_
surement of X In this procedure, values of X are not preselected but occur at
ran-
dom, depending on the unit of association selected in the sample.
Although correlation analysis cannot be carried out meaningfully under
the classic regression model, regression analysis can be carried out under
the
correlation model. Correlation involving two variables implies a co-relationship
between variables that puts them on an equal footing urrj do., not distinguisir
between them by referring to one as the dependent ind the other as the inde-
pendent variable. In fact, in the basic computational procedures, which are
the
same as for the regression model, we may fit a straight line to the data
either
by minimizing ) (yr - ),)' o, by minimizing ) (x; - ft\,.In other words,
we may
do a regression of X on y as well as a regression of y on x. The fitted line
in
the two cases in general will be different, and a logical question arises
as to
which line to fit.
If the objective is solely to obtain a measure of the strength of the relation-
ship between the two variables, it does not matter which line is fitted, since
the
measure usually computed will be the same in either case.If, however, it is desired
to use the equation describing the relationship between the two variables for
the
purposes discussed in the preceding sections, it does matter which line is fitted.
The variable for which we wish to estimate means or to make predictions
should
be treated as the dependent variable; that is, this variable shouid be regressed
on
the other variable.

The Biaariate Normo,l Distribution Under the correlation model, X

and lzare assumed to vary together in what is called a joint d,istribution.If this
loint
distribution is a normal distribution, it is referred to is a hiuariate normal axiriUu-
tion. rnferences regarding this population may be made based on the results
of
samples properly drawn from it. If, on the other hand, the form of the joint
dis-
tribution is known to be nonnormal, or if the form is unknown and. there is
no
.iustification for assuming normality, inferential procedures are invalid, although
descriptive measures may be computed.

correlatdon assutnptions The following assumptions must hold. for infer-

ences about the population to be valid when sampling is from a bivariate distri-
bution.

l. For each value of Xthere is a normally distributed subpopulation of lrvalues.

2. For each value of Ythere is a normally distributed subpopulation of Xvalues.
442 CHAFTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

f(&v) f(x, Y)

(a)

f(x, Y)

\c)
FIGIIRD 9.6.1 A bivariate normal distribution. (rz)A bivariate normal distribution.
(6) A cutaway showing normally distributed subpopulation of Y for given x. (c) A cutaway
showing normally distributed subpopulation of X for given Y

3. The joint distribution of Xand Iis a normal distribution called the ltiaariate
normal distribution.
4. The subpopulations of Yvalues all have the same variance.
5. The subpopulations of Xvalues all have the same variance.

The bivariate normal distribution is represented graphically in Figure 9.6.1.

In this illustration we see that if we slice the mound parallel to Y at some value
of X, the cutaway reveals the corresponding normal distribution of X Similarly, a
slice through the mound parallel to Xat some value of Yreveals the corresponding
normally distributed subpopulation of X.
9.7 THE CORRELATION COEFFICIENT 44.B

9.7 THD CORREI,ATION CODT'T'ICIEITT

The bivariate normal distribution discussed in Section 9.6 has five parameters, dx,
cy, lL,, Ity, a\d p. The first four are, respectively, the standard deviations and
means associated with the individual distributions. The other parameter, p, is
called the population correlationcofficient and measures the strength of the linear
relationship between X and Y.
^ The population correlation coefficient is the positive or negative square root
of p2, the population coefficient of determination previously discussed.,and since
the coefficient of determination takes on values between 0 and 1 inclusive, p rl'ay
assume any value between -1 and +1. If p : 1 there is a perfect direct linear
correlation between the two variables, while p : - 1 indicates perfect inverse lin-
ear correlation. If p : 0 the two variables are not linearly correlated. The sign of
p will always be the same as the sign of B, the slope of the population regression
line for X and Y.
The sample correlation coefficient, r; describes the linear relationship be-
tween the sample observations on two variables in the same way that p describes
the relationship in a population. The sample correlation coefficient is the square
root of the sample coefficient of determination that was defined earlier.
Figures 9.4^.5(d) and 9.4.5(c), respecrively, show typical scafter diagrams
where r --+ 0 (r' - 0) and r : *1 1r2 : t;. Figure 9.7.1 ihows a Lypicalscatter
diagram where r: -I.
We are usually interested in knowing if we may conclude that p * 0, that is,
that X and Y are linearly correlated. Since p is usually unknown, we draw a tar'-
dom sample from the population of interest, compute r, the estimate of p, and
test fle:p:0 against the alternative p * 0. The procedure will be illustrated in
the following example.

FIGIIB.D 9.7.1 Scatter diasram for r : -1.

I
tI
I
L
444 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

DXAMPLD 9.7.I
The purpose of a study by Kwast-Rabban et al. (A-7) was to analyze somatosensory
evoked potentials (SEPs) and their interrelations following stimulation of digits I,
III, and V in the hand. The researchers wanted to establish reference criteria in
a control population. Thus, healthy volunteers were recruited for the study. In the
future this information could be quite valuable as SEps may provide a method to
demonstrate functional disturbances in patients with suspected cervical root lesion
who have pain and sensory slmptoms. In the study, stimulation below-pain-level
intensity was applied to the fingers. Recordings of spinal responses were made
with electrodes fixed by adhesive electrode cream to the subject's skin. one of
the relationships of interest was the correlation between a subject's height (cm)
and the peak spinal latency (cv) of the SEP.The data for lbb measurements are
shown in Table 9.7.1.

TABLE 9.7.I Height and Spine SEp

Measurements (Cv) from Stimulation of
Digit I for I55 Subieets l)escribed in
Example 9.7.1

Height Cv Height Cv

r49 t4.4 l68 r6.3 tBl r5.B

r49 r3.4 l68 r5.3 tBl rB.B
155 13.5 r6B I6.0 1BI 18.6
155 13.5 t68 16.6 TB2 18.0
156 r3.0 l68 L5.7 tB2 t7.9
156 13.6 r68 16.3 tB2 17.5
r57 r4.3 l68 16.6 IB2 17.4
r57 14.9 t68 15.4 IB2 r7.0
t58 14.0 170 16.6 l82 t7.5
158 14.0 170 r6.0 IB2 r7.B
r60 15.4 170 I7.0 t84 IB.4
160 14.7 170 t6.4 184 tB.5
161 15.s 17l 16.5 I84 17.7
l6l r5.7 17r 16.3 184 17.7
t6t l5.B t 7l
16.4 t84 17.4
r61 16.0 17I 16.5 l84 lB.4
16r t4.6 172 t7.6 l85 i9.0
r61 t5.2 t72 l6.B r85 19.6
r62 15.2 172 17.0 t87 l9.l
r62 16.5 172 17.6 IB7 19.2
(Continued)
9.7 THE CORRELATIONCOEFFICIENT 445

Ileight Cv
t62 17.0 173 I7.3 tB7 l7.B
r62 14.7 I73 16.8 tB7 19.3
163 16.0 I74 r5.5 ]BB I7.5
163 r5.B t74 15.5 rBB 1B.O
163 17.0 175 17.0 lB9 tB.0
r63 1 5 I. 175 15.6 189 1B.B
r63 14.6 r75 t6.B 190 tB.3
163 15.6 r75 I7.4 190 tB.6
IO.J 14.6 r75 17.6 190 1B.B
t64 17.0 r75 16.5 190 I9.2
I64 16.3 I75 16.6 lgt IB.5
r64 16.0 r75 17.0 191 18.5
t64 16.0 176 lB.0 191 19.0
r65 15.7 176 I7.0 tgl rB.5
165 16.3 t76 I7.4 194 19.8
r65 17.4 t76 IB.2 I94 IB.B
165 17.0 176 I7.3 r94 lB.4
165 16.3 r77 n.2 I94 19.0
r66 14.1 r77 18.3 195 rB.0
t66 t4.2 r79 t6.4 195 tB.2
r66 r4.7 t79 i6.t 196 1.7.6
r66 I3.9 t79 77.6 196 lB.3
r66 17.2 r79 17.8 197 rB.9
t67 t6.7 t79 16.1 r97 I9.2
r67 16.5 r79 16.0 200 2L.0
t67 14.7 r79 16.0 200 I9.2
t67 14.3 t79 L7.5 202 18.6
r67 l4.B r79 77.5 202 18.6
t67 15.0 180 18.0 IB2 20.0
I67 15.5 tBO t7.9 190 20.0
t67 r5.4 l8l 18.4 190 r9.5
l68 17.3 IBI t6.4
Sounce: Olga Kwast-Rabben,ph.D. Used with pemission

Solution: The scatter diagram and least-squares regression line are shown in
Figure 9.7.2.
Let us assume that the investigator wishes to obtain a regression
equation to use for estimating and predicting purposes. In ttat case
the sample correlation coefficient will be obtained bv the methods dis-
cussed under the regression model.
446 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

18 a
aa
o
E

Height(cm)
FIGIIBD 9.7.2 Height and Cervical (spine) potentials in Digit I
stimulationfor the data describedin Examnle 9.7.1.

The Regression Equation

Let us assume that we wish to predict Cv levels from a knowledge
of heights. In that case we treat height as the independent variable
and Cv level as the dependent variable and obtain the regression
equation and correlation coefficient with MINITAB as shown in
Figure 9.7.3.For this example r : {J19:.848. We know that ris
positive because the slope of the regression line is positive. We may
also use the MINITAB correlation procedure to obtain r as shown in
Figure 9.7.4.
The printout from the SAS@correlation procedure is shown in
Figure 9.7.5. Note that the SAS@procedure gives descriptive measures
for each variable as well as the p vahrc for the correlation coefficient.
When a computer is not available for performing the calcula-
tions, r rnay be obtained by means of the following formulas:

b'l>*7 - (2 *)r/"1
r=
2fi - (2 t)'/n
An alternative formula for computing r is given by
n2 q)t - () x)() r)
r= r/O ? g\
\/;tE-a;Fg>t--a;F
9.7 THE CORRELATION COEFFICIENT 447

The regression equation is

uv
d--
= -J.z0 + 0.115 Heiqht

Drarl'i nt- n-
SE Coef . t . n

Constant. 1.015 -? T q n n^.

Helght 0.005192 1q ?A n ^^^

s = 0.8s73 P-Qn -
7 1 ,. 9 % R-Sq(adj) = 17.72

Analysis of Variance

Source DF ss MS t r ' D
Pacracci nn
- r e J + v p e r v r f
1 287.56 287.56 391.30 0.000
Residual_ Error 153 I 1 _ 2. 4 4 0.73
Tot.al 400.00

Unusual ObservaLrons
Obs Height UV t1t SE FiI Residual St Resid
39 1 , 66 14.1000 1 q A l q q 0.0865 - 1 ,. 7 1 _ 9 9 -2 . O2R
42 'iq
1 , 66 13.9000 Ql00 0.0865 -1,.9L99 -2.25R
105 1B t_ 15.8000 17.5384 0.0'110 - 1 ,. 7 3 8 4 -2 . O4R
151 202 18.6000 1g q44?
0.1705 -l-.3443 -1.60 x
I52 '1q
202 18.6000 qal'1
0.1705 -1,.3443 -1_.60 x
153 782 20.0000 ! | .o3z> 0.0798 2 .3471 2.75R

R denotes an observation with a large standardized residual

x denotes an observati-on whose x va]ue gives it large infruence.

FIGf''RD 9'7'3 MINITAB output for Example 9.7.1 using the simple regressronprocedure.

An advantage of this formula is that r may be computed without

first computing D.This is the desirable procedure when it is not
antic-
ipated that the regression equation will be used.
Remember that the sample correlation coefficient, r, will always
have the same sign as the sample slope, D.
t

DXAMPI,N 9.7.2
Refer to Example 9.7.1. We wish to see if the sample value of r :
.848 is of suffi-
cient magnitude to indicate that, in the population, height and
Cv SEp levels are
correlated.

l
L
44s CHAPTER 9 STMPLE LINEAR REGRESSIONAIID CORRELATTON

Datar

Cl-: Height
C2: Cv

Dialog Box: Session command:

Stat > Basic Statistics > Gorrelation MTB > correlation c1 c2.

Type CI C2inYatiables. Click OK.

OUTPUTT

Gorrelations! Height, Gv

Pearson correlation of Heiqht and Cv = 0.848

P-Va]ue = 0.000

FIGIIBD 9.71 MINITAB procedure for Example 9.7.1 using the correlation command.

The CORR Procedure

2 Variables: HEIGHT CV

Simp1e Statistics

Variable N Mean Std Dev Sum Minimum Maximum

HETGHT 155 L75.04516 1,1.92745 27t32 t_49.00000 202.00000
cv 1ss 16.85613 1.61165 26L3 13.00000 21.00000

Pearson Correlation Coefficients, N = 155

Prob > lrl under H0: Rho=O

HEIGHT cv
HEIGHT l_.00000 0.84788
<.0001
r1\/ 0.84788 1.00000
<.0001_

FIGIIRD 9.7.5 SAS@printoutfor Example 9.7.1.

9.7 THE CORRELATIONCOEFFICIENT 44',

Solution: We conduct a hypothesis test as follows.

l . Data. See the initial discussion of Example 9.7.1.

.) Assumptions.
We presume that the assumptions given in Sec-
tion 9.6 are applicable.

3. Hypotheses.
H s :P : 0
Ha:p*0
4. Test statistic. \{rhen p:0, it can be shown that the appropriate
test statistic is

(e.7.3)

5. Distribution of test statistic. When .116is true and the assump_

tions are met, the test statistic is distributed as Stud.ent's / distri-
bution with n - 2 degrees of freedom.

6. Decision rule. If we let a : .0b, the critical values of / in the

present example are t 1.9754 (by interpolation). If, from our
data, we compute a value of f that is either greater than or equal
to +1.9754 or less than or equal to -Lg7b4, we will reject the
null hypothesis.

Calculation of test statistic. Our calculated value of t is

-tcJ
r: .848 19.787
| - .7r9
8 . statistical decision. Since the computed value of the test statistic
does exceed the critical value of /, we reject the null hypothesis.
9. Conclusion. We conclude that, in the population, height and SEp
levels in the spine are linearly correlated.

1 0 . p valae. Since / : 19.787 > 2.6085 (interpolared value of r for

153, .995), we have for this test, p < .00b. I

A Test tor use Vrhen the HgpothesEzed p rs u Nonzero vulue The

use of the I statistic computed in the above test is appropriate only for testing
Ho: p : 0. If it is desired to test Ho: p : p6, where pe is some value other than
zero, we must use another approach. Fisher (5) suggeststhat r be transformed. to
z" as follows:

,.:|'"fi (e.7.4)
4'',o CHAFTER9 SIMPLELINEAR REGRESSION
AND CORRELATION

where ln is a natural logarithm. It can be shown that z, is approximately normally

distributed with a mean of zr: |ln{(\ + p)/(1- p)} and estimated standard devi-
ation of

(e.7.5)
\/6- 3

To test the null hlpothesis that p is equal to some value other than zero, the
test statistic is

Z: (e.7.6)
1/\/n - 3

which follows approximately the standard normal distribution.

To determine z,for an observed rand z, for a hlpothesized p, we consult
Table I, thereby avoiding the direct use of natural logarithms.
Suppose in our present example we wish to test

Hg: P : .80

against the alternative

H6: p * .80

at the .05 level of significance. By consulting Table I (and interpolating), we find

that for

r : .848, z,: 1.24726

and for

p : .80' zo : 1.09861

Our test statistic, then, is

r.24726- 1.0986r: 1 . 8 3
Z: - 3
V\,Ibb

Since 1.83 is less than the critical value of z: 7.96, we are unable to reject ,F16.
We conclude that the population correlation coefficient may be .80.
For sample sizes less than 25, Fisher's Ztransformation should be used with
caution, if at all. An alternative procedure from Hotelling (6) may be used for
sample sizes equal to or greater than 10. In this procedure the following trans-
formation of r is employed:

34-l r
(s.7.7)
4n
9.7 THE CORRELATION COEFFICIENT
45t
The standard deviation of z* is

1
Vn-1
(e.7.8)

The test statistic is

Z= : (z _ {*)\/i-

I/YT- (e.7.e)

where

(34 * P)
(*(pronounced zeta) - "o -

Critical values for

>r comparison purposes
purpos, ".. "orlrl.o from the srandard nor-
mal distriburion.
In our present example, to test Ho: .g0 against H6: p *.g0 using the
|
Hotelling transformation and a : ,0b, we have

3(t.24726)+ .B4B
z a: 1 . 2 4 7 2 6 r.2339
4(155)
3 ( 1 ' o e 8 6 1+) ' B
f* : 1.09861 - * 1.0920
4(155)
2 * : ( r . 2 3 3 9- 1 . 0 9 2 0 ) V r s 5- l : 1.7609
since 1'7609 is less than 1-96, the null hypothesis
is nor rejected,
"used. and rhe same
conclusion is reached as when the Fisher transformation
is
altcrtwtines In some situations the data available for analysis
do not meet the
assumptions necessary for the valid use of the procedures
discussed here for testing
hlpotheses about a population correlation coefficient.
In such casesit may be more
appropriate to use *re Spearman rank correlation technique
discussedin Chapter 13.
Contid,ence Inte^:al tor p Fisher's transformation may be used to con_
struct. 100(1 - a) percent confidence intervals for p.
The gerre.ul formula for a
confidence interval

estimaror+ (reliabilityfactor)(standarderror)

is employed. we first convert our estimator, r, to

z' construct a confidence inter_
val about z, and then reconvert the limits to obtain
a r00(r - a) percent confi_
dence interval about p. The general formula then
becomes

z,* z(I/lf,n-3) (9.7.10)

45.2 AND CORREI,ATION
CHAPTER9 SIMPLELINEAR REGRESSION

For our present example the 95 percent confidence interval for zo is given by

1 . 2 4 7 2 t6 1 . 9 6 ( I / v 4 5 5- 3 )
i.08828,1.40624

Converting these limits (by interpolation in Table I), which are values of 2",
into values of r gives

I.0BB2B .7962
t.40624 .8866

We are 95 percent confident, then, that p is contained in the interval .7962 to

.88866. Because of the limited entries in the table, these limits must be consid-
ered as only aPProximate.

NXDBCISf,S
In each of the following exercises:
(a) Prepare a scatter diagram.
(b) Compute the sample correlation coefficient.
(c) Test Ho: p : 0 at the .05 level of significance and state your conclusions.
(d) Determine tlte p value for the test.
(e) Construct the 95 Percent confidence interval for p.
g.7.L The purpose of a studybyBrown and Persley (A-8) was to characterize acute hepatitisAin
patients more than 40 years old. They performed a retrospective chart review of 20 subjects
who were diagnosed with acute hepatitis A, but were not hospitalized. Of interest was the
use of age (years) to predict bilirubin levels (mg,/dl-). The following data were collected.

Age (Years) Bilirubin (rng/dl) Age (Years) Bilirubin (rngldl,)

7B 44 7.0
72 t2.9 42 1.8
BI 14.3 45 .B
<o 8.0 7B 3.8
64 1 4 I. A A
3.5
48 10.9 50 5.1
46 12.3 57 16.5
42 1.0 52 3.5
5B 5.2 5B 5.6
52 5.1 45 1.9

Sounct: Geri R. Brown, M.D. Used with pemission

EXERCISES 45g.

s'7'2 Another variable of interest in the study by Reiss et al. (A-3) (see
Exercise g.3.4) was
partial thromboplastin (aPTT), the standard test used to monitor
heparin anticoagula-
tion' Use the data in the following table to examine the correlatio.r
b"t*..n apTf lev-
els as measured. by the CoaguChek point-of-care assayand
standard laboratory hospital
assay in 90 subjects receiving heparin alone, heparin with
warfarin, and warfarin and
exoenoxaparin.

Warfarin and
lleparin Warfarin Exoenoxparin

CoaguCheck llospital CoaguCheck Hospital CoaguCheck Hospital

aPTT aPTT aPTT aPTT aPTT aPTT
49.3 7r.4 tB.0 77.0 56.5 46.5
57.9 86.4 3r.2 62.2 50.7 34.9
59.0 75.6 58.7 53.2 28.0
I T.J 54.5 7<' 53.0 64.8 52.3
nz.J 57.7 lB.0 45.7 4r.2 37.5
44.3 59.5 82.6 Bl.1 90.1 47.I
90.0 77.2 29.6 40.9 23.I 27.r
55.4 63.3 82.9 75.4 53.2 40.6
20.3 27.6 58.7 55.7 zl-a 37.8
28.7 52.6 64.8 54.0 67.5 50.4
64.3 101.6 37.9 79.4 JJ.O 34.2
90.4 89.4 Bt.2 62.5 45.r 34.8
OZ}.J 66.2 lB.0 36.5 56.2 nA q
B9.B 69.8 3B.B 32.8 26.0 28.2
74.7 9r.3 95.4 68.9 67.B ZIO.5
150.0 lIB.B 53.7 7L3 40.7 41.0
32.4 30.9 128.3 111.1 36.2 35.7
20.9 65.2 60.5 80.5 60.8 47.2
89.5 77.9 150.0 150.0 30.2 39.7
44.7 9r.5 38.5 46.5 tB.0 3r.3
61.0 90.5 58.9 89.1 55.6 53.0
JO.4, JJ.O 112.8 66.7 tB.0 27.4
52.9 BB.O 26.7 29.5 lB.0 35.7
57.5 69.9 49.7 47.B I O.J 62.0
39.1 41.0 85.6 63.3 75.3 36.7
74.8 Bl.7 68.B 43.5 12' 85.3
JZ.J JJ.J lB.0 54.0 42.0 38.3
125.7 142.9 92.6 100.5 49.3 39.8
77.1 98.2 46.2 52.4 22.8 42.3
143.8 108.3 60.5 93.7 35.8 36.0
Sounct: Curtis E. Haas, Pham. D. Used with permission

i
t
I
t
L.
464 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

9.7 .3 In the study by Parker et al. (A4), (see Exercise 9.3.5) , the authors also looked at the change
in AUC (area under the curve of plasma concentration of digoxin) when comparing digoxin
levels taken with and without grapefruit juice. The following table gives the AUC when
digoxin was consumed with water (ng . hr/ml) and the change in AUC compared to the
change in AUC when digoxin is taken with grape fruit juice (GFJ, %).

Water AUC Level Change h AUC

(ng'hr/ml) wirln GFJ (Eo)

6.96 17.4
5.59 24.5
5.31 8.5
8.22 20.8
I t.9l -26.7
9.50 -29.3
TI.2B - I6.B

SouncE:RobertB. Parker.Pharm.D. Used

with permission.

9.7.4 An article by Tuzson et al. (A-9) in Arehiues of Physical Medicine and Rehabilitationreported
the following data on peak knee velocity in walking (measured in degrees per second) at
flexion and extension for L8 subjects with cerebral palsy.

Flexion (o/s) Extension (o/s)

100 100
150 150
2to 180
255 165
200 2L0
t85 155
440 440
u0 lB0
400 400
160 140
t50 250
425 275
375 340
400 400
400 450
Sounct: Ann E. Tuzson, Kevin P. Granata,
300 300 and Mark F. Abel, "Spastic Veiocity Threshold
300 300 Constrains Functional Performance in Cerebral
320 275 Palsy," Archiues of Physi,cal Medicine and
Rzhabilitation, 84 (2003), 1363-1368.
EXERCISES 46.5

9.7.5 Amyotrophic lateral sclerosis (ALS) is characterized by a progressive decline of motor func-
tion. The degenerative process affects the respiratory system. Butz et al. (A-10) investigated
the longitudinal impact of nocturnal noninvasive positive-pressure ventilation on patients
with ALS. Prior to treatment, they measured partial pressure of arterial oxygen (Pao2) and
partial pressure of arterial carbon dioxide (Paco2) in patients with the disease. The results
were as follows:

Paco2 Pao2

40.0 101.0
47.0 69.0
34.0 r32.0
42.0 65.0
54.0 72.0
48.0 76.0
53.6 67.2
56.9 70.9
58.0 73.0
45.0 66.0
54.5 80.0
54.0 72.0 l

43.0 105.0 I

44.3 113.0
i
53.9 69.2 I
4l.B 66.7 t
33.0 67.0
43.1 77.5
52.4 65.1
37.9 7r.o
JZ+.) 86.5 I
i
40.1 74.7
33.0 94.0
59.9 60.4
62.6 52.5
54.L 76.9
45.7 65.3
40.6 80.3
56.6 >5.2

59.0 7t.9

Souncr: M. Butz, K. H. Woilinsky, U. Widemuth-

Catrinescu, A. Sper{eld, S. Winter, H. H. Mehrkens,
A. C. Ludolph, and H. Schreiber, "Longitudinal Effects
of Noninvasive Positive-PressureVentilation in Patients
with Amyotophic Lateral Sclerosis," American Journal
of Medi.cal Rehabilitation, 82 (2003) 597-604.

I
466 CTIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

9.7.6 A simple random sample of 15 apparently healthy children between the ages of 6 months
and 18 years yielded the following data on age, X, and liver volume per unit of body weight
(mllkg), 1:

x
.5 4I 10.0 26
.7 55 l0.t 35
2.5 4l 10.9 25
4.1 39 11.5 3t
5.9 50 L2,I 3l
6.I JZ 14.1 29
7.0 4I 15.0 23
8.2 42

9.8 SOMD PRDCAUTIONS

Regression and correlation analysis are powerful statistical tools when properly
employed. Their inappropriate use, however, can lead only to meaningless
results. To aid in the proper use of these techniques, we make the following
suggestions:

l. The assumptions underlying regression and correlation analysis should be

reviewed carefully before the data are collected. Although it is rare to find
that assumptions are met to perfection, practitioners should have some idea
about the magnitude of the gap that exists between the data to be analyzed
and the assumptions of the proposed model, so that they may decide
whether they should choose another model; proceed with the analysis, but
use caution in the interpretation of the results; or use the chosen model
with confidence.
2. In simple linear regression and correlation analysis, the two variables of
interest are measured on the same entity, called th'e unit of associ:,ationIf we
are interested in the relationship between height and weight, for example,
these two measurements are taken on the same individual. It usually does
not make sense to speak of the correlation, say, between the heights of one
group of individuals and the weights of another group.
3. No matter how strong is the indication of a relationship between two vari-
ables, it should not be interpreted as one of cause and effect. If, for example,

I
I
I
I

-
l,

D
9.8 SOME PRECAUTIONS 457

a_significant sample correlation coefficient between two variables

Xand I.is
observed, it can mean one of several things:
a. Xcauses Z
b. Y causes X.
c. some third factor, either directly or indirectly, causes both
X and. x
d. An unlikely event has occurred and a large sampre correlation
coeffi-
cient has been generated by chance from a population in which
X and
Y are, in fact, not correlated..
e. The correlation is purely nonsensical, a situation that may arise
when
measurements of X and y are not taken on a common unit of association.
4. The sample regression equation should not be used to predict
or estimate
outside the range of values of the independent variable represented
in the
sample. This practice, called extrapolation,is risky. The tiue
relationship
between two variables, although linear over an interval of the
independent
variable, sometimes may be described at best as a curve outside
this inter-
val. If our sample by chance is drawn only from the intervar where
the rela-
tionship is linear, we have only a limited representation of the
population,
and to project the sample results beyond the interval .epres.nt.d
by the
sample may lead to false conclusions. Figure 9.g.1 illustrates the
possible pit-
falls of extrapolation.

Extrapolation

Sampled Interval
f'IGUnD 9.4.I Example of extrapolation.

i
I
I
I
L
46s. CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

9.9 SUMMABY
In this chapter, two important tools of statistical analysis, simple linear regression
and correlation, are examined. The following outline for the application of these
techniques has been suggested.

1. Identify the model. Practitioners must know whether the regression model
or the iorrelation model is the appropriate one for answering their questions.
2. Review assumptions. It has been pointed out several times that the valid-
ity of the conclusions depends on how well the analyzed data fit the chosen
model.
3. Obtain the regression equation. We have seen how the regression equation
is obtained by the method of least squares. Although the computations,
when done by hand, are rather lengthy, involved, and subject to error, this
is not the problem today that it has been in the past. Computers are now in
such widespread use that the researcher or statistician without access to one
is the exception rather than the rule. No apology for lengthy computations
is necessary to the researcher who has a computer available'
4. Evaluate the equation. We have seen that the usefulness of the regression
equation for estimating and predicting purposes is determined by means of
the analysis of variance, which tests the significance of the regression mean
square. The strength of the relationship betlveen two variables under the
correlation model is assessedby testing the null hypothesis that there is
no correlation in the population. If this hlpothesis can be rejected we may
conclude, at the chosen level of significance, that the two variables are
correlated.
5. Use the equation. Once it has been determined that it is likely that the
regression equation provides a good description of the relationship befiveen
two variables, X and Y, it rrray be used for one of two purposes:
a. To predict what value Y is likely to assume, given a particular value of
Xor
b. To estimate the mean of the subpopulation of Yvalues for a particular
value of X.

This necessarily abridged treatment of simple linear regression and correla-

tion may have raised more questions than it has answered. It may have occurred to
the reader, for example , that a dependent variable can be more precisely predicted
using two or more independent variables rather than one. Oq perhaps, he or she
may feel that knowledge of the strength of the relationship among several variables
might be of more interest than knowledge of the relationship between only two vari-
ables. The exploration of these possibilities is the subject of the next chapter, and
the reader's curiosity along these lines should be at least partially relieved.
REVIEW QUESTIONS AND EXERCISES 459

For those who would rike to pursue further the topic of regression
analysis
a number of excellent references are available, including those
by Dielman (z),
Hocking (8), Mendenhall and Sincich (9), and Neter et al. (10).

Rpvrpw QUpsTroNs aND nXDRCTSDS

1' What are the assumptions underlying simple linear regression
analysis when one of the
obiectives is to make inferences about the population irom
which the sample data were
drawn?

2. Why is the regression equation called the least_squaresequation?

3. Explain the meaning of ain the sample regression equation.

4, Explain the meaning of D in the sample regression equation.

5. Explain the following terms:

(a) Total sum of squares
(b) Explained sum of squares
(c) Unexplained sum of squares

6' Explain the meaning of and the method of computing the irl,
coefficient of determination.
I
7. what is the function of the analysis of variance in regression
analysis?
8. Describe three ways in which one may test the null hypothesis i i' l ll i
that B = g.
9. For what two purposes can a regression equation be used? I
l0' What are the assumptions underlying :
simple correlation analysis when inference is an
objective? , l
I
l1' What is meant by the unit of association in regression and correlation
analysis?
12' What are the possible explanations for a significant sample correlation ,,'.1
coefficient?
13' Explain why it is risky to use a sample regression equation
to predict or to estimate out-
side the range of values of the independent variabre represented
in the sample.
14' Describe a situation in your particular area of interest where
simple regression analysis
would be useful. Use real or realistic data and do a complete regreision
analysis.
l5' Describe a situation in your particular area of interest where
simple correlation analysis
would be useful. Use real or realistic data and do a complete correlation
analysis.

In each of the following exercises, carry out the required

analysis and test hypotheses at
the indicated significance levels. compute the pvarie for each
test.
16' study by Scrogin,et al. (A-11) was designed to assessthe effects
,A of concurrent manipu-
lations of dietary Nacl and calcium on biood pressure as well
as blood pressure und .at.-
cholamine resPonses to stress. Subjects *.." ,ult-r.nritive, spontaneouslyiypertensive
male
rats' Among the analyses performed by the investiguto., *u, a
correlation bitween baseline

t
I
I

I
I
I
4G,lD CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

blood pressure and plasma epinephrine concentration (E). The following data on these
two variables were collected. Let a : .01.

BP PlasmaE BP PlasmaE

r63.90 248.00 143.20 179.00

195.r5 339.20 166.00 160.40
170.20 193.20 160.40 263.50
17L10 307.20 170.90 184.70
148.60 BO.BO rs0.90 227.50
t95.70 550.00 159.60 92.3s
151.00 70.00 141.60 139.35
166.20 66.00 160.10 173.80
I77.80 120.00 166.40 224.80
165.r0 28t.60 162.00 183.60
174.70 296.70 214.20 44r.60
164.30 2I7.30 t79.70 612.80
152.50 BB.OO 178.10 401.60
202.30 268.00 198.30 132.00
17L.70 265.50

Souncl: Karie E. Scrogin.Usedwith permission.

L7, Dean Parmalee (A-12) wished to know if the year-end grades assigned to Wright State Uni-
versity Medical School students are predictive of their second-year board scores. The fol-
lowing table shows, for 89 students, the year-end score (AVG, in percent of 100) and the
score on the second-year medical board examination (BOARD).

AVG BOARD AVG BOARD

95.73 257 85.9r 208 82.0t 196
94.03 256 85.BI 2t0 BI.86 t79
91.51 242 85.35 2r2 81.70 207
9t.49 zzJ 85.30 225 81.65 202
91.13 24r 85.27 203 81,.51 230
9O.BB zJ4 85.05 2t4 BI.07 200
90.83 226 84.58 176 80.95 200
90.60 236 84.51 196 80.92 r60
90.30 250 84.51 207 BO.84 205
90.29 226 84.42 207 80.77 t94
89.93 ZJJ 84.34 2rl 80.72 t96
89.83 241 84.34 202 80.69 r7l
89.65 234 84.13 229 BO.58 201
89.47 zJl 84.13 202 80.57 177
(Continued,)
REVTEWQUESTTONSAND EXERCISES 46I.

AVG BOARD AVG BOARD

BB.B7 228 84.09 184 80.10 L92
BB.BO 229 83.98 206 79,38 IB7
88.66 ZJJ 83.93 202 78.75 161
88.55 2t6 83.92 176 78.32 t72
88.43 207 83.73 204 78.t7 163
88.34 ,,4
83.47 208 77.39 166
87.95 237 83.27 2II 76.30 I70
87.79 ttI 83.13 t96 75.85 i59
87.01 215 83.05 203 75.60 I54
86.86 tB7 83.02 lBB 75.16 169
86.85 204 82.82 769 74.85 159
86.84 2L9 82.78 205 74.66 t67
86.30 228 82.57 183 74.58 I54
86.13 210 82.56 18I 74.16 r48
86.10 2t6 82.45 I73 70.34 r59
85.92 2r2 82.24 IB5
SouRcrl:DeanParmalee,M.D. and the WrightStateUniversityStatistical
ConsultingCenter.Usedwith permission.

Perform a complete regression analysis with AVG as the independent

variable. Let
a = .05 for all tests.

18. Maria Mathias (A-13) conducted a study of hlperactive children.

She measured the chil-
dren's attitude, h;.peractivity, and social behavior before and
after treatment. The follow-
ing table shows for 31 subjects the age and improvement scores from
pre-treatment to post-
treatment for attirude (ATT), social behavior (Soc), and hyperactiriry (Fryp).
A negative
score for HYP indicates an improvement in hyperactivity; a positive
score in AIT or SOC
indicates improvement. Perform an analysis to determine if there
is evidence to indicate
that age (years) is correlated with any of the three outcome variables. {
Let a : .05 for
all tests. t,

Subjecr
No. AGE ATT HYP SOC
I o -T.2 -r.2 0.0
2 o 0.0 0.0 i.0
J l3 -0.4 0.0 0.2
n
6 -0.4 *0.2 L2
( 9 1.0 -0.8 0.2
6 B 0.8 0.2 0.4
7 B -0.6 -0.2 0.6
B I -1,.2 -0.8 -0.6
(Continued)
4G.2 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

Sulrject
No. AGE ATT HYP SOC

9 7 0.0 0.2 0.8

l0 12 0.4 -0.8 0.4
I1 9 -0.8 0.8 -0.2
T2 IO 1.0 -0.8 r.2
l3 L2 r.4 - 1.6 0.6
T4 I t.0 -0.2 -0.2
I5 L2 0.8 -0.8 1.0
16 9 1.0 0.4 0.4
t7 t0 0.4 -0.2 0.6
1B 7 0.0 -0.4 0.6
T9 T2 I.t -0.6 0.8
20 9 0.2 -0.4 0.2
2l 7 0.4 -0.2 0.6
22 6 0.0 -3.2 1.0
23 1l 0.6 -0.4 0.0
24 tl 0.4 -0.4 0.0
25 lt t.0 -0.7 -0.6
26 tl 0.8 -0.8 0.0
27 lt L.2 0.6 1.0
28 tl 0.2 0.0 -0.2
29 1l 0.8 -r.2 0.3
30 B 0.0 0.0 -0.4
31 9 0.4 -0.2 0.2

Souncn:MariaMathias,M.D. and the WrightState

UniversityStatisticalConsultingCenter.Usedwith
permission.

19. A study by Triller et al. (A-14) examined the length of time required for home health-care
nurses to repackage a patient's medications into various medication organizers (i.e., pill
boxes). For the 19 patients in the study, researchers recorded the time required for repack-
aging of medications. They also recorded the number of problems encountered in the
repackaging session.

Repackaging Repackaging
Patient No. No. of Problerns Time (Minutes) Patient No. No. of Problerns Time (Minutes)

I 9 3B tl I
l0
2 2 25 L2 2 I5
3 0 5 I3 I t7
(Continued)
REVIEW QUESTIONS AND EXERCISES
46e'

- Repackaging
Patient No. No. of Problems Repackaging
Time (Minutes) Patient No. No. of Problems Time (Minutes)
4 6 IB l4 0 1B
J 15 t5 0
6 J 25 I6 l0 29
7 i0 l7 0 5
B 1 5 IB 1 22
9 t l0 19 I 20
l0 0 t5
SouRcr: Danen M. Triilea pham.D. Used with permission

Perform a complete,regression analysis of these data using

the number of problems to pre-
dict the time it took to complete a repackaging sessionl
Let a : .05 for all tests. vhrat
conclusions can be drawn from your analysisl How might
your results be used by health-
care providers?

20. The following are the pulmonary blood flow (pBF) and
purmonary blood volume (pBV)
values recorded for 16 infants and children with congeniial
heart disease:

Y X
PBY (mVsqM) PBF (L/rnin/sqM)
t68 4.3r
280 3.40
391 6.20
420 17.30
303 12.30
429 13.99
605 8.73
522 8.90
224 5.87
291 5.00
ZJJ
3.5r
370 4.24
53r 19.41
516 16.6t
2lt 7.21
439 11.60

regression equation describing the linear relationship

f^T i: berween the rwo variables,
compute r', and tesr H0: F : 0 by both the la test and the , test.
Let a : .0b.
46.4 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

2T, Fifteen specimens of human sera were tested comparatively for tuberculin antibody by two
methods. The logarithms of the titers obtained by the two methods were as follows:

Method

A (X) B (v)
3.31 4.09
2.4I 3.84
2.72 3.65
2.4I 3.20
2.t1 2.97
2.II 3.22
3.0r 3.96
2.t3 2.76
2.4I 3.42
2.I0 3.38
2.4I 3.28
2.09 2.93
3.00 3.54
2.08 3.r4
2.II 2.76

Find the regression equation describing the relationship between the two variables, com-
pute 12, and test Ho: F = 0 by both the .F test and the f test.

22. The following table shows the methyl mercury intake and whole blood mercury values in
12 subjects exposed to methyl mercury through consumption of contaminated fish:

X Y
Methyl Mercury in
Mercury Intake Vhole Blood
(;t.g Hg/day) ("ds)

rB0 90
200 r20
230 125
410 290
600 310
550 290
275 170
580 375
105 70
250 105
460 205
650 480
REVIEW QUESTIONS AND EXERCISES
46,5
Find the regression equation describing the linear
relationship between the two variables,
compute 12, znd tesr 110:F : 0 by boti the .Joand
, rests.
23, The following are rhe weights (kg) and blood glucose
levels (mg,/100 ml) of 16 apparently
healthy adult males:

Weisht (X) Glucose (Y)

64.0 t0B
75.3 109
73.0 104
82.I 102
76.2 105
95.7 T2T
59.4 79
93.4 107
82.T 101
78.9 B5
76.7 99
82.r 100
83.9 r0B
73.0 104
64.4 IO2
77.6 a'7

Find the simple linear regression equation and test

Hs: B :0 using both ANOVA and the
' test. Test Ho: p : 0 and construct a gb percent
confidence inteival fbr p. what is the
predicted glucose level for a man who weighs 95
kg? construct the 9b percent prediction
interval for his weight. Let d : .05 for all iests.
24. The following are the ages (years) and systolic blood pressures
of 20 apparently healthy adults:

As" (X) BP (y) Age (X) BP (Y)

20 r20 46 I28
A2
t2B 53 136
63 141 70 746
26 126 20 t24
53 134 OJ 143
?t
I2B 43 r30
5B I36 26 L24
46 r32 I9 I2I
5B 140 31 t26
70 144 ,2 I23

Find the simple linear regression equation and

test Ho: F :0 using both ANOVA and
the I test. Test 11e:p : 0 and construct a g5 percent
confidence interval fbr p. Find the

I
I
t
L
46,G CHAFTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

95 percent prediction interval for the systolic blood pressure of a person who is
2b years
old. Let a : .05 for all tests.

25. The following data were collected during an experiment in which laboratory animals
were
inoculated with a pathogen. The variables are time in hours after inoculation and
tem-
perature in degrees Celsius.

Tirrre Temperature Time Temperature

24 3B.B 44 4t.t
28 39.5 48 4r.4
32 40.3 52 4r.6
JO 40.7 56 41.8
40 41.0 60 4I.9

Find the simple linear regression equation and test Ho: :0 using both ANOVA and
F the
I test. Test Ho: P :0 and construct a g5 percent confidence interval for p. Construct
the
95 percent prediction interval for the temperature at 50 hours after inoculation.
Let
a : .Ob for all tests.

For each of the studies described in Exercises 26 through 28, answer as many of
the fol-
lowing questions as possible.
(a) Which is more relevant, regression analysis or correlation analysis, or are both tech-
niques equally relevant?
(b) Which is the independent variable?
(c) Which is the dependent variable?
(d) What are the appropriate null and alternative hypotheses?
(e) Do you think the null hypothesis was rejected? Exprain why or why
nor.
(f) Which is the more relevant objective, prediction or estimation, or
are the two equallv
relevant?
(g) \4/hat is the sampled population?
(h) \4trat is the target population?
(i) Are the variables directly or inversely related?

26. Lamarre{liche et al. (A-15) state that "The QT interval corrected for heart rate (eTc) is
believed to reflect s1'rnpatholagal balance. It has also been established that pblockers
influ-
ence the autonomic nervous system." The researchers performed correlation analysis
to mea-
sure the associalion between QTc interval, heart rate, heart rate change, and therapeutic blood
pressure resPonse for 73 hypertensive subjects taking p-blockers. The researchers found that
QTc interval length, pretreaLment heart rate, and heart rate change with therapy were not
good predictors ofblood Pressure response to Bl-selective
B-blockers in hypertensive subjects.
27. Skinner et al. (A-16) conducted a cross-sectionaltelephone survey to obtain 24hour dietary
recall of infants' and toddlers'food intakes, as reported by mothers or other primary care-
givers. one finding of interest was that among b6l toddlers ages ).b-24 -orthr, the age in
weeks of the child was negarively related to vitamin C density (b = -.a\,p: .01). \{aien
predicting calcium density, age in weeks of the child produced a slope coefficient of -1.4t
withapof.09.

!
I
t
I
t
I
l
I
II

-.
-
REVIEW QUESTIONS AND EXERCTSES 467

28. Park et al. (A-17) studied 29 male subjects with clinically confirmed cirrhosis. Among other
variables, they measured whole blood manganese levels (MnB), plasma manganese (Mnp),
urinary manganese (Mnu), and pallidal index (PI), a measure of signal intensity in T1
weighted magnetic resonance imaging (MRI). They found a correlation coefficient of .559,
p<.01, between MnB and PI. However, there were no significant correlations between
MnP and Pi or MnU and Pi (r: .353,p > .05,r: .252,p > .0b, respectively).

For the studies described in Exercises 29 through 46, do the following:

(a) Perform a statistical analpis of the data (including hypothesis testing and confidence
interval construction) that you think would yield useful information for the researchers.
(b) Construct graphs that you think would be helpful in illustrating the relationships
among variables.
(c) Where you think appropriate, use techniques learned in other chapters, such as analysis
of variance and hlpothesis testing and interval estimation regarding means and proportions.
(d) Determine p values for each computed test statistic.
(e) State all assumptions that are necessary to validate your analysis.
(f) Describe the population(s) about which you think inferences based on your analysis
would be applicable.
(g) If available, consult the cited reference and compare your analyses and results with
those of the authors.

29. Moerloose et al. (A-18) conducted a study to evaluate the clinical usefulness of a new lab-
oratory technique (method A) for use in the diagnosis of pulmonary embolism (pE). The
performance of the new technique was compared with that of a standard technique
(method B). Subjects consisted of patients with clinically suspected PE who were admitted
to the emergency ward of a European university hospital. The following are the measure-
ments obtained by the two techniques for 85 patients. The researchers performed two
analyses: (1) on all 85 pairs of measurements and (2) on those pairs of measurements for
which the value for method B was less than 1000.

9 1I9 703 599 2526 1830

84 t15 725 610 2600 1BB0
86 lOB 727 3900 2770 2100
190 tgz 745 4050 3100 1780
208 294 752 785 3270 IB70
2IB 226 BB4 9T4 32BO 2480
251 311 920 1520 3410 1440
252 250 966 972 3530 2L90
256 3I2 985 913 3900 2340
264 403 994 556 4260 3490
282 296 1050 1330 4300 4960
294 296 l1l0 t4l0 4560 7180
296 303 ll70 484 4610 1390
(Continued)

i
464 CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

3tt JJO lt90 867 4Br0 1600

344 1250 1350 5070 3770
J l t 257 l2B0 1560 5470 27BO
407 424 1330 1290 5576 2730
4tB 265 1340 1540 6230 1260
422 1400 1710 6260 2870
459 4r2 1530 1333 6370 2210
468 389 1560 t250 6430 2210
48t 4t4 lB40 764 6500 2380
529 667 lB70 t6B0 7t20 5220
540 486 2070 1310 7430 2650
562 720 2120 1360 7800 4910
574 343 2170 t770 BB90 4080
646 5tB 2270 2240 9930 3840
664 B0r 2490 t9l0
670 760 2520 2It0

Usedwith permission.
SouRcr: Dr. Philippede Moerloose.

30. Research by Huhtaniemi et al. (A-19) focused on the quality of serum luteinizing hormone
(LH) during pubertal maturation in boys. Subjects, consisting of healthy boys entering
puberty (ages 11 years 5 months to 12 years), were studied over a period of 18 months.
The following are the concentrations (IU/L) of bioactive LH (B-LH) and immunoreactive
LH (I-LH) in serum samples taken from the subjects. Only observations in which the sub-
jects' B,/I ratio was greater than 3.5 are reported here.

I-LH B-LH I-LH B-LH

.104 .97 3.63

.041 .28 .49 2.26
.t24 .64 t 4.55
.BOB 2.32 t.t7 5.06
.403 l.28 1,46 4.BI
.27 .9 t.97 B.lB
.49 2.45 .BB 2.48
.66 2.8 t.24 4.8
.82 2.6 r.54 3.t2
1.09 4.5 1.71 8.4
r.05 3.2 l.tl 6
.83 3.65 1.35 7.2
.89 5.25 1.59 7.6
.75 2.9

Souncr: Dr. Ilpo T. Huhtaniemi. Used with permission.

i
I
i
II
Il
I
I
I
I
-

I'
I

L
REVIEW QUESTTONSAnD EXERCTSES 46,'0

31. Tsau et al. (A-20) studied urinary epidermal growth factor (EGF) excrerion in normal
chil-
dren and those with acute renal failure (ARF). Random urine samples followed by
24hour
urine collection were obtained from 25 children. Subjects ranged in age from I month
to
15 years. Urinary EGF excretion was expressed as a ratio of ,rii.rury EGF to urinarv
crea-
tinine concentration (EGVCT). The authors conclude from their research results
that it is
reasonable to use random urine tests for monitoring EGF excretion. Following are
the ran-
dom (spot) and 24hour urinary EGF/Cr concentrations (pmol/mmol) for the
25 subjects:

24-h Urine Spot Urine 24-h Urine Spot Urine

Sulrject EGF/Cr (r) EGF/Cr (y) Subjecr EGF/Cr (r) EGF/Cr (y)
I 772 720 t4 254 JJJ

2 223 271 l5 93 84
2
494 314 l6 303 5r2
432 350 T7 408 277
70
IB 7tL 443
155 llB T9 209 309
7 305 J6t 20 l3t 280 rtq\
B 318 432 2l 165 r89
174 97 22 r51 101
10 t31B r309 z) I65 22I
ll 482 406 I25 228
t2 436 426 25 232
t3 527 595
" Subjectswith
ARF.
Souncn:Dr. Yong-KweiTsau.Usedwith permission.

32. One of the reasons for a study by Usaj and Starc (A-21) was an interest in the behavior
of
PH kinetics during conditions of long-term endurance and short-term endurance among
healthy runners. The nine subjects participating in the study were maratho.r .,r.rr..,
-r 5 years. ug.J
26 The authors report that they obtained a good correlation between pH kiier
ics and both short-term and long-term endurance. The following are the short- (v5B)
and
long-term (V1B) speeds and blood pH measurements for the paiticipating subjects.

Yr.r Vsn pH Range

5.4 5.6 .083

4.75 5.1 .l
4.6 4.6 .o2I
4.6 .065
4.55 4.9 .056
4.4 4.6 .0t
4.4 4.9 .058
4.2 4.4 .013
4.2 4.5 .03 Sounco: Anton Usaj, Ph.D
Used with permission.
47lD CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

JJ. Bean et al. (N22) conducted a study to assessthe performance of the isoelectric focus-
ing/immunoblotting,/laser densitometry (IEF/IB/LD) procedure to evaluate carbohydrate-
deficient transferrin (CDT) derived from dry blood spots. The investigators evaluated
paired serum (S) and dry blood spot (DBS) specimens simultaneously for CDT. Assessment
of CDT serves as a marker for alcohol abuse. The use of dry blood spots as a source of
CDT for analysis by IEF/IB/LD results in simplified sampling, storage, and transportation
of specimens. The following are the IEF/IB/LD values in densitometry units (DU) of CDT
from 25 serum and dry blood spot specimens:

Specirnen No. DBS Specirnen No. DBS

I 64 zt t4 9 1 3
74 3B l5 l 0 B
75 J J l6 1 7 7
4 t03 J' 17 38 L4
t0 9 IB 9 9
6 22 1B l9 t 5 9
7 JJ 20 20 70 31
B t0 5 2T 6L 26
9 3l l4 22 42 t4
10 30 15 ZJ 20 l0
ll 28 12 ,A
58 26
t2 16 9 25 31 12
13 t ) 7

SouRcr: Dr. PamelaBean.Usedwith permission.

34. Kato et al. (A-23) measured the plasma concentration of adrenomedullin (AM) in patients
with chronic congestive heart failure due to various cardiac diseases. AM is a hypotensive
peptide, which, on the basis of other studies, the authors say, has an implied role as a cir-
culating hormone in regulation of the cardiovascular system. Other data collected from
the subjects included plasma concentrations of hormones known to affect the cardiovas-
cular system. Following are the plasma AM (fmol/ml-) and plasma renin activity (PRA)
(ng/L's) values for 19 heart failure patients:

Patient Sex Ag. AM PRA

No. ( 1= M , 2 = F ) (Years) (fmoVml) (ng/L .s)

I 1 70 t2.lt .480594
z I 44 7.306 .63894
I 72 6.906 t.2t9542
4 I 62 7.056 .450036
5 z 52 9.026 .r9446
6 z 65 10.864 r.966824
(Continued)
REVIEW QUESTIONS AND EXERCISES 471

Patient Sex fue AM PRA

No. (1 = M,2 = F) (Years) (fmoUml) (ng/L. s)
7 2 64 7.324 .29r69
B t 7l 9.316 1.775142
9 z 6l t7.144 9.33408
t0 I 6B 6.954 .3r947
ll I 63 7.488 1.594572
t2 2 59 10.366 .963966
t3 2 55 r0.334 2.r9t842
t4 2 57 l3 3.97254
t5 z 6B 6.66 .52782
16 5l 8.906 .350028
T7 I 69 8.952 r.7362s
IB t 7l 8.034 .LO2786
l9 I 46 r3.4I l.13B9B
SouRcE:Dr. Johji Kato.Usedwith pemission.

35. In a study reported on in Archiues of Diseasein Childhood,,Golden et al. (A-24) tested the
hlpothesis that plasma calprotectin (PCal) (a neutrophil cytosolic protein released during
neutrophil activation or death) concentration is an early and sensitive indicator of inflam-
mation associated with bacterial infection in cystic fibrosis (CF). Subjects were chilclren with
confirmed CF and a control group of age- and sex-matched children without the disease.
Among the data collected were the following plasma calprotectin (p,S/L) and plasma cop-
per (PCu) (p'rnol/L) measurements. Plasma copper is an index of acute phase response
in cystic fibrosis. The authors reported a correlation coefficient of .48 between plasma
calprotectin (lo916) and plasma copper.

CF CF
Subject Subject
No. PCal PCu No. PCaI PCu

I 452 17.46 l z l54B 15.3r 22 674 l8.u

2 590 14.84 l3 708 17.00 23 3529 t7.42
r95B ,'7 At
t4 8050 20.00 24 L467 t7.42
4 20t5 lB.5l 15 9942 2s.00 25 nl6 16.73
5 4r7 15.89 16 79L 13.10 26 611 l8.lr
6 2BB4 17.99 17 6227 23.00 27 1083 21.56
7 1862 2t.66 IB 1473 16.70 28 t432 2I.56
B TO47I 19.03 i9 8697 IB.ll 29 4422 22.60
9 25850 16.4I 20 62t 18.80 30 3l9B r8.9l
l0 50lt lB.5l 2l tB32 17.08 31 544 14.37
ll 5I2B 22.70
(Continued)
472 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

Control Control
Subject Subjecr
No. PCal PCu No. PCal PCu

I 674 16.73 17 368 16.73

2 368 16.73 1B 674 16.73
J 32r 16.39 l9 Bl5 19.82
4 t592 14.32 20 598 16.1
5 5rB 16.39 2I 684 13.63
6 Bt5 t9.82 22 684 13.63
7 684 t7.96 23 674 t6.73
B 870 t9.82 24 368 16.73
9 781 IB.II 25 tr48 24.t5
l0 727 lB.tl 26 1077 22.30
II 727 l B . lt 27 518 9.49
12 7BI t B . tI 28 1657 16.10
I3 674 16.73 29 Br5 19.82
l4 It73 20.53 30 368 L6.73
l5 Bl5 t9.82 31 t077 22.30
l6 727 l B . lt

Souncr: Dr. BarbaraE. Golden.Usedwith permission.

36. Gelb et al. (A-25) conducted a study in which they explored the relationship between mod-
erate to severe expiratory airflow limitation and the presence and extent of morphologic
and CT scored emphysema in consecutively seen outpatients with chronic obstructive pul-
monary disease. Among the data collected were the following measures of lung CT and
pathology (PATH) for emphysema scoring:

CT Score PATH CT Score PATII

5 t5 45 50
90 70 45 40
50 20 B5 75
t0 25 7 0
T2 25 BO B5
35 t0 l5 5
40 35 45 40
45 30 35
5 5 75 45
25 50 5 5
60 60 5 20
70 60

SouRcr: Dr. Arthur F. Gelb. Used with permission

t
L
REVTEW QUESTTONSAND EXERCISES
478
37 ' The objective of a study by witteman et
al. (,{-26) was ro investigate skin reactiviry
purified major allergens and to assessthe with
relation with serum levels of immunoglobu-
lin E (IgE) antibodies and to determine
which additional factors contribute to the
test result' subjects consisted of patients skin
with allergic rhinitis, allergic asthma, or
who were seen in a European medical center. both,
As part of thei*t"ay,it. r.r;;;:;;;;i
lected, from 23 subjects, the following
measurements on specific IgE (IU/ml) and
test (nglml) in the presence of Lol p b, skin
a purified allergen from-in" grass pollen. we wish
to know the nature and strength of the
relationship be"tweer, *o variables. (Nora..
The authors converted the meisurements
to natural logarithms before investigating
relationship.) this

IsE Skin Test

24.87 .055
12.90 .041034
9.87 .050909
8.74 .046
6.88 .039032
5.90 .050909
4.85 .042142
3.53 .055
2.25 4.333333
2.14 .55
L.94 .050909
r.29 .446t53
.94 .4
.91 .475
.s5 4.461538
.30 4.103448
.14 7.428571
.11 4.461538
.i0 6.625
.r0 49.13043
.t0 36.47058
.r0 52.8s7r4
.r0 47.5
Souncr: Dr. JaringS. van der Zee.
Usedwith permission.

38. Garland et al' (A-27) conducted a series of

experiments to delineate the complex maternal-
fetal pharmacokinetics and the effects
of zidolrrdine (AZT) in the chronically
mented maternal and fetal baboon. (Papio instru-
species) during both steady*tate intravenous
infusion and oral bolus dosage regimens.
e-ong the data collected were the following
474 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREI,ATION

measurements on dosage (mg/kg/h) and steady-state maternal plasma AZT concentration

(nglml):

AZT AZT
Dosage Concentration Dosage Concentration

2.5 832 2.0 77r

2.5 672 l.B 757
2.5 904 0.9 2r3
2.5 554 0.6 394
2.5 996 0.9 391
r.9 B7B 430
2.r 815 Ll 440
1.9 805 t.4 352
L.9 592 l.I JJl

0.9 39t 0.8 lBl

1.5 710 0.7 174
r.4 591 1.0 470
r.4 660 l.t 426
t.5 694 0.8 170
1.8 668 1.0 360
1.8 601 0.9 320

SouRcr: Dr. MarianneGarland.Usedwith pemission.

39. The purpose of a study by Halligan et al. (A-28) was to evaluate diurnal variation in blood
pressure (BP) in women who were normotensive and those with preeclampsia. The sub-
jects were similar in age, weight, and mean duration of gestation (35 weeks) . The
researchers collected the following BP readings. As part of their analysis they studied the
relationship between mean day and night measurements and day/night differences for both
diastolic and systolic BP in each group.

C1 C2 c3 C4 C5 CI C2 C3 C4 c5
'74
0 75 56 r27 l0t 94 r37 tI9
0 6B 57 It3 104 90 B6 r39 138
0 72 5B ll5 105 B5 69 138 117
0 7l 5t tlt 94 BO 75 r33 126
0 BI 6t i30 lt0 BI 60 I27 LT2
0 6B 56 ltt l0l B9 79 t37 126
0 7B 60 113 L02 107 lt0 t6I l6t
0 7T 55 t20 99 9B BB I52 l4l
0 65 5l t06 96 7B 74 r34 r32
0 Itt 6l t20 109 BO BO t2r t2r
(Continued.)
REVIEW QUESTIONS AND EXERCTSES 475

CI C2 C3 C4 C5 CI C2 C3 c4 C5
0 74 60 I2t r04 I 96 83 143 I29
0 75 52 r2t 702 I 85 76 137 r31
0 68 50 109 9t I 79 74 135 I20
0 63 49 iOB 99 I 9l 95 139 t35
0 77 47 132 1r5 I 87 67 137 1r5
0 73 5l II2 90 83 64 143
I
1t9
0 73 52 lr8 97 I 94 85 I27 L23
0 64 62 I22 It4 I 85 70 I42 I24
0 64 54 108 94 I 78 6r tt9 tlo
0 66 54 106 BB I B0 59 I29 rt{
0 72 49 u6 10t t 98 t02 156 163
0 83 60 I27 103 100 100 I49
I r49
0 69 50 L2t 104 89 84 I41
t i35
0 72 52 IOB 95 I 98 9I r4B 139
: S.orp (0 : normotensive, I : preeclamptic);C2 = day diastoiic;C3 : night diastolic;
9f
C4 : day systolic;C5 : night systolic.
Souncn:Dr. Aidan Halligan.Usedwith permission.

40. Marks et al. (A-29) conducted a study to determine the effects

of rapid weight loss on con-
traction of the gallbladder and to evaluate the effects of ursodiol l r
and ibuprofen on satu- I
ration, nucleation and growth, and contraction. Subjects
were obese patients randomly
assigned to receive ursodiol' ibuprofen, or placebo. Ario.rg the !l
data collected were the fol- I
lowing cholesterol saturarion index values
lcsl; ana nucliation times (NT) in days of 13 l
(six male, seven female) placebo-treated subjects at the l
end of 6 weeks:

CSI NT l

L.20 4.00
r.42 6.00
l.r8 14.00
.BB 21.00
r.05 21.00
1.00 18.00
1.39 6.00
r.3r 10.00
l l7 9.00
r.36 14.00
1.06 2r.00
1.30 8.00
L.7I 2.00
Souncr: Dr. Jay W. Marks.
Usedwith permission.
476 CIIAPTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

4I. The objective of a study by Peacock et al. (A-30) was to investigate whether spinal
osteoarthritis is responsible for the fact that lumbar spine bone mineral density (BMD) is
greater when measured in the anteroposterior plane than when measured in the lateral
plane. Lateral spine radiographs were studied from women (age range 34 to 87 years) who
attended a hospital outpatient department for bone density measurement and underwent
lumbar spine radiography. Among the data collected were the following measurements on
anteroposterior (A) and lateral (L) BMD (g/cm2):

ABMD LBMD ABMD LBMD

.879 .577 1.098 .534 r.09r .836

.824 .622 .BB2 .570 .746 .433
.974 .643 .Bt6 .558 I.t27 .732
.909 .664 I.017 .675 I.4n .766
.872 .559 .669 .590 .75t .397
.930 .663 .857 .666 .786 .515
.9t2 .710 .571 .474 r.031 .574
.758 .592 1.134 .7tr .622 .506
t.072 .702 .705 .492 .B4B .657
.847 .655 .775 .348 .778 .537
1.000 .5rB .968 .579 .784 .4I9
.565 .354 .963 .665 .659 .429
1.036 .839 .933 .626 .948 .485
.Btt .572 .704 .r94 .634 .544
.901 .6L2 .624 .429 .946 .550
1.052 .663 t .l 1 9 .707 r.107 .458
.731 .376 .686 .508 1.583 .975
.637 .4BB .741 .484 r.026 .550
.951 .747 1.028 .787
.822 .610 .649 .469
.951 .7r0 t.166 .796
t.026 .694 .954 .548
r.022 .580 .666 .545
r.047 .706
.737 .526

SouRcr: Dr. CyrusCooper.Usedwith pemission.

42. Sloan et al. (A-31) note that cardiac qnnpathetic activation and parasympathetic withdrawal
result in heart rate increases during psychological stress. As indicators of cardiac adrener-
gic activity, plasma epinephrine (E) and norepinephrine (NE) generally increase in response
to psychological challenge. Power spectral analpis of heart period variability also provides
estimates of cardiac autonomic nervous system activity. The authors conducted a study to
determine the relationship between neurohumoral and two different spectral estimates of
cardiac sympathetic nervous system activity during a quiet resting baseline and in response
to a psychologically challenging arithmetic task. Subjects were healthy, medication-fiee male
and female volunteers with a mean age of 37.8 years. None had a history of cardiac, re-
spiratory, or vascular disease. Among the data collected were the following measurements
REVIEW QUESTIONS AND EXERCISES 477
on E, NE' low-fiequency (LF) and very-low-frequency (wF)
power specrral inclices, and low_
frequency/high frequency ratios (LH/HF). Measurements
are given for three periods: base-
line (B), a mental arithmetic task (MA), and change fiom
baserine to task (DELTA).

Patient No. NE LF/HF LF Period !'LF

5 3.55535 6.28040 0.66706 7.71886 B 7.74600
5 0.05557 0.13960 -0.48115 -0.99826 -2.23823
DELTA
5 3.61092 6.41999 0.18591 6.72059 MA 5.50777
6 3.55535 6.2467I 2.48308 7.33729 B 6.64353
o 0.l082l -0.05374 -2.03738 -0.77I09 -t.27196
DELfA
o 3.66356 6.19236 0.44569 6.56620 MA 5.37I57
7 3.29584 4.91998 -0.15473 7.86663 B 7.99450
7 0.59598 0.53106 0.r4086 -0.81345 -2.8640I
DELTA
7 3.89182 5.45104 -0.01387 7.053i9 MA 5.13049
B 4.00733 5.97635 l.5895t B.l8005 B s.97126
o 0.29673 0.17947 -o.II77I - t.t65B4 -0.39078
o
DELTA
4.30407 6.09582 t.471.80 7.0t421 MA 5.58048
12 3.87120 5.35659 0.47942 6.56488 B 5.94960
12 * * 0.19379 0.03415 DELTA 0.50134
T2 * 0.6732I 6.59903 MA 6.45094
13 3.97029 5.85507 0.13687 6.27444 B 5.58500
t3 -0.20909 0.10851 -0.49619
1.05965 DELTA - t.6B9tI
13 3.76120 s.96358 1.19652 5.77825 MA 3.89589
I4 3.63759 5.62040 O.BB3B9 6.08877 B 6.12490
I4 0.31366 0.07333 1.06100 1.37098 - 1.07633
DEUIA
I4 3.95124 5.69373 I.94489 7.45975 MA 5.04857
1B 4.44265 5.88053 0.99200 7.52268 B 7.19376
IB 0.35314 0.62824 -0.10297 -0.57t42 DELTA -2.06150
IB 4.79579 6.50877 0.88903 6.95126 MA 5.t3226
I9 5.03044 0.62446 6.90677 B 7.39854
to t<
0.69966 -0.88309
0.09578 0.944t3 DEUTA
l9 2.94444 5.73010 0.72024 7.85090 MA 6.57545
20 3.9t202 5.86363 1.11825 8.2634r B 6.89497
20 -0.02020 o.2t40I -0.60117 - 1.13100 -I.12073
DEUTA
20 3.89182 6.07764 0.51708 7.I324I MA 5.77424
2I 3.55535 6.21860 0.78632 8.74397 B 8.26111
2T 0.31585 -0.52487 -t.92II4 -2.38726 -2.08151
DELTA
2l 3.87120 5.69373 - 1.t3483 6.35671 MA 6.17960
22 4.18965 s.76832 -0.02785 8.66907 D
-0.05459 7.51529
22 0.16705 0.93349 -0.89157 DELTA - 1.00414
22 4.3567I 5.7t373 0.90563 7.7775I MA 6.51115
t? 3.95124 5.52545 -0.24196 6.75330 B 6.93020
0.26826 0.16491 -0.00661 0.18354 DELTA - l.1B9l2
zt 4.2I95I 5.69036 -0.24856 6.93684 MA 5.74108
(Continued)

i
I
I
I
i
t
L
474 CHAPTER 9 SIMPLE LINEAR REGRESSIONAI{D CORRELATION

Patient No. NE LF/HF LF Period YLF

24 3.784t9 5.59842 -0.67478 6.26453 B 6.45268

24 0.32668 -0.L7347 L.44970 0.52t69 DELTA 0.39277
24 4.tt0B7 5.42495 0.77493 6.78622 MA 6.84545
I 3.36730 6.13r23 0.19077 6.75395 B 6.13708
I 0.54473 0.08538 0.79284 0.34637 DELTA -0.56569
I 3.9t202 6.2166r 0.98361 7.r0031 MA 5.57139
J 2.83321 5.92158 r.89472 7.92524 B 6.30664
t.15577 0.64930 -0.75686 - l.5B4Bt DEUIA -1.95636
J 3.9BB9B 6.57088 1.13786 6.34042 MA 4.35028
4 4.29046 5.73657 1.BlBl6 7.02734 B 7.02882
4 0.14036 0.47000 *0.26089 - 1.08028 DELTA _I.43858
4 4.43082 6.20658 t.55727 5.94705 MA 5.59024
5 3.93183 5.62762 r.70262 6.76859 B 6.tll02
5 0.80437 0.67865 -0.2653r -0.29394 DELTA -0.94910
5 4.73620 6.30628 1.4373t 6.47465 MA 5.t6t92
6 3.29584 5.47227 0.tBB52 6.49054 B 6.84279
6 -0.16034 0.27073 -0.16485 - 1.12558 DELTA _ }.B42BB
6 3.13549 5.74300 0.02367 5.36496 MA 4.9999r
B 3.25810 5.37064 -0.0963r 7.23131 B 7.t6371
B 0.40547 -0.13953 0.97906 -0.62894 DELTA -2.I5IOB
B 3.66356 5.23111 0.88274 6.60237 MA 5.01263
9 3.784t9 5.94542 0.77839 5.86t26 B 6.22910
9 0.64663 0.05847 -0.42774 -0.53530 DELTA -2.18430
9 4.43082 6.00389 0.35066 5.32595 MA 4.04480
IO 4.07754 5.87493 2.32137 6.71736 B 6.59769
l0 0.23995 -0.00563 -0.25309 -0.00873 DEUIA -0.75357
t0 4.3t749 5.86930 2.06827 6.70863 MA 5.84412
11 4.33073 5.84064 2.89058 7.22570 B 5.76079
1I -3.63759 -0.01464 -t.22533 - 1.335I4 DELIA -0.55240
n 0.69315 5.82600 1.66525 5.89056 MA 5.20839
12 3.55535 6.04501 r.92977 8.50684 B 7.L5797
12 0.13353 0.t2041 -0.t5464 *0.84735 DELTA 0.13525
T2 3.68888 6.L6542 I.775t3 7.65949 MA 7.29322
t3 3.33220 4.63473 -0.11940 6.35464 B 6.76285
l3 r.t676r 1.05563 0.8562r 0.63251 DELTA _052I2I
l3 4.4998I 5.69036 0.73681 6.987t6 MA 6.24t64
I4 3.25810 5.96358 1.10456 7.01270 B 7.49426
t4 {<
0.26353 -r.20066 DEUIA -3.15046
I4 * 1.36809 5.8t204 MA 4.34381
15 5.42935 6.34564 2.7636r 9.48594 B 7.05730
t5 {< * -t.t4662 - r.58468 DELIA -0.08901
(Continued)
REVIEW QUESTIONS AND EXERCISES 479

Patient No. NE LF/HF LF Period VLF

t5 1.6t699 7.90126 MA 6.96829
L6 4.11087 65944I -0.23319 6.68269 B 6.76872
l6 -0.06782 -0.54941 0.34755 -0.29398 - l.80868
DEUTA
l6 4.04305 6.04501 0 . 11 4 3 7 6.38871 MA 4.96004
17 * 6.28040 L40992 6.0967r B 4.8267I
17 * -0.r2766 -0.17490 -0.05945 DELIA 0.69993
t7 * 6.15273 1.23501 6.03726 MA 5.52665
IB 2.39790 6.03548 0.23183 6.39707 B 6.60421
IB 1.06784 0.II299 0.27977 -0.38297 DELTA -L92672
IB 3.46574 6.14847 0.51160 6.01410 MA 4.67749
l9 4.2r95I 6.35784 t.0Br83 5.54214 B 5.69070
19 0.21l3l -0.00347 0.12485 -0.54440 DEUTA -L49802
l9 4.43082 6.35437 1.20669 4.99774 MA 4.t9268
20 4.14313 5.73334 0.89483 7.35045 B 6.93974
20 -0.t7778 0.00000 o.I7t29 -0.58013 DELTA -I.72916
20 4.02535 5.73334 LO66t2 6.77032 MA 5.21058
2I 3.66356 6.06843 -0.873r5 5.09848 B 6.02972
2T 0.20764 -0.10485 0 . 4 1 1 7 8 -0.33378 DEUTA -2.00974
2I 3.87120 5.96358 -0.46t37 4.76470 MA 4.01998
22 3.29584 5.95324 2.38399 7.62877 B 7.54359
22 0.36772 0.68139 -0.75014 -0.89992 DEUTA - 1.25555
22 3.66356 6.63463 1.63384 6.72884 MA 6.28804
x : missingdata.
Souncn:Dr. RichardP. Sloan.Usedwith pemission.

43. The purpose of a study by Chati et al. (A-32) was to ascertain the role of physical decondi-
tioning in skeletal muscle metabolic abnormalities in patients with chronic heart fail-
ure (CHF). Subjects included ambulatory CHF patients (12 males, two females) ages 3b to
74years. Among the data collected were the following measurements, during exercise, of work-
load (WL) under controlled conditions, peak ox7gen consumption (Vo2), anaerobic venrila-
tory threshold (AI), both measured in ml,/kg/min, and exercise total time (ET) in seconds.

WL Vo" AT ET WL voz AT ET
7.557 32.800 13.280 933.000 3.930 22500 18.500 720.000
3.973 8.170 6.770 255.000 3.195 r7.020 8.520 375.000
5.31t 16.530 11.200 480.000 2.4t8 15.040 t2.250 480.000
5.355 15.s00 10.000 420.000 0.864 7.800 4.200 240.000
6.909 24.470 11.550 960.000 2.703 t2.I70 8.900 513.000
t.382 7.390 5.240 346.000 1.727 l5.lt0 6.300 540.000
8.636 19.000 10.400 600.000 7.773 21.100 12.500 1200.000
SouncE: Dr. Zukai Chati. Used with nermission.
4AO CHAPTER 9 SIMPLE LINEAR REGRESSIONAND CORREIATION

M. Czader et al. (A-33) investigated certain prognostic factors in patients with centroblastic-
centrocytic non-Hodgkin's lymphomas (CBICC NHL). Subjects consisted of men and
women between the ages of 20 and 84 years at time of diagnosis. Among the data collected
were the following measurements on two relevant factors, A and B. The authors reported
a significant correlation between the two.

20.00 .154 22.34 .L47 48.66 .569

36.00 .221 18.00 .t32 20.00 .227
6.97 .t29 18.00 .085 17.66 .I25
t3.67 .064 22.66 .577 14.34 .089
36.34 .402 45.34 .r34 16.33 .051
39.66 .256 20.33 .246 18.34 .100
t4.66 .1BB 16.00 .r75 26.49 .202
27.00 .l38 15.66 .r05 13.33 .077
2.66 .078 23.00 .145 6.00 .206
22.00 .r42 27.33 .t29 rs.67 .153
11.00 .086 6.27 .062 32.33 .549
20.00 .r70 24.34 .I47
22.66 .198 22.33 .769
7.34 .092 1r.33 .130
29.67 .227 6.67 .099
tr.66 .159
8.05 .223
22.66 .065

SouRce:Dr. Magdalena
Czadermd Dr. Anna Porwit-MacDonald.
Usedwith permission.
45. Fleroxacin, a fluoroquinolone derivative with a broad antibacterial spectrum and potent
activity in vitro against gram-negative and many gram-positive bacteria, was the subject of
a study by Reigner and Welker (A-34). The objectives of their study were to estimate the
typical values of clearance over systemic availability (CL/F) and the volume of distribution
over systemic availability (VrzF) after the administration of therapeutic doses of fleroxacin
and to identi$ factors that influence the disposition of fleroxacin and to quanti$ the
degree to which they do so. Subjects were 172 healthy male and female volunteers and
uninfected patients representing a wide age range. Among the data analyzed were the
following measurements (ml/min) of CL/F and creatinine clearance (CLcr). According to
the authors, previous studies have shown that there is a correlation between the two variables.

CUF CLer CL/F CLer CL/F CLer CL/F CLer

137.000 96.000 77.000 67.700 152.000 109.000 I32.000 lIl.000

106.000 83.000 57.000 51.500 100.000 82.000 94.000 118.000
165.000 r00.000 69.000 52.400 86.000 88.000 90.000 111.000
127.000 r01.000 69.000 65.900 69.000 67.000 87.000 124.000
(Continued,)
REYTEW QUESTTONSAND EXERCTSES 4Al

CUF CLer CLIF CLer CL/F CLer CL/F CLer

139.000 116.000 76.000 60.900 108.000 68.700 48.000 10.600

I02.000 78.000 77.000 93.800 77.000 83.200 26.000 9.280
72.OOO 84.000 66.000 73.800 85.000 72.800 54.000 12.500
86.000 81.000 53.000 99.100 89.000 82.300 36.000 9.860
85.000 77.000 26.000 110.000 105.000 71.100 26.000 4.740
122.000 I02.000 89.000 99.900 66.000 56.000 39.000 7.020
76.000 80.000 44.000 73.800 ?3.000 6r.000 27.000 6.570
57.000 67.000 27.OOO 65.800 64.000 79.500 36.000 13.600
62.000 41.000 96.000 r09.000 26.000 9.t20 15.000 7.600
90.000 93.000 102.000 76.800 29.000 8.540 138.000 100.000
16s.000 88.000 159.000 125.000 39.r00 93.700 127.000 r08.000
132.000 64.000 r15.000 112.000 75.500 65.600 203.000 121.000
159.000 92.000 82.000 91.600 86.000 102.000 198.000 143.000
I48.000 I14.000 96.000 83.100 106.000 105.000 15I.000 126.000
116.000 59.000 121.000 BB.B00 77.500 67.300 113.000 llt.000
124.000 67.000 99.000 94.000 87.800 96.200 139.000 109.000
76.000 56.000 120.000 91.500 25.700 6.830 135.000 102.000
40.000 6r.000 r01.000 83.800 89.700 74.800 I16.000 110.000
23.000 35.000 118.000 97.800 108.000 84.000 148.000 94.000
27.000 38.000 I16.000 r00.000 58.600 79.000 22t.000 110.000
64.000 79.000 il6.000 67.500 9r.700 68.500 I15.000 101.000
44.000 64.000 87.000 97.500 48.900 20.600 150.000 1t0.000
59.000 94.000 59.000 45.000 53.500 i0.300 135.000 143.000
47.000 96.000 96.000 53.500 4L400 lI.B00 201.000 r15.000
17.000 25.000 163.000 84.800 24.400 7.940 164.000 103.000
67.000 122.000 39.000 73.700 42.300 3.960 130.000 103.000
25.000 43.000 73.000 87.300 34.100 t2.700 162.000 169.000
24.000 22.OOO 45.000 74.800 28.300 7.r70 107.000 140.000
65.000 55.000 94.000 100.000 47.OOO 6.180 78.000 87.100
69.000 42.500 74.000 73.700 30.500 9.470 87.500 134.000
55.000 71.000 70.000 64.800 38.700 13.700 108.000 r08.000
39.000 34.800 129.000 119.000 60.900 17.000 126.000 118.000
58.000 50.300 34.000 30.000 5r.300 6.8t0 131.000 109.000
37.000 38.000 42.000 65.900 46.100 24.800 94.400 60.000
32.000 32.000 48.000 34.900 2s.000 7.200 87.700 82.900
66.000 53.500 58.000 55.900 29.000 7.900 94.000 99.600
49.000 60.700 30.000 40.100 25.000 6.600 rs7.000 123.000
40.000 66.500 47.000 48.200 40.000 8.600
34.000 22.600 35.000 14.800 28.000 5.500
87.000 61.800 20.000 14.400

Souncr: Dr. Bruno Reigner. Used with permission.

4s'2 CHAPTER 9 SIMPLE LINEAR RECRESSION AND CORREI.ATION

46. Yasu et al. (A-35) used noninvasive magnetic resonance spectroscopy to determine the short-
and long-term effects of percutaneous transvenous mitral commissurotomy (PTMC) on exer-
cise capacity and metabolic responses of skeletal muscles during exercise. Data were col-
lected on 11 patients (2 males, 9 females) with rynnptomatic mitral stenosis. Their mean age
was 52 years with a standard deviation of 11. Among the data collected were the following
measurements on changes in mitral valve area (d-MVA) and peak oxygen consumption
(d-Voz) 3, 30, and 90 days post-PTMC:

Days d-Vo2
Subject PosI-PTMC d-MVA (crn2) (nil/kg/min)

I 0.64 0.3
J 0.76 -0.9
J 0.3 I.9
4 J 0.6 - 3.t
5 ) 0.3 -0.5
6 J 0.4 -2.7
7 0.7 1.5
B 0.9 l.l
9 0.6 -7.4
IO 0.4 -0.4
11 0.65 3.8
I
I 30 0.53 t.6
2 30 0.6 3.3
30 0.4 2.6
4 30 0.5
5 30 0.3 J-O

6 30 0.3 0.2
7 30 0.67 4.2
B 30 0.75 J

9 30 0.7 2
l0 30 0.4 0.8
t1 30 0.55 A O

I 90 0.6 t.9
z 90 0.6 5.9
90 0.4 J.J

4 90 0.6 5
5 90 0.25 0.6
6 90 0.3 2.5
7 90 0.7 4.6
B 90 0.8 4
9 90 0.7 t
* : Missing daia.
l0 90 0.38 I.t
tl 90 0.53 SouRcr: Dr. Takanori Yasu.
Used with pemission.
REFERENCES 483

Exercises for tlse with Large Data Sets Available on the Following Website:
www.wiley. com/college /daniel
l. Refer to the data for 1050 subjects with cerebral edema (CEREBRAL). Cerebral edema with
consequent increased intracranial pressure frequently accompanies lesions resulting from
head injury and other conditions that adversely affect the integrity of the brain. Available

ff::il:ll":"Jf**'fff f :ffi"::il*:lffiT'li'1.,iTii,i;:il::,:*
illJlT'ff
is the relationship between intracranial pressure and glycerol plasma concentration. Sup-
Pose you are a statistical consultant with a research team investigating the relationship
between these two variables. Select a simple random sample from the population and pei-
form the analysis that you think would be useful to the researchers. Piesent your findings
and conclusions in narrative form and illustrate with graphs where appropriate. Compaie
your results with those of your classmates.
2. Refer to the data for 1050 subjects with essential hypertension (HYPERTEN). Suppose you
are a statistical consultant to a medical research team interested in essential h)?ertension.
Select a simple random sample from the population and perform the analyses that you
think would be useful to the researchers. Present your findings and conclusions in narra-
tive form and illustrate with graphs where appropriate. Compare your resuls with those of
your classmates. Consult with your instructor regarding the size of sample you should select.
3. Refer to the data for 1200 patients with rheumatoid arthritis (CALCIUM). One hundred
patients received the medicine at each dose level. Suppose you are a medical researchers
wishing to gain insight into the nature of the relationship between dose level of prednisolone
and total body calcium. Select a simple random sample of three patients fiom each dose
level group and do the following.
(a) Use the total number of pairs of obserr.ations to obtain the least-squaresequation describ.
ing the relationship between dose level (the independent variable) and totaf body calcium.
(b) Draw a scatter diagram of the data and plot the equation.
(c) Compute r and test for significance at the .05 level. Find the p value.
(d) Compare your results with those of your classmates.
l'l
lt

BDFERDNCES

Methodology References
l. Frederick Harnvig with Brain E. Dearing, Explmatory Data Analysis,Sage Publications, Beverly
Hills. 1979.
, Iain M. Johnstone and Paul F. Velleman, "The Resistant Line and Related Regression Meth-
ods," Journal of the Amnican Statistical Association, g0 (1ggb), 1041-10b4.
c. Donald R. McNeil, Interactiue Data Analysis: A practical prime4 wiley, New york, 192i.
4. Paul F. Velleman and David C. Hoaglin, Applications, Basics,and, Computing of Exploratory
Data Analysis, Duxbury, Belmont, CA, 1981.
5. R. A. Fisher, "On the Probable Error of a Coefficient of Correlation Deduced from a Small
Sample," Metron, I (1921),3-2I.
6. H. Hotelling, "New Light on the Correlation Coefficient and Its Transforms,"
Journal of the
Royal Statistical Society,Seriu B, 15 (1952),799-222.

i
I

L
484 crrAprnR e srMpLELTNEAR
REGRESsToN
ArrDcoRRnr,arroN

7. Terry E. Dielman, Applied RzgressionAnalysis for Business and, Economia, Second Edition,
Duxbury, Belmont, Cl\ 1996.
8. Ronald R. Hocking, Method,s and, Appkcations of Linear Modek: Rzgression and the Analysis of
Vari,ance,Wiley, New York, 1996.
9. William Mendenhall and Terry Sincich, A SecondCwrse in Statistics: RegressionAnalysis, Fifth
Edition, Prentice Hall, Upper Saddle Rivea NJ, 1996.
10. John Neter, Michael H. Kutner, Christopher J. Nachtsheim, and William Wasserman,
AWIizd Linear RzgressionModek, Third Edition, Irwin, Chicago, 1996.

Applications References
A-1. Jean-Pierre Despr6s, Denis Prud'homme, Marie{hristine Pouliot, Angelo Tremblay, and
"Estimation
Claude Bouchard, of Deep Abdominal Adipose-Tissue Accumulation from Sim-
ple Anthropometric Measurements in Men," AmericanJaurnal of Cl;inicalNutrition, 54 (1991),
471477.
A-2. MoriJ. Kranz, Ilana B. Kutinsky, Alastair D. Roberston, and Philip S. Mehler, "Dose-Related
Effects of Methadone on QT Prolongation in a Series of Patients with Torsade de Pointes,"
Pharmacotherapl, 23 (2003), 802-805.
A-3. Robert A. Reiss, Curtis E. Haas, Deborah L. Griffis, Bernadette Porter, and Mary Ann Tara,
"Point-of-Care
versus Laboratory Monitoring of Patients Receiving Different Anticoagulant
Therapies," Pharrnacotherapl,22 (2002), 677-685.
4A4. Robert B. Parker, Ryan Yates, Judith E. Soberman, and Casey Laiztre, "Effects of Grape-
fruit Juice on Intestial P-glycoprotein: Evaluation Using Digoxin in Humans," Pharma-
cothnapy, 23 (2003), 979-987.
A-5. R. B. Evans, W. Gordon, and M. Conzemius, "Effect of Velocity on Ground Reaction Forces
in Dogs with Lameness Attributable to Tearing of the Cranial Cruciate Ligament," American
Jrurnal of Wterinary Rtsearch, 64 (2003), 1479-I48L.
A{. David Krieser, Andrew Rosenberg, and Gad Kainer, "The Relationship Between Serum Cre-
atinine, Serum Cystatin C and Glomerular Filtration Rate in Pediatric Renal Transplant
Recipients: A Pilot Study," Peiliatric Transplantation, 6 (2002),392-395.
"Somatosensory
A:7. Olga Kwast-Rabben, Rolf Libelius, and Hannu Heikkili, Evoked Potentials
Following Stimulation of Digital Nerves," Musclz and, Nerve, 26 (2002),533-538.
"Hepatitis
A€. Geri R. Brown and Kim Persley, A Epidemic in the Elded" Sm,thern Med,i.cal
Jm^nnal, 95 (2002), 826-833.
"Spastic
A-9. Ann E. Tuzson, Kevin P. Granata, and Mark F. Abel, Velocity Threshold Constrains
Functional Performance in Cerebral Palsy," Archiaes of Ph.ysical Meilicine and Rzhabilitation,
84 (2003),1363-1368.
A-10. M. Butz, IL H. Wollinsky, U. Widemuth-Catrinescu, A. Sperfeld, S. Winter, H. H. Mehrkens,
"Longitudinal
A. C. Ludolph, and H. Schreiber, Effects of Noninvasive Positive-Pressure
Ventilation in Patients with Amyotophic Lateral Sclerosis," Amnican Jaurnal of Med,icalReha-
bilitation, 82 (2003), 597-604.
"The
A-11. Karie E, Scrogin, Daniel C. Hatton, and David A. McCarron, Interactive Effecs of
Dietary Sodium Chloride and Calcium on Cardiovascular Stress Responses," Amtrican Jaur-
nal of Physiolog (Regulatmy Integratiae Comp. Physiol. 30), 261 (1991), R945-R949.
REFERENCES 4s,6
A-r2' Dean Parmalee, Data analyzed,by the Wright State University Statistical
Consulting Center,
Wright State Universiry, Dayton, OH (2009).
A-13' Maria Mathias, Data analyzed by the Wright State University Statistical
Consulting Center,
Wright State University, Dayton, OH (2001).
L-14' Darren M. Triller, Steven L. Clause, and Christopher Domareq "Analysis
of Medication
Management Activities for Home-Dwelling Patients, '; Amnican
1u^trnat of Health-Systznpharmacy,
5e (2002),235U235e.
A-15' Maxime Lamarre-Cliche, )&es Lacourcidre,
Jacques de Champlain, Luc poirier, and pierre
Larochelle, 'Does QTc Interval Predict the Response to Beta-Blockers
and Calcium Chan-
nel Blockers in Hypertensives?" Heart and, Diseise, 5 (2002),244=2b2.
4-16' Jean D. Skinner, Paula Ziegler, and Michael Ponza, "Transitions in Infants, and Toddlers,
Beverage Patter\s," Joumal of the American DieteticAssociation,Supplcmznt,
104 (2004), 4b-b0.
a'L7. Neung Hwa Park, Ji Kang park, younghee choi, cheol-In yoo,
choong Ryeol Lee, Hun
Lee, Hyo Kyung Kim, sung-R1ul Kim, Tae-Heum park, chung sik yoon,
Jeong, Jungsun
and Yangho Kim, 'whole Blood Manganese Corielat.r *itti High
Signal Iniensities on
T:weighted MRI in patients with Liver cirrhosis,', NzuroToxi.corogl,
24 do6),909_91b.
A-18' Philippe de Moerloose, Sylvie Desmarais, Henri Bounameaux, Guido
Reber, Arnaud perrier,
Georges Dupuy, andJean-Louis Pittet, "Contribution of a New, Rapid,
Individual and euan-
titative Automated D-Dimer ELISA to Exclude Pulmonary Embolism,,,
Thrombosis and,
Haunostasis, 75 (1996), lL-Ig.
A-19' Ilpo T. Huhtaniemi, Anne-Maarit Haavisto, Ralja Anttila, Martti A.
"Sensitive Siimes, and Leo Dunkel,
Immunoassay and in Vitro Bioassay Demonstrate Constant Bioactive/Immunore-
active Ratio of Luteinizing Hormone in Healthy Boys During the pubertal
Maturation,,,
Pediatric Rzsearch,39 (1996), 190_194.
A-20. Yong-Kwei rsau, Ji-Nan sheu, chiung-Hui chen, Ru-Jeng Teng, and Hui-chi chen,
"Decreased
urinary Epidermal Growth Facror in childrin rtth Acute Renal
Failure: Epi-
dermal Growth Factor/Cteatinine Ratio Not a Reliable Parameter
for Urinary Epidermal
Growth Factor Excretton,,' pediatric Rcsearch,39 (1996),20_24.
L-zL' A. Usaj and V. Starc, "Blood pH and Lactate Kinetics in the
Assessment of Running
Endurance," International
Joumal of SportsMed,icine,17 (1996), 24F40.
A-22' Pamela Bean, Mary Susan Sutphin, Patricia Necessary, Melkon
S. Agopian, Karsten Lieg-
mann, Carl Ludvigsen, andJames B. Peter, "Carbohydrate-Deficient
Transferrin Evaluation
in Dry Blood Spots," Alcoholism: Clinicat and Experimental llesearch,
20 (1996), b1.-;60.
A-23' Johji Kato, Kohji Kobayashi, Takuma Etoh, Miho Tanaka, Kazuo Kiramura, Takuroh
Imamura, Yasushi Koiwaya, Kenji Kangawa, and Tanenao Eto, "plasma
Adrenomedullin
Concentration in Patients with Heart Failure,"
Journat of Clinicat End,ocrinolog and, Metabu
lism, 81 (1996), 180-183.
a'24. B. E. Golden, P. A. clohessy, G. Russell, and M. K Fagerhol, .,calprotectin
as a Marker of
Inflammation in Cystic Fibrosis," Archiaes of Diseasein-Chlldhood,
Z+ (tSgO),136-139.
a-25. Arthur F. Gelb,James c. Hogg, Nestor L. Muller, MarkJ. schein,Joseph
Kuei, Donard p.
Tashkin, Joel D. Epstein, Jozef Kollin, Robert H. Green, No e Zamel,
W. Mark Elliott, and
Lida Hadjiaghai' "Contribution of Emphysema and Small Airways
in COpD,,' Chest, 109
(1996), 353_359.
486 CIIAFTER 9 SIMPLE LINEAR REGRESSIONAND CORRELATION

L-26. Agnes M. witteman, sreven o. Stapel, GerrardJ. perdok, Deman H. S. sjamsoedin, Henk
M.Jansen, Rob C. Aalberse, andJaring S. van der Zee,"The Relationship Between RAST
and Skin Test Results in Patients with Asthma or Rhinitis: A
Quantitative Study wirh purified
Major Allergens," Jmtrnal of Allergy and Clinical Immunology, 97 (7996),I6_2b.
A-27. Marianne Garland, Hazel H. Szeto, Salha S. Daniel, PamelaJ. Tropper, Michael M. Myers,
and Ra1'rnond I. Stark, "Zidowrdine Kinetics in the Pregnant Baboon,"
Joumal of Acquired,
Immune Deficienq Synd,romes and Human Retroairology,11 (1996), Ilj-I27.
A-28. A. Halligan, A. Shennan, P. C. Lambert, M. de Swiet, and D.J. Tayloa "Diurnal Blood pressure
Difference in the Assessment of Preeclampsia," Obstetrics€l Crynecolog,S7 (1996),20b-208.
A-29. "Effects
Jay W. Marks, George G. Bonorris, and LeslieJ. Schoenfield, of Ursodiol or Ibupro-
fen on Contraction of Gallbladder and Bile among Obese Patients during Weight Loss,"
Digestiae Diseasesand, Scienas, 4 j (1996), 242-249.
A-30. D. J. Peacock, P. Eggea P. Taylor, M. I. D. cawley, and c. cooper, ,,Lateral Bone Density
Measurements in Osteoarthritis of the Lumbar Spine," Annak of the Rheumatic Diseases, 55
(1996),196-198.
A-31. R. P. Sloan, P. A. shapire, E. Bagiella,J. T. Bigger, E. S. Lo, andJ. M. Gorman, ,,Relationships
Between Circulating Catecholamines and Low Frequency Heart Period Variability as Indices
of Cardiac Syrnpathetic Activity During Mental Stress,"Pslchosomatic Medicine,58 (1996) 25-31.
A-32. Zukai Chati, Faiez Zannad, Claude Jeandel, Brigitte Lherbier,
Jean-Marie Escanye, Jacques
Robert, and Etienne Aliot, "Physical Deconditioning May Be a Mechanism for the Skeletal
Muscle Energy Phosphate Metabolism Abnormalities in Chronic Heart Failure ," American
Heart Journal, 131 (1996), 560-566.
A-33. Magdalena Czader,JoannaMazur:, Mikael Pettersson,Jan Liliemark, Mats Stromberg, Birger
Christensson, Bernard Tribukait, Gert Auer, Ake 6st, and Anna porwit, "prognostic Sig-
nificance of Proliferative and Apoptotic Fractions in Low Grade Follicle Center Cell-Derived
Non-Hodgkin's Lynnphomas," Cance4 77 (1996), 1 1g0-1 I gg.
A-34. B. G. Reigner and H. A. Welker, "Factors Influencing Elimination and Distribution of
Fleroxacin: Metaanalysis of Individual Data from 10 Pharmacokinetic Studies," Antimitrobial
Agents and Chemothnapl, 40 (799G), b7b-b80.
A-35. Takanori Yasu, Taka'aki Katsuki, Nobuhiro Ohmura, Ikuko Nakada, Mafumi Owa, Mikihisa
Fujii, Akira Sakaguchi, and Muneyasu Saito, "Delayed Improvement in Skeletal Muscle
Metabolism and Exercise Capacity in Patients with Mitral Stenosis Following Immediate
Hemodl'namic Amelioration by Percutaneous Transvenous Mitral Commissurotomv.,'
American Journal of Cardiolog, 77 (I9gO), 492-497.

II
I
-

Razza Pizza Dataset
No ratings yet
Razza Pizza Dataset
20 pages
Chapter-4-Simple Linear Regression & Correlation
100% (3)
Chapter-4-Simple Linear Regression & Correlation
9 pages
Quiz CH10&11 - Time Series Analysis and Forecasting & Predictive Data Mining - Preethi Chowdary Narra
No ratings yet
Quiz CH10&11 - Time Series Analysis and Forecasting & Predictive Data Mining - Preethi Chowdary Narra
4 pages
Full An Introduction To Statistics 1st Edition George Woodbury PDF All Chapters
100% (1)
Full An Introduction To Statistics 1st Edition George Woodbury PDF All Chapters
55 pages
Computerized Trading Maximizing Day Trading and Overnight Profits - Mark Jurik 1998
No ratings yet
Computerized Trading Maximizing Day Trading and Overnight Profits - Mark Jurik 1998
452 pages
Top 10 Most Profitable Trading Strategies The Profit Playbook Unveiling The Ten Most Lucrative Trading Strategies - Nikhil Kumar 2023
0% (1)
Top 10 Most Profitable Trading Strategies The Profit Playbook Unveiling The Ten Most Lucrative Trading Strategies - Nikhil Kumar 2023
61 pages
Data Classification & Tabulation
No ratings yet
Data Classification & Tabulation
3 pages
BCSE352E EDA CAT 2 Mod 1,2,5
No ratings yet
BCSE352E EDA CAT 2 Mod 1,2,5
146 pages
Regression Analysis Willey Publication
20% (5)
Regression Analysis Willey Publication
15 pages
X Variable 1 Line Fit Plot: Regression Statistics
No ratings yet
X Variable 1 Line Fit Plot: Regression Statistics
3 pages
Business Statistics Syl Lab Us
No ratings yet
Business Statistics Syl Lab Us
2 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
Inference From Sample Statistics and Margin of Error Answers
No ratings yet
Inference From Sample Statistics and Margin of Error Answers
20 pages
ML 03 Logistic Regression
No ratings yet
ML 03 Logistic Regression
32 pages
BI - Lecture06A - EDA - Statistical Testing
No ratings yet
BI - Lecture06A - EDA - Statistical Testing
22 pages
Article Quiz1 - Linear Regression Analysis
No ratings yet
Article Quiz1 - Linear Regression Analysis
5 pages
Lesson 9: Test of Correlation and Simple Linear Regression
No ratings yet
Lesson 9: Test of Correlation and Simple Linear Regression
7 pages
Regression Analysis and Forecasting Models
No ratings yet
Regression Analysis and Forecasting Models
28 pages
ADIGRAT UNIVERSITY Bass New
No ratings yet
ADIGRAT UNIVERSITY Bass New
8 pages
Effectiveness of Audiovisual-Based Training On Basic Life Support Knowledge of Students in Bengkulu
No ratings yet
Effectiveness of Audiovisual-Based Training On Basic Life Support Knowledge of Students in Bengkulu
6 pages
STA2100-Regression Analysis
No ratings yet
STA2100-Regression Analysis
15 pages
BMKG Jayapura Jayapura
No ratings yet
BMKG Jayapura Jayapura
7 pages
14 - Regresi Dan Korelasi
No ratings yet
14 - Regresi Dan Korelasi
34 pages
Logical Trading - Applying a Method to the Madness - Mark Fisher - ACD邏輯交易法 Pp263
No ratings yet
Logical Trading - Applying a Method to the Madness - Mark Fisher - ACD邏輯交易法 Pp263
263 pages
Introductory of Statistics - Chapter 4
No ratings yet
Introductory of Statistics - Chapter 4
5 pages
Jeremy Miles, Mark Shevlin Applying Regression and Correlation
No ratings yet
Jeremy Miles, Mark Shevlin Applying Regression and Correlation
230 pages
Using and Managing Data and Information November Cohort (2021 - 2022) BA3020QA
No ratings yet
Using and Managing Data and Information November Cohort (2021 - 2022) BA3020QA
6 pages
Stat 302 Practice Final: Brad Mcneney 2017-04-15
No ratings yet
Stat 302 Practice Final: Brad Mcneney 2017-04-15
7 pages
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
No ratings yet
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
52 pages
Module 3 - Data Analysis - S RM
No ratings yet
Module 3 - Data Analysis - S RM
63 pages
9.1 Simple Linear Regression Analysis
No ratings yet
9.1 Simple Linear Regression Analysis
8 pages
Simple Linear Regression and Correlation PDF
No ratings yet
Simple Linear Regression and Correlation PDF
7 pages
HW 5 CHAP 5 Simulación Espol
No ratings yet
HW 5 CHAP 5 Simulación Espol
4 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Mda-Session-7 Simple Linear Regression
No ratings yet
Mda-Session-7 Simple Linear Regression
75 pages
Decision Theory
No ratings yet
Decision Theory
101 pages
Regression Course For Second Year (Chap 1-3)
No ratings yet
Regression Course For Second Year (Chap 1-3)
59 pages
Generalized Logistic Distributions: Rameshwar D. Gupta Debasis Kundu
No ratings yet
Generalized Logistic Distributions: Rameshwar D. Gupta Debasis Kundu
23 pages
Correlation and Simple Linear Regression: Y. I.E. X
100% (1)
Correlation and Simple Linear Regression: Y. I.E. X
9 pages
Iter PDF
No ratings yet
Iter PDF
400 pages
2009 Mega Package Steve Nison Candlestick Reignited Volume 3 and 4
No ratings yet
2009 Mega Package Steve Nison Candlestick Reignited Volume 3 and 4
128 pages
Candlestick Charting Essentials and Beyond - Volume 1 and 2 - Steve Nison
No ratings yet
Candlestick Charting Essentials and Beyond - Volume 1 and 2 - Steve Nison
147 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
MEFall2023 5
No ratings yet
MEFall2023 5
13 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Regression Analysis
No ratings yet
Regression Analysis
22 pages
JCPR Manual - Greg Morris' Candlestick Pattern Recognition For MetaStock
No ratings yet
JCPR Manual - Greg Morris' Candlestick Pattern Recognition For MetaStock
76 pages
Lecture 5 (Bivariate ND & Error Ellipses)
100% (4)
Lecture 5 (Bivariate ND & Error Ellipses)
5 pages
Risk and Return Practice Questions
No ratings yet
Risk and Return Practice Questions
6 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Intro To Econometrics With R PDF
No ratings yet
Intro To Econometrics With R PDF
392 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Econometria Con R
No ratings yet
Econometria Con R
300 pages
8-Simple Regression Analysis
No ratings yet
8-Simple Regression Analysis
9 pages
Stats 4
No ratings yet
Stats 4
23 pages
A Brief Review of Independent, Dependent and One Sample T-Test
No ratings yet
A Brief Review of Independent, Dependent and One Sample T-Test
5 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Chapter 3 Outline
No ratings yet
Chapter 3 Outline
4 pages
Stat 4-6 Chapter
No ratings yet
Stat 4-6 Chapter
37 pages
DISCRETE MATH Chapter-8
No ratings yet
DISCRETE MATH Chapter-8
34 pages
(Ebook) Graybill & Iyer 2004 Regression Analysis - Concepts & Applications - With SAS & Minitab
No ratings yet
(Ebook) Graybill & Iyer 2004 Regression Analysis - Concepts & Applications - With SAS & Minitab
648 pages
QT - Unit 2 - Part B - Regression
No ratings yet
QT - Unit 2 - Part B - Regression
40 pages
Simple Linear Regression Part 1
No ratings yet
Simple Linear Regression Part 1
63 pages
Quiz Solutions
No ratings yet
Quiz Solutions
20 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
CorrelationTrading OpportunitiesandLimitations
No ratings yet
CorrelationTrading OpportunitiesandLimitations
34 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Chapter Regression PDF
No ratings yet
Chapter Regression PDF
95 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
Manuel PDF
No ratings yet
Manuel PDF
503 pages
Intermediate Analytics-Regression-Week 1
No ratings yet
Intermediate Analytics-Regression-Week 1
52 pages
Auditing: Estimation of Errors
No ratings yet
Auditing: Estimation of Errors
5 pages
Simple Linear Regression and Correlation 568a5ac2ce9b3
No ratings yet
Simple Linear Regression and Correlation 568a5ac2ce9b3
31 pages
Correlation Not Causation: The Relationship Between Personality Traits and Political Ideologies
No ratings yet
Correlation Not Causation: The Relationship Between Personality Traits and Political Ideologies
18 pages
MMPC-005 Quantitative Analysis
No ratings yet
MMPC-005 Quantitative Analysis
4 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
Mini Tabl CG&CGK
No ratings yet
Mini Tabl CG&CGK
2 pages
Commodity Prices, Interest Rates and The Dollar - Farooq Akram Q 2009 Elsevier 2009
No ratings yet
Commodity Prices, Interest Rates and The Dollar - Farooq Akram Q 2009 Elsevier 2009
14 pages
Introduction To Econometrics With R
No ratings yet
Introduction To Econometrics With R
400 pages
Regression Analysis
No ratings yet
Regression Analysis
21 pages
Scale of Likert
No ratings yet
Scale of Likert
12 pages
Lecture9 Regression1 PDF
No ratings yet
Lecture9 Regression1 PDF
22 pages
Statistic Correlation and Regression
No ratings yet
Statistic Correlation and Regression
9 pages
Correlation and Regression 2
No ratings yet
Correlation and Regression 2
24 pages
British Vs American English
No ratings yet
British Vs American English
5 pages
Aiml Unit-3 MCQ
100% (1)
Aiml Unit-3 MCQ
6 pages
Regression Analysis
No ratings yet
Regression Analysis
12 pages
A Tutorial On How To Run A Simple Linear Regression in Excel
No ratings yet
A Tutorial On How To Run A Simple Linear Regression in Excel
19 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
Regression and Correlation
No ratings yet
Regression and Correlation
17 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Chapter 9: Correlation and Regression: Solutions
No ratings yet
Chapter 9: Correlation and Regression: Solutions
8 pages
Trip Generation
No ratings yet
Trip Generation
39 pages
Rates and Equilibria of Organic Reactions: As Treated by Statistical, Thermodynamic and Extrathermodynamic Methods
From Everand
Rates and Equilibria of Organic Reactions: As Treated by Statistical, Thermodynamic and Extrathermodynamic Methods
John E. Leffler
No ratings yet
An Introduction to Mathematical Taxonomy
From Everand
An Introduction to Mathematical Taxonomy
G. Dunn
No ratings yet
Time's Arrow: The Origins of Thermodynamic Behavior
From Everand
Time's Arrow: The Origins of Thermodynamic Behavior
Michael C. Mackey
4/5 (1)
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
From Everand
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
Harald Cramér
4/5 (2)
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
From Everand
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
Mark Everett Stone
No ratings yet
Metric, Myth & Quasicrystals
From Everand
Metric, Myth & Quasicrystals
Antony J. Bourdillon
No ratings yet