0% found this document useful (0 votes)

300 views25 pages

Simple Linear Regression: Y XI. XI X

Uploaded by

yibungo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

300 views25 pages

Simple Linear Regression: Y XI. XI X

Uploaded by

yibungo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

CHAPTER 2

SIMPLE LINEAR REGRESSION

2.1 INTRODUCTlO N

We start with the simple case of studying the relationship between a response vari-
able Y and a predictor variable X I . Since we have only one predictor variable,
we shall drop the subscript in X I and use X for simplicity. We discuss covariance
and correlation coefficient as measures of the direction and strength of the linear
relationship between the two variables. Simple linear regression model is then
formulated and the key theoretical results are given without mathematical deriva-
tions, but illustrated by numerical examples. Readers interested in mathematical
derivations are referred to the bibliographic notes at the end of the chapter, where
books that contain a formal development of regression analysis are listed.

2.2 COVARIANCE AND CORRELATION COEFFICIENT

Suppose we have observations on n subjects consisting of a dependent or response

variable Y and an explanatory variable X . The observations are usually recorded
as in Table 2.1. We wish to measure both the direction and the strength of the
relationship between Y and X . Two related measures, known as the covariance
and the correlation coeficient, are developed below.
21
Regression Analysis by Example, Fourth Edition. By Samprit Chatterjee and Ali S. Hadi
Copyright @ 2006 John Wiley & Sons, Inc.
22 SIMPLE LINEAR REGRESSION

Table 2.1 Notation for the Data Used in Simple Regression and Correlation
~~~ ~ ~

Observation Response Predictor

Number Y X
1 Y1 XI
2 Y2 x2

n Yn Xn

On the scatter plot of Y versus X , let us draw a vertical line at 2 and a horizontal
line at j j , as shown in Figure 2.1, where

are the sample mean of Y and X, respectively. The two lines divide the graph into
four quadrants. For each point i in the graph, compute the following quantities:
0 yi - j j , the deviation of each observation yi from the mean of the response
variable,
0 i - 2, the deviation of each observation xi from the mean of the predictor
z
variable, and
0 the product of the above two quantities, (yi - fj)(xi- 2 ) .
It is clear from the graph that the quantity (yi - y) is positive for every point in the
first and second quadrants, and is negative for every point in the third and fourth

Figure 2.1 A graphical illustration of the correlation coefficient.

COVARIANCE AND CORRELATIONCOEFFICIENT 23

quadrants. Similarly, the quantity (xi- Z ) is positive for every point in the first and
fourth quadrants, and is negative for every point in the second and third quadrants.
These facts are summarized in Table 2.2.

Table 2.2 Algebraic Signs of the Quantities (yz - I)

and (2,- 3 )

Quadrant Yi - Y xi - z (Yi -
i.($ - 2)
+ + +
~

3
4

If the linear relationship between Y and X is positive (as X increases Y also

increases), then there are more points in the first and third quadrants than in the
second and fourth quadrants. In this case, the sum of the last column in Table 2.2
is likely to be positive because there are more positive than negative quantities.
Conversely, if the relationship between Y and X is negative (as X increases Y
decreases), then there are more points in the second and fourth quadrants than in
the first and third quadrants. Hence the sum of the last column in Table 2.2 is likely
to be negative. Therefore, the sign of the quantity

COV(Y,X) = 2=1 1 (2.2)

n-1
which is known as the covariance between Y and X, indicates the direction of the
linear relationship between Y and X. If Cov(Y, X ) > 0, then there is a positive
relationship between Y and X, but if Cov(Y,X) < 0, then the relationship is
negative. Unfortunately, Cov(Y,X) does not tell us much about the strength of
such a relationship because it is affected by changes in the units of measurement. For
example, we would get two different values for the Cov(Y, X ) if we report Y and/or
X in terms of thousands of dollars instead of dollars. To avoid this disadvantage
of the covariance, we standardize the data before computing the covariance. To
standardize the Y data, we first subtract the mean from each observation then divide
by the standard deviation, that is, we compute

where

(2.4)
is the sample standard deviation of Y. It can be shown that the standardized
variable 2 in (2.3) has mean zero and standard deviation one. We standardize X in
24 SIMPLE LINEAR REGRESSION

a similar way by subtracting the mean Z from each observation zi then divide by
the standard deviation sx.The covariance between the standardized X and Y data
is known as the correlation coeflcient between Y and X and is given by

Cor(Y,X) = -
n-1.

Equivalent formulas for the correlation coefficient are

Cov(Y, X)
Cor(Y,X) =
sysx

Thus, Cor(Y, X) can be interpreted either as the covariance between the standard-
ized variables or the ratio of the covariance to the standard deviations of the two
Variables. From (2.5), it can be seen that the correlation coefficient is symmetric,
that is, Cor(Y, X) = Cor(X, Y).
Unlike Cov(Y,X), Cor(Y, X) is scale invariant, that is, it does not change if we
change the units of measurements. Furthermore, Cor(Y, X) satisfies
-1 5 Cor(Y,X) 5 1. (2.8)
These properties make the Cor(Y,X) a useful quantity for measuring both the
direction and the strength of the relationship between Y and X . The magnitude of
Cor(Y, X) measures the strength of the linear relationship between Y and X . The
closer Cor(Y, X) is to 1 or -1, the stronger is the relationship between Y and X .
The sign of Cor(Y, X ) indicates the direction of the relationship between Y and X .
That is, Cor(Y, X) > 0 implies that Y and X are positively related. Conversely,
Cor(Y, X) < 0, implies that Y and X are negatively related.
Note, however, that Cor(Y, X) = 0 does not necessarily mean that Y and X are
not related. It only implies that they are not linearly related because the correlation
coefficient measures only linear relationships. In other words, the Cor(Y, X ) can
still be zero when Y and X are nonlinearly related. For example, Y and X in Table
2.3 have the perfect nonlinear relationship Y = 50 - X2 (graphed in Figure 2.2),
yet Cor(Y, X ) = 0.
Furthermore, like many other summary statistics, the Cor(Y,X) can be sub-
stantially influenced by one or few outliers in the data. To emphasize this point,
Anscombe (1973) has constructed four data sets, known as Anscombe’s quartet,
each with a distinct pattern, but each having the same set of summary statistics (e.g.,
the same value of the correlation coefficient). The data and graphs are reproduced
in Table 2.4 and Figure 2.3. The data can be found in the book’s Web site.’ An
analysis based exclusively on an examination of summary statistics, such as the
correlation coefficient, would have been unable to detect the differences in patterns.

’ http://www.ilr.corneIl.eduThadi/RABE4
COVARIANCE AND CORRELATION COEFFICIENT 25

Y X Y X Y X
1 -7 46 -2 41 3
14 -6 49 -1 34 4
25 -5 50 0 25 5
34 -4 49 1 14 6
41 -3 46 2 1 7

-4 0 4
.I
A

Figure 2.2 A scatter plot of Y versus X in Table 2.3.

Table 2.4 Anscombe’s Quartet: Four Data Sets Having Same Values of Summary
Statistics
Yl XI y2 x2 y3 x3 y4 x4
8.04 10 9.14 10 7.46 10 6.58 8
6.95 8 8.14 8 6.77 8 5.76 8
7.58 13 8.74 13 12.74 13 7.71 8
8.81 9 8.77 9 7.1 1 9 8.84 8
8.33 11 9.26 11 7.81 11 8.47 8
9.96 14 8.10 14 8.84 14 7.04 8
7.24 6 6.13 6 6.08 6 5.25 8
4.26 4 3.10 4 5.39 4 12.50 19
10.84 12 9.13 12 8.15 12 5.56 8
4.82 7 7.26 7 6.42 7 7.91 8
5.68 5 4.74 5 5.73 5 6.89 8
Source: Anscombe (1973).
26 SIMPLE LINEAR REGRESSION

4 6 8 10 12 14 4 6 8 10 12 14
XI x2

M
4 8 12 16 20
x4

Figure 2.3 Scatter plots of the data in Table 2.4 with the fitted lines.

An examination of Figure 2.3 shows that only the first set, whose plot is given
in (a), can be described by a linear model. The plot in (b) shows the second
data set is distinctly nonlinear and would be better fitted by a quadratic function.
The plot in (c) shows that the third data set has one point that distorts the slope
and the intercept of the fitted line. The plot in (d) shows that the fourth data set
is unsuitable for linear fitting, the fitted line being determined essentially by one
extreme observation. Therefore, it is important to examine the scatter plot of Y
versus X before interpreting the numerical value of Cor(Y, X ) .

2.3 EXAMPLE: COMPUTER REPAIR DATA

As an illustrative example, consider a case of a company that markets and repairs

small computers. To study the relationship between the length of a service call
and the number of electronic components in the computer that must be repaired
or replaced, a sample of records on service calls was taken. The data consist of
the length of service calls in minutes (the response variable) and the number of
components repaired (the predictor variable). The data are presented in Table 2.5.
The Computer Repair data can also be found in the book's Web site. We use this
data set throughout this chapter as an illustrative example. The quantities needed
to compute tj, 1,Cov(Y, X ) , and Cor(Y, X ) are shown in Table 2.6. We have
EXAMPLE: COMPUTER REPAIR DATA 27

Row Minutes Units Row Minutes Units

1 23 1 8 97 6
2 29 2 9 109 7
3 49 3 10 119 8
4 64 4 11 149 9
5 74 4 12 145 9
6 87 5 13 154 10
7 96 6 14 166 10

50 100 150
Units

Figure 2.4 Computer Repair data: Scatter plot of Minutes versus Units.

n
c (Yi - 9(.i - 5)
- 1768 - 136,
Cov(Y,X) = Z=l - --
n-1 13
and

Cor(Y, X ) =
C(Yi - i.()y - 2) -
1768
= 0.996.
JC(Yi - Y)2 C(.i - $ 2 J27768.36 x 114
Before drawing conclusions from this value of Cor(Y, X ) , we should examine the
corresponding scatter plot of Y versus X . This plot is given in Figure 2.4. The
high value of Cor(Y, X ) = 0.996 is consistent with the strong linear relationship
between Y and X exhibited in Figure 2.4. We therefore conclude that there is a
strong positive relationship between repair time and units repaired.
Although Cor(Y, X) is a useful quantity for measuring the direction and the
strength of linear relationships, it cannot be used for prediction purposes, that is,
we cannot use Cor(Y, X) to predict the value of one variable given the value of the
other. Furthermore, Cor(Y, X) measures only pairwise relationships. Regression
analysis, however, can be used to relate one or more response variable to one or
more predictor variables. It can also be used in prediction. Regression analysis
28 SIMPLE LINEAR REGRESSION

Table 2.6 Quantities Needed for the Computation of the Correlation Coefficient
Between the Length of Service Calls, Y , and Number of Units Repaired, X

1 23 1 -74.21 -5 5507.76 25 37 1.07

2 29 2 -68.21 -4 4653.19 16 272.86
3 49 3 -48.21 -3 2324.62 9 144.64
4 64 4 -33.21 -2 1103.19 4 66.43
5 74 4 -23.21 -2 538.90 4 46.43
6 87 5 -10.21 -1 104.33 1 10.21
7 96 6 -1.21 0 1.47 0 0.00
8 97 6 -0.21 0 0.05 0 0.00
9 109 7 1 1.79 1 138.90 1 11.79
10 119 8 21.79 2 474.62 4 43.57
11 149 9 51.79 3 268 1.76 9 155.36
12 145 9 47.79 3 2283.47 9 143.36
13 154 10 56.79 4 3224.62 16 227.14
14 166 10 68.79 4 473 1.47 16 275.14
Total 1361 84 0.00 0 27768.36 114 1768.00

is an attractive extension to correlation analysis because it postulates a model that

can be used not only to measure the direction and the strength of a relationship
between the response and predictor variables, but also to numerically describe that
relationship. We discuss simple linear regression models in the rest of this chapter.
Chapter 3 is devoted to multiple regression models.

2.4 THE SIMPLE LINEAR REGRESSION MODEL

The relationship between a response variable Y and a predictor variable X is

postulated as a lineaI-2 model

Y = Po + p1x + E, (2.9)
where Po and PI, are constants called the model regression coeficients or purum-
eters, and E is a random disturbance or error. It is assumed that in the range of
the observations studied, the linear equation (2.9) provides an acceptable approxi-
mation to the true relation between Y and X. In other words, Y is approximately
a linear function of X , and E measures the discrepancy in that approximation.
'The adjective linenr has a dual role here. It may be taken to describe the fact that the relationship
between Y and X is linear. More generally, the word linear refers to the fact that the regression
parameters, PO and PI, enter (2.9) in a linear fashion. Thus, for example, Y = PO PIX'+ + E is
also a linear model even though the relationship between Y and X is quadratic.
PARAMETER ESTIMATION 29

In particular E contains no systematic information for determining Y that is not

already captured in X . The coefficient PI, called the slope, may be interpreted
as the change in Y for unit change in X . The coefficient PO,called the constant
coefficient or intercept, is the predicted value of Y when X = 0.
According to (2.9), each observation in Table 2.1 can be written as
yz = Po + P1z. + E i , i = 1 , 2 , . . . ,n, (2.10)
where yi represents the ith value of the response variable Y , xi represents the ith
value of the predictor variable X,and ~i represents the error in the approximation
of ya.
Regression analysis differs in an important way from correlation analysis. The
correlation coefficient is symmetric in the sense that Cor(Y,X) is the same as
Cor(X, Y ) .The variables X and Y are of equal importance. In regression analysis
the response variable Y is of primary importance. The importance of the predictor
X lies on its ability to account for the variability of the response variable Y and
not in itself per se. Hence Y is of primary importance.
Returning to the Computer Repair Data example, suppose that the company
wants to forecast the number of service engineers that will be required over the next
few years. A linear model,
Minutes = PO + P1 . Units + E , (2.1 1)
is assumed to represent the relationship between the length of service calls and the
number of electronic components in the computer that must be repaired or replaced.
To validate this assumption, we examine the graph of the response variable versus
the explanatory variable. This graph, shown in Figure 2.4, suggests that the straight
line relationship in (2.1 1) is a reasonable assumption.

2.5 PARAMETER ESTIMATION

Based on the available data, we wish to estimate the parameters POand PI. This is
equivalent to finding the straight line that gives the best f i r (representation) of the
points in the scatter plot of the response versus the predictor variable (see Figure
2.4). We estimate the parameters using the popular least squares method, which
gives the line that minimizes the sum of squares of the vertical distances3 from
each point to the line. The vertical distances represent the errors in the response
variable. These errors can be obtained by rewriting (2.10) as
Ez = yz - Po - p1xi, i = 1 , 2 , . . . , n. (2.12)
The sum of squares of these distances can then be written as

W O , P1) = c&4
n

i= 1
=
n

C(Yi- Po -
i= 1
P1xi)2. (2.13)

3An alternative to the vertical distance is the perpendicular (shortest) distance from each point to the
line. The resultant line is called the orthogonal regression line.
30 SIMPLE LINEAR REGRESSION

The values of $0 and $1 that minimize S(/30,PI) are given by

(2.14)

and
po = g - ,813. (2.15)
Note that we give the formula for $1 before the formula for $0 because ,&uses $1.
The estimates ,&and $1 are called the least squares estimates of 00and /31 because
they are the solution to the least squares method, the intercept and the slope of the
line that has the smallest possible sum of squares of the vertical distances from each
point to the line. For this reason, the line is called the least squares regression line.
The least squares regression line is given by

Y = $0 + PIX. (2.16)

Note that a least squares line always exists because we can always find a line that
gives the minimum sum of squares of the vertical distances. In fact, as we shall
see later, in some cases a least squares line may not be unique. These cases are not
common in practice.
For each observation in our data we can compute

jji=$o+$1xi, i = 1 , 2 ,..., n. (2.17)

These are called the$tted values. Thus, the ith fitted value, jji, is the point on
the least squares regression line (2.16) corresponding to xi. The vertical distance
corresponding to the ith observation is

ei=yi-yi, i = 1 , 2 ,..., n. (2.18)

These vertical distances are called the ordinary4 least squares residuals. One
properties of the residuals in (2.18) is that their sum is zero (see Exercise 2.5(a)).
This means that the sum of the distances above the line is equal to the sum of the
distances below the line.
Using the Computer Repair data and the quantities in Table 2.6, we have
-
P1 =
C ( y i - i j ) ( x i - Z) - -
1768
- - 15.509,
C(.i - Z)2 114
and
$0 ---y - ,BIZ
. = 97.21 - 15.509 x 6 = 4.162.

Then the equation of the least squares regression line is

Minutes = 4.162 + 15.509 Units. (2.19)

‘To be distinguished from other types of residuals to be presented later.
PARAMETER ESTIMATION 31

2 4 6 8 10
Units

Figure 2.5 Plot of Minutes versus Units with the fitted least squares regression line.

This least squares line is shown together with the scatter plot of Minutes versus
Units in Figure 2.5. The fitted values in (2.17) and the residuals in (2.18) are shown
in Table 2.7.
The coefficients in (2.19) can be interpreted in physical terms. The constant
term represents the setup or startup time for each repair and is approximately 4
minutes. The coefficient of Units represents the increase in the length of a service
call for each additional component that has to be repaired. From the data given,
we estimate that it takes about 16 minutes (15.509) for each additional component
that has to be repaired. For example, the length of a service call in which four
components had to be repaired is obtained by substituting Units = 4 in the equation
+
of the regression line (2.19) and obtaining y = 4.162 15.509 x 4 = 66.20. Since
Units = 4, corresponds to two observations in our data set (observations 4 and 5),
the value 66.198 is the fitted value for both observations 4 and 5, as can be seen
from Table 2.7. Note, however, that since observations 4 and 5 have different values
for the response variable Minutes, they have different residuals.
We should note here that by comparing (2.2), (2.7), and (2.14), an alternative
formula for ,!?I can be expressed as

(2.20)

from which it can be seen that ,!?I, Cov(Y, X ) , and Cor(Y, X ) have the same
sign. This makes intuitive sense because positive (negative) slope means positive
(negative) correlation.
So far in our analysis we have made only one assumption, namely, that Y and
X are linearly related. This assumption is referred to as the linearity assumption.
This is merely an assumption or a hypothesis about the relationship between the
response and predictor variables. An early step in the analysis should always be
the validation of this assumption. We wish to determine if the data at hand support
32 SIMPLE LINEAR REGRESSION

Table 2.7 The Fitted Values, yi, and the Ordinary Least Squares Residuals, ei, for
the Computer Repair Data
i xi yi Ya ei i xi Yi $2 ei
1 1 23 19.67 3.33 8 6 97 97.21 -0.21
2 2 29 35.18 -6.18 9 7 109 112.72 -3.72
3 3 49 50.69 -1.69 10 8 119 128.23 -9.23
4 4 64 66.20 -2.20 11 9 149 143.74 5.26
5 4 74 66.20 7.80 12 9 145 143.74 1.26
6 5 87 81.71 5.29 13 10 154 159.25 -5.25
7 6 96 97.21 -1.21 14 10 166 159.25 6.75

the assumption that Y and X are linearly related. An informal way to check
this assumption is to examine the scatter plot of the response versus the predictor
variable, preferably drawn with the least squares line superimposed on the graph
(see Figure 2.5). If we observe a nonlinear pattern, we will have to take corrective
action. For example, we may re-express or transform the data before we continue
the analysis. Data transformation is discussed in Chapter 6.
If the scatter of points resemble a straight line, then we conclude that the linearity
assumption is reasonable and continue with our analysis. The least squares estima-
tors have several desirable properties when some additional assumptions hold. The
required assumptions are stated in Chapter 4. The validity of these assumptions
must be checked before meaningful conclusions can be reached from the analysis.
Chapter 4 also presents methods for the validation of these assumptions. Using the
properties of least squares estimators, one can develop statistical inference proce-
dures (e.g., confidence interval estimation, tests of hypothesis, and goodness-of-fit
tests). These are presented in Sections 2.6 to 2.9.

2.6 TESTS OF HYPOTHESES

As stated earlier, the usefulness of X as a predictor of Y can be measured informally

by examining the correlation coefficient and the corresponding scatter plot of Y
versus X . A more formal way of measuring the usefulness of X as a predictor of Y
is to conduct a test of hypothesis about the regression parameter PI. Note that the
hypothesis PI = 0 means that there is no linear relationship between Y and X . A
test of this hypothesis requires the following assumption. For every fixed value of
X , the E’S are assumed to be independent random quantities normally distributed
with mean zero and a common variance 02.With these assumptions, the quantities,
TESTS OF HYPOTHESES 33

bo and b1 are unbiased’ estimates of Po and PI, respectively. Their variances are
Var(j3))= a2 [A +
- C(x222- 2 ) 2 1: (2.21)

and
(2.22)

Furthermore, the sampling distributions of the least squares estimates ,& and ,&
are normal with means PO and P1 and variance as given in (2.21) and (2.22),
respectively.
The variances of j o and b1 depend on the unknown parameter 02. So, we need
to estimate a’ from the data. An unbiased estimate of 0’ is given by

(2.23)

where SSE is the sum of squares of the residuals (errors). The number n - 2 in
the denominator of (2.23) is called the degrees of freedom (df). It is equal to the
number of observations minus the number of estimated regression coefficients.
Replacing g 2 in (2.21) and (2.22) by e2 in (2.23), we get unbiased estimates
of the variances of bo and ,&. An estimate of the standard deviation is called the
standard error (s.e.) of the estimate. Thus, the standard errors of ,& and ,& are

22
(2.24)

and
u
s.e.(Pl) = (2.25)
Jmz- ’
respectively, where 6 is the square root of e2in (2.23). The standard errors of is
a measure of how precisely the slope has been estimated. The smaller the standard
error the more precise the estimator.
With the sampling distributions of bo
and PI, we are now in position to perform
statistical analysis concerning the usefulness of X as a predictor of Y . Under the
normality assumption, an appropriate test statistic for testing the null hypothesis
Ho : p1 = 0 against the alternative H1 : # 0 is the t-test,

(2.26)

The statistic t l is distributed as a Student’s t with ( n - 2 ) degrees of freedom. The

test is camed out by comparing this observed value with the appropriate critical

‘An estimate 0is said to be an unbiased estimate of a parameter ’6 if the expected value of 8 is equal
to 8.
34 SIMPLE LINEAR REGRESSION

Figure 2.6 A graph of the probability density function of a t-distribution. The p-value
for the t-test is the shaded areas under the curve.

value obtained from the t-table given in the Appendix to this book (see Table A.2),
which is t(,-2,a/2), where (Y is a specified significance level. Note that we divide
(Y by 2 because we have a two-sided alternative hypothesis. Accordingly, HOis to

be rejected at the significance level cr if

It1 I 2 + 2 , 4 2 ) > (2.27)
where It11 denotes the absolute value of tl. A criterion equivalent to that in (2.27)
is to compare the p-value for the t-test with (Y and reject HOif
P(lt1l) 5 (2.28)
where p ( It1 I), called the p-value, is the probability that a random variable having
a Student t distribution with ( n - 2) is greater than It11 (the absolute value of the
observed value of the t-test). Figure 2.6 is a graph of the density function of a
t-distribution. The p-value is the sum of the two shaded areas under the curve.
The p-value is usually computed and supplied as part of the regression output by
statistical packages. Note that the rejection of HO : p1 = 0 would mean that p1 is
likely to be different from 0, and hence the predictor variable X is a statistically
significant predictor of the response variable Y .
To complete the picture of hypotheses testing regarding regression parameters,
we give here tests for three other hypotheses that may arise in practice.

Testing Ho: PI = Py
The above t-test can be generalized to test the more general hypothesis HO :
p1 = @, where p," is a constant chosen by the investigator, against the two-sided
alternative H1 : p1 # 0
:. The appropriate test statistic in this case is the t-test,
(2.29)

Note that when = 0, the t-test in (2.29) reduces to the t-test in (2.26). The
statistic tl in (2.29) is also distributed as a Student's t with ( n - 2) degrees of
TESTS OF HYPOTHESES 35

freedom. Thus, Ho : ,B1 = pf is rejected if (2.27) holds (or, equivalently, if (2.28)

holds).
For illustration, using the Computer Repair data, let us suppose that the manage-
ment expected the increase in service time for each additional unit to be repaired to
be 12 minutes. Do the data support this conjecture? The answer may be obtained
by testing Ho : /31 = 12 against H1 : 01 # 12. The appropriate statistic is

$1 - 12 - 15.509 - 12
tl = = 6.948,
0.505
~

4 1 )

with 12 degrees of freedom. The critical value for this test is t(n-2,a/2)- -
t(12,0.025) = 2.18. Since tl = 6.948 > 2.18, the result is highly significant,
leading to the rejection of the null hypothesis. The management’s estimate of the
increase in time for each additional component to be repaired is not supported by
the data. Their estimate is too low.

The need for testing hypotheses regarding the regression parameter ,& may also
arise in practice. More specifically, suppose we wish to test Ho : /30 = against
the alternative H1 : 00 # fig, where is a constant chosen by the investigator.
The appropriate test in this case is given by

(2.30)

If we set 0;= 0, a special case of this test is obtained as

(2.31)

which tests HO : Po = 0 against the alternative H I : ,& # 0.

The least squares estimates of the regression coefficients, their standard errors,
the t-tests for testing that the corresponding coefficient is zero, and the p-values are
usually given as part of the regression output by statistical packages. These values
are usually displayed in a table such as the one in Table 2.8. This table is known
as the coeficients table. To facilitate the connection between a value in the table
and the formula used to obtain it, the equation number of the formula is given in
parentheses.
As an illustrative example, Table 2.9 shows a part of the regression output for
the Computer Repair data in Table 2.5. Thus, for example, al= 15.509, the
s.e.(al) = 0.505, and hence tl = 15.509/0.505 = 30.71. The critical value for
this test using cy = 0.05, for example, is t(12,0.025) = 2.18. The tl = 30.71 is much
larger than its critical value 2.18. Consequently, according to (2.27), Ho : = 0 is
36 SIMPLE LINEAR REGRESSION

Table 2.8 A Standard Regression Output. The Equation Number of the

Corresponding Formulas are Given in Parentheses
~~ ~ ~~

Variable Coefficient (Formula) s.e. (Formula) t-test (Formula) p-value

Constant bo (2.15) s.e.(/%) (2.24) to (2.31) PO

X bi (2.14) s.e.(jl) (2.25) tl (2.26) Pl

Table 2.9 Regression Output for the Computer Repair Data

Variable Coefficient s.e. t-test p-value

Constant 4.162 3.355 1.24 0.2385
Units 15.509 0.505 30.71 < 0.0001

rejected, which means that the predictor variable Units is a statistically significant
predictor of the response variable Minutes. This conclusion can also be reached
using (2.28) by observing that the p-value (p1 < 0.0001) is much less than cy = 0.05
indicating very high significance.

A Test Using Correlation Coefficient

As mentioned above, a test of HO: = 0 against H1 : ,& # 0 can be thought

of as a test for determining whether the response and the predictor variables are
linearly related. We used the t-test in (2.26) to test this hypothesis. An alternative
test, which involves the correlation coefficient between Y and X , can be developed.
Suppose that the population correlation coefficient between Y and X is denoted
by p. If p # 0, then Y and X are linearly related. An appropriate test for testing
HO : p = 0 against H1 : p # 0 is given by

(2.32)

where Cor(Y, X ) is the sample correlation coefficient between Y and X , defined

in (2.6), which is considered here to be an estimate of p. The t-test in (2.32) is
distributed as a Student's t with ( n - 2) degrees of freedom. Thus, Ho : p = 0 is
rejected if (2.27) holds (or, equivalently, if (2.28) holds). Again if Ho : p = 0 is
rejected, it means that there is a statistically significant linear relationship between
Y and X .
It is clear that if no linear relationship exists between Y and X , then = 0.
Consequently, the statistical tests for HO : p1 = 0 and HO : p = 0 should be
identical. Although the statistics for testing these hypotheses given in (2.26) and
(2.32) look different, it can be demonstrated that they are indeed algebraically
equivalent.
CONFIDENCE INTERVALS 37

2.7 CONFIDENCE INTERVALS

To construct confidence intervals for the regression parameters, we also need to

assume that the E’S have a normal distribution, which will enable us to conclude
that the sampling distributions of ,&and b1 are normal, as discussed in Section 2.6.
Consequently, the (1 - a ) x 100%confidence interval for POis given by

a 0 fq n - 2 4 2 ) x s.e.(bo), (2.33)

where t(n-2,cu/2)
is the (1 - a / 2 )percentile of a t distribution with ( n- 2) degrees
of freedom. Similarly, limits of the (1 - a ) x 100%confidence interval for ,& are
given by
A f q n - - 2 . 4 2 ) x s.e.(b1). (2.34)
The confidence interval in (2.34) has the usual interpretation, namely, if we were
to take repeated samples of the same size at the same values of X and construct for
example 95% confidence intervals for the slope parameter for each sample, then
95% of these intervals would be expected to contain the true value of the slope.
From Table 2.9 we see that a 95% confidence interval for PI is

15.509 f 2.18 x 0.505 = (14.408,16.610). (2.35)

That is, the incremental time required for each broken unit is between 14 and 17
minutes. The calculation of confidence interval for POin this example is left as an
exercise for the reader.
Note that the confidence limits in (2.33) and (2.34) are constructed for each of
the parameters POand PI, separately. This does not mean that a simultaneous (joint)
confidence region for the two parameters is rectangular. Actually, the simultaneous
confidence region is elliptical. This region is given for the general case of multiple
regression in the Appendix to Chapter 3 in (A.15), of which the simultaneous
confidence region for ,f?o and 01 is a special case.

2.8 PREDICTIONS

The fitted regression equation can be used for prediction. We distinguish between
two types of predictions:

1. The prediction of the value of the response variable Y which corresponds to

any chosen value, ZO, of the predictor variable, or

2. The estimation of the mean response PO,when X = 20.

For the first case, the predicted value $0 is

(2.36)
38 SIMPLE LINEAR REGRESSION

The standard error of this prediction is

(2.37)

Hence, the confidence limits for the predicted value with confidence coefficient
(1 - a ) are given by
Yo f t ( n - 2 , a p ) s.e.(Yoo>. (2.38)
For the second case, the mean response po is estimated by

bo = bo + 81x0. (2.39)
The standard error of this estimate is

(2.40)

from which it follows that the confidence limits for po with confidence coefficient
(1 - a ) are given by
bo ft ( n - 2 , a p ) s.e.(bo). (2.41)
Note that the point estimate of po is identical to the predicted response Go. This
can be seen by comparing (2.36) with (2.39). The standard error of jio is, however,
smaller than the standard error of yo and can be seen by comparing (2.37) with
(2.40). Intuitively, this makes sense. There is greater uncertainty (variability)
in predicting one observation (the next observation) than in estimating the mean
response when X = 20. The averaging that is implied in the mean response reduces
the variability and uncertainty associated with the estimate.
To distinguish between the limits in (2.38) and (2.41), the limits in (2.38) are
sometimes referred to as the prediction orforecast limits, whereas the limits given
in (2.41) are called the conjidence limits.
Suppose that we wish to predict the length of a service call in which four
components had to be repaired. If 54 denotes the predicted value, then from (2.36)
we get
$4 = 4.162 +15.509 x 4 = 66.20,
with a standard error that is obtained from (2.37) as

On the other hand, if the service department wishes to estimate the expected (mean)
service time for a call that needed four components repaired, we would use (2.39)
and (2.40), respectively. Denoting by p4, the expected service time for a call that
needed four components to be repaired, we have:

fi4 = 4.162 + 15.509 x 4 = 66.20,

MEASURING THE QUALITY OF FIT 39

with a standard error

With these standard errors we can construct confidence intervals using (2.38) and
(2.41), as appropriate.
As can be seen from (2.37), the standard error of prediction increases the farther
the value of the predictor variable is from the center of the actual observations.
Care should be taken when predicting the value of Minutes corresponding to a
value for Units that does not lie close to the observed data. There are two dangers
in such predictions. First, there is substantial uncertainty due to the large standard
error. More important, the linear relationship that has been estimated may not hold
outside the range of observations. Therefore, care should be taken in employing
fitted regression lines for prediction far outside the range of observations. In our
example we would not use the fitted equation to predict the service time for a service
call which requires that 25 components be replaced or repaired. This value lies too
far outside the existing range of observations.

2.9 MEASURING THE QUALITY OF FIT

After fitting a linear model relating Y to X , we are interested not only in knowing
whether a linear relationship exits, but also in measuring the quality of the fit of the
model to the data. The quality of the fit can be assessed by one of the following
highly related (hence, somewhat redundant) ways:

1. When using the tests in (2.26) or (2.32), if HOis rejected, the magnitude of
the values of the test (or the corresponding p-values) gives us information
about the strength (not just the existence) of the linear relationship between
Y and X . Basically, the larger the t (in absolute value) or the smaller the
corresponding p-value, the stronger the linear relationship between Y and X .
These tests are objective but they require all the assumptions stated earlier,
specially the assumption of normality of the E ' S .

2. The strength of the linear relationship between Y and X can also be assessed
directly from the examination of the scatter plot of Y versus X together with
the corresponding value of the correlation coefficient Cor(Y, X ) in (2.6).
The closer the set of points to a straight line (the closer Cor(Y, X ) to 1 or
-l), the stronger the linear relationship between Y and X . This approach is
informal and subjective but it requires only the linearity assumption.

3. Examine the scatter plot of Y versus Y . The closer the set of points to a
straight line, the stronger the linear relationship between Y and X . One can
measure the strength of the linear relationship in this graph by computing the
40 SIMPLE LINEAR REGRESSION

correlation coefficient between Y and Y , which is given by

Cor(Y, Y ) = C(Yi - id(5i - 5) (2.42)

Jc (Yi - $I2 c (52 - y)2 ’
where y is the mean of the response variable Y and y is the mean of the
fitted values. In fact, the scatter plot of Y versus X and the scatter plot of
Y versus Y are redundant because the patterns of points in the two graphs
are identical. The two corresponding values of the correlation coefficient are
related by the following equation:

Cor(Y, Y ) = 1Cor(Y,X ) 1. (2.43)

Note that Cor(Y, Y )cannot be negative (why?), but Cor(Y, X ) can be positive
or negative (-1 6 Cor(Y, X ) 5 1). Therefore, in simple linear regression,
the scatter plot of Y versus Y is redundant. However, in multiple regression,
the scatter plot of Y versus Y is not redundant. The graph is very useful
because, as we shall see in Chapter 3, it is used to assess the strength of the
relationship between Y and the set of predictor variables X I ,X 2 , . . . ,X,.
4. Although scatter plots of Y versus Y and Cor(Y, Y ) are redundant in simple
linear regression, they give us an indication of the quality of the fit in both
simple and multiple regression. Furthermore, in both simple and multiple
regressions, Cor(Y, Y ) is related to another useful measure of the quality of
fit of the linear model to the observed data. This measure is developed as
follows. After we compute the least squares estimates of the parameters of a
linear model, let us compute the following quantities:

SSR = C(yi-
SSE C(Y2- Yd2,
where SST stands for the total sum of squared deviations in Y from its mean
3, SSR denotes the sum of squares due to regression, and SSE represents
the sum of squared residuals (errors). The quantities (Gi - g), ($i - g), and
(yi - @) are depicted in Figure 2.7 for a typical point (x2,yi). The line
yi = PO + j l x i is the fitted regression line based on all data points (not
shown on the graph) and the horizontal line is drawn at Y = g. Note that
for every point (xi,yi), there are two points, (52,&), which lies on the fitted
line, and (xi,g ) which lies on the line Y = jj.
A fundamental equality, in both simple and multiple regressions, is given by

SST = SSR + SSE. (2.45)

This equation arises from the description of an observation as

MEASURING THE QUALITY OF FIT 41

Figure 2.7 A graphical illustrationof various quantitiescomputed after fitting a regression

line to data.

Yi = yz + ( ~ -
i k)
Observed = Fit + Deviation from fit.

Subtracting j j from both sides, we obtain

Yz - Y -
-
(Yz - ii) + (Yi - $2)
Deviation from mean = Deviation due to fit + Residual.

Accordingly, the total sum of squared deviations in Y can be decomposed

into the sum of two quantities, the first, SSR, measures the quality of X as
a predictor of Y, and the second, SSE, measures the error in this prediction.
Therefore, the ratio R2 = SSR/SST can be interpreted as the proportion of
the total variation in Y that is accounted for by the predictor variable X .
Using (2.45), we can rewrite R2 as

(2.46)

Additionally, it can be shown that

[Cor(Y, x)12
= [Cor(Y, Y)12= R ~ . (2.47)

In simple linear regression, R2 is equal to the square of the correlation

coefficient between the response variable Y and the predictor X or to the
square of the correlation coefficient between the response variable Y and the
fitted values Y . The definition given in (2.46) provides us with an alternative
42 SIMPLE LINEAR REGRESSION

interpretation of the squared correlation coefficients. The goodness-of-jt

index, R2,may be interpreted as the proportion of the total variability in the
response variable Y that is accounted for by the predictor variable X . Note
that 0 5 R2 5 1 because SSE 5 SST. If R2 is near 1, then X accounts for a
large part of the variation in Y. For this reason, R2 is known as the coeficient
of determination because it gives us an idea of how the predictor variable X
accounts for (determines) the response variable Y. The same interpretation
of R2 will carry over to the case of multiple regression.
Using the Computer Repair data, the fitted values, and the residuals in Table
2.7, the reader can verify that Cor(Y, X ) = Cor(Y, Y ) = 0.994, from which
it follows that R2 = (0.994)2 = .987. The same value of R2 can be
computed using (2.46). Verify that SST = 27768.348 and SSE = 348.848.
So that
R2 = 1 - - =SSE
1-
348.848
= 0.987.
SST 27768.348
The value R2 = 0.987 indicates that nearly 99% of the total variability in
the response variable (Minutes) is accounted for by the predictor variable
(Units). The high value of R2 indicates a strong linear relationship between
servicing time and the number of units repaired during a service call.
We reemphasize that the regression assumptions should be checked before draw-
ing statistical conclusions from the analysis (e.g., conducting tests of hypothesis
or constructing confidence or prediction intervals) because the validity of these
statistical procedures hinges on the validity of the assumptions. Chapter 4 presents
a collection of graphical displays that can be used for checking the validity of the
assumptions. We have used these graphs for the computer repair data and found no
evidence that the underlying assumptions of regression analysis are not in order. In
summary, the 14 data points in the Computer Repair data have given us an infor-
mative view of the repair time problem. Within the range of observed data, we are
confident of the validity of our inferences and predictions.

2.10 REGRESSION LINE THROUGH THE ORIGIN

We have considered fitting the model

Y = Po + p,x + E, (2.48)

which is a regression line with an intercept. Sometimes, it may be necessary to fit

the model
Y = p,x E , + (2.49)
a line passing through the origin. This model is also called the no-intercept model.
The line may be forced to go through the origin because of subject matter theory
or other physical and material considerations. For example, distance traveled as a
function of time should have no constant. Thus, in this case, the regression model
REGRESSION LINE THROUGH THE ORIGIN 43

in (2.49) is appropriate. Many other practical applications can be found where

model (2.49) is more appropriate than (2.48). We shall see some of these examples
in Chapter 7.
The least squares estimate of p1 in (2.49) is

a1 =
c yzxz (2.50)

The ith fitted value is

yz = &ir i = 1 , 2 , . . . ,n, (2.51)
and the corresponding residual is

ei=yi-yi, i=1,2, ..., n. (2.52)

The standard error of the $1 is

(2.53)

where
(2.54)

Note that the degrees of freedom for SSE is n - 1, not n - 2, as is the case for a
model with an intercept.
Note that the residuals in (2.52) do not necessarily add up to zero as is the case
for a model with an intercept (see Exercise 2.1 l(c)). Also, the fundamental identity
in (2.45) is no longer true in general. For this reason, some quality measures for
models with an intercept such as R2 in (2.46), are no longer appropriate for models
with no-intercept. The appropriate identity for the case of models with no intercept
is obtained by replacing jj in (2.44) by zero. Hence, the fundamental identity
becomes
(2.55)
i=l i=l i=l

from which R2 is redefined as

(2.56)

This is the appropriate form of R2 for models with no intercept. Note, however,
that the interpretations for the two formulas of R2 are different. In the case of
models with an intercept, R2 can be interpreted as the proportion of the variation in
Y that is accounted for by the predictor variable X after adjusting Y by its mean.
For models without an intercept, no adjustment of Y is made. For example, if we
fit (2.49) but use the formula for R2 in (2.46), it is possible for R2 to be negative in
44 SIMPLE LINEAR REGRESSION

some cases (see Exercise 2.1 l(d)). Therefore, the correct formula and the correct
interpretation should be used.
The formula for the t-test in (2.29) for testing HO : p1 = ,@ against the two-
sided alternative H1 : P1 # ,@, continues to hold but with the new definitions of
and s.e.(bl) in (2.50) and (2.53), respectively.
As we mentioned earlier, models with no intercept should be used whenever
they are consistent with the subject matter (domain) theory or other physical and
material considerations. In some applications, however, one may not be certain as
to which model should be used. In these cases, the choice between the models given
in (2.48) and (2.49) has to be made with care. First, the goodness of fit should be
judged by comparing the residual mean squares (6’) produced by the two models
because it measures the closeness of the observed and predicted values for the two
models. Second, one can fit model (2.48) to the data and use the t-test in (2.31)
to test the significance of the intercept. If the test is significant, then use (2.48),
otherwise use (2.49).
An excellent exposition of regression models through the origin is provided by
Eisenhauer (2003) who also alerts the users of regression models through the origin
to be careful when fitting these models using computer software programs because
some of them give incorrect and confusing results for the case of regression models
through the origin.

2.1 1 TRIVIAL REGRESSION MODELS

In this section we give two examples of trivial regression models, that is, regression
equations that have no regression coefficients. The first example arises when we
wish to test for the mean p of a single variable Y based on a random sample of
n observations y1, y2, . . ., yn. Here we have HO : p = 0 against HI : p # 0.
Assuming that Y is normally distributed with mean p and variance 0 2 ,the well-
known one-sample t-test

t = - j -j --0 - Y (2.57)
s.e.(jj) sy/fi ’
can be used to test Ho, where sy is sample standard deviation of Y.Alternatively,
the above hypotheses can be formulated as
Ho(Model 1) : Y = E against Hl(Mode2 2) : Y = PO + E, (2.58)
where Po = po. Thus, Model 1 indicates that p = 0 and Model 2 indicates that
p # 0. The least squares estimate of PO in Model 2 is 3, the ith fitted value is
yi = jj, and the ith residual is ei = yi - g. It follows then that an estimate of u2 is

(2.59)

which is the sample variance of Y . The standard error of $0 is then 6 1 6 =

sy/fi, which is the familiar standard error of the sample mean jj. The t-test for
BIBLIOGRAPHIC NOTES 45

testing Model 1 against Model 2 is

(2.60)

which is the same as the one-sample t-test in (2.57).

The second example occurs in connection with the paired two-sample t-test. For
example, to test whether a given diet is effective in weight reduction, a random
sample of n people is chosen and each person in the sample follows the diet for a
specified period of time. Each person’s weight is measured at the beginning of the
diet and at the end of the period. Let Y1 and Yz denote the weight at the beginning
and at the end of diet period, respectively. Let Y = Y1 - YZ be the difference
between the two weights. Then Y is a random variable with mean p and variance
02. Consequently, testing whether or not the diet is effective is the same as testing

Ho : p = 0 against H1 : p > 0. With the definition of Y and assuming that Y is

normally distributed, the well-known paired two-sample t-test is the same as the
test in (2.57). This situation can be modeled as in (2.58) and the test in (2.60) can
be used to test whether the diet is effective in weight reduction.
The above two examples show that the one-sample and the paired two-sample
tests can be obtained as special cases using regression analysis.

2.12 BIBLIOGRAPHIC NOTES

The standard theory of regression analysis is developed in a number of good text

books, some of which have been written to serve specific disciplines. Each provides
a complete treatment of the standard results. The books by Snedecor and Cochran
(1980), Fox (1984), and Kmenta (1986) develop the results using simple algebra
and summation notation. The development in Searle (1971), Rao (1973), Seber
(1977), Myers (1990), Sen and Srivastava (1 990), Green (1993), Graybill and Iyer
(1994), and Draper and Smith (1 998) lean more heavily on matrix algebra.

EXERCISES
2.1 Using the data in Table 2.6:
Compute Var(Y)and V a r ( X ) .
n
Prove or verify that C (yi - jj) = 0.
i= 1
Prove or verify that any standardized variable has a mean of 0 and a
standard deviation of 1.
Prove or verify that the three formulas for Cor(Y, X ) in (2.3, (2.6), and
(2.7) are identical.
Prove or verify that the three formulas for ,& in (2.14) and (2.20) are
identical.

المادة العمية المتلقة بالارتباط والانحدار - د فواز القربي
100% (1)
المادة العمية المتلقة بالارتباط والانحدار - د فواز القربي
150 pages
Course: Statistiek Voor Premasters
No ratings yet
Course: Statistiek Voor Premasters
51 pages
Block-2
No ratings yet
Block-2
111 pages
Correlation and regression
No ratings yet
Correlation and regression
32 pages
RM Chap 18 Bivariate Analysis
No ratings yet
RM Chap 18 Bivariate Analysis
30 pages
MetNum1 2023 1 Week 13
No ratings yet
MetNum1 2023 1 Week 13
70 pages
Unit 4
No ratings yet
Unit 4
10 pages
Lecture 4 - Correlation and Regression
No ratings yet
Lecture 4 - Correlation and Regression
35 pages
Two variables Chap3
No ratings yet
Two variables Chap3
47 pages
BL 234 Revised Correlation Notes
No ratings yet
BL 234 Revised Correlation Notes
8 pages
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
No ratings yet
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
31 pages
Relationship- Correlation and Regression (1)
No ratings yet
Relationship- Correlation and Regression (1)
42 pages
Research in Hospitality: (Thesis Writing)
100% (2)
Research in Hospitality: (Thesis Writing)
18 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Portion 10
No ratings yet
Portion 10
55 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
Corr and Regress
No ratings yet
Corr and Regress
42 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 4
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 4
7 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
Correlation N Regression
No ratings yet
Correlation N Regression
25 pages
Week 1 - 21st Century Literature DLL
100% (5)
Week 1 - 21st Century Literature DLL
5 pages
1.2. Ch-2 - Correlation Theory-1
No ratings yet
1.2. Ch-2 - Correlation Theory-1
29 pages
Regression and correlation notes
No ratings yet
Regression and correlation notes
28 pages
Correlation of Experimental Data CLIL 2017
No ratings yet
Correlation of Experimental Data CLIL 2017
8 pages
Stat II Chapter 6
No ratings yet
Stat II Chapter 6
11 pages
The Significance of Correlation
No ratings yet
The Significance of Correlation
6 pages
MIS_BA_20232024_notes_chapter02
No ratings yet
MIS_BA_20232024_notes_chapter02
8 pages
Scatter Plot Linear Correlation
No ratings yet
Scatter Plot Linear Correlation
4 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Unit 3 Covariance and Correlation
No ratings yet
Unit 3 Covariance and Correlation
7 pages
5_Chapter9-linear regression
No ratings yet
5_Chapter9-linear regression
15 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Chapter 12
No ratings yet
Chapter 12
36 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
ECN 652 Handout 9 Student
No ratings yet
ECN 652 Handout 9 Student
46 pages
Correction
No ratings yet
Correction
10 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Regression and Correlation
No ratings yet
Regression and Correlation
23 pages
Topic 5-Lecture Notes
No ratings yet
Topic 5-Lecture Notes
12 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
Chapter 8 - PSYC 284
No ratings yet
Chapter 8 - PSYC 284
7 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Regression and Correlation
No ratings yet
Regression and Correlation
37 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
Correlation and Regression
No ratings yet
Correlation and Regression
11 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Handout 5 Correlation and Regression (Recovered)
No ratings yet
Handout 5 Correlation and Regression (Recovered)
6 pages
Y X y X N B: Linear Regression
No ratings yet
Y X y X N B: Linear Regression
7 pages
How Can We Explore The Association Between Two Quantitative Variables?
No ratings yet
How Can We Explore The Association Between Two Quantitative Variables?
7 pages
Parenting A Teen Girl: A Crash Course On Conflict, Communication & Connection With Your Teenage Daughter
100% (6)
Parenting A Teen Girl: A Crash Course On Conflict, Communication & Connection With Your Teenage Daughter
11 pages
Correlation
100% (1)
Correlation
29 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
11 pages
Definition 4.5.1: The Covariance of and Is The Number Defined by
No ratings yet
Definition 4.5.1: The Covariance of and Is The Number Defined by
2 pages
Bamboo Based Cowshed
No ratings yet
Bamboo Based Cowshed
47 pages
Magical+Knowledge+Book+1 9
No ratings yet
Magical+Knowledge+Book+1 9
10 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
History of Mizoram by K. Lalzuimawia
0% (1)
History of Mizoram by K. Lalzuimawia
340 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
The Lushei Kuki Clans by John Shakespear
No ratings yet
The Lushei Kuki Clans by John Shakespear
400 pages
Chapter Four Correlation Analysis: Positive or Negative
No ratings yet
Chapter Four Correlation Analysis: Positive or Negative
15 pages
OET Listening Test 1
100% (5)
OET Listening Test 1
14 pages
Shear Force and Bending Moment
100% (2)
Shear Force and Bending Moment
23 pages
Correlation and Covariance
No ratings yet
Correlation and Covariance
11 pages
08 - Chapter 1
No ratings yet
08 - Chapter 1
45 pages
A Disappearance in Drury Lane - Ashley Gardner
No ratings yet
A Disappearance in Drury Lane - Ashley Gardner
935 pages
Correlation & Simple Regression
No ratings yet
Correlation & Simple Regression
15 pages
Linux Incident
No ratings yet
Linux Incident
24 pages
MLR & LR Bill Joint Select Committee Report
No ratings yet
MLR & LR Bill Joint Select Committee Report
104 pages
UCSP WT 1-2 and PT 1-2
50% (2)
UCSP WT 1-2 and PT 1-2
9 pages
Madam Pia Oa Kaayo
No ratings yet
Madam Pia Oa Kaayo
31 pages
Criminology and Forensic Psychology PDF
No ratings yet
Criminology and Forensic Psychology PDF
381 pages
Maruti Suzuki
0% (1)
Maruti Suzuki
17 pages
Methods of Inquiry in The Two Sciences
No ratings yet
Methods of Inquiry in The Two Sciences
9 pages
GENEL2 Presentation
No ratings yet
GENEL2 Presentation
27 pages
SLM English 8 Q4 E3
No ratings yet
SLM English 8 Q4 E3
4 pages
Frank Lane - They're Off PDF
No ratings yet
Frank Lane - They're Off PDF
31 pages
Chapter 2 Planning & Decision Making
No ratings yet
Chapter 2 Planning & Decision Making
56 pages
Air Pollution Control Devices
0% (1)
Air Pollution Control Devices
16 pages
Do You Want To Be Free and Safe Then Read Carefully
No ratings yet
Do You Want To Be Free and Safe Then Read Carefully
10 pages
Colonial History of Latin America - Reading 2 - English Course 3
No ratings yet
Colonial History of Latin America - Reading 2 - English Course 3
2 pages
Civil Engineering
No ratings yet
Civil Engineering
27 pages
MODULE 7 SYNTAX Plus Worksheets
100% (1)
MODULE 7 SYNTAX Plus Worksheets
13 pages
Story (Coart)
No ratings yet
Story (Coart)
3 pages
10 - Chapter 3
No ratings yet
10 - Chapter 3
54 pages
BTCM2033-Group Assignment Marking Rubic 2020-21
No ratings yet
BTCM2033-Group Assignment Marking Rubic 2020-21
2 pages
4E5E Prelims 2021 Paper 2 - Question Booklet v1
No ratings yet
4E5E Prelims 2021 Paper 2 - Question Booklet v1
8 pages
Engineering: Engineering Is The Application of Science To The Optimum
No ratings yet
Engineering: Engineering Is The Application of Science To The Optimum
53 pages
15.415x Foundations of Modern Finance: Lecture 1: Introduction
No ratings yet
15.415x Foundations of Modern Finance: Lecture 1: Introduction
37 pages
Introduction To Epidemiology Examination.
No ratings yet
Introduction To Epidemiology Examination.
6 pages
Simple Past Review
No ratings yet
Simple Past Review
1 page
Canal Irrigation
No ratings yet
Canal Irrigation
4 pages
Assignments Schedule
No ratings yet
Assignments Schedule
8 pages
BSME
No ratings yet
BSME
1 page
Lecture 2 Market Prices and Present Value Printout Final
No ratings yet
Lecture 2 Market Prices and Present Value Printout Final
28 pages
11 - Chapter 4
No ratings yet
11 - Chapter 4
25 pages
British Administration of Manipur by Kh. Ibochou Singh MA
No ratings yet
British Administration of Manipur by Kh. Ibochou Singh MA
1 page
Anggriana Ayu. Agus Noviana. Della Lestari (TOEFL)
No ratings yet
Anggriana Ayu. Agus Noviana. Della Lestari (TOEFL)
9 pages
Lessonplansample 1
No ratings yet
Lessonplansample 1
5 pages
No5 BMW Clubs Ci Guideline 2017-02-08
No ratings yet
No5 BMW Clubs Ci Guideline 2017-02-08
35 pages
Startup Policy
No ratings yet
Startup Policy
1 page
Actions Required For Building A Strong Startup Ecosystem in Manipur
No ratings yet
Actions Required For Building A Strong Startup Ecosystem in Manipur
3 pages
Rate Aalaysis
No ratings yet
Rate Aalaysis
10 pages
Lecture
No ratings yet
Lecture
6 pages
12 Ways To Be More Search Savvy
No ratings yet
12 Ways To Be More Search Savvy
5 pages
10 Ethical Issues Confronting IT Managers
No ratings yet
10 Ethical Issues Confronting IT Managers
5 pages
Case Study of EBay Inc
No ratings yet
Case Study of EBay Inc
3 pages
Propose Evaluation Rubric INITIAL SCREENING For IDEA Stage Startup
No ratings yet
Propose Evaluation Rubric INITIAL SCREENING For IDEA Stage Startup
2 pages
Manual IPCRF For Proficient Teacher SY 2021 2022
No ratings yet
Manual IPCRF For Proficient Teacher SY 2021 2022
11 pages
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Calculus III Essentials
From Everand
Calculus III Essentials
Editors of REA
1/5 (2)