Examples Correation and Regression
Examples Correation and Regression
10
y = -0.6368x + 9.9393
8
Hours
0
0 2 4 6 8 10 12 14
Months
34
33
32.5
32
31.5
31
30.5
30
0 1 2 3 4 5 6 7 8 9
x y
1 2.8333 31.5 0.36147 -0.54 -0.1951938
2 1.233 30.5 -1.23883 -1.54 1.9077982
3 2.144 30.9 -0.32783 -1.14 0.3737262
4 3.849 31.6 1.37717 -0.44 -0.6059548
5 8.214 34.2 5.74217 2.16 12.4030872
6 1.448 34.2 -1.02383 2.16 -2.2114728
7 1.513 30.7 -0.95883 -1.34 1.2848322
8 1.297 31.7 -1.17483 -0.34 0.3994422
9 1.257 32.5 -1.21483 0.46 -0.5588218
10 0.93 32.6 -1.54183 0.56 -0.8634248
Sample mean 2.47183 32.04
Total 11.934018
St Dev 2.20713002 1.3301629
𝑠𝑥 = 5.8737, 𝑠𝑦 = 6.4462
−231.75
𝑟= = −0.8744
8 − 1 (5.8737)(6.4462)
• d. Strong inverse relationship. As the number
of police increases, the crime decreases, or as
crime increases, the number of police
decreases.
Questions
• Assume the dependent variable is number of crimes.
a. Determine the regression equation.
b. Estimate the number of crimes for a city with 20 police
officers.
c. Interpret the regression equation.
Solution
a.
𝑏 = −.8744( 6.4462/5.8737) = −0.9596
95 146
𝑎 = − −0.9596 = 29.3877
8 8
b. 10.1957, found by 29.3877 − 0.9596(20)
c. For each policeman added, crime goes down by almost one.
Testing The Significance of Slope
of the Regression Line
Testing The Significance Of The Slope
• We already showed how to find the equation of the regression
line that best fits the data, based on the least squares principle.
• The purpose of the regression equation is to quantify a linear
relationship between two variables.
• The next step is to analyze the regression equation by conducting
a test of hypothesis to see if the slope of the regression line is
different from zero.
• If we cannot demonstrate that this slope is different from zero,
then we conclude there is no merit to using the independent
variable as a predictor.
Two-Tailed Hypothesis Testing
• The null and alternative hypotheses are:
𝐻0 ∶ 𝛽 = 0 (the slope of the regression equation in the
population is zero.)
𝐻1 ∶ 𝛽 ≠ 0 (the slope of the regression equation in the
population is other than zero.)
𝛽 (the Greek letter beta) represents the population slope for the
regression equation.
• In regression analysis, 𝑏 is our computed slope based on a
sample and is an estimate of the population’s slope, identified
as 𝛽.
Conclusions and T test
• If 𝐻0 is accepted, then the regression line is horizontal and
there is no relationship between the independent variable,
X, and the dependent variable, Y.
• If 𝐻0 is rejected and the alternative statement is accepted, a
significant relationship exists between the two variables.
• T test for slope is:
One-tailed Test
• Instead of a two tailed test, we prefer one tailed test of the form.
𝐻0 ∶ 𝛽 ≤ 0
𝐻1 ∶ 𝛽 > 0
• If we do not reject the null hypothesis, we conclude that the
slope of the regression line in the population could be zero. This
means the independent variable is of no value in improving our
estimate of the dependent variable.
• If we reject the null hypothesis and accept the alternative, we
conclude the slope of the line is greater than zero. Hence, the
independent variable is an aid in predicting the dependent
variable.
Example I
• We take the same example of salespersons
and copiers sold.
• The t distribution is the test statistic.
• 𝑑𝑓 = 𝑛 − 2 = 15 − 2 = 13.
• We use the .05 significance level. From Appendix B.5, the
critical value is 1.771.
• Our decision rule is to reject the null hypothesis if the value
computed from formula is greater than 1.771.
Decision
• The computed value of 6.205 exceeds our critical value of
1.771, so we reject the null hypothesis and accept the
alternative hypothesis.
• We conclude that the slope of the line is greater than zero.
• The independent variable, number of sales calls, is useful in
estimating copier sales.
Example II
• The owner of Haverty’s Furniture Company studied the
relationship between the amount spent on advertising in a
month and sales revenue for that month.
• The amount of sales is the dependent variable and advertising
expense is the independent variable. The regression equation in
that study was ŷ = 1.5 + 2.2𝑥 for a sample of 5 months.
• Conduct a test of hypothesis to show there is a positive
relationship between advertising and sales. From statistical
software, the standard error of the regression coefficient is 0.42.
Use the .05 significance level.
Solution
Example III
• The regression equation is ŷ = 29.29 − 0.96x, the sample size
is 8, and the standard error of the slope is 0.22. Use the .05
significance level.
• Can we conclude that the slope of the regression line is less
than zero?
Solution
• 𝐻0 : 𝛽 ≥ 0 𝐻1 : 𝛽 < 0
• 𝑑𝑓 = 𝑛 − 2 = 8 − 2 = 6
• Reject H0 if 𝑡 < −1.943.
• 𝑡 = − 0.96 0.22 = −4.364
• Reject 𝐻0 and conclude the slope is less than zero.