0% found this document useful (0 votes)
44 views16 pages

Correlation Regression

The document discusses correlation and linear regression analysis. It provides sample data on number of weekend TV commercials (x) and subsequent sales (y) for a stereo store. It defines covariance and correlation coefficient, and shows how to calculate a regression line using the least squares method. A scatter plot and regression equation are presented for another sample data set relating student population (x) to pizza parlor sales (y). Formulas are provided for calculating sums of squared errors (SSE), total sum of squares (SST), and sum of squared regression (SSR). The coefficient of determination (r^2) is defined and a hypothesis test is described to evaluate if the regression coefficient (β1) is significantly different from zero.

Uploaded by

Aparna JR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views16 pages

Correlation Regression

The document discusses correlation and linear regression analysis. It provides sample data on number of weekend TV commercials (x) and subsequent sales (y) for a stereo store. It defines covariance and correlation coefficient, and shows how to calculate a regression line using the least squares method. A scatter plot and regression equation are presented for another sample data set relating student population (x) to pizza parlor sales (y). Formulas are provided for calculating sums of squared errors (SSE), total sum of squares (SST), and sum of squared regression (SSR). The coefficient of determination (r^2) is defined and a hypothesis test is described to evaluate if the regression coefficient (β1) is significantly different from zero.

Uploaded by

Aparna JR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Correlation and regression

correlation
A stereo and sound equipment store’s manager wants to
determine the relationship between the number of weekend
television commercials shown and the sales at the store
during the following week. The sample data is as shown:

Week No. of commercials ‘x’ Sales Volume (Rs.1000s)


1 2 50
2 5 57
3 1 41
4 3 54
5 4 54
6 1 38
7 5 63
8 3 48
9 4 59
10 2 46
covariance

Covariance is a descriptive measure of linear relationship


between two variables.

Covariance is given by sxy = [∑(xi - xbar) (yi - ybar)] / (n-1)


SCATTER DIAGRAM METHOD

SOURCE: WIKIPEDIA
Simple linear regression
Regression x y

Model y = β0 + x1 y1

β1 x + ε x2 y2

Regression .
v .

eqn . .

E(y) = β0 + β1 x xn yn

Estimated
b0, b1 provide Regression
Equation
estimates of ŷ = b0 + b1x
β0 and β1 Sample stats
b0, b1
Least squares method

Sample data from 10 pizza parlor restaurants situated near


college campus.

Restaurant Student Population (‘000s) Sales (Rs.‘000s)


1 2 58
2 6 105
3 8 88
4 8 118
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202
Least squares method

Least squares criterion

• Min ∑(yi - ŷ)2

Estimated regression equation is

ŷ = b0 + b1 x

Where b1 = [∑(xi - xbar) (yi - ybar)] / ∑(xi - xbar)2

And where b0 = ybar - b1 xbar


Chart Title

220

200
y = 5x + 60
180

160

140

120

100

80

60

40

20

0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Student Population (000s)
SSE

xi Student Predicted Squared


yi Sales
Restaurant Population Sales Error (yi - ŷ) Error
(Rs.‘000s)
(‘000s) ŷ = 5x + 60 (yi - ŷ)2
1 2 58 70 -12 144
2 6 105 90 15 225
3 8 88 100 -12 144
4 8 118 100 18 324
5 12 117 120 -3 9
6 16 137 140 -3 9
7 20 157 160 -3 9
8 20 169 160 9 81
9 22 149 170 -21 441
10 26 202 190 12 144
SSE = 1530
SSt

xi Student
yi Sales Squared Error
Restaurant Population Error (yi - ȳ)
(Rs.‘000s) (yi - ȳ)2
(‘000s)
1 2 58 -72 5184
2 6 105 -25 625
3 8 88 -42 1764
4 8 118 -12 144
5 12 117 -13 169
6 16 137 7 49
7 20 157 27 729
8 20 169 39 1521
9 22 149 19 361
10 26 202 72 5184
ȳ= 130 SST = 15730
Coefficient of determination

SSE = ∑(yi - ŷi)2

SST = ∑(yi - ȳ)2

SSR = ∑(ŷi - ȳ)2 = SST - SSE

r2 = SSR / SST

Correlation Coefficient = (sign of b1) √r2


Coefficient of determination

Xi 1 2 3 4 5

Yi 3 7 5 11 14

The estimated regression equation for these data is ŷ =.20 +


2.60x

A) compute SSE, SST and SSR

B) Compute r2

C) Compute r
Testing for significance

E(y) = β0 + β1 x. If the value of ‘x’ = 0, then E(y) = β0 and hence x


and y are linearly related

To test the significance of relationship, conduct a hypothesis test


to determine whether the value of is β1 zero.

MSE - Mean Square of errors estimates the value of σ2.

S2 is unbiased estimator of σ2

So S2 = MSE = SSE / (n-2)

n-2 degrees of freedom as β0 & β1 are already used to compute


SSE.
Testing for significance

H 0: β 1 = 0

• H a: β 1 ≠ 0

Sampling distribution of b1

• E(b1) = β1

Standard Deviation σb1= σ / √(xi - xbar)2

• sb1 = s / √(xi - xbar)2


exercise

Xi 2 3 5 1 8
Yi 25 25 20 30 16

• A) Compute mean square error

• B) Use t-test to test for significance at alpha = .05

• C) use F-test to test the hypothesis at .05 level of


significance. Present the results in ANOVA table format.

You might also like