0% found this document useful (0 votes)
43 views66 pages

Week 2

The document discusses regression modeling and estimation techniques. It covers topics such as estimating the regression function, properties of the fitted regression line, estimating the error term variance, the normal error regression model, analysis of variance in regression analysis, and the coefficient of determination. Examples and exercises are provided to demonstrate estimating the regression line and predicting new observations using linear regression models.

Uploaded by

244715691
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views66 pages

Week 2

The document discusses regression modeling and estimation techniques. It covers topics such as estimating the regression function, properties of the fitted regression line, estimating the error term variance, the normal error regression model, analysis of variance in regression analysis, and the coefficient of determination. Examples and exercises are provided to demonstrate estimating the regression line and predicting new observations using linear regression models.

Uploaded by

244715691
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Regression Modelling

Week 2

Week 2 1 / 66
1 Estimation of Regression Function (Ch 1.6)

2 Properties of Fitted Regression Line (Ch 1.6)

3 Estimation of Error Terms Variance σ 2 (Ch 1.7)

4 Normal Error Regression Model (Ch 1.8)

5 Analysis of Variance (ANOVA) Approach to Regression Analysis (Ch 2.7)

6 Coefficient of Determination (Ch 2.9)

Week 2 2 / 66
Estimation of Regression Function (Ch 1.6)

Week 2 3 / 66
Estimated Regression Function

Regression Model: Y = β0 + β1 X + ε.
Regression Function: E (Y ) = β0 + β1 X .
Use least squares estimation to estimate β0 and β1 .
Estimated regression function:

Ŷ = b0 + b1 X ,

where
Pn
i=1 (Xi − X̄ )(Yi − Ȳ ) Sxy
b1 = Pn = , b0 = Ȳ − b1 X̄ ,
i=1 (Xi − X̄ )
2 Sxx

and we call Ŷ the value of the estimated regression function at the level X
of the predictor variable.

Week 2 4 / 66
Estimated Regression Function

We call the value of Y a response and E (Y ) the mean response. So


Ŷ is a point estimator of the mean response E (Y ) at the level X of
the predictor variable.
It can be shown that Ŷ is an unbiased point estimator of the mean
response E (Y ).
Given a new level X , we can use Ŷ as a (point) prediction of the
response.
For the i-th observation in the data, Yi is the observed value, and we
call
Ŷi = b0 + b1 Xi ,
the fitted value for the i-th observation.

Week 2 5 / 66
Residuals

The i-th residual is the difference between the observed value Yi and
the corresponding fitted value Ŷi , denoted as ei :

ei = Yi − Ŷi .

For our model, the i-th residual becomes

ei = Yi − (b0 + b1 Xi ).

Week 2 6 / 66
Residuals

Do not confuse
εi = Yi − E (Yi ) "Model error"
ei = Yi − Ŷi "Residual"

εi : deviation from the unknown true regression line, unobservable.


ei : deviation from the estimated regression line, known.
Residuals are highly useful for studying whether a given regression
model is appropriate for the data at hand.

Week 2 7 / 66
Residuals
15
10
Y
5

2 4 6 8 10

Week 2 8 / 66
Toluca Company Example

The Toluca Company manufactures refrigeration equipment as well as


many replacement parts. In the past, one of the replacement parts has
been produced periodically in lots of varying sizes. When a cost
improvement program was undertaken, company officials wished to
determine the optimum lot size for producing this part. The production
of this part involves setting up the production process (which must be
done no matter what is the lot size) and machining and assembly
operations. One key input for the model to ascertain the optimum lot
size was the relationship between lot size and labor hours required to
produce the lot. To determine this relationship, data on lot size and
work hours for 25 recent production runs were utilized.
Page 19, Chapter 1 of the textbook.

Week 2 9 / 66
Load Data

Use dataset from the R package “ALSM”.


# install.packages("ALSM")
library("ALSM")
mydata <- TolucaCompany
dim(mydata)

> [1] 25 2
head(mydata, 2)

> x y
> 1 80 399
> 2 30 121
X <- mydata[,1] # or X <- mydata$x
Y <- mydata[,2] # or Y <- mydata$y

X = “Lot Size” and Y = “Work Hours”

Week 2 10 / 66
Load Data

Or download “Kutner Textbook Datasets” from Wattle, file named


“CH01TA01.txt”.
mydata <- read.table("CH01TA01.txt")
head(mydata, 2)

> V1 V2
> 1 80 399
> 2 30 121
names(mydata) <- c("x","y")
head(mydata, 2)

> x y
> 1 80 399
> 2 30 121

Week 2 11 / 66
Summary Statistics

summary(mydata)

> x y
> Min. : 20 Min. :113.0
> 1st Qu.: 50 1st Qu.:224.0
> Median : 70 Median :342.0
> Mean : 70 Mean :312.3
> 3rd Qu.: 90 3rd Qu.:389.0
> Max. :120 Max. :546.0

Week 2 12 / 66
Summary Statistics
boxplot(mydata) # or boxplot(X); boxplot(Y)
500
400
300
200
100
0

x y
Week 2 13 / 66
Scatter Plot
plot(X, Y, pch = 16, xlab = "Lot size", ylab = "Work hours",
main = "Toluca Company")
Toluca Company
500
400
Work hours

300
200
100

20 40 60 80 100 120

Lot size

Week 2 14 / 66
Fitting Model Manually
Recall we have
Pn
i=1 (Xi − X̄ )(Yi − Ȳ )
b1 = Pn 2
i=1 (Xi − X̄ )
b0 = Ȳ − b1 X̄
Xbar <- mean(X)
Ybar <- mean(Y)
Xcenter <- X - Xbar
Ycenter <- Y - Ybar
Sxy <- sum(Xcenter*Ycenter) # or sum(X*Y)-length(X)*mean(X)*mean(Y)
Sxx<- sum(Xcenter^2) # or sum(X^2)-length(X)*mean(X)*mean(X)
Sxy

> [1] 70690


Sxx

> [1] 19800


Week 2 15 / 66
Fitting Model Manually

b1 <- Sxy/Sxx
b1

> [1] 3.570202


b0 <- Ybar - b1*Xbar
b0

> [1] 62.36586

Week 2 16 / 66
Fitting Model Manually

Another way to calculate b1 ,


Pn
i=1 (Xi − X̄ )(Yi − Ȳ ) rxy sy sx rxy sy
b1 = Pn 2
= 2
= ,
i=1 (Xi − X̄ ) sx sx

where rxy is the sample correlation between X and Y , sy and sx are


the standard deviations of X and Y .

Week 2 17 / 66
Fitting Model Manually

b1 <- cov(X, Y)/var(X)


b1

> [1] 3.570202


# or
b1 <- cor(X, Y)*sd(Y)/sd(X)
b1

> [1] 3.570202

Week 2 18 / 66
Fitting Model with “lm” Function

mymodel <- lm(Y ~ X)


class(mymodel)

> [1] "lm"


# or
# mymodel <- lm(y ~ x, data = mydata)

View(mymodel)

Week 2 19 / 66
Fitting Model with “lm” Function

mymodel$coefficients # or use coef(mymodel)

> (Intercept) X
> 62.365859 3.570202

Week 2 20 / 66
Fitting Model with “lm” Function

summary(mymodel)

>
> Call:
> lm(formula = Y ~ X)
>
> Residuals:
> Min 1Q Median 3Q Max
> -83.876 -34.088 -5.982 38.826 103.528
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 62.366 26.177 2.382 0.0259 *
> X 3.570 0.347 10.290 4.45e-10 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 48.82 on 23 degrees of freedom
> Multiple R-squared: 0.8215, Adjusted R-squared: 0.8138
> F-statistic: 105.9 on 1 and 23 DF, p-value: 4.449e-10

Week 2 21 / 66
Estimated Regression Line
Ŷ = 62.366 + 3.570X
plot(X, Y, pch = 16)
abline(mymodel, col = "purple", lty = 2, lwd = 2)
500
400
Y

300
200
100

20 40 60 80 100 120

Week 2 22 / 66
Fitted Values

Yhat <- b0 + b1*X # manually compute


Yfit <- mymodel$fitted.values # get it from model output
# or Yfit <- fitted(mymodel)
Yhat

> [1] 347.9820 169.4719 240.8760 383.6840 312.2800 276.5780 490.7901 347.9820
> [9] 419.3861 240.8760 205.1739 312.2800 383.6840 133.7699 455.0881 419.3861
> [17] 169.4719 240.8760 383.6840 455.0881 169.4719 383.6840 205.1739 347.9820
> [25] 312.2800
Yfit

> 1 2 3 4 5 6 7 8
> 347.9820 169.4719 240.8760 383.6840 312.2800 276.5780 490.7901 347.9820
> 9 10 11 12 13 14 15 16
> 419.3861 240.8760 205.1739 312.2800 383.6840 133.7699 455.0881 419.3861
> 17 18 19 20 21 22 23 24
> 169.4719 240.8760 383.6840 455.0881 169.4719 383.6840 205.1739 347.9820
> 25
> 312.2800

Week 2 23 / 66
Predict Y for a New Observation

# Predict Y as the new level of X = 85


Xnew <- 85
Ypredict <- b0 + b1*Xnew
# or Ypredict <- mymodel$coefficients[1] + mymodel$coefficients[2]*Xnew
Ypredict

> [1] 365.833


# Or use matrix multiplication
Xnew <- c(1, 85)
Ypredict <- Xnew %*% mymodel$coefficients
Ypredict

> [,1]
> [1,] 365.833

Week 2 24 / 66
Residuals
Res <- Y - Yhat # manually compute
Res

> [1] 51.0179798 -48.4719192 -19.8759596 -7.6840404 48.7200000 -52.5779798


> [7] 55.2098990 4.0179798 -66.3860606 -83.8759596 -45.1739394 -60.2800000
> [13] 5.3159596 -20.7698990 -20.0880808 0.6139394 42.5280808 27.1240404
> [19] -6.6840404 -34.0880808 103.5280808 84.3159596 38.8260606 -5.9820202
> [25] 10.7200000
Res <- mymodel$residuals # get it from model output
# or Res <- residuals(mymodel)
Res

> 1 2 3 4 5 6
> 51.0179798 -48.4719192 -19.8759596 -7.6840404 48.7200000 -52.5779798
> 7 8 9 10 11 12
> 55.2098990 4.0179798 -66.3860606 -83.8759596 -45.1739394 -60.2800000
> 13 14 15 16 17 18
> 5.3159596 -20.7698990 -20.0880808 0.6139394 42.5280808 27.1240404
> 19 20 21 22 23 24
> -6.6840404 -34.0880808 103.5280808 84.3159596 38.8260606 -5.9820202
> 25
> 10.7200000

Week 2 25 / 66
Properties of Fitted Regression Line (Ch 1.6)

Week 2 26 / 66
Properties of Fitted Regression Line

Pn
1 The sum of residuals is zero: i=1 ei = 0.

sum(Res) # Not exactly equal zero due to some rounding errors

> [1] -2.930989e-14

Week 2 27 / 66
Properties of Fitted Regression Line

Pn 2
2. The sum of squared residuals is a minimum: i=1 ei .

Week 2 28 / 66
Properties of Fitted Regression Line

3. The sum of the observed values Yi equals the sum of the fitted values
Ŷi :
n
X n
X
Yi = Ŷi .
i=1 i=1

It follows that the mean of the fitted values Ŷi is the same as the
mean of the observed values Yi , namely, Ȳ .

Week 2 29 / 66
Properties of Fitted Regression Line

4. The sum of the weighted residuals is zero when the residual in the ith
trial is weighted by the level of the predictor variable in the ith trial:
n
X
Xi ei = 0.
i=1

sum(X*Res) # Not exactly equal zero due to some rounding errors

> [1] 7.105427e-15

Week 2 30 / 66
Properties of Fitted Regression Line

5. The sum of the weighted residuals is zero when the residual in the ith
trial is weighted by the fitted value of the response variable for the ith
trial:
n
X
Ŷi ei = 0.
i=1

sum(Yhat*Res)

> [1] -4.774847e-12

Week 2 31 / 66
Properties of Fitted Regression Line

6. The regression line always goes through the point (X̄ , Ȳ ).

Week 2 32 / 66
Estimation of Error Terms Variance σ 2 (Ch 1.7)

Week 2 33 / 66
Estimation of Error Terms Variance σ 2

The variance σ 2 of the error terms εi needs to be estimated as it


measures the variability of the probability distribution of Y .

In addition, many statistical inferences of the regression model and


getting a prediction interval of Y require an estimator of σ 2 .

Week 2 34 / 66
Estimation of Error Terms Variance σ 2

Recall: Estimate σ 2 for a single population

Degrees of freedom is the


Pn 2 number of values in the
i=1 (Yi − Ȳ ) final calculation of a
s2 = .
n−1 statistic that are free to
vary.
Pn
− Ȳ )2 is called a sum of squares, where Yi − Ȳ is the
i=1 (Yi
deviation of Yi from the estimated mean Ȳ .
The degrees of freedom (DF) is n − 1 since we lost one degree of
freedom by using Ȳ as an estimator of the unknown population mean.
The estimator s 2 can be regarded as a mean square (sum of
squares/DF).
It is unbiased, E (s 2 ) = σ 2 .

Week 2 35 / 66
Estimation of Error Terms Variance σ 2

For the regression model, estimating σ 2 (i.e., Var (Y )), we also need to
calculate a sum of squared deviations, but the deviation of an observation
Yi must be calculated around its own estimated mean Ŷi , i.e., ei = Yi − Ŷi .
The appropriate sum of squares, denoted by SSE, is
n
X n
X
SSE = ei2 = (Yi − Ŷi )2 .
i=1 i=1

SSE stands for error sum of squares or residual sum of squares. It


has n − 2 degrees of freedom as two degrees of freedom are lost
because we need to estimate two parameters to obtain Ŷi .

Week 2 36 / 66
Estimation of Error Terms Variance σ 2

Our estimator of σ 2 is a mean square, i.e.,


Pn
SSE − Ŷi )2
i=1 (Yi
se2 = MSE = = .
DF n−2

MSE stands for error mean square or residual mean square.



An estimator of the standard deviation σ is se = MSE .
It can be shown that MSE is an unbiased point estimator of σ 2 , i.e.,

E (MSE ) = σ 2 .

Week 2 37 / 66
Estimation of Error Terms Variance σ 2

SSE <- sum(Res^2)


SSE

> [1] 54825.46


n <- length(Y) # or n <- dim(mydata)[1]
MSE <- SSE/(n-2)
MSE

> [1] 2383.716


# Estimator of the standard deviation sigma
sigma_hat = sqrt(MSE)
sigma_hat

> [1] 48.82331


# This is also called "residual standard error"

Week 2 38 / 66
Estimation of Error Terms Variance σ 2

>
> Call:
> lm(formula = Y ~ X)
>
> Residuals:
> Min 1Q Median 3Q Max
> -83.876 -34.088 -5.982 38.826 103.528
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 62.366 26.177 2.382 0.0259 *
> X 3.570 0.347 10.290 4.45e-10 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 48.82 on 23 degrees of freedom
> Multiple R-squared: 0.8215, Adjusted R-squared: 0.8138
> F-statistic: 105.9 on 1 and 23 DF, p-value: 4.449e-10

Week 2 39 / 66
Normal Error Regression Model (Ch 1.8)

Week 2 40 / 66
Normal Error Regression Model

The normal error regression model is

Yi = β0 + β1 Xi + εi ,

where the error εi ∼iid N (0, σ 2 ).


This model implies that the Yi ∼ N (β0 + β1 Xi , σ 2 ) independently.
Throughout this course unless otherwise stated, we assume the normal
error regression model is applicable. This assumption simplifies theory
and helps us setting up the procedures of statistical inferences.

No matter what form of the distribution of εi , the least squares method


provides unbiased estimators of β0 and β1 that have minimum variance
among all unbiased linear estimators.

Week 2 41 / 66
Analysis of Variance (ANOVA) Approach to
Regression Analysis (Ch 2.7)

Week 2 42 / 66
ANOVA Approach to Regression Analysis

We consider the regression analysis from the perspective of analysis of


variance.
Useful for checking if some predictors are relevant to the regression
model.
The analysis of variance is based on the partitioning of sums of squares
and degrees of freedom associated with the response variable Y .

Week 2 43 / 66
Partitioning of Total Sum of Squares

The total variation of Yi :


n
X
SSTO = (Yi − Ȳ )2 .
i=1

SSTO stands for total sum of squares.


If all Yi observations are the same, SSTO = 0.
The greater the variation among Yi , the larger is SSTO.
SSTO is a measure of the uncertainty of Yi , when the informtion of Xi
is not taken into account.

Week 2 44 / 66
Partitioning of Total Sum of Squares

If we take Xi into account, the variation of Yi is


n
X
SSE = (Yi − Ŷi )2 .
i=1

If all Yi observations fall on the fitted regression line, SSE = 0,


meaning "perfect model".
The greater the variation of the Yi observations around the fitted
regression line, the larger is SSE .

Week 2 45 / 66
Partitioning of Total Sum of Squares

SSTO <- sum(Ycenter^2)


SSTO

> [1] 307203


SSE

> [1] 54825.46


SSTO - SSE

> [1] 252377.6

Why is there a large difference between these two sums of squares?

Week 2 46 / 66
Partitioning of Total Sum of Squares
15
10
Y
5

2 4 6 8 10

Week 2 47 / 66
Partitioning of Total Sum of Squares

The difference, as we show shortly, is another sum of squares:


n
X
SSR = (Ŷi − Ȳ )2
i=1

SSR stands for regression sum of squares.


SSR represents the variability in Yi which is associated with the
regression line.
The larger SSR is in relation to SSTO, the greater is the effect of the
regression relation in accounting for the total variation in Yi .
If the regression line is horizontal, then SSR = 0. Othervise, SSR is
positive.

Week 2 48 / 66
Partitioning of Total Sum of Squares

SSR <- sum((Yhat - Ybar)^2)


SSR # equal to SSTO-SSE

> [1] 252377.6


SSR/SSTO

> [1] 0.8215335

Week 2 49 / 66
Formal Development of Partitioning

SSTO = SSR + SSE

Proof: We can see easily that Yi − Ȳ = Yi − Ŷi + Ŷi − Ȳ .


Pn
Also, it can be shown i=1 (Ŷi − Ȳ )(Yi − Ŷi ) = 0. Then

n
X
SSTO = (Yi − Ȳ )2
i=1
n
X
= (Yi − Ŷi + Ŷi − Ȳ )2
i=1
n
X n
X n
X
= (Ŷi − Ȳ )2 + (Yi − Ŷi )2 + 2 (Ŷi − Ȳ )(Yi − Ŷi )
i=1 i=1 i=1
= SSR + SSE

Week 2 50 / 66
Breakdown of Degrees of Freedom
SSTO has n − 1 degrees of freedom

One degree of freedom is lost by using Ȳ as an estimator of the unknown


mean. Equivalently, the deviation Yi − Ȳ is subject to one
population P
n
constraint i=1 (Yi − Ȳ ) = 0.

SSE has n − 2 degrees of freedom

Two degrees of freedom are lost due to estimating two parameters to obtain
Ŷi .

SSR has 1 degree of freedom

Only two degrees of freedom are associated with a regression line,


corresponding
Pn to the intercept and the slope. But one is lost because of the
constraint i=1 (Ŷi − Ȳ ) = 0.

DFSSTO = DFSSR + DFSSE .


Week 2 51 / 66
Mean Squares
A sum of squares divided by its associated degrees of freedom is called a
mean square
Sample variance of Y is a mean square.

Regression mean square:


SSR
MSR = = SSR
1

Error (residual) mean square:

SSE
MSE =
n−2

Mean squares are not additive.


Week 2 52 / 66
ANOVA Table

Source of Variation Sum of Squares df Mean Square


SSR = ni=1 (Ŷi − Ȳ )2 MSR = SSR
P
Regression 1 1

Pn SSE
Error SSE = i=1 (Yi − Ŷi )2 n−2 MSE = n−2

Pn
Total SSTO = i=1 (Yi − Ȳ )2 n−1

Week 2 53 / 66
Expected Mean Squares

MSE and MSR are random variables and we have

E (MSE ) = σ 2
n
X
2
E (MSR) = σ + β12 (Xi − X̄ )2
i=1

When β1 = 0, the means of the sampling distribution of MSE and


MSR are the same.
When β1 6= 0, the mean of the sampling distribution of MSR is larger
than MSE .
Comparing MSR and MSE should be useful for testing if β1 = 0.

Week 2 54 / 66
F Test

To test

H0 : β 1 =0 v.s. Ha : β1 6= 0
2 2 2 2
or H0 : σmodel /σerror =1 v.s. Ha : σmodel /σerror >1
2 2
where σerror = E (MSE ) and σmodel = E (MSR).

We can use the test statistic:

MSR
F∗ =
MSE

What’s the distribution of F ∗ under the null hypothesis?

Week 2 55 / 66
F Test

Recall:
With two independent χ2 distributed random variables Z1 and Z2 , with
degrees of freedom df1 and df2 , the ratio

Z1 /df1
Z2 /df2

will follow an F distribution with (df1 , df2 ) degrees of freedom.

Week 2 56 / 66
F Test

Under the null hypothesis (β1 = 0), it can be shown that


SSR
σ2
is distributed as χ21 ,
SSE
σ2
is distributed as χ2n−2 ,
SSE and SSR are independent.

So
F ∗ ∼ F (1, n − 2) under H0 .

Week 2 57 / 66
F test

This is an upper-tail test.


Large values of F ∗ support Ha and values of F ∗ near 1 support H0 .
So we only reject H0 for larger values of F ∗ .

With a significance level α, we reject H0 when

F ∗ > Fα (1, n − 2),

where Fα (1, n − 2) is the upper α quantile of the distribution F (1, n − 2).

Week 2 58 / 66
ANOVA (manually)

# Regression mean square


MSR <- SSR/1
Fstat <- MSR/MSE
Fstat

> [1] 105.8757


critical_value <- qf(0.95, 1, n-2)
# or qf(0.05, 1, n-2, lower.tail = FALSE)
critical_value

> [1] 4.279344


pvalue <- 1- pf(Fstat, 1, n-2)
pvalue

> [1] 4.448828e-10

Week 2 59 / 66
ANOVA (by R function)

anova(mymodel)

> Analysis of Variance Table


>
> Response: Y
> Df Sum Sq Mean Sq F value Pr(>F)
> X 1 252378 252378 105.88 4.449e-10 ***
> Residuals 23 54825 2384
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Week 2 60 / 66
Coefficient of Determination (Ch 2.9)

Week 2 61 / 66
Coefficient of Determination (R 2 )

The coefficient of determination


SSR
R2 = = 1 − SSE /SSTO
SSTO

It measures the proportion of total variation in Y that can be explained


by the fitted linear regression model.
0 ≤ R2 ≤ 1
When all observations fall on the fitted regression line, then SSE = 0
and R 2 = 1. It means that the covariate X accounts for all variation in
Y.
When the fitted regression line is horizontal, then SSE = SSTO and
R 2 = 0. It indicates the covariate X is of no help in reducing the
variation in Y with linear regression.

Week 2 62 / 66
R2

Misunderstanding
A high coefficient of determination indicates that useful predictions can
be made.
A high coefficient of determination indicates that the estimated
regression line is a good fit.
A coefficient of determination near zero indicates that X and Y are
not related.
To access the goodness fit and adequacy of a regression model, conclusion
can not be made just based on the value of R 2 . You also need to look at
residual plots, test the significance of the model, etc.

Week 2 63 / 66
R2

In SLR, R 2 = r 2 , where r is the sample correlation coefficient of X and


Y.
The closer it is to 1, the greater is said to be the degree of linear
association between X and Y .
R2 <- SSR/SSTO
R2

> [1] 0.8215335


# Check with the coefficient of correlation
cor(X, Y)^2

> [1] 0.8215335

Week 2 64 / 66
R 2 in R

summary(mymodel)

>
> Call:
> lm(formula = Y ~ X)
>
> Residuals:
> Min 1Q Median 3Q Max
> -83.876 -34.088 -5.982 38.826 103.528
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 62.366 26.177 2.382 0.0259 *
> X 3.570 0.347 10.290 4.45e-10 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 48.82 on 23 degrees of freedom
> Multiple R-squared: 0.8215, Adjusted R-squared: 0.8138
> F-statistic: 105.9 on 1 and 23 DF, p-value: 4.449e-10

Week 2 65 / 66
Read Ch 1.6-1.8, 2.7, 2.9 of the textbook.

Week 2 66 / 66

You might also like