0% found this document useful (0 votes)
14 views8 pages

HW5

Uploaded by

zayzay2day
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

HW5

Uploaded by

zayzay2day
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

HW5

2024-10-16

Problem 6.11
a)
H0: b1=b2=b3=0
Ha: not all b# = 0
The decision rule is if calculated F > F* fail to reject
F = 2.798061
F* = 35.33703
as can be seen, the F* is way bigger so we reject the null hypothesis
The p-value is also very small (less than .05) so we reject the null hypothesis and state that there is a
regression relation.
This implies that at least one of B1 B2 B3 are relevant p-value = 3.315708e-12
b)
B1 CI = ( -5.64608e-05 , 0.001630622 )B3 CI = ( 478.6096 , 768.4993 ) The CI for B1 includes 0, so we can’t
reject H0: b1 = 0 with 95% confidence.
The CI for B3 does not include 0, so we can reject H0: b3 = 0 with 95% confidence.
c)
R squared = 0.6883342
This is the % of the error that is explained by X1, X2, and X3 in the model
Problem 6.16
a)
H0: b1=b2=b3=0
Ha: not all b# = 0
The decision rule is if calculated F > F* fail to reject
F = 2.219059
F* = 35.33703
as can be seen, the F* is way bigger so we reject the null hypothesis
The p-value is also very small (less than .05) so we reject the null hypothesis and state that there is a
regression relation.
This implies that at least one of B1 B2 B3 are relevant p-value = 1.541973e-10
b)

1
B1 CI = ( -1.614248 , -0.6689755 )
B2 CI = ( -1.52451 , 0.6405013 )
B3 CI = ( -29.09203 , 2.151701 )
I used Bonferroni joint CI becasue we were given a family confidence coefficient
The CI for B1 does not include 0, so we can reject H0: b1 = 0 with 90% confidence.
The CI for B2 includes 0, so we can’t reject H0: b2 = 0 with 90% confidence.
The CI for B3 includes 0, so we can’t reject H0: b3 = 0 with 90% confidence.
R squared = 0.6821943
This is the % of the error that is explained by X1, X2, and X3 in the model
Problem 6.17
a)
90% CI for Yh hat = ( 64.52854 , 73.49204 )
This CI shows the range that we are 90% sure this set of observations will fall into
b)
The 90% prediction interval = ( 52.09065 , 85.92992 ) For a future observation with the given X paramater
values we are 90% sure it will fall within out Prediction Interval
Problem 7.4
a)
Extra sum of squares for X1 = 136366.2
Extra sum of squares for X3 | X1 = 2033565
Extra sum of squares for X2 | X1,X3 = 6674.588
b)
H0 : B2 = 0 (reduced model) Ha : B2 != 0 (full model)
F* = 0.3250843
F = 4.042652
Since F* < F we cannot say at with 95% confidence that B2 should be in the model. We fail to reject H0, so
we should probably drop B2.
P-value = 0.5712274
c)
Expression 1 = 142092.2
Expression 2 = 142092.2
They are equal in this case. They will always be equal because both expressions are representing the total
amount of error explained by both X1 and X2 in a model.
Problem 7.9
H0: B1 = -1 , B2 = 0 (reduced model) Y = B0 - X1 +B3*X3
Ha: B1 != -1 , B2 != 0 (full model) Y = B0+B1X1+B2 X2+B3*X3
[1] 8275.389 [1] 4.03271

2
summary(mlrX3)

##
## Call:
## lm(formula = Y2.0 ~ X3.2 - X1.2, data = dataX3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.369 -9.606 -1.946 9.212 31.631
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 146.449 15.304 9.569 2.55e-12 ***
## X3.2 -37.117 6.637 -5.593 1.33e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.33 on 44 degrees of freedom
## Multiple R-squared: 0.4155, Adjusted R-squared: 0.4022
## F-statistic: 31.28 on 1 and 44 DF, p-value: 1.335e-06
anova(mlr2)

## Analysis of Variance Table


##
## Response: Y2.0
## Df Sum Sq Mean Sq F value Pr(>F)
## X1.2 1 8275.4 8275.4 81.8026 2.059e-11 ***
## X2.2 1 480.9 480.9 4.7539 0.03489 *
## X3.2 1 364.2 364.2 3.5997 0.06468 .
## Residuals 42 4248.8 101.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As you can see at the .025 level since F>F* we reject the null hypothesis
#
# ---
# title: "HW5"
# output: pdf_document
# date: "2024-10-16"
# ---
#
# Problem 6.11
#
# a)
#
# ```{r, echo=FALSE, results='asis'}
# Y<-c(4264,4496,4317,4292,4945,4325,4110,4111,4161,4560,4401,4251,4222,4063,4343,4833,4453,4195,4394,40
# X1<-c(305657,328476,317164,366745,265518,301995,269334,267631,296350,277223,269189,277133,282892,30663
# X2<-c(7.17,6.2,4.61,7.02,8.61,6.88,7.23,6.27,6.49,6.37,7.05,6.34,6.94,8.56,6.71,5.82,6.82,8.38,7.72,7.
# X3<-c(0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,
# library(broom)
# data <- data.frame(Y, X1, X2, X3)
# mlr<-lm(Y ~ X1+X2+X3, data = data)

3
# sm = summary(mlr)
# pVal = glance(sm)$p.value[[1]]
# ```
# H0: b1=b2=b3=0
#
# Ha: not all b# = 0
#
# The decision rule is if calculated F > F* fail to reject
#
# ```{r, echo=FALSE, results='asis'}
# cat("F = ",qf(.95,3,48),"\n\n")
# cat("F* = ",glance(sm)$statistic, "\n\n")
# ```
#
# as can be seen, the F* is way bigger so we reject the null hypothesis
#
# The p-value is also very small (less than .05) so we reject the null hypothesis
# and state that there is a regression relation.
#
# This implies that at least one of B1 B2 B3 are relevant
# ```{r, echo=FALSE, results='asis'}
# cat("p-value = ", pVal)
# ```
#
# b)
#
# ```{r, echo=FALSE, results='asis'}
# B0 = mlr$coefficients[[1]]
# B1 = mlr$coefficients[[2]]
# B2 = mlr$coefficients[[3]]
# B3 = mlr$coefficients[[4]]
# B = qt(1-.05/4,48)
# SEB1 = sm$coef[[2,2]]
# SEB3<-sm$coef[[4,2]]
# cat("B1 CI = (", B1 - B*SEB1, ", ", B1 + B*SEB1,")")
# cat("B3 CI = (", B3 - B*SEB3, ", ", B3 + B*SEB3,")")
# ```
# The CI for B1 includes 0, so we can't reject H0: b1 = 0 with 95% confidence.
#
# The CI for B3 does not include 0, so we can reject H0: b3 = 0 with 95% confidence.
#
# c)
#
# ```{r, echo=FALSE, results='asis'}
# cat("R squared = ", sm$r.squared)
# ```
#
# This is the % of the error that is explained by X1, X2, and X3 in the model
#
# Problem 6.16
#
# a)
#

4
# ```{r, echo=FALSE, results='asis'}
# Y2.0<-c(48,57,66,70,89,36,46,54,26,77,89,67,47,51,57,66,79,88,60,49,77,52,60,86,43,34,63,72,57,55,59,8
# X1.2<-c(50,36,40,41,28,49,42,45,52,29,29,43,38,34,53,36,33,29,33,55,29,44,43,23,47,55,25,32,32,42,33,3
# X2.2<-c(51,46,48,44,43,54,50,48,62,50,48,53,55,51,54,49,56,46,49,51,52,58,50,41,53,54,49,46,52,51,42,4
# X3.2<-c(2.3,2.3,2.2,1.8,1.8,2.9,2.2,2.4,2.9,2.1,2.4,2.4,2.2,2.3,2.2,2,2.5,1.9,2.1,2.4,2.3,2.9,2.3,1.8,
# data2 <- data.frame(Y2.0, X1.2, X2.2, X3.2)
# mlr2<-lm(Y2.0 ~ X1.2+X2.2+X3.2, data = data2)
# sm2 = summary(mlr2)
# pVal2 = glance(sm2)$p.value[[1]]
# ```
#
# H0: b1=b2=b3=0
#
# Ha: not all b# = 0
#
# The decision rule is if calculated F > F* fail to reject
#
# ```{r, echo=FALSE, results='asis'}
# cat("F = ",qf(.90,3,42),"\n\n")
# cat("F* = ",glance(sm)$statistic, "\n\n")
# ```
#
# as can be seen, the F* is way bigger so we reject the null hypothesis
#
# The p-value is also very small (less than .05) so we reject the null hypothesis
# and state that there is a regression relation.
#
# This implies that at least one of B1 B2 B3 are relevant
# ```{r, echo=FALSE, results='asis'}
# cat("p-value = ", pVal2)
# ```
#
# b)
#
# ```{r, echo=FALSE, results='asis'}
# B0.2 = mlr2$coefficients[[1]]
# B1.2 = mlr2$coefficients[[2]]
# B2.2 = mlr2$coefficients[[3]]
# B3.2 = mlr2$coefficients[[4]]
# B2.0 = qt(1-.1/6,42)
# SEB1.2 = sm2$coef[[2,2]]
# SEB2.2 = sm2$coef[[3,2]]
# SEB3.2 = sm2$coef[[4,2]]
# cat("B1 CI = (", B1.2 - B2.0*SEB1.2, ", ", B1.2 + B2.0*SEB1.2,")\n\n")
# cat("B2 CI = (", B2.2 - B2.0*SEB2.2, ", ", B2.2 + B2.0*SEB2.2,")\n\n")
# cat("B3 CI = (", B3.2 - B2.0*SEB3.2, ", ", B3.2 + B2.0*SEB3.2,")\n\n")
# ```
#
# I used Bonferroni joint CI becasue we were given a family confidence coefficient
#
# The CI for B1 does not include 0, so we can reject H0: b1 = 0 with 90% confidence.
#
# The CI for B2 includes 0, so we can't reject H0: b2 = 0 with 90% confidence.

5
#
# The CI for B3 includes 0, so we can't reject H0: b3 = 0 with 90% confidence.
#
# ```{r, echo=FALSE, results='asis'}
# cat("R squared = ", sm2$r.squared)
# ```
#
# This is the % of the error that is explained by X1, X2, and X3 in the model
#
# Problem 6.17
#
# a)
#
# ```{r, echo=FALSE, results='asis'}
# bVec = c(B0.2,B1.2,B2.2,B3.2)
# xHVec = c(1,35,45,2.2)
# Temp = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
# Xmatrix = matrix(c(Temp,X1.2,X2.2,X3.2),ncol=4)
#
# MSE = (t(Y2.0) %*% Y2.0 - t(bVec) %*% t(Xmatrix) %*% Y2.0)/42
# MSE1 = sum((B0.2+X1.2*B1.2+X2.2*B2.2+X3.2*B3.2 - Y2.0)ˆ2)/42
# XTX = t(Xmatrix)%*%Xmatrix
#
# library(matlib)
# seBsquared = MSE1*(solve(XTX))
#
# YHhat = t(xHVec) %*% bVec
# SEYHhat = sqrt(t(xHVec) %*% seBsquared %*% xHVec)
# tVal7 = qt(.95,42)
# cat("90% CI for Yh hat = (", YHhat - tVal7*SEYHhat, ",", YHhat + tVal7*SEYHhat, ")")
# ```
#
# This CI shows the range that we are 90% sure this set of observations will fall into
#
# b)
#
# ```{r, echo=FALSE, results='asis'}
# hmat = t(xHVec) %*% xHVec
# SEpred = sqrt(MSE1*solve(hmat)*(1+(t(xHVec) %*% xHVec)))
# cat("The 90% prediction interval = (", YHhat-tVal7*SEpred, ",", YHhat+tVal7*SEpred,")")
# ```
# For a future observation with the given X paramater values we are 90% sure it will fall within out Pre
#
# Problem 7.4
#
# a)
#
# ```{r, echo=FALSE, results='asis'}
# #Extra sum of squares is SSE model+new var(s) - SSE model
# dataX1 <- data.frame(Y, X1)
# mlrX1<-lm(Y ~ X1, data = dataX1)
# B0X1 = mlrX1$coefficients[[1]]
# B1X1 = mlrX1$coefficients[[2]]

6
# SSEX1 = sum((B0X1+X1*B1X1 - Y)ˆ2)
#
# dataX13 <- data.frame(Y, X1, X3)
# mlrX13<-lm(Y ~ X1+X3, data = dataX13)
# B0X13 = mlrX13$coefficients[[1]]
# B1X13 = mlrX13$coefficients[[2]]
# B3X13 = mlrX13$coefficients[[3]]
# SSEX13 = sum((B0X13+X1*B1X13+X3*B3X13 - Y)ˆ2)
#
# data <- data.frame(Y, X1, X2, X3)
# mlr<-lm(Y ~ X1+X2+X3, data = data)
# SSEfull = sum((B0+X1*B1+B2*X2+X3*B3 - Y)ˆ2)
#
#
# dataX2 <- data.frame(Y, X2)
# mlrX2<-lm(Y ~ X2, data = dataX2)
# B0X2 = mlrX2$coefficients[[1]]
# B2X2 = mlrX2$coefficients[[2]]
# SSEX2 = sum((B0X2+B2X2*X2 - Y)ˆ2)
#
# cat("Extra sum of squares for X1 = ", anova(mlr)[[1,2]])
# cat("\n\nExtra sum of squares for X3 | X1 = ", SSEX1-SSEX13)
# cat("\n\nExtra sum of squares for X2 | X1,X3 = ", SSEX13-SSEfull)
# ```
#
# b)
#
# ```{r, echo=FALSE, results='asis'}
# #F* = MSR(X2|X1,X3)/MSE(full)
# Fdrop = (SSEX13-SSEfull)*48/SSEfull
# ```
#
# H0 : B2 = 0 (reduced model)
# Ha : B2 != 0 (full model)
#
# ```{r, echo=FALSE, results='asis'}
# cat("F* = ", Fdrop)
# cat("\n\nF =",qf(.95,1,48))
# ```
#
# Since F* < F we cannot say at with 95% confidence that B2 should be in the model. We fail to reject H0
#
# ```{r, echo=FALSE, results='asis'}
# cat("P-value = ", pf(Fdrop,1,48,lower.tail=FALSE))
# ```
#
# c)
#
# ```{r, echo=FALSE, results='asis'}
# dataX12 <- data.frame(Y, X1, X2)
# mlrX12<-lm(Y ~ X1+X2, data = dataX12)
# B0X12 = mlrX12$coefficients[[1]]
# B1X12 = mlrX12$coefficients[[2]]

7
# B2X12 = mlrX12$coefficients[[3]]
# SSEX12 = sum((B0X12+X1*B1X12+X2*B2X12 - Y)ˆ2)
#
# cat("Expression 1 =", anova(mlr)[[1,2]] + (SSEX1-SSEX12))
#
# cat("\n\nExpression 2 = ", anova(mlrX2)[[1,2]] + (SSEX2-SSEX12))
# ```
#
# They are equal in this case. They will always be equal because both expressions are representing the t
#
# Problem 7.9
#
# H0: B1 = -1 , B2 = 0 (reduced model) Y = B0 - X1 +B3*X3
#
# Ha: B1 != -1 , B2 != 0 (full model) Y = B0+B1*X1+B2*X2+B3*X3
#
# ```{r, echo=FALSE, results='asis'}
# #SSR(X1,X2|X3)*42/SSE(full)/2
# #SSE(b1 = -1, X3)-SSE(X1,X2,X3) / SSE full * 21
# dataX3 <- data.frame(Y2.0, X3.2)
# mlrX3<-lm(Y2.0 ~ X3.2-X1.2, data = dataX3)
# B0X3 = mlrX3$coefficients[[1]]
# B3X3 = mlrX3$coefficients[[2]]
# SSEX3 = sum((B0X3+B3X3*X3.2 - Y2.0)ˆ2)
# anova(mlr2)[1,2]
# qf(.975,2,42)
# ```
# ```{r}
# summary(mlrX3)
# anova(mlr2)
# ```
# As you can see at the .025 level since F>F* we reject the null hypothesis

You might also like