0% found this document useful (0 votes)
36 views5 pages

Lab-9 RMD

The document analyzes test score and class size data for 10 schools. It finds a strong negative correlation between class size and test scores. A linear regression model is fitted and finds class size is a statistically significant predictor of test scores, explaining over 90% of the variation in scores. A similar analysis is done on income and test score data, finding a moderate positive correlation and regression model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views5 pages

Lab-9 RMD

The document analyzes test score and class size data for 10 schools. It finds a strong negative correlation between class size and test scores. A linear regression model is fitted and finds class size is a statistically significant predictor of test scores, explaining over 90% of the variation in scores. A similar analysis is done on income and test score data, finding a moderate positive correlation and regression model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Lab 9

Maira Sulaimanova

2023-11-14
Class_Size <- c(23,19,30,22,23,29,35,36,33,25)
Test_Score <- c(430,430,333,410,390,377,325,310,328,385)
plot(Class_Size,Test_Score,xlab="class size",ylab="test score")

mean(Test_Score)

## [1] 371.8

var(Test_Score)

## [1] 2022.178

sd(Test_Score)

## [1] 44.96863

cov(Class_Size,Test_Score)

## [1] -254.2222
cor(Class_Size,Test_Score)

## [1] -0.953319

mod <- lm(Test_Score ~ Class_Size)


summary(mod)

##
## Call:
## lm(formula = Test_Score ~ Class_Size)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.727 -4.665 -2.404 5.475 25.669
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 570.5994 22.7243 25.110 6.77e-09 ***
## Class_Size -7.2291 0.8096 -8.929 1.96e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.4 on 8 degrees of freedom
## Multiple R-squared: 0.9088, Adjusted R-squared: 0.8974
## F-statistic: 79.74 on 1 and 8 DF, p-value: 1.963e-05

plot(Test_Score ~ Class_Size)
abline(mod, col = "red")
library(AER)

## Loading required package: car

## Loading required package: carData

## Loading required package: lmtest

## Loading required package: zoo

##
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':


##
## as.Date, as.Date.numeric

## Loading required package: sandwich

## Loading required package: survival

data("CASchools")
income <- CASchools$income[1:10]
data_df <- data.frame(income, Test_Score)

mod1 <- lm(income ~ Test_Score, data = data_df)

summary(mod)
##
## Call:
## lm(formula = Test_Score ~ Class_Size)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.727 -4.665 -2.404 5.475 25.669
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 570.5994 22.7243 25.110 6.77e-09 ***
## Class_Size -7.2291 0.8096 -8.929 1.96e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.4 on 8 degrees of freedom
## Multiple R-squared: 0.9088, Adjusted R-squared: 0.8974
## F-statistic: 79.74 on 1 and 8 DF, p-value: 1.963e-05

cor(data_df$income, data_df$Test_Score)

## [1] 0.6083904

plot(data_df$Test_Score, data_df$income, xlab = "test Scores", ylab =


"income", main = "scatterplot: income vs test Scores")
abline(mod1, col = "red")
inc_mod <- summary(mod1)
print(inc_mod)

##
## Call:
## lm(formula = income ~ Test_Score, data = data_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1394 -2.0373 -0.2803 0.8577 8.7266
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -12.57491 10.65244 -1.180 0.272
## Test_Score 0.06172 0.02846 2.168 0.062 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.84 on 8 degrees of freedom
## Multiple R-squared: 0.3701, Adjusted R-squared: 0.2914
## F-statistic: 4.701 on 1 and 8 DF, p-value: 0.06199

SSR <- sum(mod$residuals^2)


TSS <- sum((data_df$income - mean(data_df$income))^2)
R_squared <- 1 - SSR / TSS
cat("SSR (sum of squared residuals):", SSR, "\n")

## SSR (sum of squared residuals): 1659.493

cat("TSS (total sum of squares):", TSS, "\n")

## TSS (total sum of squares): 187.2865

cat("R-squared:", R_squared, "\n")

## R-squared: -7.860718

residuals <- resid(mod1)

SER <- sqrt(sum(residuals^2) / (length(residuals) - 2))


cat("SER (standard error of residuals):", SER, "\n")

## SER (standard error of residuals): 3.839994

You might also like