0% found this document useful (0 votes)
4 views3 pages

Regression Modelli NG Assignment

The document outlines a regression analysis assignment by Kiura Clare Wathimu, detailing steps including the creation of scatter and correlation matrices, fitting a multiple regression model, and analyzing residuals. It discusses the interpretation of coefficients, conducts tests for constancy of error variance, and evaluates the significance of the regression model using F-tests and Bonferroni corrections. Additionally, it calculates the coefficient of determination (R²) to assess the model's explanatory power regarding total labor hours.

Uploaded by

mjujahofficial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Regression Modelli NG Assignment

The document outlines a regression analysis assignment by Kiura Clare Wathimu, detailing steps including the creation of scatter and correlation matrices, fitting a multiple regression model, and analyzing residuals. It discusses the interpretation of coefficients, conducts tests for constancy of error variance, and evaluates the significance of the regression model using F-tests and Bonferroni corrections. Additionally, it calculates the coefficient of determination (R²) to assess the model's explanatory power regarding total labor hours.

Uploaded by

mjujahofficial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Name: KIURA CLARE WATHIMU

Reg No: SCM222-1261/2022

REGRESSION ANALYSIS ASSIGNMENT

a) Scatter Plot Matrix and Correlation Matrix

To obtain the scatter plot matrix and correlation matrix:

pairs(data[, c("X1", "X2", "X3", "Y")], main="Scatter Plot Matrix")

Here the scatter plot matrix allows us to visually inspect:

If there are linear or non-linear relationships between any pairs of variables.

Potential outliers or clusters of points that might affect the model

cor_matrix <- cor(data[, c("X1", "X2", "X3", "Y")]) print(cor_matrix)

The correlation matrix quantifies the linear relationship between them:

b) Fit a Multiple Regression Model

model <- lm(Y ~ X1 + X2 + X3, data=data)

summary(model)

β1^: This coefficient represents the change in total labor hours (Y) for each additional case
shipped (X1), holding other factors constant.

β2^: This coefficient represents the change in total labor hours (Y) for each 1% change in the
indirect labor cost percentage (X2), holding other factors constant.

β3^: This coefficient represents the change in total labor hours (Y) when a holiday occurs
(X3=1), compared to weeks without a holiday (X3=0), holding other factors constant.

c) Obtain Residuals and Box Plot

residuals <- model$residuals

boxplot(residuals, main="Box Plot of Residuals")

The plot helps us to see if the residuals are evenly distributed and show no pattern, meaning
that the regression model can be considered appropriate otherwise if there are many outliers
or the residuals show a systematic trend, the model might need adjustment.
d) Residuals Against Y, X1, X2, X3, and Normal Probability Plot

# Plot residuals against Y

plot(data$Y, residuals, main="Residuals vs Y", xlab="Y", ylab="Residuals") abline(h=0, col="red")

# Plot residuals against X1

plot(data$X1, residuals, main="Residuals vs X1", xlab="X1", ylab="Residuals") abline(h=0, col="red")

# Plot residuals against X2

plot(data$X2, residuals, main="Residuals vs X2", xlab="X2", ylab="Residuals") abline(h=0, col="red")

# Plot residuals against X3

plot(data$X3, residuals, main="Residuals vs X3", xlab="X3", ylab="Residuals") abline(h=0, col="red")

# Normal probability plot

(Q-Q plot) qqnorm(residuals)

qqline(residuals, col="red")

e) Brown-Forsythe Test for Constancy of Error Variance

fitted_values <- model$fitted.values

group1 <- data[order(fitted_values)[1:26], ]

group2 <- data[order(fitted_values)[27:52], ]

library(car) bf_test <- leveneTest(Y ~ factor(c(rep(1, 26), rep(2, 26))), data=rbind(group1, group2))
print(bf_test)

f) Test for a Regression Relation

# F-test for regression relation

anova(model)

# Interpretation: If p-value < 0.05, reject the null hypothesis (indicating a significant regression
relation)

The p-value test indicates whether the regression model significantly explains variation in
labor hours.

g) Bonferroni Procedure for Estimating β1 and β3

# Bonferroni correction for multiple comparisons

confint(model, level = 0.95) # This provides confidence intervals for all coefficients

After computing the joint confidence intervals for β1 and β3, the results are interpreted based
on whether they include zero (which suggests no effect) or not.
h) Coefficient of Multiple Determination (R²)
# Coefficient of determination R2

summary(model)$r.squared

R² Interpretation:

R2 is the proportion of the variance in the dependent variable (Y, total labor hours) that is
explained by the independent variables (X1, X2, and X3).

For example R2=0.65, or 65% of the variation in total labor hours is explained by the model
leaving 35% unexplained. The higher the R2, the better the model explains the variation in the
dependent variable.

You might also like