0% found this document useful (0 votes)
22 views6 pages

Make Up Cat

Uploaded by

Gershon Ayieko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Make Up Cat

Uploaded by

Gershon Ayieko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MakeUpCat

BDATC01/0843/2022
2025-01-11

R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring
HTML, PDF, and MS Word documents. For more details on using R Markdown see
http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content
as well as the output of any embedded R code chunks within the document. You can
embed an R code chunk like this:
# Part i) Simulate the scenario
set.seed(123)
# Setting seed for reproducibility
n_questions <- 20
prob_correct <- 0.6
simulated_results <- rbinom(1, n_questions, prob_correct)
cat("Simulated number of correct guesses:", simulated_results, "\n")

## Simulated number of correct guesses: 13

# Part ii) Find the probability of at least 11 correct guesses


prob_at_least_11 <- sum(dbinom(11:n_questions, n_questions, prob_correct))
cat("Probability of at least 11 correct guesses:", prob_at_least_11, "\n")

## Probability of at least 11 correct guesses: 0.7553372

# Part iii) Estimate the number of correct guesses with 99% probability of
occurrence
lower_bound <- qbinom(0.005, n_questions, prob_correct)
upper_bound <- qbinom(0.995, n_questions, prob_correct)
cat("Number of correct guesses with 99% probability of occurrence:",
lower_bound, "to", upper_bound, "\n")

## Number of correct guesses with 99% probability of occurrence: 6 to 17

# Function to generate Fourier series


generate_fourier_series <- function(x, n_terms) {
a0 <- 0.5
series <- a0
for (n in 1:n_terms) {
an <- 1 / n
bn <- 1 / n
series <- series + an * cos(n * x) + bn * sin(n * x)
}
return(series)
}

# Example usage
x <- seq(0, 2 * pi, length.out = 100)
n_terms <- 10
fourier_series <- generate_fourier_series(x, n_terms)

# Plotting the Fourier series


plot(x, fourier_series, type = "l", col = "blue", lwd = 2,
main = "Fourier Series", xlab = "x", ylab = "f(x)")

# i.) Interpret Parameter Estimates


# In a multiple linear regression model, the parameter estimates
(coefficients) represent the relationship
# between each predictor variable and the response variable, holding all
other predictors constant.
# Intercept (β0): This is the expected value of the response variable when
all predictor variables are zero.
# Slope Coefficients (β1, β2, ...): Each slope coefficient represents the
change in the response variable
# for a one-unit change in the corresponding predictor variable, holding all
other predictors constant.
# To comment on parameter and variable significance, we look at the p-values
associated with each coefficient:
# Significant Variables: If the p-value is less than the chosen significance
level (e.g., 0.05), the corresponding
# predictor variable is considered statistically significant.
# Non-significant Variables: If the p-value is greater than the significance
level, the predictor variable is not
# statistically significant.

# ii.) Comment on Overall Model Fit


# The overall fit of the model can be assessed using several metrics:
# R-squared (R²): This measures the proportion of the variance in the
response variable that is explained by the
# predictor variables. A higher R² indicates a better fit.
# Adjusted R-squared: This adjusts the R² value for the number of predictors
in the model, providing a more accurate
# measure of model fit.
# F-statistic: This tests the overall significance of the model. A
significant F-statistic (p-value < 0.05) indicates
# that the model provides a better fit than a model with no predictors.

# iii.) Write the Estimated Model Equation


# The estimated model equation for a multiple linear regression model is
typically written as:
# Y_hat = β0 + β1*X1 + β2*X2 + ... + βn*Xn
# where:
# Y_hat is the predicted value of the response variable.
# β0 is the intercept.
# β1, β2, ..., βn are the slope coefficients for the predictor variables X1,
X2, ..., Xn.
# For example, if the model includes predictors for tree height (Height) and
diameter (Diameter), the estimated model
# equation might look like:
# Volume_hat = β0 + β1*Height + β2*Diameter

# Load necessary libraries


library(car)

## Warning: package 'car' was built under R version 4.4.2

## Loading required package: carData

library(regclass)

## Warning: package 'regclass' was built under R version 4.4.2

## Loading required package: bestglm

## Warning: package 'bestglm' was built under R version 4.4.2

## Loading required package: leaps

## Warning: package 'leaps' was built under R version 4.4.2


## Loading required package: VGAM

## Warning: package 'VGAM' was built under R version 4.4.2

## Loading required package: stats4

## Loading required package: splines

##
## Attaching package: 'VGAM'

## The following object is masked from 'package:car':


##
## logit

## Loading required package: rpart

## Loading required package: randomForest

## Warning: package 'randomForest' was built under R version 4.4.2

## randomForest 4.7-1.2

## Type rfNews() to see new features/changes/bug fixes.

## Important regclass change from 1.3:


## All functions that had a . in the name now have an _
## all.correlations -> all_correlations, cor.demo -> cor_demo, etc.

# Fit the multiple linear regression model


model <- lm(Volume ~ Girth + Height, data = trees)

# Assumption 1: Linearity
# Plot residuals vs fitted values
plot(model$fitted.values, model$residuals,
main = "Residuals vs Fitted Values",
xlab = "Fitted Values", ylab = "Residuals")
abline(h = 0, col = "red")
# Assumption 2: Independence
# Durbin-Watson test for autocorrelation
dw_test <- durbinWatsonTest(model)

# Shapiro-Wilk test for normality


shapiro_test <- shapiro.test(model$residuals)
cat("Shapiro-Wilk test p-value:", shapiro_test$p.value, "\n")

## Shapiro-Wilk test p-value: 0.6439824

# Assumption 3: No multicollinearity
# Variance Inflation Factor (VIF)
vif_values <- vif(model)
cat("VIF values:\n")

## VIF values:

print(vif_values)

## Girth Height
## 1.36921 1.36921

# Function to accept three vectors Y and X1, fit a linear regression model,
and compute the statistic L
compute_statistic_L <- function(Y, X1) {
# Fit the linear regression model
model <- lm(Y ~ X1)
# Extract coefficients
beta0 <- coef(model)[1]
beta1 <- coef(model)[2]

# Compute residuals
residuals <- Y - (beta0 + beta1 * X1)

# Compute S^2
S2 <- sum((Y - mean(Y))^2) / (length(Y) - 1)

# Compute L
L <- sum(residuals^2) / S2

return(L)
}

# Example usage
Y <- c(10, 20, 30, 40, 50)
X1 <- c(1, 2, 3, 4, 5)
L <- compute_statistic_L(Y, X1)
cat("Computed statistic L:", L, "\n")

## Computed statistic L: 4.139942e-30

You might also like