Assignment2 Stats
Assignment2 Stats
1. Consider the following dataset representing the relationship between the number of
hours spent on studying and the exam scores for a group of students: (5) (CO2)
Hours 4 6 8 10 12 14 16 18 20
Scores 60 65 70 75 80 85 90 95 98
2. Suppose you have a dataset with three independent variables X1, X2, X3, and a
dependent variable Y: (5) (CO2)
X1 X2 X3 Y
3 4 2 25
5 6 3 35
7 8 4 45
a. Fit a multiple linear regression model using this data and write down the estimated
regression equation.
b. Calculate the coefficient of determination R2 for the model.
3. Three experimenters determine the moisture content of samples of body lotion. For
this purpose, each experimenter has taken 4 consignments. The results are given
below: (5) (CO2)
Consignments
Experimenter
I II III IV
E1 9 10 9 11
E2 10 11 9 10
E3 10 12 10 11
Test whether there is any significant difference among consignments and among
experimenters at 5% significance level.
4. Consider three different teaching methods (A, B and C) and their corresponding test
scores for a sample of students:
Method A: [75,82,78,88,92]
Method B: [68,75,80,85,88]
Method C: [72,78,84,90,94]
Perform a one-way ANOVA to test whether there are any significant differences in
the mean test scores among the three teaching methods. Provide the ANOVA table
and state your conclusion. (5) (CO2)
5. Conduct a two-way ANOVA for a study examining the effects of two factors,
temperature (levels: low, medium, high) and time (levels: 1 hour, 2 hours, 3 hours),
on the growth of plants. The data is as follows:
Perform the ANOVA and interpret the results, including any interaction effect
between temperature and time. (5) (CO2)
6. Explain how the model complexity is related with the number of variables in the
model. (3) (CO2)
7. What are the various computational techniques for variable selection? (3) (CO2)
8. Define residual analysis and describe how it is used for model diagnostics. Provide
examples of situations where residuals might indicate problems with the model.
(3) (CO2)
9. Suppose you have Logistic Regression Model with the following coefficients: b0 =
1.5, b1 = -0.2, and b2 = 0.03. If the feature values for a specific data point are X1 = 8
and X2 = 12, calculate the probability of the event Y = 1 using the logistic regression
equation
1
𝑃(𝑌 = 1) = 1+𝑒 −(𝑏0 +𝑏1 𝑋1 +𝑏2 𝑋2 ) (3) (CO2)
10. Given a Poisson Regression Model with the equation log (µ) = -1 + 0.2X1 – 0.1X2,
where X1 represents the number of advertisements and X2 represents the time spent on
a website, calculate the predicted mean number of customer purchases for a scenario
where X1 = 8 and X2 = 20. (3) (CO2)