Exercise 2.3 Logistic Regression
Exercise 2.3 Logistic Regression
3 Logistic Regression
1. Run the Rscript passexam.R on (a) passexam.csv dataset and then on (b) passexam2.csv
dataset. What is the cause of the error? Explain.
d. Using set.seed(2) with 70-30 train-test splt, and keeping only statistically significant
variables, show the trainset confusion matrix and testset confusion matrix.
Part B: Questions for Research Paper [Freitas et. al. (2012)] Reading:
2. What is the difference between adjusted Odds Ratio and unadjusted Odds Ratio?
3. How did Freitas et. al. (2012)] identify high risk factors? Hint: See Table 1.
1. Set Service Rating = Neutral as the baseline reference level for Rating, in rating.csv dataset.
2. Develop Logistic regression to explain Rating using the multinom() function from Rpackage
nnet. Which variables are statistically significant? [Note: glm() function cannot be used here.]
3. What is the model predicted service rating for each of the case in the dataset?