Predicting Student Success - A Logistic Regression Analysis of Dat
Predicting Student Success - A Logistic Regression Analysis of Dat
OpenSIUC
Research Papers Graduate School
Fall 10-27-2017
Recommended Citation
Soule, Patrick. "Predicting Student Success: A Logistic Regression Analysis of Data From Multiple SIU-C Courses." (Fall 2017).
This Article is brought to you for free and open access by the Graduate School at OpenSIUC. It has been accepted for inclusion in Research Papers by
an authorized administrator of OpenSIUC. For more information, please contact [email protected].
PREDICTING STUDENT SUCCESS:
by
Patrick B. Soule
A Research Paper
Submitted in Partial Fulfillment of the Requirements for the
Master of Science
Department of Mathematics
in the Graduate School
Southern Illinois University Carbondale
December, 2017
RESEARCH PAPER APPROVAL
By
Patrick B. Soule
Master of Science
Approved by:
Dr. M. Wright
Dr. R. Habib
Graduate School
Southern Illinois University Carbondale
July 18 2017
AN ABSTRACT OF THE RESEARCH PAPER OF
The objective of this report is to improve prediction techniques regarding the future
performance of students in select university courses through the utilization of multiple logis-
tic regressions. This is achieved with the aid of statistical computing software which applies
forward step-wise variable selection methods that identify influential variables sufficient to
accurately predict student success. Once a logit model is constructed with the required pa-
rameters and predictors, the inverse logit function outputs a probability of student success.
In all cases, logistic prediction models matched or exceeded the performance of current
prediction methods while using an equal or lesser number of explanatory variables. These
findings show that current prediction methods can improve by using a statistically justified
procedure. It also suggests the inefficacy of some predictors used to currently estimate
student performance.
i
TABLE OF CONTENTS
Chapter Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Chapters
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
ii
LIST OF TABLES
iii
LIST OF FIGURES
iv
1
CHAPTER 1
INTRODUCTION
Southern Illinois University Carbondale has recently begun using an early warning
intervention program (EWIP). This program aims to detect struggling students in general
education core courses early in the semester. Once a student is classified as struggling or
at risk of not earning a grade of C or higher there are several ways the university initiates
intervention. The student may be contacted by the instructor, an academic advisor or a
residence hall academic peer associate who, after consultation, then best directs them to
additional resources. This research paper investigates to detect such students, but does not
address the subsequent intervention.
Each course participating in EWIP has its own predictive functions which are employed
early in the semester, usually at the end of week three. The output of each function
estimates a score between zero and one hundred percent which is then classified into one
of four color codes. The following guidelines are used:
The precise cutoffs are tailored to each course but signify the same warning; the coding
scheme labels students ’least at risk’ to ’most at risk’ by assigning the colors green, yellow,
orange and red respectively. The goal of the university is to maximize the number of green
coded students. Concern arises when students are misclassified, as is possible with an
estimating function. The current models that determine academic risk are working well
however were created ad hoc and are not, probably, generated statistically. Also, it does
not use the final grade in consideration.
2
The current EWIP functions are linear in their predictors with weighted coefficients
derived from specific knowledge of each course. It is not known if the individual predictors
being used are best or if the coefficients are optimal. Using data to determine the weights
of influential predictors provides a way to maximize the chance of correct predictions and
therefore minimize misclassifications.
Optimization means that the university’s limited available resources can be efficiently
applied to the correct population with the greatest need. Successful interventions lead to
students passing the course or avoiding a D, F or WF grade. Further, this leads to fewer
lend themselves to partitioning the response into two categories and not four. Whereas it
is possible to assign color codes, particularly in an ordinal regression model, the end goal
is to determine whether the student requires intervention. It is with this end goal in mind
CHAPTER 2
THE DATA
CLEANING
In this report the records of four Southern Illinois University Carbondale courses from
2015 fall semester are analyzed. These include: BIO200A, MATH101, MATH106 and
MATH108. We combine all sections of a course for one semester and treat them as one
sample. Records are collected in the third and eighth week.1 The early week class sizes
rarely equal the size of the samples at week eight: invariably students drop, withdraw or
transfer from the course. Future records are subsequently left lacking; likewise, eighth week
records occasionally contain students who transfer in after week three. In these events we
use case-wise deletion.
We use the terms: variable, covariate, predictor, response variable, explanatory or
are covered in detail. The other courses follow similarly with figures to summarize results.
DESCRIPTIVE STATISTICS
BIO200A has records for students regarding Pretest, Homework, Quiz and Test scores
as well as Attendance. Week three scores for Quiz are unfortunately incomplete until week
eight.2 Each of these explanatory variables is converted to a percent so they can be directly
compared. The descriptive statistics for the potential predictors act as an additional check
for errors incurred from data entry by illuminating any extreme outliers. Histograms, box-
and-whisker plots and scatterplots help to show distribution and trends. From Figure 2.1
it is clearly observable that not all covariates are normally distributed. Homework and
1
MATH101 collects in the fifth and eighth weeks due to course structure.
2
Not all sections of BIO200A have a Quiz score recorded by week three.
4
1.0
0.8
0.6
0.4
●
0.2
●
0.0
Attendance are strongly skewed left. However, since logistic regression is used as opposed
to linear, these variables don0 t require a transformation.
CORRELATIONS
playing the data in this way helps to visualize how the explanatory variables relate to one
another. In the lower triangular portion of the matrix the scatterplots have overlaid trend
lines that attempt to describe the relationship. Note the effect caused by zeros in the data.
For example, between Pretest and Test the trend line is strongly affected, creating the
perception of a quadratic effect. This is considered an artifact of the smoothing technique.
If the relationship were true it would imply, to a point, that higher pretest scores lead to
0.8
Pretest
Density
0.24
*** 0.13
* 0.53
***
0.4
0.0
● ● ●●●●●●●
●● ●● ●
● ●● ●●
● ●●●●●●●
●●
●●●
●●
●● ●●
●●●●
●●● ●●
●● ●●●
● ●●
●● ●●●● ●●●
● ●
●
Homework
*** ***
0.8
●● ●
● ●● ● ●●● ●
● ●
●● ●● ● ●●● ●
● ● ● ●
● ●● ● ●● ●
●●
● ●● ●
Density
● ●●
●●● x
●● ●
● ●● ●●
● ● ●● ●
●●
0.23 0.40
0.4
●
● ●● ● ● ●
●
● ●● ● ●
●●
●● ●●● ●
●
0.0
● ●●●●
●●●● ●● ●
1.0
● ●●●●●●●●●●●●●●●●●●●●● ● ● ●●
●●
●●
●● ●●
●●●●
●
●●●●
●
●●●
●
●●●
●●
●●
●●
●●
●
●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●●
●
●
●●
●
●
● ●● ● ●●●●●●● ● ●● ●
● ● ● ●●● ● ●●●●
●
●●
●●
●
●●
●●
●
●
Attendance
● ●●●●●●● ●●●● ● ● ●
●●●
x
●●●
●
● ●
Density
●
***
0.6
● ●●● ● ●● ● ● ●●●
●
● ●● ● ● ● ● ● ● 0.32
● ●
● ●
0.2
● ● ● ●● ●
● ● ●● ● ●
●●
● ●
●
● ● ●●●●●● ●● ● ●●●●
●
●● ●
● ●
●
●
● ●●●●● ●●● ● ●
●● ● ●
● ●●●●
● ● ●●●
●●
●● ●● ●
●
● ●● ●● ● ●●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
● Test
0.8
●●● ● ● ● ● ●●●● ●●
●● ● ●
●
● ● ●●● ● ●
● ●● ●●●●● ● ● ●● ●● ● ●●
●
●●●●●● ● ●
● ● ●
●
● ● ●●●●●●●●
● ●●●●●
●●● ● ● ● ● ●●●
●● ● ● ●● ● ●
●●● ●
● ●
●
●●
●●
●●● ● ●● ●●●●●
● ●●●
●
●●● ● ●
● ● ●
●
●●●● ●● ● ● ● ●● ● ● ● ● ●●
●● ● ●
Density
● ●●●●●●●● ● ●●
●●
●●●●●●● ● ● ●
● ●●●
● ●●
●●●●●
●● ●
● ●● ●
● ●● ● ● ●●●● ●●
●● ●●
●
●●
●● ● x● ●
●
●
●
●
●
●
●
●
●●● ●
●● ●● ● ● ●●● ●● ● ●●● ● ● ●
●●●●●●
●● ● ●● ●●● ●●●●● ● ●
● ●
●
● ●●●●●
●●
●●●● ● ● ● ● ●
● ● ● ●●●●●● ●
●
● ● ●
● ●
●
● ●●
●● ● ●● ●● ●
● ●
●
0.4
● ● ● ● ● ● ● ● ● ●
●
● ● ●●●●●● ● ●
●
●
●
● ●● ● ● ●
● ● ●
●
●
●● ● ●● ● ● ●
●
● ● ●
● ●● ● ●
● ● ● ● ● ●
●
0.0
● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●
Pearson’s R
Bivariate Pearson correlation coefficients are reported in the upper triangular portion
of the matrix in Figure 2.2. Pearson0 s r is a measure of the linear relation between two
variables and is determined by calculating
n
X
(x1i − x1 )(x2i − x2 )
i=1
n
X n
X 1/2 , (2.1)
2 2
(x1i − x1 ) (x2i − x2 )
i=1 i=1
where x1 and x2 are distinct covariates and x̄ is the arithmetic mean. Pearson0 s r is bound
between -1 and 1. A value of ±1 indicates a perfect positive or negative linear relationship,
whereas 0 signifies no linear relationship between the variables.
Test and Pretest exhibit a moderate correlation with r = 0.53. Using Fisher0 s z0
6
transformation, we find the 95% confidence interval about r to be (0.425, 0.612). The
large sample size of 234 helps to keep the range relatively small. A Student0 s t test is also
√
r n−2
t= √ (2.2)
1 − r2
resulting in a p-value less than 0.0001. It should be noted that using both methods is
redundant, since the above confidence interval does not contain zero.
Recall that not all covariates are normally distributed, in fact any other bivariate
combination will require nonparametric methods. Spearman0 s rho and Kendall0 s tau are
We report both Spearman and Kendall coefficients together in Table 2.1. The Spearman
rho values appear in the upper triangular portion of the matrix and Kendall’s tau in the
lower triangular portion.
Table 2.1. Matrix of Nonparametric Correlation Coefficients
Covariates Pretest Homework Attendance Test
Pretest - 0.28∗∗∗ 0.05 0.62∗∗∗
Homework 0.20∗∗∗ - 0.21∗∗ 0.46∗∗∗
Attendance 0.04 0.17∗∗ - 0.25∗∗∗
Test 0.46∗∗∗ 0.33∗∗∗ 0.20∗∗∗ -
Asterisks denote level of statistical significance. ∗ , ∗∗
, and ∗∗∗
correspond
to p < 0.05, < 0.01, and < 0.001 respectively.
Most values are low except when measuring Test versus Pretest and Test versus Home-
work. Spearman0 s rho suggests a slightly increased correlation for each, this could be due
to Spearman0 s rank being a less robust statistic [3]. At any rate, they are still within the
7
MULTICOLLINEARITY
To build an informative and predictive logistic regression model, the explanatory vari-
ables must be independent of each other. Measuring correlations is one way to evaluate how
much one variable explains another. If two variables essentially behave the same, including
both for modeling purposes introduces redundancy as well as increases the size of param-
eter errors. Unfortunately, even if pairwise comparisons show low correlation coefficients
there is still the possibility of collinearity. If this occurs, it becomes difficult to determine
such measure involves regressing each explanatory variable on all other covariates. If the
coefficient of determination (R2 ) is high, multicollinearity may be present. Of course high
is a relative term, some [4] argue that values above 0.8 are concerning. The coefficient of
determination is the portion of variation in the regressing variable that is explained by the
other covariates and is calculated as follows.
Pn
2 (ŷi − ȳ)2
R = Pi=1
n 2 (2.3)
i=1 (yi − ȳ)
where ŷ is the fitted value that the model predicts. This is a ratio of variances. The numer-
ator of (2.3) is the sum of squared residuals (SSR), the squared difference between model
predictions and actuality. The denominator represents total variance in the model. As the
SSR increases, R2 approaches 1. For our purposes we want R2 to be low, otherwise the
other covariates are explaining much of what is assumed to be an independent variable. In
Table 2.2 we include two other measures that are related to the coefficient of determination:
Tolerance = 1 − R2 , and Variance Inflation Factor (VIF) = 1/(1 − R2 ). Following the rule
of thumb for R2 < 0.8 this corresponds to Tolerance > 0.2 and a VIF < 5.
8
VIF is interpreted [4] as the percentage of inflation in the variance of the coefficient
due to collinearity. For Test, VIF = 1.647 which suggests that its variance is inflated by
65%. This may sound worse than it is. Consider the R2 value of 0.393 which suggests that
almost 40% of the variability of Test is explained by the other three covariates in the linear
regression model. This is not enough to cause alarm. However, these results as well as the
ones from the bivariate correlation coefficients suggest that Pretest and Test are likely the
CHAPTER 3
THE MODEL
BINOMIAL DISTRIBUTION
To make accurate statistical inferences we need to make certain assumptions about the
distribution of the binary response variable. If we assume that each student is independent1 ,
that is one student0 s success does not affect another, then we can assign a probability
distribution to our response variable. In terms of our data on BIO200A, there were 234
students and 165 of those earned a grade of C or above. We let this proportion, 0.705
be a point estimate for the proportion of success of this class. Assume next semester
binomial distribution as µ = πn ≈ 176. This gives an estimate for the number of students
to succeed, but which ones? To address this we consider the use of given covariates to
construct a model.
LOGISTIC REGRESSION
Whether or not a student succeeds is the binary response variable we are trying to
predict. Mathematically, we let Y represent the response as
1 if student earns a grade of A, B or C
Y =
0 otherwise
1
. . . and identically distributed (a strong assumption for a student body). . .
2
If π is left fixed, as n becomes large the binomial distribution approximates the normal curve.
10
0.08
0.06
Probability
0.04
0.02
0.00
Number of Students
π(x)
logit[P (Y = 1)] = log = α + βx (3.1)
1 − π(x)
where π(x) is equivalent to the probability of success, P(Y=1). Note that the right hand
side (RHS) of (3.1) is linear in x and the LHS is a logarithmic scale of the odds of success.
Although odds are commonly used and interpreted, the natural log function is not. With
exponentiation and algebraic manipulation we yield
exp(α + βx)
π(x) = (3.2)
1 + exp(α + βx)
to obtain a form that outputs only values between 0 and 1 which then are interpreted as
probabilities. The equations (3.1) and (3.2) are simple in the fact that they only use one
explanatory variable x. Using the method of maximum likelihood estimation we will build
several different models of increasing complexity and choose one based on best fit criterion.
11
1.0
0.8
Probability of Success
0.6
0.4
0.2
0.0
0 20 40 60 80 100
Predictor
Figure 3.2. Logistic Regression Curve
Model Interpretation
The coefficient of 0.1 is sometimes referred to as the weight of the predictor. Avoiding log
interpretations, e0.1 ≈ 1.11 translates to a 11% increase in the odds of success given one
unit increase in x. We plot the function in Figure 3.2 and mark the point π(50) = 0.5. Note
the nonlinear S-shaped curve that is monotonically increasing, which is typical in practice
[1]. The slope of the curve is steepest at this point, which means volatility for a binary
variable. There is a fifty percent chance of success which also means the same chance for
failure.
12
guess from a polynomial function, followed by successive iterates that rapidly converge to
the maximum likelihood estimates [1]. This process determines the MLE0 s for the weights
in the model, but we need a way to determine what predictors should be in the model. We
want a model that sufficiently describes the real life phenomenon. Any model is a simplifi-
cation of reality but we must balance between too simple and over complex. Simple models
are easier to interpret, but a more complex model may fit the data better. To assess this,
we require that a model have statistical significance as well as conform to several goodness
of fit tests.
We now build models with a single covariate first and check which produces the best
fit. The covariate that provides the best fit to the data is selected and used in the next
step. We then increase the complexity of the selected model by choosing one more predictor
from the remaining and comparing fit. This process continues until increasingly complex
models fail to produce a better fit. This is known as forward variable selection.
There are five measures of fitness that we will consider at each step: Akaike Informa-
tion Criterion (AIC), deviance, concordance, McFadden0 s pseudo R2 , and the significance
of model parameters. AIC provides a measure of the information that a model provides.
The calculation involves a penalty term which acts to preserve parsimony of parameters.
We want to minimize
where k is the number of parameters in the model. A similar statistic is deviance. Whereas
AIC is measuring the tested model against the theoretically true model, deviance measures
against the most complex model possible, a saturated model with an individual parameter
13
for each observation. If we let LT be the log likelihood of the tested model and LS be the
log likelihood of the saturated model, deviance is calculated as
The deviance likelihood ratio statistic tests the hypothesis that all parameters not used
in the tested model are zero. Deviance follows an approximately chi-squared distribution
where the degrees of freedom are the number of observations minus the number of parame-
ters used. Models can also be compared using the deviance statistic. If the parameters in a
model are a subset of those in a more complex model, the difference in their deviances can
be calculated and interpreted as a chi-square test for the hypothesis that the more complex
model provides a better fit. A sufficiently high statistic would be evidence to suggest that
the complex model is necessary.
Our third statistic measures predictive power. A receiver operating characteristic
failed.3 Figure 3.3 displays the ROC curve for the model with ‘Test’ as the only explanatory
variable.
The diagonal line represents the intercept model which is equivalent to guessing, essen-
tially flipping a coin to decide. The desired shape is a high parabolic arch filling the upper
left portion of the plot. Total area under the curve is equivalent to concordance. Pairing
up the observed successes with observed failures, concordance checks that the model is
assigning a higher probability to the successes and not the failures. A value of c = 0.5 is
equivalent to guessing.4
3
Failure here is defined as not succeeding in earning a grade of C or higher.
4
It is possible to build a model that predicts worse than c = 0.5.
14
1.0
0.8
0.6
Sensitivity
0.4
0.2
0.0
1. General impression is that a higher value is more desirable with 0 good0 fitting models in
the 0.2 − 0.4 range.[7]
Lastly, one can always test the statistical significance of the individual parameters
used in the model. Along with the MLE for the parameters are standard errors (SE)
which can be used to calculate confidence intervals about the estimates as well as test their
significance, similar to what we saw in Section 2. The null hypothesis for logistic regression
parameter significance states that the response is independent of the explanatory variable.
To test this we calculate
2
z 2 = β̂/SEβ̂ (3.5)
15
which has a chi-squared null distribution. Large values provide evidence that the predictor
is affecting the response.
With these measures of model fit as a guide we proceed through a stepwise forward
variable selection. We exhibit an instance of the process for week three BIO200A.
An Example
Statistical software calculates the estimates and measures of fit for BIO200A for week
three which we summarize in Table 3.1. We omit the intercepts from one predictor models
for the sake of clarity. The Test predictor is showing the best fit across multiple measures.
Therefore ‘Test’ will be added to the logistic regression model and we proceed to the next
step in variable selection. Table 4.5 shows all models considered. Adding ‘Homework’
substantially improves model fit compared to ‘Pretest’ and ‘Attendance’. In the third
step we see that ’Pretest’ should be chosen over ‘Attendance’ however, both variables are
not statistically significant. This could be a sign of multicollinearity. Notice that in the
previous step, those predictors were also found to be not statistically significant. Perhaps
’Test’ predictor is already explaining the contribution they could provide. With this and
the principle of parsimony in mind, we choose ‘Test’ and ‘Homework’ as the predictors in
CHAPTER 4
RESULTS
MODEL CHOICE
Tables 4.5 through 4.12 contain model candidates for each course and are subdivided
by week. Unsurprisingly, ‘Test’ is by far the most influential predictor in all instances.
Models with just ‘Test’ as a predictor fit well but adding either ‘Homework’ or ‘Quiz’
predictors, when applicable, always increases the fit. Any models that have three terms
are comprised only of the aforementioned predictors. ‘Pretest’ or ‘Attendance’ models only
outperform an intercept model. As other terms are added the model fit marginally improves
along with an undesired increase in parameter errors which lead to insignificant predictors.
We conclude that ‘Pretest’ and ‘Attendance’ have little predictive power compared to the
other covariates.
The final model choice for BIO200A at week three includes ‘Test’ and ‘Homework’
whereas the week eight model differs by the addition of ‘Quiz’. This suggests that if data
were available in week three it could potentially improve model fit. MATH101 model uses
all three predictors whereas MATH106 and MATH108 use ‘Test’ and ‘Homework’ only. In
Table 4.1 we present the final models complete with parameter estimates, predictors are
abbreviated to their first letter. The weight of each coefficient alone is not telling, rather the
weight relative to others is what lends insight. Notice for example that Test in week three
for BIO200A is weighted over twice as much as ‘Homework’. As the semester progresses,
‘Test’ in week eight is weighted over three times as much as ‘Homework’.
In Figure 4.1 we illustrate the influence of each predictor by plotting the logistic
regression curve for final models of week three and week eight. Each curve holds the other
covariates fixed at their medians. We see that in week eight the probability of success is very
volatile inside the 0.4 to 0.5 percent range. Students earning above 50% as their average
test percent have a very high probability of earning a C grade or higher, assuming they have
17
1.0
0.8
0.8
Probability of Success
0.6
0.6
0.4
0.4
Week 3 Week 3
0.2
0.2
Week 8 Week 8
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
median homework and quiz scores. ‘Homework’ on the other hand has an almost linear
relationship early semester whereas the mid-semester plot rapidly approaches a success
probability of 1. This is due in part by the fixing of ‘Quiz’ and ‘Test’ at their median
values of .76 and .66 respectively.
COMPARISON
Having selected the ‘best’ fitting model for each course we measure how well they
perform compared to current predicting functions. Table 4.4 displays the current functions
used for each course. Current functions predict a final course score out of one hundred. We
18
Success 139 15
Linear 82.48%
BIO200A Failure 26 54
Success 146 16
Logistic 85.04%
Failure 19 53
Success 371 76
Logistic 75.60%
Failure 115 224
Success 60 32
Linear 70.63%
MATH106 Failure 5 29
Success 74 18
Logistic 76.98%
Failure 11 23
Success 236 95
Linear 75.72%
MATH108 Failure 23 132
Success 248 83
Logistic 76.95%
Failure 29 126
Success 144 3
Linear 89.74%
BIO200A Failure 21 66
Success 162 3
Logistic 97.44%
Failure 3 66
Success 440 91
Linear 82.47%
MATH101 Failure 46 209
Success 390 48
Logistic 81.58%
Failure 96 252
Success 74 18
Linear 80.95%
MATH106 Failure 6 28
Success 80 12
Logistic 85.71%
Failure 6 28
Success 262 69
Linear 82.72%
MATH108 Failure 15 140
Success 270 61
Logistic 83.33%
Failure 20 135
19
change this to a binary outcome by consulting the corresponding syllabus which determines
the cutoff point for earning a C or higher. Then predicted outcomes are compared with
actual observed data. There are four potential cases when predicting behavior. Correct
classification is when the observed success or failure matches prediction. Tables 4.2 and 4.3
detail the results for all courses.
Overall, with logistic regression the correct prediction rate improves for early week
and mid-semester predictions except for MATH101. Improvements are more pronounced
in mid-semester than in early week for BIO200A and MATH106. In MATH101, current
functions and logistic models use the same predictors but with different weights. Since
MATH101 is a modular course composed of four disjoint topics and a cumulative final,
trends seen early semester may not be indicative of late semester performance. This is a
possible explanation for the lack of improvement for the logistic model in week eight.
20
REFERENCES
[1] Agresti, Alan, An Introduction to Categorical Data Analysis,Wiley, Inc., New Jerey,
2007.
[2] Brian G. Peterson and Peter Carl (2014). PerformanceAnalytics: Econometric tools
[5] H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York,
2009.
[6] John Fox and Sanford Weisberg (2011). An R Companion to
[9] R Core Team (2016). R: A Language and Environment for Statistical Computing. R
Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
[10] Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth
VITA
Graduate School
Southern Illinois University
Patrick B. Soule