Logistic Regression Using SPSS: Presented by Nasser Hasan - Statistical Supporting Unit 7/8/2020 Nasser - Hasan@miami - Edu
Logistic Regression Using SPSS: Presented by Nasser Hasan - Statistical Supporting Unit 7/8/2020 Nasser - Hasan@miami - Edu
[email protected]
Overview
• Brief introduction of Logistic Regression.
Overview
Logistic Regression
Overview
Logistic Regression - Examples
Overview
Logistic Regression - Assumption
Overview
Logistic Regression - Assumption
Overview
Box-Tidwell Test
- Don’t worry about the significant interaction if the sample sizes are
large.
Logistic Regression Using SPSS
https://miami.box.com/s/cb1tytyzogqe1vs7eu4fdqj7m9ewtwzo
- Move your DV into the DV box, and all of your IVs in the covariates box.
Logistic Regression Using SPSS
- Add the interaction term between each continues IV and its log.
Logistic Regression Using SPSS
- Check the appropriate statistics and plots needed for the analysis as
shown below:
Logistic Regression Using SPSS
- If all of them are not significant, redo the analysis with the interaction
terms:
Logistic Regression Using SPSS
This table contains the Cox & Snell R Square and Nagelkerke R Square values, which are both
methods of calculating the explained variation. These values are sometimes referred to
as pseudo R2 values (and will have lower values than in multiple regression). However, they are
interpreted in the same manner, but with more caution. Therefore, the explained variation in the
dependent variable based on our model ranges from 24.0% to 33.0%, depending on whether you
reference the Cox & Snell R2 or Nagelkerke R2 methods, respectively.
Logistic Regression Using SPSS
The Hosmer-Lemeshow tests the null hypothesis that predictions made by the model fit perfectly
with observed group memberships. A chi-square statistic is computed comparing the observed
frequencies with those expected under the linear model. A nonsignificant chi-square indicates
that the data fit the model well.
Logistic Regression Using SPSS
Logistic regression estimates the probability of an event (in this case, having heart disease)
occurring. If the estimated probability of the event occurring is greater than or equal to 0.5 (better
than even chance), SPSS Statistics classifies the event as occurring (e.g., heart disease being
present). If the probability is less than 0.5, SPSS Statistics classifies the event as not occurring
(e.g., no heart disease). It is very common to use binomial logistic regression to predict whether
cases can be correctly classified (i.e., predicted) from the independent variables. Therefore, it
becomes necessary to have a method to assess the effectiveness of the predicted classification
against the actual classification.
Logistic Regression Using SPSS
- With the independent variables added, the model now correctly classifies 71.0% of cases
overall (see "Overall Percentage" row) à Percentage accuracy in classification.
- 45.7% of participants who had heart disease were also predicted by the model to have heart
disease (see the "Percentage Correct" column in the "Yes" row of the observed categories). à
Sensitivity
- 84.6% of participants who did not have heart disease were correctly predicted by the model not
to have heart disease (see the "Percentage Correct" column in the "No" row of the observed
categories). à Specificity
Logistic Regression Using SPSS
- The positive predictive value is the percentage of correctly predicted cases with the
observed characteristic compared to the total number of cases predicted as having the
characteristic. In our case, this is 100 x (16 ÷ (10 + 16)) which is 61.5%. That is, of all cases
predicted as having heart disease, 61.5% were correctly predicted.
- The negative predictive value is the percentage of correctly predicted cases without the
observed characteristic compared to the total number of cases predicted as not having the
characteristic. In our case, this is 100 x (55 ÷ (55 + 19)) which is 74.3%. That is, of all cases
predicted as not having heart disease, 74.3% were correctly predicted.
Logistic Regression Using SPSS
- The Wald test ("Wald" column) is used to determine statistical significance for each of the
independent variables. The statistical significance of the test is found in the "Sig." column.
From these results you can see that age (p = .003), gender (p = .021) and VO2max (p = .039)
added significantly to the model/prediction, but weight (p = .799) did not add significantly to
the model.
Logistic Regression Using SPSS
- You can use the information in the "Variables in the Equation" table to predict the probability of
an event occurring based on a one-unit change in an independent variable when all other
independent variables are kept constant. For example, the table shows that the odds of
having heart disease ("yes" category) is 7.026 times greater for males as opposed to females.
Logistic Regression Using SPSS
- A logistic regression was performed to ascertain the effects of age, weight, gender and
VO2max on the likelihood that participants have heart disease. The logistic regression model
was statistically significant, χ2(4) = 27.402, p < .0005. The model explained 33.0%
(Nagelkerke R2) of the variance in heart disease and correctly classified 71.0% of cases.
Males were 7.02 times more likely to exhibit heart disease than females. Increasing age was
associated with an increased likelihood of exhibiting heart disease, However, increasing
VO2max was associated with a reduction in the likelihood of exhibiting heart disease.
Thanks for Listening and Attending!
Multiple Regression Using SPSS
Any Questions?
Presented by Nasser Hasan - Statistical Supporting Unit
Can you please give us a minute to fill this survey as it will help
6/3/2020
us to evaluate our performance and take your feedback into
consideration for future webinars:
https://umiami.qualtrics.com/jfe/form/SV_a9N5Xta6OlybEeV