0% found this document useful (0 votes)
2 views6 pages

Logistic Regression 1

The document explains how to interpret the results of a logistic regression analysis using a fictitious dataset that includes variables such as age, gender, and smoking status to predict disease. It details the steps to calculate the regression, including the results table, classification table, and Chi2 test, highlighting the significance of the model based on p-values. The analysis shows that the model correctly classifies 72.22% of individuals and indicates a significant difference between models with and without independent variables.

Uploaded by

Vince Tejada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views6 pages

Logistic Regression 1

The document explains how to interpret the results of a logistic regression analysis using a fictitious dataset that includes variables such as age, gender, and smoking status to predict disease. It details the steps to calculate the regression, including the results table, classification table, and Chi2 test, highlighting the significance of the model based on p-values. The analysis shows that the model correctly classifies 72.22% of individuals and indicates a significant difference between models with and without independent variables.

Uploaded by

Vince Tejada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

But how do you interpret the results

of a logistic regression

Age Gender Smoker status Disease


22 female Non-smoker 1
Let's take a look at this
25 female Smoker 1
fictitious example.
18 male Smoker 0
45 male Non-smoker 0
12 female Smoker 0
43 male Smoker 1
23 male Smoker 0
33 male Smoker 1
… … … …

If you like, you can download the example dataset for free and

follow the steps in parallel. Please just use this link.

Or load it from the logistic Regression tutorial


When you use the link, the data is automatically loaded.

We want to calculate a logistic regression,


so we just click on regression.
When we copy our data in here, the
variables show up down here.

Depending on how your dependent variable is


scaled, DATAtab will calculate either a logistic or a
linear regression under the tab Regression.
We choose disease as the dependent variable and age,
gender, and smoking status as the independent variables.
Datatab now calculates a logistic regression for us.
If you don't know how to interpret
the results, you can click on

We will now go through all the


tables slowly and understandably.
Let's start at the top.
Let‘s Start

The first thing that is displayed is the results table. In the results
table you can see that a total of 36 people were examined.

With the help of the calculated regression model, 26 of


36 persons could be correctly assigned. That is 72.22%!

Then comes the classification table.

Here you can see how often the categories


not diseased and diseased were observed
and how often they were predicted.
In total, "not diseased" was observed 16 times.

Of these 16 individuals, the regression model correctly scored


11 as not diseased and incorrectly scored 5 as diseased.
Of the 20 diseased individuals, 15 were correctly scored as
diseased and 5 incorrectly scored as diseased.

To be noted:
For deciding whether a person is diseased or not the
threshold of 50% is used.

50%

-∞ 0 +∞

If the regression model estimates a value greater than 50%,


this person is assigned “diseased”, otherwise “not diseased”.
Now comes the Chi2 test.

Here we can read whether the model as


a whole is significant or not.

Two models are compared for this purpose


In one model all independent variables are used

and in the other model the independent variables are not used.

With the help of the Chi2 test we compare how good the prediction is
when the dependent variables are used and how good it is when the
dependent variables are not used and the Chi2 test “tells us” if there is a
significant difference between these two results.

The null hypothesis is that both models are the same.

If the p-value is less than 0.05,


this null hypothesis is rejected.

In our example, the p-value is less than


0.05 and we assume that there is a
significant difference between the
models. Thus, the model as a whole is
significant.

You might also like