0% found this document useful (0 votes)
4 views

Logistic_Regression

Regression analysis is a method for modeling relationships between variables, with linear regression predicting a metric dependent variable and logistic regression predicting a dichotomous variable. Logistic regression helps determine influences on binary outcomes, such as disease presence, using independent variables like age and smoking status. The logistic function is used to estimate probabilities between 0 and 1, which is essential for predicting dichotomous outcomes.

Uploaded by

Vince Tejada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Logistic_Regression

Regression analysis is a method for modeling relationships between variables, with linear regression predicting a metric dependent variable and logistic regression predicting a dichotomous variable. Logistic regression helps determine influences on binary outcomes, such as disease presence, using independent variables like age and smoking status. The logistic function is used to estimate probabilities between 0 and 1, which is essential for predicting dichotomous outcomes.

Uploaded by

Vince Tejada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

What is the difference between a

What is a regression
linear regression and a logistic
A regression analysis is a method for
regression?
modeling relationships between
variables.
In a linear regression, the dependent
It makes it possible to infer variable is a metric variable, e.g. salary
or predict a variable or electricity consumption.

based on one or more


other variables.
In a logistic regression, the
dependent variable is a
The variable we want to infer or
predict is called the dependent
dichotomous variable.
variable or criterion.
What is a dichotomous variable?

Dichotomous variables are variables


with only two values.

For example:
Whether a person buys or does
not buy a particular product
or
whether a disease is
present or not

The variables we use for prediction


are called independent
variables or predictors.
How can logistic Our data set might look like this:
regression be used
Here we have the
independent variables
With the help of logistic
regression, we can determine what
has an influence on whether a certain
disease is present or not. Age Gender Smoker status Disease
22 female Non-smoker 1
25 female Smoker 1
18 male Smoker 0
45 male Non-smoker 0
12 female Smoker 0
43 male Smoker 1
23 male Smoker 0
33 male Smoker 1
… … … …

and here the dependent


variable with 0 and 1.

We could study the influence of We could now investigate what influence the
age, gender and smoking status independent variables have on the disease.
on that particular disease. If there is an influence, then we can predict
how likely a person is to have a certain disease.

In this case 0 stands for not diseased


and 1 for diseased
Now, of course, the
question arises:
Why do we need logistic
regression in this case?
Why can't we just use linear
and the probability for the occurrence
of the characteristic 1 (=characteristic regression?
present) is estimated.
A quick recap: A linear regression would now simply
In linear regression, this is put a straight line through the points.
our regression equation:

We have the the 1


dependent variable independent variables

and the regression coefficients. x

We can now see, that in the case of


linear regression, values between
However, we now have a
dependent variable that is either
plus and
0 or 1.

y
y
1
1

0
0
x
No matter which value we have for the x
independent variables, only 0
or 1 results. minus infinity can occur.
However, the goal of logistic No matter where we are on the x-axis,
regression is to estimate the
probability of occurrence.
1
The value range for the prediction
should therefore be between 0 and 1.
1/2

y
-∞ 0 +∞
1
between minus and plus infinity only
values between 0 and 1 result.

0
And that is exactly
x
what we want!

So we need a function that only


takes values between 0 and 1!

The equation for the logistic

And that is exactly what the function looks like this:


logistic function does.

1
1/2 The logistic function is now
used by the logistic regression.

-∞ 0 +∞
For z, the equation of the linear
regression is now simply inserted.

This gives us this equation:

Thus, the probability that the


dependent variable is 1 is given by:

What does this look


like for our example

In our example,
the probability of having a certain disease

is a function of age, gender and smoking status.


For z, the equation of the linear regression
is now simply inserted.

This gives us this equation:

Thus, the probability that the dependent


variable is 1 is given by:

What does this look like for our example

In our example,
the probability of having a certain disease

is a function of age, gender and smoking status.

Now we need to determine the coefficients


so that our model best represents the given data.

To solve this problem, the so-called


maximum likelihood method is used.
For this purpose, there are good numerical
methods that can solve the problem efficiently.

You might also like