0% found this document useful (0 votes)
15 views14 pages

Materi MT

Uploaded by

Putrijunho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views14 pages

Materi MT

Uploaded by

Putrijunho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

LOGISTIC BINARY REGRESSION

WEEK 5 – CATEGORIC DATA ANALYSIS

Putri Azizatun Hidayati, M. Si.


REVIEW LAST WEEK

Simply divide the cumulative Dividing the odds of disease analogous to the risk
incidence in exposed group in the exposed by the the
by the cumulative incidence in odds of disease in the non- ratio (RR) of cohort
the unexposed group exposed group studies
OUTLINE
TODAY
Introduction to Logistic Regression
01 Binary data, the difference of ordinary linear regression and logistic regression,
implementation in real life

Outcome:
Model and Parameter Estimation
1. Able to explain
binary data
2. Understanding
02 The form of model and how to estimate the parameter

logistic binary
regression Parameter Interpretation
3. Able to estimate
parameters in logistic
03 Explain the meaning of the parameter

regression
4. Interpreting the
Quiz and Discussion
analysis result 04 Discussion time
Categorical Data

d a t a
Categorical data has a Data Types
measurement scale that
consist of several

C a t e g o r i c a l
categories. The Categorical Data Numerical Data

categories can explain


the characteristics of the
Nominal Ordinal Interval Ratio
measurement results

Nominal Data Ordinal Data


- There is no order between categories - There is an order between categories
- Only for label - Ex: Level of education, level of satisfaction in
- Ex: Gender, races, religion, color service (good, average, bad)
Binary Categorical Data

d a t a
Binary Categorical Data Tipe Data
is grouping the data into

C a t e g o r i c a l
2 categories, for Data Kategorik
example "Success" and
"Failure"
Nominal Ordinal

Often, binary data is used to represent one of two


conceptually opposed values, e.g.: the outcome of an
experiment ("success" or "failure") the response to a yes–
no question ("yes" or "no") presence or absence of some
feature ("is present" or "is not present")
Binary Logistic Regression
Logistic regression models a
Ordinary linear relationship between predictor
regression models a relationship variables and a categorical
between predictor variables and a response variable. Ex: Predicting
numerical response variable. Ex: the result of passing the test
Predicting houses price based on (pass or fail), predicting raining
certain variables possibilities in a certain days
based on several variables

Binary Logistic Regression:


Used when the response is binary (i.e., it has
two possible outcomes)
Binary Logistic
Regression Model
For example, Y is a binary variable with outcome “Success” and “Fail”. And X is a predictor for variable
Y. Then, the logistic regression model can be written as below

𝑝
𝑙𝑜𝑔𝑖𝑡 𝑝 = log = 𝛼 + 𝛽𝑥 + 𝜀
𝑝: Probability of “Success” 1−𝑝
𝑥: the value of predictor variable
𝛼: Intercept of logit
𝛽: Slope of the logit Estimation
𝜀: Error term

Ex: Predicted the probability of


getting promoted by the work
experience. exp 𝛼 + 𝛽𝑥
𝑝Ƹ =
1 + exp 𝛼 + 𝛽𝑥
Example:
In a company, the superior want to know the probability of getting
promoted by working experience.
Logistic By logistic regression, we can break down the dependent and
Regression independent variables into
Y: Status of getting promoted (“Yes” or “No”)
X: Working experience (Years)

Then
𝑝: the probability of getting promoted

𝑝 = 0, the person will not get promoted


𝑝 = 1, the person will get promoted

Logistic regression returns an outcome of 0 (Promoted =


No) for 𝑝 < 0.5. A prediction of 1 (Promoted = Yes) is
returned for 𝑝 ≥ 0.5:
Example:
In a company, the superior want to know the probability of getting
What if we fit promoted by working experience.
the
probability
with linear Of course, there would be a problem:
The linear regression does not have limited
regression? outcome. It can cause the prediction exceed
from 1 or less than 0, which is not make
sense for probability.

Since we want to predict a binary outcome (Yes/No),


the predictions need to range from 0 to 1. It isn’t
possible to have negative predictions or predictions that
exceed 1.
To ensure that the outcome is always a probability,
logistic regression uses the sigmoid function to squash
the output of linear regression between 0 and 1
The Horseshoe Crab Data

Parameter
Each female horseshoe crab in the study had a male crab attached to
Estimation her in her nest. The study investigated, whether the width of the crab
affecting the existence of the male crab around the female, that
commonly called satellites.

𝑌 = 1 means there is at least one satellite around her


𝑌 = 0 means there is no satellite around her
𝑋: Width of the female crab

The estimated probability


of the female crab has any
satellite
Odds Interpretation
𝑝 𝑥
𝑜𝑑𝑑𝑠 𝑌 = 1 = = exp 𝛼 + 𝛽𝑥 = 𝑒 𝛼 𝑒 𝛽
1−𝑝

The odds multiply by 𝑒 𝛽 for every 1-unit increase in x

For the horseshoe crabs

So, the estimated odds of a satellite multiply by 𝑒 𝛽 = exp 0.497 = 1.64 for each centimeter increase in
width. That is, there is a 64% increase
Consider the effect of snoring in heart disease. With scores {0, 2, 4, 5}
for snoring levels, the logistic regression ML fit is

Study Case
1. Describe the independent and dependent variables
2. Describe 𝑝 as the probability of existence of heart disease
3. Interpret the sign of the estimated effect of x
4. Estimate the probabilities of heart disease at snoring levels 0 and 5
5. Describe the estimated effect of snoring on the odds of heart disease
Source

Agresti, A, 2007, An Introduction to


Categorical Data Analisis. John Willey &
Sons Inc., New Jersey

Hosmer, D.W, and Lemeshow, S, 2000,


AppliedLogistic Regression. John Willey
& Sons Inc., New York

https://towardsdatascience.com/logistic-
regression-explained-in-7-minutes-
f648bf44d53e

You might also like