0% found this document useful (0 votes)

19 views

RM - Binary Logistic Regression Model - Estimation

Uploaded by

Fides Mboma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

RM - Binary Logistic Regression Model - Estimation

Uploaded by

Fides Mboma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

MDS5202/ Segment 3

BINARY LOGISTIC
REGRESSION MODEL –
ESTIMATION
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

Table of Contents

1. Simple Logistic Regression 4

1.1 Logistic Regression Analysis 5

1.2 Logistic Regression Estimation 8

1.3 Maximum Likelihood Estimation 11

1.4 Interpretation of Parameters 13

1.5 Categorical Independent Variable 13

1.6 Categorical Independent Variable Error! Bookmark not defined.

1.7 Reference Cell Coding 14

1.8 Challenger Dataset 15

2. Multiple Logistic Regression 17

2.1 Estimation 18

3. Summary 19

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 2/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

Introduction
Linear regression can be used to predict a value that is continuous in nature. However, if the
values to be predicted are dichotomous, that is, it takes only two values, the linear regression
is unsuitable in its original form. We need to modify the values to apply the concept. Binary
logistic regression is a method to predict dichotomous variables.

Learning Objectives

At the end of this topic, you will be able to:

• Explain the use of the logistic regression model for dichotomous outcome variables
• Describe the maximum likelihood estimation of the parameters of the logistic
regression
• Interpret the regression coefficient, odds ratio, and confidence interval

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 3/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

1. Simple Logistic Regression

Regression Analysis is a method to model a relationship between variables. It helps to infer
or predict a variable based on one or more other variables. Dependent variable is a variable
that one would want to infer or predict. An Independent variable is a variable that is used for
predicting a dependent variable. In Linear Regression Models, the dependent variable is
continuous. For example, the Salary of the employees, Consumption of electricity, Weight of
children and so on. Logistic Regression Models have dichotomous variables as dependent
variables. A dichotomous variable is a categorical variable with two categories. For example,
if the disease is present or absent, a customer buys a product or not and so on. There are
only two possible outcomes that the outcome variable or dependent variable can take.

In the visualisation of data with continuous values, the dependent variables are plotted on
the Y-axis, whereas the independent variables are plotted on the X-axis. There is a certain
scatter of points, as depicted in Figure 1. To fit the line of the regression, lease squares
methods are selected, and then a line is put up in the scatter so that there is an equal number
of points on either side of the line and the line passes through maximum points. This
process fits a linear regression into the distribution between two continuous variables.

Figure 1: Visualisation of Data

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 4/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

1.1 Logistic Regression Analysis

Logistic regression is similar to linear regression except that the Y-axis has only values 0
and 1, for example, representing a case of obesity - 'is obese' and 'is not obese'. While
creating a scatter plot, there is a line against the value 0, which represent 'is not obese', and
then there is a line of scatter points against 'is obese', coded as 1.

Figure 2: Logistic Regression Analysis

The logistics function gives the probability of an event of interest, say 𝑝. The probability
ranges from 0 to 1and we try to look at the odds of an event. The odds of an event can be
defined by taking the ratio of the probability of occurrence and the probability of non-
occurrence of the event. It can also be defined by using the formula,
= 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒/(1 − 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒)
𝑝
=
1−𝑝

When the probability ranging from 0 to 1 gets converted to odds ratio, then the range of this
entity, the odds will be 0 when 𝑝 is 0, and when 𝑝 is one, it'll go to infinity, the positive infinity.
So here, odds can take a value anywhere between 0 to infinity.

The log of odds brings the particular outcome variable from the '0 to 1 variable' to a
continuous scale, like in linear regression. So, a log of 0 is negative infinity. The log of odds
translates the scale of the 0 to 1 observed variable from negative infinity to positive infinity.
The 'log of odds' link function converts the '0 to 1 observation' or dichotomous variable into
a continuous scale. Once the continuous scale is achieved, the log odds become similar to
the 𝑌 in the linear regression model.

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 5/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

When the regression line is drawn, we get the curve shape when the probability of obesity is
put on the Y-axis. So, the curve line best fits the scatterplot. Here, 𝑌 ranges from 0 to 1.

Figure 3: Regression Line

Suppose the probability of obesity gets converted to the log of odds, then the axis moves
from 0 and 1 on the probability scale to negative to positive infinity. So now it becomes
similar to 𝑌 as seen in the linear regression model.

Figure 4: Regression line

Figure 5 shows what the logistic regression would look like. It is to be noted that logistic
regression is not a linear model, but it is a non-linear model. The link function converts the
non-linear model into a linear one.

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 6/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

Figure 5: Logistic Regression

Suppose in the logit transformation, the logit proportion is kept on the Y-axis, and the
predictor is on the X-axis; the sigmoid curve in Figure 6 gets converted into a linear graph.
So, this is the whole process where the link function, also called the logit transformation, is
used to convert a sigmoid curve to a linear curve.

Figure 6: Logit Transformation

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 7/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

1.2 Logistic Regression Estimation

Consider Figure 7, where weight is plotted against obesity. A point in Figure 7 indicates the
probability of obesity of a lightweight individual ( because the left side indicates a smaller
weight). So, the probability that a light-weighing individual is obese will be very small
because the weight is low.

Figure 7: Probability at a Point

If the red point is shifted to the particular location and the individual with an intermediate
size is weighed, then the probability is around 0.5.

Figure 8: Probability at Mid-Point

If the red dot moves to the extreme right, the probability of the new individual who is
measured as obese is higher. This seems logical because the individual weight is higher.

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 8/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

Figure 9: Probability at Higher Values

So, the curve in a logistic regression identifies the probability of that particular value on the
X-axis. The red circle on the line is nothing but the probability of observation of that order on
the X-axis. Using the predictors, you are trying to predict the probability of being an outcome
of interest; that is what logistic regression does. So, it classifies the subject into one or the
other group using the predictor values by giving a probability of it belonging to any one of
the two categories.

Figure 10: Probability at Multiple Points

The linear regression models estimate the regression coefficients using the least squares
method. The distance of each point from the line should be as minimum as possible.
Whereas logistic regression uses the "maximum likelihood method".

To calculate the likelihood of logistic regression, take the first observation. Here, 𝑦 equals 0
when an individual 'is not obese' and 1 when an individual 'is obese'. If the probability of

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 9/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

being obese is denoted as 𝑝, then the probability of the point will be 1 − 𝑝. So, with the help
of other observations, one can get the likelihood of all the other points. Since it is a binary
outcome variable, the likelihood of each point will be its respective probabilities, and the
likelihood of all points put together will be the product of the probabilities.

Figure 13: Other Likelihood Observation

Next, shift all the lines and calculate the likelihood of all the points. Calculate such
likelihoods for each line that can be drawn connecting the scatter of points using the
sigmoid curve. Identify the line that gives the maximum likelihood, which will help predict
the regression coefficient.

Figure 14: Likelihood Observation

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 10/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

In linear regression models, 𝑌 can take any value from negative to positive infinity, but in
logistic regression models, 𝑌 takes values between 0 and 1. Mathematically it can be written
as,
𝑙𝑜𝑔𝑖𝑡 (𝑝𝑖 ) = 0 + 1 𝑋1
Where,
𝑙𝑜𝑔𝑖𝑡 (𝑝𝑖 ): Logit transformation of the probability of the event = log of the odds
0 : Intercept of the regression line
1 : Slope of the regression line
𝑋1: Is the predictor variable

The distribution of each observation 𝑦𝑖 is given by the Bernoulli mass function,

𝑦
𝑓𝑖 (𝑦𝑖 ) = 𝜋𝑖 𝑖 (1 − 𝜋𝑖 )1−𝑦𝑖 , 𝑖 = 1, 2, . . . , 𝑛

The likelihood function is denoted as,

𝑛 𝑛
𝑦
𝐿(𝒚, 𝛽) = ∏ 𝑓𝑖 (𝑦𝑖 ) = ∏ 𝜋𝑖 𝑖 (1 − 𝜋𝑖 )1−𝑦𝑖
𝑖=1 𝑖=1

Consider the log-likelihood is,

𝑛 𝑛 𝑛
𝜋𝑖
ln 𝐿 (𝒚, 𝛽) = ln ∏ 𝑓𝑖 (𝑦𝑖 ) = ∑ [𝑦𝑖 ln ( )] + ∑ ln( 1 − 𝜋𝑖 )
1 − 𝜋𝑖
𝑖=1 𝑖=1 𝑖=1

1.3 Maximum Likelihood Estimation

The maximum likelihood estimators (MLEs) of the model parameters are those values that
maximise the likelihood (or log-likelihood) function. ML often gives estimators that are
intuitively pleasing. Some properties of MLEs are:
• Are unbiased for large samples
• Have nearly minimum variance
• Have an approximate normal distribution when 𝑛 is large.

If we have 𝑛𝑖 trials at each observation, we can write the log-likelihood as,

𝑛

ln 𝐿 (𝒚, 𝛽) = 𝛽 ′ 𝑿′ 𝒚 − ∑ 𝑛𝑖 ln[ 1 + exp( 𝒙′𝑖 𝛽)]

𝑖=1

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 11/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

The derivative of the log-likelihood is,

𝑛
𝜕 ln 𝐿 (𝒚, 𝜷) 𝑛𝑖
= 𝑿′ 𝒚 − ∑ [ ] exp( 𝒙′𝑖 𝜷)𝒙𝑖
𝜕𝜷 1 + exp( 𝒙′𝑖 𝜷)
𝑖=1
𝑛
′
= 𝑋 𝑦 − ∑[𝑛𝑖 𝜋𝑖 ] 𝑥𝑖
𝑖=1

= 𝑋 ′ 𝑦 − 𝑋 ′ 𝜇 (because 𝜇𝑖 = 𝑛𝑖 𝜋𝑖 )

Setting this last result to 0 gives the maximum likelihood score equations like,
𝑋 ′ (𝑦 − 𝜇) = 0

These equations are similar to linear regression,

𝑦 = 𝑋𝛽 + 𝜀 = 𝜇 + 𝜀𝑋 ′ (𝑦 − 𝜇) = 0

Results from ordinary least squares (OLS) or ML with normal errors.

Since 𝜇 = 𝑋𝛽,

𝑋 ′ (𝑦 − 𝜇) = 𝑋 ′ (𝑦 − 𝑋𝛽) = 0, 𝑋 ′ 𝑋𝛽̂ = 𝑋 ′ 𝑦, and

𝛽̂ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

OLS or the normal−theory MLE.

Solving the ML score equations in logistic regression is not easy, as,

𝑛𝑖
𝜇𝑖 = , 𝑖 = 1,2, . . . , 𝑛
1 + exp( − 𝒙′𝑖 𝛽)

Logistic regression is a non-linear model. The solution can be calculated based on iteratively
reweighted least squares or IRLS. It follows an iterative procedure as parameter estimates
involve several steps and must be updated from an initial "guess". Since the variance of the
observations is not constant, weights are used. The weights are functions of the unknown
parameters.

©COPYRIGHT 2023 (VER. 1.0), ALL RIGHTS RESERVED. MANIPAL ACADEMY OF HIGHER EDUCATION 12/20
BINARY LOGISTIC REGRESSION MODEL-ESTIMATION

1.4 Interpretation of Parameters

The log odds at 𝑥 is,
𝜋̂(𝑥)
𝜂̂ (𝑥) = ln = 𝛽̂0 + 𝛽̂1 𝑥
1 − 𝜋̂(𝑥)

The log odds at 𝑥 + 1 is,

𝜋̂(𝑥 + 1)
𝜂̂ (𝑥 + 1) = ln = 𝛽̂0 + 𝛽̂1 (𝑥 + 1)
1 − 𝜋̂(𝑥 + 1)

The difference in the log odds is,

𝜂̂ (𝑥 + 1) − 𝜂̂ (𝑥) = 𝛽̂1

The odds ratio is found by taking antilogs,

𝑂𝑑𝑑𝑠𝑥+1 ̂
𝑂̂𝑅 = = 𝑒 𝛽1
𝑂𝑑𝑑𝑠𝑥

It interprets the estimated increase in the probability of "success" associated with a one-unit
increase in the value of the predictor variable.

1.5 Categorical Independent Variable

The predictable variable is called categoric, and the dependent variable is coded as 0 and 1,
the binary outcome variable. It can also be multinomial. It is required to choose a reference
category because the comparison is made across the categories to look at the increased
probability of occurrence of the event. Generally, the code is either the highest or lowest
(depending on the software used). It uses the concept of dummy variables.
Dummy variables have indicator variables for each category.

1.6 Reference Cell Coding

Design Variables
Here is an example of identifying the dummy variables and the reference cell coding when
a categorical independent or predictor variable must be included.

Table 1: Design variables

This design variable of income has three categories, low, medium, and high, which are
numerically coded as the values, such as 1, 2, and 3. To depict this using, two dummy
variables are required, the indicator variable and the dummy variable. So, the number of
dummy variables equals one less than the categories in the variable. Here, the income
variable has three categories for which two dummy variables are needed. Table 1 shows
that against the low-income category, the first dummy variable has 1, and the second one
has 0, which signifies that it's a member of the low category.

Similarly, in the medium category, the first dummy variable has 0, and the second has 1. At
last, the high category can be noticed that both the dummy variables are taken as 0, which
becomes the reference category that indicates 0 across all the dummy variables. So, this
variable is being put into the regression model in software; it'll give the coefficients for low,
and medium compared to high.

1.7 Challenger Dataset

Here is an example of getting the simple logistic regression and interpretation of its output
that has been generated out of this analysis.

Figure 15: Dataset of Rockets

So, this is a challenger data set of rockets being installed for moving into space. There are
certain lists of variables. The data given in Figure 15 is the number of failures that came in
the field, the number of failures of the nozzle, the failed field, and so on. 0 means no and 1
means yes. There are certain values that are taken as temperature, pressure at the field, and
pressure at the nozzle. The failure occurred when the O-ring in the field joints or the nozzles
of the solid rocket were blasted. So, 1 codifies the incident of a failure, and 0 is its absence,
and the objective was to predict what determines the failure of the rocket. The focus is
mostly on the O-rings of the field joint as being the most determinant of the accidents. The
temperature on the day of launch is measured in degrees Celsius and the leak check of the
pressure test of the O-ring. These tests ensure that the rings would seal the joint of the field
and the nozzle.

While running through R, the output received tells about the 24 recordings, out of which 7
were failures and 17 were successful, and the predictor that has been considered is the
temperature. So, with the help of the standard error of the coefficients, we get the Z test,
which is the ratio of the coefficient to the standard error. The corresponding P-values are
given, and the column of odds ratio shows a unit increase in X, and the difference in the log

odds is 𝛽̂1 and if it is exponentiated, we get the exponent of 𝛽1, that is the odds ratio. If the
regression model is written, this is how it would be,
exp( 10.875 − 0.17132𝑥)
𝑦̂ =
1 + exp( 10.875 − 0.17132𝑥)

Figure 16: Challenger Dataset

The regression coefficient for the 𝛽 for the temperature is −0.17132, and to get the odds
ratio,
𝑂̂𝑅 = 𝑒 −0.17132 = 0.84

Implies that every decrease of one degree in temperature increases the odds of O-ring failure
1
by about = 1.19 𝑜𝑟 19 𝑝𝑒𝑟𝑐𝑒𝑛𝑡.
0.84

The risk at 22 degrees Celsius is,

𝑂̂𝑅 = 𝑒 22(−0.17132) = 0.0231

The temperature at the Challenger launch at 22 degrees below the lowest observed launch
1
temperature, increases the odds of failure of 0.0231 = 43.34 𝑜𝑟 𝑎𝑏𝑜𝑢𝑡 4200 𝑝𝑒𝑟 𝑐𝑒𝑛𝑡!! The

extrapolations imply that it will affect the launch.

To generate the 95%𝐶𝐼 𝑓𝑜𝑟 𝛽, the general rule for the confidence interval is,
𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 ± 𝐶𝑜𝑛𝑓𝑑. 𝐶𝑜𝑒𝑓𝑓 × 𝑆𝐸(𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐)

It is assumed that the statistic has a normal distribution.

95%𝐶𝐼 𝑓𝑜𝑟 𝛽:𝛽 ± 𝑍1 − 𝛼/2 × 𝑆𝐸(𝛽)

𝛼
95%𝐶𝐼 𝑓𝑜𝑟 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 [exp( 𝛽)]: exp {𝛽 ± 𝑍1 − 2 × 𝑆𝐸(𝛽)}

𝐼𝑛 𝑐ℎ𝑎𝑙𝑙𝑒𝑛𝑔𝑒𝑟 𝑑𝑎𝑡𝑎, 𝛽 = 0.17132 and 𝑆𝐸(𝛽) = 0.08344

95%𝐶𝐼 𝑒𝑥𝑝 𝛽:
exp {0.17132 ± 1.96 × 0.08344 = (0.72,0.99)

The 𝐶𝐼 of a ratio defines the presence of 1 in the interval. Hence, the presence of 1 does not
allow one to reject the hypothesis; if the interval is below or above 1, the chance of failure
will increase or decrease.

2. Multiple Logistic Regression

Multiple Logistic Regression is a statistical test used to predict a single binary variable using
one or more other variables. It is used when the variable you want to predict (your dependent
variable) is binary, and one or more independent variables or variables (s) are used.

For example, a purchase made can be answered with yes or no. This can be the dependent
variable. In this case, the consumer income and age are independent variables.

In another example, the dependent variables are coded 0 for no depression and 1 for
depression, whereas independent variables are: smoking: smoker yes = 1 or no = 0, age
(continuous), gender: female = 0 and male = 1

The equation for it is,

𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2 … + 𝑏𝑖 𝑥𝑖
𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛
𝑙𝑛 [ ] = 𝑎 + 𝑏1 (𝑠𝑚𝑜𝑘𝑖𝑛𝑔) + 𝑏2 (𝑎𝑔𝑒) + 𝑏3 (𝑔𝑒𝑛𝑑𝑒𝑟)
1 − 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛

2.1 Estimation
Table 2: Estimation of Output

Table 2 shows the output available for the particular study and in which the multiple logistic
regression was run. In the linear regression, the output available was 𝛽 coefficients, error of
𝛽. The coefficient's exponent is called the odds ratio, and the 95% confidence interval for
the odds ratio is given in Table 2. By looking at the odds ratio of gender, the value found is
1.132; this is called the adjusted odds ratio because there is a presence of other predictor
variables or other independent variables. It is considered that male is 13% more likely to be
depressed as compared to a female. While it is male, female is coded as 0, so that becomes
the reference category. The confidence interval for it has 1. Therefore, it is not considered a
significant predictor of depression because there is an inclusion of 1. The risk of being
depressed for males is likely to range from 0.823 𝑡𝑜 1.556 with 95% confidence when
adjusted for all the other variables in the model.

Similarly, from the Table, for a one-unit increase in age, the risk of being depressed is the
same as for the age lowered by 1. Hence it will also not help predict the depression or odds
of a depression. When it comes to smokers, the risk of odds ratio is 2.38. So, it is assumed
that a smoker is 2.38 times more likely to be depressed than non-smokers when adjusted
for age and gender. The odd ratio generated from simple logistic regression is referred to
as the crude odds ratio, and the odd ratio generated for the predictors from a multiple
logistic regression is referred to as the adjusted odds ratio.

3. Summary
In this topic, we discussed:
• Need for the use of the logistic regression model for dichotomous outcome variables
• Maximum likelihood estimation of the parameters of the logistic regression
• Interpretation of the regression coefficient, odds ratio and confidence interval

PAG 2 Chemistry Questions
No ratings yet
PAG 2 Chemistry Questions
18 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
PNG National Energy Policy 2017-2027
No ratings yet
PNG National Energy Policy 2017-2027
149 pages
Lecture 22. Glm
No ratings yet
Lecture 22. Glm
41 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
5.1) Binary logistic regression
No ratings yet
5.1) Binary logistic regression
32 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Chapter 06 - US 7e
No ratings yet
Chapter 06 - US 7e
22 pages
Logistic_Regression
No ratings yet
Logistic_Regression
6 pages
Logistics Regression Notes
No ratings yet
Logistics Regression Notes
12 pages
Logistic Regression Monograph - DSBA v2
No ratings yet
Logistic Regression Monograph - DSBA v2
54 pages
ML Lec-9
No ratings yet
ML Lec-9
13 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
新增 Microsoft Word Document
No ratings yet
新增 Microsoft Word Document
10 pages
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
No ratings yet
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
6 pages
Logistic+Regression+Monograph+ +DSBA+v2
No ratings yet
Logistic+Regression+Monograph+ +DSBA+v2
54 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Logistic Regression
100% (2)
Logistic Regression
47 pages
CHAPTER 2
No ratings yet
CHAPTER 2
11 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Binary Logistic Regression
100% (1)
Binary Logistic Regression
11 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
7.logistics Regression - BDSM - Oct - 2020
No ratings yet
7.logistics Regression - BDSM - Oct - 2020
49 pages
A Simple But Effective Logistic Regression Derivation
No ratings yet
A Simple But Effective Logistic Regression Derivation
6 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
Loges Tic
No ratings yet
Loges Tic
30 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Logistic Regression Model Study Assignment
100% (1)
Logistic Regression Model Study Assignment
5 pages
ppt4
No ratings yet
ppt4
54 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
02 LogisticRegression
No ratings yet
02 LogisticRegression
29 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
47 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
Dissertation Using Logistic Regression
100% (2)
Dissertation Using Logistic Regression
6 pages
An Introduction To Logistic Regression in R
No ratings yet
An Introduction To Logistic Regression in R
25 pages
4_c_logistic regression
No ratings yet
4_c_logistic regression
13 pages
Lecture 3 Logistic Regression
No ratings yet
Lecture 3 Logistic Regression
14 pages
Logistic Regression
No ratings yet
Logistic Regression
17 pages
Environmental Justice and Economic Degrowth - An Alliance Between Two Movements
No ratings yet
Environmental Justice and Economic Degrowth - An Alliance Between Two Movements
16 pages
Assignment Problems Sheet
No ratings yet
Assignment Problems Sheet
3 pages
Stiffness Modifiers As Per Is Code-Etabs Application
100% (1)
Stiffness Modifiers As Per Is Code-Etabs Application
2 pages
Physics Question Bank (2)
No ratings yet
Physics Question Bank (2)
2 pages
Department of Education: Granja Kalinawan National High School
No ratings yet
Department of Education: Granja Kalinawan National High School
26 pages
STD - 4 Air, Water and Weather - ppt1
No ratings yet
STD - 4 Air, Water and Weather - ppt1
17 pages
Organization and Management
100% (1)
Organization and Management
37 pages
English 2nd Summative Test Q4 2022 2023B
No ratings yet
English 2nd Summative Test Q4 2022 2023B
2 pages
Dating Format
100% (2)
Dating Format
18 pages
Virtual Learning Environment
No ratings yet
Virtual Learning Environment
12 pages
Imp HR Interview Question
No ratings yet
Imp HR Interview Question
5 pages
Running Plan Template - 42k
100% (1)
Running Plan Template - 42k
4 pages
Math Reviewer Q2
No ratings yet
Math Reviewer Q2
2 pages
Uncertanity Fully Explained
No ratings yet
Uncertanity Fully Explained
18 pages
Guidelines Botanic Gardens
No ratings yet
Guidelines Botanic Gardens
47 pages
Golitsis-Simplicius and Philoponus On The Authority Opf Aristotle
No ratings yet
Golitsis-Simplicius and Philoponus On The Authority Opf Aristotle
2 pages
Grade 11 Capricorn North Common Task
No ratings yet
Grade 11 Capricorn North Common Task
7 pages
ISO 18001 - Process Safety Indicators
100% (1)
ISO 18001 - Process Safety Indicators
22 pages
New TIP Course 1 (DepEd Teacher)
No ratings yet
New TIP Course 1 (DepEd Teacher)
93 pages
Practical Research 1 Handouts
No ratings yet
Practical Research 1 Handouts
3 pages
Pr1 Module 1
No ratings yet
Pr1 Module 1
37 pages
Change Management PowerPoints
No ratings yet
Change Management PowerPoints
185 pages
WEEK 5 - History of The English Language Part II
No ratings yet
WEEK 5 - History of The English Language Part II
18 pages
Uhf Rfid Antennas Brochure en Us
No ratings yet
Uhf Rfid Antennas Brochure en Us
13 pages
Phy110 Unit 3 [by Alfa] lpu
No ratings yet
Phy110 Unit 3 [by Alfa] lpu
22 pages
6.5-N-M.Sc.-Physics-Sem-III-IV-
No ratings yet
6.5-N-M.Sc.-Physics-Sem-III-IV-
32 pages
cuet prep
No ratings yet
cuet prep
4 pages
Organisational Culture and Dynamics: January 2018
No ratings yet
Organisational Culture and Dynamics: January 2018
10 pages

RM - Binary Logistic Regression Model - Estimation

Uploaded by

RM - Binary Logistic Regression Model - Estimation

Uploaded by

MDS5202/ Segment 3

1. Simple Logistic Regression 4

1.1 Logistic Regression Analysis 5

1.2 Logistic Regression Estimation 8

1.3 Maximum Likelihood Estimation 11

1.4 Interpretation of Parameters 13

1.5 Categorical Independent Variable 13

1.6 Categorical Independent Variable Error! Bookmark not defined.

1.7 Reference Cell Coding 14

1.8 Challenger Dataset 15

2. Multiple Logistic Regression 17

At the end of this topic, you will be able to:

1. Simple Logistic Regression

Figure 1: Visualisation of Data

1.1 Logistic Regression Analysis

Figure 2: Logistic Regression Analysis

Figure 3: Regression Line

Figure 4: Regression line

Figure 5: Logistic Regression

Figure 6: Logit Transformation

1.2 Logistic Regression Estimation

Figure 7: Probability at a Point

Figure 8: Probability at Mid-Point

Figure 9: Probability at Higher Values

Figure 10: Probability at Multiple Points

Figure 13: Other Likelihood Observation

Figure 14: Likelihood Observation

The distribution of each observation 𝑦𝑖 is given by the Bernoulli mass function,

The likelihood function is denoted as,

Consider the log-likelihood is,

1.3 Maximum Likelihood Estimation

If we have 𝑛𝑖 trials at each observation, we can write the log-likelihood as,

ln 𝐿 (𝒚, 𝛽) = 𝛽 ′ 𝑿′ 𝒚 − ∑ 𝑛𝑖 ln[ 1 + exp( 𝒙′𝑖 𝛽)]

The derivative of the log-likelihood is,

These equations are similar to linear regression,

Results from ordinary least squares (OLS) or ML with normal errors.

𝑋 ′ (𝑦 − 𝜇) = 𝑋 ′ (𝑦 − 𝑋𝛽) = 0, 𝑋 ′ 𝑋𝛽̂ = 𝑋 ′ 𝑦, and

OLS or the normal−theory MLE.

Solving the ML score equations in logistic regression is not easy, as,

1.4 Interpretation of Parameters

The log odds at 𝑥 + 1 is,

The difference in the log odds is,

The odds ratio is found by taking antilogs,

1.5 Categorical Independent Variable

1.6 Reference Cell Coding

Table 1: Design variables

1.7 Challenger Dataset

Figure 15: Dataset of Rockets

Figure 16: Challenger Dataset

The risk at 22 degrees Celsius is,

extrapolations imply that it will affect the launch.

It is assumed that the statistic has a normal distribution.

95%𝐶𝐼 𝑓𝑜𝑟 𝛽:𝛽 ± 𝑍1 − 𝛼/2 × 𝑆𝐸(𝛽)

𝐼𝑛 𝑐ℎ𝑎𝑙𝑙𝑒𝑛𝑔𝑒𝑟 𝑑𝑎𝑡𝑎, 𝛽 = 0.17132 and 𝑆𝐸(𝛽) = 0.08344

2. Multiple Logistic Regression

The equation for it is,

You might also like