0% found this document useful (0 votes)

19 views8 pages

90784-Origin of Logit

Uploaded by

deepanmech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views8 pages

90784-Origin of Logit

Uploaded by

deepanmech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Chapter 2

Short History of the Logistic Regression

Model

Abstract The logistic regression model, as compared to the probit, Tobit, and
complementary log–log models, is worth revisiting based upon the work of Cramer
(http://ssrn.com/abstract¼360300 or http://dx.doi.org/10.2139/ssrn.360300) and
(Logit models from economics and other fields, Cambridge University Press,
Cambridge, England, 2003, pp. 149–158). The ability to model the odds has
made the logistic regression model a popular method of statistical analysis. The
logistic regression model can be used for prospective, retrospective, or cross-
sectional data while the probit, Tobit, and the complementary log–log models can
only be used with prospective data because they model the probability of the event.
This chapter provides a summary (http://ssrn.com/abstract¼360300 or http://dx.
doi.org/10.2139/ssrn.360300; Logit models from economics and other fields,
Cambridge University Press, Cambridge, England, 2003, pp. 149–158).

2.1 Motivating Example

More than 175 years after the advent of the growth curve, we have fully embraced
the logistic regression model as a viable tool for binary data. Today, the logistic
regression model is one of the most widely used binary models in the analysis of
categorical data. The logistic regression model is based on modeling the odds of an
outcome, and the idea of odds (as used commonly by the average person) has lots of
appeal. Many seem to be familiar with the odds of certain outcomes, whether their
discussions are in sports, illness, or almost anything else. Additionally, it is quite
interesting from a statistical point of view that whether the data were obtained from
prospective, retrospective, or cross-sectional sampling, the covariate’s impact on
the binary outcome will be the same.
Since this book concentrates on fitting logistic regression models, it is reason-
able to spend time elaborating on the history and the origination of those models.
The advent of the logistic regression model, as compared to the probit, Tobit, log–
log, and complementary log–log models, is worth revisiting (Cramer, 2002, 2003).
The ability to model the odds has made it very attractive since the logistic
regression relies on the odds, and the odds can always be computed whether the

© Springer International Publishing Switzerland 2015 17

J.R. Wilson, K.A. Lorenz, Modeling Binary Correlated Responses using SAS,
SPSS and R, ICSA Book Series in Statistics 9, DOI 10.1007/978-3-319-23805-0_2
18 2 Short History of the Logistic Regression Model

data are prospective, retrospective, or cross-sectional. However, since the probit,

Tobit, log–log, and complementary log–log models rely on probabilities, they are
only applicable to prospective data. Logistic regression models model the proba-
bility (nonlinear) or, equivalently, the odds (nonlinear) or logit (linear) of
the outcome of an event. Logistic regression models have been used in
countless ways, analyzing anything from election data to credit card data to
healthcare data. Logistic regression analysis is a useful tool for all of these disci-
plines because it is ideal for identifying, discriminating, and profiling different
types of subpopulations.

2.2 Definition and Notation

2.2.1 Notation

In this discussion, we use the following symbols:

Pt is the probability of the outcome at time t being one.
1 Pt is the probability of the outcome being zero at time t.
log is the natural logarithm.
logit denotes the log of the odds, i.e., log½Pt =ð1 Pt Þ.
β0 represents the value of the logit when the covariate is zero.
β1 represents the increase in the logit for a unit increase in the covariate (when
continuous) or the difference from one category to the next if the covariate is
binary.

2.2.2 Definition

A monotonic function is a function which is either entirely nonincreasing or

nondecreasing. A function is said to be monotonic if its rate of increase or decrease
remains the same in direction. So, for x > 0, then f ðxÞ ¼ x2 is monotonic increasing
since f ðx þ 2Þ > f ðxÞ for any x > 0 but is not for all x.
A probit model is a type of regression for binary data on a scale that depends on
the cumulative distribution function of normal distribution.
A prospective study is a study designed to determine the relationship between an
outcome and a certain characteristic of the units involved. The researcher follows
the population group over a period of time, noting when or how often the event or
nonevent (e.g., lung cancer) occurs in the smokers and in the nonsmokers. Pro-
spective studies produce an opportunity to determine probabilities for each group
(event or nonevent) and as such provide the relative risk.
A retrospective study is a study in which the event or nonevent is unknown, and
the information gathered depends on what occurred in the past. One example is
2.3 Exploratory Analyses 19

conducting a study of patients with AIDS and whether or not they had used dirty
needles or other common practices.
A case–control study is a non-experimental research design where researchers
collect information on previous cases and compare that information with a control
group of persons who have not had those cases (called the control). The two groups
(case and control) are matched for age, sex, and other personal data, and are then
examined to determine which possible factor (e.g., cigarette smoking, watching
television) may account for the increase or decrease in the case group.
A Tobit model is also referred to as a censored regression model. The Tobit
model is best suited to cases when the response variable is either left- or right-
censoring, and we are interested in the linear relationships between variables. For
example, in the 1980s there was a time when the law restricted speedometer
readings to at most 85 mph. So experiments involving predicting a vehicle’s
top-speed from a combination of horsepower and engine size, your largest speed
value would be 85, regardless of how fast the vehicle was speeding. This is a perfect
example of right-censoring (censoring from above) the data. The one thing we are
certain about, is that those vehicles recorded as traveling at 85 mph were at least
85 mph. Introduction to SAS. UCLA: Statistical Consulting Group. http://www.ats.
ucla.edu/stat/sas/notes2/ (accessed November 24, 2007).

2.3 Exploratory Analyses

The logistic regression model is a tool for presenting the relation between a binary
response or a multinomial response and several predictors. Its use is very familiar
and common in the fields of health and education, as well as with elections, credit
card companies, mortgages, and other cases, where there is a need to profile the
sampling unit (Fig. 2.1).
Some example questions to guide a study might be as follow:
1. How do education, ideology, race, and gender predict a vote in favor or not in
favor of a US Senator?

Input Output
Binary

Continuous Model
Categorical produced
(0,1)
One observation

per sampling unit

Fig. 2.1 A schematic diagram as X impacts Y

20 2 Short History of the Logistic Regression Model

2. What factors predict the type of registered voters who would support the
reelection of a President or a Governor?
3. What are the characteristics of the consumer who should be offered a credit
card?
4. What are the characteristics of a traveler that will make him or her choose one
mode of transportation over another (rail, bus, car, or plane)?

2.4 Statistical Model

The origin of the logistic regression model is in bioassay and some other disci-
plines. We learned that the logistic function was invented for the purpose of
describing the population growth. Also it was given its name by a Belgian math-
ematician, Verhulst. Figure 2.2 provides a description of the function:

Pt ¼ eβ0 þβ1 t = 1 þ eβ0 þβ1 t

This figure shows the relation of proportion Pt as time increases. Let the linear
relation be

logit ½Pt ¼ β0 þ β1 t;

where β0 denotes the value at time equal to zero, β1 denotes the rate of change of
logit [Pt] with regard to time and

logit ½Pt ¼ log Pt = 1 Pt

The logistic function rises monotonically as t increases. We concur with authors

who have noted that for Pt from 0.3 to 0.7, the shape of the logistic curve closely
resembles that of the normal probability cumulative distribution function (Fig. 2.3).
One account of the emergence of the logistic function from the growth curve is
dated as far back as 1838 when it became a popular formula for certain places in
North Africa (Cramer, 2002). In more recent times, Dr. Pearl of the U.S. Food
Administration was preoccupied with the food needs of a growing population
during World War I and decided to use logistic functions to address
it. Additionally, President Dr. Lowell Reed of Johns Hopkins used an application
of the logistic curve to catalytic agent formed during a reaction (Reed & Berkson,
1929). The logistic function was also used in chemistry at the same time, but it
appears that the basic idea was for logistic growth. Our research support the fact
that the function is still used to model population growth as well as the market
penetration of new products and technologies.
2.4 Statistical Model 21

Fig. 2.2 A logistic curve Pt versus time

Fig. 2.3 Cumulative distribution function of normal distribution

There is a close resemblance of the logistic to the normal distribution function

(Wilson, 1925; Winsor, 1932). As an alternative to the normal probability function,
in 1944 Berkson turned his attention to the statistical methodology of bioassay and
proposed the use of the logistic instead of the normal probability function of Pt,
coining the term “logit” as compared to the “probit” presented by Bliss (1934a,
1934b). The logistic function has presented itself in bioassay in that the logit model
of bioassay can easily be generalized to logistic regression, where binary outcomes
are related to a number of determinants without a specific theoretical background.
We learned that the earliest developments in statistics and epidemiology took
place in the late 1950s and the 1960s. We learned that in the discipline of statistics,
22 2 Short History of the Logistic Regression Model

the analytical advantages of the logit transformation as a means of dealing with

discrete binary outcomes were put at the forefront of the discussion. This was
supported by Dr. Cox as a pioneer in the field by publishing a series of papers in the
1960s about the topic, and then following them up with the outstanding textbook
titled Analysis of Binary Data, Cox (1969). Later, the close proximity of the logistic
model to discriminant analysis was recognized, as well as its unique relationship to
log linear models (Bishop, Fienberg, & Holland, 1975). We further learned that
epidemiologists were busy developing case–control studies even earlier since the
discipline of epidemiology is more directly concerned with odds, odds ratios,
log-odds, or logit transformation. It appears that researchers were already clamor-
ing about the theoretical justification, Cornfield (1951, 1956), and we must mention
the works of Berkson (1944, 1951).
Our research led us to believe that the first comprehensive textbook with medical
applications was published by Hosmer and Lemeshow (1989). I remember using
their first edition in my graduate categorical data class in Statistics at Arizona State
University shortly after I arrived in Tempe. Until recently, I was unaware that I was
touching part of history. I remember back then talking to some researchers from the
marketing department and being told that logistic regression was brought to their
discipline by certain researchers. The presence of logistic regression models in the
behavioral sciences is believed to be due to the works of McKelvey and Zavoina
(Cramer, 2003). They adopted the approach based on an ordered probit analysis of
the voting behavior of US Congressmen. However, the generalization of logistic
regression to the multinomial or polychotomous case is due to Gurland, Lee, and
Dahm (1960), Mantel (1966), and Theil (1969).

2.5 Analysis of Data

Our analyses of binary data with logistic regression models will be done mostly
with SAS, SPSS, and R. There are several procedures in SAS, SPSS, and R for
modeling binary responses under varying conditions and certain assumptions. We
attempt to use the most common procedures as we demonstrate the fit of logistic
regression models to correlated data with and without time-dependent covariates
and with fixed and random effects. There are a few chapters when we were unable
to duplicate the fit of the model in all three statistical packages.

2.6 Conclusions

The logistic regression is often preferred as a model for binary responses as it is

appropriate for any kind of data: cross-sectional, prospective, and retrospective. Its
reliance on the odds makes it an excellent candidate for interpretation as society can
References 23

easily relate to such findings. On the contrary, using probit or complementary log–
log is only appropriate for modeling prospective data as they rely on probabilities.

References

Berkson, J. (1944). Applications of the logistic function to bioassay. Journal of the American
Statistical Association, 9, 357–365.
Berkson, J. (1951). Why I prefer logits to probits. Biometrics, 7(4), 327–339.
Bishop, Y. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory
and practice. Cambridge, MA: MIT Press.
Bliss, C. I. (1934a). The method of probits. Science, 79, 38–39.
Bliss, C. I. (1934b). The method of probits. Science, 79, 409–410.
Cornfield, J. (1951). A method of estimating comparative rates from clinical data. Journal of the
National Cancer Institute, 11, 1269–1275.
Cornfield, J. (1956). A statistical problem arising from retrospective studies. In J. Neyman (Ed.),
Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability
(pp. 135–148). Berkeley, CA: University of California Press.
Cox, D. R. (1969). Analysis of binary data. London: Chapman and Hall.
Cramer, J. S. (2002). The origins of logistic regression (Tinbergen Institute Working Paper
No. 2002-119/4). Retrieved from SSRN: http://ssrn.com/abstract¼360300 or http://dx.doi.
org/10.2139/ssrn.360300
Cramer, J. S. (2003). The origins and development of the logit model. In J. S. Cramer (Ed.), Logit
models from economics and other fields (pp. 149–158). Cambridge, England: Cambridge
University Press.
Gurland, J., Lee, I., & Dahm, P. A. (1960). Polychotomous quantal response in biological assay.
Biometrics, 16, 382–398.
Hosmer, D., & Lemeshow, W. (1989). Applied logistic regression. New York: Wiley.
Mantel, N. (1966). Models for complex contingency tables and polychotomous response curves.
Biometrics, 22, 83–110.
Reed, L. J., & Berkson, J. (1929). The application of the logistic function to experimental data.
Journal of Physical Chemistry, 33(5), 760–779.
Theil, H. (1969). A multinomial extension of the linear logit model. International Economic
Review, 10(3), 251–259.
Wilson, E. B. (1925). The logistic or autocatalytic grid. Proceedings of the National Academy of
Science, 11, 431–456.
Winsor, C. P. (1932). A comparison of certain symmetrical growth curves. Proceeding of
Washington Academy of Sciences, 22, 73–84.
http://www.springer.com/978-3-319-23804-3

Logistic Regression From Introductory To Advanced Concepts and Applications - Scott Menard-Ch 1
No ratings yet
Logistic Regression From Introductory To Advanced Concepts and Applications - Scott Menard-Ch 1
18 pages
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
No ratings yet
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
170 pages
Logistic Regression Mini Tab
100% (3)
Logistic Regression Mini Tab
20 pages
02 LogisticRegression
No ratings yet
02 LogisticRegression
29 pages
Lecture 10
No ratings yet
Lecture 10
13 pages
CHAID
No ratings yet
CHAID
3 pages
Thesis Using Logistic Regression
100% (2)
Thesis Using Logistic Regression
7 pages
4 - C - Logistic Regression
No ratings yet
4 - C - Logistic Regression
13 pages
The Logit Model Measurement Problem
No ratings yet
The Logit Model Measurement Problem
19 pages
Bivariate Logistic Regression
100% (1)
Bivariate Logistic Regression
12 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Lecture3-Logistic Regression 6-5-08
No ratings yet
Lecture3-Logistic Regression 6-5-08
72 pages
XSTK
No ratings yet
XSTK
8 pages
Binary Logistic Regression and Its Application
No ratings yet
Binary Logistic Regression and Its Application
8 pages
Sarang Ke Liye
No ratings yet
Sarang Ke Liye
14 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Random Notes
No ratings yet
Random Notes
11 pages
Logistics Regression
No ratings yet
Logistics Regression
10 pages
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
100% (1)
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
170 pages
Logistic Regression A Primer
No ratings yet
Logistic Regression A Primer
94 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Loges Tic
No ratings yet
Loges Tic
30 pages
Logistic Regression: in Experimental Research
No ratings yet
Logistic Regression: in Experimental Research
12 pages
M8 Logreg
No ratings yet
M8 Logreg
10 pages
Logistic Regression
100% (2)
Logistic Regression
47 pages
Chapter Three 3.0 Methodology 3.1 Source of Data
No ratings yet
Chapter Three 3.0 Methodology 3.1 Source of Data
10 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
An Introduction To Logistic Regression
No ratings yet
An Introduction To Logistic Regression
13 pages
Logit Probit and Tobit Models For Catego PDF
No ratings yet
Logit Probit and Tobit Models For Catego PDF
19 pages
Garson 2008 Logistic Regression
No ratings yet
Garson 2008 Logistic Regression
33 pages
Group 1 Biostat Assignement@
No ratings yet
Group 1 Biostat Assignement@
20 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
A Simple But Effective Logistic Regression Derivation
No ratings yet
A Simple But Effective Logistic Regression Derivation
6 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
新增 Microsoft Word Document
No ratings yet
新增 Microsoft Word Document
10 pages
Logit & Probit Model
No ratings yet
Logit & Probit Model
51 pages
Machine Learning Syllabus PDF
0% (1)
Machine Learning Syllabus PDF
4 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
2 Dealing With Logistic Regression
No ratings yet
2 Dealing With Logistic Regression
4 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Logistic Regression
100% (2)
Logistic Regression
32 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Logistic Ordinal Regression
No ratings yet
Logistic Ordinal Regression
10 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Business Statistics - Logistic Regression (Part 2) - Old
No ratings yet
Business Statistics - Logistic Regression (Part 2) - Old
8 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
An Introduction To Logistic Regression in R
No ratings yet
An Introduction To Logistic Regression in R
25 pages
Logistic Regression Tutorial
No ratings yet
Logistic Regression Tutorial
25 pages
Joseph M. Hilbe - Practical Guide To Logistic Regression (2016, Taylor & Francis)
No ratings yet
Joseph M. Hilbe - Practical Guide To Logistic Regression (2016, Taylor & Francis)
162 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
Logistic Regression Model - A Review
No ratings yet
Logistic Regression Model - A Review
5 pages
Chapter 16 - Logistic Regression Model
No ratings yet
Chapter 16 - Logistic Regression Model
7 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
Fernando, Logit Tobit Probit March 2011
No ratings yet
Fernando, Logit Tobit Probit March 2011
19 pages
Ch.6-7 Correlation & Regression
No ratings yet
Ch.6-7 Correlation & Regression
56 pages
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
No ratings yet
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
19 pages
Handouts CH 3 (Gujarati)
No ratings yet
Handouts CH 3 (Gujarati)
5 pages
Correlation and Regression
No ratings yet
Correlation and Regression
6 pages
Logistic Regression in Python Tutorial
100% (2)
Logistic Regression in Python Tutorial
23 pages
Time Series Analysis Parte 1 PDF
No ratings yet
Time Series Analysis Parte 1 PDF
189 pages
Confusion Matrix
No ratings yet
Confusion Matrix
26 pages
Sample Question Econometrics
No ratings yet
Sample Question Econometrics
11 pages
Probit and Logit Models Stata Program and Output PDF
No ratings yet
Probit and Logit Models Stata Program and Output PDF
10 pages
Chapter 8 Stats
No ratings yet
Chapter 8 Stats
13 pages
Data Analysis Report Team 5
No ratings yet
Data Analysis Report Team 5
15 pages
Lecture10 7012 Logit
No ratings yet
Lecture10 7012 Logit
45 pages
Dummy Variable Regression Models
No ratings yet
Dummy Variable Regression Models
19 pages
Introduction To Boosting: Cynthia Rudin PACM, Princeton University
No ratings yet
Introduction To Boosting: Cynthia Rudin PACM, Princeton University
29 pages
Supervised Learning With R
No ratings yet
Supervised Learning With R
30 pages
10 11648 J Ajcst 20220503 11
No ratings yet
10 11648 J Ajcst 20220503 11
10 pages
KAREN JOY MAGSAYO - Final Exam
No ratings yet
KAREN JOY MAGSAYO - Final Exam
17 pages
Widiantari
No ratings yet
Widiantari
13 pages
Computerised Handwriting Speed Test System CHSTS Validation of A Handwriting Assessment For Chinese Secondary Students
No ratings yet
Computerised Handwriting Speed Test System CHSTS Validation of A Handwriting Assessment For Chinese Secondary Students
9 pages
Statistics For Geoscience Applications: Univariate Statistics Bivariate Statistics Multivariate Statistics
No ratings yet
Statistics For Geoscience Applications: Univariate Statistics Bivariate Statistics Multivariate Statistics
25 pages
PSYC220 Final Assignment
No ratings yet
PSYC220 Final Assignment
6 pages
InOpe - 1 - Forecasting
No ratings yet
InOpe - 1 - Forecasting
2 pages
Idsa Reviewer
No ratings yet
Idsa Reviewer
4 pages
Greene - Chap 9
No ratings yet
Greene - Chap 9
2 pages
Case Study 1-Way ANOVA: Yields of Entozoic Amoebae Under 5 Methods of Innoculation
No ratings yet
Case Study 1-Way ANOVA: Yields of Entozoic Amoebae Under 5 Methods of Innoculation
11 pages
STAT 200 Week 7 Homework Problems
No ratings yet
STAT 200 Week 7 Homework Problems
9 pages
1 Model Building and Application in Logistic Regression
No ratings yet
1 Model Building and Application in Logistic Regression
7 pages
Bsa106 O1
No ratings yet
Bsa106 O1
4 pages
Dafiq Julika Iqsyam Excel
No ratings yet
Dafiq Julika Iqsyam Excel
3 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet

90784-Origin of Logit

Uploaded by

90784-Origin of Logit

Uploaded by

Chapter 2

Short History of the Logistic Regression

2.1 Motivating Example

© Springer International Publishing Switzerland 2015 17

data are prospective, retrospective, or cross-sectional. However, since the probit,

2.2 Definition and Notation

In this discussion, we use the following symbols:

A monotonic function is a function which is either entirely nonincreasing or

2.3 Exploratory Analyses

per sampling unit

Fig. 2.1 A schematic diagram as X impacts Y

2.4 Statistical Model

The logistic function rises monotonically as t increases. We concur with authors

Fig. 2.2 A logistic curve Pt versus time

Fig. 2.3 Cumulative distribution function of normal distribution

There is a close resemblance of the logistic to the normal distribution function

the analytical advantages of the logit transformation as a means of dealing with

2.5 Analysis of Data

The logistic regression is often preferred as a model for binary responses as it is

You might also like