0% found this document useful (0 votes)
84 views87 pages

Logistic Reg

Logistic regression is used when the dependent variable is binary. It estimates the probability of an observation belonging to each binary group. Unlike linear regression, logistic regression constrains the predicted probability to be between 0 and 1. The logistic regression model estimates the log odds (logit) of an observation being in one group versus the other. The coefficients can be interpreted as how much the log odds change when the associated independent variable increases by one unit, holding other variables constant.

Uploaded by

Siddhant Sanjeev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views87 pages

Logistic Reg

Logistic regression is used when the dependent variable is binary. It estimates the probability of an observation belonging to each binary group. Unlike linear regression, logistic regression constrains the predicted probability to be between 0 and 1. The logistic regression model estimates the log odds (logit) of an observation being in one group versus the other. The coefficients can be interpreted as how much the log odds change when the associated independent variable increases by one unit, holding other variables constant.

Uploaded by

Siddhant Sanjeev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

Logistic Regression

Reading: Applied Categorical and Count Data


Analysis, Tang, He & Tu, CRC Press, Chapter 4

References:
1. Categorical Data Analysis; Alan Agresti; Wiley
Series in Probability; 1990.
3. Multivariate Data Analysis, Hair et al, Pearson
Education, 6e, pp 359-401

1
Logistic Regression/ Binary Logit Model

• Logistic regression (“binary logit”) model


extends ideas of multiple linear regression to
situation where response variable, Y, is binary
(with coded values 0 and 1)
• Explanatory variables X1, X2 · · · Xk may be
categorical/continuous variables.
• It estimates chances of an observation belonging
to each group.

2
Problems with Usual Regression

For X=100, predicted value = 3.2121 has no meaning!

3
Solution with Logistic Regression

logistic function: Prob=1/[1+exp(-(-6.39+0.44*x))]


i.e., ln(Prob/(1-Prob)) = –6.39+0.44*x [=logit].
For X=100, predicted Prob = 1;
4
For X=18, predicted Prob = 0.82
Prescribe “Buy/Sell/Hold” for a
Security
Daily Data on
Price, Volume, MACD, RSI, Put-Call Ratio,
Open Interest, etc & “Buy/Sell/Hold prediction”
(attached in hindsight on the basis of next
trading day’s closing price)

Using Logistic Regression Model fitted to above


test dataset over a period, we can make future
decisions
5
6
7
Binary Logit Model
Probability of success may be modeled as:
 P  exp( a0  a1 X )
log   a0  a1 X i.e., P 
1  exp( a0  a1 X )
1 P 

Logistic function (a0=0, a1=1):


p = f(x) = exp(x)/[1+exp(x)]

8
Relation between Probability, Odds & Logit
Log(Odds)
Probability Odds =Logit
0 0 NC
0.1 0.11 -2.20
0.2 0.25 -1.39
0.3 0.43 -0.85
0.4 0.67 -0.41
0.5 1.00 0.00
0.6 1.50 0.41
0.7 2.33 0.85
0.8 4.00 1.39
0.9 9.00 2.20
1 NC NC
9
Binary Logit Model: Multiple Regressors

Linear Regression Model :


P  a0  a1 x1  ...  ak xk ; P can be beyond [0,1] range.

 P 
log   a0  a1 X 1  ...  ak X k
1 P 
k
exp(  ai X i )
i.e., P  i 0
k
1  exp(  ai X i )
i 0

10
Death Risk from Heart Disease
• Death from heart disease in next 5 years;
• 3 risk factors (age, sex, and blood cholesterol
level) to predict the risk of death from heart
disease
x1 = age in years, less 50
x2 = sex, where 0 is male and 1 is female
x3 = cholesterol level in mmol/L, less 5.0

11
Dataset (100 data points)
Sex=“1”
means
female

12
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then BINARY LOGISTIC.

3. Move “Death” in to the DEPENDENT VARIABLE box.

4. Move “Sex,” “Agemin50,” and “Cholmin5,” in to the


COVARIATES box.)

5. Select ENTER for METHOD (default option)

6. Click OK.

13
Output (3 explanatory variables)

Given age, sex & cholesterol level of a person, compute:

logit = z = –9.826–1.484(if female)+0.908(age – 50)+1.164(cholesterol –5) &


14
chances of death risk=1/[1+exp(–z)]
15
Death Risk from Heart Disease
(in next 5 years)

• Logit = z = –9.826–1.484(if female)+0.908(age –


50)+1.164(cholesterol –5); here a0 = –9.826, a1=
0.908, a2=–1.484, a3=1.164
• (x1 = age in years, less 50; x2 = sex, where 0 is
male and 1 is female ; x3 = cholesterol level in
mmol/L less 5.0 )
• Risk of death p= 1/[1+exp(-z)]
• Prediction: Risk of death from heart disease for
Mr Nagarwal (61.5 years, cholesterol level 6.9
mmol/L) = 0.94; (z= –9.826–1.484*0+ 0.908*10.5
+1.164*1.9=2.824)
16
Output (3 explanatory variables)

17
Output (2 explanatory variables)

Given age & cholesterol level of a person, compute:

Logit = z = –9.645+0.845(age – 50)+1.104(cholesterol –5) &


18
chances of death risk=1/[1+exp(–z)]
Output (2 explanatory variables)

19
Properties of the Logit Model

• Although Xi may vary from – to  , P is


constrained to lie between 0 and 1.

• When Xi approaches – , P approaches 0.

• When Xi approaches  , P approaches 1.

• When linear regression model is used, P is not


constrained to lie between 0 and 1.
20
Estimation and Model Fit
• Estimation Method: Maximum Likelihood
• Fit: Cox & Snell R Square and Nagelkerke R Square
(similar to R2 in multiple regression).
• Cox & Snell R Square can not equal 1.0, even if fit is
perfect, this limitation is overcome by Nagelkerke R
Square.
• Hosmer–Lemeshow test used for assessing goodness
of fit of a model (Large P-value indicates overall
model fit is good)
• Compare predicted and actual values of Y to
determine the percentage of correct predictions.
21
Estimating Model Parameters
• Linear Regression model uses OLS method to
minimize sum of squared errors of prediction
• Logistic Regression model maximizes “likelihood”
of observing y1,…,yn, defined by
n
L   p i (1  p i ) (1 yi ) ,
yi

i 1

exp(  0  1 x 1i  ...   k x ki )
where p i  .
1  exp(  0  1 x 1i  ...   k x ki )
Thus, L is a function of  0 , 1 ,...,  k
22
Estimation (contd.)
n
L   p i i (1  p i ) (1 yi ) ,
y

i 1

exp(  0  1x1i  ...   k x ki )


where p i  .
1  exp(  0  1x1i  ...   k x ki )

To maximize L, equivalently, ln(L) we can use Excel Solver.


We require initial values of 0, 1 · · · k, which can be
obtained by MLR of “empirical log odds” on X1, …,Xk :

 yi  12 
ln     0  1X1i  ...   k X ki
 (1  y i  1
2) 

23
Model Fit Measures
2/n
Cox & Snell R square: R 2  1   L 0 
 
L
where L0= max likelihood of the intercept-only model,
L=max likelihood under specified model

Nagelkerke adjusted R square:


2
 L0  n
1  
 L 
2
R
Adjusted R 2  2  2
R max
1  (L 0 ) n

24
Other Measures
• Akaike’s Information Criterion:
AIC= 2*ln(L) + 2*k,
where k = # of model parameters
• Corrected AIC (for small samples)
AICc= AIC + 2*(k(k+1))/(n-k-1)
• Bayesian Information Criterion:
BIC = 2*ln(L) + (ln(n))*k

25
Interpretation of Coefficients

• Log odds, i.e., ln(p/(1-p)), is a linear combination of


a0, a1 · · · ak
• If Xi value increases by one unit, log odds will
change by ai units, when other X-variables held
constant
• a0= log odds if/when all X-variables equal zero

p exp( a0  a1 x1  ...  ak xk )
ln( )  a0  a1 x1  ...  ak xk , i.e., p  ,
1 p 1  exp( a0  a1 x1  ...  ak xk )
26
Interpretation of Coefficients
If Xi is increased by one unit, the log odds will change by
ai units, when the values of other independent variables is
held constant.
Log(Odds)
Sign of ai will determine Probability Odds =Logit
0 0 NC
whether the probability 0.1 0.11 -2.20
increases (if the sign is 0.2 0.25 -1.39
positive) or decreases (if the 0.3 0.43 -0.85
0.4 0.67 -0.41
sign is negative) by some 0.5 1.00 0.00
amount. 0.6 1.50 0.41
0.7 2.33 0.85
0.8 4.00 1.39
0.9 9.00 2.20
1 NC NC
27
Example: Insurance Requirement

• 100 respondents, [2-level response] 63 yes,


37 no
• Predictors are Age, gender, dependent
(whether have dependents or not).
• Want to estimate chances of a prospective
customers purchasing insurance.

28
Data (2-level Response)

29
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then BINARY LOGISTIC.

3. Move “Willing” in to the DEPENDENT VARIABLE box.

4. Move “Age,” “Dependent,” and “Income,” in to the


COVARIATES box.)

5. Select ENTER for METHOD (default option)

6. Click OK.

30
Output (2-level response)

Given age, dependent & income of a person, compute:


logit = z = –50.326+1.077*age+13.778(if has dependents) +
0.000*income;
& chances of insurance purchase = 1/[1+exp(–z)]
31
Output (2-level response)

32
Output (2-level response)

33
Output (2-level response; w/o Income)

Given age, dependent & income of a person, compute:


logit = z = –50.190+1.083*age+13.782(if has dependents) ;
& chances of insurance purchase = 1/[1+exp(–z)]

34
Output (2-level response; w/o Income)

35
Output (2-level response; w/o Income)

36
Example: Brand Loyalty
Malhotra & Dash, Table 18.6, p.598

• 30 respondents, 15 are brand loyal (value=1), 15 are


not (value=0);
• Also measured are attitude toward the brand
(Brand), attitude toward the product category
(Product), attitude toward shopping (Shopping) all
on a 1(unfavorable) to 7(favorable) scale.
• Want to estimate chances of a consumer being brand
loyal as a function of attitude toward the brand,
product category and shopping.

37
Shopping

Shopping
Product

Product
Loyalty

Loyalty
Brand

Brand
Seq#

Seq#
1 4 3 5 1 16 3 1 3 0
2 6 4 4 1 17 4 6 2 0
3 5 2 4 1 18 2 5 2 0
4 7 5 5 1 19 5 2 4 0
5 6 3 4 1 20 4 1 3 0
6 3 4 5 1 21 3 3 4 0
7 5 5 5 1 22 3 4 5 0
8 5 4 2 1 23 3 6 3 0
9 7 5 4 1 24 4 4 2 0
10 7 6 4 1 25 6 3 6 0
11 6 7 2 1 26 3 6 3 0
12 5 6 4 1 27 4 3 2 0
13 7 3 3 1 28 3 5 2 0
14 5 1 4 1 29 5 5 3 0
15 7 5 5 1 30 1 3 2 0
38
Explaining Brand Loyalty
No. Loyalty Brand Product Shopping
1 1 4 3 5
2 1 6 4 4
3 1 5 2 4
4 1 7 5 5
5 1 6 3 4
6 1 3 4 5
7 1 5 5 5
8 1 5 4 2
9 1 7 5 4
10 1 7 6 4
11 1 6 7 2
12 1 5 6 4
13 1 7 3 3
14 1 5 1 4
15 1 7 5 5
16 0 3 1 3
17 0 4 6 2
18 0 2 5 2
19 0 5 2 4
20 0 4 1 3
21 0 3 3 4
22 0 3 4 5
23 0 3 6 3
24 0 4 4 2
25 0 6 3 6
26 0 3 6 3
27 0 4 3 2
28 0 3 5 2
29 0 5 5 3
39
30 0 1 3 2
SPSS Windows

To run logit analysis or logistic regression


using SPSS for Windows, click:

• Analyze > Regression>Binary Logistic 

40
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then BINARY LOGISTIC.

3. Move “Loyalty to the Brand [Loyalty]” in to the


DEPENDENT VARIABLE box.

4. Move “Attitude toward the Brand [Brand},” “Attitude


toward the Product category [Product},” and “Attitude
toward Shopping [Shopping],” in to the COVARIATES
box.)

5. Select ENTER for METHOD (default option)

6. Click OK.
41
Results of Logistic Regression

Dependent Variable Encoding

Original Value Internal Value


Not Loyal 0
Loyal 1

Model Summary

-2 Log Cox & Snell Nagelkerke R


Step likelihood R Square Square
1 23.471(a) .453 .604
a Estimation terminated at iteration number 6 because parameter estimates changed by less than .001.

42
Results of Logistic Regression

Classification Table a
Predicted

Loyalty to the Brand Percentage


Observed Not Loyal Loyal Correct
Step 1 Loyalty to the Not Loyal 12 3 80.0
Brand Loyal 3 12 80.0
Overall Percentage 80.0

a. The cut value is .500


Variables in the Equation a
B S.E. Wald df Sig. Exp(B)
Step Brand 1.274 .479 7.075 1 .008 3.575
1 Product .186 .322 .335 1 .563 1.205
Shopping .590 .491 1.442 1 .230 1.804
Constant -8.642 3.346 6.672 1 .010 .000

43
a.Variable(s) entered on step 1: Brand, Product, Shopping.
Regressor: Brand

Given “brand” ratings by a person, compute:


logit = z = –6.215+1.351*Brand ;
& chances of brand loyalty = 1/[1+exp(–z)]

44
Regressor: Brand

45
Regressors: Brand, Shopping

Given “brand” & “shopping” ratings by a person, compute:


logit = z = –7.727+1.288*Brand+0.513*shopping ;
& chances of brand loyalty = 1/[1+exp(–z)]

46
Regressors: Brand, Shopping

47
48
Bankruptcy Example
(Applied Multivariate Statistical Analysis by Johnson & Wichern)

• Annual financial data collected for 16 (currently)


bankrupt firms about 2 years prior to their bankruptcy
and for 20 (currently) financially sound firms at about
the same time. [ 0: Bankrupt Firms; 1: Non-Bankrupt
Firms]
• Four explanatory variables
X1 (CFTD)= cash flow/total debt,
X2 (NITA) = net income/total assets
X3 (CACL) = current assets/current liabilities,
X4 (CANS) = current assets/net sales

49
Bankrupt
CA/NS
CA/CL
CF/TD

NI/TD
Seq#

1 -0.45 -0.41 1.09 0.45 0 0 = bankrupt


2 -0.56 -0.31 1.51 0.16 0
3 0.06 0.02 1.01 0.4 0 1= sound
4 -0.07 -0.09 1.45 0.26 0
5 -0.1 -0.09 1.56 0.67 0
6 -0.14 -0.07 0.71 0.28 0
7 0.04 0.01 1.5 0.71 0
8 -0.06 -0.06 1.37 0.4 0
9 0.07 -0.01 1.37 0.34 0
10 -0.13 -0.14 1.42 0.44 0
11 -0.23 -0.3 0.33 0.18 0
12 0.07 0.02 1.31 0.25 0
13 0.01 0 2.15 0.7 0
14 -0.28 -0.23 1.19 0.66 0
15 0.15 0.05 1.88 0.27 0
16 0.37 0.11 1.99 0.38 0
50
Bankrupt
CA/NS
CA/CL
CF/TD

NI/TD
Seq#
17 0.51 0.1 2.49 0.54 1
18 0.08 0.02 2.01 0.53 1
19 0.38 0.11 3.27 0.35 1
20 0.19 0.05 2.25 0.33 1
21 0.32 0.07 4.24 0.63 1
22 0.31 0.05 4.45 0.69 1
23 0.12 0.05 2.52 0.69 1
24 -0.02 0.02 2.05 0.35 1
25 0.22 0.08 2.35 0.4 1
26 0.17 0.07 1.8 0.52 1
27 0.15 0.05 2.17 0.55 1
28 -0.1 -0.01 2.5 0.58 1
29 0.14 -0.03 0.46 0.26 1
30 0.14 0.07 2.61 0.52 1
31 0.15 0.06 2.23 0.56 1
32 0.16 0.05 2.31 0.2 1
33 0.29 0.06 1.84 0.38 1
34 0.54 0.11 2.33 0.48 1
35 -0.33 -0.09 3.01 0.47 1
36 0.48 0.09 1.24 0.18 1 51
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then BINARY LOGISTIC.

3. Move “Bankrupt” in to the DEPENDENT VARIABLE


box.

4. Move “CFTD”, “NITD”, “CACL”, & “CANS” in to the


COVARIATES box.)

5. Select ENTER for METHOD (default option)

6. Click OK.

52
Regressors: CFTD, CACL, NITD, CANS

53
Regressors: CFTD, CACL, NITD, CANS

54
Regressors: CFTD, CACL

Given “CFTD” & “CACL” values of a bank, compute:


logit = z = –4.863+5.654*CFTD+2.550*CACL ;
& chances of being “sound” = 1/[1+exp(–z)]

55
Regressors: CFTD, CACL

56
57
Example: Resort Visits
(Malhotra & Dash, Table 18.2, p.581)

Sample of 30 households
• VISIT = 1 if visited in last 2 years; = 2 otherwise

• INCOME
• TRAVEL (Attitude toward travel, on a 9-point scale)
• VACATION (importance attached to family vacation)
• HOUSEHOLD SIZE
• AGE (of head of household)
……………………………………………………………..
• AMOUNT (has 3 levels of spending on family vacation)
58
Resort Visits
Sample of 30 households
• VISIT = 1 if visited in last 2 years; = 2 otherwise
• INCOME
• TRAVEL (Attitude toward travel, on a 9-point scale)
• VACATION (importance attached to family vacation)
• HOUSEHOLD SIZE
• AGE (of head of household)

1. Which variables are most important predictors of VISIT ?

2. Can we develop a profile of the two groups in terms of these


important predictors?
59
Information on Resort Visits:
Analysis Sample
Annual Attitude Importance Household Age of Amount
Resort Family Toward Attached Size Head of Spent on
No. Visit Income Travel to Family Household Family
($000) Vacation Vacation

1 1 50.2 5 8 3 43 M (2)
2 1 70.3 6 7 4 61 H (3)
3 1 62.9 7 5 6 52 H (3)
4 1 48.5 7 5 5 36 L (1)
5 1 52.7 6 6 4 55 H (3)
6 1 75.0 8 7 5 68 H (3)
7 1 46.2 5 3 3 62 M (2)
8 1 57.0 2 4 6 51 M (2)
9 1 64.1 7 5 4 57 H (3)
10 1 68.1 7 6 5 45 H (3)
11 1 73.4 6 7 5 44 H (3)
12 1 71.9 5 8 4 64 H (3)
13 1 56.2 1 8 6 54 M (2)
14 1 49.3 4 2 3 56 H (3)
15 1 62.0 5 6 2 58 H (3)
60
Information on Resort Visits: Analysis Sample
Table 18.2, cont. Annual Attitude Importance Household Age of Amount
Resort Family Toward Attached Size Head of Spent on
No. Visit Income Travel to Family Household Family
($000) Vacation Vacation

16 2 32.1 5 4 3 58 L (1)
17 2 36.2 4 3 2 55 L (1)
18 2 43.2 2 5 2 57 M (2)
19 2 50.4 5 2 4 37 M (2)
20 2 44.1 6 6 3 42 M (2)
21 2 38.3 6 6 2 45 L (1)
22 2 55.0 1 2 2 57 M (2)
23 2 46.1 3 5 3 51 L (1)
24 2 35.0 6 4 5 64 L (1)
25 2 37.3 2 7 4 54 L (1)
26 2 41.8 5 1 3 56 M (2)
27 2 57.0 8 3 2 36 M (2)
28 2 33.4 6 8 2 50 L (1)
29 2 37.5 3 2 3 48 L (1)
30 2 41.3 3 3 2 42 L (1)
61
Regressors: ALL

62
Regressors: ALL

63
Most Important Regressors?

64
Most Important Regressors?

65
Regressors: Income, Hsize

Given “income” & “hsize” values of a household, compute:


logit = z = 18.074–.270*income–1.415*hsize ;
& chances of visit = 1/[1+exp(–z)]

66
Regressors: Income, Hsize

67
68
Multinomial Logistic Regression
Sample of 30 households
• INCOME
• TRAVEL (Attitude toward travel, on a 9-point scale)
• VACATION (importance attached to family vacation)
• HOUSEHOLD SIZE
• AGE (of head of household)
……………..
• AMOUNT (has 3 levels of spending on family
vacation)
69
Information on Resort Visits:
Analysis Sample
Annual Attitude Importance Household Age of Amount
Resort Family Toward Attached Size Head of Spent on
No. Visit Income Travel to Family Household Family
($000) Vacation Vacation

1 1 50.2 5 8 3 43 M (2)
2 1 70.3 6 7 4 61 H (3)
3 1 62.9 7 5 6 52 H (3)
4 1 48.5 7 5 5 36 L (1)
5 1 52.7 6 6 4 55 H (3)
6 1 75.0 8 7 5 68 H (3)
7 1 46.2 5 3 3 62 M (2)
8 1 57.0 2 4 6 51 M (2)
9 1 64.1 7 5 4 57 H (3)
10 1 68.1 7 6 5 45 H (3)
11 1 73.4 6 7 5 44 H (3)
12 1 71.9 5 8 4 64 H (3)
13 1 56.2 1 8 6 54 M (2)
14 1 49.3 4 2 3 56 H (3)
15 1 62.0 5 6 2 58 H (3)
70
Information on Resort Visits: Analysis Sample
Table 18.2, cont. Annual Attitude Importance Household Age of Amount
Resort Family Toward Attached Size Head of Spent on
No. Visit Income Travel to Family Household Family
($000) Vacation Vacation

16 2 32.1 5 4 3 58 L (1)
17 2 36.2 4 3 2 55 L (1)
18 2 43.2 2 5 2 57 M (2)
19 2 50.4 5 2 4 37 M (2)
20 2 44.1 6 6 3 42 M (2)
21 2 38.3 6 6 2 45 L (1)
22 2 55.0 1 2 2 57 M (2)
23 2 46.1 3 5 3 51 L (1)
24 2 35.0 6 4 5 64 L (1)
25 2 37.3 2 7 4 54 L (1)
26 2 41.8 5 1 3 56 M (2)
27 2 57.0 8 3 2 36 M (2)
28 2 33.4 6 8 2 50 L (1)
29 2 37.5 3 2 3 48 L (1)
30 2 41.3 3 3 2 42 L (1)
71
Multinomial Logit Model

 Pj 
log   a0i  a1 j X 1  ...  akj X k , j  1,..., (m - 1)
 Pm 

k
exp(  aij X i )
i.e., Pj  m 1
i 0
k
, j  1,..., (m - 1)
1  [ exp(  aij X i )]
j 1 i 0

72
Spending on Travel/Vacation
• Amount Spent has 3 levels (m=3) and there are 5
explanatory variables (k=5).
• We need two equations for modeling logits for
level j = 1, 2, with respect to “reference level” m

 Pj 
log   a0i  a1 j X 1  ...  akj X k , j  1,2
 Pm 

73
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.
2. Click REGRESSION and then MULTINOMIAL
LOGISTIC.
3. Move “amount” in to the DEPENDENT VARIABLE box.
4. Move “income” “travel,” “vacation”, “hsize” & “age” in to
the COVARIATES box.)
5. Under STATISTICS: under MODEL select everything
except Monotonicity measures, under PARAMETRS select
Estimates, Likelihood ratio tests & then ENTER
6. Click OK.
74
Multinomial Logistic Regression 1

75
Multinomial Logistic Regression 2

76
Multinomial Logistic Regression 3

Logit1= z1=35.771–.854*income+1.702*hsize

Logit2 = z2=14.594–.265*income+.125*hsize
p3= 1/(1+exp(z1)+exp(z2))
p1=exp(z1)/(1+exp(z1)+exp(z2))
77
p2=exp(z2)/(1+exp(z1)+exp(z2))
Example: Insurance Requirement

• 100 respondents, [3-level response] 42 yes,


28 no, 30 maybe
• Predictors are Age, gender, dependent
(whether have dependents or not).
• Want to estimate chances of a prospective
customers purchasing insurance.

78
Data (3-level Response)

79
SPSS Windows: Logit Analysis
1. Select ANALYZE from the SPSS menu bar.
2. Click REGRESSION and then MULTINOMIAL
LOGISTIC.
3. Move “Resp3level” in to the DEPENDENT VARIABLE
box.
4. Move “Age,” “Dependent,” and “Income,” in to the
COVARIATES box.)
5. Under STATISTICS: under MODEL select everything
except Monotonicity measures, under PARAMETRS select
Estimates, Likelihood ratio tests & then ENTER
6. Click OK.
80
Output (3-level response)

Logit1 = z1=63.913 –29.447*dependent – 0.877*age+.000*income


Logit2 = z2=132.935 – 50.587*dependent – 2.403*age+.000*income
Prob(“Yes”) = p3 = 1/(1+exp(z1)+exp(z2))
Prob(“May be”) = p1=exp(z1)/(1+exp(z1)+exp(z2))
81
Prob(“No”) = p2=exp(z2)/(1+exp(z1)+exp(z2))
Output (3-level response)

82
Output (3-level response)

83
Output (3-level response; w/o Income)

Logit1 = z1=63.637 -29.573*dependent -0.878*age


Logit2 = z2=128.431 -49.638*dependent -2.325*age
Prob(“Yes”) = p3 = 1/(1+exp(z1)+exp(z2))
Prob(“May be”) = p1=exp(z1)/(1+exp(z1)+exp(z2))
84
Prob(“No”) = p2=exp(z2)/(1+exp(z1)+exp(z2))
85
Output (3-level response; w/o Income)

86
Output (3-level response; w/o Income)

87

You might also like