0% found this document useful (0 votes)
39 views

Multinomial Logistic Regression-1

This document outlines a seminar on multinomial logistic regression. It begins with an introduction defining multinomial logistic regression as an extension of logistic regression for dependent variables with more than two categories. It then compares multinomial logistic regression to simple and binary logistic regression. The document provides an example of using multinomial logistic regression to study factors influencing different types of diabetes. It presents the model, assumptions, and uses of multinomial logistic regression and includes an outline of the seminar topics.

Uploaded by

Ali Hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Multinomial Logistic Regression-1

This document outlines a seminar on multinomial logistic regression. It begins with an introduction defining multinomial logistic regression as an extension of logistic regression for dependent variables with more than two categories. It then compares multinomial logistic regression to simple and binary logistic regression. The document provides an example of using multinomial logistic regression to study factors influencing different types of diabetes. It presents the model, assumptions, and uses of multinomial logistic regression and includes an outline of the seminar topics.

Uploaded by

Ali Hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Seminar Topic

Multinomial Logistic Regression


Name: Asma Parveen
Roll# 8403
MSc 3rd Semester
Supervisor Name: Muhammad Aftab
GC University Faisalabad
OUTLINE
 Introduction
 Definition
 Comparison with simple regression and Binary Logistic
Regression
 Model
 Assumptions
 Applications
 Example
Introduction:
 The logistic regression model can be extended to the situation
where the response variable assumes more than two categories that
are not ordinal (i.e. they have no natural ordering) then we use
multinomial logistic regression to model the dependent variable.
Multinomial logistic regression is also known as multiclass logistic
regression, multinomial logit etc. 
Definition:
 Multinomial logistic regression is the linear regression
analysis to conduct when the dependent variable is nominal
with more than two levels. Thus it is an extension of logistic
regression, which analyzes dichotomous (binary) dependents.
For example:
 Choice of color (red, yellow, green)
 Choice of profession (doctor, engineer, lawyer)
 Choice of undergraduate program (physics, history, chemistry)
Comparison with regression analysis and Binary logistic
regression:

 In regression analysis we estimate the effect of one or more


explanatory variable(s) on the continuous response variable.
 When our response variable assumes only two values for

example we want to study that whether the customer will buy


the product or not. Then our dependent variable would be a
binary variable (1=yes, 0=No).
 But sometimes our response variable assumes more than two

values then we use multinomial logistic regression. For


example in the study of mode of transportation to work, the
response variable may be the private automobile, car, bicycle,
public transport, or walking ( no natural ordering).
Assumptions:
 Dependent variable should be nominal.
 Independent variables can be continuous or categorical.
 Independence of observations.
 No multicollinearity.
 Linear relationship between any continuous independent
variables and the logit transformation of the dependent
variable.
 No outliers or highly influential points
 Kelinger and pedhazur recommended that at least thirty
observations per variables should be used.
Uses of Multinomial logistic Regression:
 Medical
 Marketing
 engineering
 and social sciences.
Model:

Log(=+++··· +

Where

is the intercept j=1,2,…,k-1

, ,…., are regression coefficients


, ,…., are explanatory variables
Example:
We want to study about the factors that have influence on
diabetes.
 Our response variable is diabetes and its categories may

be :
 Chemical diabetes 2) Overt diabetes 3) normal diabetes

 Many variables have influence on diabetes but we take

only three of them :


 Insulin response (IR) 2) steady- state plasma glucose

(SSPG) 3) relative weight (RW).


 These measurements were taken on 145 volunteers who

were subjected to same regimen.


Solution:
N: shows the number of observation.
Marginal percentage - The marginal percentage lists the
proportion of valid observations found in each of the
outcome variable's groups. This can be calculated by
dividing the N.
Valid - This indicates the number of observations in the
dataset where the outcome variable and all predictor
variables are non-missing.
Missing - This indicates the number of observations in the
dataset where data are missing from the outcome variable
or any of the predictor variables.
Total -This indicates the total number of observations in the
dataset--the sum of the number of observations in which
data are missing and the number of observations with valid
data.
Results interpretation- Out of 145 people 33 have overt
diabetes, 36 have chemical diabetes and 76 have normal
diabetes. Thus marginal percentage (33/145)*100 = 22.8%
Sig: The p = 0.000 which is less than 0.05 indicate that the
regression coefficients are not equal to zero. i.e. all the
variables are significant.
This table contain the chi-square statistic. This
statistic intended to test whether the observed data are
consistent with the fitted model.
Sig: Shows the p = 1.000 value is greater than 0.05
so we conclude that observed data are consistent with
the fitted model.
There three R2 that tells the variation in the data.
Cox and Snell: is based on the log likelihood for the model
compare to the log likelihood for intercept model. Its value is less
than one.
Nagelkerke: it adjusts the Cox & Snell’s so that the range of
possible values extends to one.
McFadden: its value depends on the estimated likelihood. Its
range is 0 to 1 but never reach 1.
Hence the value of R2 = 0.667 , 0.767 and 0.539 which indicate
that 66.7% , 76.7% and 53.9% of the variation in response
variable is due to explanatory variables.
Interpretation:
Intercepts: -1.903 and -7.611 are the average values of
response variable when the effect of explanatory variables are
zero.
 0.046 it means that the average increase in response variable is
0.046 due to unit increase in SSPG when all the other variables
held constant.
Sig: The “P” value of SSPG is 0.000 which is less than our
significance level set at 5% so we reject our null hypothesis. So
we conclude that SSPG has a significant effect on diabetes.
95% Confidence Interval: This is the Confidence Interval for
an individual regression coefficient given in the model. For a
given predictor with a level of 95% confidence, we would say
that we are 95% confident that the "true" population regression
coefficient lies in between the lower and upper limit of the
interval.
 Reference:
 www.research.gate.net
 Regression Analysis by Example 5th edition by

Samprit Chatterjee and Ali S. Hadi


 statistics.laerd.com

You might also like