0% found this document useful (0 votes)
17 views16 pages

Chintagunta 2001 Endogeneity and Heterogeneity in A Probit Demand Model Estimation Using Aggregate Data

probit

Uploaded by

nikitagupta80194
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views16 pages

Chintagunta 2001 Endogeneity and Heterogeneity in A Probit Demand Model Estimation Using Aggregate Data

probit

Uploaded by

nikitagupta80194
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

This article was downloaded by: [103.141.126.

88] On: 18 May 2024, At: 10:36


Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA

Marketing Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org

Endogeneity and Heterogeneity in a Probit Demand


Model: Estimation Using Aggregate Data
Pradeep K. Chintagunta,

To cite this article:


Pradeep K. Chintagunta, (2001) Endogeneity and Heterogeneity in a Probit Demand Model: Estimation Using Aggregate Data.
Marketing Science 20(4):442-456. https://doi.org/10.1287/mksc.20.4.442.9751

Full terms and conditions of use: https://pubsonline.informs.org/Publications/Librarians-Portal/PubsOnLine-Terms-and-


Conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.

© 2001 INFORMS

Please scroll down for article—it is on subsequent pages

With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research (O.R.)
and analytics professionals and students. INFORMS provides unique networking and learning opportunities for individual
professionals, and organizations of all types and sizes, to better understand and use O.R. and analytics tools and methods to
transform strategic visions and achieve better outcomes.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Research Note
Endogeneity and Heterogeneity in a Probit
Demand Model: Estimation Using
Aggregate Data
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

Pradeep K. Chintagunta
Graduate School of Business, University of Chicago, 1101 East 58th Street, Chicago, Illinois 60637
[email protected]

ance structure assumed for the utilities of the alternatives. We


Abstract demonstrate how the parameters of the proposed model can
Two issues that have become increasingly important while
be estimated using aggregate time series data from a product
estimating the parameters of aggregate demand functions
market. In the estimation, we account for the endogeneity of
to study firm behavior are the endogeneity of marketing ac-
marketing variables as well as for heterogeneity across con-
tivities (typically, price) and heterogeneity across consumers sumers.
in the market under consideration. Ignoring these issues in Our results indicate that both endogeneity as well as het-
the estimation of the demand function parameters can lead erogeneity need to be accounted for even after allowing for
to biased and inconsistent estimates for the effects of mar- a non-IIA specification at the individual consumer level.
keting activities. Endogeneity and heterogeneity have Specific to our data, we also find that ignoring endogeneity
achieved prominence in large measure because of the in- has a bigger impact on the estimated price elasticities than
creasing popularity of logit models to characterize demand ignoring the effects of heterogeneity. A comparison of the
functions using aggregate data. The logit model accounts for elasticities obtained from the probit model with those from
purchase incidence and brand choice by including a ‘‘no- the corresponding logit specification indicates that the range
purchase’’ alternative in the consumer’s choice set. This al- of elasticities obtained from the probit model across brands
lows for category sales to change as a function of the mar- is larger than that obtained from the logit. The results have
keting activities of brands in the category. implications for issues such as firm-level pricing.
There are three issues with using the logit model with the In addition to specifying a probit model and providing
no-purchase option to characterize demand when studying comparisons with the logit model, the paper also addresses
competitive interactions among firms. (1) The marketing lit- the third issue raised above. We propose a simple alternative
erature dealing with brand choice behavior at the consumer to the purchase incidence/brand choice specification by de-
level has found that the IIA restriction is not appropriate, as composing the demand for a brand into a category demand
each brand in the choice set is more similar to some brands equation and a conditional brand choice share equation. We
than it is to others. (2) Studies have found that the purchase provide a comparison of results from this specification to
incidence decision is distinct from the brand choice decision. those from the specification that includes the no-purchase
Hence, it may not be appropriate to model the no-purchase alternative and find that estimated elasticities are sensitive
decision as just another alternative in the choice set with the to the specification used. We also estimate the demand func-
tion parameters using a traditional specification such as the
IIA restriction holding across all brands and the no-pur-
double-logarithmic model. Here, we find that the estimated
chase option. (3) Even if the distinction between the pur-
elasticities could be signed in such a manner as to be not
chase incidence and brand choice decisions is accounted for
useful for firm-level pricing decisions.
via, for example, a nested logit specification, accounting for
One of the key limitations of the proposed model is that
the purchase incidence decision with aggregate data re-
while it accounts for the purchase incidence and brand
quires assumptions for computing the share of the no-pur- choice decisions of households, it does not account for dif-
chase alternative which is otherwise unobserved. ferences across consumers in their purchase quantities. The
In this paper, we propose a probit model as an alternative model and analysis are best suited for product categories in
to the logit model to specify the aggregate demand functions which consumers typically make single-unit purchases. An-
of firms competing in oligopoly markets. The probit model other limitation is more practical in nature. While recent ad-
avoids the IIA property that affects the logit model at the in- vances have been made in computing probit probabilities, it
dividual consumer level. Furthermore, the probit model can could nevertheless be a challenge to do so when the number
naturally account for the distinction between the purchase in- of alternatives is large.
cidence and brand choice decisions due to the general covari- (Heterogeneity; Endogeneity; Probit Model; Logit Model)

MARKETING SCIENCE 䉷 2001 INFORMS 0732-2399/01/2004/0442/$05.00


Vol. 20, No. 4, Fall 2001, pp. 442–456 1526-548X electronic ISSN
ENDOGENEITY AND HETEROGENEITY IN A PROBIT DEMAND MODEL: ESTIMATION USING AGGREGATE DATA

Introduction hold data. If the observed data at the store or market


The recent literature in marketing and in economics level are the aggregation of consumers with different
has seen an explosion of studies dealing with the brand preferences and sensitivities to marketing in-
analysis of firm-level behavior. The principal moti- struments, then ignoring this heterogeneity in the
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

vation behind these studies is to measure market estimation of the demand function parameters can
power of firms and to understand interfirm compet- lead to biased and inconsistent estimates for the
itive behavior. For example, Kadiyali (1996) studies marketing activities.
the competitive interactions between Kodak and Fuji The issues of endogeneity and heterogeneity have
in the U.S. market to investigate whether or not the achieved prominence in large measure because of the
rivalry between these two firms is as intense as in- increasing popularity of discrete-choice models to
dicated by the popular press. Sudhir (2001) looks at specify the demand functions to study firm behavior.
the competitive interactions among firms within var- The aforementioned studies by Sudhir and Nevo,
ious segments of the automobile industry to deter- along with others by Berry et al. (1995), have used
discrete-choice-based demand functions. The main
mine whether these interactions vary significantly
advantages of discrete-choice models are: (i) They are
across product segments. Nevo (2001) investigates
derived from utility maximizing behavior of consum-
the extent of pricing rivalry in the ready-to-eat
ers in the marketplace. (ii) They require estimation of
breakfast cereal market to determine whether ob-
fewer numbers of parameters, as compared to linear
served prices reflect market power associated with
(and log–log and semilog) demand functions (instead
product differentiation or collusion by firms in the
of estimating 100 price parameters in a market with
industry.
10 brands, usually a single price parameter is esti-
The fundamental building block for the analysis
mated). (iii) They seldom result in parameter esti-
of firm behavior is the demand function for each of
mates with incorrect signs for own and cross effects,
the players in the marketplace. The demand func-
as is the case with linear demand systems and their
tions relate the sales of the brands to their prices,
variants.
promotions, and other marketing variables. Two is-
The most widely used specification of discrete-
sues that have become increasingly important while
choice demand function in such studies thus far has
estimating the parameters of such aggregate de-
been the logit model. To allow the total demand for
mand functions to study firm behavior are the en- the category to vary over time, the model treats the
dogeneity of marketing activities (typically, price) and no-purchase option or the ‘‘outside good’’ as an ad-
the heterogeneity across consumers in the market un- ditional alternative available in the choice set. The
der consideration. The endogeneity problem arises specification embodies all the advantages of discrete-
when there are variables for which data are not choice models noted above. Additionally, it is also
available (such as shelf space allocation, shelf loca- easy to estimate. All these advantages appear to jus-
tion, store coupons, etc.) that could influence a tify the model’s widespread use in the literature.
brand’s sales in a given week and if these variables Hence, researchers have used the logit demand model
are correlated with the included marketing variables and have accounted for the issues of endogeneity and
such as price (lowering the price of brand in a given heterogeneity while estimating the parameters of
week may be accompanied by giving it more shelf these models with aggregate store (Besanko et al.
facings). These other marketing activities are part of 1998), chain, or market (Sudhir 2001) data. Account-
the error term in the estimation and the correlation ing for heterogeneity in the logit model also alleviates
between the price variable and the error term results the problem of restrictive cross-elasticities that are ob-
in the endogeneity problem. Not accounting for this tained from this model because of the ‘‘independence
correlation will give incorrect estimates for the ef- of irrelevant alternatives’’ (IIA) property at the indi-
fects of the included marketing variables. The issue vidual consumer level (see the discussion in Nevo
with heterogeneity is the same as it is with house- 2001).

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 443


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

While the logit model with an outside good (and proach used in previous studies. Rather than attempt
accounting for endogeneity and heterogeneity in the to model the purchase incidence and brand choice de-
estimation with aggregate data) has seen widespread cisions simultaneously, we employ the approach pro-
application in the marketing and economics litera- posed by Kim et al. (1995). Category sales are mod-
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

ture, little research has been devoted to analyzing eled as a regression of the total demand across brands
the sensitivity of the results obtained to some of the on category level marketing activities. Brand shares
restrictions associated with this specification. Our are obtained as an aggregation of consumers’ condi-
goal in this paper is to investigate two of these re- tional (on category purchase) brand choice probabil-
strictions. The first is the IIA restriction of the logit ities as in previous studies such as those by Berry et
model at the individual consumer level. Models such al. (1995), Nevo (2001), Sudhir (2001), etc. The advan-
as the probit (Currim 1982) do not suffer from the tage of this methodology is that we only use infor-
IIA problem at the individual level. As noted above, mation on the brands that is directly obtained from
aggregate elasticities from the logit model that ac- the marketplace, i.e., sales, prices, and other market-
counts for heterogeneity are indeed not subject to ing activities of brands. A disadvantage is that the
the IIA restriction. Nevertheless, researchers who model can no longer be given a fully ‘‘structural’’
have estimated logit and probit models with house- interpretation, as the category regression is a re-
hold data have found the aggregate elasticities from duced-form approach to modeling a piece of the con-
these two specifications to be different even after ac- sumer’s decision problem (the purchase incidence de-
counting for the effects of heterogeneity (see Chin- cision). We provide a comparison of results obtained
tagunta and Honore 1996). So the question that aris- from the two methods for our data to investigate the
es is: Are the aggregate elasticities obtained from the sensitivity of the elasticity estimates to the definition
logit and probit models similar when using aggre- of the outside good.
gate data in the estimation and after accounting for The remainder of this paper is organized as fol-
the effects of both heterogeneity and endogeneity? lows. In the next section, we describe the estimation
An answer to this question is important, as the elas- of the aggregate probit model that accounts for het-
ticities directly influence the measure of market erogeneity as well as price endogeneity. We then pro-
power. vide the results from the empirical analysis using
The second issue we investigate is the modeling of market data on shampoo purchases. A comparison
the ‘‘no-purchase’’ option as an additional alternative with the logit specification and an investigation of the
in the logit model. Inclusion of the outside good in sensitivity to alternative assumptions on the no-pur-
the estimation requires the shares of each of the al- chase option are provided. The final section concludes
ternatives—including that for the outside good or with some directions for future research using the
‘‘no-purchase’’ alternative—to be known. With con- methodology.
sumer level data, where one observes whether or not
a household purchases the product category, comput-
ing the shares is straightforward (Chintagunta 1993).
With aggregate data, we only observe the sales or Model Formulation and Estimation
shares of the brands but do not observe the aggregate Strategy
fraction of consumers not buying a product category We begin with the basic probit model at the house-
in a given week. Hence, we need to assume the total hold (we use consumer and household interchange-
potential size of the market in each week to compute ably here) level. We then describe the category level
the share of the outside good (Nevo 2001 assumes regression model. Our description of the probit mod-
that everyone living in the market area consumes the el assumes the presence of K ‘‘brands.’’ This can be
equivalent of one helping of cereal each day). In this interpreted as K ⫺ 1 brands and one no-purchase op-
paper, we present a simple alternative to the ap- tion (in the case where the outside alternative is part

444 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

of the choice process) or as K brands with a separate form expression and represents a K ⫺ 1 dimensional
category sales regression. In the former case, the Kth integral. However, there are several approaches to
brand will not have any marketing variables associ- computing the integral to a high degree of accuracy.
ated with it. Specifically, the indirect utility of con- See Hajivassiliou et al. (1996) for a comparison of
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

sumer i for brand j in week t is given by the following these methods.


expression. In the above formulation, households are assumed
to differ in their preferences as well as in their price
Vijt ⫽ ␣ij ⫹ ␤i ln( p jt ) ⫹ ␥d jt ⫹ ␮jt ⫹ ⑀ijt ,
sensitivities. To account for such heterogeneity, re-
Vijt ⫽ Yijt ⫹ ⑀ijt , (1) searchers have proposed several approaches. The
two most commonly used specifications are the
where the category consists of K brands j ⫽ 1, 2, 3, parametric random effects logit model (Allenby and
. . . , K. Vi j t is the indirect utility of consumer i for Rossi 1999) or the semiparametric random effects
brand j in week t. ⑀i j t is a K-variate normal random logit model (see for example, Jain et al. 1994). Here,
error term with mean 0 and covariance matrix ⍀. Yi j t we focus on the parametric model. For the latent
includes all terms in the indirect utility function ex- class approach using aggregate data, see Berry et al.
cluding ⑀i j t. ␣i j is consumer i’s intrinsic preference for (1998). Heterogeneity in intrinsic preferences (␣i j)
brand j. ␤i is the price sensitivity parameter for con- and price sensitivities (␤i) are accounted for as fol-
sumer i. p j t is the price of brand j in week t. ␥ is the lows:
deal parameter and d j t is the deal variable for brand
j in week t. ␮ j t is the unobservable attribute for ␣ij ⫽ ␣j ⫹ ⑀ij and ␤i ⫽ ␤ ⫹ ⑀i ␤ ,
brand j in week t. The unobservable attribute ␮ j t cap- where
tures the effects of variables other than prices and
deals that are not included in the model and that ⑀ij ⬃ N(0, ␴ 2j ) and ⑀i ␤ ⬃ N(0, ␴ 2␤ ). (3)
could drive the probability of choosing brand j.
␣ j is the mean intrinsic preference level for brand j
These are in-store variables that could vary over time
across households, and ␤ is the mean value of the
and are correlated with the retail price. This results
price sensitivity parameter. The term ␴ j2 represents
in the endogeneity problem discussed in the intro-
the variance in the intrinsic preference for brand j
duction. The probability of brand j being chosen is
across consumers. With household level data, one
given by:
can allow these preferences to be correlated across
Pijt ⫽ Pr(Vikt ⫺ Vijt ⱕ 0, ∀k ⫽ 1, 2, . . . , K, k 苷 j) brands. However, with aggregate data, we do not
have information to distinguish between these cor-
⫽ Pr{⑀ikt ⫺ ⑀ijt ⱕ ␣ij ⫺ ␣ik ⫹ ␤i [ln( p jt ) ⫺ ln( p kt )] relations and those due to the random component
⫹ ␥(d jt ⫺ dkt ) ⫹ ␮jt ⫺ ␮kt , of indirect utilities, ⑀i j t. Hence, while we assume
utilities themselves to be correlated across brands,
k ⫽ 1, 2, . . . , K, k 苷 j} we restrict the preferences to be uncorrelated across
⫽ Pr(␩j,ikt ⱕ Zj,ikt , k ⫽ 1, 2, . . . , K, k 苷 j) brands. Three points are noteworthy at this junc-
ture.
⫽ ⌽(Zijt , ⍀j,K⫺1 ) (2)
(i) If the brands or products included in the analysis
where ␩ikt has a (K ⫺ 1)-variate normal distribution can be represented by their constituent attri-
with mean zero and covariance matrix ⍀ j,K⫺1, ⌽(., .) butes, then allowing for heterogeneity along each
refers to the CDF of a K ⫺ 1 variate normal distri- attribute as in Equation (3), will allow for brand
bution, and Zijt denotes the matrix Yijt ⫺ Yikt ∀k ⫽ 1, preferences to be correlated without the problem
2, . . . , K, k 苷 j with each element denoted by Z j,ikt ∀k of identification noted above.
⫽ 1, 2, . . . , K, k 苷 j. Note that, unlike the logit model, Specifically, let ␣ij ⫽ ⌺W
w⫽1 ␣iw I jw where I is an
the probability in Equation (2) does not have a closed indicator that takes the value 1 if brand or prod-

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 445


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

uct j includes attribute w or zero otherwise, and approach used closely parallels that of Berry et al.
␣iw is the preference value that consumer i has (1995) and Nevo (2001), with one important differ-
for attribute w. Then, Equation (3) can be written ence (that we discuss subsequently).
as ␣ij ⫽ ⌺W w⫽1 ␣w I jw ⫹ ⌺w⫽1 ⑀iw I jw. Even if the ⑀iw
W
Step 1. Decompose Yijt as Yijt ⫽ (␣ j ⫹ ␤ ln(p jt) ⫹
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

terms are independent across the w’s, the pref-


erences for the two brands j and k will be cor- ␥d jt ⫹ ␮ jt) ⫹ [⑀ij ⫹ ⑀i␤ pijt] ⫽ L jt ⫹ [⑀ij ⫹ ⑀i␤ pijt]. Note
related if they share a subset of attributes. By that L jt is household invariant, whereas the second
imposing a structure on the nature of preference term depends on i. Intuitively, the estimation involves
correlation, we can overcome the identification two ‘‘nested’’ loops. In the ‘‘outer’’ loop, the param-
problem noted previously. eters corresponding to the household heterogeneity
(ii) If we had access to data from multiple markets distribution as well as those in ⍀K⫺1 (Equation 2) are
or multiple stores in a given market, we can ex- computed, whereas the ‘‘inner’’ loop involves com-
ploit in addition, the variation in demographic puting the unknown parameters embedded in L jt. It
characteristics across the different units (markets is important to distinguish between the two loops,
or stores) by making ␣ij and ␤i functions of these because while L jt is linear in the unknown parame-
variables. In this way, we can allow for system- ters, ␣ j, ␤, and ␥, the term in the square brackets is
atic differences in preferences as well as sensitiv- nonlinear in the embedded parameters (⍀K⫺1, ␴j2 and
ities to marketing activities across different de- ␴ ␤2).
mographic units. Step 2. Make R draws for the terms ⑀ij and ⑀i␤. This
(iii) If the probabilities in (2) are based on the logit requires initial guesses for the unknown (nonlinear)
model, we can allow for a general pattern of cor- parameters ␴j2 and ␴ ␤2 . Hence, given these initial val-
relation across preferences and price sensitivities ues, the term in the square brackets in the above
in (3), as the utilities themselves are constrained equation is ‘‘known.’’ Additionally, in this step, we
to be uncorrelated in this case. also need to pick starting values for the parameters
When one has access to household data (in the in ⍀K⫺1 (to ensure that the matrix is positive definite,
absence of the unobserved attribute term ␮ j t), we can we choose initial values for the Cholesky decompo-
write out the likelihood of a string of purchases over sition of this matrix).
time for each household. This likelihood would then Step 3. Make initial guesses for the L jt terms. Note
be integrated over the distribution of heterogeneity. that if there are 3 brands and 100 time periods, this
The sample likelihood, which is the product of the involves 300 L jt ‘‘parameters’’ in the case of the out-
unconditional household likelihoods, would then be side good model and 200 for the category sales/
maximized to arrive at a set of parameters. Villas- brand choice model. Now, given L jt, [⑀ij ⫹ ⑀i␤ pijt], and
Boas and Winer (1999) have recently addressed the the starting values for ⍀K⫺1, we can compute the
issue of accounting for ␮ j t and then estimating the probit probability Prjt for each of the R draws. The
parameters of the logit demand model using house- predicted share from the model (s jt) is the average
hold data. In dealing with aggregate data, estimation probability across the R draws.
is complicated by two issues: (a) Data are observed
only at the aggregate level. In other words, what we Step 4. The ‘‘inner loop’’ computation takes place.
observe are S j t—shares of brand j in week t. (b) p j t In other words, keeping the nonlinear parameters
and ␮ j t could potentially be correlated. fixed at the initial guesses, we iterate over the (200 or
The principle underlying the estimation is simple: 300) values of L jt to minimize the distance between
Obtain estimates that equate the observed shares S jt the predicted share (s jt) and the actual share (S jt).
to the shares predicted by the model, s jt. The imple- Given the nonlinearity of the probit probability, the
mentation of this strategy tends to be more compli- logarithmic transformation to linearity (see Berry et
cated because of the correlation mentioned above. The al. 1995) that works for the logit model no longer ap-

446 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

plies, and we need to use standard nonlinear opti- step can be summarized as follows (note Lt is the
mization methods to carry out the minimization. This vector {L1t, L2t, . . . , LKt}).

min (Sjt ⫺ sjt ) ⫽ min [Sjt ⫺ ⌽(Lt 円 Zjt , ␴j , ␴␤ , ⍀j,K⫺1 )],


Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

Lt Lt

Lt [
min Sjt ⫺ 冕 ␩ j,Kt ⱕZj,Kt
··· 冕 ␩ j,( j ⫹1)t ⱕZj,( j ⫹1)t

␩ j,( j ⫺1)t ⱕZj,( j ⫺1)t
···

⫻ 冕 ␩ j,1t ⱕZj,1t
]
␾(␩j,1t , . . . , ␩j,( j ⫺1)t , ␩j,( j ⫹1)t , . . . , ␩j,Kt ) d␩j,1t , . . . , ␩j,( j ⫺1)t , ␩j,( j ⫹1)t , . . . , ␩j,Kt .

Step 5. Step 4 gives us the values L jt for all j and t. pling the probit brand choice model with a category
Now, returning to the expression for L jt, we note that sales model. Denote by Q jt the sales of brand j in
L jt ⫽ ␣ j ⫹ ␤ ln(p jt) ⫹ ␥d jt ⫹ ␮ jt. If corr( p jt, ␮ jt) ⫽ 0, week t. Then the sales at the ‘‘category’’ (or subcate-
then we can obtain ␣ j, ␤, and ␥ by simply regressing gory level in our case) is nothing but the aggregation
L jt on intercepts, ln( p jt) and d jt. However, given the of sales across brands. The category sales in week t
possibility of correlation, instrumental variable meth- is given by CQt ⫽ ⌺Kj⫽⫺11 Q jt. The category sales level
ods are used instead. This completes the computation will depend on the prices and promotions of the var-
of the linear parameters, conditional on the initial ious brands in the category and also on factors such
choices of the nonlinear parameters. as seasonality. We compute category level price and
promotion variables by share-weighting the prices
Step 6. The error term ␮ jt is computed as L jt ⫺ (␣ j and promotions of the individual brands (see Kim et
⫹ ␤ ln(p jt) ⫹ ␥d jt).
al. 1995). Rather than use a weekly share weight how-
Step 7. The error term is then interacted with the ever, we compute the average share of each brand
instrument vector used in Step 5 to provide the GMM over the period of the data and use these as share
objective function. This objective function forms the weights. Therefore, variation in the dependent vari-
basis of obtaining the nonlinear parameters, i.e., the able is not being used to create our independent var-
outer loop. iables. We denote the share-weighted price and pro-
motion variables as CPt and CRt. Now the category
Step 8. Minimizing the GMM objective function by sales regression model is given as follows:
iterating over the values of ⍀K⫺1, ␴j2 , and ␴␤2 provides

冘␭I
3
estimates for the nonlinear parameters. The corre-
ln(CQt ) ⫽ ␻ ⫹ ␯ ln(CPt ) ⫹ ␳CRt ⫹ ⫹ et . (4)
sponding values of ␣ j, ␤, and ␥ computed in Step 5 s⫽1
s st

will give us the values of the linear parameters. The


standard errors of the estimates can then be comput- In the above equation, ␻, ␯, ␳, ␭s are parameters to be
ed. We turn next to the formulation of the category estimated. Ist is an indicator variable taking the value
regression model in instances in which it is difficult 1 if week t is in season s and zero otherwise. The
to quantify the sales of the outside good. random error term is et. Estimation of the parameters
of the above equation requires recognizing two im-
The Category Sales Regression portant points. The first is that the category price is
In the case of the probit model specification with K likely to be endogenous, i.e., potentially correlated
⫺ 1 brands and no outside good, prices of the various with the error term. Furthermore, et could be corre-
brands have no influence on the total size of the cat- lated with ␮ jt from Equation (2). The first issue can
egory. To overcome this problem, we propose cou- be addressed by using instruments for category pric-

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 447


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

Table 1 Descriptive Statistics


Brand A Brand B Brand C
Variable Mean Standard Deviation Mean Standard Deviation Mean Standard Deviation
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

Sales (units) 906,697 132,412 1,316,436 1,557,747 1,639,378 214,326


Share 0.235 0.024 0.341 0.029 0.424 0.036
Price (16-oz. bottle) 2.562 0.094 2.561 0.112 2.109 0.019
Promotion 1,790 6,892 622 2,525 8,822 18,396

es in addition to instruments for brand prices. The Estimation and Results


second issue can be addressed by augmenting the er- The data we use are for the shampoo product cate-
ror vector ␮t used in the GMM with the error term gory. Because of the proprietary nature of the data,
et. The prediction of the aggregate sales of brand j in we are unable to reveal the actual identities of the
week t is obtained by multiplying the prediction from brands. There are three brands in the specific subcat-
the category sales model with that obtained from the egory chosen for the analysis and we refer to them as
brand choice model. We have the following expres- brands A, B, and C. We chose this subcategory be-
sion: cause the managers at the firm releasing the data felt
that these brands formed a distinct submarket in the

冦 [ 冘␭I ]冧
3
category. The data are aggregated for the entire U.S.
Qjt ⫽ exp ␻ ⫹ ␯ ln(CPt ) ⫹ ␳CRt ⫹ s st ⫹ et
s⫽1 market. While it is important to consider issues of

冘 ⌽(Z
aggregation as described in Christen et al. (1997),

[1
R
R

r ⫽1
rjt , ⍀j,K⫺1 ) .
] (5)
market level data are routinely used for the investi-
gation of competitive interactions. Weekly informa-
Note that it is important to account for the error tion over two years (104 weeks) is available for the
terms et and ␮ jt when making predictions. This is three brands. Besides the sales levels of the brands,
similar in kind to the issue raised by Christen et al. we also have their levels of prices and promotional
(1997) in the context of log–log regression models. activities over the 2-year period. In addition, we used
In the case in which the no-purchase option or the seasonal dummies in the category sales regression.
outside good is treated as an additional alternative, We assume that the only endogenous variable is price.
the sales of brand j in week t are given by the follow- The instruments we used are the following. From the
ing expression. M in the equation refers to the total Bureau of Labor Statistics, we obtained price indices
consumption associated with that category in each for material (packaging as well as certain categories
week (assumed to be invariant over time). For ex- of chemicals used as ingredients in shampoos) and
ample, as described above, in the case of Nevo (2001), labor. We also used values of one period lag prices
this is the potential consumption of cereal by the pop- for all brands as instruments. Note that lagged prices
ulation of interest. can be problematic when there is serial correlation in
the ␮ jt term.

[ 冘 ⌽(Z ] Descriptive statistics of the data are in Table 1. The


R
1
Qjt ⫽ M rjt , ⍀j,K ) . (6) share data are conditional on purchase. They indi-
R r ⫽1
cate that brand C is the biggest brand in this partic-
Note from the above equation that the covariance ma- ular subcategory of the shampoo category. However,
trix ⍀ j,K is of dimension K, as opposed to K ⫺ 1, as the smallest brand, brand A has the highest coeffi-
there is one additional alternative—the outside good. cient of variation of the three brands. The average

448 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

prices of brands A and B are very close in magni- effects of endogeneity but not for the effects of het-
tude to each other, although there appears to be erogeneity; (iii) without the unobserved attribute ␮ jt
greater variation in the prices of brand B. Brand C, but accounting for heterogeneity in preferences as
the largest share brand has the lowest price. As this well as the price sensitivity parameter; and (iv) the
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

category is heavily promoted through manufacturer most general case that accounts for both endogeneity
coupons, we use information on the couponing var- as well as heterogeneity.
iable to capture promotional effects on sales. The From Table 2, we see that brand A is the desig-
variable is operationalized as the total value of cou- nated ‘‘base’’ brand with mean intrinsic preference
pons dropped in each week. Table 1 indicates that level set to zero. The two specifications that account
brand C drops the most coupons, followed by for endogeneity reveal positive mean intrinsic pref-
brands A and B. The low price of brand C coupled erences for the two larger brands, B and C, as their
with its heavy couponing appear to contribute to its estimates exceed zero. In the models that do not ac-
large share in the marketplace. In the estimation, we count for endogeneity, brand B has a lower mean
used current and lagged values of the couponing intrinsic preference level than brand A. Note that the
variable. We found that the only significant variable magnitudes of these and other estimates are not di-
was the 1-week lagged value of coupons dropped. rectly comparable because of differences in the es-
Hence, this is the only variable included in the sub- timated covariance matrices across the four specifi-
sequent estimation and results. cations. Table 2 also reveals that the coefficients of
the two marketing variables, price, and promotion
Estimation have the right signs and are significant at the 5%
In the estimation, we performed extensive sensitivity level of significance across all the model specifica-
analyses pertaining to the number of draws from the tions. In order to interpret the relative magnitudes
heterogeneity distribution required. Based on this, of the price coefficients, we compute the correspond-
we settled on 100 draws (R ⫽ 100) as being reason- ing elasticities that are presented later. The hetero-
able, as increasing the draws beyond this number geneity parameters in Table 2 indicate that there is
did not affect the parameter estimates significantly. some heterogeneity in the intrinsic brand preferenc-
For the outside good models, we did not use any es in the case of the most general model although
marketing variables in the utility specification. If the variances are not very large in magnitude. This
data are available, they can easily be incorporated implies that after one explicitly allows for non-IIA
into the analysis. behavior at the individual consumer level via the
probit specification, there appears to be little hetero-
Results geneity in intrinsic preferences. Later, we will con-
The results are discussed as follows. First, we discuss trast this finding with that obtained from the cor-
the estimates obtained from the category regression/ responding logit model specifications. The most
brand choice probit model. Next, we provide results heterogeneity we find is for the price sensitivity pa-
from the probit model with an outside good included rameter obtained under the ‘‘with endogeneity and
in the specification but without the category regres- with heterogeneity’’ specification. In this case, the
sion model. Finally, we discuss the results obtained standard deviation of 0.242 is significantly different
from the comparison logit models. In Columns 2–5 from zero. Note that there are three covariance pa-
of Table 2, we provide the results from the probit rameters estimated as there are three brands. Hence,
model with the category regression/brand choice for- ⍀ j,K⫺1 is a 2 ⫻ 2 matrix with three unknown param-
mulation. Four different specifications were estimat- eters in the Cholesky decomposition, of which only
ed. These are (i) without the unobserved attribute ␮ jt two parameters are uniquely identified. Hence, stan-
that results in the endogeneity problem and without dard errors are not reported for the third parameter,
accounting for heterogeneity; (ii) accounting for the as it is fixed.

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 449


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

Table 2 Parameter Estimates and Standard Errors for Shampoo Data*


Category Regression/Brand Choice Model Outside Good Model
With Endo- With Endo-
With Endogeneity geneity geneity
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

No Endogeneity With Endogeneity No Endogeneity With Heterogene- No Hetero- With Het-


Variable No Heterogeneity No Heterogeneity With Heterogeneity ity Variable geneity erogeneity

Brand B ⫺0.204 0.028 ⫺0.150 0.034 Brand A ⫺0.010 0.017


(0.024) (0.031) (0.026) (0.032) (0.004) (0.005)
Brand C 0.114 0.007 0.062 0.034 Brand B 0.002 0.032
(0.019) (0.025) (0.021) (0.026) (0.005) (0.004)
Price ⫺0.570 ⫺0.298 ⫺0.404 ⫺0.428 Brand C ⫺0.007 0.009
(0.121) (0.137) (0.120) (0.135) (0.003) (0.004)
Promotion** 0.007 0.004 0.005 0.004 Price ⫺0.142 ⫺0.171
(0.001) (0.002) (0.001) (0.002) (0.059) (0.062)
Covariance Parameter 1 0.460*** 0.083*** 0.236*** 0.054*** Promotion 0.003 0.003
(0.001) (0.001)
Covariance Parameter 2 1.383 0.287 0.960 0.279 Covariance Parameter 1*** 0.196 0.179
(0.079) (0.111) (0.113) (0.128)
Covariance Parameter 3 0.548 0.305 0.383 0.384 Covariance Parameter 2 ⫺0.167 ⫺0.167
(0.082) (0.129) (0.095) (0.134) (0.079) (0.081)
␴␤ (Price) — — 0.049 0.242 Covariance Parameter 3 0.001 ⫺0.0001
(0.021) (0.029) (0.065) (0.071)
␴A (brand A) — — 0.008 0.031 Covariance Parameter 4 0.252 0.240
(0.010) (0.014) (0.081) (0.111)
␴B (brand B) — — 0.016 0.005 Covariance Parameter 5 0.145 0.144
(0.012) (0.015) (0.066) (0.051)
␴C (brand C) — — 0.007 0.048 Covariance Parameter 6 0.143 0.134
(0.009) (0.016) (0.071) (0.066)
␴␤ (Price) — 0.058
(0.021)
Intercept** 0.161 0.159 0.161 0.160 ␴A (brand A) — 0.004
(0.033) (0.056) (0.033) (0.056) (0.032)
Price ⫺0.011 ⫺0.009 ⫺0.011 ⫺0.010 ␴B (brand B) — 0.073
(0.002) (0.003) (0.003) (0.003) (0.022)
Promotion** 0.0002 0.0002 0.002 0.0002 ␴C (brand C) — 0.027
(0.00005) (0.0001) (0.00006) (0.0001) (0.027)
*The estimates for the seasonality parameters are not reported, as they were not significant at the 5% level.
**Promotion variable was multiplied by 1e⫺4 and log(Category Sales, Table 3) was multiplied by 0.01 in the estimation.
***Fixed in the estimation. Note that only (K·(K ⫺ 1)/2) ⫺ 1 parameters of the covariance matrix are identified.

Table 2 provides the parameter estimates obtained specifications are quite similar to one another. One of
from the category regression model under each spec- the things we also find is that seasonality does not
ification. A priori, we would expect the parameters play a major role in this product category.
from the two ‘‘no-endogeneity’’ specifications to re- In Table 3, Columns 2–9, we present the elasticity
semble each other and those from the two ‘‘with-en- estimates from the four specifications. For each spec-
dogeneity’’ models to be similar as heterogeneity has ification, we present two sets of elasticities. The first
no impact on the category sales regressions. Indeed column corresponds to the brand choice elasticities.
the results reflect this, although results from all four The second column contains the total sales or de-

450 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

Table 3 Price Elasticities (Standard Errors) from the Various Models


Probit Models
Category Regression/Brand Choice Logit Models
No Endogeneity & Endogeneity & Endogeneity & Heterogeneity
No Heterogeneity Only Endogeneity Only Heterogeneity Heterogeneity Outside Good
Category
Brand Brand Brand Brand Only Endogeneity & Regression/ Outside
Share of . . . Choice Total* Choice Total Choice Total Choice Total Endogeneity Heterogeneity Brand Choice Good

Effect of Brand A’s Price on


Brand A ⫺1.194 ⫺1.422 ⫺1.491 ⫺1.689 ⫺1.223 ⫺1.487 ⫺1.930 ⫺2.179 ⫺1.457 ⫺1.669 ⫺1.437 ⫺2.004
(0.089) (0.157) (0.117) (0.168) (0.091) (0.155) (0.118) (0.170) (0.155) (0.167) (0.133) (0.151)
Brand B 0.139 ⫺0.092 0.474 0.273 0.155 0.113 0.735 0.479 0.069 0.102 0.146 0.334
(0.061) (0.088) (0.094) (0.140) (0.060) (0.085) (0.094) (0.138) (0.043) (0.053) (0.058) (0.079)
Brand C 0.550 0.317 0.433 0.232 0.552 0.283 0.470 0.214 0.026 0.049 0.182 0.194
(0.115) (0.139) (0.132) (0.177) (0.110) (0.132) (0.135) (0.182) (0.031) (0.033) (0.057) (0.082)
Effect of Brand B’s Price on
Brand A 0.206 ⫺0.143 0.708 0.403 0.230 ⫺0.155 1.093 0.724 0.104 0.053 0.216 0.512
(0.097) (0.133) (0.111) (0.158) (0.095) (0.130) (0.110) (0.160) (0.027) (0.039) (0.071) (0.088)
Brand B ⫺0.434 ⫺0.780 ⫺1.020 ⫺1.320 ⫺0.443 ⫺0.826 ⫺1.381 ⫺1.740 ⫺1.031 ⫺1.230 ⫺1.405 ⫺1.902
(0.083) (0.109) (0.146) (0.179) (0.083) (0.103) (0.145) (0.183) (0.118) (0.132) (0.157) (0.183)
Brand C 0.242 ⫺0.107 0.447 0.142 0.236 ⫺0.149 0.528 0.162 0.049 0.059 0.224 0.365
(0.119) (0.156) (0.139) (0.188) (0.118) (0.155) (0.139) (0.186) (0.024) (0.028) (0.073) (0.081)
Effect of brand C’s Price on
Brand A 0.992 0.562 0.789 0.414 0.997 0.530 0.851 0.410 0.048 0.089 0.329 0.380
(0.138) (0.222) (0.151) (0.200) (0.137) (0.217) (0.150) (0.196) (0.030) (0.033) (0.065) (0.077)
Brand B 0.294 ⫺0.133 0.544 0.171 0.287 ⫺0.177 0.644 0.204 0.059 0.072 0.274 0.453
(0.126) (0.208) (0.144) (0.181) (0.126) (0.211) (0.143) (0.183) (0.029) (0.035) (0.069) (0.074)
Brand C ⫺0.793 ⫺1.215 ⫺0.882 ⫺1.251 ⫺0.790 ⫺1.249 ⫺1.004 ⫺1.437 ⫺0.925 ⫺1.035 ⫺1.305 ⫺1.697
(0.099) (0.157) (0.103) (0.159) (0.100) (0.161) (0.105) (0.161) (0.135) (0.159) (0.198) (0.211)
*Brand Choice: Brand Choice Elasticity; Total: Category Sales and Brand Choice Elasticity.
CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

mand elasticities. There are several interesting points (b) Elasticity (No Endogeneity and With Hetero-
to note from Table 3: geneity) ⬍ Elasticity (With Endogeneity and
With Heterogeneity).
(1) None of the cross elasticities are subject to the IIA (4) Not accounting for endogeneity seems to have a
restriction, even those that come from models that
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

bigger impact than not accounting for unob-


do not account for heterogeneity. This is because served heterogeneity in this category.
of the probit model specification at the individual (5) Looking at the own sales elasticities from the
consumer level. The finding is in contrast with most general model (accounting for both endo-
studies such as Berry et al. (1995) and Nevo geneity and heterogeneity), we find that brand A
(2001), where heterogeneity is required to break is the most price elastic followed by brands B and
the IIA restriction due to the logit assumption on C. We find that this ordering is preserved when
brand choices. endogeneity is accounted for. However, in the oth-
(2) As expected, the own elasticity of sales is larger er two cases we find the ordering of brands B and
(in magnitude) than the own elasticity of brand C interchanged even after accounting for the ef-
choice. However, the cross-elasticities of demand, fects of heterogeneity. This further underscores
when positive, are larger for brand choice than for the need to account for these phenomena when
sales. Intuitively, this is because in the case of the estimating the parameters of demand functions.
own sales elasticity, if the price of a brand increas- (6) Examining the cross-sales elasticities from the
es, consumers can switch to another brand or to with endogeneity and heterogeneity model, we
not buying at all. Hence, the own elasticity of find that brand A prices have a bigger impact on
sales is larger than if consumers are forced to the sales of brand B than on brand C. Brand B’s
switch to one of the other brands (the brand price, consistent with that of brand A, has a big-
choice elasticity). With the cross-sales elasticities, ger effect on the sales of that brand than on the
an increase in price of brand A implies fewer con- sales of brand C. Also, brand A sales seem to be
sumers switch to brands B and C because con- affected the most by brand C’s prices with brand
sumers can also switch to not buying at all. This B’s sales being affected less. These cross-elastici-
is not the case for the cross brand choice elastic- ties provide insights into the nature of interbrand
ities. price competition in this market.
(3) Comparing own elasticities across model specifi- (7) Note that some of the total sales cross-elasticities
cations, we find that ignoring either endogeneity have the wrong signs under the two specifications
or heterogeneity tends to bias the elasticities to- that do not account for endogeneity. The reason
wards zero. Specifically, the following relation- for this is that the corresponding brand choice-
ships appear: elasticities are biased toward zero due to not ac-
(a) Elasticity (No Endogeneity and No Hetero- counting for endogeneity (and, as noted in (4),
geneity) ⬍ Elasticity (With Endogeneity and this has a bigger impact than not accounting for
No Heterogeneity); heterogeneity). Recall that the total cross-elastici-
(b) Elasticity (No Endogeneity and No Hetero- ty is the category sales elasticity ⫹ the brand
geneity) ⬍ Elasticity (No Endogeneity and choice cross elasticity. The category elasticity is
With Heterogeneity). negatively signed whereas the brand choice cross-
Furthermore, we also find that accounting for ei- elasticity is positively signed. When the latter is
ther endogeneity or heterogeneity does not suffice biased towards zero, the sum in certain cases
and it is important to account for both these is- turns out to be negative. Note that this is not the
sues in the estimation. Specifically, case with the brand choice cross-elasticities.
(a) Elasticity (With Endogeneity and No Hetero-
geneity) ⬍ Elasticity (With Endogeneity and Having discussed the results from the brand
With Heterogeneity); choice/category regression specification for the probit

452 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

model, we turn next to the specification in which an (3) The own price elasticities seem to be smaller in
outside good is included in the individual-level choice this case as compared to the most general model un-
model to capture the no-purchase behavior of con- der the category regression/brand choice specifica-
sumers. This obviates the need for a category regres- tion (Column 9 in Table 3). In particular, the own
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

sion equation. Hence, the model is identical to the elasticities seem closer to zero by roughly 0.4–0.5 for
brand choice component of the previous specification all three brands. What this implies is that the category
with an additional alternative. In Table 2, Columns 7– elasticities corresponding to this sales specification
8, we present the parameter estimates and their stan- are smaller than those obtained when category sales
dard errors for this formulation. Given the relative were modeled explicitly as a function of category lev-
importance of accounting for endogeneity found with el marketing activities.
the previous specification, we focus only on the two (4) Furthermore, the cross-price elasticities are also
formulations that account for endogeneity—with and very small in magnitude, especially compared to
without accounting for heterogeneity. those in Columns 2–9. It must be noted that previous
Note from Table 2 that we now have three brand studies that have examined the purchase incidence
intercepts—one for each brand. The reason is that we and brand choice decisions of households have also
now have four alternatives, the three brands and the obtained small cross-elasticities relative to own elas-
outside good, and so three intercepts are identified. ticities (Chintagunta 1993).
The outside good is specified as the base brand in Taken together, these results imply that the esti-
this case. Also note that we have six covariance pa- mated price elasticities are sensitive to the model
rameters (of which five are uniquely identified) rather specification. The choice of specification will come
than three as in the previous formulation. The reason down to a trade-off between wanting a fully struc-
is that ⍀ j,K⫺1 is now a 3 ⫻ 3 matrix with six unknown tural interpretation of the model versus not having to
parameters in the Cholesky decomposition. The re- make assumptions that determine the total size of the
sults are largely consistent with those from the cate- category. For example, if one does have data on the
gory regression/brand choice model. We note once entire category’s sales, then this information can be
again that the price and promotion parameters are exploited in defining the outside good. However, in
not directly comparable across specifications because the absence of such information, the proposed cate-
of differences in the estimated covariance matrices. gory regression/brand choice model may be pre-
We do note however, that the parameters correspond-
ferred.
ing to the heterogeneity distribution are small and
are not significantly different from zero in two of the
four cases. Even the standard deviation parameter for Model Comparison: Logit Model
price that had an estimated coefficient of 0.242 is only Having discussed the results from two different
0.058 in this case. Again, we caution that the numbers probit specifications, we turn next to the logit model
are not directly comparable. Nevertheless, they seem to see whether implications obtained are similar to
to indicate a small effect of heterogeneity in this case. those obtained for the probit model. Accordingly, in
To verify this, we provide in Table 3 (Columns 10– Table 3 (Columns 12–13) we provide the price elastic-
11) the price elasticities from the two specifications. ities obtained from the two logit specifications. The
We note the following from these estimates. first is a purchase incidence/brand choice model sim-
(1) Consistent with our previous results, we find ilar to the nested logit model. This is the specification
that not accounting for the effects of heterogeneity discussed in Chintagunta (1993) except that we allow
does bias the estimated elasticities towards zero in for the price coefficient to be different from ⫺1. This
this case as well. specification treats the no-purchase option to be dis-
(2) The relative ordering of the own elasticities is tinct from the alternatives in the category under con-
also the same as previously found with brand A being sideration. Hence, even in the absence of heteroge-
the most price sensitive followed by brands B and C. neity the substitution pattern between one of the

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 453


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

brands and the outside good is different from that Table 3 (Column 12) reveals a pattern similar to that
between two brands. The second specification is the of the comparison described above. However, in this
category regression/brand choice model, whose di- case it appears that the logit own price elasticities
rect probit counterpart we have discussed previously. across the three brands are very close to one another,
ranging only from ⫺1.305 for brand C to ⫺1.437 for
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

Under both specifications, we account for endogene-


ity as well as for heterogeneity. brand A. Hence, it appears that the logit assumption
We draw the following inferences from the elastic- on brand choice probabilities may be restricting the
ities in Table 3 (Columns 12 and 13). range of elasticities estimated from the data. This pro-
(1) Under both specifications, brand A has the vides further motivation for using the probit model
highest elasticity, followed by brands B and C in that to characterize demand when studying competitive
order. This is consistent with the results obtained interactions among firms.
from the probit model specifications. To summarize, the model comparison results in-
(2) Comparing across specifications, we find that dicate that while the elasticities from the logit and
the own elasticities obtained from the purchase inci- probit specifications are roughly comparable, there
dence/brand choice model are smaller than those ob- are some differences that exist. As these differences
tained from the category regression/brand choice have implications for optimal pricing behavior, they
model. Note that the former specification requires an are of interest to researchers studying competitive be-
assumption on category consumption much like the havior at the firm level. Given the more flexible nature
probit model with the outside good. Furthermore, the of the choice model under the probit specification, one
magnitude of difference in elasticities is roughly com- can, for these data, conclude that this is a more ap-
parable to the differences from the corresponding propriate specification. This is notwithstanding the
probit models. flexibility imparted to the logit model by the distri-
(3) Comparing the elasticities from the category re- bution of heterogeneity imposed. We also carried out
gression/brand probit and logit models in Table 3, a predictive validation exercise on four holdout weeks
we find that these elasticities are quite comparable in for the logit and probit outside good models. Note
their magnitudes. The own elasticities for the three that predictions with such models require us to also
brands under the probit specification are ⫺2.179, integrate over the distribution of the unobserved at-
⫺1.740, and ⫺1.437. The corresponding elasticities tribute, as we do not observe these terms for the hold-
from the logit model are ⫺2.004, ⫺1.902, and ⫺1.679. out data. We use the empirical distribution for the
It appears from these numbers that the logit elastic- purpose and make predictions at each of the 104 un-
ities vary over a smaller range than the probit elas- observed attribute values from the estimation sample.
ticities. In other words, optimal margins for the man- The average share for each brand across the 104 val-
ufacturers under the probit specification will lead to ues is computed for each hold out week. The mean
a wider range in margins than under the logit spec- absolute percentage error from the logit model is 27%
ification. Performing this computation, we find the and that for the probit model is 23%.
margin ((price ⫺ cost)/price) for the three brands un-
der the probit specification to be 46%, 57%, and 69%. Model Comparison: Log–Log Regression Model
Under the logit specification, we obtain 50%, 53%, We also estimated the parameters from a log–log re-
and 60%. Similarly, the cross-elasticities range from gression model that is most comparable to our model
0.194 to 0.512 under the logit specification, whereas specification (i.e., using the same set of variables). We
they range from 0.162 to 0.724 under the probit mod- also estimated the linear and semilog regression
el. models, as some of these specifications are more ap-
(4) A comparison of the elasticities from the probit propriate for pricing purposes. The price elasticities
outside good model from Table 3 (Column 11) with from the log–log model are in Table 4, with those
the logit purchase incidence/brand choice model in from the other specifications being substantively sim-

454 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

Table 4 Price Elasticities from Log–Log Regression Model as heterogeneity need to be accounted for even after
Sales of/Price allowing for a non-IIA specification at the individual
of → Brand A Brand B Brand C consumer level. We also find that ignoring endoge-
neity has a bigger impact on the estimated price elas-
Brand A ⫺0.129 ⫺1.198 1.232
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

ticities than ignoring the effects of heterogeneity. A


(0.599) (0.489) (0.813)
Brand B ⫺1.439 1.799 ⫺0.836 comparison of the elasticities obtained from the prob-
(0.568) (0.601) (0.885) it model with those from the corresponding logit
Brand C 2.205 ⫺0.347 ⫺2.633 specification indicates that while the elasticities ap-
(0.532) (0.545) (0.821) pear to be comparable in magnitude, there is one key
difference. We find that the range of elasticities ob-
tained from the probit model across brands is larger
ilar. We find from Table 4 that the own-price elasticity than that obtained from the logit. This finding could
for brand B has the incorrect sign, while that for stem in part from the probit model, allowing for dif-
brand A is not significantly different from zero. Of ferent error variances across brands.
the cross-elasticities, two are positive, two negative, In addition to specifying a probit model and pro-
and two are not significantly different from zero. As viding comparisons with the logit model, the paper
noted in the introduction, using these elasticities as also addresses the issue of the specification of the
the basis for strategic pricing decisions can be prob- ‘‘outside good’’ that arises when using discrete choice
lematic. models to specify demand. We propose a simple al-
ternative to this specification by decomposing the de-
mand for a brand into a category demand equation
Conclusions and a conditional brand choice share equation. We
In this paper, we have proposed the probit model as provide a comparison of results from this specifica-
an alternative to the logit model to specify the aggre- tion to those from the outside good specification and
gate demand functions of firms competing in oligop- find that estimated elasticities are sensitive to the
oly markets. The primary benefit that accrues from specification used.
using the probit model is an avoidance of the IIA One of the key limitations of the proposed model
property at the individual consumer level that en- is that while it accounts for the purchase incidence
ables us to distinguish between the effects of IIA vi- and brand choice decisions of households, it does not
olations and the effects of heterogeneity at the aggre- account for differences across consumers in their pur-
gate level. In the estimation of the model parameters, chase quantities. The model and analysis are best
we account for two critical issues that have received suited for product categories in which consumers
recent attention in the marketing literature. These are typically make only single-unit purchases. Another
endogeneity of marketing variables and heterogeneity limitation is more practical in nature. While recent
across consumers. The endogeneity problem arises advances have been made in computing probit prob-
because of unobserved factors that are firm- and time abilities, it could nevertheless be a challenge to do so
period-specific (but invariant across consumers) that when the number of alternatives is large.
could be correlated with price. Consumer heteroge- In summary, this study has proposed a probit de-
neity is accounted for by assuming that brand pref- mand model as an alternative to the logit model that
erences and price sensitivities vary across consumers can be used as a basis to investigate competitive in-
following a parametric distribution. The individual teractions among firms in a product market. We ex-
level choice probabilities are aggregated across het- amine the sensitivity of the estimated price elasticities
erogeneous consumers, and the aggregated demand to the specification of the no-purchase alternative in
function is taken to the data. these models. The estimation of the model parame-
Our results indicate that both endogeneity as well ters accounts for endogeneity and heterogeneity. Our

MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001 455


CHINTAGUNTA
Endogeneity and Heterogeneity in a Probit Demand Model

model results obtained from the analysis of the sham- , B. E. Honore. 1996. Investigating the effects of marketing var-
poo product category indicate that the proposed iables and unobserved heterogeneity in a multinomial probit
model. IJRM 13 1–15.
specification is a promising alternative to existing
Christen, M., S. Gupta, J. Porter, R. Staelin, D. R. Wittink. 1997.
methods used for the purpose. Using market-level data to understand promotion effects in a
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.

nonlinear model. J. Marketing Res. 34 322–334.


Acknowledgments
Currim, I. S. 1982. Predictive testing of consumer choice models not
The author thanks the editor, the area editor, and two subject to independence of irrelevant alternatives. J. Marketing
anonymous reviewers for their comments and sug- Res. 19 (2) 208–222.
gestions. He is grateful to J. P. Dube for useful dis- Hajivassiliou, V. A., D. L. McFadden, P. A. Ruud. 1996. Simulation
cussions. The author thanks the Kilts Center at the of multivariate normal rectangle probabilities and their deriv-
University of Chicago for partial funding of this re- atives: Theoretical and computational results. J. Econometrics 72
(1–2) 85–134.
search.
Jain, D. C., N. J. Vilcassim, P. K. Chintagunta. 1994. A random-
References coefficients logit brand-choice model applied to panel data. J.
Allenby, G. M., P. E. Rossi. 1999. Marketing models of consumer Bus. Econom. Statist. 12 (July) 317–328.
heterogeneity. J. Econometrics 89 57–78. Kadiyali, V. 1996. Entry, its deterrence, and its accommodation: A
Berry, S. T., M. Carnall, P. T. Spiller. 1998. Airline hubs: Costs and study of the U.S. photographic film industry. Rand J. Econom.
markups and the implications of consumer heterogeneity. 27 (3) 452–478.
NBER Working Paper 5561. Kim, B-D., R. C. Blattberg, P. E. Rossi. 1995. Modeling the distri-
, J. Levinsohn, A. Pakes. 1995. Automobile prices in market bution of price sensitivity and implications for optimal retail
equilibrium. Econometrica 60 (4) 841–890. pricing. J. Bus. Econom. Statist. 13 (3) 291–303.
Besanko, D., S. Gupta, D. Jain. 1998. Logit demand estimation under Nevo, A. 2001. Measuring market power in the ready-to-eat cereal
competitive pricing behavior: An equilibrium framework. industry. Econometrica 69 (2) 307–342.
Management Sci. 44 (11) 1533–1547. Sudhir, K. 2001. Competitive pricing behavior in the auto market:
Chintagunta, P. K. 1993. Investigating purchase incidence, brand A structural analysis. Marketing Sci. 20 (1) 42–60.
choice and purchase quantity decisions of households. Market- Villas-Boas, M., R. Winer. 1999. Endogeneity and brand choice mod-
ing Sci. 12 (2) 184–208. els. Management Sci. 45 (10).

This paper was received December 20, 1999, and was with the author 9 months for 4 revisions; processed by Greg Allenby.

456 MARKETING SCIENCE/Vol. 20, No. 4, Fall 2001

You might also like