Chintagunta 2001 Endogeneity and Heterogeneity in A Probit Demand Model Estimation Using Aggregate Data
Chintagunta 2001 Endogeneity and Heterogeneity in A Probit Demand Model Estimation Using Aggregate Data
Marketing Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact [email protected].
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
© 2001 INFORMS
With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research (O.R.)
and analytics professionals and students. INFORMS provides unique networking and learning opportunities for individual
professionals, and organizations of all types and sizes, to better understand and use O.R. and analytics tools and methods to
transform strategic visions and achieve better outcomes.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Research Note
Endogeneity and Heterogeneity in a Probit
Demand Model: Estimation Using
Aggregate Data
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
Pradeep K. Chintagunta
Graduate School of Business, University of Chicago, 1101 East 58th Street, Chicago, Illinois 60637
[email protected]
vation behind these studies is to measure market estimation of the demand function parameters can
power of firms and to understand interfirm compet- lead to biased and inconsistent estimates for the
itive behavior. For example, Kadiyali (1996) studies marketing activities.
the competitive interactions between Kodak and Fuji The issues of endogeneity and heterogeneity have
in the U.S. market to investigate whether or not the achieved prominence in large measure because of the
rivalry between these two firms is as intense as in- increasing popularity of discrete-choice models to
dicated by the popular press. Sudhir (2001) looks at specify the demand functions to study firm behavior.
the competitive interactions among firms within var- The aforementioned studies by Sudhir and Nevo,
ious segments of the automobile industry to deter- along with others by Berry et al. (1995), have used
discrete-choice-based demand functions. The main
mine whether these interactions vary significantly
advantages of discrete-choice models are: (i) They are
across product segments. Nevo (2001) investigates
derived from utility maximizing behavior of consum-
the extent of pricing rivalry in the ready-to-eat
ers in the marketplace. (ii) They require estimation of
breakfast cereal market to determine whether ob-
fewer numbers of parameters, as compared to linear
served prices reflect market power associated with
(and log–log and semilog) demand functions (instead
product differentiation or collusion by firms in the
of estimating 100 price parameters in a market with
industry.
10 brands, usually a single price parameter is esti-
The fundamental building block for the analysis
mated). (iii) They seldom result in parameter esti-
of firm behavior is the demand function for each of
mates with incorrect signs for own and cross effects,
the players in the marketplace. The demand func-
as is the case with linear demand systems and their
tions relate the sales of the brands to their prices,
variants.
promotions, and other marketing variables. Two is-
The most widely used specification of discrete-
sues that have become increasingly important while
choice demand function in such studies thus far has
estimating the parameters of such aggregate de-
been the logit model. To allow the total demand for
mand functions to study firm behavior are the en- the category to vary over time, the model treats the
dogeneity of marketing activities (typically, price) and no-purchase option or the ‘‘outside good’’ as an ad-
the heterogeneity across consumers in the market un- ditional alternative available in the choice set. The
der consideration. The endogeneity problem arises specification embodies all the advantages of discrete-
when there are variables for which data are not choice models noted above. Additionally, it is also
available (such as shelf space allocation, shelf loca- easy to estimate. All these advantages appear to jus-
tion, store coupons, etc.) that could influence a tify the model’s widespread use in the literature.
brand’s sales in a given week and if these variables Hence, researchers have used the logit demand model
are correlated with the included marketing variables and have accounted for the issues of endogeneity and
such as price (lowering the price of brand in a given heterogeneity while estimating the parameters of
week may be accompanied by giving it more shelf these models with aggregate store (Besanko et al.
facings). These other marketing activities are part of 1998), chain, or market (Sudhir 2001) data. Account-
the error term in the estimation and the correlation ing for heterogeneity in the logit model also alleviates
between the price variable and the error term results the problem of restrictive cross-elasticities that are ob-
in the endogeneity problem. Not accounting for this tained from this model because of the ‘‘independence
correlation will give incorrect estimates for the ef- of irrelevant alternatives’’ (IIA) property at the indi-
fects of the included marketing variables. The issue vidual consumer level (see the discussion in Nevo
with heterogeneity is the same as it is with house- 2001).
While the logit model with an outside good (and proach used in previous studies. Rather than attempt
accounting for endogeneity and heterogeneity in the to model the purchase incidence and brand choice de-
estimation with aggregate data) has seen widespread cisions simultaneously, we employ the approach pro-
application in the marketing and economics litera- posed by Kim et al. (1995). Category sales are mod-
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
ture, little research has been devoted to analyzing eled as a regression of the total demand across brands
the sensitivity of the results obtained to some of the on category level marketing activities. Brand shares
restrictions associated with this specification. Our are obtained as an aggregation of consumers’ condi-
goal in this paper is to investigate two of these re- tional (on category purchase) brand choice probabil-
strictions. The first is the IIA restriction of the logit ities as in previous studies such as those by Berry et
model at the individual consumer level. Models such al. (1995), Nevo (2001), Sudhir (2001), etc. The advan-
as the probit (Currim 1982) do not suffer from the tage of this methodology is that we only use infor-
IIA problem at the individual level. As noted above, mation on the brands that is directly obtained from
aggregate elasticities from the logit model that ac- the marketplace, i.e., sales, prices, and other market-
counts for heterogeneity are indeed not subject to ing activities of brands. A disadvantage is that the
the IIA restriction. Nevertheless, researchers who model can no longer be given a fully ‘‘structural’’
have estimated logit and probit models with house- interpretation, as the category regression is a re-
hold data have found the aggregate elasticities from duced-form approach to modeling a piece of the con-
these two specifications to be different even after ac- sumer’s decision problem (the purchase incidence de-
counting for the effects of heterogeneity (see Chin- cision). We provide a comparison of results obtained
tagunta and Honore 1996). So the question that aris- from the two methods for our data to investigate the
es is: Are the aggregate elasticities obtained from the sensitivity of the elasticity estimates to the definition
logit and probit models similar when using aggre- of the outside good.
gate data in the estimation and after accounting for The remainder of this paper is organized as fol-
the effects of both heterogeneity and endogeneity? lows. In the next section, we describe the estimation
An answer to this question is important, as the elas- of the aggregate probit model that accounts for het-
ticities directly influence the measure of market erogeneity as well as price endogeneity. We then pro-
power. vide the results from the empirical analysis using
The second issue we investigate is the modeling of market data on shampoo purchases. A comparison
the ‘‘no-purchase’’ option as an additional alternative with the logit specification and an investigation of the
in the logit model. Inclusion of the outside good in sensitivity to alternative assumptions on the no-pur-
the estimation requires the shares of each of the al- chase option are provided. The final section concludes
ternatives—including that for the outside good or with some directions for future research using the
‘‘no-purchase’’ alternative—to be known. With con- methodology.
sumer level data, where one observes whether or not
a household purchases the product category, comput-
ing the shares is straightforward (Chintagunta 1993).
With aggregate data, we only observe the sales or Model Formulation and Estimation
shares of the brands but do not observe the aggregate Strategy
fraction of consumers not buying a product category We begin with the basic probit model at the house-
in a given week. Hence, we need to assume the total hold (we use consumer and household interchange-
potential size of the market in each week to compute ably here) level. We then describe the category level
the share of the outside good (Nevo 2001 assumes regression model. Our description of the probit mod-
that everyone living in the market area consumes the el assumes the presence of K ‘‘brands.’’ This can be
equivalent of one helping of cereal each day). In this interpreted as K ⫺ 1 brands and one no-purchase op-
paper, we present a simple alternative to the ap- tion (in the case where the outside alternative is part
of the choice process) or as K brands with a separate form expression and represents a K ⫺ 1 dimensional
category sales regression. In the former case, the Kth integral. However, there are several approaches to
brand will not have any marketing variables associ- computing the integral to a high degree of accuracy.
ated with it. Specifically, the indirect utility of con- See Hajivassiliou et al. (1996) for a comparison of
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
uct j includes attribute w or zero otherwise, and approach used closely parallels that of Berry et al.
␣iw is the preference value that consumer i has (1995) and Nevo (2001), with one important differ-
for attribute w. Then, Equation (3) can be written ence (that we discuss subsequently).
as ␣ij ⫽ ⌺W w⫽1 ␣w I jw ⫹ ⌺w⫽1 ⑀iw I jw. Even if the ⑀iw
W
Step 1. Decompose Yijt as Yijt ⫽ (␣ j ⫹  ln(p jt) ⫹
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
plies, and we need to use standard nonlinear opti- step can be summarized as follows (note Lt is the
mization methods to carry out the minimization. This vector {L1t, L2t, . . . , LKt}).
Lt Lt
Lt [
min Sjt ⫺ 冕 j,Kt ⱕZj,Kt
··· 冕 j,( j ⫹1)t ⱕZj,( j ⫹1)t
冕
j,( j ⫺1)t ⱕZj,( j ⫺1)t
···
⫻ 冕 j,1t ⱕZj,1t
]
(j,1t , . . . , j,( j ⫺1)t , j,( j ⫹1)t , . . . , j,Kt ) dj,1t , . . . , j,( j ⫺1)t , j,( j ⫹1)t , . . . , j,Kt .
Step 5. Step 4 gives us the values L jt for all j and t. pling the probit brand choice model with a category
Now, returning to the expression for L jt, we note that sales model. Denote by Q jt the sales of brand j in
L jt ⫽ ␣ j ⫹  ln(p jt) ⫹ ␥d jt ⫹ jt. If corr( p jt, jt) ⫽ 0, week t. Then the sales at the ‘‘category’’ (or subcate-
then we can obtain ␣ j, , and ␥ by simply regressing gory level in our case) is nothing but the aggregation
L jt on intercepts, ln( p jt) and d jt. However, given the of sales across brands. The category sales in week t
possibility of correlation, instrumental variable meth- is given by CQt ⫽ ⌺Kj⫽⫺11 Q jt. The category sales level
ods are used instead. This completes the computation will depend on the prices and promotions of the var-
of the linear parameters, conditional on the initial ious brands in the category and also on factors such
choices of the nonlinear parameters. as seasonality. We compute category level price and
promotion variables by share-weighting the prices
Step 6. The error term jt is computed as L jt ⫺ (␣ j and promotions of the individual brands (see Kim et
⫹  ln(p jt) ⫹ ␥d jt).
al. 1995). Rather than use a weekly share weight how-
Step 7. The error term is then interacted with the ever, we compute the average share of each brand
instrument vector used in Step 5 to provide the GMM over the period of the data and use these as share
objective function. This objective function forms the weights. Therefore, variation in the dependent vari-
basis of obtaining the nonlinear parameters, i.e., the able is not being used to create our independent var-
outer loop. iables. We denote the share-weighted price and pro-
motion variables as CPt and CRt. Now the category
Step 8. Minimizing the GMM objective function by sales regression model is given as follows:
iterating over the values of ⍀K⫺1, j2 , and 2 provides
冘I
3
estimates for the nonlinear parameters. The corre-
ln(CQt ) ⫽ ⫹ ln(CPt ) ⫹ CRt ⫹ ⫹ et . (4)
sponding values of ␣ j, , and ␥ computed in Step 5 s⫽1
s st
冦 [ 冘I ]冧
3
category. The data are aggregated for the entire U.S.
Qjt ⫽ exp ⫹ ln(CPt ) ⫹ CRt ⫹ s st ⫹ et
s⫽1 market. While it is important to consider issues of
冘 ⌽(Z
aggregation as described in Christen et al. (1997),
⫻
[1
R
R
r ⫽1
rjt , ⍀j,K⫺1 ) .
] (5)
market level data are routinely used for the investi-
gation of competitive interactions. Weekly informa-
Note that it is important to account for the error tion over two years (104 weeks) is available for the
terms et and jt when making predictions. This is three brands. Besides the sales levels of the brands,
similar in kind to the issue raised by Christen et al. we also have their levels of prices and promotional
(1997) in the context of log–log regression models. activities over the 2-year period. In addition, we used
In the case in which the no-purchase option or the seasonal dummies in the category sales regression.
outside good is treated as an additional alternative, We assume that the only endogenous variable is price.
the sales of brand j in week t are given by the follow- The instruments we used are the following. From the
ing expression. M in the equation refers to the total Bureau of Labor Statistics, we obtained price indices
consumption associated with that category in each for material (packaging as well as certain categories
week (assumed to be invariant over time). For ex- of chemicals used as ingredients in shampoos) and
ample, as described above, in the case of Nevo (2001), labor. We also used values of one period lag prices
this is the potential consumption of cereal by the pop- for all brands as instruments. Note that lagged prices
ulation of interest. can be problematic when there is serial correlation in
the jt term.
prices of brands A and B are very close in magni- effects of endogeneity but not for the effects of het-
tude to each other, although there appears to be erogeneity; (iii) without the unobserved attribute jt
greater variation in the prices of brand B. Brand C, but accounting for heterogeneity in preferences as
the largest share brand has the lowest price. As this well as the price sensitivity parameter; and (iv) the
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
category is heavily promoted through manufacturer most general case that accounts for both endogeneity
coupons, we use information on the couponing var- as well as heterogeneity.
iable to capture promotional effects on sales. The From Table 2, we see that brand A is the desig-
variable is operationalized as the total value of cou- nated ‘‘base’’ brand with mean intrinsic preference
pons dropped in each week. Table 1 indicates that level set to zero. The two specifications that account
brand C drops the most coupons, followed by for endogeneity reveal positive mean intrinsic pref-
brands A and B. The low price of brand C coupled erences for the two larger brands, B and C, as their
with its heavy couponing appear to contribute to its estimates exceed zero. In the models that do not ac-
large share in the marketplace. In the estimation, we count for endogeneity, brand B has a lower mean
used current and lagged values of the couponing intrinsic preference level than brand A. Note that the
variable. We found that the only significant variable magnitudes of these and other estimates are not di-
was the 1-week lagged value of coupons dropped. rectly comparable because of differences in the es-
Hence, this is the only variable included in the sub- timated covariance matrices across the four specifi-
sequent estimation and results. cations. Table 2 also reveals that the coefficients of
the two marketing variables, price, and promotion
Estimation have the right signs and are significant at the 5%
In the estimation, we performed extensive sensitivity level of significance across all the model specifica-
analyses pertaining to the number of draws from the tions. In order to interpret the relative magnitudes
heterogeneity distribution required. Based on this, of the price coefficients, we compute the correspond-
we settled on 100 draws (R ⫽ 100) as being reason- ing elasticities that are presented later. The hetero-
able, as increasing the draws beyond this number geneity parameters in Table 2 indicate that there is
did not affect the parameter estimates significantly. some heterogeneity in the intrinsic brand preferenc-
For the outside good models, we did not use any es in the case of the most general model although
marketing variables in the utility specification. If the variances are not very large in magnitude. This
data are available, they can easily be incorporated implies that after one explicitly allows for non-IIA
into the analysis. behavior at the individual consumer level via the
probit specification, there appears to be little hetero-
Results geneity in intrinsic preferences. Later, we will con-
The results are discussed as follows. First, we discuss trast this finding with that obtained from the cor-
the estimates obtained from the category regression/ responding logit model specifications. The most
brand choice probit model. Next, we provide results heterogeneity we find is for the price sensitivity pa-
from the probit model with an outside good included rameter obtained under the ‘‘with endogeneity and
in the specification but without the category regres- with heterogeneity’’ specification. In this case, the
sion model. Finally, we discuss the results obtained standard deviation of 0.242 is significantly different
from the comparison logit models. In Columns 2–5 from zero. Note that there are three covariance pa-
of Table 2, we provide the results from the probit rameters estimated as there are three brands. Hence,
model with the category regression/brand choice for- ⍀ j,K⫺1 is a 2 ⫻ 2 matrix with three unknown param-
mulation. Four different specifications were estimat- eters in the Cholesky decomposition, of which only
ed. These are (i) without the unobserved attribute jt two parameters are uniquely identified. Hence, stan-
that results in the endogeneity problem and without dard errors are not reported for the third parameter,
accounting for heterogeneity; (ii) accounting for the as it is fixed.
Table 2 provides the parameter estimates obtained specifications are quite similar to one another. One of
from the category regression model under each spec- the things we also find is that seasonality does not
ification. A priori, we would expect the parameters play a major role in this product category.
from the two ‘‘no-endogeneity’’ specifications to re- In Table 3, Columns 2–9, we present the elasticity
semble each other and those from the two ‘‘with-en- estimates from the four specifications. For each spec-
dogeneity’’ models to be similar as heterogeneity has ification, we present two sets of elasticities. The first
no impact on the category sales regressions. Indeed column corresponds to the brand choice elasticities.
the results reflect this, although results from all four The second column contains the total sales or de-
mand elasticities. There are several interesting points (b) Elasticity (No Endogeneity and With Hetero-
to note from Table 3: geneity) ⬍ Elasticity (With Endogeneity and
With Heterogeneity).
(1) None of the cross elasticities are subject to the IIA (4) Not accounting for endogeneity seems to have a
restriction, even those that come from models that
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
model, we turn next to the specification in which an (3) The own price elasticities seem to be smaller in
outside good is included in the individual-level choice this case as compared to the most general model un-
model to capture the no-purchase behavior of con- der the category regression/brand choice specifica-
sumers. This obviates the need for a category regres- tion (Column 9 in Table 3). In particular, the own
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
sion equation. Hence, the model is identical to the elasticities seem closer to zero by roughly 0.4–0.5 for
brand choice component of the previous specification all three brands. What this implies is that the category
with an additional alternative. In Table 2, Columns 7– elasticities corresponding to this sales specification
8, we present the parameter estimates and their stan- are smaller than those obtained when category sales
dard errors for this formulation. Given the relative were modeled explicitly as a function of category lev-
importance of accounting for endogeneity found with el marketing activities.
the previous specification, we focus only on the two (4) Furthermore, the cross-price elasticities are also
formulations that account for endogeneity—with and very small in magnitude, especially compared to
without accounting for heterogeneity. those in Columns 2–9. It must be noted that previous
Note from Table 2 that we now have three brand studies that have examined the purchase incidence
intercepts—one for each brand. The reason is that we and brand choice decisions of households have also
now have four alternatives, the three brands and the obtained small cross-elasticities relative to own elas-
outside good, and so three intercepts are identified. ticities (Chintagunta 1993).
The outside good is specified as the base brand in Taken together, these results imply that the esti-
this case. Also note that we have six covariance pa- mated price elasticities are sensitive to the model
rameters (of which five are uniquely identified) rather specification. The choice of specification will come
than three as in the previous formulation. The reason down to a trade-off between wanting a fully struc-
is that ⍀ j,K⫺1 is now a 3 ⫻ 3 matrix with six unknown tural interpretation of the model versus not having to
parameters in the Cholesky decomposition. The re- make assumptions that determine the total size of the
sults are largely consistent with those from the cate- category. For example, if one does have data on the
gory regression/brand choice model. We note once entire category’s sales, then this information can be
again that the price and promotion parameters are exploited in defining the outside good. However, in
not directly comparable across specifications because the absence of such information, the proposed cate-
of differences in the estimated covariance matrices. gory regression/brand choice model may be pre-
We do note however, that the parameters correspond-
ferred.
ing to the heterogeneity distribution are small and
are not significantly different from zero in two of the
four cases. Even the standard deviation parameter for Model Comparison: Logit Model
price that had an estimated coefficient of 0.242 is only Having discussed the results from two different
0.058 in this case. Again, we caution that the numbers probit specifications, we turn next to the logit model
are not directly comparable. Nevertheless, they seem to see whether implications obtained are similar to
to indicate a small effect of heterogeneity in this case. those obtained for the probit model. Accordingly, in
To verify this, we provide in Table 3 (Columns 10– Table 3 (Columns 12–13) we provide the price elastic-
11) the price elasticities from the two specifications. ities obtained from the two logit specifications. The
We note the following from these estimates. first is a purchase incidence/brand choice model sim-
(1) Consistent with our previous results, we find ilar to the nested logit model. This is the specification
that not accounting for the effects of heterogeneity discussed in Chintagunta (1993) except that we allow
does bias the estimated elasticities towards zero in for the price coefficient to be different from ⫺1. This
this case as well. specification treats the no-purchase option to be dis-
(2) The relative ordering of the own elasticities is tinct from the alternatives in the category under con-
also the same as previously found with brand A being sideration. Hence, even in the absence of heteroge-
the most price sensitive followed by brands B and C. neity the substitution pattern between one of the
brands and the outside good is different from that Table 3 (Column 12) reveals a pattern similar to that
between two brands. The second specification is the of the comparison described above. However, in this
category regression/brand choice model, whose di- case it appears that the logit own price elasticities
rect probit counterpart we have discussed previously. across the three brands are very close to one another,
ranging only from ⫺1.305 for brand C to ⫺1.437 for
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
Table 4 Price Elasticities from Log–Log Regression Model as heterogeneity need to be accounted for even after
Sales of/Price allowing for a non-IIA specification at the individual
of → Brand A Brand B Brand C consumer level. We also find that ignoring endoge-
neity has a bigger impact on the estimated price elas-
Brand A ⫺0.129 ⫺1.198 1.232
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
model results obtained from the analysis of the sham- , B. E. Honore. 1996. Investigating the effects of marketing var-
poo product category indicate that the proposed iables and unobserved heterogeneity in a multinomial probit
model. IJRM 13 1–15.
specification is a promising alternative to existing
Christen, M., S. Gupta, J. Porter, R. Staelin, D. R. Wittink. 1997.
methods used for the purpose. Using market-level data to understand promotion effects in a
Downloaded from informs.org by [103.141.126.88] on 18 May 2024, at 10:36 . For personal use only, all rights reserved.
This paper was received December 20, 1999, and was with the author 9 months for 4 revisions; processed by Greg Allenby.