0% found this document useful (0 votes)
230 views41 pages

The Historical Development of The Market Segmentation Concept

This document discusses the history and evolution of the market segmentation concept in both academia and practice. It outlines key developments in market segmentation theory starting in the early 20th century. In academia, seminal works in the 1950s and 1960s established market segmentation as a central marketing concept. Methodological approaches to segmentation have evolved through five phases since the 1960s, moving from classification methods to developing techniques tailored for marketing contexts like clusterwise regression and finite mixture models. Market segmentation involves segmenting a heterogeneous market into homogeneous subgroups (S), targeting specific segments (T), and developing unique positioning for products/services (P), and is considered a core component of modern marketing strategy.

Uploaded by

DEBASWINI DEY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
230 views41 pages

The Historical Development of The Market Segmentation Concept

This document discusses the history and evolution of the market segmentation concept in both academia and practice. It outlines key developments in market segmentation theory starting in the early 20th century. In academia, seminal works in the 1950s and 1960s established market segmentation as a central marketing concept. Methodological approaches to segmentation have evolved through five phases since the 1960s, moving from classification methods to developing techniques tailored for marketing contexts like clusterwise regression and finite mixture models. Market segmentation involves segmenting a heterogeneous market into homogeneous subgroups (S), targeting specific segments (T), and developing unique positioning for products/services (P), and is considered a core component of modern marketing strategy.

Uploaded by

DEBASWINI DEY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/279384714

The Historical Development of the Market Segmentation Concept

Chapter · January 2000


DOI: 10.1007/978-1-4615-4651-1_1

CITATIONS READS
4 2,173

2 authors:

Michel Wedel Wagner A Kamakura


University of Maryland, College Park Rice University
229 PUBLICATIONS   16,622 CITATIONS    165 PUBLICATIONS   15,650 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Brian Griner disseration research under Stephen Farber View project

Identifying Performance Gaps of Salespeople with Internal Benchmarking View project

All content following this page was uploaded by Michel Wedel on 07 July 2021.

The user has requested enhancement of the downloaded file.


0

Chapter 23
The History of Methods for Market
Segmentation

Michel Wedel
Distinguished University Professor
Pepsico Chair in Consumer Science
Robert H. Smith School of Business
University of Maryland
College Park MD 20742
http://scholar.rhsmith.umd.edu/wedel/

Wayne S. DeSarbo
Distinguished Emeritus Professor of Marketing
Smeal College of Business
Marketing Department
The Pennsylvania State University
University Park, PA. 16802
[email protected]
1

The History of Methods for Market Segmentation

23.1 Introduction: The Evolution of the Market Segmentation Concept

Market segmentation is one of the most indisputable topics in marketing. Both academics and

practitioners assert that it is the concept that has had the most impact on marketing practice (Roberts,

Kayande, and Stremersch 2014). As a case in point, about 85% of some 30,000 new product launches

have been reported to fail because of inadequate market segmentation (Christensen, Cook, and Hall

2005). Yet, surveys among managers have revealed that the use of segmentation modeling is more

prevalent among larger companies with more frequent customer contact (Verhoef, Spring, Hoekstra, et

al. 2003). Market segmentation guides a firm's marketing strategy and the allocation of resources

towards products and markets. In marketing practice, the idea of targeting products and services to

subgroups of consumers has been documented to occur as early as in 1820. German and British

booksellers targeted segments in the market for books based on price, geography, demographics, and

psychographics (Fullerton 1988). This strategy was systematically deployed which spurred growth in

book sales in Germany, elsewhere in Europe, and in the US. In “The Story of Mass Marketing in

America,” Tedlow (1996, p.6-9) points to the pioneering deployment of market segmentation strategy

by General Motors. Circa 1920, GM responded to the highly successful mass marketing strategy by Ford

via a segmentation strategy based on price: “A car for every purse and purpose,” where each car in its

product line was intended to appeal to a different segment of customers. This marked one of the largest

deployments of segmentation strategies in practice to date.

23.1.1 Market Segmentation in Academia

Classic theories on price discrimination (Pigou 1920), product differentiation (Ekelund 1970),

and imperfect competition (Robinson 1938) show that facing heterogeneous markets or customers, a
2

firm employing a market segmentation strategy can expect to increase profitability. These theories

therefore provide the major rationales for market segmentation (Frank, Massey, and Wind 1972). In

marketing academia, identifying market segments had already been identified as a major marketing

research problem by Shaw (1916) in his book “An Approach to Business Problems.” Shaw saw markets

being made up of segments, or “strata”, based on economic and social factors such as population, race,

buying behaviors, and mental attitudes (Bartels 1988, p. 127). The conceptual foundations of market

segmentation were first laid out by Smith (1956). He defined: “Market Segmentation involves viewing a

heterogeneous market as a number of smaller homogeneous markets in response to differing

preferences, attributable to the desires of customers for more precise satisfaction of their varying

wants.” Smith’s definition has stood the test of time, reflecting a market orientation as opposed to a

product orientation. Since his seminal article, market segmentation became a central concept in

marketing theory and the key component of marketing strategies of companies (Johnson 1971; Dickson

and Ginter 1987), as well as one of the most researched areas in Marketing, accumulating over

seventeen million hits on Google to date. Key theoretical developments of market segmentation can be

found in the works of Smith (1956), Claycamp and Massy (1968), Wind (1978), and Dickson and Ginter

(1987). The first comprehensive treatments of market segmentation were provided by Frank, Massy,

and Wind (1972) in their book “Market Segmentation,” and later by Wedel and Kamakura (2000) in their

book “Market Segmentation: Conceptual and Methodological Foundations.”

23.1.2 STP: Segmentation-Targeting-Positioning

Market Segmentation has become the central tenet in marketing strategy in practice and in

academia across the globe as a key component of the S(Segmentation)-T(Targeting)-P(Positioning)

process (Baker 1988; Kotler 1988; Myers 1996). STP involves selecting basis variables (Kim, Fong, and

DeSarbo 2012), profiling the segments for accessibility (Wedel and DeSarbo 2002), selecting one or

more of the most profitable segments to target (Mahajan and Jain 1978; Winter 1979; DeSarbo and
3

Mahajan 1984; DeSarbo and DeSarbo 2001), and developing positioning concepts for the firm’s

products/services (DeSarbo, Blanchard, and Atalay 2008). Various methods (citations in parenthesis

above) have been developed to assist in this process.

Importantly, the Marketing literature has enumerated various criteria for market segmentation to

be effective (Frank, Massy, and Wind 1972; Baker 1988; Kotler 1988; Wedel and Kamakura 2000;

DeSarbo and DeSarbo 2001; DeSarbo, Stadler-Blank, and Chen 2017). These criteria include the

Identifiability of segments; Differentiation of customer behavior between segments; Substantiality of

the derived segments from a marketing or cost-effectiveness perspective; Stability of the segments in

terms of consumer preferences and competitive entry; Reachability of segments by a distinct marketing

mix strategy; Responsiveness of the market segments to the marketing efforts targeted at them;

Actionability of segmentation results by providing insights into effective marketing actions; Feasibility of

implementation of segmentation strategy based on technical and managerial constraints; Profitability of

the segmentation strategy; and, Projectability of the results to the entire relevant marketplace. Most

extant approaches to market segmentation do not accommodate all these criteria, and segmentation

studies should thus incorporate multiple segmentation bases, managerial constraints, and be tailored to

the specific application context to meet these criteria for effective segmentation. Note, in the

formulation of segments one often needs to trade off of statistical goodness-of-fit with managerial and

efficacy criteria (DeSarbo and Mahajan 1984; Krieger and Green 1996; DeSarbo and Grisaffe 1998;

Brusco, Cradit, and Stahl 2002; Brusco, Cradit, and Tashchian 2003; Liu, Ram, Lusch, et al. 2010).

The marketing science literature on market segmentation methodology has evinced roughly five

different phases. A genealogical map of this literature is provided in Figure 1. Phase 1 (classification

methods) involved the application of grouping methods from statistics, classification, and numerical

taxonomy to segmentation problems, as well as the further development of some of these methods to

tailor them better to marketing contexts. Here, the conceptual development of normative segmentation
4

methods incorporated profit-maximizing segmentation schemes. Phase 2 (clusterwise regression

methods) involved the development of novel methods specifically tailored to the substantive issue of

response-based segmentation, which derived segments of consumers and estimated segment specific

response models s at the same time. A variety of techniques that allowed for constraints and multiple

criteria were developed as well. Phase 3 (finite mixture models) involved the application of parametric

latent class models initially developed (and later extended) in statistics to segmentation problems, and

the recognition of finite mixture models as a model-building approach to segmenting markets. In Phase

4 several novel extensions were developed, including mixture regression models, mixture and

clusterwise joint space models, and dynamic (Hidden Markov) segmentation models. There was a

proliferation of the development of new models for a variety of applications, including scanner panel

data, conjoint analysis, and multidimensional unfolding. In Phase 5, sparked by the application of

Bayesian methods, a debate emerged as to the effectiveness of “discrete” approaches to segmentation

versus models that allowed for personalization based on continuous heterogeneity distributions.

Research showed which approach worked best in which situation, and accommodating unobserved

heterogeneity became a mainstay of models of consumer behavior. Finally, there is a continued

application and development of these methods both in academia and practice. We discuss these phases

in the sequel and provide an outlook for the future of market segmentation research.

23.2 Segmentation Bases

23.2.1 Consumer Markets

The onset of segmentation research was focused not only on the conceptual foundations, but

also on the identification of empirical data that could form the basis for the estimation of market

segments. On the consumer side, geodemographic and socioeconomic variables were initially preferred.
5

Nielsen Claritas had developed a geo-demographic approach called PRIZM that classified US residential

neighborhoods into 66 distinct segments based on education, affluence, family life cycle, urbanization,

race/ethnicity, and mobility. Companies like Barnes and Noble, Hyundai, and Yelp have successfully

utilized this segmentation framework in their strategy implementation.

Unfortunately, differences in actual purchase behaviors between such geo-demographic-based

segments were shown to be relatively small (McCann 1974). This led to the use of observable product-

specific bases such as usage frequency (Twedt 1964), loyalty, repeat purchase, and usage situations

(Dickson 1982), as it was felt that such behavioral approaches were more closely related to consumers’

actual buying behavior. For example, many applications of usage segmentation utilize the 80/20 rule -

the adage that 80% of sales accrue from approximately 20% of the customers. In such cases, the idea

would be to target the 20% heavy users. Such an approach was utilized for Ultrabooks (Macbook air)

launched by Intel and Apple looking at the long sitting hours of some of their customers.

The need for a more actionable picture of consumers caused researchers to resort to

psychographics variables, such as personality (Claycamp 1965; Brody and Cunningham 1968), value

systems (Kahle, Beatty, and Holmer 1986; Novak and MacEvoy 1990; Kamakura and Mazzon 1991) and

lifestyle (Plummer 1974; Lastovicka 1982; Lesser and Hughes 1986; Wells 1975). For example, the VALS

framework, developed by Strategic Business Insights, combines demographic and psychological

measurements to devise some eight primary segments. Here, the Goya company benefited from the

utilization of such demographic and psychographic information in marketing to the Hispanic

marketplace finding their target market share strong family values, having multiple generations often

residing in one household, and have strong ties with their country of origin.

When it became clear, however, that these psychographic bases often lacked responsiveness

and actionability, researchers turned to product attribute perceptions (Dhalla and Mahatoo 1976;
6

Gensch 1978), preferences (Sewall 1978), and elasticities (Massy and Frank 1965; Claycamp and Massy

1968; Tollefson and Lessig 1978; Elrod and Winer 1982). Haley (1968) first proposed product benefits as

a basis for segmentation, operationalized as self-stated or statistically estimated (for example in conjoint

analysis; Green and Krieger 1991) importance weights (Currim 1981; Moriarty and Venkatesan 1978;

Myers 1976; Wind 1978; Calantone and Sawyer 1978). Benefit segmentation has long been considered

a favored mode of segmentation meeting many of the effectiveness criteria. For example, Constellation

Brands successfully deployed benefit segmentation to identify six segments in the US premium wine

market: Enthusiasts, Image Seekers, Savvy Shoppers, Traditionalists, Satisfied Shoppers, and

Overwhelmed/Confused.

Nonetheless in modern segmentation studies, multiple segmentation bases are often used

where each basis is employed according to its own strength. DeSarbo, Carroll, Clark, and Green (1984),

DeSarbo and Grisaffe (1998), and DeSarbo and Wu (2016) provide approaches for using such multiple

batteries of variables for segmenting customers. For example, pharmaceutical companies with their

direct-to-consumer (DTC) marketing efforts will use combinations of different household basis variables

to segment customers within a certain malady to maximize interest, compliance, and affordability.

23.2.2 Business Markets

In general, business markets are smaller in terms of number of larger customers compared to the

consumer market. In fact, some focused businesses may only face a handful of potential business

consumers. For example, Cummins engines, Delphi control systems, and other automotive parts

suppliers depended on getting large contracts from just a few major auto manufacturers. In such cases,

one can make a good case for micro-marketing where each customer is a separate segment. This is,

however, not the typical case. While business markets can be segmented utilizing some of the same

bases as in the consumer market such as geography, usage rate, and benefits sought, business markets
7

can use a variety of other variables for segmentation. According to Winer and Dhar (2010) and Kotler

and Keller (2016), these include: firmographics (e.g., type of industry, company size, location), operating

variables (e.g., technology, user status, customer lifetime value), purchase framework (e.g., power

structure, business relationships, purchase criteria), situational factors (e.g., types of applications,

degree of purchase urgency, size of the orders), and personal characteristics (e.g., loyalty to the

supplier, buyer attitudes towards risk, congruency of values). Doyle and Saunders (1985) provide one of

the earliest published studies of industrial market segmentation used a combination of factor and

cluster analysis of end-user survey data for a multinational raw material company. They contribute to a

prior debate on the literature on whether or not business and consumer market segmentation are

fundamentally different by concluding that although underlying concepts and methods are similar to

consumer market segmentation, they do require significant modification. Bolton and Meyers (2003)

segment an international business market based on dimensions of service quality as antecedents of

business customers’ price elasticities.

Examples of the application of business segmentation bases involve those by Siemens, which

utilized type of business classifications (such as the SIC code in the US) for segmentation. PPG Industries

utilized a hierarchical multiple base segmentation scheme. The first level was by their various

businesses, the second level involved geography (country/continent), and, the third level involved type

and size of business. Oracle used as the major basis for segmentation functionality and needs for the

diverse set of technological products and services they provide.

23.3 Phase 1: Classification Methods

Green (1977) proposed two distinct types of segmentation. In a priori segmentation, the

segmentation scheme was pre-specified, and the marketer looked at variables which could be utilized to
8

predict which segment a consumer belonged to. In post-hoc segmentation studies, nothing was known

about the nature of unknown segments and the marketer had to derive this information from suitable

empirical methods. A-priori segmentation methodology initially consisted of standard statistical

approaches such as cross-tabulation (Bass, Tigert, and Londsdale 1968; Morrison 1973), linear regression

(Wildt 1976; Wildt and McCann 1980; Beckwith and Sasieni 1976) and Multinomial logit and Probit

models (Rao and Winter 1978; Gensch 1985). It soon appeared, however, that post-hoc methods of

market segmentation provide a better fit to segmentation problems.

23.3.1 Decision Trees

Classification and Regression Trees (CART) (Breiman, Friedman, Olshen, et al. 1984) is an

umbrella term for a group of techniques that partition a sample by sequentially splitting a set of

variables one-by-one, to create a hierarchical tree that represents these variables in the order of their

importance for predicting a specified dependent variable. In classification trees, the dependent variable

is categorical; in regression trees, it is continuous. Each end-node of the derived tree thus represents a

market segment that is defined by a higher-order interactive effect of the segmentation bases employed

in the analysis. One of the earliest CART-like methods that was applied to market segmentation was AID,

the Automatic Interaction Detector (Assael 1970; Assael and Roscoe 1976). AID was generalized to

predict categorical dependent variables (CHAID) (Kass 1980), multiple dependent variables (MacLachlan

and Johansson 1981; Magidson 1994), and profit as a dependent variable (Martin and Wright 1974).

Such CART methods, however, proved to have drawbacks. Doyle and Fenwick (1975) showed via cross-

validation that the resulting trees are typically unstable due to overfitting unless the sample size is very

large, while Doyle and Hutchinson (1976) showed that clustering methods generally perform better for

segmentation problems.

CART has recently seen a revival as a machine learning approach because of the development of

random forests (Breiman 2001) which is an assemble method (which are based on a series of alternative
9

models rather than a single one) that that extends CART by constructing a large number of decision

trees based on random samples of cases and features, and outputting the modal class (classification) or

average prediction (regression) of these multiple trees. It overcomes some of the limitations of CART

related to overfitting. Recently, Aouad, Ali, Elmachtoub, Ferreira, and McNellis (2020) develop a tree

construction algorithm that splits a sample into segments based on customer features, and within each

segment fit a Multinomial Logit model to predict customers’ choices.

23.3.2 Cluster Analysis

There are basically four different types of cluster analysis: 1. Hierarchical clustering: Allows clusters to

exist within bigger clusters to form a tree; 2. Partition clustering: A division of the set of data objects into

non-overlapping clusters such that each object is in exactly one and only one cluster; 3. Overlapping

clustering: Reflects that an object can simultaneously belong to more than one cluster; 4. Fuzzy

clustering: Every object belongs to every cluster with a membership weight that goes between 0 (if it

absolutely doesn't belong to cluster) and 1 (if it absolutely belongs to the cluster).

Frank and Green (1968) and Lessig and Tollefson (1971) proposed the use of hierarchical clustering

methods to identify market segments. Hierarchical clustering algorithms start with each individual

consumer in each own cluster, and link clusters successively based on some measure of their proximity

or similarity such as City Block, Euclidian, or Mahalanobis distance measures (Wedel and Kamakura

2000, p. 46-47). The linking rules involved include single, complete, and average linkage (the distance

between two clusters is based on the smallest, largest, respectively average distance between any of its

members). Ward’s linkage algorithm joins clusters such that the increase in the within-cluster variance is

minimized. Partitioning or Nonhierarchical clustering algorithms start from some initial assignment of

consumers to segments and reassign consumers iteratively until a certain criterion is optimized. K-
10

means (or relatedly, Partitioning Around Medoids) is the most popular of the nonhierarchical methods

and minimizes the within-segment mean square deviation from the mean (medoid).

There are many applications of clustering methods: for example, hierarchical clustering was

applied by Greeno, Summers, and Kernan (1973) to identify segments on the basis of personality and

non-hierarchical methods were applied by Moriarty and Venkatesan (1978) for benefit segmentation.

Punj and Stewart (1983) reviewed clustering applications for market segmentation and concluded that

nonhierarchical methods are more robust than hierarchical methods to outliers and the inclusion of

irrelevant variables. Among hierarchical methods, they found that Ward’s method performed best.

Several extensions of these standard clustering methods were proposed. Sexton (1974) clustered

households based on market share response models using a two-step procedure in which first segments

were identified based on sales data, and secondly market share response models were estimated within

each segment. Similar two-step procedures for segmentation later became popular in conjoint studies.

In the psychometrics and classification literatures, several extensions of these clustering

methods were proposed - some of which found their way into marketing applications. Procedures were

developed that automatically weigh the variables used for hierarchical (DeSoete, DeSarbo, and Carroll

1985) and nonhierarchical clustering algorithms (DeSarbo, Caroll, Clark, et al. 1984). Techniques that

allow for segments to overlap, that is that allow a consumer to belong to more than one segment, were

developed as well (Arabie, Caroll, DeSarbo, and Wind 1981; DeSarbo and Mahajan 1984). Hruschka

(1986) applied fuzzy classification methods to market segmentation, and showed that they performed

better than overlapping and non-overlapping methods. Wedel and Kamakura (2000, p.72-73) provide an

overview of more than twenty applications of clustering methods to market segmentation, in areas such

as banking, media exposure, opinion leadership, product usage and buying behavior, customer

complaints, and drug prescriptions.


11

23.4 Phase 2: Clusterwise Regression

Two-stage procedures like the one proposed by Sexton (1974) were initially used for response-based

segmentation (Hauser and Urban 1977; Moriarty and Venkatesan 1978; Moore 1980; Currim 1981). In

the context of benefit segmentation and conjoint analysis, in the first stage importance weights were

estimated at the individual level, and these were subjected to a hierarchical or nonhierarchical

clustering procedure in the second stage. However, this procedure was shown to suffer from a low

reliability of the individual-level estimates which carried over to the segments derived from them, while

the criteria used by the clustering procedures did not maximize predictive accuracy (Kamakura 1988;

Wedel and Kistemaker 1989). Ogawa (1987) proposed a ridge-regression like estimation procedure to

address the issue of stability of individual-level logit model estimates in a two-step approach.

Kamakura (1988) proposed a hierarchical clustering procedure tailored to conjoint analysis that

simultaneously grouped respondents into segments based on a criterion that maximized the predictive

fit of the conjoint regression model within each segment. This procedure presented the first integrated

method for grouping and regression and set the scene for developments in clusterwise regression.

Rather than using a hierarchical aggregation procedure, Wedel and Kistemaker (1989) used a

nonhierarchical procedure for clusterwise regression that used an exchange algorithm to swap

observations between segments until a predictive fit criterion was maximized. This work extends the

early work by Späth (1979), who coined the term clusterwise regression. These clusterwise regression

models take the linear form:

𝒚𝑖 = 𝑿𝑖 𝑩𝒎𝑖 + 𝜺𝑖 , (1)

with 𝑿𝑖 a (𝐽 × 𝑃) matrix of independent variables, where 𝑗 = 1, … , 𝐽 are repeated measures, and 𝑩 is

a (𝑃 × 𝑆) matrix of parameters for the 𝑝 = 1, … , 𝑃 independent variables and segments 𝑠 = 1, … , 𝑆.

𝒎𝑖 is a (𝑆 × 1) vector of memberships of individual i in the S segments that can have only one nonzero
12

element: 𝑚𝑖,𝑠 = 0/1. The cluster-wise regression procedures optimize likelihood (or least squares)

criteria of the form max ∑𝑁


𝑖=1 𝜺𝑖 (𝒎𝑖 )′𝜺𝑖 (𝒎𝑖 ). While Kamakura (1988) finds the segment membership
𝒎𝑖

indicators 𝒎𝑖 through a hierarchical clustering algorithm, Wedel and Kistemaker (1989) do so through

an exchange algorithm and estimated the standard errors of the estimates with a Bootstrap procedure.

DeSarbo, Oliver, and Rangaswamy (1990) generalize clusterwise regression by allowing for overlapping

segments (estimated through simulated annealing) and constraints, where the vector 𝒎𝑖 can contain

multiple nonzero (1) elements. Wedel and Steenkamp (1989, 1991) generalize cluster-wise regression to

allow for fuzzy membership, that is 0 ≤ 𝑚𝑖,𝑠 ≤ 1, estimated using alternating least squares (ALS)

algorithms. Brusco, Cradit, and Stahl (2002) and Brusco, Cradit, and Tashchian (2003) generalize

clusterwise regression to multicriteria settings.

23.5 Phase 3: Finite Mixture Models

23.5.1 Latent Class Models

Mixture models date as far back as to the work by Newcomb (1886) and Pearson (1894). Latent

Class Analysis, a finite mixture for observed categorical variables that is used for clustering of

dichotomous variables, was first proposed by Lazarsfeld (1950) and extended to polytomous variables

by Goodman (1974), who also developed a maximum likelihood estimation algorithm. A latent class

model for 𝑘 = 1, … , 𝐾 observed categorical variables with categories 𝑐𝑘 = 1, … , 𝐶𝑘 is formulated as:

𝑘|𝑠
𝜋𝑐1 ,…,𝑐𝐾 = ∑𝑆𝑠=1 𝜋𝑠 ∏𝐾
𝑘=1 𝜋𝑐𝑘 . (2)

Here, 𝜋𝑐1 ,…,𝑐𝐾 is the probability of an observation falling into cell 𝑐1 , … , 𝑐𝐾 of the K-way contingency

𝑘|𝑠
table. 𝜋𝑠 is the proportion of latent class (segment) s, and 𝜋𝑐𝑘 is the conditional probability of category

𝑐𝑘 of variable k given latent class s. The latent class model is based on the assumption of local
13

independence, that is, in equation (2), the K categorical variables are assumed independent within each

latent class s. Green, Carmone, and Wachspress (1976) first recognized the applicability of the latent

class model to market segmentation and applied it to data on the adoption of a telecommunication

service. Lehmann, Moore, and Elrod (1982) used latent class analysis to identify segments of consumers

with limited problem solving versus routinized response behavior. Poulsen (1990) recognized the

applicability of latent class models to consumer choice behavior and identified latent classes based on

zero order and first order Markov switching using binomial and multinomial models. Grover and

Srinivasan (1987) used a similar approach to identify segments based on how consumers switch

between coffee brands based on latent class analysis of a two-way contingency table, assuming zero-

order purchase behavior within segments. They extended this approach to analyze choice behavior over

time through latent class analysis of repeated cross-tables of brand switching (Grover and Srinivasan

1989). Kamakura and Mazzon (1991) and Kamakura and Novak (1992) used Multinomial mixtures for

consumer value segmentation. Kamakura and Wedel (1995) developed a tailored interviewing

procedure for life-style segmentation that classifies consumers into life-style segments using a latent-

class model. They also developed a latent class model for the purpose of fusing two or more

independent data sets (Kamakura and Wedel 1997). Wedel and Kamakura (2000, p. 97) provide an

overview of latent class applications in marketing. None of these models or applications included

predictor variables and, therefore, did not enable response-based segmentation.

23.5.2 Mixture Regression Models

DeSarbo and Cron (1988) formulated the clusterwise regression problem in a maximum likelihood

framework. Their model is a finite mixture of univariate normal densities in which the expectations of

these densities are specified as linear functions of a set of explanatory variables. The likelihood is:

𝐿(𝒚𝑖 |𝚯) = ∑𝑆𝑠=1 𝜋𝑠 ∏𝐽𝑗=1 𝑓(𝑦𝑖,𝑗 |𝜽𝑠 ), (3)


14

where: 𝐸[𝑦𝑖,𝑗 |𝑠] = 𝒙𝑖,𝑗 𝜷𝑠 . Here, 𝑓(∙) is the Normal distribution function. 𝚯 denotes all model

parameters, and 𝜽𝑠 the vector of segment-specific parameters; that is, in this case, 𝜋𝑠 which is the prior

probability or the size of segment s, 𝜷𝑠 which is the vector of regression parameters, and 𝜎𝑠2 which is the

segment-specific variance of the Normal distribution. The expected memberships, 𝑝𝑖,𝑠 , are known as the

posterior membership probabilities which are useful for assigning consumers to segments after the

model is estimated:

𝐽
𝜋𝑠 ∏𝑗=1 𝑓(𝑦𝑖,𝑗 |𝜷𝑠 ,𝜎𝑠2 )
𝑝𝑖,𝑠 = . (4)
∑𝑆𝑠=1 𝜋𝑠 ∏𝐽𝑗=1 𝑓(𝑦𝑖,𝑗 |𝜷𝑠 ,𝜎𝑠2 )

This model set the stage for the development and application of mixture and mixture regression

models to market segmentation problems. Many finite mixture regression models were developed in

the late 1980’s and 1990’s (Kamakura and Wedel 2000, p.118). De Soete and DeSarbo (1991) and Wedel

and DeSarbo (1993) developed finite mixture binomial probit and logit regression models. The finite

mixture multinomial logit regression model developed by Kamakura and Russell (1989) revolutionized

scanner panel data analysis. Wedel, DeSarbo, Bult, and Ramaswamy (1993) proposed a Poisson mixture

regression model for the application to direct mail response. The Poisson distribution was compounded

with a Gamma distribution to account for within-segment heterogeneity by Ramaswamy, Anderson, and

DeSarbo (1994). Rosbergen, Pieters, and Wedel (1997) used a mixture of gamma distributions to

describe gaze durations on advertisements. DeSarbo, Wedel, Vriens, and Ramaswamy (1992) developed

multivariate normal regression mixtures and applied these to metric conjoint analysis. All these models

use the same likelihood as in equation (3), but with 𝑓(∙) indicating these various distribution functions.

Wedel and DeSarbo (1995) provided a general framework for mixtures of generalized linear models

that contains all these models as special cases, where the conditional expectation related to equation

(3) is replaced by 𝑔(𝐸[𝑦𝑖,𝑗 |𝑠]) = 𝒙𝑖,𝑗 𝜷𝑠 , with 𝑔(∙) a link function, such as the logit for the Binomial, the

log for the Poisson and Negative Binomial, and the inverse for the Gamma and Exponential distributions.
15

Finite mixture models have been estimated via direct numerical maximization of the likelihood function,

or via the EM algorithm (Dempster, Laird, and Rubin 1977), which uses a data augmentation approach

by introducing indicator variables for segment membership (𝑧𝑖,𝑠 = 0/1) into the likelihood function, and

alternating between taking the expectation of these memberships and maximizing the conditional

likelihood given these. Later, these models were expressed in a Bayesian framework. Formulating prior

distributions for all parameters, 𝑝(𝚯) = 𝑝(𝝅1:𝑆 ) ∙ 𝑝(𝜽1:𝑆 ), Dirichlet, Normal, and other conjugate

distributions allows for the posterior distribution to be formulated: 𝑝(𝚯|𝒚𝑖 ) ∝ ∏ 𝐿(𝒚𝑖 |𝚯)𝑝(𝚯). MCMC

techniques such as Metropolis-Hastings or data augmentation can be utilized for estimation.

One area where mixture regression modeling has been efficaciously applied is international

market segmentation. Ter Hofstede, Steenkamp, and Wedel (1999) developed a mixture of item

response theory models (IRT) for international market segmentation, while at the same time dealing

with respondents’ heterogeneous scale usage. Ter Hofstede, Wedel, and Steenkamp (2002) proposed a

mixture model for international market segmentation that incorporated spatial contiguity constraints on

segment membership. They used an MCMC procedure for its estimation. Helsen, Jedidi, and DeSarbo

(1993) developed a mixture model that simultaneously identified segments of countries based on the

Bass diffusion model. Lemmens, Croux, and Stremersch (2012) incorporated Markov dynamics in their

mixture model for international market segmentation.

23.6 Phase 4: Extensions of Mixture Regression Models

In the following decade, mixtures of Generalized Linear Regression models were extended in a

variety of directions such as for scanner panel data, conjoint analysis, concomitant variable mixtures for

segment profiling, and Hidden Markov Models for dynamic segmentation.


16

23.6.1 Mixture Models for Household-Level Scanner Panel Data

The identification of segments from scanner panel data was one of the important applications of

mixture regression models. The mixture of Multinomial distributions provided by Kamakura and Russell

(1989) was one of the most influential of these models, allowing segmentation based on scanner panel

data where segments differed in, amongst others, the parameter that captures the impact of prices on

purchases. Based on this advance, Bucklin, Gupta, and Siddarth (1998) and Bucklin and Gupta (1992)

developed joint mixture segmentation model for purchase incidence, brand choice and purchase

quantity. Along these lines, Kamakura, Kim, and Lee (1996) proposed a mixture of nested Multinomial

Logit models where the nesting structure differed across segments, reflecting consumers’ different

hierarchical decision processes. Kamakura and Russell (1993) showed that a Multinomial mixture model

applied to scanner panel data could be used to measure brand equity. Andrews and Currim (2002) used

a mixture logistic regression model to identify segments based on cross-category purchase behaviors.

23.6.2 Mixture Models for Store-Level Data

Several studies have used mixture models to attempt to recover segment structure from

aggregate (store-level) data without access to individual-level customer information. Zenor and

Srivastava (1993) first estimate a mixture of MNL regression models on aggregate panel data and

validate the aggregate data estimates with individual household panel data estimates. Besanko, Dubé,

and Gupta (2003) analyze firms’ competitive strategies using a finite mixture model calibrated on

aggregate store-level scanner data. But Bodapati and Gupta (2004) investigated the extent to which

segments can be reliably recovered from store-level (scanner) data. The observed store level choice data

consists of the number of times products 𝑗 = 0, 1 ,…, J are purchased in store k at each time point t,

𝑦𝑘,𝑗,𝑡 , with 𝑗 = 0 representing no purchase. In the resulting Likelihood the probabilities are formulated

as the following mixture:


17

𝑦𝑘,𝑗,𝑡
𝐿(𝒚𝑘,𝑡 |𝚯) = ∏𝐽𝑗=0[∑𝑆𝑠=1 𝜋𝑠 𝑝(𝑦𝑘,𝑗,𝑡 )] , (5)

where 𝑝(𝑦𝑘,𝑗,𝑡 ) is the (nested) Multinomial choice probability. The authors show that the parameters of

the model are theoretically identifiable from store data. But, based on a large number of simulations

they conclude that parameter estimates from store data have a large bias for realistic sample sizes. They

conclude that it is likely that the estimated parameters are far from their true values, thus putting into

question this modeling approach in cases other than those where sample sizes are very large.

23.6.3 Mixture Models for Conjoint Analysis

Next to panel data applications, the application to conjoint analysis was particularly important.

DeSarbo, Wedel, Vriens, and Ramaswamy (1992) developed multivariate normal regression mixtures for

metric conjoint analysis, and Kamakura, Wedel, and Agrawal (1994) developed a Multinomial mixture

for rankings data collected in conjoint analysis. DeSarbo, Ramaswamy, and Cohen (1995) developed a

similar model for choice-based conjoint (CBC), and DeSarbo, Ramaswamy, and Chatterjee (1995)

proposed a Dirichlet mixture for constant sum data collected by asking respondents to allocate points

whose sum was fixed. These models are all special cases of the general regression mixture model

formulated in equation (3) but applied to different types of conjoint analysis studies. In a simulation

study, Vriens, Wedel, and Wilms (1996) showed that the mixture modeling approach to conjoint

analyses outperformed classical two-stage clustering approaches. Finite mixture regression conjoint

analysis has now gained substantial traction in practice.

23.6.4 Concomitant Variable Mixture Regression Models for Segment profiling

Based on the work by Dayton and MacReady (1988), concomitant-variable mixture models

allowed for the simultaneous profiling of the segments with consumer descriptor variables by

formulating the prior probabilities (or mixing parameters, see equations (2) and (3)) as a logit function of

a vector of exogeneous consumer descriptor variables 𝒛𝑖 :


18

exp (𝒛′𝑖 𝜶𝑠 )
𝜋𝑖,𝑠 = (6)
1+∑𝑤=1 exp (𝒛′𝑖 𝜶𝑤 )
𝑆−1

These models were extended by Kamakura, Wedel, and Agrawal (1994) for conjoint choice analysis and

by Gupta and Chintagunta (1994) for scanner panel data. These and later approaches to concomitant

variable mixtures by Wedel and DeSarbo (2002) allow for the simultaneous profiling of segments in terms

of demographic and socioeconomic variables, integrating an important step in the STP process into the

mixture model framework.

23.6.5 Mixture Regression Models with Variable Selection

Other methodological advances included those by Kim, Fong, and DeSarbo (2012) who

developed an approach that allows for simultaneous selection of predictor variables within each

segment, thus relaxing the assumption of standard mixture regression models that each segment has

the same set of predictor variables. They used an MCMC procedure for the estimation of this model.

Kim, Fong, Blanchard, and DeSarbo (2013) extended this approach to include managerial constraints on

the segment structure and applied it to consumers’ perceptions of service quality.

23.6.6 Simultaneous Segmentation and Positioning

Empirical methods for STP started with two-stage approaches that employed multidimensional

scaling (MDS) and cluster analysis sequentially, to first portray the relationship between brands and

consumers in a spatial map, and then cluster consumers’ locations on the map to form market segments

(Cooper 1983). However, the two steps in this analysis optimize different and often incongruent loss

functions (Holman 1972), while estimation errors in the MDS step carry over to the clustering stage

rendering the segment solutions unreliable (DeSarbo, Grewal, and Scott 2008). These problems resulted

in the development of finite mixture multidimensional joint space models (DeSarbo, Manrai, and Manrai

1994; Wedel and DeSarbo 1996), which employ either vector (Slater 1960; Tucker 1960) or unfolding

(Coombs 1964) representations. In such finite mixture multidimensional joint space models in addition
19

to the coordinates of the brands on a spatial map, vectors or ideal points of segments are estimated

instead of for every individual consumer, thus significantly reducing the number of parameters

estimated.

Many mixture multidimensional joint space models for the analysis of preference/dominance data

have been proposed over the past thirty years. For example, DeSarbo, Howard, and Jedidi (1991)

developed a mixture multidimensional scaling vector model (MULTICLUS) for normally distributed data.

DeSarbo, Jedidi, Cool, and Schendel (1990) extended this model to a mixture weighted ideal point

model. De Soete and Heiser (1993) and De Soete and Winsberg (1993), respectively, extended the

MULTICLUS model to accommodate linear restrictions on the stimulus coordinates. Böckenholt and

Böckenholt (1991) developed simple and weighted ideal-point mixture scaling models for binary data.

DeSarbo, Ramaswamy, and Lenk (1993) developed a mixture vector model for Dirichlet-distributed data.

Chintagunta (1994) developed a mixture vector model for scanner panel choice data that also included

the effects of marketing mix variables, and Wedel, Vriens, Bijmolt, et al. (1998) developed a mixture

unfolding model to map brand positions using conjoint choice data, while also including the effects of

attributes in the conjoint design. Wedel and Kamakura (2000, p. 141) provide an overview of mixture

unfolding applications in marketing.

Wedel and DeSarbo (1996) provided a general framework for two-way preference/dominance data

in the exponential family, that includes both vector and weighted ideal-point representations, and

profiling of the brand coordinates and segment membership (see equation 6) in terms of exogeneous

variables. It nests many of the previously proposed models mentioned above. In mixture vector models

the conditional expectation of 𝑦𝑖,𝑗 related to equation (3) is replaced by:

𝑔(𝐸[𝑦𝑖,𝑗 |𝑠]) = 𝑐𝑗,𝑠 + ∑𝑇𝑡=1 𝑢𝑗,𝑡 𝑣𝑠,𝑡 , (7)

while for a mixture unfolding model, it is replaced by:


20

2
𝑔(𝐸[𝑦𝑖,𝑗 |𝑠]) = 𝑐𝑗,𝑠 − ∑𝑇𝑡=1 𝑤𝑠,𝑡 (𝑢𝑗,𝑡 −𝑣𝑠,𝑡 ) . (8)

Here 𝑔(∙) is a link function akin to these authors’ earlier work on generalized linear model mixtures, 𝑐𝑗,𝑠

are brand constants for each segment s, 𝑢𝑗,𝑡 are brand coordinates on 𝑡 = 1, … , 𝑇 latent dimensions,

𝑤𝑠,𝑡 are dimension weights for each segment (simple unfolding models arise for 𝑤𝑠,𝑡 = 1) and 𝑣𝑠,𝑡

represent segment-specific vectors (equation 7), or ideal points (equation 8) on the T-dimensional

spatial map. These mixture multidimensional joint space models are traditionally estimated using the

method of maximum likelihood using either gradient-based maximization or the EM-algorithm.

Mixture multidimensional joint space models have several limitations which include that they

require distributional assumptions that may be violated, identification restrictions that may be

cumbersome, and require intensive computation that often only yield locally optimal solutions.

DeSarbo, Grewal, and Scott (2008) propose a clusterwise vector model that simultaneously estimates

market segments, their composition, a brand space, and preference vectors per market segment. The

ALS framework that they develop does not require distributional assumptions and renders conditionally

(within an iterate) globally optimum results. DeSarbo, Blanchard, and Atalay (2008) present a

constrained clusterwise unfolding procedure that simultaneously identifies consumer segments, derives

a joint space of brand coordinates and segment-level ideal points, and creates a link between specified

product attributes and brand locations in the derived joint space. This latter feature permits a variety of

brand policy simulations, as well as subsequent positioning optimization and targeting. DeSarbo,

Blanchard, LeBaron and Atalay (2008) generalize this procedure to also accommodate the estimation

multiple segment ideal points across different contexts, but its estimation is much more computationally

intensive. This class of methods produces managerially useful representations for STP.

23.6.7 HMM Models for Dynamic Segmentation


21

Several approaches have been proposed in the literature that enable segments’ structure to

change over time. Two main categories of these models are models for manifest change and those that

capture latent segment change (Wedel and Kamakura 2000, p. 159-160). Models for manifest change

included (linear or higher order) functions of time into latent class models (Grover and Srinivasan 1989),

or in the regression functions (Poulsen 1990; Ramaswamy 1997; Wedel, Kamakura, DeSarbo, and Ter

Hofstede 1995) or the concomitant variable equation (Kamakura, Kim, and Lee 1996) of finite mixtures

or finite mixture regression models.

Models for latent change utilize Hidden Markov Model (HMM) formulations. HMM’s go back to

the work by Baum and Petrie (1966) and Baum, Petrie, Soules, et al. (1970), and conceptually extend

finite mixture models and finite mixture regression models by allowing unobserved classes (segments)

to evolve over time according to a Markov process. That is, the joint probability of being in segment r(t-

1) at time t-1 and segment s(t) at time t is 𝜋𝑟(𝑡−1),𝑠(𝑡) = 𝜋𝑟(𝑡−1) 𝜋𝑠(𝑡)|𝑟(𝑡−1) . The likelihood of this model

(Netzer, Ebbes, and Bijmolt 2016) extends equation (3):

𝐿(𝒚𝑖,1:𝑇 |𝚯) = ∑𝑆𝑟=1 𝜋𝑟(1) 𝑓(𝑦𝑖,1 |𝜷𝑟(1) ) ∏𝑇𝜏=2 𝜋𝑠(𝜏)|𝑟(𝜏−1) 𝑓(𝑦𝑖,𝜏 |𝜷𝑠(𝜏) ), (9)

Netzer, Ebbes, and Bijmolt (2016) provide a review of HMM applications in marketing. Specific

applications of the HMM to dynamic segmentation are those by Poulsen (1990), who used it to describe

how customers’ membership in latent segments based on purchase behavior changed over time.

Brangule-Vlagsma, Pieters, and Wedel (2002) used a HMM to describe how value system segments

changed over time. Montgomery, Srinivasan, and Liechty (2004) identified segments dynamically based

on internet browsing behavior, and Du and Kamakura (2006) investigated how household lifecycle

segments evolve. Ebbes, Grewal, and DeSarbo (2009) develop a HMM to identify homogeneous

strategic segments of firms. Netzer, Lattin, and Srinivasan (2008) used a HMM, Romero, van der Lans,

and Wierenga (2013) a partially Hidden HMM, and Zhang, Watson, Palmatier, and Dant (2016) a
22

multivariate HMM to characterize dynamics in customer relationships. Park and Gupta (2011) looked at

dynamic segmentation based on purchase cycles. Lemmens, Croux, and Stremersch (2012) studied how

segments of countries evolve in the context of new product growth. Shi and Zhang (2014) identified

dynamic segments based on store loyalty and promotion sensitivity, and Ebbes and Netzer (2021)

identified segments based on consumers’ job seeking status.

HMMs have been extended to include consumer variables, marketing actions and unobserved

heterogeneity into the transition probabilities among segments, leading to non-homogenous HMMs

(Netzer, Lattin, and Srinivasan 2008; Montoya, Netzer, and Jedidi 2010; Li, Sun, and Montgomery 2011;

Zhang, Watson, Palmatier, and Dant 2016). The formulation used is similar to that of the concomitant

variable mixture model (see equation 6), but the exogeneous variables 𝒛𝑖,𝜏 specific to consumer i and

time period 𝜏 are now included in the segment transition probabilities:

exp (𝒛′ 𝜶𝑖,𝑠 )


𝜋𝑖,𝑠(𝜏)|𝑟(𝜏−1) = 1+∑𝑆−1 exp𝑖,𝜏(𝒛′ , (10)
𝑤=1 𝑖,𝜏 𝜶𝑖,𝑤 )

̅ 𝑠 , 𝚫𝑠 ) a Normal distribution to account for unobserved heterogeneity in


where possibly 𝜶𝑖,𝑠 ~Φ(𝜶𝑖,𝑠 |𝜶

the segment dynamics. These models have been estimated using numerical maximization of the

likelihood or Metropolis-Hastings MCMC algorithms, or the Baum-Welch algorithm (a special case of the

EM algorithm; Baum and Petrie 1966) and data augmentation MCMC algorithms which require forward-

backward recursions over time to evaluate the likelihood (Netzer, Ebbes, and Bijmolt 2016).

More recently, Bernstein, Modaresi, and Sauré (2019) develop a new dynamic clustering

approach estimates that adaptively adjusts customer segments and customizes the assortment to each

customer as more transaction data becomes available. The dynamic segmentation procedure is based

on a Dirichlet Prior Process (essentially a mixture model with an unknown number of classes, see

below), and a bandit algorithm to determine the assortment for each customer.
23

23.7 Phase 5: Alternative Representations of Consumer Heterogeneity

Finite mixture models are based on the assumption that individual-level model parameters can

take on only a finite number of S values with Multinomial probabilities: 𝑃(𝜷𝑖 = 𝜷𝑠 ) = 𝜋𝑠 . But some

researchers in the 1990’s argued that tastes, preferences, and responsiveness to marketing variables are

continuously distributed across the population (Allenby and Lenk 1994; Allenby and Ginter 1995). Those

researchers envisioned market segmentation as an artificial partitioning of a continuous heterogeneity

distribution into discrete segments. In this stream of research, behavioral parameters (e.g., regression

parameters) were estimated at the individual level, mostly assuming a Normal distribution for them,

̅ , 𝚺) (Allenby and Lenk 1994; Allenby and Rossi 1998).


𝜷𝑖 ~Φ(𝜷𝑖 |𝜷

A debate ensued on whether a discrete (finite mixture) or continuous (random coefficients)

heterogeneity distribution was the most appropriate to capture individual differences/heterogeneity

and guide marketing actions (Wedel, Kamakura, Arora, et al. 1999). In particular, the criticisms levied

against the finite mixture approach to market segmentation were the following. 1. The assumption that

within each market segment the parameters capturing the behavior of consumers are identical seems

overly restrictive; 2. If the true underlying heterogeneity distribution is continuous, assuming a finite

mixture distribution of heterogeneity leads to inconsistent parameter estimates; 3. The predictive

power of finite mixture models is limited because individual-level estimates are constrained to lie inside

the convex hull of the segment-level estimates (Wedel, Kamakura, Arora, et al. 1999). In addition, finite

mixtures have associated technical disadvantages related to identifiability of the model parameters, the

existence of local optima in the likelihood, and label switching in the case of MCMC estimation

(Gonçalves-Dias and Wedel 2004). Indeed, for some applications, researchers found that models with
24

continuous heterogeneity distributions could outperform models that involved a finite mixture model

representation (Vriens, Wedel, and Wilms 1996; Lenk, DeSarbo, Green, and Young 1996).

In contrast, it was claimed that models with a continuous heterogeneity distribution would

enable a more accurate representation of the tails of consumers’ heterogeneity distribution (Allenby

and Rossi 1998). Nonetheless, these models suffer from drawbacks as well, as they are sensitive to the

specific form of the heterogeneity distribution that is assumed (most often the Normal) and a

misrepresentation of that distribution leads to bias in parameter estimates and loss of predictive power.

While individual-level estimates are often easily obtained, the estimates are often unreliable because of

limited individual-level data. In addition, it was shown that finite mixture models can closely

approximate any continuous heterogeneity distribution by letting the number of support points of the

finite mixture grow large (Wedel and Kamakura 2000, p. 326-327).

The extensive simulation studies in conjoint and panel data settings by Andrews, Ansari, and

Currim (2002), and Andrews, Ainslie, and Currim (2002) provided some resolution of this debate.

Simulating both continuous (linear regression) and discrete (Multinomial logit) outcome variables, their

studies revealed that there are no substantial differences between models with continuous and discrete

heterogeneity distributions in terms of parameter recovery and prediction of hold-out data. However,

models with continuous representations of heterogeneity generally provide a better fit to the data but

perform poorly when the number of observations per individual consumer is small. The conclusion from

these studies is that for all practical purposes, the statistical properties of neither approach clearly

dominates the other.

The selection of the most adequate representation could thus primarily be driven by substantive

arguments or profitability of the marketing actions in question. As a case in point, Zhang and Wedel

(2009) investigated the profitability of customized promotions targeted at the mass market, segment,
25

and individual level in online and offline stores using a finite mixture model and an analytical

optimization algorithm to customize promotions. They found that in brick-and-mortar stores, the

differences in profit from promotions at these three levels of granularity is negligible, and retailers thus

best use undifferentiated promotion strategies. Nonetheless, they found that in online stores

promotions customized at the individual level can lead to meaningful increases in profit over segment-

and mass market–level promotions. Thus, the idea that one-to-one marketing necessarily enhances

profitability seems misguided, as it often also comes with increased costs. When there are economies of

scale in the design or implementation of marketing instruments targeting segments is more beneficial.

Subsequently, research was aimed at combining the two approaches to capturing heterogeneity

in behavior. Initial research did so by compounding the distribution of the dependent variable with a

conjugate heterogeneity distribution and formulating a finite mixture for the compound distribution.

Examples are the mixture of Negative Binomial distributions, arising by compounding the Poisson

distribution for counts with a conjugate Gamma distribution (Ramaswamy, Anderson, and DeSarbo

1994) and a mixture of Dirichlet-Multinomial distributions, arising by compounding the Multinomial

distribution of the dependent variable with a conjugate Dirichlet Distribution (Böckenholt 1993).

Allenby, Arora, and Ginter (1998) proposed the use of finite mixture of Normal distributions to

capture heterogeneity in a Multinomial logit model of consumer choice, which takes the form

̅ 𝑠 , 𝚺𝑠 ). By combining discrete and continuous heterogeneity distributions, this


𝜷𝑖 ~ ∑𝑆𝑠=1 𝜋𝑠 Φ(𝜷𝑖 |𝜷

approach relaxed the restrictive assumption of mixture regression models that all consumers in a

segment have the same identical values of the parameters. Indeed, applications revealed substantial

within-segment heterogeneity (Rossi, Allenby, and McCulloch 2005, p. 142-154). A similar model was

later proposed by Varki and Chintagunta (2004). These authors showed that their model outperformed

both the standard finite mixture and the random coefficients logit models. Lenk and DeSarbo (2000)
26

generalized these models to finite mixtures of random coefficients generalized linear models. The

likelihood of this class of models is:

𝐿(𝒚𝑖 |𝚯) = ∑𝑆𝑠=1 𝜋𝑠 ∫ ∏𝐽𝑗=1 𝑓(𝑦𝑖,𝑗 |𝜷𝑖 )Φ(𝜷𝑖 |𝜷


̅ 𝑠 , 𝚺𝑠 ) 𝑑𝜷𝑖 . (11)

Here 𝚯 contains all parameters. Simulated likelihood can be used for the estimation (Pakes and Pollard

̅ 𝑠 , 𝚺𝑠 ) and replaces the integral in (11)


1989), which takes a certain fixed number of draws from Φ(𝜷𝑖 |𝜷

with a sum over the draws. Alternatively, MCMC procedures can be used, which involves a Metropolis-

Hastings step to estimate the individual-level parameters. These models are important because they

enable a very flexible representation of heterogeneity.

An even more flexible approach is the so-called Dirichlet Process Prior (DPP), a Bayesian

nonparametric approach in which a prior Dirichlet distribution is placed on the class of possible

distribution functions describing the heterogeneity of parameters. Briefly, one assumes 𝛽𝑖 ~𝐹, with an

unknown distribution function, where 𝐹~𝐷𝑃(𝛾, 𝐹0 ) follows a Dirichlet Process (DP), in which 𝛾

determines the concentration of the distribution around the prior 𝐹0 which is often assumed to be

Normal. Some of the applications of this method predate the use of the mixture of Normal distributions

to represent heterogeneity, and circumvent their problems associated with selecting which number of

segments best represents the data. The result is a nonparametric representation of heterogeneity that

conceptually is a Dirichlet mixture of Normal distributions similar to equation (11), but with an unknown

number of classes (Escobar 1994; Escobar and West 1995). The applications of this Bayesian

nonparametric representation of heterogeneity include those by Ansari and Mela (2003) to e-

customization, by Wedel and Zhang (2004) to analyzing brand competition across categories, by Kim,

Menzefricke, and Feinberg (2004) to consumer choice decisions, by Braun and Bonfrer (2011) to

modeling consumer interactions in networks, and by Bruce (2019) in dynamic models of advertising.

Bruce (2019) shows that the DPP substantially outperforms competing approaches.
27

23.8 Future Research

Even after more than 75 years of research, academic and practitioner efforts to further conceptual

and methodological development of market segmentation is called for. Examining the evolution of

market segments as a function of changes in consumer behavior and market structure, new product

entry, dynamic competitive forces, and shifts in the economy remains a challenging problem.

Segmentation approaches using multiple segmentation bases, incorporating meaningful managerial

constraints, and expressed mathematically to obtain unique solutions that satisfy effectiveness criteria

also deserve further research. Segmentation research should be extended to current marketing

problems, including CRM, the customer journey, multichannel, social media, and multimedia behaviors.

An example is the work by Konuş, Verhoef, and Neslin (2008) who used mixture models to identify

segments based on consumers multichannel shopping behavior (DeKeyser, Schepers, and Konuş 2015).

Social media data, mobile location data, blogs, online reviews, keyword search data, and image and

video data provide rich multimodal sources of data for psychographic, sociographic, and behavioral

segmentation but pose unique challenges as well (Wedel and Kannan 2016). Natural language

processing and computer vision are already extensively deployed to process such data, and novel

machine learning methods, such as random forests, artificial neural networks, support vector machines,

and deep learning, offer potential for application to large scale, multicriteria, dynamic segmentation

(Wedel and Kannan 2016; Ngai, Xiu, and Chau 2009).

We hope that this historical perspective and review provided in this chapter will help stimulate

such further research on market segmentation. That research would hopefully increasingly comprise of

academic-industry collaboration and involve exchange of ideas, theories, data and software.
28

Figure 1. Genealogy Map of the Market Segmentation Literature.


29

References
Allenby, Greg M., Neeraj Arora, and James L. Ginter. 1998. “On the Heterogeneity of Demand.” Journal
of Marketing Research 35(3): 384-89.
Allenby, Greg M. and James L. Ginter. 1995. “Using Extremes to Design Products and Segment Markets.”
Journal of Marketing Research 32 (4): 392-403.
Allenby, Greg M. and Peter J. Lenk. 1994. “Modeling Household Purchase Behavior with Logistic Normal
Regression.” Journal of the American Statistical Association 89 (428): 1218-31.
Allenby, Greg M. and Peter E. Rossi. 1998. “Marketing Models of Consumer Heterogeneity.” Journal of
Econometrics 89 (1–2): 57–78.
Andrews, Rick L, Andrew Ainslie, and Imran S. Currim. 2002. “An Empirical Comparison of Logit Choice
Models with Discrete versus Continuous Representations of Heterogeneity.” Journal of Marketing
Research 39 (4): 479-87.
Andrews, Rick L., Asim Ansari, and Imran S. Currim. 2002. “Hierarchical Bayes versus Finite Mixture
Conjoint Analysis Models: A Comparison of Fit, Prediction, and Partworth Recovery.” Journal of
Marketing Research 39 (1): 87–98.
Andrews, Rick L. and Imran S. Currim. 2002. “Identifying Segments with Identical Choice Behaviors
across Product Categories: An Intercategory Logit Mixture Model.” International Journal of Research in
Marketing 19 (1): 65 – 80.
Ansari, Asim and Carl F. Mela. 2003. "E-Customization." Journal of Marketing Research 40 (2): 131-45.
Aouad, Ali, Adam Elmachtoub, Kris Ferreira, and Ryan McNellis. 2020. "Market Segmentation Trees."
Working Paper, Harvard Business School January 2020.
Arabie, Phipps, J. Carroll, Wayne S. DeSarbo, and Yoram Jerry Wind. 1981. “Overlapping Clustering: A
New Method for Product Positioning.” Journal of Marketing Research 18 (3): 310–17.
Assael, Henry. 1970. “Segmenting Markets by Group Purchasing Behavior: An Application of the AID
Technique.” Journal of Marketing Research 7 (2): 153-8.
Assael, Henry and A. Marvin Roscoe, Jr. 1976. “Approaches to Market Segmentation Analysis.” Journal of
Marketing 40 (4): 67-76.
Baker, M.J. 1988. “Marketing Strategy and Management.” McMillan Education, New York.
Bartels, Robert. 1988. “The History of Marketing Thought.” Publishing Horizons, Columbus, OH.
Bass, Frank M., Douglas J. Tigert, and Ronald T. Lonsdale. 1968. “Market Segmentation: Group versus
Individual Behavior.” Journal of Marketing Research 5(3): 264-70.
Baum, Leonard E. and Ted Petrie. 1966. "Statistical inference for Probabilistic Functions of Finite State
Markov Chains." The Annals of Mathematical Statistics 37 (6): 1554-63.
Baum, Leonard E., Ted Petrie, George Soules, and Norman Weiss. 1970. "A Maximization Technique
Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains." The Annals of
Mathematical Statistics 41 (1): 164-71.
Beckwith, Neil E. and Maurice W. Sasieni. 1976. “Criteria for Market Segmentation Studies.”
Management Science 22 (8): 892–903.
30

Bernstein, Fernando, Sajad Modaresi, and Denis Sauré. 2019. “A Dynamic Clustering Approach to Data-
Driven Assortment Personalization.” Management Science 65 (5): 2095-115.
Besanko, David, Jean-Pierre Dubé, and Sachin Gupta. 2003. “Competitive Price Discrimination in a
Vertical Channel Using Aggregate Retail Data.” Management Science 49 (9): 1121–38.
Böckenholt, Ulf. 1993. “Estimating Latent Distributions in Recurrent Choice Data.” Psychometrika 58 (3):
489–509.
Böckenholt, Ulf and Ingo Böckenholt. 1991. “Constrained Latent Class Analysis: Simultaneous
Classification and Scaling of Discrete Choice Data.” Psychometrika 56 (December): 699–716
Bodapati, Anand V. and Sachin Gupta. 2004. “The Recoverability of Segmentation Structure from Store-
Level Aggregate Data.” Journal of Marketing Research 41 (3): 351-64.
Bolton, R.N. and M.B. Myers. 2003. “Price-Based Global Market Segmentation for Services.” Journal of
Marketing 67 (3): 108-28.
Brangule-Vlagsma, Kristine, Rik G. Pieters, and Michel Wedel. 2002. “The Dynamics of Value Segments:
Modeling Framework and Empirical Illustration.” International Journal of Research in Marketing 19 (3):
267-85.
Braun, Michael and André Bonfrer. 2011. "Scalable Inference of Customer Similarities from Interactions
Data Using Dirichlet Processes." Marketing Science 30 (3): 513-31.
Breiman, Leo. 2001. "Random Forests." Machine Learning 45 (1): 5–32.
Breiman, Leo, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1984. “Classification and
Regression Trees (Wadsworth Statistics/Probability).” Chapman & Hall/CRC, New York.
Brody, Robert P. and Scott M. Cunningham. 1968. “Personality Variables and the Consumer Decision
Process.” Journal of Marketing Research 5 (1): 50–7.
Bruce, Norris I. 2019. “Bayesian Nonparametric Dynamic Methods: Applications to Linear and Nonlinear
Advertising Models.” Journal of Marketing Research 56(2): 211-29.
Brusco, Michael J., J. Dennis Cradit, and Stephanie Stahl. 2002. “A Simulated Annealing Heuristic for a
Bicriterion Partitioning Problem in Market Segmentation.” Journal of Marketing Research 39 (1): 99-109.
Brusco, Michael J., J. Dennis Cradit, and Armen Tashchian. 2003. “Multicriterion Clusterwise Regression
for Joint Segmentation Settings: An Application to Customer Value.” Journal of Marketing Research 40
(2): 225-34.
Bucklin, Randolph E. and Sunil Gupta. 1992. “Brand Choice, Purchase Incidence, and Segmentation: An
Integrated Modeling Approach.” Journal of Marketing Research 29 (2): 201–15.
Bucklin, Randolph E., Sunil Gupta, and S. Siddarth. 1998. “Determining Segmentation in Sales Response
across Consumer Purchase Behaviors.” Journal of Marketing Research 35 (2): 189-97.
Calantone, Roger J. and Alan G. Sawyer. 1978. “The Stability of Benefit Segments.” Journal of Marketing
Research 15 (3): 395-404.
Chintagunta, P.K. 1994. “Heterogeneous Logit Model Implications for Brand Positioning.” Journal of
Marketing Research 31 (2): 304–11.
Christensen, Clayton M., Scott Cook, and Taddy Hall. 2005. “Marketing Malpractice: The Cause and the
Cure.” Harvard Business Review 83 (12): 74-83.
31

Claycamp, Henry S. 1965. “Characteristics of Owners of Thrift Deposits in Commercial Banks and Savings
and Loan Associations.” Journal of Marketing Research 2 (2): 163–70.
Claycamp, Henry J. and William F. Massy. 1968. “A Theory of Market Segmentation.” Journal of
Marketing Research 5 (4): 388-94.
Coombs, C.H. 1964. “A Theory of Data.” John Wiley & Sons, New York.
Cooper, L. G. 1983. “A Review of Multidimensional Scaling in Marketing Research.” Applied Psychological
Measurement 7 (4): 427-50.
Currim, Imran S. 1981. “Using Segmentation Approaches for Better Prediction and Understanding from
Consumer Mode Choice Models.” Journal of Marketing Research 18 (3): 301-9.
Dayton, C. Mitchell and George B. MacReady. 1988. “Concomitant-Variable Latent-Class Models.”
Journal of the American Statistical Association 83 (401): 173–8.
De Keyser, Arne, Jeroen Schepers, and Umut Konuş. 2015. “Multichannel customer segmentation: Does
the after-sales channel matter? A replication and extension.” International Journal of Research in
Marketing 32 (4): 453-56.
Dempster, A.P., M.N. Laird, and D.B. Rubin. 1977. “Maximum Likelihood Estimation from Incomplete
Data via the E-M Algorithm.” Journal of the Royal Statistical Society, Series B 39 (1): 1–38.
DeSarbo, Wayne S., A. Selin Atalay, David LeBaron, and Simon J. Blanchard. 2008. “Estimating Multiple
Consumer Segment Ideal Points from Context-Dependent Survey Data.” Journal of Consumer Research
35 (1): 142-53.
DeSarbo, Wayne S., Simon J. Blanchard, and Selin Atalay. 2008. “A New Spatial Classification
Methodology for Simultaneous Segmentation, Targeting, and Positioning (STP Analysis) for Marketing
Research.” Review of Marketing Research 5: 75-103.
DeSarbo, Wayne S., Ashley Stadler Blank, and Qian Chen. 2017. “A Parametric Constrained
Segmentation Methodology for Application in Sport Marketing.” Customer Needs and Solutions 4 (4): 37-
55.
DeSarbo, Wayne S., J. Douglas Carroll, Linda A. Clark, and Paul E. Green. 1984. “Synthesized Clustering: A
Method for Amalgamating Alternative Clustering Bases with Differential Weighting of Variables.”
Psychometrika 49 (1): 57–78.
DeSarbo, Wayne S. and William L. Cron. 1988. “A Maximum Likelihood Methodology for Clusterwise
Linear Regression.” Journal of Classification 5 (2): 249-89.
DeSarbo, Wayne S. and Christian F. DeSarbo. 2001. “A Generalized Normative Segmentation
Methodology Employing Conjoint Analysis.” In Conjoint Measurement, Eds. Anders Gustafsson, Andreas
Hermann, and Frank Huber. Springer, London: 447-8.
DeSarbo, Wayne S., R. Grewal, and C. Scott. 2008. “A Clusterwise Bilinear Multidimensional Scaling
Methodology for Simultaneous Segmentation and Positioning Analyses.” Journal of Marketing Research
45 (3): 280–92.
DeSarbo, Wayne S. and Douglas B. Grisaffe. 1998. “Combinatorial Optimization Approaches to
Constrained Market Segmentation: An Application to Industrial Market Segmentation.” Marketing
Letters 9 (2): 115-34.
32

DeSarbo, Wayne S., Daniel J. Howard, and Kamel Jedidi. 1991. “MULTICLUS: A New Method for
Simultaneously Performing Multidimensional Scaling and Cluster Analysis.” Psychometrika 56 (March):
121–36.
DeSarbo, Wayne S., K. Jedidi, K. Cool, and D. Schendel. 1990. “Simultaneous Multidimensional Unfolding
and Cluster Analysis: An Investigation of Strategic Groups.” Marketing Letters 2: 129–46.
DeSarbo, Wayne S. and Vijay Mahajan. 1984. “Constrained Classification: The Use of A Priori Information
in Cluster Analysis.” Psychometrika 49 (2): 187-216.
DeSarbo, Wayne S., Ajay K. Manrai, and L.A. Manrai. 1994. “Latent Class Multidimensional Scaling
Approaches: A Review of the Recent Developments in the Marketing and Psychometric Literature.” In
Advanced Methods of Marketing Research, ed. Richard P. Bagozzi. Blackwell Publishers, Oxford, UK: 190-
222.
DeSarbo, Wayne S., R. O. Oliver, and A. Rangaswamy. 1990. “A Simulated Annealing Methodology for
Clusterwise Linear Regression.” Psychometrika 54 (4): 707-36.
DeSarbo, Wayne S., Venkatram Ramaswamy, and Rabikar Chatterjee. 1995. “Analyzing Constant-Sum
Multiple Criterion Data: A Segment-Level Approach.” Journal of Marketing Research 32(2): 222-32.
DeSarbo, Wayne S., Venkatram Ramaswamy, and Steven H. Cohen. 1995. “Marketing Segmentation with
Choice-Based Conjoint Analysis.” Marketing Letters 6 (2): 137-47.
DeSarbo, Wayne S., Venkatram Ramaswamy, and Peter Lenk. 1993. “A Latent Class Procedure for the
Structural Analysis of Two-Way Compositional Data.” Journal of Classification 10 (2): 159–93.
DeSarbo, Wayne S., Michel Wedel, K. Vriens, and V. Ramaswamy. 1992. “Latent Class Metric Conjoint
Analysis.” Marketing Letters 3 (3): 273-88.
DeSarbo, Wayne S. and Jianan Wu. 2016. “The Joint Spatial Representation of Multiple Variable
Batteries Collected in Marketing Research.” Journal of Marketing Research 38 (2): 244-53.
De Soete, Geert and Wayne S. DeSarbo. 1991. “A Latent Class Probit Model for Analyzing Pick any/N
data.” Journal of Classification 8 (1): 45-63.
De Soete, Geert, Wayne S. DeSarbo, and J. Carroll. 1985. “Optimal variable weighting for hierarchical
clustering: An alternating least-squares algorithm.” Journal of Classification 2: 173-92.
De Soete, Geert and Willem Heiser. 1993. “A Latent Class Unfolding Model for Analyzing Single Stimulus
Preference Ratings.” Psychometrika 58 (4): 545-65.
De Soete, Geert and S. Winsberg. 1993. “A Latent Class Vector Model for Preference Ratings.” Journal of
Classification 10 (December): 192–218.
Dhalla, Nariman K. and Winston H. Mahatoo. 1976. “Expanding the Scope of Segmentation Research:
Segmentation Research Must Cover More of the Total Marketing Problem if it is to be Operational and
Profitable.” Journal of Marketing 40 (2): 34-41.
Dias, José Gonçalves and Michel Wedel. 2004. “An Empirical Comparison of EM, SEM and MCMC
Performance for Problematic Gaussian Mixture Likelihoods.” Statistics and Computing 14 (4): 323-32.
Dickson, Peter R. 1982. “Person-Situation: Segmentation’s Missing Link.” Journal of Marketing 46 (4): 56-
64.
Dickson, Peter R. and James L. Ginter. 1987. “Market Segmentation, Product Differentiation, and
Marketing Strategy.” Journal of Marketing 51 (2): 1-10.
33

Doyle, Peter and Ian Fenwick. 1975. “The Pitfalls of AID Analysis.” Journal of Marketing Research 12 (4):
408-13.
Doyle, Peter and P. Hutchinson. 1976. “The Identification of Target Markets.” Decision Sciences 7: 152-
61.
Doyle, Peter and John Saunders. 1985. “Market Segmentation and Positioning in Specialized Industrial
Markets.” Journal of Marketing 49 (2): 24-32.
Du, Rex Yuxing and Wagner A. Kamakura. 2006. “Household Life Cycles and Lifestyles in the United
States.” Journal of Marketing Research 43 (1): 121-32.
Ebbes, Peter, Rajdeep Grewal, and Wayne. S. DeSarbo. 2009. “Modelling Strategic Group Dynamics: A
Hidden Markov Approach.” Quantitative Marketing and Economics 8 (2): 241-74.
Ebbes, Peter and O. Netzer. 2021. “Using social media data to identify and target job seekers.”
Management Science: forthcoming.
Ekelund, Robert B. 1970. “Price Discrimination and Product Differentiation in Economic Theory: An Early
Analysis.” The Quarterly Journal of Economics 84 (2): 268-278.
Elrod, Terry and Russel S. Winer. 1982. “An Empirical Evaluation of Aggregation Approaches for
Developing Market Segments.” Journal of Marketing 46 (4): 65-74.
Escobar, Michael D. 1994. "Estimating Normal Means with a Dirichlet Process Prior." Journal of the
American Statistical Association 89 (425): 268-77.
Escobar, Michael D. and Mike West. 1995. "Bayesian Density Estimation and Inference Using Mixtures."
Journal of the American Statistical Association 90 (430): 577-88.
Frank, Ronald E. and Paul E. Green. 1968. “Numerical Taxonomy in Marketing Analysis: A Review
Article.” Journal of Marketing Research 5 (1): 83-94.
Frank, Ronald E., William F. Massy, and Yoram Wind. 1972. “Market Segmentation.” Prentice Hall Inc,
Englewood Cliffs, NJ.
Fullerton, Ronald A. 1988. “How Modern Is Modern Marketing? Marketing's Evolution and the Myth of
the "Production Era".” Journal of Marketing 52 (1): 108-125.
Gensch, Dennis H. 1978. “Image-Measurement Segmentation.” Journal of Marketing Research 15 (3):
384-94.
Gensch, Dennis H. 1985. “Empirically Testing a Disaggregate Choice Model for Segments.” Journal of
Marketing Research 22 (4): 462-67.
Goodman, Leo A. 1974. “Exploratory Latent Structure Analysis using both Identifiable and Unidentifiable
Models.” Biometrika 61: 215–31.
Green, Paul E. 1977. “A New Approach to Market Segmentation.” Business Horizons 20 (1): 61-73.
Green, Paul E., Frank J. Carmone, and David P. Wachspress. 1976. “Consumer Segmentation via Latent
Class Analysis.” Journal of Consumer Research 3 (3): 170–4.
Green, Paul E. and Abba M. Krieger. 1991. “Segmenting Markets with Conjoint Analysis.” Journal of
Marketing 55 (4): 20-31.
Greeno, Daniel W., Montrose S. Sommers, and Jerome B. Kernan. 1973. “Personality and Implicit
Behavior Patterns.” Journal of Marketing Research 10 (1): 63–9.
34

Grover, Rajiv and V. Srinivasan. 1987. “A Simultaneous Approach to Market Segmentation and Market
Structuring.” Journal of Marketing Research 24 (2): 139–53.
Grover, Rajiv and V. Srinivasan. 1989. “An Approach for Tracking Within-Segment Shifts in Market
Share.” Journal of Marketing Research 26 (May): 230–6.
Gupta, Sachin and Pradeep K. Chintagunta. 1994. “On Using Demographic Variables to Determine
Segment Membership in Logit Mixture Models.” Journal of Marketing Research 31 (1): 128-36.
Haley, Russell I. 1968. "Benefit Segmentation: A Decision-Oriented Research Tool." Journal of Marketing
32 (3): 30-5.
Hauser, John R. and Glen L. Urban. 1977. “A Normative Methodology for Modeling Consumer Response
to Innovation.” Operations Research 25 (4): 579–619.
Helsen, Kristiaan, Kamel Jedidi, and Wayne S. DeSarbo. 1993. “A New Approach to Country
Segmentation Utilizing Multinational Diffusion Patterns.” Journal of Marketing 57 (4): 60-71.
Holman, Eric W. 1972. “The Relation Between Hierarchical and Euclidean Models for Psychological
Distances.” Psychometrika 37 (4): 417–23.
Hruschka, H. 1986. “Market Definition and Segmentation using Fuzzy Clustering Methods.” International
Journal of Research in Marketing 3 (2): 117-34.
Johnson, Richard M. 1971. “Market Segmentation: A Strategic Management Tool.” Journal of Marketing
Research 8 (1): 13-8.
Kahle, Lynn R., Sharon E. Beatty, and Pamela Miles Homer. 1986. “Alternative Measurement Approaches
to Consumer Values: The List of Values (LOV) and Values and Life Style (VALS).” Journal of Consumer
Research 13 (3): 405–9.
Kamakura, Wagner A. 1988. “A Least Squares Procedure for Benefit Segmentation with Conjoint
Experiments.” Journal of Marketing Research 25 (2): 157-67.
Kamakura, Wagner A., Byung-Do Kim, and Jonathan Lee. 1996. "Modeling Preference and Structural
Heterogeneity in Consumer Choice." Marketing Science 15 (2): 152-72.
Kamakura, Wagner A. and José A. Mazzon. 1991. "Values Segmentation: A Model for the Measurement
of Values and Value Systems." Journal of Consumer Research 18 (2): 208-18.
Kamakura, Wagner A. and Thomas P. Novak. 1992. “Value-System Segmentation: Exploring the Meaning
of LOV.” Journal of Consumer Research 19 (1): 119–32.
Kamakura, Wagner A. and Gary J. Russell. 1989. “A Probabilistic Choice Model for Market Segmentation
and Elasticity Structure.” Journal of Marketing Research 26 (4): 379-90.
Kamakura, Wagner A. and Gary J. Russell. 1993. “Measuring Brand Value with Scanner Data.”
International Journal of Research in Marketing 10 (1): 9-22.
Kamakura, Wagner A. and Michel Wedel. 1995. “Life-Style Segmentation with Tailored Interviewing.”
Journal of Marketing Research 32 (3): 308-17.
Kamakura, Wagner A. and Michel Wedel. 1997. “Statistical Data Fusion for Cross-Tabulation.” Journal of
Marketing Research 34 (4): 485-498.
Kamakura, Wagner A., Michel Wedel, and Jagadish Agrawal. 1994. “Concomitant Variable Latent Class
Models for Conjoint Analysis.” International Journal of Research in Marketing 11 (5): 451-64.
35

Kass, G. V. 1980. “An Exploratory Technique for Investigating Large Quantities of Categorical Data.”
Applied Statistics 29 (2): 119–27.
Kim, Sunghoon, Duncan K.H. Fong, Simon J. Blanchard, and Wayne S. DeSarbo. 2013. “Implementing
Managerial Constraints in Model-Based Segmentation: Extensions of Kim, Fong, and DeSarbo (2012)
with an Application to Heterogeneous Perceptions of Service Quality.” Journal of Marketing Research 50
(5): 664-73.
Kim, Sunghoon, Duncan K.H. Fong, and Wayne S. DeSarbo. 2012. “Model-Based Segmentation Featuring
Simultaneous Segment-Level Variable Selection.” Journal of Marketing Research 49 (5): 725-36.
Kim, Jin Gyo, Ulrich Menzefricke and Fred M. Feinberg. 2004. “Assessing Heterogeneity in Discrete
Choice Models Using a Dirichlet Process Prior.” Review of Marketing Science 2 (1): 1-1.
Konus, Umut, Peter C. Verhoef, and Scott A. Neslin. 2008. “Multichannel Shopper Segments and Their
Covariates.” Journal of Retailing 84 (4): 398-413.
Kotler, P. 1988. “Marketing Management: Analysis, Planning, Implementation, and Control.” Englewood
Cliffs, Prentice-Hall, NJ.
Kotler, Philip and Kevin L. Keller. 2016. “Marketing Management, 15th ed.” Pearson, NY.
Krieger, Abba M. and Paul E. Green. 1996. “Modifying Cluster Based Segments to Enhance Agreement
with an Exogenous Response Variable.” Journal of Marketing Research 33 (3): 351-63.
Lastovicka, John L. 1982. “On the Validation of Lifestyle Traits: A Review and Illustration.” Journal of
Marketing Research 19 (1): 126–38.
Lazarsfeld, Paul F. 1950. “The Logical and Mathematical Foundation of Latent Structure Analysis and the
Interpretation and Mathematical Foundation of Latent Structure Analysis.” In Measurement and
Prediction, Eds. S.A. Stouffer, L. Guttman, E. A. Suchman, et al. Princeton University Press, Princeton, NJ:
362–472.
Lehmann, Donald R., William L. Moore, and Terry Elrod. 1982. “The Development of Distinct Choice
Process Segments Over Time: A Stochastic Modeling Approach.” Journal of Marketing 46 (2): 48–59.
Lemmens, Aurélie, Christophe Croux, and Stefan Stremersch. 2012. “Dynamics in the International
Market Segmentation of New Product Growth.” International Journal of Research in Marketing 29 (1):
81-92.
Lenk, Peter J. and Wayne S. DeSarbo. 2000. "Bayesian Inference for Finite Mixtures of Generalized Linear
Models with Random Effects." Psychometrika 65 (1): 93-119.
Lenk, Peter J., Wayne S. DeSarbo, Paul E. Green, and Martin R. Young. 1996. “Hierarchical Bayes Conjoint
Analysis: Recovery of Partworth Heterogeneity from Reduced Experimental Designs.” Marketing Science
15 (2): 173–91.
Lesser, Jack A. and Marie Adele Hughes. 1986. “The Generalizability of Psychographic Market Segments
across Geographic Locations.” Journal of Marketing 50 (1): 18–27.
Lessig, V. P. and Tollefson, J. D. 1971. “Market Segmentation Through Numerical Taxonomy.” Journal of
Marketing Research 8 (4): 480–7.
Li, Shibo, Baohong Sun, and Alan L. Montgomery. 2011. “Cross-Selling the Right Product to the Right
Customer at the Right Time.” Journal of Marketing Research 48 (4): 683-700.
36

Liu, Ying, Sudha Ram, Robert Lusch, and Michael Brusco. 2010. “Multicriterion Market Segmentation: A
New Model, Implementation, and Evaluation.” Marketing Science 29 (5): 880-94.
MacLachlan, Douglas L. and Johny K. Johansson. 1981. “Market Segmentation with Multivariate AID.”
Journal of Marketing 45 (1):74-84.
Magidson, J. 1994. “The CHAID Approach to Segmentation Modelling: Chi-Square Automatic Interaction
Detection.” In Advanced Method of Marketing Research, Ed. Richard P. Bagozzi. Blackwell Publishers,
Cambridge, MA: 118-59
Mahajan, Vijay and Arun K. Jain. 1978. “An Approach to Normative Segmentation.” Journal of Marketing
Research 15 (3): 338-45.
Martin, Claude R. and Roger L. Wright. 1974. “Profit-Oriented Data Analysis for Market Segmentation:
An Alternative to AID.” Journal of Marketing Research 11 (3): 237-42.
Massy, William F. and Ronald E. Frank. 1965. “Short Term Price and Dealing Effects in Selected Market
Segments.” Journal of Marketing Research 2 (2): 171-85.
McCann, John M. 1974. “Market Segment Response to the Marketing Decision Variables.” Journal of
Marketing Research 11 (4): 399-412.
Montgomery, Alan L., Shibo Li, Kannan Srinivasan, and John C. Liechty. 2004. “Modeling Online Browsing
and Path Analysis using Clickstream Data.” Marketing Science 23 (4): 579-95.
Montoya, Ricardo, Oded Netzer, and Kamel Jedidi. 2010. “Dynamic Allocation of Pharmaceutical
Detailing and Sampling for Long-Term Profitability.” Marketing Science 29 (5): 909-24.
Moore, William L. 1980. “Levels of Aggregation in Conjoint Analysis: An Empirical Comparison.” Journal
of Marketing Research 17 (4): 516-23.
Moriarty, Mark and M. Venkatesan. 1978. “Concept Evaluation and Market Segmentation.” Journal of
Marketing 42 (3): 82–6.
Morrison, Donald G. 1973. “Evaluating Market Segmentation Studies: The Properties of R2.”
Management Science 19 (11): 1213-21.
Myers, James H. 1976. “Benefit Structure Analysis: A New Tool for Product Planning.” Journal of
Marketing 40 (4): 23-32.
Myers, James H. 1996. “Segmentation and Positioning for Strategic Marketing Decisions.” American
Marketing Association Publishing, Chicago.
Netzer, Oded, Peter Ebbes, and Tammo Bijmolt. 2016. “Hidden Markov Models in Marketing.” In
Advanced Methods in Modeling Markets, Eds. Peter S.H. Leeflang, Jaap E. Wieringa, Tammo H.A.
Bijmolt, and Koen H. Pauwels. Springer International Series in Quantitative Marketing, New York: 405-
52.
Netzer, Oded, James M. Lattin, and V. Srinivasan. 2008. “A Hidden Markov Model of Customer
Relationship Dynamics.” Marketing Science 27 (2): 185-204
Newcomb, Simon. 1886. “A Generalized Theory of the Combination of Observations so as to Obtain the
Best Result.” American Journal of Mathematics 8 (4): 343-66.
Ngai, Eric W. T., Li Xiu, and Dorothy C. K. Chau. 2009. “Review: Application of Data Mining Techniques in
Customer Relationship Management: A literature Review and Classification.” Expert Systems with
Applications 36 (2): 2592–602.
37

Novak, Thomas P. and Bruce MacEvoy. 1990. "On Comparing Alternative Segmentation Schemes: The
List of Values and Values and Life Styles." Journal of Consumer Research 17 (1): 105-9.
Ogawa, Kohsuke. 1987. “An Approach to Simultaneous Estimation and Segmentation in Conjoint
Analysis.” Marketing Science 6 (1): 66-81.
Pakes, Ariel and David Pollard. 1989. “Simulation and the Asymptotics of Optimization Estimators.”
Econometrica 57 (5): 1027–57.
Park, Sungho and Sachin Gupta. 2011. “A Regime-Switching Model of Cyclical Category Buying.”
Marketing Science 30 (3): 469-80.
Pearson, Karl. 1894. “Contributions to the Mathematical Theory of Evolution.” Philosophical Transaction
of the Royal Society of London A (185): 71-110.
Pigou, Arthur C. 1920. “The Economics of Welfare.” Macmillan, London.
Plummer, Joseph T. 1974. “The Concept and Application of Life Style Segmentation.” Journal of
Marketing 38 (1): 33–7.
Poulsen, Carsten Stig. 1990. “Mixed Markov and Latent Markov Modelling Applied to Brand Choice
Behaviour.” International Journal of Research in Marketing 7 (1): 5–19.
Punj, Girish and David W. Stewart. 1983. “Cluster Analysis in Marketing Research: Review and
Suggestions for Application.” Journal of Marketing Research 20 (2): 134-48.
Ramaswamy, Venkatram. 1997. “Evolutionary Preference Segmentation with Panel Survey Data: An
application to New Products.” International Journal of Research in Marketing 14 (1): 57–80.
Ramaswamy, Venkatram, Eugene W. Anderson, and Wayne S. DeSarbo. 1994. “A Disaggregate Negative
Binomial Regression Procedure for Count Data Analysis.” Management Science 40 (3): 405–17.
Rao, Vithala R. and Frederick W. Winter. 1978. “An Application of the Multivariate Probit Model to
Market Segmentation and Product Design.” Journal of Marketing Research 15 (3): 361-8.
Roberts, John H., Ujwal Kayande, and Stefan Stremersch. 2014. “From academic research to marketing
practice: Exploring the marketing science value chain.” International Journal of Research in Marketing 31
(2): 127-140.
Robinson, Joan. 1938. “The Economics of Imperfect Competition.” Macmillan, London.
Romero, Jaime, Ralf van der Lans, and Berend Wierenga. 2013. “A Partially Hidden Markov Model of
Customer Dynamics for CLV Measurement.” Journal of Interactive Marketing 27 (3): 185-208.
Rosbergen, Edward, Rik Pieters, and Michel Wedel. 1997. “Visual Attention to Advertising: A Segment-
Level Analysis.” Journal of Consumer Research 24 (3): 305–14.
Rossi, P.E., Greg M. Allenby, and Robert McCulloch. 2005. “Bayesian Statistics and Marketing.” Wiley,
New York.
Sewall, Murphy A. 1978. “Market Segmentation Based on Consumer Ratings of Proposed Product
Designs.” Journal of Marketing Research 15 (4): 557-64.
Sexton, Donald E., Jr. 1974. “A Cluster Analytic Approach to Market Response Functions.” Journal of
Marketing Research 11 (1): 109–14.
Shi, Wei and Jie Zhang. 2014. “Usage Experience with Decision Aids and Evolution of Online Purchase
Behavior.” Marketing Science 33 (6): 763-884.
38

Slater, Patrick. 1960. “The Analysis of Personal Preferences.” British Journal of Statistical Psychology 13
(2): 119–35.
Smith, Wendell R. 1956. "Product Differentiation and Market Segmentation as Alternative Marketing
Strategies." Journal of Marketing 21 (1): 3-8.
Späth, H. 1979. “Clusterwise Linear Regression.” Computing22: 367-73.
Tedlow, Richard S. 1996. “New and Improved: The Story of Mass Marketing in America.” Harvard
Business School Press, Boston, MA.
Ter Hofstede, Frenkel, Jan-Benedict Steenkamp, and Michel Wedel. 1999. “International Market
Segmentation Based on Consumer–Product Relations.” Journal of Marketing Research 36 (1): 1-17.
Ter Hofstede, Frenkel, Michel Wedel, and Jan-Benedict E.M. Steenkamp. 2002. “Identifying Spatial
Segments in International Markets.” Marketing Science 21 (2): 160-77.
Tollefson, John O. and V. Parker Lessig. 1978. “Aggregation Criteria in Normative Market Segmentation
Theory.” Journal of Marketing Research 15 (3): 346-55.
Tucker, L.R. 1960. “Intra-individual and Inter-individual Multidimensionality.” In Psychological Scaling,
Eds. H. Gulliksen and S. Messick. John Wiley & Sons, New York: 155-67.
Twedt, D.W. 1964. “How Important to Marketing Strategy is the Heavy User?” Journal of Marketing 28
(1): 71-2.
Varki, Sajeev and Pradeep K. Chintagunta. 2004. “The Augmented Latent Class Model: Incorporating
Additional Heterogeneity in the Latent Class Model for Panel Data.” Journal of Marketing Research 41
(2): 226-33.
Verhoef, Peter C., Penny N. Spring, Janny C. Hoekstra, and Peter S.H. Leeflang. 2003. “The Commercial
Use of Segmentation and Predictive Modeling Techniques for Database Marketing in the Netherlands.”
Decision Support Systems 34 (4): 471-81.
Vriens, Marco, Michel Wedel, and Tom Wilms. 1996. “Metric Conjoint Segmentation Methods: A Monte
Carlo Comparison.” Journal of Marketing Research 33 (1): 73–85.
Wedel, Michel and Wayne S. DeSarbo. 1993. “A Latent Class Binomial Logit Methodology for the
Analysis of Paired Comparison Choice Data.” Decision Sciences 24 (6): 1157-70.
Wedel, Michel and Wayne S. DeSarbo. 1995. “A Mixture Likelihood Approach for Generalized Linear
Models.” Journal of Classification 12 (1): 21-55.
Wedel, Michel and Wayne S. DeSarbo. 1996. “An Exponential-Family Multidimensional Scaling Mixture
Methodology.” Journal of Business and Economic Statistics 14 (4): 447-59.
Wedel, Michel and Wayne S. DeSarbo. 2002. “Market Segment Derivation and Profiling via a Finite
Mixture Model Framework.” Marketing Letters 13 (1): 17-25.
Wedel, Michel, Wayne S. DeSarbo, J. R. Bult, and V. Ramaswamy. 1993. “A Latent Class Poisson
Regression Model for Heterogeneous Count Data.” Journal of Applied Econometrics 8 (4): 397-411.
Wedel, Michel and Wagner A. Kamakura. 2000. “Market Segmentation: Conceptual and Methodological
Foundations.” Kluwer Academic Publishers, Norwell, MA.
Wedel, Michel, Wagner A. Kamakura, N. Arora, A. Bemmaor, J. Chiang, T. Elrod, R. Johnson, P. Lenk, S.
Neslin, and C.S. Poulsen. 1999. “Discrete and continuous representation of heterogeneity.” Marketing
Letters 10 (3): 217 – 30.
39

Wedel, Michel, Wagner A. Kamakura, Wayne S. DeSarbo, and Frenkel Ter Hofstede. 1995. “Implications
for Asymmetry, Nonproportionality, and Heterogeneity in Brand Switching from Piece-wise Exponential
Mixture Hazard Models.” Journal of Marketing Research 32 (4): 457-62.
Wedel, Michel and P.K. Kannan. 2016. “Marketing Analytics for Data-Rich Environments.” Journal of
Marketing 80 (6): 97-121.
Wedel, Michel and C. Kistemaker. 1989. “Consumer Benefit Segmentation Using Clusterwise Linear
Regression.” International Journal of Research in Marketing 6: 45-9.
Wedel, Michel and Jan-Benedict E. M. Steenkamp. 1989. “Fuzzy Clusterwise Regression Approach to
Benefit Segmentation.” International Journal of Research in Marketing 6 (4): 241-58.
Wedel, Michel and Jan-Benedict E. M. Steenkamp. 1991. “A Clusterwise Regression Method for
Simultaneous Fuzzy Market Structuring and Benefit Segmentation.” Journal of Marketing Research 28
(4): 385-96.
Wedel, Michel, Marco Vriens, Tammo H.A. Bijmolt, Wim Krijnen, and Peter S.H. Leeflang. 1998.
“Assessing the Effects of Abstract Attributes and Brand Familiarity in Conjoint Choice Experiments.”
International Journal of Research in Marketing 15 (1): 71-8.
Wedel, Michel and Jie Zhang. 2004. “Analyzing Brand Competition across Subcategories.” Journal of
Marketing Research 41 (4): 448-56.
Wells, William D. 1975. "Psychographics: A Critical Review." Journal of Marketing Research 12(2): 196-
213.
Wildt, Albert R. 1976. “On Evaluating Market Segmentation Studies and the Properties of R2.”
Management Science 22 (8): 904-8.
Wildt, Albert R. and John M. Mccann. 1980. “A Regression Model for Market Segmentation Studies.”
Journal of Marketing Research 17 (3): 335-40.
Wind, Yoram. 1978. “Issues and Advances in Segmentation Research.” Journal of Marketing Research 15
(3): 317-37.
Winer, Russ and Ravi Dhar. 2010. “Marketing Management, 4th edition.” Pearson, NY.
Winter, Frederick. W. 1979. “A Cost-Benefit Approach to Market Segmentation.” Journal of Marketing
43 (4): 103-11.
Zenor, Michael J. and Rajendra K. Srivastava. 1993. “Inferring Market Structure with Aggregate Data: A
Latent Segment Logit Approach.” Journal of Marketing Research 30 (3): 369–79.
Zhang, Jonathan Z., George F. Watson, Robert W. Palmatier, and Rajiv P. Dant. 2016. “Dynamic
Relationship Marketing.” Journal of Marketing 80 (5): 53-75.
Zhang, Jie and Michel Wedel. 2009. “The Effectiveness of Customized Promotions in Online and Offline
Stores.” Journal of Marketing Research 46 (2): 190-206.

View publication stats

You might also like