Jss 3240
Jss 3240
Jss 3240
Abstract
Discrete choice experiments are widely used in a broad area of research fields to capture the
preference structure of respondents. The design of such experiments will determine to a large
extent the accuracy with which the preference parameters can be estimated. This paper presents a
new R-package, called idefix, which enables users to generate optimal designs for discrete choice
experiments. Besides Bayesian D-efficient designs for the multinomial logit model, the package
includes functions to generate Bayesian adaptive designs which can be used to gather data for the
mixed logit model. In addition, the package provides the necessary tools to set up actual surveys
and collect empirical data. After data collection, idefix can be used to transform the data into the
necessary format in order to use existing estimation software in R.
Keywords: optimal designs, discrete choice experiments, adaptive, mixed logit, shiny, R.
1. Introduction
Discrete choice experiments (DCE's) are used to gather stated preference data. In a typical DCE,
the respondent is presented several choice sets. In each choice set the respondent is asked to choose
between two or more alternatives in which each alternative consists of specific attribute levels. By
analyzing the stated preference data, one is able to gain information on the preferences of respondents.
The use of stated preference data, and DCE's in particular, has strongly increased the past decades in
fields such as transportation, environmental economics, health economics, and marketing.
To analyze stated preference data, a choice model needs to be assumed. The large majority of discrete
choice models is derived under the assumption of utility-maximizing behavior by the decision maker
(Marschak 1950). This family of models is known as random utility maximization (RUM) models,
and the most well known members are the multinomial logit model (MNL) (McFadden 1974), and
the mixed logit model (MIXL) (Hensher and Greene 2003; McFadden and Train 2000; Train 2003).
The idea behind such models is that people maximize their utility, which is modeled as a function of
2 idefix: optimal designs for DCE’s
the preference weights and attribute levels. The deterministic part of the utility is most often linearly
specified in the parameters, but the corresponding logit probabilities relate nonlinear to the observed
utility.
When setting up a DCE, the researcher is usually restricted by the limited number of choice sets he
can present to a respondent, and the limited number of respondents he can question. Therewith, the
set of all possible choice sets one could present (i.e., the full factorial design) increases rapidly by
including more attributes and attribute levels. The researcher is therefore obliged to make a selection
of the choice sets he would like to include in the experimental design.
At first, choice designs were built using concepts from the general linear model design literature,
neglecting the fact that most choice models are nonlinear (Huber and Zwerina 1996). Afterwards
different design approaches have been proposed focusing on one or more design properties such as:
utility balance, task complexity, response efficiency, attribute balance, and statistical efficiency (see
Rose and Bliemer 2014, for an overview). Where orthogonal designs were mostly used at first, sta-
tistically optimal designs have now acquired a prominent place in the literature on discrete choice
experiments (Johnson et al. 2013). The latter aims to select those choice sets that forces the respon-
dent to make trade-offs, hereby maximizing the information gained from each observed choice, or
alternatively phrased, to minimize the confidence ellipsoids around the parameter estimates.
Optimal designs maximize the expected Fisher information. For choice designs, this information
depends on the parameters of the assumed choice model. Consequently, the efficiency of a choice
design is related to the accuracy of the guess of the true parameters before conducting the experi-
ment. In order to reduce this sensitivity, Bayesian efficient designs were developed. In the latter, the
researcher acknowledges his uncertainty about the true parameters by specifying a prior preference
distribution. It has been repeatedly proven that optimal designs outperform other design approaches
when the researcher has sufficient information on the preference structure a priori (Kessels, Goos,
and Vandebroek 2006; Rose, Bliemer, Hensher, and Collins 2008; Rose and Bliemer 2009). On the
other hand, several researchers have pointed out that even Bayesian efficient designs are sensitive to
a misspecification of the prior distribution (see Walker, Wang, Thorhauge, and Ben-Akiva (2018) for
an elaborate study on the robustness of different design approaches).
Serial designs (Bliemer and Rose 2010b), and Individually Adaptive Sequential Bayesian (IASB) de-
signs (Yu, Goos, and Vandebroek 2011; Crabbe, Akinc, and Vandebroek 2014) have gone one step
further in order to ensure that the prior guess is reasonable by sequentially updating it during the
survey. In serial efficient designs, the design is changed across respondents based on the updated in-
formation. With the IASB approach, individual designs are constructed during the survey within each
respondent, based on the posterior distribution of the individual preference parameters. By sequen-
tially updating this distribution, those methods have proven to be less vulnerable to a misspecification
of the initial prior. In addition, the IASB approach performed especially well when preference hetero-
geneity is large and resulted in qualitative designs for estimating the MIXL model (Danthurebandara,
Yu, and Vandebroek 2011; Yu et al. 2011).
So far, no R (R Development Core Team 2008) package is suited to generate optimal designs for
DCE’s. In addition, there is no software that implements any kind of adaptive designs for DCE’s.
Some general experimental design R packages exist (e.g., AlgDes (Wheeler 2014) and OptimalDe-
sign (Harman and Filova 2016)), which can be used for the generation of optimal designs. However
none of them can be used to apply methods designed for nonlinear models which are necessary for
choice models (Train 2003). To our knowledge two R packages exist that are promoted for design-
ing DCE’s. The package support.CEs (Aizaki 2012) provides functions for generating orthogonal
Journal of Statistical Software 3
main-effect arrays, but does not support optimal designs for discrete choice models. The package
choiceDes (Horne 2014), which depends on AlgDes, is able to generate D-optimal designs for linear
models and makes use of DCE terminology but does not take into account the dependency on the un-
known preference parameters. Furthermore it is limited to effects coded designs, and does not allow
the user to specify alternative specific constants. Such design packages are still often used, because
some linearly optimized designs are also optimal for MNL models when the preference parameters
are assumed to be zero.
We believe that efficient designs deserve a place in the toolbox of the DCE researcher and adaptive
efficient designs appear to be a promising extension. Since implementing such procedures is time con-
suming we believe the current package can pave the way for researchers and practitioners to become
familiar with these techniques. Therefore the current R package implements D-efficient, Bayesian
D-efficient, and the IASB approach to generate optimal designs for the MNL and MIXL model. Fur-
thermore it provides functions that allow researchers to set up simulation studies in order to evaluate
the performance of efficient and or adaptive designs for certain scenario’s. In addition, a function is
included that generates a shiny application in which designs are presented on screen, which supports
both pregenerated and adaptive designs, and allows the researcher to gather empirical data. The data
format of idefix can be easily transformed in order to use existing estimation packages in R such as
package bayesm (Rossi 2015), ChoiceModelR (Sermas 2012), RSGHB (Dumont and Keller 2015),
mlogit (Croissant 2013) and the Rchoice package (Sarrias 2016).
The outline of this paper is as follows: in the next section some guidance is provided on gathering and
specifying prior information, essential to produce efficient designs. Section 3 explains how to generate
statistically optimal designs for the MNL model using the R package idefix. In Section 4, the IASB
methodology is discussed together with the functions related to that approach. Here one can also find
an example of how the different functions can be combined to set up simulation studies. Section 5
describes the SurveyApp function which enables the researcher to gather empirical choice data by
launching an interactive shiny application. In Section 6, it is shown how to transform the idefix data
format into the format desired by the estimation package of interest. Lastly, in Section 7, we discuss
planned future development of the package.
Consult the literature: often research concerning similar (choice) topics is available in the liter-
ature. The statistics of interest in economics frequently involve the amount of money someone is
willing to pay (WTP) for a certain product or service. For example in transportation research, nu-
merous choice experiments have been published on the value of time (VOT), in health economics the
value of a statistical life (VOSL) is of interest, whereas in environmental economics the focus lies
on WTP for improvement of environmental quality and conservation of natural resources. If one is
interested in conducting a choice experiment involving price and time attributes, an option would be
to first consult for example Abrantes and Wardman (2011). In the latter, the authors provide a com-
prehensive meta-analysis that covers 226 British studies in which 1749 valuations of different types of
time (e.g., in-vehicle time, walk time, wait time, departure time shift, ...) are included. For a similar
meta-analysis covering 389 European studies one could consult Wardman, Chintakayala, and de Jong
(2016).
One must be careful when copying coefficient estimates from other studies, since those are dependent
on the context, and sensitive to design and model specifications. Nevertheless the goal is not to
extract a point estimate, but rather to define a reasonable interval which most likely contains the true
coefficient. As Wardman et al. (2016) write: “These implied monetary values serve as very useful
benchmarks against which new evidence can be assessed and the meta-model provides parameters
and values for countries and contexts where there is no other such evidence”.
Invest in a pilot study: given the cost of collecting data, efficient designs became more popular.
The more efficient a design is, the less participants one needs to question to obtain a similar level of
estimation accuracy. An efficient way of spending resources is therefore to first do a pilot study on
a sample of the total respondents. This way meaningful parameter estimates can be obtained, which
can serve as prior information for constructing the efficient design that will be given to the remaining
respondents. For the pilot study one can use an orthogonal design or an efficient design assuming zero
parameters.
Expert judgement and focus groups: another method is to consult focus groups. By debating and
questioning a representative sample, one can select relevant attributes, understand their importance
and incorporate meaningful attribute levels in the survey. The same information can be used to define
reasonable expectations about the influence each attribute will have on the utility. For an example
in which attribute prioritization exercises, attribute consolidation exercises and semi-structured inter-
views are described see Pinto, Danilovich, Hansen, Finn, Chang, Holl, Heinemann, and Bockenholt
(2017). Besides focus groups, one can also consult an expert related to the topic.
In theory, any distribution can serve as a prior distribution as long as the probability mass is distributed
over the parameter space proportionally to one’s degree of belief. In practice, it is advised to use a
distribution for which there exists a convenient way to take draws from. Several types of distributions
are commonly used to express prior belief, the (truncated) Normal, the Uniform and the Lognormal
distribution. Let us assume one beliefs that the true coefficient of price will be somewhere between
0 and -2, with all values in between being equally likely, then specifying βprice ∼ U (−2, 0) would
be in accordance with that belief. Similarly, specifying βprice ∼ N (−1.5, 1) would mean that the re-
searcher beliefs the true coefficient will most likely be -1.5, but he also gives credibility to values near
-1.5. In case one has absolutely no idea about the value of a coefficient in advance, a large variance
and mean equal to zero would reflect that uncertainty.
To come up with specific numbers for the mean and variance, we recommend to start with defining,
for each coefficient, a lower and upper boundary in between which the true coefficient should be lo-
cated. These boundaries can be set by making use of any of the previously mentioned methods. Often
one of both boundaries is known a priori, since for most attributes one can tell whether they will have
a positive or negative effect on the utility. When only boundaries are known, specifying a Uniform
distribution that spans that region is possible. When only one boundary is known, a truncated Normal
can be appropriate. When one is able to make a more informed guess, usually a Normal distribution
with mean equal to the best guess is specified. By specifying the variance, one modifies the range of
credible values around the best guess. A way to decide upon the variance could be to ensure that the
pre-established boundaries are for example near µ ± 2σ, with µ the mean and σ the standard deviation
of the Normal distribution. The wider the variance, the more robust the design solution will be. On
the other hand, the smaller the variance, the more informative the prior is, and thus the more efficient
the design solution will be if the true parameters are close to the best guess. In the next section we
will show how one can evaluate this trade-off.
A Bayesian D-efficient design will be optimized by minimizing the mean D-error, in which each D-
error is computed with a draw from the prior. To ensure that the solution is based on a representative
sample of the prior, it is important to use enough draws. It is hard to define a minimum number of
draws, since it depends on the dimension of the prior and the quality of the draws (for a comparison
see (Bliemer, Rose, and Hess 2008)). Since the computation time of generating a DB efficient design
depends on the number of draws, we advice to generate designs with a sample such that the computa-
tion remains feasible. Afterwards one can test the robustness of the design solution by recalculating
the DB-error with a larger sample from the prior (see the DBerr function in Section 3.3).
prior or the coding of the attribute levels is wrong. If the prior truly reflects your belief, you should
prefer alternatives that have a high probability and dislike alternatives with low probability.
In what follows, we will cover an example similar to the one used for the robustness analysis in
Walker et al. (2018). We will consider the implications of the prior on the robustness of a design. In
this choice experiment we are interested in the value of time, i.e., the amount of money a consumer
would be willing to pay in order to reduce waiting time with one unit. There are two continuous
attributes, price and time, each containing 5 attribute levels. We used the idefix package to select
three designs, each containing 20 choice sets with two alternatives. Each design was based on a
different prior belief. For each belief we fixed βtime = −1. In the first condition, the researcher
had no information about the true parameters. We refer to this condition as the naive prior in which
βprice ∼ N ($0/hour, $60/hour). In the second condition, the researcher knew the sign of the VOT.
We will call this the semi informative condition in which βprice ∼ T N ($0/hour, $60/hour), and
restricting the support such that only positive VOT's have credibility. In the last case, the researcher
did a pilot study in which the estimated VOT equaled $20/hour. He was therefore highly confident the
true VOT should be somewhere in between $10/hour and $30/hour, with $20/hour as his best guess.
We will call this condition the informative condition with βprice ∼ N ormal ($20/hour, $2/hour).
We used the CEA algorithm to optimize 12 initial random designs (is the default in CEA) for each
condition, of which we then selected the design with the lowest DB-error. Afterwards we used the
DBerr function in order to evaluate the robustness of each design given a range ($0 − $100/hour)
of possible true VOT's. The complete R code can be seen in Section 3.3 under the DBerr function.
Figure 1: Comparison of the D-error for designs generated with different priors given a range of true
Value of Time (VOT). The VOT range 10 - 30 $/h indicates the preset interval that is believed to
enclose the true value, prior to the experiment.
Journal of Statistical Software 7
From Figure 1 we can tell that the more informative the prior is, the more efficient the design will
be given that the prior was correct (true VOT = $20/hour). We can also see that this is true for the
region that the researcher defined (indicated by vertical lines in Figure 1. Above $35/hour the semi
informative prior would perform best, because that prior gave more credibility to this region than the
informative did. The uninformative prior performed worst since no information could be provided.
This small example merely aims to show that given a reasonable bounded parameter space one can
test whether the specification of his prior belief is reflected by the range in which the design is most
efficient.
Let us now assume that we are less confident that the true VOT will be situated in between 10 and
$30/hour, but rather believe it could vary from 10 till $80/hour. We could then for example adjusts the
informative prior to: βprice ∼ N ormal ($45/hour, $20/hour).
Figure 2: Comparison of the D-error for designs generated with different priors given a range of true
Value of Time (VOT). The VOT range 10 - 80 $/h indicates the preset interval that is believed to
enclose the true value, prior to the experiment.
As can be seen in Figure 2, the updated informative prior results in the most efficient design for true
VOT’s between 15 and $75/hour. The difference with the semi-informative prior becomes smaller
since the degree of belief is more spread out over the parameter space. One can use the DBerr
function to calculate means of D(B) efficiencies for plausible regions in the parameter space. This
way one can incorporate prior knowledge without loosing to much robustness. When in doubt between
different design approaches, we encourage users to add other designs to the comparison in order to
choose a design that best fits their needs.
8 idefix: optimal designs for DCE’s
2.4. Summary
There are multiple ways of gathering prior information. It should be feasible to at least decide upon
reasonable boundaries in which the coefficients of interest should be located. One can evaluate a prior
specification by inspecting the choice probabilities it produces. When little information is available,
knowing the sign of most parameters can already improve the design substantially. Given predefined
boundaries, the robustness of a design can be evaluated for that region by using the DBerr function.
U ksn = x>
ksn β + ksn . (1)
The p-dimensional preference vector β denotes the importance of all attribute levels, represented by
the p-dimensional vector xksn . Unobserved influences on the utility function are captured by the
independent error terms ksn , which are assumed to be distributed according to a Gumbel distribution.
The probability that individual n chooses alternative k in choice set s can then be expressed in closed
form:
exp(x>ksn β)
pksn (β) = PK . (2)
>
i=1 exp(xisn β)
In general, standard maximum likelihood techniques are applied to estimate the preference vector β.
∂2L
IF IM = −E . (3)
∂β∂β >
For the MNL model, this yields the following expression, where N is the number of respondents,
Xs = (x> >
1sn , ..., xKsn ), the design matrix of choice set s, Ps = diag[p1sn , p2sn , ..., pKsn ], and
1
Originally this model was known as the conditional logit model, the term MNL is however used more frequently
nowadays.
Journal of Statistical Software 9
S
X
IF IM (β|X) = N X>
s P s − p >
s s X s.
p (4)
s=1
Several efficiency measures have been proposed based on the information matrix, among which the
D-efficiency has become the standard approach (Kessels et al. 2006; Rose and Bliemer 2014). A D-
optimal design approach maximizes the determinant of the information matrix, therefore minimizing
the generalized variance of the parameter estimates. The criterion is scaled to the power 1/p, with p
the number of parameters the model contains:
Ω = IF IM (β|X)−1 , (5)
In order to calculate the D-error of a design, one must assume a model and parameter values. Since
there is uncertainty about the parameter values, a prior distribution π(β) can be defined on the prefer-
ence parameters. In this case the expected D-error is minimized over the prior preference distribution
and is referred to as the DB-error
Z
DB − error = det (Ω)1/p π(β)dβ. (7)
To find the design that minimizes such criteria, different algorithms have been proposed (see Cook and
Nachtsheim 1980, for an overview). We choose to implement both the modified Fedorov algorithm,
which was adapted from the classical Fedorov exchange algorithm (Fedorov 1972), as well as a co-
ordinate exchange algorithm. The former swaps profiles from an initial design matrix with candidate
profiles in order to minimize the D(B)-error. The latter changes individual attribute levels in order to
optimize the design. For more details see Modfed and CEA functions in Section 3.3.
3.3. Optimal designs for the MNL model with package idefix
Two main algorithms are implemented to generate optimal designs for the MNL model, the Modfed
function and the CEA function. Modfed is a modified Fedorov algorithm, whereas CEA is a coordi-
nate exchange algorithm. The latter is substantially faster and better suited for large designs problems.
Due to a less exhaustive search, it can produce slightly less efficient designs when there are few choice
sets compared to the Modfed algorithm. The Modfed is much slower, but has the advantage that the
user can specify a candidate list of alternatives, and thus is able to put restrictions on the design. We
will start by explaining how to generate a candidate list of alternatives by using the Profiles func-
tion, to continue with the Modfed function. Afterwards we show how to decode a design with the
Decode function. lastly we explain the CEA algorithm, and we end with an example of the DBerr
function that can be used to evaluate the robustness of different design solutions.
Profiles
The first step in creating a discrete choice design is to decide which attributes, and how many levels
of each, will be included. This is often not a straightforward choice that highly depends on the
research question. In general, excluding relevant attributes will result in increased error variance,
10 idefix: optimal designs for DCE’s
whereas including too many, can have a similar result due to the increase in task complexity. For
more elaborated guidelines in choosing attributes and levels we refer to Bridges et al. (2011).
Afterwards one can start creating profiles as combinations of attribute levels. Most often, all possible
combinations are valid profiles and then the Profiles function can be used to generate them. It
could be that some combinations of attribute levels are not allowed for a variety of reasons. In that
case the list of possible profiles can be restricted afterwards by deleting the profiles that do not suffice.
Table 1: Dummy and effects coding in the idefix package for an attribute with four levels.
The function Profiles has three arguments of which one is optional. In the lvls argument one
can specify how many attributes should be included, and how many levels each one should have.
The number of elements in vector at.lvls indicates the number of attributes. The numeric values
of that vector indicate the number of levels each attribute contains. In the example below there are
three attributes, the first one has three, the second four, and the last one has two levels. The type of
coding should be specified for each attribute, here with the vector c.type, with one character for
each attribute. Attributes can be effects coded "E", dummy coded "D" or treated as a continuous
variable "C". In this case all attributes will be effects coded. In Table 1, the different coding schemes
in the idefix package are depicted for an attribute containing four levels.
R> library("idefix")
R> at.lvls <- c(3, 4, 2)
R> c.type <- c("E", "E", "E")
R> Profiles(lvls = at.lvls, coding = c.type)
When continuous attributes are desired, the levels of those attributes should be specified in c.lvls,
with one numeric vector for each continuous attribute, and the number of elements should equal the
number of levels specified in lvls for each attribute. In the example below there are two continuous
attributes, the first one contains four levels (i.e., 4, 6, 8, and 10) and the second one two levels (i.e., 7
and 9).
Journal of Statistical Software 11
The output is a matrix in which each row is a possible profile. The last two columns represent the
continuous attributes.
Modfed
A modified Fedorov algorithm is implemented and can be used with the Modfed function. The
function consists of eleven arguments of which seven are optional. The first argument cand.set is
a matrix containing all possible profiles that could be included in the design. This can be generated
with the Profiles function as described above, but this is not necessary. Furthermore the desired
number of choice sets n.sets, the number of alternatives n.alts in each choice set, and the
draws from the prior distribution, for which the design should be optimized, should be specified in
par.draws. By entering a numeric vector, the D-error will be minimized given the parameter values
and a MNL likelihood. By specifying a matrix in par.draws, in which each row is a draw from
a multivariate prior distribution, the DB-error will be optimized. We recommend to always use the
DB-error since a D-efficient design is more sensitive to a misspecification of the prior.
If alternative specific constants are desired, the argument alt.cte should be specified. In order to
do this, a binary vector should be given with length equal to n.alts, indicating for each alternative
whether an alternative specific constant should be present 1 or not 0. Whenever alternative specific
constants are present, the par.draws argument requires a list containing two matrices as input.
The first matrix contains the parameter draw(s) for the alternative specific constant parameter(s). The
second matrix contains the draw(s) for the remaining parameter(s). To verify, the total number of
columns in par.draws should equal the number of nonzero parameters in alt.cte + the number
of parameters in cand.set.
For some discrete choice experiments, a no choice alternative is desired. This is usually an alternative
containing one alternative specific constant and zero values for all other attribute levels. If such
an alternative should be present, the no.choice argument can be set to TRUE. When this is the
case, the design will be optimised given that the last alternative of each choice set is a no choice
alternative. Note that when no.choice = TRUE, alt.cte[n.alts] should be 1, since the no
choice alternative has an alternative specific constant.
The algorithm will swap profiles from cand.set with profiles from an initial design in order to
maximize the D(B)-efficiency. In order to avoid local optima the algorithm will repeat this procedure
for several random starting designs. The default of n.start = 12 can be changed to any integer.
By default best = TRUE so that the output will only show the design with the lowest D(B)-error.
12 idefix: optimal designs for DCE’s
If best = FALSE, all optimized starting designs are shown in the output. One can also provide his
own starting design(s) in start.des, in this case n.start is ignored.
The modified Fedorov algorithm is fairly rapid, however for complex design problems (with lots of
attributes and attribute levels), the computation time can be high. Therefore we make use of parallel
computing through the parallel package (R Core Team 2017). By default parallel = TRUE
and detecCores will detect the number of available CPU cores. The optimization of the different
starting designs will be distributed over the available cores minus one.
The algorithm will converge when an iteration occurs in which no profile could be swapped in order to
decrease the D(B)-error anymore. A maximum number of iterations can be specified in max.iter,
but is by default infinite.
In the example below a DB-optimal design is generated for a scenario with three attributes. The
attributes have respectively four, two and three levels each. All of them are dummy coded. The matrix
M, containing draws from the multivariate prior distribution with mean mean and covariance matrix
sigma, is specified in par.draws. The mean vector contains six elements. The first three are the
parameters for the first attribute, the fourth is the parameter for the second attribute and the last two
are the ones for the third attribute.
The output consist of the optimized design that resulted in the lowest DB-error since best is TRUE
by default. Besides the final D(B)-error $error, $inf.error denotes the percentage of draws
for which the design resulted in an infinite D-error. This could happen for extreme large parameter
values, which result in probabilities of one or zero for all alternatives in all choice sets. In that case
the elements of the information matrix will be zero, and the D-error will be infinite. This percentage
should thus be preferably close to zero when calculating the DB-error and zero when generating a
D-optimal design. Lastly $probs shows the average probabilities for each alternative in each choice
set given the sample from the prior preference distribution par.draws.
$`design`
Var12 Var13 Var14 Var22 Var32 Var33
set1.alt1 1 0 0 1 1 0
set1.alt2 0 1 0 0 0 1
set2.alt1 1 0 0 1 0 1
set2.alt2 0 0 0 0 0 0
set3.alt1 1 0 0 0 0 1
set3.alt2 0 0 0 1 1 0
set4.alt1 0 0 0 1 0 0
set4.alt2 0 1 0 0 1 0
Journal of Statistical Software 13
set5.alt1 0 0 0 1 1 0
set5.alt2 0 0 1 0 0 0
set6.alt1 1 0 0 0 0 0
set6.alt2 0 0 1 1 0 1
set7.alt1 0 1 0 1 0 0
set7.alt2 0 0 0 1 0 1
set8.alt1 0 1 0 1 0 0
set8.alt2 0 0 1 0 1 0
$error
[1] 2.302946
$inf.error
[1] 0
$probs
Pr(alt1) Pr(alt2)
set1 0.3516382 0.6483618
set2 0.4503424 0.5496576
set3 0.7077305 0.2922695
set4 0.4842112 0.5157888
set5 0.6790299 0.3209701
set6 0.7231303 0.2768697
set7 0.1682809 0.8317191
set8 0.4540432 0.5459568
Decode
The Decode function allows the user to decode a coded design matrix into a readable choice design
with labeled attribute levels, as they would appear in a real survey. Comparing the decoded alternatives
with their predicted choice probabilities is a simple way of checking whether the prior specification is
reasonable. It can for example happen that an efficient design contains some dominant alternatives. To
what extent these are undesired depends on the researcher. Some argue that dominant alternatives can
serve as a test to see whether the respondent takes the choice task seriously, others argue that it may
lead to lower response efficiency. In any case, it is important that the predicted choice probabilities,
which are a direct consequence of the prior, seem plausible. Otherwise either the prior, or the used
coding is presumably wrongly specified (Crabbe and Vandebroek 2012; Bliemer, Rose, and Chorus
2017).
The design that needs to be decoded has to be specified in des. This is a matrix in which each row
is an alternative. Furthermore the number of alternatives n.alts, and the applied coding coding
should be indicated. The labels of the attribute levels which will be used in the DCE should be
specified in lvl.names. This should be a list containing a vector for each attribute. Each vector
has equal elements as the corresponding attribute has attribute levels. In the example below we will
decode the design previously generated with the Modfed function into a readable design. We specify
all attribute levels in lvls. For this example we mimicked a transportation DCE. The first attribute is
the ”cost” with four different values: 15, 20, 30 and 50 dollar. The second attribute represents ”travel
14 idefix: optimal designs for DCE’s
time”, an alternative that can have 2 or 30 min as attribute levels. Lastly the attribute ”comfort” has
three levels, the comfort can be bad, moderate or good. In des we specify the optimized design that
resulted in the lowest DB error from the output of the Modfed function. In coding the same type
of coding we used to generate the design matrix should be specified. In this case all attributes were
dummy coded.
The output of Decode contains two components. The first one, $design, shows the decoded design
matrix. The second one, lvl.balance, shows the frequency of each attribute level in the design.
As previously mentioned, besides statistical efficiency other criteria such as attribute level balance can
be of importance too.
$`design`
V1 V2 V3
set1.alt1 $20 15 min moderate
set1.alt2 $30 2 min good
set2.alt1 $20 15 min good
set2.alt2 $15 2 min bad
set3.alt1 $20 2 min good
set3.alt2 $15 15 min moderate
set4.alt1 $15 15 min bad
set4.alt2 $30 2 min moderate
set5.alt1 $15 15 min moderate
set5.alt2 $50 2 min bad
set6.alt1 $20 2 min bad
set6.alt2 $50 15 min good
set7.alt1 $30 15 min bad
set7.alt2 $15 15 min good
set8.alt1 $30 15 min bad
set8.alt2 $50 2 min moderate
$lvl.balance
$lvl.balance$`attribute 1 `
$lvl.balance$`attribute 2 `
15 min 2 min
9 7
Journal of Statistical Software 15
$lvl.balance$`attribute 3 `
In the second example we optimize several starting designs for the same attributes as the ones in the
first example. As before, all possible profiles are generated using the Profiles function. This
time we include an alternative specific constant (asc) for the first alternative and we add a no choice
alternative. A no choice alternative is coded as an alternative that contains one asc and zeros for all
other attribute levels. The vector c(1, 0, 1) specified in alt.cte, indicates that there are three
alternatives of which the first and the third (the no choice alternative) have an asc. The mean vector
m contains eight entries, the first one corresponds to the asc of the first alternative, the second one to
the asc of the no choice alternative. The third, fourth and fifth elements correspond to the prior mean
of the coefficients of the levels of the first attribute. The sixth element indicates the prior mean of
the coefficient of the levels for the second attribute and the last two elements to the levels of the third
attribute. In this example all attributes are effects coded (see Table 1 for an example). A sample is
drawn from the multivariate normal prior distribution with mean m and covariance matrix v which is
passed to par.draws. In order to avoid confusion the draws for the alternative specific constants
are separated from the draws for the coefficients and passed on to par.draws in a list.
R> set.seed(123)
R> code = c("E", "E", "E")
R> cs <- Profiles(lvls = c(4, 2, 3), coding = code )
R> alt.cte <- c(1, 0, 1)
R> m <- c(0.1, 1.5, 1.2, 0.8, -0.5, 1, -1.5, 0.6)
R> v <- diag(length(m))
R> ps <- MASS::mvrnorm(n = 500, mu = m, Sigma = v)
R> ps <- list(ps[, 1:2], ps[, 3:8])
R> D.nc <- Modfed(cand.set = cs, n.sets = 10, n.alts = 3,
+ alt.cte = alt.cte, par.draws = ps, no.choice = TRUE,
+ best = FALSE)
R> for (i in 1:length(D.nc)) print(D.nc[[i]]$error)
[1] 1.230047
[1] 1.277368
[1] 1.241679
[1] 1.241667
[1] 1.250804
[1] 1.276689
[1] 1.252373
[1] 1.266141
[1] 1.248671
[1] 1.249621
[1] 1.274209
[1] 1.253853
16 idefix: optimal designs for DCE’s
Because best was set to FALSE, the outcome (i.e., the optimized design matrix, the DB-error and
the probabilities) of each initial design was stored. We print out all the DB-errors and decide which
design we would like to decode. Another option is to optimize more starting designs. In the example
below the first design is decoded, since this was the one that resulted in the lowest DB-error. Note that
we now have to specify which alternative is the no choice alternative in the no.choice argument.
V1 V2 V3 probs
set1.alt1 $50 2 min good 0.3214916
set1.alt2 $15 15 min moderate 0.3133349
no.choice <NA> <NA> <NA> 0.3651735
set2.alt1 $15 2 min moderate 0.4697414
set2.alt2 $20 2 min good 0.3856513
no.choice.1 <NA> <NA> <NA> 0.1446072
set3.alt1 $20 15 min moderate 0.2784902
set3.alt2 $15 2 min bad 0.3145071
no.choice.2 <NA> <NA> <NA> 0.4070027
set4.alt1 $30 2 min good 0.4284075
set4.alt2 $50 2 min moderate 0.2310475
no.choice.3 <NA> <NA> <NA> 0.3405450
set5.alt1 $20 2 min good 0.3565410
set5.alt2 $15 2 min good 0.4518309
no.choice.4 <NA> <NA> <NA> 0.1916281
set6.alt1 $20 2 min bad 0.3174667
set6.alt2 $50 15 min good 0.1360625
no.choice.5 <NA> <NA> <NA> 0.5464708
set7.alt1 $15 15 min good 0.4273929
set7.alt2 $30 2 min bad 0.1274950
no.choice.6 <NA> <NA> <NA> 0.4451121
set8.alt1 $50 2 min moderate 0.3294313
set8.alt2 $30 15 min good 0.1941744
no.choice.7 <NA> <NA> <NA> 0.4763942
set9.alt1 $15 2 min bad 0.2896765
set9.alt2 $30 2 min moderate 0.3206922
no.choice.8 <NA> <NA> <NA> 0.3896313
set10.alt1 $30 2 min moderate 0.2405066
set10.alt2 $20 2 min moderate 0.4750211
no.choice.9 <NA> <NA> <NA> 0.2844724
CEA
It is known that the computation time of the Modfed algorithm increases exponentially by includ-
ing additional attributes. Therefore we also implemented the coordinate exchange algorithm (CEA),
Journal of Statistical Software 17
which is particularly effective for larger design problems (Tian and Yang 2017; Meyer and Nacht-
sheim 1995). Since the CEA approach runs in polynomial time, it is expected that computation time
will be reduced by one or two orders of magnitude, while producing equally efficient designs as the
Modfed algorithm. Only when using few initial starting designs, for small design problems (e.g., <
10 choice sets), it could be that the CEA function produces designs that are slightly less efficient as the
ones obtained from Modfed. This is because the CEA algorithm performs a less exhaustive search.
To overcome this problem, one can increase the number of random initial designs n.start, but we
advice to use the modified Fedorov algorithm for such problems. Similarly as the Modfed function,
the CEA procedure improves random initial design matrices in which a row represents a profile and
columns represent attribute levels. The latter however considers changes on an attribute-by-attribute
basis, instead of swapping each profile with each possible profile. It is thus no longer needed to gen-
erate a candidate set of all possible profiles. The downside is that one can also no longer eliminate
particular profiles in advance. All arguments, and the output that CEA produces, are exactly the same
as the ones documented in the Modfed function. The only difference is that the lvls and coding
argument now directly serve as input for the CEA function.
R> set.seed(123)
R> lvls = c(4, 2, 3)
R> coding = c("E", "E", "E")
R> alt.cte <- c(1, 0, 1)
R> m <- c(0.1, 1.5, 1.2, 0.8, -0.5, 1, -1.5, 0.6)
R> v <- diag(length(m))
R> ps <- MASS::mvrnorm(n = 500, mu = m, Sigma = v)
R> ps <- list(ps[, 1:2], ps[, 3:8])
R> D.nc_cea <- CEA(lvls = lvls, coding = coding, n.alts = 3,
+ n.sets = 10, alt.cte = alt.cte, par.draws = ps,
+ no.choice = TRUE, best = TRUE)
R> D.nc_cea
$`design`
alt1.cte alt3.cte Var11 Var12 Var13 Var21 Var31 Var32
set1.alt1 1 0 0 0 1 -1 -1 -1
set1.alt2 0 0 1 0 0 -1 0 1
no.choice 0 1 0 0 0 0 0 0
set2.alt1 1 0 1 0 0 1 0 1
set2.alt2 0 0 0 1 0 1 -1 -1
no.choice 0 1 0 0 0 0 0 0
set3.alt1 1 0 -1 -1 -1 1 -1 -1
set3.alt2 0 0 0 0 1 -1 0 1
no.choice 0 1 0 0 0 0 0 0
set4.alt1 1 0 1 0 0 1 -1 -1
set4.alt2 0 0 0 0 1 1 -1 -1
no.choice 0 1 0 0 0 0 0 0
set5.alt1 1 0 1 0 0 1 1 0
set5.alt2 0 0 0 1 0 1 0 1
no.choice 0 1 0 0 0 0 0 0
set6.alt1 1 0 -1 -1 -1 1 1 0
18 idefix: optimal designs for DCE’s
set6.alt2 0 0 1 0 0 -1 -1 -1
no.choice 0 1 0 0 0 0 0 0
set7.alt1 1 0 0 0 1 1 0 1
set7.alt2 0 0 1 0 0 1 1 0
no.choice 0 1 0 0 0 0 0 0
set8.alt1 1 0 0 1 0 -1 -1 -1
set8.alt2 0 0 0 0 1 1 -1 -1
no.choice 0 1 0 0 0 0 0 0
set9.alt1 1 0 0 1 0 -1 0 1
set9.alt2 0 0 -1 -1 -1 -1 -1 -1
no.choice 0 1 0 0 0 0 0 0
set10.alt1 1 0 0 1 0 1 1 0
set10.alt2 0 0 -1 -1 -1 1 0 1
no.choice 0 1 0 0 0 0 0 0
$error
[1] 1.237644
$inf.error
[1] 0
$probs
Pr(alt1) Pr(alt2) Pr(no choice)
set1 0.1870938 0.3265117 0.4863944
set2 0.4697414 0.3856513 0.1446072
set3 0.3605579 0.1517753 0.4876668
set4 0.6058905 0.1608716 0.2332379
set5 0.2186997 0.4996534 0.2816469
set6 0.1087114 0.4204388 0.4708497
set7 0.3627346 0.2461984 0.3910670
set8 0.2609301 0.3730199 0.3660500
set9 0.3204917 0.1179716 0.5615366
set10 0.2599124 0.2614408 0.4786468
DBerr
As previously mentioned, the choice of using an efficient design approach is related to the confidence
one has in the prior specification of the parameters that are to be estimated Walker et al. (2018). Other
designs approaches often rely on different assumptions regarding the true data generating process.
Foreseeing the design implications given that some assumptions are violated, or the prior deviates
from the true parameters, can be difficult. The DBerr functions allows to evaluate designs resulting
from different assumptions to be compared given a range of plausible true parameters. The function
simply calculates the D(B)-error, given a design des, and parameter values par.draws. The latter
can be a vector, in which case the D-error is calculated, or it can be matrix in which each row is a
draw. If a matrix is supplied and mean = TRUE the D(B)-error is calculated. If mean = FALSE
a vector with D-errors is returned. Lastly, the number of alternatives in each choice set needs to be
Journal of Statistical Software 19
specified in n.alts.
In what follows we show the code used to generate the robustness plots in Section 2.3. We compared
different design solutions, each generated with different prior beliefs.
R> set.seed(123)
R> N = 500
R> U <- rnorm(n = N, mean = 0, sd = 3)
R> S <- rtruncnorm(n = N, a = -Inf, b = 0, mean = 0, sd = 3)
R> I <- rtruncnorm(n = N, a = -Inf, b = 0, mean = -1, sd = 0.1)
R> I2 <- rtruncnorm(n = N, a = -Inf, b = Inf, mean = -2.25, sd = 1)
We draw a sample from each of the prior distributions. The sample of the uninformative prior is stored
in U, S contains the semi-informative sample, and I and I2 are the informative samples for βprice .
We will continue by showing how we generated the informative design and check the robustness for
that design. The same can be done for the other priors by replacing I with the desired sample.
We used the CEA algorithm to generate an efficient design given sample I. Afterwards the design
matrix is stored in des. There where two attributes, each containing 5 levels. If we would label the
levels of price as 1 = $20, 2 = $40, 3 = $60, 4 = $80, 5 = $100, and the levels of the time attribute
as 1 = 1h, 2 = 2h, 3 = 3h, 4 = 4h, 5 = 5h, then if βprice = −1, and βtime = −1 one is willing
to pay $20 to reduce time with one hour, or V OT = $20/h. Since the coefficient of time was fixed
to -1, we can define a range of possible true values of time by varying βprice from 0 till -5. Finally
the DBerr function can be used to calculate the efficiency of the design given each possible true
VOT. Specifying mean = TRUE, would return the mean which is the same as the DB-error for the
specified range. One can thus test multiple designs for different plausible parameters.
As can be seen in Figure 1, the very informative prior performs well when the best guess ($20/h)
equals the true VOT, but becomes quickly inefficient when the true VOT deviates.
20 idefix: optimal designs for DCE’s
Z
πks = pks (β)f (β)dβ. (8)
The logit probability pks (β) as specified in equation 2, is weighted over the density function f (β),
describing the preference heterogeneity. The likelihood of the observed responses for N respondents
yN can then be written as
N Z Y
S Y
K
!
Y yksn
L(yN |XN , θ) = (pksn (β n )) f (β n |θ)dβ n , (9)
n=1 s=1 k=1
where yN is a binary vector with elements yksn which equal 1 if alternative k in choice set s was
chosen by respondent n, and 0 otherwise. XN is the full design matrix containing the subdesigns
for N respondents, and θ contains the parameters of the population distribution f (β). MIXL models
are either estimated with a hierarchical Bayesian estimation procedure or with a simulated likelihood
approach.
β n ∼ g(β)
min(D(B)-error)
ysn
Figure 3: Flow chart of the IASB algorithm in which an individual response ysn on choice set Xsn
is used to update the prior preference distribution g(β). The next choice set is selected based on the
updated preference distribution and previously selected choice sets.
The flow chart in Figure 3 represents the idea of the IASB approach. At the top we see the distri-
bution of the individual preferences of a respondent βn . Initially a sample is drawn from the prior
distribution β n ∼ N (µ, Σ). Based on that sample, one or more initial choice sets X will be selected
by minimizing the D(B)-error. After observing the respons(es) ysn , the prior is updated in a Bayesian
way and samples are now drawn from the posterior distribution g(β). This process is repeated as
many times as there are individual sequential choice sets, resulting in an individualised design for
each respondent. It has been proven that generating designs according to this IASB approach results
in choice data of high quality in order to estimate the MIXL model (Danthurebandara et al. 2011; Yu
et al. 2011).
4.3. Optimal designs for the MIXL model with package idefix
In this part, the functions necessary to generate adaptive choice sets in a sequential way, as explained
in Section 4.2, are described. The two main building blocks of the IASB approach consist of an impor-
tance sampling algorithm ImpsampMNL to sample from the posterior, and the SeqMOD algorithm,
to select the most efficient choice set given a posterior sample. Furthermore, a function to generate
responses RespondMNL is included, this way all elements to set up a simulation study, as depicted
in Figure 4, are present. If the reader has no interest in evaluating an adaptive design approach with
response simulation, but rather in collecting empirical data, one can proceed to Section 5.
22 idefix: optimal designs for DCE’s
prior likelihood
ImpsampMNL
posterior draws
SeqMOD
RespondMNL
choice set response
Figure 4: Simulation setup for the IASB approach. ImpsampMNL, SeqMOD and RespondMNL are
functions included in the idefix package.
In what follows an example is given of the workflow, together with a description of each of the
functions involved. In the example, choice data for one respondent will be generated, while part of
the choice sets are selected making use of the IASB methodology. We will assume a scenario with
three attributes. The first attribute has four levels, the second attribute has three levels and the last one
has two levels. We choose to use dummy coding for all attributes and not to include an alternative
specific constant. As a prior we use draws from a multivariate normal distribution with mean vector
m and covariance matrix v. In each choice set there are two alternatives.
First we will generate a DB optimal design containing eight initial fixed choice sets in the same way
as explained in Section 2.3.
R> set.seed(123)
R> cs <- Profiles(lvls = c(4, 3, 2), coding = c("D", "D", "D"))
R> m <- c(0.25, 0.5, 1, -0.5, -1, 0.5)
R> v <- diag(length(m))
R> ps <- MASS::mvrnorm(n = 500, mu = m, Sigma = v)
R> init.des <- Modfed(cand.set = cs, n.sets = 8, n.alts = 2,
+ alt.cte = c(0, 0), par.draws = ps)$design
R> init.des
set5.alt1 0 0 0 0 0 0
set5.alt2 1 0 0 1 0 1
set6.alt1 0 1 0 1 0 0
set6.alt2 0 0 0 0 1 0
set7.alt1 0 0 1 1 0 0
set7.alt2 1 0 0 0 0 1
set8.alt1 0 1 0 0 1 1
set8.alt2 0 0 1 0 0 0
The next step is to simulate choice data for the initial design. We assume that the true preference
parameter vector is the following:
RespondMNL
To simulate choices based on the logit model the RespondMNL function can be used. The true
(individual) preference parameters can be set in par, in this case truePREF. In des, a matrix
should be specified in which each row is a profile. This can be a single choice set or a design matrix
containing several choice sets, in this example our initial design init.des is used. The number of
alternatives per choice set should be set in n.alts.
R> set.seed(123)
R> y.sim <- RespondMNL(par = truePREF, des = init.des, n.alts = 2)
R> y.sim
[1] 1 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1
The output is a binary vector indicating the simulated choices. Given that K = 2, the alternatives that
were chosen in each choice set were respectively the first, first, first, second, second, first, second and
second alternative.
At this point we have information about the true preferences captured in the responses y.sim, given
to the choice sets in init.des. We can now take this information into account by updating our prior
distribution.
ImpsampMNL
Since the posterior has no closed form, an importance sampling algorithm is provided with the
ImpsampMNL function, which can be used to take draws from the posterior after each observed
response. As importance density a multivariate student t distribution, with degrees of freedom equal
to the dimensionality, is used. The mean of the importance densitity is the mode of the posterior dis-
tribution and the covariance matrix −H −1 , with H the Hessian matrix of the posterior distribution.
The draws are taken systematically using extensible shifted lattice points (Yu et al. 2011).
As prior distribution the multivariate normal is assumed for which the mean vector can be specified in
prior.mean, and covariance matrix in prior.covar. Here we use the same mean prior vector
m and covariance matrix v as the ones we used to generate the initial design. Previously presented
24 idefix: optimal designs for DCE’s
choice sets, in this case init.des, can be passed through des, whereas the simulated responses
y.sim can be passed through y. The number of draws can be specified in n.draws, here we
took 200 draws from the posterior distribution. Note that the ImpsampMNL function includes three
more arguments which are not required. If alternative specific constants are present, those should
be specified in alt.cte argument which is by default NULL. The last two arguments lower and
upper allow the user to sample from a truncated normal distribution. This can be desired when
the researher is certain about the sign of one or more parameters. For example when there are three
parameters and the researcher knows the first parameter should be positive, a lower boundary of zero
can be specified for that parameter with lower = c( 0, -Inf, -Inf). Equivalently when
only negative draws are desired for the first paramter an upper boundary can be specified by stating
upper = c(0, Inf, Inf). In this case we however do not know the signs of the parameters
and apply no boundaries.
R> set.seed(123)
R> draws <- ImpsampMNL(n.draws = 200, prior.mean = m,
+ prior.covar = v, des = init.des, n.alts = 2, y = y.sim)
R> draws
The output contains the sample $sample in which each row is a draw from the posterior. The
importance weight of each draw is given in $weights, the mode and the covariance matrix of the
importance density are given respectively in $max and $covar.
$`sample`
Var12 Var13 Var14 Var22 Var23 Var32
[1,] 1.070865236 0.638213757 2.049646085 -1.334767e+00 -2.14475738 0.09857508
[2,] 0.662390619 0.457390508 0.571839377 -1.842649e-01 -0.55735903 1.87854394
[3,] 2.137729327 1.737021903 1.640171096 -3.223393e-01 -1.56516482 0.83483329
...
[200,] 0.584339280 0.275411214 1.928374787 -1.710515e+00 -1.36622933 1.51836041
$weights
[1] 0.0064215784 0.0048492541 0.0061226039 0.0048566461 0.0068167857 0.0044063350
[7] 0.0054392379 0.0067403456 0.0033642025 0.0028369841 0.0043626513 0.0059685074
...
[199] 0.0040685285 0.0044834598
$max
[1] 0.8666279 0.5478021 1.3107427 -0.7595160 -1.3510582 0.9885595
$covar
Var12 Var13 Var14 Var22 Var23 Var32
Var12 0.588263554 0.06313684 0.07088819 0.007553871 -0.005398027 0.005892646
Var13 0.063136836 0.63202630 0.07933276 -0.015982465 -0.040937948 -0.049903465
Var14 0.070888189 0.07933276 0.67186035 -0.062542644 -0.025917076 0.044022941
Var22 0.007553871 -0.01598247 -0.06254264 0.533996254 0.064650234 -0.031610052
Var23 -0.005398027 -0.04093795 -0.02591708 0.064650234 0.612430932 -0.072343996
Var32 0.005892646 -0.04990347 0.04402294 -0.031610052 -0.072343996 0.462797374
Given the draws from the posterior distribution we can now select the next optimal choice set.
SeqMOD
The SeqMOD function can be used to select the next optimal choice set. The algorithm will evaluate
each possible choice set in combination with the previously selected choice sets and select the one
which maximizes efficiency, given the posterior preference distribution.
Journal of Statistical Software 25
In the SeqMOD algorithm, the previously selected choice sets can be specified in des, which are
stored in init.des in the example. The candidate set is the same as the one we used to generate
init.des, namely cs. The sample we obtained from the posterior by using ImpsampMNL is saved
in draws. The draws themselves dr are specified in par.draws and their importance weights
w in weights. Our prior covariance matrix v is passed through prior.covar. Optional argu-
ments such as alt.cte, no.choice and parallel are the same as previously explained in the
Modfed function. Lastly reduce is by default TRUE and reduces the set of all potential choice sets
to a subset of choice sets that have a unique information matrix.
The output contains the most optimal next choice set given all possible choice sets, and the associated
error.
$`set`
Var12 Var13 Var14 Var22 Var23 Var32
[1,] 1 0 0 1 0 0
[2,] 0 1 0 0 1 1
$error
[1] 0.002874806
It should be noted that the criterion used to evaluate the efficiency in the SeqMOD algorithm is slightly
different from the one we specified in equation 7. Since the adaptive approach is completely Bayesian,
we wish to approximate the covariance matrix of the posterior preference distribution with the gener-
alized Fisher information matrix (GFIM). The latter is computed as minus the Hessian matrix of the
log-posterior density and is given by:
∂ 2 π0 (β)
I GF IM (β|X) = IF IM (β|X) − EY . (10)
∂β∂β >
Here IF IM (β|X) represents the Fisher information matrix of the MNL model as described in equa-
tion 4. The second part represents the Fisher information contained in the prior density π0 (β). When
a multivariate normal distribution is assumed as prior preference density distribution, the second term
simplifies to
∂ 2 π0 (β)
EY = −Σ−1
0 , (11)
∂β∂β >
i.e., minus the inverse of the prior preference distribution. The expression for the generalized Fisher
information matrix then becomes:
26 idefix: optimal designs for DCE’s
A − error, i.e., the DB-error used in an adaptive scenario is then calculated as:
The DB
Z
A
DB − error = det (IGF IM (β|X))−1/p π0 (β)dβ. (13)
Note that the less informative the prior preference distribution is, i.e., the closer Σ−1
0 is to the zero
A
matrix, the less difference there will be between the DB − error and the DB − error.
To set up a simulation study, the previous steps can be repeated as many times as additional adaptive
choice sets are required. This can be done for N different participants, each time drawing a unique
truePREF vector from an a priori specified population preference distribution. The researcher can
vary the heterogeneity of this population distribution along with the number of adaptive choice sets
and other specifications. The simulated choice data can be stored and prepared for estimation using
existing R packages for which we refer to Section 6.
R> data("example_design")
R> xdes <- example_design
R> xdes
set1.alt2 0 1 0 0 1 0
set2.alt1 0 0 0 0 1 0
set2.alt2 1 0 1 0 0 0
set3.alt1 0 1 0 0 0 1
set3.alt2 0 0 0 1 0 0
set4.alt1 0 1 0 0 0 0
set4.alt2 1 0 1 0 1 0
set5.alt1 0 0 0 1 0 1
set5.alt2 0 1 0 1 0 0
set6.alt1 1 0 0 1 1 0
set6.alt2 0 1 1 0 0 0
set7.alt1 0 0 1 0 0 1
set7.alt2 1 0 0 0 1 0
set8.alt1 0 1 1 0 1 0
set8.alt2 1 0 0 1 0 1
The total number of choice sets that need to be presented are defined in n.total. If no other
choice sets besides the one in the specified design are desired, then n.total equals the number of
choice sets in des. If n.total is larger than the number of sets provided in des, adaptive sets will
be generated as explained in the third example. In alts, the names of the alternatives have to be
specified. In this case there are two alternatives, named Alt A and Alt B. In atts, the names of the
attributes are specified, here price, time, and comfort.
The attribute levels are specified in lvl.names, which is a list containing one character vector for
each attribute. The levels for the first attribute are 10, 5, and 1 dollar. The attribute time can take
values 20, 12 and 3 minutes. Lastly, the comfort attribute can vary between bad, average, and good.
The type of coding used in des should be specified in coding. This is the same argument as
explained in the Profiles function in Section 3.3. Here all attributes are dummy coded.
There are three arguments where some text can be provided. The character string b.text will appear
above the options where the participant can indicate his or her choice. In this example the text "Please
choose the alternative you prefer" appears in the choice task as can be seen in Figure 5. Before the
discrete choice task starts, some instructions can be given. This text can be provided to intro.text
in the form of a character string. In this case "Welcome, here are some instructions ... good luck!"
28 idefix: optimal designs for DCE’s
will appear on screen before the survey starts. In the same way some ending note can be specified
in end.text. The character string "Thanks for taking the survey" will appear on screen when the
survey is completed.
When running the SurveyApp function, a screen will pop up, starting with the initial text provided
in intro.text. Next all the choice sets in the design provided in des will be presented on screen
one after the other as can be seen in Figure 5.
Figure 5: Example of a discrete choice experiment, generated with the SurveyApp function.
Lastly the directory to store the observed responses, together with the presented design can be spec-
ified in data.dir. The default is NULL, and in this case no data will be stored. If a directory is
specified, two text files will be written to that directory at the end of the survey. One containing the
decoded design as it was presented to the participant together with the sequence of alternatives that
were chosen. This is similar to the Decode output previously described in Section 3.3. The second
file has the same file name, which starts with the same number (based on Sys.time()), except for
the fact that "char" is replaced with "num". This file contains the coded design matrix and the binary
response vector. This file can be used to estimate the preference parameters (see Section 6 for more
information). The file containing the decoded design can be used to inspect the choice sets as they
were perceived by the respondents. An example of both files can be seen in Figure 6.
Journal of Statistical Software 29
Figure 6: Example of the two data files that get stored by the SurveyApp function.
If a no choice option is desired, the specified design should already contain a no choice alternative in
each choice set. In this example we use a toy example design, generated with the Modfed function,
and available in the idefix package. Except for the inclusion of a no choice alternative this design has
the same properties as the previous example design.
R> data("nochoice_design")
R> ncdes <- nochoice_design
R> ncdes
set5.alt1 0 0 0 0 0 0 1
set5.alt2 0 0 1 0 1 0 0
no.choice 1 0 0 0 0 0 0
set6.alt1 0 0 0 0 1 1 0
set6.alt2 0 0 1 0 0 0 0
no.choice 1 0 0 0 0 0 0
set7.alt1 0 0 0 1 0 0 1
set7.alt2 0 1 0 0 0 1 0
no.choice 1 0 0 0 0 0 0
set8.alt1 0 0 0 1 0 1 0
set8.alt2 0 1 0 0 1 0 1
no.choice 1 0 0 0 0 0 0
Now three alternative names are required. The last one indicates the no choice option.
Since there is an alternative specific constant, the alt.cte argument should be specified. As before,
this is done with a vector in which each element indicates whether an asc is present for the corre-
sponding alternative or not. Furthermore the option no.choice, which is by default NULL should
be altered to an integer indicating the no choice alternative.
Figure 7: Example of a discrete choice experiment with a no choice option, generated with the
SurveyApp function.
As can be seen in Figure 7, the no choice option is not part of the decoded alternatives, it is however
possible to select none instead of alternative A or B.
Journal of Statistical Software 31
The same files as in the previous example are saved in data.dir if a directory is specified. The
design will now contain the initial choice sets and the additional four adaptive sets.
set5.alt2 0 1 0 1 0 0 1
set6.alt1 1 0 0 1 1 0 0
set6.alt2 0 1 1 0 0 0 1
set7.alt1 0 0 1 0 0 1 1
set7.alt2 1 0 0 0 1 0 0
set8.alt1 0 1 1 0 1 0 0
set8.alt2 1 0 0 1 0 1 1
set9.alt1 0 1 0 1 1 0 1
set9.alt2 1 0 0 0 0 1 0
set10.alt1 0 1 0 1 1 0 0
set10.alt2 1 0 1 0 0 1 1
set11.alt1 0 1 0 1 1 0 0
set11.alt2 1 0 1 0 0 1 1
set12.alt1 0 1 0 1 1 0 1
set12.alt2 1 0 0 0 0 1 0
6. Data management
The idefix package always uses the same data format. An individual choice design is represented as
a matrix in which each row is an alternative, and subsequent rows form choice sets (see for example
the output of Modfed in Section 3.3). Individual choice data is contained in a binary vector with
Journal of Statistical Software 33
length equal to the number of alternatives (rows) in the design matrix. For each choice set the chosen
alternative in indicated with a 1 and all others with 0. Individual design matrices as well as the choice
vectors can be stacked together to form aggregate choice designs and an aggregate data vector. The
function LoadData can be used to do this. To estimate the preference parameters on the acquired
data, several R packages can be used. Some of those packages however require different data formats.
The function Datatrans allows one to switch between the format of idefix and the one of the
desired estimation package.
7 set6.alt2 0 1 1 0 0 0 1
7 set7.alt1 0 0 1 0 0 1 0
7 set7.alt2 1 0 0 0 1 0 1
7 set8.alt1 0 1 1 0 1 0 1
7 set8.alt2 1 0 0 1 0 1 0
Datatrans
To illustrate the data transformation, we make use of an example of an aggregate dataset included
in idefix. The dataset aggregate_design is the same one as shown at the end of Section 6.1.
There are seven respondents who each responded to eight choice sets containing two alternatives.
The alternatives consist of three dummy coded attributes with each three levels. In order to use the
Datatrans function we need to split the design matrix from the responses. To get the design matrix
we remove the ID column (the first column), the column indicating the set number (second column),
and the response column (column nine). What remains is the design matrix X.
All responses are captured in a binary vector containing the choice data of all respondents. Since
Journal of Statistical Software 35
there were seven respondents who each responded to eight choice sets containing two alternatives, y
contains 112 elements.
[1] 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1
[30] 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0
[59] 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0
[88] 1 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 1 0
The function Datatrans can be used to transform the data into the correct format in order to use
existing estimation packages in R. In the pkg argument one can specify the package for wich the data
needs to be transformed. There are six options: “bayesm” indicates the rhierMnlRwMixture
function of package bayesm (Rossi 2015), “ChoiceModelR” should be used for the choicemodelr
function of package ChoiceModelR (Sermas 2012), “RSGHB” for the doHB function of the RS-
GHB (Dumont and Keller 2015) package, “Mixed.Probit” for the rbprobitGibbs function
of the bayesm package, “mlogit” for the mlogit function of package mlogit (Croissant 2013)
and lastly“Rchoice” for the Rchoice function of the Rchoice package (Sarrias 2016).
Furthermore the number of alternatives per choice set n.alts, the number of choice sets per respon-
dent n.sets, the number of respondents n.resp, and the number of parameters n.par should be
specified. If the response vector y is not in binary format, but in discrete format (one integer indicat-
ing the chosen alternative per choice set), the bin argument has to be set to FALSE. The alt.var
argument which is by default NULL can be specified if the names of the alternatives should be differ-
ent. This requires a character vector with length equal to the number of alternatives in each choice set.
This only has influence on the ouput for mlogit and Rchoice.
In the example above, the data is transformed in order to use the rhierMnlRwMixture function
of package bayesm. The dataformat required is a list with elements $lgtdata, and $p where
$lgtdata is a list of nlgt = length(lgtdata) lists with each cross-section unit MNL data.
lgtdata[[i]]$y is a vector of responses with one element per choice set indicating the chosen
alternative for each respondent. lgtdata[[i]]$X is the design matrix for each respondent, and p
is the number of alternatives in each choice set. The output of Datatrans is the following (only the
first respondent is shown):
$p
[1] 2
$lgtdata
$lgtdata[[1]]
$lgtdata[[1]]$y
[1] 1 2 1 2 1 2 1 2
36 idefix: optimal designs for DCE’s
$lgtdata[[1]]$X
par.1 par.2 par.3 par.4 par.5 par.6
[1,] 1 0 1 0 0 0
[2,] 0 1 0 0 1 0
[3,] 0 0 0 0 1 0
[4,] 1 0 1 0 0 0
[5,] 0 1 0 0 0 1
[6,] 0 0 0 1 0 0
[7,] 0 1 0 0 0 0
[8,] 1 0 1 0 1 0
[9,] 0 0 0 1 0 1
[10,] 0 1 0 1 0 0
[11,] 1 0 0 1 1 0
[12,] 0 1 1 0 0 0
[13,] 0 0 1 0 0 1
[14,] 1 0 0 0 1 0
[15,] 0 1 1 0 1 0
[16,] 1 0 0 1 0 1
In the following example, the same data is transformed in order to use the mlogit package for estima-
tion. The output is a data.frame in long format that contains a boolean choice variable Choice,
the index of the alternative alt.names and an individual index id.var.
7. Future development
As previously mentioned, the package is mainly concerned with providing D(B) efficient designs.
Journal of Statistical Software 37
Of course any statistic or feature that is useful to practitioners in selecting an appropriate design
can be included. We therefore encourage users to give feedback and suggestions to further improve
the package. The most recent updates can be followed on https://github.com/traets/
idefix. We plan to implement the following in the near future:
References
Abrantes PA, Wardman MR (2011). “Meta-Analysis of UK Values of Travel Time: an Update.” Trans-
portation Research Part A: Policy and Practice, 45(1), 1 – 17. ISSN 0965-8564. doi:https:
//doi.org/10.1016/j.tra.2010.08.003. URL http://www.sciencedirect.
com/science/article/pii/S0965856410001242.
Aizaki H (2012). “Basic Functions for Supporting an Implementation of Choice Experiments in R.”
Journal of Statistical Software, Code Snippets, 50(2), 1–24. doi:10.18637/jss.v050.c02.
Bliemer MC, Rose JM (2010a). “Construction of Experimental Designs for Mixed Logit Models
Allowing for Correlation Across Choice Observations.” Transportation Research Part B: Method-
ological, 44(6), 720 – 734. doi:10.1016/j.trb.2009.12.004.
Bliemer MC, Rose JM (2010b). Serial Choice Conjoint Analysis for Estimating Discrete Choice
Models, chapter Chapter 6, pp. 137–161. doi:10.1108/9781849507738-006. URL
https://www.emeraldinsight.com/doi/abs/10.1108/9781849507738-006.
Bliemer MC, Rose JM, Chorus CG (2017). “Detecting Dominance in Stated Choice Data and Ac-
counting for Dominance-based Scale Differences in Logit Models.” Transportation Research Part
B: Methodological, 102, 83 – 104. doi:10.1016/j.trb.2017.05.005.
Bliemer MC, Rose JM, Hess S (2008). “Approximation of Bayesian Efficiency in Experi-
mental Choice Designs.” Journal of Choice Modelling, 1(1), 98 – 126. ISSN 1755-5345.
doi:https://doi.org/10.1016/S1755-5345(13)70024-1. URL http://www.
sciencedirect.com/science/article/pii/S1755534513700241.
Bridges JF, et al. (2011). “Conjoint Analysis Applications in Health-a Checklist: A Report of the
ISPOR Good Research Practices for Conjoint Analysis Task Force.” Value in Health, 14(4), 403 –
413. doi:10.1016/j.jval.2010.11.013.
38 idefix: optimal designs for DCE’s
Crabbe M, Vandebroek M (2012). “Using Appropriate Prior Information to Eliminate Choice Sets
with a Dominant Alternative from D-efficient Designs.” Journal of Choice Modelling, 5(1), 22 –
45. doi:10.1016/S1755-5345(13)70046-0.
Croissant Y (2013). mlogit: Multinomial Logit Model. R package version 0.2-4, URL https:
//CRAN.R-project.org/package=mlogit.
Dumont J, Keller J (2015). RSGHB: Functions for Hierarchical Bayesian Estimation: A Flexi-
ble Approach. R package version 1.1.2, URL https://CRAN.R-project.org/package=
RSGHB.
Fedorov VV (1972). Theory of Optimal Experiments. Probability and mathematical statistics 12.
Academic press, New York (N.Y.).
Harman R, Filova L (2016). OptimalDesign: Algorithms for D-, A-, and IV-Optimal Designs. R pack-
age version 0.2, URL https://CRAN.R-project.org/package=OptimalDesign.
Hensher DA, Greene WH (2003). “The Mixed Logit Model: the State of Practice.” Transportation,
30(2), 133–176. doi:10.1023/A:1022558715350.
Horne J (2014). choiceDes: Design Functions for Choice Studies. R package version 0.9-1, URL
https://CRAN.R-project.org/package=choiceDes.
Huber J, Zwerina K (1996). “The Importance of Utility Balance in Efficient Choice Designs.”
Journal of Marketing Research, 33(3), 307–317. doi:10.2307/3152127. URL http:
//www.jstor.org/stable/3152127.
Johnson FR, et al. (2013). “Constructing Experimental Designs for Discrete-Choice Experiments:
Report of the ISPOR Conjoint Analysis Experimental Design Good Research Practices Task Force.”
Value in Health, 16(1), 3 – 13. doi:10.1016/j.jval.2012.08.2223.
Marschak J (1950). “Rational Behavior, Uncertain Prospects, and Measurable Utility.” Econometrica,
18(2), 111–141. doi:10.2307/1907264.
Journal of Statistical Software 39
Train KE (2003). Discrete Choice Methods with Simulation. Cambridge University Press. doi:
10.1017/CBO9780511753930.
Wardman M, Chintakayala VPK, de Jong G (2016). “Values of Travel Time in Europe: Review and
Meta-analysis.” Transportation Research Part A: Policy and Practice, 94, 93 – 111. ISSN 0965-
8564. doi:https://doi.org/10.1016/j.tra.2016.08.019. URL http://www.
sciencedirect.com/science/article/pii/S0965856415302810.
Wheeler B (2014). AlgDesign: Algorithmic Experimental Design. R package version 1.1-7.3, URL
https://CRAN.R-project.org/package=AlgDesign.
Affiliation:
Martina Vandebroek
Leuven Statistics Research Centre
KULeuven
Celestijnenlaan 200B
3001 Leuven, Belgium
E-mail: Martina.Vandebroek@kuleuven.be
Frits Traets
Faculty of Economics and Business research
KULeuven
Naamsestraat 69 bus 3500
3000 Leuven, Belgium
E-mail: frits.traets@gmail.com