Hello! I'm trying to do an orthogonal design for a choice experiment where the participants choose between 4 mobility alternatives. I want to fix one of the 4 alternatives as a status-quo for the individual but I don't know if it is correct to define a reference scenario in an orthogonal design? If it is possible, how to program it in my case? (I only find the answer for my question when using an efficient design)

An orthogonal design does not optimise for the specified model, so you can simply leave out the reference alternative and create an orthogonal design without it. In your survey, you simply add your reference alternative. Note that your data will not be orthogonal because of the reference alternative. Note that there is NO benefit in using an orthogonal design for choice experiments.

I generally recommend using an efficient design with zero priors, which creates a design that maximises information and accounts for the reference alternative and has much more flexibility.

Hi, sorry for bothering you again! I followed your recommendation concerning using an efficient design with priors=0 and the following is the code I created (the status-quo for the respondent is using the car as a mean of transport):

However, running the code on Ngene, it takes a long time making evaluations and it does not converge to a final result. Reading the manual, I found that if a non-bayesian mnl program does not generate quickly this means that I have a problem of specification but I can't see where the error is in my code? I even tried to run an example code from the manual to check whether the problem is coming from the software itself and it is taking too long to generate the final result too (almost an hour running it is still the same status). This is the code I used from the manual:

The software doesn't converge unfortunately as when estimating a model. In estimation, we have algorithms to locate the maximum LL function (e.g., Newton-Raphison). When generating a design, whilst we have an objective function (say minimise the D-error), there is no algorithm to continually improve the design iteratively as you do in estimation with the LL function (unless you allow for continuous non-discrete levels), or you make a number of very restrictive assumptions. Hence, most algorithms test a set of random designs (some in smatter ways than others) as there are probably billions of billions of billions of possible designs for any given problem. Only for very small design problems can you check the objective function for all possible designs. Hence, the software will keep running (checking more designs) until either a) the user stops it and selects the best design found at that time, b) we finally as a species destroy the world and hence the computer running the software, or c) the sun finally explodes and takes with it the earth (and also the computer).

This is why we talk about efficient rather than optimal designs. Unless you check the objective function for every possible design, you won't know if it is optimal - just that it is efficient.

Finding the optimal (most efficient) design with the lowest D-error is a so-called NP-hard problem, i.e. the time it takes to find an optimal design increases exponentially with the dimensions of the design. There are typically trillions of possible experimental designs and evaluating all of them to find the best design takes very long as John points out.

As you will see in the output window in Ngene, the D-error initially decreases quickly (from 0.04 to 0.03) but after some time the D-error only marginally improves (e.g. from 0.0283 to 0.0282). One can argue that a design with D-error 0.0282 is sufficiently efficient and you simply manually stop the algorithm once the D-errors do not decrease much anymore. The D-error of the optimal design is unknown, but will be around, say, 0.0280. Ngene will eventually find the optimal design, but it could take days, months, or even years. The benefit of using an optimal design with a D-error of 0.0280 instead of an efficient design with a D-error of 0.0282 is negligible.

So how long to you need to run the algorithm? For computationally simple syntax (e.g. MNL model with fixed priors) it only takes seconds or minutes for the D-error to stabilise to an acceptable level, while for computationally complex syntax (e.g. RPPANEL model with Bayesian priors) it may take hours or days. Many users simply run the syntax overnight and stop the algorithm the next morning.