How to: Multinomial regression models in R

In my last post I looked at binomial choice modelling in R, i.e. how to predict a yes/no decision from other data. Now however I want to look at modelling a more complicated choice, between more than two options. This is known as multinomial choice modelling and R can perform these analyses using the nnet package.

Let’s start by making up some data. The following code creates 1000 data points and creates an arbitrary three-way choice value using some if-else statements.

The original data set. Colours represent different choices as a function of variables x1 and x2.

Next we need to fit the model, which is done via R’s nnet package for neural networks. The code below says that we believe the probability of choosing each option to be a function of the variables x1 and x2. The actual model structure is somewhat more complicated but, for simple choice problems, you’re probably safe to use the formula interface with the default options.

As with the binomial choice model, predicting new values from such a model can be a little tricky. If we naively use the predict method, the results will be the same every time we run the following command:

But what we really want is to predict the probabilities associated with each option and then draw a random number to make our actual selection. That way, each time the code runs we will get slightly different choices which better reflects how people actually make decisions. The choice probabilities can be predicted as follows:

The result of this command is an n by k matrix, where n is the number of data points being predicted and k is the number of options.

Notice that the sum of each row equals 1, as each matrix entry gives the probability of selecting a given option. Therefore to make a choice, we need to calculate the cumulative probabilities associated with each option. We can then draw a random value between 0 to 1; the option with the greatest cumulative probability below our draw value is our choice. This can be written into a function for easier use.

Multinomial choice model results. Note how the model captures the uncertainty lying at the interface of the different choice regions.

This method might not provide sufficiently robust results with more complicated choice models, involving many choices and lots of predictor variables. But for most basic problems, you can easily predict multinomial choice models using R’s neural network package and the predictMNL function above.