Introduction

Assume now that the observed data takes its values in a fixed and finite set of nominal categories . Considering the observations for any individual as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions for and . For a given (i,j), the sum of the K probabilities is 1, so in fact only K-1 of them need to be defined. In the most general way possible, any model can be considered so long as it defines a probability distribution, i.e., for each k, , and . Ordinal data further assume that the categories are ordered, i.e., there exists an order such that

We can think, for instance, of levels of pain (low moderate severe) or scores on a discrete scale, e.g., from 1 to 10. Instead of defining the probabilities of each category, it may be convenient to define the cumulative probabilities for , or in the other direction: for . Any model is possible as long as it defines a probability distribution, i.e., it satisfies

It is possible to introduce dependence between observations from the same individual by assuming that forms a Markov chain. For instance, a Markov chain with memory 1 assumes that all that is required from the past to determine the distribution of is the value of the previous observation ., i.e., for all ,

Formatting of categorical data in the MonolixSuite

In case of categorical data, the observations at each time point can only take values in a fixed and finite set of nominal categories. In the data set, the output categories must be coded as integers, as in the following example:

A normal distribution is used for , while log-normal distributions for and ensure that these parameters are positive (even without variability). Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:
VPC’s for categorical data compare the observed and predicted frequencies of each category over time:

Observations in markov2_data.txt take their values in {1, 2, 3}. Then, 6 transition probabilities need to be defined in the model.

Continuous-time Markov chain

The previous situation can be extended to the case where time intervals between observations are irregular by modeling the sequence of states as a continuous-time Markov process. The difference is that rather than transitioning to a new (possibly the same) state at each time step, the system remains in the current state for some random amount of time before transitioning. This process is now characterized by transition rates instead of transition probabilities:

The probability that no transition happens between and is

Furthermore, for any individual i and time t, the transition rates satisfy for any ,

Constructing a model therefore means defining parametric functions of time that satisfy this condition.