I am running a mixed-effects linear model, with one random effect that is a factor with multiple levels and is an intercept-only model. Like this:

lmer(y ~ x1 + x2 + x3 + (1|factor(x4)), data=data)

The output of my model in R includes an intercept (fixed), my fixed factors, and an intercept (x4). I've typically never used regressions to predict out-of-sample values, only for descriptives. But, let's say I had a new observation with a level that is either

Not in x4, or

If I knew nothing about what level this new observation is in, but I had a gun to my head and had to make a prediction anyway.

To come up with a prediction:

intercept (fixed) + b1*new1 + b2*new2 + b3*new3, but what do I do with the random intercept? In my case, is it reasonable to simply add the intercept (x4) term that is outputted from my model to come up with my y estimate?

1 Answer
1

The mixed effects model assumes that the parameters associated with your random intercept factor $x_4$ are distributed as

$$ N(0, \sigma_4) $$

This assumption is built into the estimation procedure of the model. Additionally , the dispersion parameter $\sigma_4$ is estimated from your data along with all of the other parameters of the linear model.

Now for your question(s).

If you do not know anything about the value of $x_4$ you have two options.

If you need one number as a prediction, you should think as follows. If someone gave you a distribution, say a $N(154, 7)$, and wanted you to predict what number will be sampled from it, what would you say? The best choice by many metrics is the mode, as it is the value that is most likely(*). Your situation is the same: you don't know the value of the $x_4$ parameter, but you know it's drawn from $N(0, \sigma_4)$. The most likely value is zero(**).

If you want something more sophisticated, you can draw samples from the random effect distribution $N(0, \sigma_4)$, make predictions with each of the drawn random effects, and get a simulated distribution of possible predictions. This lets you quantify the possibilities given the information that you do have.

(*) From a continuous distribution all values are technically equally likely, they all have probability zero of being drawn. The mode is better conceptualized as the value that the sample is most likely to be near.

(**) It seems weird that the most likely value of $x_4$ is always zero? Keep in mind that you also estimated an (fixed) intercept for your model, so that has absorbed the center of the random effect distribution.

$\begingroup$So, if I do assume that my new sample has the most-likely value of x4 (a value of zero), and the random intercept absorbs the center of the distribution, then I should be adding to the random intercept to my prediction, correct?$\endgroup$
– BobbyJohnsonOGSep 1 '16 at 22:00

$\begingroup$It's the fixed intercept that absorbs the center of the random intercept distribution. Sorry, the terminology in this situation turns into quite a nightmare. So in the case that you want to infer the mode, your prediction would just ignore the random intercept part: (fixed intercept) + x_1 + x_2 + x_3.$\endgroup$
– Matthew DrurySep 1 '16 at 22:03

$\begingroup$On the flipside, when would I include (random intercept) into my model? If I know what x4 is for a new sample, what would that get multiplied by?$\endgroup$
– BobbyJohnsonOGSep 1 '16 at 22:08

$\begingroup$That's a deeper question, and really depends on what you're after. Most of my work is focused on prediction, so I think of it from a shrinkage/regularization perspective. I use a random effect when I think the parameters are likely to be dominated by noise, and I want to estimate them conservatively by shrinking them towards estimates I'm more sure about. I'm sure scientists more focused on explanatory modeling have other perspectives, I'm just not very literate on what they are.$\endgroup$
– Matthew DrurySep 1 '16 at 22:09

$\begingroup$Thank you so much for your help. Please me know if this should be a new question:$\endgroup$
– BobbyJohnsonOGSep 1 '16 at 23:51