Page tags

Add a new page

Introduction

In principle, a "gold standard" for estimating $p$ values for the significance of a more complex model relative to a simpler (nested) one is (1) fit the simpler model to the data; (2a) simulate new values from the simpler values many (e.g. $N=1000$) times; (2b) each time, fit the more complex model to the simulated data and extract summary statistics (e.g. $t$ or $Z$ statistics for the parameters of interest); (3) estimate $p$ values or critical values from the null distribution of simulated statistics, e.g. the frequency with which $|t_{\mbox{obs}}|>|t_{\mbox{sim}}|$. Pinheiro and Bates 1 do this to illustrate the issue of boundary effects, and the simulate.lme method in the nlme package is set up to run precisely this procedure: see ?simulate.lme. This procedure will generally be very slow, and it would be better to have workable approximations for $p$ values (or to dispense with them entirely, if your editor/reviewers/supervisor/co-authors will let you …) There are two problems with the analogous procedure for GLMMs fitted with [g]lmer in the lme4 package: (1) a simulate method is not implemented for GLMM fits; and (2) in the case of quasilikelihood fits, it's not even clear how to simulate "quasi" data - how does one generate data with the right mean-variance relationship that make sense?

Simulating from quasi- distributions

To answer the second question first: there are (at least) two different ways to do this, neither feels entirely satisfactory, neither is well tested. (From a simulation point of view we would be better off with compound distributions, either or negative binomial or lognormal-Poisson for counts [logitnormal-Beta or beta-binomial] for proportions.)

Method 1: generate data from a Gamma distribution with parameters adjusted to give the appropriate mean-variance relationship. That is, for shape parameter $a$ and scale parameter $s$, and quasi-Poisson parameters $\lambda$ (mean) and $\phi$ (overdispersion parameter), knowing that $\mbox{mean} = as$ and $\mbox{var}= as^2$ gives us (by the method of moments) $a=\mbox{var}/\mbox{mean}=\lambda \phi/\lambda=\phi$ and $s=\mbox{mean}^2/\mbox{var} = \lambda^2/(\lambda \phi) = \lambda/\phi$. The only other question is whether to round the values, which will make them more "realistic" but will mess up the mean and variance.

Method 2: Alternatively, one can use a negative binomial with $k$ (overdispersion parameter) depending on the mean. For the NB, $\mbox{mean}=\mu$, $\mbox{var}=\mu(1+\mu/k)$. For the QP, $\mbox{var}=\lambda \phi$, $\mbox{mean}=\lambda$. So $\lambda=\mu$ and $\phi \mu = \mu(1+\mu/k)$ or $k=\mu/(\phi\mu-1)$. Therefore, this approach will only work if $\phi \mu >1$, which may be a problematic restriction …

Simulations

Read in a file containing (among other things) my.mer.sim, a drop-in replacement for the simulate method in lme4 that can handle GLMMs.

> source("glmmfuns.R")

Now to test this out. First simulate some Poisson and some binomially distributed "data" (the eta1 variable, which is the value on the linear predictor scale plus some error variance, will serve as a test of normally distributed data):