Sunday, February 3, 2013

For descriptive statistics, values below LLOQ set to ...

That is what I read the other day. For calculation of descriptive statistics, values below the LLOQ (lower limit of quantification) were set to.... Then I wondered, wasn't there a trick in JAGS to incorporate the presence of missing data while estimating parameters of a distribution. How would that compare with standard methods such as imputing LLOQ at 0, half LLOQ, LLOQ or just ignoring the missing data? A simulation seems to be called for.

Somewhat informative prior

As the original model did not perform very well at some points, I created an informative prior on the standard deviation. It is not the best prior, but just to give me an idea. To defend having a prior, once a method has a LLOQ, there should be some idea about precision. Otherwise, how is LLOQ determined? Also, in a real world there may be data points with complete data, which allows estimation of the standard deviation in a more hierarchical model.

Simulations

The script below is used to the methods with some settings I chose. Basically, the LLOQ is set at 1. The SD is 1. The mean ranges fro 1 to 3.5. The data is cut a a level of 1. The data is simulated from a normal distribution. If more than half of the data is below LLOQ then new samples are drawn until more than half of the data is above LLOQ.

Results for means

After all simulations, a difference is calculated between the simulation mean and the estimated means. This made interpreting the plots much more easy.dataout$dif <- dataout$mu_out-dataout$muin

Uninformed prior did not perform with very little data

With very little data, the standard deviation gets large and the mean is low. This may just be because there is no information stating they are not large and small respectively, and the chain just wanders of. Anyway, reason enough to find that with only few samples it is not a feasible method. Adding information about the standard deviation much improves this.

At low number of observations variation of the mean swamps bias.

The variation in estimates is rather large. Even so, there is a trend visible that removing missing data and imputing LLOQ give too high estimates when mu is low and hence the number of missings is high. When mu is high then all methods obtain the same result.

At large number of observations bias gets far too large

At this point the missing option is not plotted any more, it is just not competitive. Setting at LLOQ gets a big positive bias, while 0 gets a negative bias, especially for 80 observations. There is no point in setting up an informed prior. Half LLOQ seems least biased out of the simple methods.

Standard deviation is negative biased for simple imputation

The uninformed prior estimation is too bad to plot and retain reasonable axes (and setting limits on y means the data outside those limits get eliminated from the smoother too, so that is not an option). Especially setting missing as LLOQ or ignoring them gets too low estimates of standard deviation. The informed prior gets a bit too large standard deviation.

Conclusion

At low number of observations using a Bayesian model and a good prior on the standard deviation is the best way to obtain descriptive statistics. Lacking those setting <LLOQ values to 1/2 LLOQ for estimation of means and to 0 for estimation of standard deviations seems best. But some care and more extensive simulations for the problem a hand are needed.

Finally, setting data to <LLOQ is something that has implications. As a statistician, I think using <LLOQ to display data in listings and tables is a good thing. However, when subsequent calculations by statisticians are needed, they may cause more trouble than they resolve. I have seen simple PCA go down the drain by <LLOQ, which is, truth to say, a shame given the effort in getting data.

No comments:

Post a Comment

Wiekvoet

Wiekvoet is about R, JAGS, STAN, and any data I have interest in. Topics range from sensometrics, statistics, chemometrics and biostatistics. For comments or suggestions please email me at wiekvoet at xs4all dot nl.