[Show abstract][Hide abstract]ABSTRACT:
We investigate mean field variational approximate Bayesian inference for models that use continuous distributions, Horseshoe, Negative-Exponential-Gamma and Generalized Double Pareto, for sparse signal shrinkage. Our principal finding is that the most natural, and simplest, mean field variational Bayes algorithm can perform quite poorly due to posterior dependence among auxiliary variables. More sophisticated algorithms, based on special functions, are shown to be superior. Continued fraction approximations via Lentz’s Algorithm are developed to make the algorithms practical.

[Show abstract][Hide abstract]ABSTRACT:
A fast mean field variational Bayes (MFVB) approach to nonparametric regression when the predictors are subject to classical measurement error is investigated. It is shown that the use of such technology to the measurement error setting achieves reasonable accuracy. In tandem with the methodological development, a customized Markov chain Monte Carlo method is developed to facilitate the evaluation of accuracy of the MFVB method.

[Show abstract][Hide abstract]ABSTRACT:
We derive the precise asymptotic distributional behavior of Gaussian
variational approximate estimators of the parameters in a single-predictor
Poisson mixed model. These results are the deepest yet obtained concerning the
statistical properties of a variational approximation method. Moreover, they
give rise to asymptotically valid statistical inference. A simulation study
demonstrates that Gaussian variational approximate confidence intervals possess
good to excellent coverage properties, and have a similar precision to their
exact likelihood counterparts.

[Show abstract][Hide abstract]ABSTRACT:
We develop strategies for mean field variational Bayes approximate inference for Bayesian
hierarchical models containing elaborate distributions. We loosely define elaborate
distributions to be those having more complicated forms compared with common distributions
such as those in the Normal and Gamma families. Examples are Asymmetric Laplace, Skew Normal
and Generalized Extreme Value distributions. Such models suffer from the difficulty that the
parameter updates do not admit closed form solutions. We circumvent this problem through a
combination of (a) specially tailored auxiliary variables, (b) univariate quadrature schemes
and (c) finite mixture approximations of troublesome density functions. An accuracy assessment
is conducted and the new methodology is illustrated in an application.

[Show abstract][Hide abstract]ABSTRACT:
Bayesian hierarchical models are attractive structures for conducting regression analyses when the data are subject to missingness. However, the requisite probability calculus is challenging and Monte Carlo methods typically are employed. We develop an alternative approach based on deterministic variational Bayes approximations. Both parametric and nonparametric regression are considered. Attention is restricted to the more challenging case of missing predictor data. We demonstrate that variational Bayes can achieve good accuracy, but with considerably less computational overhead. The main ramification is fast approximate Bayesian inference in parametric and nonparametric regression models with missing data. Supplemental materials accompany the online version of this article.

[Show abstract][Hide abstract]ABSTRACT:
We develop Mean Field Variational Bayes methodology for fast approximate inference in Bayesian Generalized Extreme Value additive model analysis. Such models are useful for flexibly assessing the impact of continuous predictor variables on sample extremes. The new methodology allows large Bayesian models to be fitted and assessed without the significant computing costs of Markov Chain Monte Carlo methods. We illustrate our new methodology with maximum rainfall data from the Sydney, Australia, hinterland. Comparisons are made between the Mean Field Variational Bayes and Markov Chain Monte Carlo approaches.

[Show abstract][Hide abstract]ABSTRACT:
We demonstrate and critique the new Bayesian inference package Infer.NET in terms of its capacity for statistical analyses. Infer.NET differs from the well-known BUGS Bayesian inference packages in that its main engine is the variational Bayes family of deterministic approximation algorithms rather than Markov chain Monte Carlo. The underlying rationale is that such deterministic algorithms can handle bigger problems due to their increased speed, despite some loss of accuracy. We find that Infer.NET is a well-designed computational framework and offers significant speed advantages over BUGS. Nevertheless, the current release is limited in terms of the breadth of models it can handle, and its inference is sometimes inaccurate. Supplemental materials accompany the online version of this article.

[Show abstract][Hide abstract]ABSTRACT:
We investigate general kernel density derivative estimators, that is, kernel estimators of multivariate density derivative functions using general (or unconstrained) bandwidth matrix selectors. These density derivative estimators have been relatively less well researched than their density estimator analogues. A major obstacle for progress has been the intractability of the matrix analysis when treating higher order multivariate derivatives. With an alternative vectorization of these higher order derivatives, these mathematical intractabilities are surmounted in an elegant and unified framework. The finite sample and asymptotic analysis of squared errors for density estimators are generalized to density derivative estimators. Moreover, we are able to exhibit a closed form expression for a normal scale bandwidth matrix for density derivative estimators. These normal scale bandwidths are employed in a numerical study to demonstrate the gain in performance of unconstrained selectors over their constrained counterparts.

[Show abstract][Hide abstract]ABSTRACT:
Likelihood-based inference for the parameters of generalized linear mixed models is hindered by the presence of intractable integrals. Gaussian variational approximation provides a fast and effective means of approximate inference. We provide some theory for this type of approximation for a simple Poisson mixed model. In particular, we establish consistency at rate m−1/2 + n−1, where m is the number of groups and n is the number of repeated measurements.

[Show abstract][Hide abstract]ABSTRACT:
Variational approximation methods have become a mainstay of contemporary Machine Learning methodology, but currently have little presence in Statistics. We devise an effective variational approximation strategy for fitting generalized linear mixed models (GLMM) appropriate for grouped data. It involves Gaussian approximation to the distributions of random effects vectors, conditional on the responses. We show that Gaussian variational approximation is a relatively simple and natural alternative to Laplace approximation for fast, non-Monte Carlo, GLMM analysis. Numerical studies show Gaussian variational approximation to be very accurate in grouped data GLMM contexts. Finally, we point to some recent theory on consistency of Gaussian variational approximation in this context.

[Show abstract][Hide abstract]ABSTRACT:
We introduce the concept of penalized wavelets to facilitate seamless embedding of wavelets into semiparametric regression models. In particular, we show that penalized wavelets are analogous to penalized splines; the latter being the established approach to function estimation in semiparametric regression. They differ only in the type of penalization that is appropriate. This fact is not borne out by the existing wavelet literature, where the regression modelling and fitting issues are overshadowed by computational issues such as efficiency gains afforded by the Discrete Wavelet Transform and partially obscured by a tendency to work in the wavelet coefficient space. With penalized wavelet structure in place, we then show that fitting and inference can be achieved via the same general approaches used for penalized splines: penalized least squares, maximum likelihood and best prediction within a frequentist mixed model framework, and Markov chain Monte Carlo and mean field variational Bayes within a Bayesian framework. Penalized wavelets are also shown have a close relationship with wide data (“p≫n”) regression and benefit from ongoing research on that topic.

[Show abstract][Hide abstract]ABSTRACT:
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.

[Show abstract][Hide abstract]ABSTRACT:
Variational approximations facilitate approximate inference for the parameters in complex statistical models and provide fast, deterministic alternatives to Monte Carlo methods. However, much of the contemporary literature on variational approximations is in Computer Science rather than Statistics, and uses terminology, notation and examples from the former field. In this article we explain variational approximation in statistical terms. In particular, we illustrate the ideas of variational approximation using examples that are familiar to statisticians.

[Show abstract][Hide abstract]ABSTRACT:
We devise a classification algorithm based on generalized linear mixed model (GLMM) technology. The algorithm incorporates spline smoothing, additive model-type structures and model selection. For reasons of speed we employ the Laplace approximation, rather than Monte Carlo methods. Tests on real and simulated data show the algorithm to have good classification performance. Moreover, the resulting classifiers are generally interpretable and parsimonious.

[Show abstract][Hide abstract]ABSTRACT:
High-throughput flow cytometry experiments produce hundreds of large multivariate samples of cellular characteristics. These samples require specialized processing to obtain clinically meaningful measurements. A major component of this processing is a form of cell subsetting known as gating. Manual gating is time-consuming and subjective. Good automatic and semi-automatic gating algorithms are very beneficial to high-throughput flow cytometry.
We develop a statistical procedure, named curvHDR, for automatic and semi-automatic gating. The method combines the notions of significant high negative curvature regions and highest density regions and has the ability to adapt well to human-perceived gates. The underlying principles apply to dimension of arbitrary size, although we focus on dimensions up to three. Accompanying software, compatible with contemporary flow cytometry infor-matics, is developed.
The method is seen to adapt well to nuances in the data and, to a reasonable extent, match human perception of useful gates. It offers big savings in human labour when processing high-throughput flow cytometry data whilst retaining a good degree of efficacy.

[Show abstract][Hide abstract]ABSTRACT:
We provide several illustrations of Bayesian semiparametric regression analyses in the BRugs package. BRugs facilitates use of the BUGS inference engine from the R computing environment and allows analyses to be managed using scripts. The examples are chosen to represent an array of non-standard situations, for which mixed model software is not viable. The situations include: the response variable being outside of the one-parameter exponential family, data subject to missingness, data subject to measurement error and parameters entering the model via an index.