Abstract

This article considers a new approximation to the log-likelihood surface in mixture models. This approximation is based on both the mean and variance of the full-data loglikelihood over imputations of assignments of observations to components. This approximation is accurate to second order, and holds for general missing data problems. The approximation provides a new method for calculating the observed information using the EM algorithm, and motivates a Gauss-Newton method for finding the MLE. This GaussNewton method is implemented together with the ideas behind the SAGE algorithm. The resulting algorithm outperforms the EM, CEMM, and a further Gauss-Newton algorithm when analyzing data from three-component Gaussian mixtures.