Efficient Monte Carlo Methods for Conditional Logistic Regression

ABSTRACT: Exact inference for the logistic regression model is based on generating the permutation distribution of the sufficient statistics for the regression parameters of interest conditional on the sufficient statistics for the remaining nuisance parameters Despite the availability of fast numerical algorithms for the exact computations, there are numerous instances where a data set is too large to be analyzed by the exact methods, yet too sparse or unbalanced for the maximum likelihood approach to be reliable What is needed is a Monte Carlo alternative to the exact conditional approach which can bridge the gap between the exact and asymptotic methods of inference The problem is technically hard because conventional Monte Carlo methods lead to massive rejection of samples that do not satisfy the linear integer constraints of the conditional distribution We propose a network sampling approach to the Monte Carlo problem that eliminates rejection entirely Its advantages over alternative saddlepoint and Markov Chain Monte Carlo approaches are also discussed