Count data with excess zeros are commonly seen in experiments forimproving electronics manufacturing quality, in medical researchof HIV patients with high-risk behaviors and in agricultural study of number of insects per leaf.Yip (1988) and Lambert (1992) proposed zero-inflated Poisson distribution andHeilbron (1989) used zero-altered Poisson and negative binomial distributionsto model this type of data. Li, Lu, Park, Kim, Brinkley and Peterson (1999)derived multivariate version of the zero-inflated Poisson distribution andapplied it to detect equipment problems in electronics manufacturingprocesses.
Zero-inflated distributions assume that with probability 1 - p the onlypossible observation is 0, and with probability p, a random variabledescribing defect counts in the imperfect state is observed. For example, when manufacturing equipment is properlyaligned (perfect state), there may be no defects. Otherwise, defects may occuraccording to a distribution of the imperfect state. The defect counts inimperfect state could follow Poisson, negative binomial, or other distributions but most of the current researches use Poisson distribution. Although the maximum likelihood (ML) method is widely used in estimatingparameters in the zero-inflated distributions, there is no theoreticalstudy on the properties of the ML estimates.In Chapter 1, we propose a generalframework for generalized zero-inflated models (ZIM), which assume only thatthe distribution of the imperfect state has the support of the nonnegativeintegers and satisfies appropriate regularity conditions. We study the properties of the ML estimates of ZIM parameters,including their existence, uniqueness, strong consistencyand asymptotic normality under regularity conditions. By focusing on the univariate ZIM, we give detailedrigorous proofs to the lemmas and theorems stated in the thesis. Then, we study covariate effects in the univariate and multivariate zero-inflated regression models. Because the zero-inflated model involves both Bernoulli parameter p and the imperfect state parameter lambda,building the model separately does not use the information efficiently and the resulted model is more complicated than needed. This problem gets worse in the multivariate ZIM, where the number of model terms increases drastically. Our procedure selects limited important model terms to maximize the ZIM likelihood functions.
In Chapter 2, we review current researches on zero-inflated Poissonmodels. Some new results on multivariate Poisson and multivariate zero-inflated Poisson distributions are given. By generalizing theresults in Lambert (1992) and Li, et al (1999), we propose a multivariatezero-inflated Poisson regression model. An example from Nortel process development research is used to illustrate the model selection procedure for the zero-inflated regression models and computational details.