May i ask for some guidance on an EFA i am doing using dichotomous variables. The correlation matrix is non-positive definite (observed in R). However, the model runs in Mplus (dependant on rotation chioce and number of factors).

I assume that Mplus calculates pairwise tetrachoric correlations for the data. Does Mplus apply any smoothing method to the correlation matrix to allow the model to fit. If so, do the model fit statistics need to be adjusted?

Is there areason why ML cannot be used in this EFA or is WLSMV preferable? It would be great if you could point me to a reference for either of these.

Maximum likelihood can be used if there are not too many factors. The issue with maximum likelihood and categorical variables is that numerical integration is required, one dimension for each factor. More than four dimensions can be computationally unrealistic.

I also have an issue with zero cells. Is their a facility to add a continuity correction when calculating the tetrachorics? If not i have seen previous posts where you recommend removing one of the variables - do you think this necessary when only one table has a zero cell out of say 465 correlations?

For binary variables, a zero cell implies a correlations of one. Both variables should not be used in the analysis. For polytomous items, some zero cells should be okay. Mplus automatically adds a constant to the frequency of a zero cell. See the ADDFREQUENCY option.

So the default is to add 0.5/(sample size) to the zero cell. I had been wondering why i wasn't seing a corrrelation of -1 between the relevant variables. With this added the correlation between the two variables is not equal to 1 (or -1).

But i am a bit confused. Why do you advocate removing the variables when the continuity correction has removed some of the bias that tetrachorics have to tend towrds -1 when zero cells are present?

You don't see a correlation of plus/minus one. That is the problem. The implied correlation is not what is estimated. The addition of a constant does not necessarily work well particularly in the base of binary items. With binary items, we recommend using only one of the items when they correlate one just as we would recommend if two continuous variables correlated one.