How can we predict latent class membership for a new data set which is not part of the data set used for LCA? For example, if we conduct LCA based on male sample, can we get output file with the predicted class membership for females?

You can do this. It makes the assumption that the new sample comes from the same population as the original sample. This may not be justified in your case.

To do so, you need to fix all of the parameters to their estimated values using the original sample. You can use the SVALUES option of the OUTPUT command to get your MODEL command with those values as starting values. You can do a global replace of * with @ to fix the values. Then estimate the model using the new data set asking for CPROBABILITIES in the SAVEDATA command.

I conducted LCA based on data collected up to 2009 (Sample 1). In year 2010, we collected new sample (Sample 2). I want to investigator whether using LCA model based on Sample 1 predict classmembership of Sample 2 is consistent with results from LCA analysis on Sample 2. If there were no missing data on Sample 2, I would use formulae for posterior probability to predict the classmembership. But we do have missing data on some indicators. What you advised makes sense to me.

Do you have any sample M-plus syntax or similar example for doing this prediction I can follow?

I think you want to compare posterior probabilities using the sample 1 estimates on sample 2 and using sample 2 in a new analysis. The way you would do the first is ask for SVALUES in the analysis with sample 1. Then you would change the * to @ and use that input with the sample 2 data. All parameters in the model should be fixed. Ask for CPROBABILITIES in the SAVEDATA command.

If in a first step you have created an observed binary variable of compliance status using Most Likely Class you can then in a second step simply use a multiple-group analysis using that binary variable as a GROUPING variable (see UG Index) in a model that is a standard mediation model

y ON M X; M ON X;

Note, however, that the entropy in the first step should be at least 0.8 for this to give trustworthy results. If less than 0.8, you want to study 3-step techniques in Web Note 15.

Thank you, Dr. Muthen, for such a prompt response! That's the thing, I have never created observed binary variables based on compliance status. That is the thing I do not know how to create such a variable in MPlus.

I know once you have them you use SAVE=CPROBABILITIES; to save them. But how do you I create them? I don't see an example in the MPlus Guide.

And once I create them, can I actually open the data set & see them (like in STATA when you can browse your data set)?

SAVE = CPROB not only gives the posterior class probabilities for each person but also gives the most likely class membership (the class that the person's posterior probability is highest for). You can also save any variable you want into this data set by using the Auxiliary command.

One more thing to add to the above. While my fit indices look perfect, I have a very large residual variance for my Y. So, it doesn't seem like a model explains a lot of variance in the outcome. I know fit indices are not compared directly to residual variance or amount of variance explained, but still seems strange to have such perfect fit in this situation.