It is a well known problem that in some models as the number of observations becomes large, econometric estimators fail to converge on consistent estimators. The leading case of this is when estimating a binary response model with panel data with potential “fixed effects” correlated with the explanatory variables appearing in the population.

One method that is typically implemented by researchers is to observe inviduals or organizations over multiple periods of time. This is called panel data. Within the context of panel data it is often assumed unobserved effects: genetics, motivation, business structure, and whatever other unobservables that might be correlated with the explanatory variables are unchanging over time.

If we do then it is relatively easy to remove the effect of unobservables from our analysis. In my previous post I demonstrate 3 distinct but equivalent methods for accomplishing this task when our structural model is linear.

However, when our model is a binary response variables (graduate from college or not, get married or not, take the job or not, ect.) it is usually no longer logically consistent to stick with a linear model.

In addition, not all of our remidies which worked for the linear model provide consistent estimators for non-linear models. Let us see this in action.

First we will start with generating the data as we did in the December 4th post.

# Add a time index, first define a group apply function# that applies by group index.gapplyfunction(x,group, fun){ returner numeric(length(group))for(i inunique(group)) returner[i==group]get("fun")(x[i==group]) returner}

glm(yp ~ x+xmean,data = fulldata,family = binomial(link = "probit"))# Interestingly, including the Chamberlain-Munlak device in the probit# though theoretically inconsistent does seem produce estimates# comparably good as including the device with the logit at least# in the sample sizes simulated here.