Re: st: too many base levels specified?

thanks, jeff,
this is incredibly helpful!
traci
On Mon, Dec 6, 2010 at 12:19 PM, Jeff Pitblado, StataCorp LP
<jpitblado@stata.com> wrote:
> Traci Schlesinger <traci.schlesinger@gmail.com> is using -suest- with the
> results from two -logistic- model fits and is getting the "too many base
> levels specified" error:
>
>> I am analyzing racial disparities in criminal justice outcomes using
>> both logistic and reg on individual-level data split by sentencing
>> structure. I first run the models on states with traditional codes
>> and then on states with presumptive guidelines. I am using the SUEST
>> command so that I can compare the values of the variables "black" and
>> "latino" across these two models. In other words, I want to know if
>> the value of "black", for example, is different in states with
>> traditional codes than in states with presumptive guidelines.
>>
>> here is (part of) my code:
>>
>> logistic incarcerate black latino young1 young2 drugt totchgs2 chg1att
>> second_fel cjstatus priarr prifconv primconv pripris y1998 y2000 y2002
>> y2004 y2006 i.county if guidelines_cat == 0;
>> est store a;
>> logistic incarcerate black latino young1 young2 drugt totchgs2 chg1att
>> second_fel cjstatus priarr prifconv primconv pripris y1998 y2000 y2002
>> y2004 y2006 i.county if guidelines_cat == 2;
>> est store b;
>> suest a b;
>> test [a_incarcerated]black=[b_incarcerated]black;
>> test [a_incarcerated]latino=[b_incarcerated]latino;
>>
>> While this code is very similar to codes I have used in the past, I am
>> getting the error code "too many base levels specified r(198);" when
>> stata gets to the "suest a b;" line. What does this mean? And, is
>> there anything I can do about it?
>
> It appears that the smallest value of the 'country' variable is different when
> 'guidelines_cat' is equal to 0 compared to when it is equal to 2. This
> results in a different base level chosen for the 'country' factor variable in
> the two -logistic- model fits. When these two models are combined by -suest-,
> Stata complains about the two different base level choices.
>
> Traci can -fvset- the base for the 'country' factor variable so that it is
> consistent between model fits.
>
> Here is a fabricated example using the auto dataset. Output is omitted for
> the sake of brevity.
>
> . sysuse auto
> . tabulate rep78 foreign, nolabel
>
> Reviewing the output from -tabulate- shows that 'rep78' takes on the integer
> values from 1 to 5, but there are no observations where 'rep78' is 1 or 2 and
> 'foreign' is 1.
>
> Let's use -fvset- to fix the base level for -rep78- to 5:
>
> . fvset base 5 rep78
>
> Now we can fit two linear regression models (or any models we want) without
> having to also specify the common base level.
>
> . regress mpg turn i.rep if foreign==1
> . estimates store Foreign
> . regress mpg turn i.rep if foreign==0
> . estimates store Domestic
>
> Thus we can use -suest- with the stored estimation results.
>
> . suest Foreign Domestic
> . test [Foreign_mean]turn = [Domestic_mean]turn
>
> Notice that we chose a base of 5 for 'rep78' instead of 1. The above -test-
> result will be the same, but the interpretation of the 'rep78' coefficients
> would not be consistent between the model fits. This would only matter if we
> were interested in comparing the 'rep78' coefficients between the model fits.
>
> --Jeff
> jpitblado@stata.com
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/