Re: st: When number of regressors greaterthan the number of clusters in OLS regression

Re: st: When number of regressors greaterthan the number of clusters in OLS regression

Date

Mon, 1 Sep 2008 21:08:15 -0400 (EDT)

Thank you all for your invaluable suggestions. I really appreciate it.
Divya.
---- Original message ----
>Date: Mon, 1 Sep 2008 19:19:48 -0400
>From: Steven Samuels <sjhsamuels@earthlink.net>
>Subject: Re: st: When number of regressors greater than the number of clusters in OLS regression
>To: statalist@hsphsun2.harvard.edu
>
>Thanks Mark. I've been thinking that the data were not *sampled* as
>clusters. Since they were not, I erroneously assumed that there would
>not be cluster effects. I agree clustered effects should be
>considered. As Vince Wiggins stated in http://www.stata.com/statalist/
>archive/2005-10/msg00594.html , "We can use the [robust] covariance
>matrix to test any subset of joint hypotheses that does not exceed
>its rank." Thus Divya can get valid standard errors for single
>coefficients, if she adds states as clusters, and can probably make
>most of the inferences she is interested in.
>
>-xtreg- offers some intriguing possibilities, for it would
>distinguish between state-level and district-level predictors of the
>same kind. Of course statistics from neighboring districts may be
>spatially correlated, opening up a completely different area of
>analysis.
>
>Perhaps the best advice to Divya that I can give, in addition to Mark's:
>
>Clarify your purpose--is the study exploratory ("find a good
>predictive model")? Or are you testing hypotheses about certain
>predictors? If your analysis is exploratory, consider holding out a
>random set of districts or states on which to test the fit of your
>"best" models. If you are interested in certain predictors, than
>others are potential effect modifiers and confounders. You probably
>don't need them all. Do you have 25 predictors because you know they
>are all important from other studies? The more unnecessary
>predictors you have in one model, the more difficult it will be to
>tease out the truly important ones.
>
>-Steve
>
>
>
>On Sep 1, 2008, at 6:00 PM, Schaffer, Mark E wrote:
>
>>
>> Whether or not you need to use cluster-robust depends on whether you
>> think your data have a problem that cluster-robust can address, namely
>> (1) the error terms in your equation are correlated within states
>> because of unobserved heterogeneity (so the iid assumption fails), but
>> (2) the error terms are not correlated across states.
>>
>> A good example would be whether you are looking at something that is
>> affected by state-level regulation, i.e., the laws regulating it vary
>> from state to state, but you don't have variables that control for
>> this
>> somehow.
>
>*
>* For searches and help try:
>* http://www.stata.com/help.cgi?search
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
=======================================
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/