Re: st: A layman question on model building

Thanks Maarten
On Thu, Mar 7, 2013 at 5:22 PM, Maarten Buis <maartenlbuis@gmail.com> wrote:
> --- On 07.03.2013 08:07, James Bernard wrote:
>>> We often add control variables that turn out to be insignificant. Does
>>> that mean that I can remove that variable form my model without being
>>> concerned with omitted variable bias?
>
> --- On Thu, Mar 7, 2013 at 8:22 AM, John Antonakis wrote:
>> If you have a sufficiently large sample size and the regressors of interest
>> are significant predictors, then it is best to leave in the controls; they
>> do not harm but help consistency (even if only a tad). <snip> I would
>> (mostly) always err on the side of caution and include the controls.
>
> I agree with John if the predictor belonged in the model. However,
> many predictors don't belong in the model to begin with. I often see
> researcher including variables only because they think it influences
> the explained/dependent/response/left-hand-side/y variable. This is a
> necessary but _not_ a sufficient condition. The researcher also needs
> to think about the relationship between the control variable and the
> predictor of interest. This leads to two common mistakes:
>
> Say you have a response which is influenced by someones social status
> and you approximate social status with someones occupation. Should you
> than control for someones education? If you think that someones
> education is also a approximation of someones social status, and that
> that is the mechanism through which education influences your response
> than the answer should be no. If you do than you measure the impact of
> one measure of social status while adjusting for another measure of
> that same social status. What you could do is either chose one of the
> two measures of social status or combine the two measures into a
> single status measure, for instance using -sheafcoef- and -propcnsreg-
> (both from SSC) or -sem-. Maybe you need to rethink your theory, it
> could be that it is really just possibilities and constraints
> associated with the occupation and knowledge and socialization
> received from eduction that influences your response, in which case
> both might (see below) belong in your model .
>
> Say you want to know the impact of someones education on a response,
> should you control for that persons occupation? In most cases I would
> say no, as that way you would filter out one of the important
> mechanisms through which education influences the response. In most
> cases it is reasonable to assume that someones education influences
> someones occupation and not the other way around. So if occupation
> influences a response, than that represents a causal pathway through
> which education influences that response: education influences
> someones occupation which in turn influences the response. This is not
> a spurious effect that you want to filter out. If anything the causal
> claim for this part of the effect of education is stronger than the
> residual effect you would obtain if you controlled for occupation,
> since we know why that effect is there and how it works. It could be
> meaningful to use -sem- in combination with -estat teffects- to
> decompose the total effect of education into effects that can be
> explained by intervening variables (occupation) and a
> residual/unexplained effect of education
>
> So, once I decided that a variable belongs in my model I would agree
> with John and almost always leave it in, but many variables do not
> belong in the model to begin with.
>
> Hope this helps,
> Maarten
>
> ---------------------------------
> Maarten L. Buis
> WZB
> Reichpietschufer 50
> 10785 Berlin
> Germany
>
> http://www.maartenbuis.nl
> ---------------------------------
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/