st: RE: Interaction and squared effects in a probit (pa) model

--- Andrea and Laura wrote me privately:
> We´re working on a model and, when trying to solve some econometrical
> issues, we found your name on the statalist and thought you may be
> able to help us.
The rule is that you ask questions to statalist and not to individual
members:
<http://www.stata.com/support/faqs/res/statalist.html#private>
> We´re estimating a panel data model with 90 cross-section observations
> (country pairs) and 10 time series observations (years), using the
> -xtgee- command with the -family(bin) link(probit) corr(ar1) robust
> force- options.
>
> We recently included interactive terms in our model and we´re finding
> difficulties in estimating the correspondent marginal effects, as the
> command -margins- is not suitable for nonlinear estimations,
> especially when our variables of interest are combined with each other.
>
> To make things even more complex, we´ve also got variables interacted
> with themselves (quadratic terms).
>
> Searching for a command that would be suitable in our case, we found
> -inteff- (http://www.stata-journal.com/sjpdf.html?articlenum=st0063),
> but are a little confused because of the mention of the squared
> variables.
-margins- is actually exactly right when you want the marginal effect
after a model that includes square terms. As you can see in the first
part of the example below, the marginal effect returned by -margins-
corresponds exactly with the marginal effect computed by hand.
The real question should be: Do you really want marginal effects?
Marginal effects can be thought of as a linear model on top of your
previous model. In the graph below we can see that the predicted
probabilities follow a strong non-linear pattern. This begs the
question: Do you believe that there can be a single straight line
that can meaningfully summarize the pattern in the predicted
probabilities?
For the example below my answer would: no, that pattern is just too
non-linear. This should come as no surprise. The quadratic term
was added because we believed there to be substantial non-linearity.
So either we believe that a linear line is a good-enough
approximation, in which case we can use marginal effects but it
raises the question why we added the quadratic term. Or we believe
that the non-linearity is substantial, which means that the quadratic
term may be justified, but now marginal effects loose their meaning.
If you are in the latter case I would add a footnote to the table
of marginal effects saying that the effect is just too non-linear to
be meaningfully summarized by marginal effects and leave that cell
them empty in the table. Than I would add a graph of the predicted
probability against that variable.
*-------------------- begin example -------------------------
sysuse auto, clear
recode rep78 1/2=3
probit foreign c.mpg##c.mpg i.rep78
// do it with margins
margins, dydx(*) at(mpg=20 rep78=4)
// do it by hand
tempname xb
scalar `xb' = _b[_cons] + _b[mpg]*20 + _b[c.mpg#c.mpg]*400 + ///
_b[4.rep78]
di normalden(`xb')* (_b[mpg] + 2*20*_b[c.mpg#c.mpg])
//============================= do you really want marginal effects?
// create predicted probabilities by repair status
predict pr
separate pr, by(rep78)
// create the "regression lines" implied by marginal effects
scalar `xb' = _b[_cons] + _b[mpg]*20 + _b[c.mpg#c.mpg]*400
local b3 = normalden(`xb') * ///
(_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c3 = normal(`xb')-20*`b3'
sum mpg if rep78 == 3, meanonly
local l3 = r(min)
local u3 = r(max)
local b4 = normalden(`xb' + _b[4.rep78])* ///
(_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c4 = normal(`xb' + _b[4.rep78])-20*`b4'
sum mpg if rep78 == 4, meanonly
local l4 = r(min)
local u4 = r(max)
local b5 = normalden(`xb' + _b[5.rep78])* ///
(_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c5 = normal(`xb' + _b[5.rep78])-20*`b5'
sum mpg if rep78 == 5, meanonly
local l5 = r(min)
local u5 = r(max)
// display them in a graph
twoway line pr3 mpg, sort lpattern(solid) lcolor(black) || ///
function y = `c3' + `b3'*x, ///
range(`l3' `u3') lpattern(solid) lcolor(gs8) || ///
line pr4 mpg, sort lpattern(dash) lcolor(black) || ///
function y = `c4' + `b4'*x, ///
range(`l4' `u4') lpattern(dash) lcolor(gs8) || ///
line pr5 mpg, sort lpattern(shortdash) ///
lcolor(black) || ///
function y = `c5' + `b5'*x, ///
range(`l5' `u5') lpattern(shortdash) lcolor(gs8) ///
ytitle(predicted probability) xline(20) ///
xtitle(miles per gallon) ///
legend( cols(1) pos(4) ///
order( - "probit predictions" ///
1 "rep78=3" ///
3 "rep87=4" ///
5 "rep87=5" ///
- "marginal effects" ///
`""predictions""' ///
2 "rep78=3" ///
4 "rep87=4" ///
6 "rep87=5" ))
*------------------ end example ------------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/