insignificant interaction term - should we still look at the marginal effects?

06 Nov 2015, 09:45

Dear Statalist,

I have a question that is maybe not so much about Stata as it is about statistics in general.

I run a multivariate regression model (prais) and include an interaction effect between two continuous variables. Let’s call them X and Z. (c.X#c.Z)
The output tells me the interaction term c.X#c.Z is not significant. P>(t) = 0.470

Can I/should I proclaim that there is no interaction, or should I still look at the marginal effects?

When I look at the margins, I get the following:
margins, dydx(X) at( Z=(0(1)10)) vsquish

Or should I say, based on this, that the two variables actually interact; that at some values of Z (3-8), the effect of X on Y is actually moderated/influenced by Z.
I thought that after looking at the (insignificant) interaction term in the output, there is no need to investigate further.

Would the conclusion/interpretation change if _all 11_ values (not just 3-8 but 1-11) were significant, while the main interaction term in the multivariate regression remains insignificant?

The answer depends on your specific problem and context. The failure of the interaction term to achieve statistical significance may be due to the actual absence of effect modification. But it also may reflect inadequate sample size, inadequate variation in either X or Z, or too much noise in the outcome measurement. If the theory of the domain you are working in says that X#Z interaction is to be expected, then I would ignore the statistical significance, and go with the output you have. If you were originally motivated to include cX#c.Z because you had personal reasons to anticipate there would be effect modification but you wanted to find out, then you should revisit your original motivation. If you really had good reason to anticipate interaction, I wouldn't necessarily be dissuaded by the non-significance. But if it was more or less of a whim, or part of a fishing expedition, then by all means go back and re-model without the interaction. If the whole purpose of the study was to determine whether there is interaction between X and Z, then you have a negative result, but the reasons for the negative result are, as already noted, potentially quite diverse.

Another thing that would influence my thinking about whether to retain these is the size of the effects you got from your -margins- command. Without knowing what X, Z, and your outcome variable are, I have no way to know if the difference between -0.0272084 and -0.0071655 is large enough to matter from a practical perspective (and whether the range of Z from 1 to 10 is the appropriate range to look for such interaction). You can be the judge of that. If you think that difference in effects is of practical significance, then, again, I would retain the interaction in the model.

Finally, I don't think I would pay much attention at all to the p-values in the -margins- output. They are telling you whether the effect of X on your outcome is significantly different from zero at each of the particular values of Z. There are a lot of aspects of the distribution of X, Z, and the outcome that affect these p'values. The fact that some are statistically significant and others are not is not particularly useful information. And even if all 11 were statistically significant, that would only tell me that throughout the range Z from 0 to 10, X has a statistically significantly non-zero effect on the outcome--but it doesn't say anything at all about whether these effects differ significantly from each other.

Comment

This sentence from you makes me write a reply: “ They are telling you whether the effect of X on your outcome is significantly different from zero at each of the particular values of Z.”

Let’s assume that the sample size is big enough, that variation in X,Z is sufficient and noise is small enough. This is not a question about research design in that sense.
I am doing some research on political parties. In Europe. I’m looking at policy position/preference drift.

My argument is that main predictor “X” has an effect on the DV “Y” (Y measures change in policy position on a certain issue for political parties, each party has its own value on Y).
I find that it has.

Now some theorize that the effect of X on Y should be different, depending on the characteristics of political parties in question (for example are they Left wing (Z = 1) or Right wing (Z=10) or anything in between). Others that effect should only hold for some parties but not others.

I run a regression and find that:
Coefficient of X* (significant)
Coefficient of Z* (significant)
Coefficient of W (no sig)
Coefficient of V (no sig)
Coefficient of Q (no sig)
cX#c.Z (no sig).
other interactions also not significant.

And I go on to argue on the basis of this that the effect of X on Y does not depend on Z (it dosen’t matter if the party is Left or Right wing).

Along comes somebody, and they do the exact same thing. Same data, same everything. cX#c.Z (no sig).
The they show and plot the marginal effects (marginsplot, a smiliar graph is attached to this) and show that the effect of X on Y is stronger when Z is in the range of 3-7. Or that X only has an effect on Y when the party is Center-Left/Center. (ignore for a moment the small values eg. 0.0272084)

That goes against what I argued. I said “cX#c.Z (no sig).”, hence there is a ceteris paribus effect of X on Y that is similar for all parties regardless whether left or right.

Comment

The they show and plot the marginal effects (marginsplot) and show that the effect of X on Y is stronger when Z is in the range of 3-7. Or that X only has an effect on Y when the party is Center-Left/Center. (ignore for a moment the small values eg. 0.0272084)

Well, assuming they got the same results as you did, they could only "show" that by completely misinterpreting the results. The margins results you posted show no such effect at all. It is true that the effect of X is statistically significant when Z is in the 3-7 range, and not statistically significant outside that range. But the effect of X is clearly at its most negative when Z = 0 and at its most positive when Z = 10, and it increases monotonically throughout that range. The fluctuations in statistical significance arise as artifacts of peculiarities in the distributions of X, Z, and outcome. (For example, maybe there are just more data with Z in the 3-7 range than at the extremes.)

It all boils down to a really important principle to bear in mind--and it's universally applicaable:

The difference between statistically significant and not statistically significant is itselfnot statistically significant!

1 like

Comment

Thank you.
The negative and positive effects when Z=0 and Z=10 are OK, it's supposed to be that because values below 0 on the Y indicate change in one direction and values above 0 indicate change in another.
The population is "all parties in Western Europe" so obviously, there are more parties close to the center than there are parties at the extremes.

So if I get is straight: The bottom line is that even if an interaction effect is insignificant, I can still draw conclusions from the marginal effects, as the other researcher does? Is that the lesson for today?

Comment

I'm not sure exactly which conclusions drawn from the marginal effects you are speaking of. You can't conclude that your analysis affirms the existence of an interaction. They don't. But the non-significance of the interaction term doesn't exclude that possibility either, unless you can rule out all the other reasons why the interaction term might not attain statistical significance. Based only on what you show, I would say that the data are compatible with either interpretation.

If you find the difference between the effect given you by -margins- at Z = 0 to be different from that at Z = 10, then certainly you can report that finding, even if it is not buttressed by a p-value. (Note, by the way, that the confidence intervals for those effects do not overlap--though I wouldn't put too much stock in that either.)

But one conclusion you definitely cannot draw is that the effect modification by Z is stronger in the 3 to 7 range than at the extremes. Not only is it a misinterpretation of the p-values to assert that claim, it actually contradicts the data in the dy/dx column of the output.