There are 16 people in the dataset(using subjectID to identify them). Each year, when these people were ages 11, 12, 13, 14, and 15, they provided us a score y. The overview of this dataset is like this:

1 Answer
1

Short answer: No, you cannot say that there is "evidence of a significant difference". Your test shows that there is statistically significant evidence of a non-zero difference in the slopes for men and women, which is a different thing.

Longer answer: Firstly, it is important to get the terminology of hypothesis testing correct. Classical hypothesis tests do not measure any such thing as a "significant difference" between groups. Statistical "significance" in this context is a measure of the magnitude of evidence in favour of the alternative hypothesis of a non-zero difference between the unknown slopes for these two groups. The "significance" here refers to the magnitude of evidence of a difference, not the magnitude of the difference itself. Hence, you should refer to there being "statistically significant evidence of a difference" --- not a "significant difference" in the slopes.

In this particular case, you have conducted an ANOVA test of the nested models, where the null hypothesis is that there is no interaction term (i.e., zero difference in the slopes for males and females) and the alternative hypothesis is that there is a non-zero interaction term (i.e., a non-zero difference in the slopes for males and females). This test gives a p-value of $2.2 \times 10^{-16}$, which is extremely low, and well below any significance level that would be likely to be chosen for testing. Assuming that this nested model form is reasonable (e.g., by checking diagnostic plots, etc.), you have indeed identified statistically significant evidence of a difference in the slopes between the male and female groups.

In regard to this issue, it is again important to stress that there is a huge difference between "significant evidence of a difference" versus "evidence of a significant difference". These are two very different things, and conflating these two things is an example of the quantifier-shift fallacy. It is the same error as confusing "significant evidence of a crime" (e.g., the police collect strong evidence that a motorist ran a stop-sign) and "evidence of a significant crime" (e.g., evidence of arson, rape, or murder).

$\begingroup$I think the models are not nested, the degrees of freedom are the same in the two models.$\endgroup$
– Dimitris RizopoulosMar 28 at 5:31

$\begingroup$Although I wrote the R command, but I don't know how to write the null hypothesis of the test?$\endgroup$
– tieguanyinMar 28 at 22:38

$\begingroup$Having looked at your R commands, I think Dimitris is correct, that you have not used nested models. Full model should be y ~ Age*Sex + (Age|id), not y ~ Age:Sex + (Age|id).$\endgroup$
– BenMar 28 at 23:24

$\begingroup$Thanks Ben. And now I am confused about how to write the null hypothesis of such nested model test:y ~ Age*Sex + (Age|id) and y ~ Age+Sex + (Age|id)?$\endgroup$
– tieguanyinMar 28 at 23:40