Possible (!) explanation

then Simplify seems to cache that the simplification to rule is better, because it has a lower LeafCount. Once this is done, the rule is applied, even if rule is now set to an expression which is more complex than the original one.

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.