I tried again with a recent Clang, and confirm your finding. Indeed Clang produces faster compiles than GCC (by 4.5%) on my code (which relies heavily on cross-module inlining, making zero use of "static inline" implementation of function in header files).

For my draughts program (template heavy, header-only C++ code), GCC consistently produces 10% faster binaries than Clang. I suspect that Clang is not aggressive enough with inlining all my thin wrappers. I am too lazy to figure out the exact threshold parameter that will trigger Clang to inline as well as GCC.