You are giving your test subjects 1s of wallclock time, not running time. That can make quite a difference.

Why run them only for a fixed time, and not build in some finish criteria? This way there is no real assurance that whatever was to be tested really was done (specially if longer running tests join the set later).