Hi,
I wonder whether others have already noticed that allocations may
surprisingly be slower on 64bit platforms than on 32bit ones.
I compiled the following code using an OCaml-compiler that generates 32bit code:
-------------------------
let () =
for i = 1 to 100000000 do
ignore (Int32.add 42l 24l)
done
-------------------------
I ran it on a 64bit platform (Intel(R) Pentium(R) D CPU 2.80GHz), and
it took 0.65 seconds to finish. Then I recompiled it on this same
platform using an OCaml-compiler that generates 64bit code.
Surprisingly, the resulting executable took 0.72 seconds to run!
This is only a difference of about 10%, but I have seen more complex
cases where there are timing differences in excess of 50%, which is
already pretty substantial.
Looking at the assembly, there is really no difference in the loop
other than the use of the quad word instructions, which should not
take longer on the exact same platform (i.e. same CPU-frequency). But
there is a suspicious call to "caml_alloc2", which might cause these
differences. Can it be that there are alignment problems or similar
in the run time?
In the considerably more complex code I'm currently working on it also
seemed to me that it's allocations (the run time) that cause the
performance difference.
Regards,
Markus
--
Markus Mottl http://www.ocaml.info markus.mottl@gmail.com