System.nanoTime and multiple cpus/cores

Discussion in 'Java' started by transpendence@googlemail.com, Mar 17, 2006.

Guest

I've tried to use System.nanoTime to make precise measures of timing
intervalls. I't works great - but only as long as the program runs on
one cpu only. If there are multiple cpus/cores, the running thread seem
to switch between different cpus and each cpu seem to have a different
timer base the result of nanoTime is jumping forward and backward in
time, depending on which cpu the thread is currently running.

Is there a possibility to force threads to a single cpu directly in
java or to use another high-precession timer (I need ms resolution and
it should work on Windows too)?

Advertisements

schreef:
> I've tried to use System.nanoTime to make precise measures of timing
> intervalls. I't works great - but only as long as the program runs on
> one cpu only. If there are multiple cpus/cores, the running thread seem
> to switch between different cpus and each cpu seem to have a different
> timer base the result of nanoTime is jumping forward and backward in
> time, depending on which cpu the thread is currently running.

Are you sure it is due to multithreading? There are these small
particles that are known to be able to jump back in time (have to do
some reading up on quantummechanics before posting stuff like this).

Advertisements

Guest

wrote:
> I've tried to use System.nanoTime to make precise measures of timing
> intervalls. I't works great - but only as long as the program runs on
> one cpu only.

I recall an excellent thread (was it an article?) explaining why
you can't rely on System.nanoTime() to give ultra-precise results
when running on multi-cpu/cores... But can't find it back

You can't entirely reject the possibility of an error in the value
given back by System.nanoTime() either: if you search the
web you'll find at least one such bug (I think it was one
version of Windows that was at fault).

> Is there a possibility to force threads to a single cpu directly in
> java

Directly in Java no, even if years ago there have been talks
about this at Sun. They said that "maybe one day" you'd have
the ability to call a method like:

setCPUAffinity()

to define the "affinity mask". This would have solved your
problem but AFAIK is has never been implemented (and may
not even be possible to implement at the JVM level).

That said, maybe you can force the affinity mask at the OS level
if you *really* need it (I'd rather let the OS scheduler decide
how the cores/cpus are used).

Guest

> > If there are multiple cpus/cores, the running thread seem
> > to switch between different cpus and each cpu seem to have a different
> > timer base the result of nanoTime is jumping forward and backward in
> > time, depending on which cpu the thread is currently running.
>
> Are you sure it is due to multithreading? There are these small
> particles that are known to be able to jump back in time (have to do
> some reading up on quantummechanics before posting stuff like this).
>
>

But... Don't you think his explanation may be *exactly* what
is going on? To me nanoTime() is not very precise when
running with several cores/cpus, so the OP's explanation doesn't
seem far-fetched at all (but I may be wrong).

Heck, even the famous assembly "rdtsc" instruction (mentionned
on Roedy's site btw) could only be used to measure timing
accurately if and only if the pipeline was flushed, to prevent
out-of-order instructions execution. This required hacks and...
serious performance drops (flushing the pipeline could be done
by using cpuid).

That said, I can't wait to have Java 9874 which implements
System.picoTime(): this time it *really* is accurate... Then two
months later Intel starts selling the new virtual-multi-transparent
-woozing-buzz-architreadhed-cored-processor and picoTime()
isn't really that precise anymore. Repeat ad nauseam.

Not that I personnally need sub-nanosecond precision timer
or anything

Guest

I can force it to a single cpu via the windows task manager (the
problems are gone then) - but only after the program has started. And
it limits the whole process to a single cpu.

But it seems I've found a solution:

After some searching, I've found that System.nanoTime() uses
QueryPerformanceCounters() on Windows and that this function is known
to have problems on Athlon64 multicore systems. I've found a hint to
use /usepmtimer in win.ini. Don't know if it really always works, but
after I changed it, the problems are gone.

On 17 Mar 2006 05:53:03 -0800, wrote,
quoted or indirectly quoted someone who said :
>I've tried to use System.nanoTime to make precise measures of timing
>intervalls. I't works great - but only as long as the program runs on
>one cpu only. If there are multiple cpus/cores, the running thread seem
>to switch between different cpus and each cpu seem to have a different
>timer base the result of nanoTime is jumping forward and backward in
>time, depending on which cpu the thread is currently running.

Roedy Green wrote:
> On 17 Mar 2006 05:53:03 -0800, wrote,
> quoted or indirectly quoted someone who said :
>
>
>>I've tried to use System.nanoTime to make precise measures of timing
>>intervalls. I't works great - but only as long as the program runs on
>>one cpu only. If there are multiple cpus/cores, the running thread seem
>>to switch between different cpus and each cpu seem to have a different
>>timer base the result of nanoTime is jumping forward and backward in
>>time, depending on which cpu the thread is currently running.
>
>
> that's a bug. Java is supposed to compensate for that.

I agree its a bug, but I'm not sure Java can compensate for it. The JVM
does not necessarily know when a thread moves. The operating system does
know, and should be providing a consistent timer at the syscall, or
equivalent, level.

On Fri, 17 Mar 2006 19:26:30 GMT, Patricia Shanahan <>
wrote, quoted or indirectly quoted someone who said :
>I agree its a bug, but I'm not sure Java can compensate for it. The JVM
>does not necessarily know when a thread moves. The operating system does
>know, and should be providing a consistent timer at the syscall, or
>equivalent, level.

hmm. You would have to enqueue a request to a fixed timer thread.
That of course defeats the fine grain resolution.

Is there at least an integer index of CPU you could grab at the same
time as the RDTSC? Intels have a serial number, which can be
disabled. AMDs don't.
--
Canadian Mind Products, Roedy Green.http://mindprod.com Java custom programming, consulting and coaching.

Guest

> hmm. You would have to enqueue a request to a fixed timer thread.
> That of course defeats the fine grain resolution.

Indeed.

Do you know how I can find (out of curiosity) how nanoTime() is
implemented in Java 1.5 (and/or 1.6) ? (Say under Windows XP
and under Linux).

RDTSC is flawed anyway... As I wrote in another post in this thread,
to have a "real" fine-grained RDTSC you have to flush the pipeline
before using the instruction, which in itself kind of defeats the
purpose.

Moreover with all the CPU that throttle their speed (such as many
Notebook CPUs) RDTSC is basically useless.

And apparently on some hyper-threading systems, methods like
Window's QueryPerformanceCounter sometimes falls back to
RDTSC...

Guest

Hi Patricia,

Patricia Shanahan wrote:
....
> I agree its a bug, but I'm not sure Java can compensate for it. The JVM
> does not necessarily know when a thread moves. The operating system does
> know, and should be providing a consistent timer at the syscall, or
> equivalent, level.

but apparently doesn't provide it.

I found back a thread from 2003 on an Intel forum... (I'm pretty sure
the hundreds-of-mega-bytes of patch Windows has had since that time
didn't fix that problem and the situation on Linux OSes doesn't seem
any better

"You could set up the OS to support high precision
"virtual timers or virtual TSC's (it's fairly trivial)
"but it's not currently there in any OS

Now I'm all ears: if someone can show me how to cleanly have a
Java high precision timer on a multi-cored-multi-cpu-hyper-
threaded-(insert latest CPU feature)-system providing nanosecond
(or sub-nanosecond) accuracy without side effect (for example
without flushing any pipeline), I'll read very carefully. It has to
work on Intel, AMDs, and all the others and also, of course, on
various OSes.

Until then, I'll code my Java apps without relying on System.nanoTime()
giving very meaningfull values (ie: without hoping it'll really provide
a high-precision timer on the various architectures the JVM run on)

Chris Uppal wrote:
>
>>And apparently on some hyper-threading systems, methods like
>>Window's QueryPerformanceCounter sometimes falls back to
>>RDTSC...
>
>
> Do you have a link/reference for that ?
>

The implementation of QueryPerformanceCounter seems to be in the part of
the kernel that differs between single processor or multi processor
implementations. In my experience on multiprocessors it is always
implemented by the RDTSC instruction. On single processors QPC is
implemented via the timer counter.

Guest

Hi Chris,

Chris Uppal wrote:
....
> > And apparently on some hyper-threading systems, methods like
> > Window's QueryPerformanceCounter sometimes falls back to
> > RDTSC...
>
> Do you have a link/reference for that ?

sadly I've no handy link... But I recall reading this from more than
one place and even seeing a nice little program somehow "prooving"
this. Googling and browsing endless threads in obscure forums should
eventually lead to some interesting infos on the subject, but it seems
I can't find it that easily (I found some other stuff
though)

That said, Patricia was right (as usual), when she said that it's the
OS who should be providing a consistent timer.

And I wasn't entirely correct when I said that none of the OSes do
this today...

The "new way" of doing in in modern processors is apparently
called HPEC (working both on newer Intel and AMDs) :

It is not implemented yet in any Windows version (apparently
Dell even disables it in some BIOS that otherwise would provide
the functionality, on the basis that no desktop Windows use it
yet).

But... It is already working on some other systems. For example
some Linux kernel (if I read correctly) now have a gettimeofday()
that use the underlying "this-time-really-high-precision-and-
consistent-amongst-threads-and-cpus-we-promise-you" HPET
timer (now that's redundant, as the 'T' is for "timer"

So it may be possible that some people using System.nanoTime()
are already benefiting from this new high precision event timer
afterall.

Regarding my previous question "how to know how
System.nanoTime() is implemented?", I somehow
expected that the answer was "RTFS" (Read The Fine
Source), but, concretely, how do I do this?

Do I have access to all the native code too ? (and if
I want to see how some JNI method is done on
Windows but I've got a Linux JDK, does it mean I've
got to download a Windows JDK ?)

Mark Thornton wrote:
> The implementation of QueryPerformanceCounter seems to be in the part of
> the kernel that differs between single processor or multi processor
> implementations. In my experience on multiprocessors it is always
> implemented by the RDTSC instruction. On single processors QPC is
> implemented via the timer counter.

> Regarding my previous question "how to know how
> System.nanoTime() is implemented?", I somehow
> expected that the answer was "RTFS" (Read The Fine
> Source), but, concretely, how do I do this?
>
> Do I have access to all the native code too ? (and if
> I want to see how some JNI method is done on
> Windows but I've got a Linux JDK, does it mean I've
> got to download a Windows JDK ?)

You can download the entire platform source from the normal download page:

Even more than normal, check the license /very/ carefully before accepting it.
It is /not/ the same licence as the JDK or JRE. In fact, there are two
licences you can opt for, one of which is entirely abominable, the other of
which might be acceptable.

That contains (afaik) the entire source for Windows, Linux, and Solaris builds,
including the C++ source for the JVM and the native methods. Pretty big[*]
and, though it's not badly structured, it may take you a while to learn your
way around it. It helps if you are reasonably familiar with JNI.

([*] around 200 meg, nearly 20K files.)

You'll find (if you do accept the licence) the C++ method
os::elapsed_counter(), which is where the nano timer gets its data, defined in
several files (according to OS). The Windows implementation is in:
<root>/hotspot/src/os/win32/vm/os_win32.cpp

Share This Page

Welcome to The Coding Forums!

Welcome to the Coding Forums, the place to chat about anything related to programming and coding languages.

Please join our friendly community by clicking the button below - it only takes a few seconds and is totally free. You'll be able to ask questions about coding or chat with the community and help others.
Sign up now!