FaqProfilerSampledInstrumentation

Which method of tracking methods should I use, exact or sampled?

During CPU profiling, a "method entry" call is injected at the beginning of each profiled method, and a "method exit" call before each return. These "method entry"/"method exit" calls generate a timestamp. You can choose from the following options to track methods:

Exact Call Tree and Timing

Exact Call Tree, Sampled Timing

Which do I choose?

The difference between exact and sampled instrumentation is when the timestamp on the instrumented methods is taken. Our advice for choosing a profiling method is to start with exact instrumentation. If you observe that the overhead is 100 per cent or greater, and the call intensity is more than 10,000/second (usually both things come together), consider switching to sampled instrumentation. For highly call-intensive applications, results produced by exact and sampled instrumentation may be slightly different. In this case, sampled instrumentation typically produces more accurate results for the top 10-20 methods, whereas full instrumentation is more accurate for the rest of the application code.

Exact Call Tree and Timing.

When this option is used, both the "method entry" and "method exit" injected calls record the timestamp every time each of them is invoked. The execution time for a target application method is calculated as the difference between the two timestamps. In this way, you get locally precise timings.

The drawback to this is that the overhead of the OS call that returns a high-resolution timestamp is pretty high – on the order of a few hundred nanoseconds on machines running the Solaris and Linux operating systems, and over 1 microsecond on machines running Windows. Thus, if your application contains a lot of small methods (a few lines of code) executed frequently, you may discover that the overhead when using full instrumentation is very high - between a few tens and a few thousand percent, depending on the application.

Exact Call Tree, Sampled Timing.

When you choose this option, timestamps are taken only on those "method entry"/"method exit" call that fall approximately at the end of each specified sampling period. This option is a hybrid method that provides the advantages of both the traditional instrumentation (counting the exact number of method invocations) and traditional sampling technique (small overhead). When this method is used, the "method entry" and "method exit" calls count the number of invocations, but do not take a timestamp every time they are called. Instead, they check a per-thread flag that a separate thread of execution, managed by the IDE, sets at a specified period.

When the flag for a thread is true, the next call to "method entry"/"method exit", whichever comes first, takes a timestamp. It then charges the difference between this timestamp and the previous one recorded in the same way, to the method that is currently on top of the thread's stack. In this way, the number of calls to the OS high-precision timer is reduced dramatically.

For highly call-intensive applications that make around 10,000 – 1,000,000 calls per second, this translates into a very significant overhead reduction (10 times or more). Furthermore, it appears that for such applications this profiling method may actually give more precise results. This is because it tends to create less disruption to optimizations that the dynamic compiler in the JVM and the CPU may apply in the course of the program execution.

The only drawback of this scheme is that it will not give you precise results for methods that are executed infrequently and for a short time. However, unlike traditional sampling, it will at least record the exact number of invocations for these methods.