Re: [Sbcl-devel] cpu load

On Sun, Oct 26, 2008 at 1:34 AM, David De La Harpe Golden
<david@...> wrote:
Just to clarify:
CPU percentage is just a ratio between real time the process has been
running, and the total run time (system + user) it has spent. There is
no way to ask "how much CPU is this process consuming right now": the
best you can do is ask "how much CPU has this process spent over
period X (which might be the lifetime of the process of something
else)".
You can measure this simply enough even with just
GET-INTERNAL-REAL-TIME and GET-INTERNAL-RUN-TIME:
(defun measure-cpu (function &rest arguments)
"Call FUNCTION with ARGUMENTS, and return the approximate
CPU percentage spent as a single float."
(declare (dynamic-extent arguments))
(let ((start-real (get-internal-real-time))
(start-run (get-internal-run-time)))
(apply function arguments)
(let* ((end-real (get-internal-real-time))
(end-run (get-internal-run-time))
(real (- end-real start-real))
(run (- end-run start-run)))
(* 100.0 (/ run real)))))
;; Just for testing: busy-wait SPIN seconds, and sleep SLEEP seconds.
(defun spin-and-sleep (&key (spin 0.0) (sleep 0.0))
(let ((target (truncate
(+ (* spin internal-time-units-per-second)
(get-internal-real-time)))))
(loop while (> target (get-internal-real-time)))
(sleep sleep)))
(measure-cpu (lambda () (spin-and-sleep :spin 0.0 :sleep 1.0))) ; => 0.0
(measure-cpu (lambda () (spin-and-sleep :spin 0.1 :sleep 0.9))) ; => 9.700001
(measure-cpu (lambda () (spin-and-sleep :spin 0.5 :sleep 0.5))) ; => 49.1
(measure-cpu (lambda () (spin-and-sleep :spin 0.9 :sleep 0.1))) ; => 88.0
or by using TIME, or SB-EXT:CALL-WITH-TIMING. These are, however,
times for the whole process.
If you need per-thread measurement, the gettid() call David explained
can be use to obtain the task id, after which you can read
/proc/<pid>/stat/task/<tid> to get the per-thread timings from which
to compute the CPU percentage. Consulting man proc should provide the
details, and Google probably is of help there too.
However, if I understand correctly, you want to figure out where your
CPU is spinning -- not how much it is spinning. For this, SB-SPROF
should do admirably.
http://www.sbcl.org/manual/Statistical-Profiler.html#Statistical-Profiler
Run it in the default :CPU mode, and you should see where your time is
being spent. (Note, however, the difference between START-PROFILING
and WITH-PROFILING regarding which threads are profiled.)
Cheers,
-- Nikodemus

Thread view

I'm running a hunchtentoot server with emacs - slime - sbcl (amd64).
From time to time 'top' shows 99,x % cpu (user) load
for this process.
The application works fine without delay, I can work in the REPL
send http requests.
With list-all-threads et.al. I do not get the load.
I haven't a clue how to debug this problem.
How can I request the load of the threads,etc?
Helmut

Helmut G. Enders wrote:
> I'm running a hunchtentoot server with emacs - slime - sbcl (amd64).
> From time to time 'top' shows 99,x % cpu (user) load
> for this process.
>
> The application works fine without delay, I can work in the REPL
> send http requests.
>
> With list-all-threads et.al. I do not get the load.
> I haven't a clue how to debug this problem.
>
> How can I request the load of the threads,etc?
>
Linux-only:
Well.... (though the profiler might help if different threads are
running different functions ...which they probably aren't if you're
doing web serving), note that "top -H" on linux should show individual
threads and their load as far as the system is concerned. Once you have
the kernel TID (not the same as the pthread id, BTW) of the offending
thread (On linux, kernel TIDs are basicially just special PIDs, since
threads are just special processes, it's found in the "PID" column in
"top -H"), you "just" have to match it to the corresponding lisp thread...
But I couldn't find it immediately visible anywhere in the sb-threads
implementation*. So making threads do their own gettid syscall and
stash the result somewhere is an (evil) option.
; 186 is the gettid syscall number for linux/amd64, see
; /usr/include/asm/unistd.h - they vary between
; linux ports to different archs
(defun gettid () (sb-unix::int-syscall ("syscall" int) 186))
Note that because of >mumble, handwave< this probably won't work if you
try to use sb-thread:interrupt-thread to interrupt an existing thread to
run gettid. However, if you can arrange for the thread to call it itself
in its normal execution flow it should work and return a number matching
the "PID" column in "top -H" or the "LWP" column in "ps -eLf" output.
(*Of course, kernel tids are considered an implementation detail at the
glibc/pthread level and are not officially exposed by them...)

On Sun, Oct 26, 2008 at 1:34 AM, David De La Harpe Golden
<david@...> wrote:
Just to clarify:
CPU percentage is just a ratio between real time the process has been
running, and the total run time (system + user) it has spent. There is
no way to ask "how much CPU is this process consuming right now": the
best you can do is ask "how much CPU has this process spent over
period X (which might be the lifetime of the process of something
else)".
You can measure this simply enough even with just
GET-INTERNAL-REAL-TIME and GET-INTERNAL-RUN-TIME:
(defun measure-cpu (function &rest arguments)
"Call FUNCTION with ARGUMENTS, and return the approximate
CPU percentage spent as a single float."
(declare (dynamic-extent arguments))
(let ((start-real (get-internal-real-time))
(start-run (get-internal-run-time)))
(apply function arguments)
(let* ((end-real (get-internal-real-time))
(end-run (get-internal-run-time))
(real (- end-real start-real))
(run (- end-run start-run)))
(* 100.0 (/ run real)))))
;; Just for testing: busy-wait SPIN seconds, and sleep SLEEP seconds.
(defun spin-and-sleep (&key (spin 0.0) (sleep 0.0))
(let ((target (truncate
(+ (* spin internal-time-units-per-second)
(get-internal-real-time)))))
(loop while (> target (get-internal-real-time)))
(sleep sleep)))
(measure-cpu (lambda () (spin-and-sleep :spin 0.0 :sleep 1.0))) ; => 0.0
(measure-cpu (lambda () (spin-and-sleep :spin 0.1 :sleep 0.9))) ; => 9.700001
(measure-cpu (lambda () (spin-and-sleep :spin 0.5 :sleep 0.5))) ; => 49.1
(measure-cpu (lambda () (spin-and-sleep :spin 0.9 :sleep 0.1))) ; => 88.0
or by using TIME, or SB-EXT:CALL-WITH-TIMING. These are, however,
times for the whole process.
If you need per-thread measurement, the gettid() call David explained
can be use to obtain the task id, after which you can read
/proc/<pid>/stat/task/<tid> to get the per-thread timings from which
to compute the CPU percentage. Consulting man proc should provide the
details, and Google probably is of help there too.
However, if I understand correctly, you want to figure out where your
CPU is spinning -- not how much it is spinning. For this, SB-SPROF
should do admirably.
http://www.sbcl.org/manual/Statistical-Profiler.html#Statistical-Profiler
Run it in the default :CPU mode, and you should see where your time is
being spent. (Note, however, the difference between START-PROFILING
and WITH-PROFILING regarding which threads are profiled.)
Cheers,
-- Nikodemus

Nikodemus Siivola wrote:
>
> However, if I understand correctly, you want to figure out where your
> CPU is spinning -- not how much it is spinning. For this, SB-SPROF
> should do admirably.
But just to note, neither the classic nor statistical profiler currently
gives you /per-thread-per-function/ timings (Er. Unless they do and I've
just missed them), which might be useful if several threads are running
the same function, but could have significantly variable timings
depending on per-thread conditions (like whether a normal or malicious
client connected).
You can trim the statistical one to sample only particular threads, but
something that might be useful would be "yes please sample all threads,
but tot up and report the call counts on a per-thread basis not
aggregate across all sampled threads when you do it".