Finding the cause of inexplicable warnings in XS code

where the code at that line is a call to an XS subroutine (let’s call it xsub()) and you’re certain that the arguments you’re passing are not undefined.

Somewhere, deep in the XS/C code, an undefined value is being used. But where? And why is perl reporting that line?

Perl is reporting the last line of perl code that was executed at the same or higher level in the stack. So other perl code, such as a callback, may have been executed between entering xsub() and the warning being generated, but that perl code must have returned before the warning was triggered.

Assuming XS/C code is large and complex, like mod_perl, how can you locate the code that’s triggering the warning?

That make the program abort and generate a core dump file at the point the warning is generated. You can then use a debugger, or Devel::CoreStack, to report the C call stack at the time. It’s a savage but effective technique.

If the XS/C code was compiled with options to keep debug info (i.e., -g) then that’ll show you exactly where in the XS/C code the undefined value is being used. If not, then it’ll at least show you the name of the XS/C function and the overall call stack.

(The dump function is a curious vestige of old ways. You could use kill(9, $$). I’m not sure about the portability of either, for this purpose, beyond unix-like systems.)

I suggested the technique to Graham Barr recently and it proved effective in tracking down the source of that warning in a very large mod_perl application. The warning pointed the finger at a $r->internal_redirect($uri) call. The actual cause was a PerlInitHandler returning undef. (The handler was an old version of DashProfiler::start_sample_period_all_profiles.)

Anyway, it dawned on me this morning that I should update the technique. It doesn’t have to be so savage. On modern systems you don’t need to shoot the process dead to get a C stack trace.

A few approaches came to mind:

spawn a “gcore $$” command (or similar) to get a core file from the running process

spawn a “gdb $$ &” (to attach to the running process) followed immediately by kill(17, $$) to send a SIGSTOP to the process to give time for the debugger to attach and for you to investigate the state of the live process.