Now that the kernel stands up fine to running user space probes,
I now need to figure out where are they, or what to do with them.

They dont show up in 'dtrace -l', but I suspect thats more of
a misunderstanding on my part about what/how.

The kernel does have the PID provider code linked in (fasttrap.c
and related files), but some stuff is still commented out.

Perusing the solaris kernel shows I have a number of things missing,
such as static probes on process creation/exit, and other
subsystems. This will require some hacking to get working, but I
*dont think* is important just now. (Hacking means I am going to
have to disassemble these functions and patch in static probes, unless
I can find good hooks within the kernel which I can daisy chain onto).

Theres complexity in handling user programs since the dtrace libraries
rely on the procfs way of manipulating applications and Linux
has ptrace() along with the new fprobe() (?) interface.

I am tempted to create a procfs() driver for Linux to hide this
complexity. I have never found the ptrace() syscall interface friendly
for multithreaded apps, but then its very easy to make mistakes and
take a good while to debug them.

Am just poking around at the moment. This is the final lynchpin of
dtrace -- if this can be made to work then the rest is just quality control.

I got my first quantize() graph out of dtrace today (its always
worked, just I had never gotten around to trying it out).

Any form of programming results in bugs - unexpected behavior.
A driver is no different - but the bugs are harder to fathom
because many of your favorite debugging aids cannot work in
kernel space.

My solution, is printk(). Its very low tech but it works...

I am a fan of gdb and have written my own x86 kernel debuggers
(for x386 and x186 processors), but porting them is a pain. Now we
have 64-bit chips, I have to decide if I want to port my ancient
code. (The debugger is powerful but nothing grand).

In a driver you have to crawl - one atom at a time; enabling huge
wads of code and expecting it to work, well, er, wont.
Disabling lots of code and ensuring structure is there and then a sprinkling
of print statements works well.

I try to use vmware whilst debugging since bad pointers can corrupt
filesystems, and losing hundreds of gigs of filesystem is not nice,
waiting to fsck or reformat/reinstall.

Linux has a nice feature: GPFs (bad pointers) are caught
and logs written to /var/log/messages.
If you are lucky no reboot is needed.

Having a stripped down startup is essential - being able to reboot
in about 10s is ideal - no waiting for GUI startups. (I use rlogin
or telnet or ssh into the vm session).

Mutex debugging is a pain, but I have found that Linux has
a drop dead timer: if the kernel is unresponsive after 10+s
a message is printed on the console. Next is a reboot (dont
resume a VM snapshot, since you will not have access to what went wrong).

After reboot, /var/log/messages will have your printk() statements
to help track down where you got stuck.

Just a minor update on dtrace and USDT progress.
As detailed in he last couple of entries, USDT work
is progressing; we can now generate and compile ELF binaries
which include user space probes.

I have been working on the /dev/dtrace_helper driver code
which is used by a user space app, to find where its not working.
This has lead me into a corner of the Linux port, which had been
stubbed out or #define'd to compile, saving for a rainy
day, the work required to resolve.

That day arrived: much of the Linux kernel driver code is just
plain ol' C with a few bits of assembler. However, the dtrace helper
ioctl() code needs to be able to find and store attributes of real
processes. So, my process-shadowing veneer needs to be stronger
than it is. This is the key code which negates the need to
change the GPL Linux kernel source; by keeping the dtrace
driver pure of GPL, it needs a way to find stuff which is not normally
of a concern to a driver.

This is not so much a GPL vs CDDL licensing issue, but more of an issue
in that the Linux kernel changes, sometimes quite dramatically, from
one release to another. If dtrace is not a part of the kernel, then
it needs to be a good citizen and provide easy adaption to new
kernels, or kernels compiled with differing compile time options.

Applications like VMware have a similar issue - many times
a new kernel will come out and VMware wont work on it, since the
compiled code doesnt conform to the new headers or functions.

Of the many thousands of lines of code in dtrace, only a very
few (10-20) care about this aspect of the kernel, e.g. convert a PID
to a process structure, store/retrieve attributes affecting scheduling.
But these few lines are the most difficult...or not...

Other tasks have been taking my time this week, so I shall be going
back into the water in a few days....

(This text is located in the doc/usdt.html file in the dtrace/linux
distribution; it will be updated as I can confirm implementation
details).

User Defined Tracing - How it works

USDT is a mechanism for user land applications to embed their
own probes into executables. For example, a Perl or Python
interpreter might use it to gain access to stack traces of applications
which are already started.

The goal of the DTrace team was near zero overhead when not invoked.
This works well - even commercial applications can embed
probes and not worry about performance or run time dependencies.

The DTRACE_PROBEx macros translate into a function call.
To gain near-zero overhead, during linking, the function call
is replaced by a series of NOP instructions. Heres an example disassembly
of the above

When an application is built, dtrace is run on the
object files to rewrite the objects, stubbing out the calls
for probes, and creating a table in the executable of the
places where the stubs are located. (The code
for this is located in libdtrace/dt_link.c).

When the application is started up, a piece of code is
executed (before main() is called). [Code located in
libdtrace/drti.c]. This code looks at the current system,
to see if dtrace is loaded into the kernel and communicates
with the /dev/dtrace/helper driver to inform it that new
probes are available in this process.

Voila! We are done. Or nearly.

At this point, whilst the application is running, 'dtrace -l'
should reveal your new probes.

The Kernel

When a user elects to monitor the probe, the patched (NOP-ed) code
will be change into a call back into the kernel to notify the
function/probe is being invoked.

I think the generation of a compiled executable now works.
I can create an executable which links in the dtrace probes,
and have some, hopefully minor fixes to drti.o to make so it
can communicate with the driver.

Next step is ensuring the driver helper functions are enabled
and to test it out with a simple example in the release.

Its been a slow week trying to debug just a few lines of code
in dt_link.c as GNU object files and Sun ones are subtly different
in the way undef symbols are defined in object files.

I have managed to get back into the swing of things on DTrace.
My focus this week has been user defined dtrace probes (USDT).
I have created a sample script in the usdt/ dir so I can work
through what needs to be done.

Its not finished yet, but heres what I have found:

libdtrace/drti.c needs to be compiled to an object file, and
linked in with target apps. (This now compiles).

"dtrace -G" is the magic used to covert a prototype file to
the object file needed to link with the application to be probed.

We need the dtrace 'helper' device - something I had commented
out early on, since I wasnt sure what it was. This is now
enabled in /dev/dtrace_helper. (If I can work out how to create
a /dev/dtrace/ dir, then I can more closely mimic Solaris; not
a big deal whether this is done).

libdtrace/dt_link.c has needed a few minor mods to fit in with
the new device name, but some workarounds but assumptions about
/usr/ccs/ which is not the compiler directory under Linux.

A small complication is needed to store state on a per proc/task
basis, but the shadow mechanism (par_alloc) is used for this.
(I need to intercept process/task death to do the garbage
collection; subject for another day).

This will be a major milestone to get this working; people have
been asking about USDT for Perl/Ruby, and I want to put a
probe into CRiSP - just so I can understand the in's and out's.

I had recently taken a short break from dtrace whilst enhancing
fcterm to include the following features:

Infinite scrollback (sort of, now spills to files and can
page in from the files, but you dont really want infinite scroll).

Performance: now faster than all the competition.

Auto-restart: if the X server crashes, fcterm will carry all of your
pty state over to the newly started X server, without missing a heartbeat.
Yes, it waits for a new X server to come along and continues from where
you left off; no more lost editing sessions, or shell sessions.

Fixed a typo in dtrace today...time to go back in the water...
stay tuned....