This example is taken from the simple-c example app bundled in
the distribution. At this point, the target app died with a SIGTRAP
since I havent finished testing.

What made this work?

The code I have ported (my fault) confuses Sun's 'regs' array with
Linux's 'pt_regs' array. I've done some mappings so we get the
correct interrupt level context, but had to comment out a few
references to unsupported registers on Linux (eg %gs, %fs, etc).
I assume the references are needed in probe context for D apps
that want them.

Shame that the target user space binary died, but now hopefully I
can make even more progress, e.g. for the other trap types
(which I dont fully understand yet, but then, I am being thick).

I find Linux depressing. It really is depressing.
The whole lot is out of control, all in the interests of
kindness.

I upgraded one laptop last week to Ubuntu 8.10. Worked like a dream.

Until I tried to suspend the laptop, and sometimes it wouldnt recover
and most of the time the stupid, very very stupid NetworkManager
... didnt. After some research, I found a kit to replace this.

But, still the same issues ... not working reliably after suspending.
And definitely not reconnecting to the WPA wifi.

I cant believe how horrible and complicated this has become. In the
old days there were just config files and /etc/rc.d files to worry
about. Theres unnamed daemons controlling everything with layers of
too much complexity.

I updated the kernel to 2.6.28.2. I lost my sound. I lose my sound
every time I build a stock kernel, and I still dont know what/where.

Then my master server - Fedora Core 8 - tried to upgrade that
to 9 or 10, and now, it too, wont restore (non-WIFI) ethernet on
start, without me going to the other room to poke it with
a sharp and hot stick ('ifconfig eth0 down; ifconfig eth0 up; route add default gw ....')

Honestly, I am ready to retire and give up on this.

At this rate, Windows 7 will be my prime operating system, or I
am going to live in a cave where people dont upgrade things that
werent broken.

Of course, its all my fault. I thought it was time to get 'real'.
Silly me.

Job done: From now on, only two Linux releases of CRiSP will be
produced - linux-x86_32 and linux-x86_64.

This is presently being produced on a Fedora Core (FC8) box running
glibc2.7, but runs on AS2.1, AS3, AS4, Ubuntu 7/8 (and probably 9+ as well).

Having tracked down what was causing such dependencies, and the
arms-race to keep up with distros, I found that really only a couple
of things caused this. Strangely, the ancient C functions ctype.h
(isalpha(), isdigit(), etc) were the biggest nuisance, since later
glibcs use GCC smarts and libc versioning to disallow a new binary
running on an earlier release.

I wrote a tool to patch the ELF section headers to remove the enforced
GLIBC dependencies, and it works.

(I wasted a lot of time, because my FC8 box got updated and the X11
libs/headers moved around and I thought my unification was triggering
the bizarre errors I was getting).

(Along the way, my dynamic IP address also changed, which I only
found out when I tried to download from my own site).

And to make matters worse, one of my laptops suffered a "I am going
to update apt-get and break your system badly". This was an old
Knoppix release - which was nice since it was an old GLIBC release,
but an upgrade pushed me into glibc2.7 territory, invalidating the
first rule of software development: dont update things because it
seems like a good idea. It isnt :-)

So, now that laptop gets the Ubuntu treatment. I'm feeling a lot
more happy with apt-get, and now two systems (plus 1 vmware) is all
on Ubuntu, with the master being RedHat (which is now a blacksheep
because it has no easy upgrade without pain of a big download or
Ubuntu). Still, diversity is good.

What else would I do if I didnt have to fight silly issues ?!
Dtrace maybe ....

Yes, but now I need to fix fcterm - the terminal emulator which,
added to the chores above by core dumping whilst running gdb.
(UTF-8).

I can now intercept application INT3 breakpoint traps, and
pass them into dtrace. Its not quite right yet (and, if you load
dtrace into your kernel, it will presently break gdb and single step /
breakpoints), but I hope to fix that.

So now, we can have a USDT app tell the kernel it has probes, have
/usr/bin/dtrace monitor the probes, have the app hit INT3 to jump
into the kernel, and the next bit is to have the dtrace engine
talk back to the application.

I peeked at FreeBSD again, only to find all this is commented out
over there, so we are ahead in this area compared to FreeBSD.
Next is to work out some details in dtrace_user_probe(), and
just use it for a bit.

For years - since the day CRiSP for Linux was built, I have
been plagued with Linux ABI binary portability, meaning that
CRiSP has had to be built for every combination of glibc (and
now, 32+64 bit) platforms.

Why? Because, if you run a later crisp on an earlier system, the
binaries will refuse to run, complaining about glibc mismatches.

This drives me nuts. For years I had been meaning to see what the
cause was, and I was surprised. Very surprised how the glibc maintainers
could do this.

No other platform: Windows, Mac, or any other Unix has this problem.
(Well, Mac can be nearly as bad, but definitely not Windows, or any
SVR4/BSD derivative - to my knowledge).

Take the standard C library for <ctype.h>. Its existed since
practically day 1 of the C language, providing useful functions like
isalpha(), isdigit(), etc. Did you realise that this family can cause
binary API problems? Well, it does. Somewhere in glibc 2.[567] they
made these functions Unicode and obscure-aware (eg, isalpha(EOF) should
not cause an array bounds indexing violation). So, the simple
#defines or array lookups of old are replaced with calls to a function
in libc.so, which may not exist in older libc.so's. Yuk.
This isnt an option that is turned on because you want, and its
almost undocumented.

So, one of the trivialest functions in libc.so is being replaced by
a private implementation.

pthreads is another issue - I am aware that at some point in the
past, the size of structures for pthreads changed, and this caused
portability issues for apps. Instead of hiding this in the implementation,
they use versioning of symbols.

In GCC 4.x, it supports functions for detecting stack frame smashing,
but this is turned on by default. If you compile with -D_FORTIFY_SOURCE=0,
then these API compatibility issues are removed. (I am not advising others
to do that; I test my apps with valgrind and my own builtin memory corruption
detector).

I had to do lots of stuff to find this out, e.g.

objdump -T binary | grep GLIBC

Will tell some of the story.

objdump -p binary | grep VER

will tell the rest of the story. The definitions for VERNEED, VERNEEDNUM and
VERSYM stops a later binary running on an old system. When I have
finished writing a tool to strip this out of a binary, then I can run
a glibc2.7 application on an AS2.1 (glibc 2.3 or glibc2.2).

I will then be able to build just two Linux releases: 32 and 64 bit, and
use my latest development system to create a binary compatible release.

I have to say that doing this means the onus is on me to work around
why such symbol versioning occurs, but its a nuisance.

I have lots of vmware and systems running a variety of Linux releases, but
its an annoyance to have customers tell me that Ubuntu 8.10 isnt supported,
even tho I use it myself (for dtrace work).

Just reading on the openbsd mailing list about ZFS for OpenBSD, and
someone saying wouldnt dtrace be better. Was wondering about that
comment. Yes, porting dtrace to OpenBSD should be easier than for
Linux given that OpenBSD is a derivative or ancestor of FreeBSD.
I dont know the relative maturity of one vs the other, although
I think FreeBSD has a bigger user base, but, in theory, it follows,
it is doable.

Would I do it? Maybe, if someone asked. But before then ... Linux
needs to get a little bit further forward.

With regards Linux dtrace, I have a piece of glue to place -- on
the interrupt vector which handles a user-space breakpoint trap.
I can see the code in Solaris, and now need to work out the best
place to put this in Linux, and that should handle the full cycle
from user-to-kernel-to-user-to-kernel which is needed for USDT.
Let me see how I can get on with this, and then some cleanups can
start to happen....

I was wandering if it was doable/viable/workable. To be honest,
I dont see why not.

I am not proposing to attempt this (not unless I am really bored
and Linux dtrace is 'finished').

But technically, most of the dtrace code is just plain-ol-C. Theres
bits to hook into the kernel and userspace, but the dtrace code
is modular and segregated that actually the Unix specific pieces are
relatively small.

For anyone who has tackled Windows device drivers (and they
are not that difficult, although operate in a more complex
way than Unix), it should be doable.

Theres more layers in Windows (core kernel, nt.dll, win32, user,
gdi, ...), but the fundamentals of reading/writing memory is what is crucial.

Of course, Windows doesnt support ELF, and I would hate to run
a 'dtrace -l' inside a CMD.EXE window.

Just wanted to take a detour away from dtrace for a moment. I rarely
comment or write on CRiSP, even although it is a mature baby.

Someone asked me about editing/viewing large files in CRiSP.
I thought I would crib some of the mail I sent.

Heres a question: What is the largest file you could edit on a 16-bit
machine? 32-bit? 64-bit? (CRiSP has survived these CPU architectural
changes over the years).

The answer is the same for all: how big is your hard drive.
Naive coding would lead to just loading the file into memory and
hence you would be limited to the size of RAM and addressability
of the CPU. This has never been a good thing: if you spend all
your time in the same editor for small files, you almost certainly
want to use that tool for large files too, e.g. >4GB files.

The largest file I tried to test in CRiSP is around 16 GB. I didnt
go much further (this was a 32-bit cpu), because it got boring waiting
for the file to page in via the O/S, but it works.

Of course, you can find a weak spot in this: just try taking a huge
file and do a search and replace of every character in the file. CRiSP
will attempt to save the undo information and you will wait a long
time for the I/O. At least CRiSP tries - and tries to be efficient.

So, the answer to the question is: How long do you want to wait?

CRiSP can support almost infinitely large files (upto the size of
your hard disk or filesystem), but what you do next will really
depend.

Its worth reiterating this point. Whether your tool of choice can
survive being pushed to extremes, and whether its performance
degrades linearly, exponentially, or catastrophically. That is an
interesting topic for technically interested people. Maybe not
for everyone.

Some degree of success! We can now run a USDT enabled
process, run dtrace on the probes of that process, and I can
see the INT 0x3 (0xcc) instruction being written to the
probe points of that proc. The kernel writes a breakpoint
instruction with the goal of /usr/bin/dtrace monitoring the
child for SIGTRAP signals. (And, presumably, to fire the
callback for the process .. not sure what happens next).

Found it! After days/weeks of perusing source code, trying
to understand the PID provider and fasttrap code, and pulling
(what little) hair I have out, I found it.

When a user space app registers itself as a provider, it would not
show up in 'dtrace -l'. Why?

Because I am stupid and missed the blindingly obvious.

Fasttrap.c has a limit on how many user space providers can
be created - to avoid crashing or DOSing the kernel. But I forget
(or rather, didnt realise) the variable was not set. (In Sun land,
they read the attributes from kernel config variables, but I had
commented that out).

Progress is slow at the moment. In the continuing battle
to get USDT to work, I am reaching some roadblocks.

The 'easy' part was getting core dtrace into the kernel - wherever
something was wrong, I would crash the kernel, so, I could track
down where it broke and work backwards.

With USDT its slightly different. After getting a userland binary
to have probes in it, it runs and tells the kernel it is probable.
Kernel trace messages show the probe exists, yet 'dtrace -l' doesnt
list the probe. (I am using MacOS to compare what *should* happen
with what *does* happen on Linux).
I am obviously missing something here.

Its a bit of chicken-and-egg trying to work out the flaw, e.g.
it could be the userspace implementation not being complete, or
it could be a sillyness in the kernel, or even something I have
forgotten to do.

Interestingly, when running a USDT app, it declares the probes, and
you can see them (eg on the Mac) with 'dtrace -l'.

You can run in two ways: run the app on its own, and attach to the probe
with dtrace, or, do both together, launch dtrace to fire the app and
monitor the probes.

Interestingly, on the Mac, gcc seems to have some enhancements to allow
the inline probe declarations to work. Statically disassembling
the binary and disassembling whilst the app is running shows
the kernel correctly putting in "INT 3" instructions into the
userspace code area.

Its possible on Linux that dtrace is too divorced from the real
kernel, or I just had something stubbed out.

I also hit a problem with "dtrace -c ..." in Linux. I dont know
if this is a pthreads issue or a Linux issue, but Linux doesnt
allow ptrace(PTRACE_CONT) to be executed from a child thread,
when the child target process is forked() from the main thread.
In Linux, the target proc and the controlling thread are like
siblings instead of parent-child. (I solved this temporarily by
moving fork/exec creation to the monitoring thread, but its still
a bit flaky).

I am spending a lot of time statically reviewing the dtrace code
to work out where the problem is. I can find lots of code
I want to be executed to handle USDT, but, am missing a vital
cog to make it hang together...

FreeBSD 7.1 came out this week to a mild amount of fanfare.
Thats a good thing. Its great that people spend a lot of effort
on distro's for themselves and their own communities.

I grabbed the distro and the source to see what had changed in dtrace.
It looks like "not a lot" from the source snapshots I had earlier
in 2008.

Alas, disappointingly, USDT dtrace doesnt work. (I couldnt get
dtrace to work at all in FreeBSD from the stock download for x86-64;
I guess I need to rebuild the kernel).

Searching the web reveals user land tracing is not complete.
This is a shame, because I have been using the FreeBSD model
of implementation for Linux. I have had a hard time, because it
looks like there are subtle things wrong/broken in FreeBSD/USDT
tracing (e.g. the way a process is launched and ptrace() is used
to attach to the process is missing some key lines of code).

I have spent the last week poring over the subtleties of what
FreeBSD do, along with Sun and Apple. I should be able to get
this bit to work, however I am not sure about
other aspects of the tracing, such as aborting or skipping
over syscalls. (The ptrace() syscall is simply not as
powerful as Sun's /procfs interface).

I know most of the ELF code works for symtab lookups, so I should be
able to make some new progress.
I'll update the blog and put out a new source tarball when I feel
happy with what I have.

As always, things have been slow, but they sped up over the
last few days. (I've been ill with flu over Xmas, which didnt help;
every thought of dtrace made my head explode!)

First, the /proc/$$/ctl driver sort-of-nearly-almost-but-doesnt work.
It hooks into the kernel and can respond to calls, but theres a
problem/difficulty: I havent figured out how to simply intercept syscall
entry/exit on a per process/thread basis, without lots of kernel hacking
or a brute force patch on entry/exit to the syscall handlers. This
would be against the ideal of dtrace having a zero-impact approach
to monitoring. Maybe its doable long term (I do so love the
solaris approach to procfs; ptrace doesnt cut the mustard).

In any case, this may not matter; I have spent more time
understanding the libdtrace library about how it handles:

dtrace -c prog

and how it grabs a running process. I took a new look at FreeBSD
and noticed it used the only other valid alternative: ptrace, so
I am grabbing ideas and code from FreeBSD to see if I can make progress.

Side note: using the Apple code is rather pointless, since
it relies on the MACH underlying OS calls to do process manipulation
and theres nothing similar in Linux - i.e. an uphill struggle.

The FreeBSD code is nice and simple, except it does rely
on the EVENT subsystem in FreeBSD for inter-thread communication
(not sure I fully follow it). I have stubbed it out for now - just
so I can get something/anything working.

Hopefully when this is done, I can handle the reverse journey
for USDT.

Lets hope 2009 is a better dtrace year. It will be a long slog
to get dtrace reliable, and the more that people try it or comment
on it, the better, but I feel comfortable that key parts of dtrace
just work, but I havent addressed quality. (I am slowly trying
to clean up compiler warnings, for instance, which many times
obscure real silliness on my behalf).