Just an update - I noticed some minor annoyances in dtrace, which I
want to fix, so I can use it in anger. Here they are:

Cant trace syscalls for a 32-bit app on a 64-bit kernel. Duh! Yes,
but I never finished that work so I forgot!

Access to "cpu" for printing doesnt work unless the /usr/lib/dtrace
files are installed. Sometimes I run from a build dir, and thats annoying
so I will see if I can modify dtrace to look where the binary is running
from rather than just the lib. (You can override the include dir on the
command line, but its too much work to read the help that dtrace provides
or that I augmented :-) Double-duh!)

I notice some fbt "entry" points dont have corresponding "return"
exits, which is a nuisance. Probably the disassembler is missing some
magic in the instruction sequences, so will look to tidy that up.

Still not happy with timestamps in dtrace - they appear to work,
yet everytime I try to measure entry-to-exit times, the values look "wrong",
e.g. it almost looks like the timestamp (hrtime) code is freezing the
value for all matched probes rather than the wall clock time. Not sure
if its me being a dtrace-noobie, or some sillyness in the code.
(Theres also a bug in reading 64-bit timestamps where the low order 32-bits
can wrap and lead to "negative" time occasionally; either that, or a problem
when context switches cpus).

And lastly, really need to provide a mutex provider, since the linux
kernel inlines the assembler for locks and semaphores, and make it impossible
to count or monitor these probes.

Thats enough to keep me busy for a few days, as my crisp fixes slip behind.
(Getting line wrapping to work properly without display surprises is time
consuming, even trying to remember how the code was supposed to work).

I just had to do some home surgery. Son dropped laptop, and hard drive didnt
recover. Shame, since the harddrive was supposed to be able to survive
these (thats what Compaq/HP says on the sales blurb; me - I never believe any
of that :-) )

After a quick, "what next? dash to the stores?", decided to reclaim an unused
spare 80GB laptop drive and go with that : zero cost fix.

So, onto the Windows Vista recovery disks we made when the laptop
was new. Great, after 3 CDROMs (or were they DVDs?), we reboot and
we get a nice "Windows cannot proceed with the installation" type
dialog and reboots. Nothing, including safe-mode will work.

Quick google search: you cannot install Vista from the recovery disks.
Pardon? Like most PCs out there, they all ship without recovery media, and
you have to make your own. I do hate that. Its a "con" for the general
public and they dont realise it. I have a copy of Vista on DVD, but
after trying that, that was going nowhere fast.

So I tried the Window7 recovery disk - and it worked a treat. It installed
(despite the recovery disk being for a Dell), and a short time later,
a fully functioning Windows 7 install. Wifi worked; laptop screen
resolution was spot on, and absolutely nothing to dislike.

Of course, it doesnt have a license, and sooner or later, Windows 7
will remind us. But I dont really care. I paid for Vista - twice, once
for a machine that never needed or used it, and once for the compaq, which
lost its hard drive, and for which the recovery disks were a waste of
life spent making them. (Maybe I can salvage my sons core data: itunes library,
and firefox bookmarks, but he isnt that fussed).

World-of-warcraft is busy downloading updates, and he is using the
other PC until he can get a few seconds to walk across the room and
carry on life as if nothing bad ever happens.

All the time, I am wondering: there must be more to life than
watching progress bars on screens.

I have never really tested this, but there are issues if you
use ustack() on a 32bit app on a 64bit kernel. I had previously
written that stack walking on Linux with gcc is problematic because
of the susceptability of omit-frame-pointers meaning the only correct
way to walk a stack is via the DWARF debug records. (There is some
dwarf support in the kernel code in dtrace, but its not complete, and
theres a danger if its invoked, that bad pointers can generic
kernel GPFs).

When the user space dtrace wakes up, having fetched a buffer of info from
the kernel, it may include references to user processes, and, if "ustack()"
is used, then dtrace will examine the running process to get the
loaded libraries and walk the stack.

In theory, dtrace should handle this (mostly via the ELF libraries),
but, dtrace assumes a Solaris style /proc filesystem, and not the Linux
one. (The Linux port attemps to get this "right" but its not fully proven).

I will look at what/where the "gotchas" are. Would be nice to not
worry about the binary type.

Dtrace (the user command) relies on the libelf library to allow
introspection of target applications and for the USDT code for creating
probable libraries.

The naked Ubuntu (and many other distros) provide a core set of packages
to work, but not the development packages. The dtrace release tells you
what you need.

What I have found is that there are so many libelfs out there and they
do not all agree. Eg. some include ELF_C_MMAP_READ and some dont. Worse,
the enums for the various values are different, leading to potential
of an app build with one set of headers causing strange and difficult to
debug error codes from elf_begin() and friends.

I need to add some better autodetection to the dtrace code, or, one
possibilty is to move totally away from libelf and use the elf library
I put together for the CRiSP/elfrewrite code. (Am loathe to do that, but
it would sever any dependencies and provide better support on old/very
old systems).

I put out a new release today to fix some more build regressions, but
I have someone reporting the failing to build the "simple"/USDT example
code. If anyone is trying this out, try:

$ make -i all

as a temporary work around, assuming its just the "simple" example which fails
and nothing else.

I posted a new release of dtrace last night, and theres some silly
issues to resolve in that release. Its fine on Ubuntu 10.04 and 10.10
and probably lots of earlier releases, but thought it worth highlighting
some of the blips I found on an RedHat AS4 build:

If yacc is used, especially older yaccs, a bison construct can lead to a
compile time error. Bison allows "string" tokens, so that on an error
message it doesnt print something ugly, like "Syntax error near DT_OPEN_BRACE".
Unfortunately, old yacc either doesnt reject this or happily accepts a
token like "sizeof" which then causes a #define of sizeof - the keyword,
and causes strange and difficult to diagnose compiler diagnostics.
I will need to put in a yacc/bison detect to handle this.

INET_ADDRSTRLEN are not being handled properly for older kernels,
when the #define is in a file we are not including.

The changes for the lockless ioctl() result in backward compile
time problems since neither unlock_ioctl() or compat_ioctl() are available
and the code doesnt default to the old style ioctl(). (The major
difference here is that the old ioctl() callback would have
a 'struct node' and 'struct file' argument, whereas both unlocked_ioctl
and compat_ioctl only pass in a struct file. Some #ifdef's should handle
this.

Having done various "other" projects (which are still ongoing,
including a TV Guide browser/diary), its back to dtrace. Suddenly
decided I was annoyed dtrace wasnt running on the latest kernels
(thanks to everyone who prompted and reminded me of this).

Its a bit of a pain: 2.6.26 and 2.6.37 changed enough things to
make life difficult doing a backwards/forwards series of code changes.
Added to which, my current development laptop is missing some of the
older kernels (available on my other machine), so even if I get
it to compile, I cannot easily guarantee I havent broken a prior
kernel build.

Some things, like the "ioctl()" driver function changed, as Linux
worked out the big-kernel-lock (BKL) - which is a good thing, but
enough to complicate code having to handle old and new kernels.

Others are a bit more curious (e.g. kmalloc() prototype not visible
unless <linux/slab.h> is included).

Oh well - it only cost 12 pounds, but it has failed - getting the
same erratic "no route to host" problems with this adaptor as if
it wasnt there. Thats: wifi, wired ethernet, airport express, homeplug,
USB wifi - all the same symptoms on the appallingly poor mac mini
hardware.

So, I now get to decide on what to replace the mac mini with.
An Inspiron Zino is my current favorite - but the cpu specs are
a bit on the low side.

Or maybe a small/refurb/cheap desktop. Or maybe a nettop device (which
have very poor cpu performance). Oh well. <sob>.

Had a couple of bugs to fix in elfrewrite - it didnt correctly handle
ELF binaries which had both .hash/.gnu.hash. Found this out when
running on my older Ubuntu system and it caused bad binaries to be generated.
I hadnt realised that was even going on with Ubuntu 10.04 and
earlier (should have been obvious that -hash-style=both was the default).

Also, removed a silly dependency on libelf, so that it runs on more systems.
(The elfrewrite is available as part of the crisp install; i may package
it up separately, but it serves my purpose to do that).

The LM Tech nano usb is a teeny weeny USB wifi adaptor. I have had no
end of problems with wired and wifi on my Intel Mac Mini - such a poor
and broken piece of Apple tech. Nothing worked to make either reliable -
with erratic lack of network connectivity. I am close to dumping this
horrible piece of kit, but the LM Tech nano USB wifi seems to be working.
Lets test it out for a few weeks. 150Mbps 802.11n - strong signal strength
(strange - it reports 90+% signal strength whilst the wireless router
reports 23%!). Not hugely fast (because of distance from macmini to router
and its hiding behind, rather than in front of the mac - more signal
blockage). But, for 12 GBP - its a bargain (vs the 90 GBP for an
airport express which has yet to serve the purpose i purchased it for).

Meanwhile, need to get back to some more coding updates. (Added a
CSV macro to crisp, which now allows column-based searching, e.g.
"col select 3==hello" shows all lines where column 3 contains "hello".
Need to extend it more to support AND/OR scenarios, but it meets
my use-case with room to extend.