Wanted to recount a little scenario that happened to me, and a warning
to others.

I've been writing a "makefile" generator tool. The makefiles for CRiSP
have become a little unwieldy over the years - specifically, parallelisation
issues, incomplete dependencies, and inability to adapt them quickly
to a new scenario.

CRiSP runs on MacOS, Linux and Windows. The Mac and Linux makefiles are
shared (as are all the Unix makefiles), with an auxiliary small makefile
to handle the Cocoa part of the build cycle. Unfortunately, for Windows,
nothing is shared - it has a complete set of its own makefiles.

I wanted to compile CRiSP under MingW - a GCC port to windows, because I
wanted to use GCC, and avoid issues with Microsoft Visual Studio.
MingW provides a Unix like environment, but Windows is too far from Unix
and MingW is too far from Windows, in terms of my makefiles.

So, I built a makefile generator. It has some good features - auto/recursive
header file dependencies, avoiding "cd" and creating a parallelisable
tree, along with proper cleaning - clean only what is built.

That latter feature had a bug in it: the initial version did the
equivalent of "rm -f */*". Unfortunately, when I ran it, it hadnt
cd'ed to the build directory, so it deleted all my files in subdirectories.
Oops!

Not to worry - I do backups every few days and propagate the sources to
other machines. Mount the USB drive and go look at the .tar.bz2 file
containing the current sources. (Alas, about 2-3 days old, but that
didnt matter).

What I found was a corrupted .tar.bz2 file. My initial thoughts were,
*how* did this happen? My backup script is used all the time, and I
have validated the backups, but this was strange and new.

Never mind, I reconstituted the missing files from my other backups
and systems.

But I was curious, what could cause this to happen.

On investigation, I found the following worrying sign. I tend to backup
to a USB flash drive. I use Linux. I use swsusp to suspend to RAM. My current
kernel/distro has a bug in it. When you suspend with mounted USB drives, on
a wakeup, it doesnt understand the filesystem was mounted or the hardware
needs to be reprobed. "mount" would show the filesystem mounted
but the device was totally empty. If I unmounted the rogue device, and
remounted, all the files were present.

I am guessing that I did a backup and left the device mounted, suspended
the system, and didnt notice for a while (1-2 days) and eventually this
may have lead to the corruption.

Moral: even if your backup system is perfect and doing validation -
your operating system (or some other component) may work against you.
You may not know this.

In my case, I may have to strengthen the backup system to consider
applying md5sums to the files, and validating them before writing
to the device, or maybe to cache the backups on HD and verify before dropping the
local HD.

The tainted kernel is normal because of the different license (GPL & CDDL).

Lets do a little history lesson.

In the beginning, Solaris gave rise to a tool called "truss". "truss"
is a tool for tracing system calls. It was a breakthrough tool (versions
for other OS's existed previously, but the truss implementation was simple
and easy to use; BSD kernels had there own tool .. but I digress). truss
lets you understand what calls a tool makes and you can see
the parameters to the syscall. Great for educational purposes or for
diagnosing performance problems in an application.

On Linux systems (and other Unixes), strace was created to emulate
truss - a system call tracing tool.

"ltrace" is another tool, which can be used to trace any dynamic
library function call. Sometimes very useful, but at the library
level, the number of calls executed can be huge - something as simple
as starting an X Windows program can involve lots of nested calls
to fopen, string library and other calls - but because of the level
of detail - they can make it hard to understand what an application is
doing.

A few years back - dtrace was born. dtrace is a tracing tool which
runs at the system level - this means you can trace all processes, into
and out of the kernel, rather than selecting a single process to trace.

There are two common modes of operation: trace system calls (you
can see who executes the call - rather than with strace/truss, you pick
a single process and see what calls it does), and you can trace
kernel function calls (seeing which process triggers the trace).
[dtrace does other things, such as tracing user land shared libraries,
just like ltrace, but again, you can find which processes invoke a function
rather than knowing up front, which one to trace].

When tracing in the kernel (and user space), the basic mechanism is
that of a breakpoint: one or more breakpoint instructions are placed
at the places you choose (using the dtrace probes syntax), and
dtrace converts these breakpoint traps into the events and callbacks that
a D script can trigger on.

DTrace works at the function level: you pick a function to trace and
dtrace computes the start of the function (very easy - the symbol table
gives us the start address of each function), and also, the (multiple)
end addresses of a function. (Each function can have multiple exit
points, so dtrace needs to "find" each of these). Finding the exit
points of a function involves disassembling the function and trapping
the RET instruction. (Solaris/dtrace traps the LEAVE instruction which
preceeds the RET instruction, but this is not very useful on Linux, when
using gcc, since the compiler rarely emits a LEAVE/RET instruction sequence).

Its possible for the disassembler to get confused and Linux's dtrace
is tuned for the type of output typical from gcc. It can result in not
finding the end part of a function or getting confused by inlined
constant tables (gcc used to create switch-statement jump tables
and inline them in the code body; fortunately, it no longer does this - all
read-only data is put in its own ELF section, and code in memory is exactly
that -- code -- nothing else).

dtrace needs to intercept loadable kernel modules - so that it can expose
tracable functions, and, intercept module unloading - otherwise it
could leave a trap exposed in a part of memory that is freed or used
for something else.

Lets switch to ftrace: ftrace is a Linux kernel subsystem, similar to dtrace,
and an integral part of the kernel. Because of licensing conflicts between
dtrace, ftrace exists in its own right. It works similarly to dtrace - intercepting
module loads and letting you place traps in the kernel. (ftrace is
different in many respects to dtrace - it provides a richer API
for kernel tracing, but it can allow dangerous user scripts to crash
the kernel).

ftrace is quite vocal - if it detects anything it cannot handle in a loadable
module, it will log a message - indicating what and why it couldnt
handle the instruction sequence it meets. This results in a kernel log
message, such as the following, when dtrace is loaded:

Here, the code in ftrace.c at line 1509, is complaining.
Its not obvious what its complaining about, but the last two lines
highlight what ftrace was confused about. The "actual" message
shows the instruction it didnt like. Those bytes correspond to:

e8 0f 62 03 00 call 0x00036214

I dont know what it doesnt like about that, but it is possibly suggesting
that ftrace, in computing the call-graph of the kernel, may have found
a code path which could result in a kernel hang (this might be
dtrace violating a kernel API constraint, or, because ftrace doesnt
realise how dtrace works).

To date, I have not observed any side effect of this warning (I havent
tried to use ftrace on a system running dtrace, so its possible that
ftrace or dtrace may cause a hang, when they are enabled, together).

By the way - always treat kernel stack traces (such as this) or
even from dtrace itself, with suspicion. Because of the way GCC
compiles code, it is not possible to create a 100% accurate stack
trace in all scenarios. The stack walker looks at all the words on the
stack and lists out potentially likely active stack frames. (The "?"
marks in the trace show the stack walker highlighting definitely questionable
stack boundaries; walking the stack has enough gotchas and complexity that
maybe I will return to the subject in a future blog topic).

CRiSP has had an impressive life - starting out as an editor on
MSDOS-3.0. It has been updated to run on all CPUs, operating systems
and platforms - from 16-bit to 80-bit processors (yes, 80, it was ported
to the CRAY-1).

I am just finalising some new features (document map, annotation window),
and a new feature, which some people may find useful - file-saving.

Until recently, you could use CRiSP to view files and explore aspects
of the files and directories, such as cross-referencing. You had
the ability to do infinite-undo (years before anyone else could do the
same), and CRiSP was the first syntax coloring editor on the Unix platform.

But the new feature means that as well as editing your files, you can now
save them. Previously, you could load a file, and make changes. On your
next session, you could make additional changes, but only as long as you
remade the prior sessions edits.

When storing files in source code control systems, and building new
versions of software, having to type in your edits repeatedly, was
a slight problem. (CRiSP has a fast-autorepeat, so this helped enormously
when scrolling to relevant parts of a buffer).

With the "File/Save" feature (which no other editor on the market yet
supports, although I am guessing some will try to copy CRiSP's unique traits),
your files are saved, permanently. When you return to editing, CRiSP
will be able to retrieve the last modified file.

Its taken a while to get this feature working properly, and had me
doing some research into various algorithms which allow for file saving.
(Many of the algorithms are quite technical, and caused me much *grief*
trying to get all permutations working).

The first file-saving implementation is geared around Cellular automata
(otherwise known as "Paper"). With this, you commit your changes to paper,
and when you restart CRiSP, you no longer have to remember what changes
you have made, since the "Paper" copy will let you see immediately. The
advantage of this "cellular" technology is that you can keep older versions
of the files you have saved.