Not terribly much progress on the cpplib front. Neil's
algorithm didn't work right the first time around, got
revised, and now there's a new version sitting on my hard
drive waiting to be finished up.

I did some work in other areas, like the 'specs' that
tell /bin/cc how to run the real compiler (which is
hiding in a dark corner of /usr/lib). These are a
little language in their own, and not terribly
comprehensible - here's a snippet:

With some magic, that turns into an argument vector for
one of the programs run during compilation. Not
surprisingly, people avoid the stuff as much as possible.

I also stomped the irritating warning bug with built-in
functions. You didn't used to get warned if you forget to
include <string.h> and use strcpy,
because gcc Knows Things about strcpy before it
sees any headers. Not anymore. (It was intended to get
this right all along, but one if went the wrong way in the
mess that is the Yacc grammar for C.)

Neil came up with an ingenious algorithm for expanding
macros, which should get the C standard's semantics just
right, but avoid having to scan any token more than once.
It's remarkably simple to implement, but difficult to
describe and not easy to comprehend from reading the code.
There's going to be a long comment explaining it somewhere.

Anyway, I've implemented all of it except stringification,
which is presenting some difficulties. I'm a wee bit
concerned about the way the algorithm interacts
with the macro stack as I designed it - we may be losing
critical information. But it's late and I'm tired, and
it'll probably all make sense tomorrow.

I punted the lexer glue and am busily grinding through a
rewrite of the macro expander. The goal here is not to
forget about the tokenization of the original macro, but
preserve as much of it as possible. This will dramatically
reduce the amount of data that has to be copied around,
reexamined, etc.

So far, object like macros work, and I'm starting on
function like macros. [These are the terms the standard
uses. It's like this:

Function like macros take more work, because you have to
substitute arguments. In the example above, a
might be replaced by a big hairy expression.

I took some time out and stomped on about a hundred
compiler warnings. We're now sure that string constants are
always treated as, well, constant. I've also got as far as
the Valley of the Dead in Nethack, which has never
happened before (still only about halfway through the game,
though).

To blizzard:
If you're going to improve gmon/mcount, please teach it that
if there's an existing gmon.out in the working directory,
then it should augment that file instead of clobbering it.
That way, if you want to profile a program that runs for a
short time, you could just run it a few thousand times in a
shell loop. Right now you have to do that, plus rename the
reports so they all get saved, and then crunch them together
at the end. This takes much longer than it has to, and
throws your results off because disk cache is wasted on the
huge gmon.out files which all have to stay around until the
end.

To make this change safely, you should probably save the
identity of the executable in gmon.out, and start over if it
changes. (This should be done anyway.)

I'd also like to see better kernelside support for
profiling. setitimer(2) has a lot of overhead, and
the ticks don't come nearly often enough. SVR4 has a
profil(2) system call that pushes the histogram
updates into the kernel, which gets rid of the overhead but
doesn't help with the granularity. Also, I don't think it
can handle gaps in the region to be profiled, so your
program has to be statically linked.

I'd rather not add system calls. Instead, I envision a
pseudo-device which you map several different times,
specifying the window of the address space to profile. It
can use the high-resolution timer in the RTC to get ticks
more often than the normal timer interrupt. Updates happen
in the driver, so no more 30% of execution time spent in
__mcount_internal.

GCC/i386 has a stupid bug where it clobbers %edx
on every function entry, when compiling with profiling.
This breaks -mregparm. Okay, that doesn't affect
very many people - it still needs to get fixed.

It is the One True Character Set, and the answer to all
our problems, or at least the ones having to do with text
encoding. Advocates of this position usually have a
specific format that they prefer - UTF8 or
UCS2.

It is an abomination in the sight of God, and must be
stamped out wherever it occurs. The usual reason given is
that it's not a strict superset of all existing encodings.
E.g. the conversion from various Chinese/Japanese charsets
to Unicode and back is said to lose information.

The truth, as usual, will be somewhere in the middle. I
don't know enough about the issues to judge. I would
appreciate it if anyone who does know enough to judge would
contact me and give me some clues. Email: zack@wolery.cumb.org.

Neil Booth came through with the new lexer for cpplib. It's
much, much cleaner than the old one, and ~500 lines shorter
to boot. Now I just have to knock together a glue layer so
it will talk to the rest of the program, which expects a
completely different interface. Then I can convert the old
code to the new design over the next couple weeks instead of
all at once.

The todo list is way out of date. I'd post a link, but
it's so outdated it's actively misleading. Updating it goes
on this week's queue, right after the glue
works.

All the routines in cpp that I haven't gotten around to
rewriting (there aren't too many left, thank ghod) look
remarkably similar. They are at least five hundred lines
long. They have at least ten levels of nested braces. They
have at least ten variables that are used all over the
entire function, plus at least twenty more declared in the
inner blocks. And they have obvious places where they can
be broken into smaller functions with ease.

I would like to know who it was wrote all the functions
like that. They're not just in the preprocessor. They're
everywhere in gcc. I mean, everywhere. And it's not
like a monster function gets optimized better than four
reasonable ones; in fact, just the opposite. Nor is it
easier to debug a monster function, or profile it. And it
is certainly not easier to edit it. So someone must have
been absolutely in love with the things. And I want to know
who, and why.

I have been notified that no, I may not remove support for
grotty pre-ANSI macro tricks from the preprocessor. It
turns out that the ickiness can be confined in a few small
places, which is better than we had it before. But I really
wanted it to go away entirely. *grumble*

Several people have pointed me at Netscape's app-defaults
file, which lists a bazillion things you can usefully
tweak. I now have no splash screen, no blinking text, and
no useless toolbar buttons. Unfortunately, there doesn't
seem to be anything related to the Energizer-bunny bookmarks
headers.

Here's how it's done: add this to
.X(resources|defaults) and restart X:

I just love the way the toolbar resource names have
nothing to do with their visible labels.

"But the Security button is useful!" I hear you
scream. Yeah, but you can get the same thing by clicking on
the little lock icon in the left corner of the status bar.
I want to keep the status bar, it's actually useful...

There are some other interesting resources in there, like
*dontForceWindowStacking which may disable
javascript's ability to create popup windows that can't be
got rid of. (Too bad it doesn't disable javascript's
ability to create popups, period.) I have javascript turned
off anyway, so it's irrelevant.

The file to dig through is Netscape.ad, which
will probably be in /usr/lib/netscape or wherever
the installation directory wound up. It has amusing bitter
comments by JWZ.