For people unfamiliar with this concept, I should explain briefly.
The C standard is full of places that say "if the program contains
x, the behavior is undefined", which really means "C programs
do not contain x, so If the
program contains x, it is not written in C, and, as this
standard only defines the meaning of programs in C, it has nothing to
say about the meaning of your program." There are around a couple of
hundred of these phrases, and a larger number of places where it is
implied.

For example, everyone knows that it means when you write x =
4;, but what does it mean if you write 4 = x;?
According to clause 6.3.2.1[#1], it means nothing, and this is not a C
program. The non-guarantee in this case is extremely strong. The C
compiler, upon encountering this locution, is allowed to abort and
spontaneously erase all your files, and in doing so it is not
violating the requirements of the standard, because the standard does
not require any particular behavior in this case.

The memorable phrase that the comp.lang.c folks use is that
using that construction might cause demons to fly out of your nose.

[ Addendum 20071030: I am informed that I misread the standard here,
and that the behavior of this particular line is not undefined, but
requires a compiler diagnostic. Perhaps a better example would have
been x = *(char *)0. ]

Here the pointer p starts at the end of the string s,
and the loop might stop when p points to the position just
before s. Except no, that is forbidden, and the program might
at that moment cause demons to fly out of your nose. You are allowed
to have a pointer that points to the position just after an
object, but not one that points just before.

Well anyway, I seem to have digressed. My point was that M. Gould
says that one advantage of languages like Perl that are defined wholly
by their (one) implementation is that you never have "undefined
behavior". If you want to know what some locution does, you type it
in and see what it does. Poof, instant definition.

Although I think this is a sound point, it occurred to me that that is
not entirely correct. The manual is a specification of sorts, and
even if the implementation does X in situation Y, the
manual might say "The implementation does X in situation
Y, but this is unsupported and may change without warning in
the future." Then what you have is not so different from Y
being undefined behavior. Because the manual is (presumably) a
statement of official policy from the maintainers, and, as a
communiqué from the people with the ultimate authority to
define the future meaning of the language, it has some of the
same status that a formal specification would.

Such disclaimers do appear in the Perl documentation.
Probably the most significant example of this is the static variable
hack. For various implementation reasons, the locution my $static if
0 has a strange and interesting effect:

This makes $static behave as a "static" variable, and persist
from call to call of foo(). Without the ... if 0,
the code would print "static is now 42" five times. But with
... if 0, it prints:

static is now
static is now 1
static is now 2
static is now 3
static is now 4

This was never an intentional feature. It arose accidentally, and
then people discovered it and started using it. Since the behavior
was the result of a strange quirk of the implementation, caused by the
surprising interaction of several internal details, it was officially
decided by the support group that this behavior would not be supported
in future versions. The manual was amended to say that this behavior
was explicitly undefined, and might change in the future. It can be
used in one-off programs, but not in any important program, one that
might have a long life and need to be run under several different
versions of Perl. Programs that use pointers that point outside the
bounds of allocated storage in C are in a similar position. It might
work on today's system, with today's compiler, today, but you can't do
that in any larger context.

Having the "undefined behavior" be determined by the manual, instead
of by a language standard, has its drawbacks. The language standard
is fretted over by experts for months. When the C standard says that
behavior is undefined, it is because someone like Clive Feather or
Doug Gwyn or P.J. Plauger, someone who knows more about C than you
ever will, knows that there is some machine somewhere on which the
behavior is unsupported and unsupportable. When the Perl manual says
that some behavior is undefined, you might be hearing from the Perl
equivalent of Doug Gwyn, someone like Nick Clark or Chip Salzenberg or
Gurusamy Sarathy. Or you might be hearing from a mere nervous-nellie
who got their patch into the manual on a night when the release
manager had stayed up too late.

Here is an example of this that has bothered me for a long time. One
can use the each() operator to loop lazily over the contents
of a hash:

while (my $key = each %hash) {
# do something with $key and $hash{$key}
}

What happens if you modify the hash in the middle of the loop? For
various implementation reasons, the manual forbids this.

For example, suppose the loop code adds a new key to the hash. The
hash might overflow as a result, and this would trigger a
reorganization that would move everything around, destroying the
ordering information. The subsequent calls to each() would
continue from the same element of the hash, but in the new order,
making it likely that the loop would visit some keys more than once,
or some not at all. So the prohibition in that case makes sense:
The each() operator normally guarantees to produce each key
exactly once, and adding elements to a hash in the middle of the loop
might cause that guarantee to be broken in an unpredictable way.
Moreover, there is no obvious way to fix this without potentially
wrecking the performance of hashes.

But the manual also forbids deleting keys inside the loop, and there
the issue does not come up, because in Perl, hashes are never
reorganized as the result of a deletion. The behavior is easily
described: Deleting a key that has already been visited will not
affect the each() loop, and deleting one that has not yet
been visited will just cause it to be skipped when the time comes.

Some people might find this general case confusing, I suppose. But
the following code also runs afoul of the "do not modify a hash
inside of an each loop" prohibition, and I don't think
anyone would find it confusing:

Here we want to delete all the bad items from the hash. We do this by
scanning the hash and deleting the current item whenever it is bad.
Since each key is deleted only after it is scanned by each,
we should expect this to visit every key in the hash, as indeed it
does. And this appears to be a useful thing to write. The only
alternative is to make two passes, constructing a list of bad keys on
the first pass, and deleting them on the second pass. The code would
be more complicated and the time and memory performance would be much
worse.

There is a potential implementation problem, though. The way that
each() works is to take the current item and follow a "next"
pointer from it to find the next item. (I am omitting some
unimportant details here.) But if we have deleted the current item,
the implementation cannot follow the "next" pointer. So what
happens?

In fact, the implementation has always contained a bunch of
code, written by Larry Wall, to ensure that deleting the current key
will work properly, and that it will not spoil the each().
This is nontrivial. When you delete an item, the delete()
operator looks to see if it is the current item of an each()
loop, and if so, it marks the item with a special flag instead of
deleting it. Later on, the next time each() is invoked, it
sees the flag and deletes the item after following the "next"
pointer.

So the implementation takes some pains to make this work. But someone
came along later and forbade all modifications of a hash inside an
each loop, throwing the baby out with the bathwater. Larry
and perl paid a price for this feature, in performance and memory and
code size, and I think it was a feature well bought. But then someone
patched the manual and spoiled the value of the feature. (Some years
later, I patched the manual again to add an exception for this case.
Score!)

Another example is the question of what happens when you modify an
array inside a loop over the array, as with:

@a = (1..3);
for (@a) {
print;
push @a, $_ + 3 if $_ % 2 == 1;
}

(This prints 12346.) The internals are simple, and the semantics are
well-defined by the implementation, and straightforward, but the
manual has the heebie-jeebies about it, and most of the Perl community
is extremely superstitious about this, claiming that it is "entirely
unpredictable". I would like to support this with a quotation from
the manual, but I can't find it in the enormous and disorganized mass
that is the Perl documentation.

[ Addendum: Tom Boutell found it. The perlsyn page says "If
any part of LIST is an array, foreach will get very confused
if you add or remove elements within the loop body, for example with
splice. So don't do that." ]

The behavior, for the record, is quite straightforward: On the first
iteration, the loop processes the first element in the array. On the
second iteration, the loop processes the second element in the array,
whatever that element is at the time the second iteration starts,
whether or not that was the second element before. On the third
iteration, the loop processes the third element in the array, whatever
it is at that moment. And so the loop continues, terminating the
first time it is called upon to process an element that is past the
end of the array. We might imagine the following pseudocode:

There is nothing subtle or difficult about this, and claims that the
behavior is "entirely unpredictable" are probably superstitious
confessions of ignorance and fear.

Let's try to predict the "entirely unpredictable" behavior of the
example above:

@a = (1..3);
for (@a) {
print;
push @a, $_ + 3 if $_ % 2 == 1;
}

Initially the array contains (1, 2, 3), and so the first iteration
processes the first element, which is 1. This prints 1, and, since 1
is odd, pushes 4 onto the end of the array.

The array now contains (1, 2, 3, 4), and the loop processes the second
element, which is 2. 2 is printed. The loop then processes the third
element, printing 3 and pushing 6 onto the end. The array now
contains (1, 2, 3, 4, 6).

On the fourth iteration, the fourth element (4) is printed, and on the
fifth iteration, the fifth element (6) is printed. That is the last
element, so the loop is finished. What was so hard about that?

My blog was recently inserted into the feed for planet.haskell.org, and
of course I immediately started my first streak of posting code-heavy
articles about C and Perl. This is distressing not just because the
articles were off-topic for Planet Haskell—I wouldn't give the
matter two thoughts if I were posting my usual mix of abstract math
and stuff—but it's so off-topic that it feels weird to
see it sitting there on the front page of Planet Haskell. So I
thought I'd make an effort to talk about Haskell, as a friendly
attempt to promote good relations between tribes. I'm not sure what
tribe I'm in, actually, but what the heck. I thought about Haskell a
bit, and a Haskell example came to mind.

Here is a definition of the factorial function in Haskell:

fact 0 = 1
fact n = n * fact (n-1)

I don't need to explain this to anyone, right?

Okay, now here is another definition:

fact 0 = 1
fact (n+1) = (n+1) * fact n

Also fine, and indeed this is legal Haskell. The pattern n+1
is allowed to match an integer that is at least 1, say 7, and doing so binds n to
the value 6. This is by a rather peculiar special case in the
specification of Haskell's pattern-matcher. (It is section 3.17.2#8
of Haskell 98 Language and Libraries: The Revised
Report, should you want to look it up.) This peculiar
special case is known sometimes as a "successor pattern" but more
often as an "n+k pattern".

The spec explicitly deprecates this feature:

Many people feel that n+k patterns should not be
used. These patterns may be removed or changed in future versions of
Haskell.

(Page 33.) One wonders why they put it in at all, if they were going
to go ahead and tell you not to use it. The Haskell committee is
usually smarter than this.

I have a vague recollection that there was an argument between people
who wanted to use Haskell as a language for teaching undergraduate
programming, and those who didn't care about that, and that this was
the compromise result. Like many compromises, it is inferior to both
of the alternatives that it interpolates between. Putting the feature
in complicates the syntax and the semantics of the language, disrupts
its conceptual purity, and bloats the
spec—see the Perlesque yikkity-yak on pages 57–58 about
how x + 1 = ... binds a meaning to +, but (x +
1) = ... binds a meaning to x. Such complication is
worth while only if there is a corresponding payoff in terms of
increased functionality and usability in the language. In this case,
the payoff is a feature that can only be used in one-off programs.
Serious programs must avoid it, since the patterns "may be removed or
changed in future versions of Haskell". The Haskell committee
purchased this feature at a certain cost, and it is debatable whether
they got their money's worth. I'm not sure which side of that issue I
fall on. But having purchased the feature, the committee then threw
it in the garbage, squandering their sunk costs. Oh well. Not even
the Haskell committee is perfect.

I think it might be worth pointing out that the version of the program
with the n+k pattern is technically superior to the
other version. Given a negative integer argument, the first version
recurses forever, possibly taking a long time to fail and perhaps
taking out the rest of the system on which it is running. But the
n+k version fails immediately, because the n+1
pattern will only match an integer that is at least 1.

The "nasal demons" of the C standard are a joke, but a serious one.
The C standard defines what C compilers must do when presented with C
programs; it does not define what they do when presented with
other inputs, nor what other software does when presented with C
programs. The authors of C standard clearly understood the standard's
role in the world.

XML documents may, and should, begin with an XML declaration which
specifies the version of XML being used. For example, the following is
a complete XML document, well-formed but not valid:

<?xml version="1.0"?>
<greeting>Hello, world!</greeting>

...

The version number "1.0" should be used to indicate conformance to
this version of this specification; it is an error for a document to
use the value "1.0" if it does not conform to this version of this
specification.

(Emphasis is mine.) The XML 1.0 spec is just a document. It has no power,
except to declare that certain files are XML 1.0 and certain files are
not. A file that complies with the requirements of the spec is XML 1.0;
all other files are not XML 1.0. But in the emphasized clause, the spec
says that certain behavior "is an error" if it is exhibited by
documents that do not conform to the spec. That is, it is
declaring certain non-XML-1.0 documents "erroneous". But within the
meaning of the spec, "erroneous" simply means that the documents are
not XML 1.0. So the clause is completely redundant. Documents that do
not conform to the spec are erroneous by definition, whether or not
they use the value "1.0".

It's as if the Catholic Church issued an edict forbidding all rabbis
from wearing cassocks, on pain of excommunication.

I am happy to discover that this dumb error has been removed from the
most recent edition of the XML 1.0 spec.