June 2003 Archives

Editor's note: Perl 6 Essentials is the first book to offer a peek into the next major version of the Perl language. It covers the development of Perl 6 syntax as well as Parrot, the language-independent interpreter developed as part of the Perl 6 design strategy. In this excerpt from Chapter 3 of the book, the authors take an in-depth look of some of the most important principles of natural language and their impact on the design decisions made in Perl 6.

Introduction

At the heart of every language is a core set of ideals that give
the language its direction and purpose. If you really want to understand the
choices that language designers make--why they choose one feature over another
or one way of expressing a feature over another--the best place to start is
with the reasoning behind the choices.

Perl 6 has a unique set of influences. It has deep roots in Unix
and the children of Unix, which gives it a strong emphasis on utility and
practicality. It's grounded in the academic pursuits of computer science and
software engineering, which gives it a desire to solve problems the right way,
not just the most expedient way. It's heavily steeped in the traditions of
linguistics and anthropology, which gives it the goal of comfortable
adaptation to human use. These influences and others like them define the
shape of Perl and what it will become.

Linguistic and Cognitive Considerations

Perl is a human language. Now, there are significant differences
between Perl and languages like English, French, German, etc. For one, it is
artificially constructed, not naturally occurring. Its primary use, providing
a set of instructions for a machine to follow, covers a limited range of human
existence. Even so, Perl is a language humans use for communicating. Many of
the same mental processes that go into speaking or writing are duplicated in
writing code. The process of learning to use Perl is much like learning to
speak a second language. The mental processes involved in reading are also
relevant. Even though the primary audience of Perl code is a machine, as often
as not humans have to read the code while they're writing it, reviewing it, or
maintaining it.

Many Perl design decisions have been heavily influenced by the
principles of natural language. The following are some of the most important
principles, the ones we come back to over and over again while working on the
design and the ones that have had the greatest impact.

The Waterbed Theory of Complexity

The natural tendency in human languages is to keep overall
complexity about equivalent, both from one language to the next, and over time
as a language changes. Like a waterbed, if you push down the complexity in one
part of the language, it increases complexity elsewhere. A language with a
rich system of sounds (phonology) might compensate with a simpler syntax. A
language with a limited sound system might have a complex way of building
words from smaller pieces (morphology). No language is complex in every way,
as that would be unusable. Likewise, no language is completely simple, as too
few distinctions would render it useless.

The same is true of computer languages. They require a constant
balance between complexity and simplicity. Restricting the possible operators
to a small set leads to a proliferation of user-defined methods and
subroutines. This is not a bad thing, in itself, but it encourages code that
is verbose and difficult to read. On the other hand, a language with too many
operators encourages code that is heavy in line noise and difficult to read.
Somewhere in the middle lies the perfect balance.

The Principle of Simplicity

In general, a simple solution is preferable to a complex one. A
simple syntax is easier to teach, remember, use, and read. But this principle
is in constant tension with the waterbed theory. Simplification in the wrong
area is one danger to avoid. Another is false simplicity or
oversimplification. Some problems are complex and require a complex solution.
Perl 6 grammars aren't simple. But they are complex at the language level in a
way that allows simpler solutions at the user level.

The Principle of Adaptability

Natural languages grow and change over time. They respond to
changes in the environment and to internal pressure. New vocabulary springs up
to handle new communication needs. Old idioms die off as people forget them,
and newer, more relevant idioms take their place. Complex parts of the system
tend to break down and simplify over time. Change is what keeps language
active and relevant to the people who use it. Only dead languages stop
changing.

The plan for Perl 6 explicitly includes plans for future
language changes. No one believes that Perl 6.0.0 will be perfect, but at the
same time, no one wants another change process quite as dramatic as Perl 6. So
Perl 6 will be flexible and adaptable enough to allow gradual shifts over
time. This has influenced a number of design decisions, including making it
easy to modify how the language is parsed, lowering the distinctions between
core operations and user-defined operations, and making it easy to define new
operators.

The Principle of Prominence

In natural languages, certain structures and stylistic devices
draw attention to an important element. This could be emphasis, as in "The dog stole my wallet" (the dog, not the man), or extra
verbiage, as in "It was the dog who stole my wallet," or a shift to an unusual
word order, "My wallet was stolen by the dog" (my wallet, not my shoe, etc.),
or any number of other verbal tricks.

Perl is designed with its own set of stylistic devices to mark
prominence, some within the language itself, and some that give users
flexibility to mark prominence within their code. The NAMED blocks use all capitals to draw attention to the
fact that they're outside the normal flow of control. Perl 5 has an alternate
syntax for control structures like if and for, which moves them to the end to serve as statement
modifiers (because Perl is a left-to-right language, the left side is always a
position of prominence). Perl 6 keeps this flexibility, and adds a few new
control structures to the list.

The balance for design is to decide which features deserve to be
marked as prominent, and where the syntax needs a little flexibility so the
language can be more expressive.

The Principle of End Weight

Natural languages place large complex elements at the end of
sentences. So, even though "I gave Mary the book" and "I gave the book to
Mary" are equally comfortable, "I gave the book about the history of
development of peanut-based products in Indonesia to Mary" is definitely less
comfortable than the other way around. This is largely a mental parsing
problem. It's easier to interpret the major blocks of the sentence all at once
than to start with a few, work through a large chunk of minor information, and
then go back to fill in the major sentence structure. Human memory is
limited.

End weight is one of the reasons regular expression modifiers
were moved to the front in Perl 6. It's easier to read a grammar rule when you
know things like "this rule is case insensitive" right at the start. (It's
also easier for the machine to parse, which is almost as important.)

End weight is also why there has been some desire to reorder the
arguments in grep to:

grep @array { potentially long and complex block };

But that change causes enough cultural tension that it may not
happen.

The Principle of Context

Natural languages use context when interpreting meaning. The
meanings of "hot" in "a hot day," "a hot stereo," "a hot idea," and "a hot
debate" are all quite different. The implied meaning of "it's wet" changes
depending on whether it's a response to "Should I take a coat?" or "Why is the
dog running around the kitchen?" The surrounding context allows us to
distinguish these meanings. Context appears in other areas as well. A painting
of an abstract orange sphere will be interpreted differently depending on
whether the other objects in the painting are bananas, clowns, or basketball
players. The human mind constantly tries to make sense of the universe, and it
uses every available clue.

Perl has always been a context-sensitive language. It makes use
of context in a number of different ways. The most obvious use is scalar and
list contexts, where a variable may return a different value depending on
where and how it's used. These have been extended in Perl 6 to include string
context, boolean context, numeric context, and others. Another use of context
is the $_ defaults, like print, chomp, matches, and now
when.

Context-dependent features are harder to write an interpreter
for, but they're easier on the people who use the language daily. They fit in
with the way humans naturally think, which is one of Perl's top goals.

The Principle of DWIM

In natural languages there is a notion called "native speaker's
intuition." Someone who speaks a language fluently will be able to tell
whether a sentence is correct, even if they can't consciously explain the
rules. (This has little to do with the difficulty English teachers have
getting their students to use "proper" grammar. The rules of formal written
English are very different from the rules of spoken English.)

As much as possible, features should do what the user expects.
This concept of DWIM, or "Do What I Mean," is largely a matter of intuition.
The user's experiences, language exposure, and cultural background all
influence their expectations. This means that intuition varies from person to
person. An English speaker won't expect the same things as a Dutch speaker,
and an Ada programmer won't expect the same things as a COBOL programmer.

The trick in design is to use the programmer's intuitions
instead of fighting against them. A clearly defined set of rules will never
match the power of a feature that "just seems right."

Perl 6 targets Perl programmers. What seems right to one Perl
programmer may not seem right to another, so no feature will please everyone.
But it is possible to catch the majority cases.

Perl generally targets English speakers. It uses words like
"given," which gives English speakers a head start in understanding its
behavior in code. Of course, not all Perl programmers are English speakers. In
some cases idiomatic English is toned down for broader appeal. In grammar
rules, ordinal modifiers have the form 1st, 2nd, 3rd, 4th, etc., because those are most natural for native
English speakers. But they also have an alternate form 1th, 2th, etc., with the
general rule Nth, because the English endings for
ordinal numbers are chaotic and unfriendly to non-native speakers.

The Principle of Reuse

Human languages tend to have a limited set of structures and
reuse them repeatedly in different contexts. Programming languages also employ
a set of ordinary syntactic conventions. A language that used { } braces to delimit loops but paired keywords to delimit if statements (like if
... then... end if) would be incredibly
annoying. Too many rules make it hard to find the pattern.

In design, if you have a certain syntax to express one feature,
it's often better to use the same syntax for a related feature than to invent
something entirely new. It gives the language an overall sense of consistency,
and makes the new features easier to remember. This is part of why grammars
are structured as classes. Grammars could use any syntax, but classes already
express many of the features grammars need, like inheritance and the concept
of creating an instance.

The Principle of Distinction

The human mind has an easier time identifying big differences
than small ones. The words "cat" and "dog" are easier to tell apart than
"snore" and "shore." Usually context provides the necessary clues, but if
"cats" were "togs," we would be endlessly correcting people who heard us wrong
("No, I said the Johnsons got a new dog, not tog, dog.").

The design consideration is to build in visual clues to subtle
contrasts. The language should avoid making too many different things similar.
Excessive overloading reduces readability and increases the chance for
confusion. This is part of the motivation for splitting the two meanings of
eval into try and eval, the two meanings of for
into for and loop, and
the two uses of sub into sub and method.

Distinction and reuse are in constant tension. If too many
features are reused and overloaded, the language will begin to blur together.
Far too much time will be spent trying to figure out exactly which use is
intended. But, if too many features are entirely distinct, the language will
lose all sense of consistency and coherence. Again, it's a balance.

Language Cannot Be Separated from Culture

A natural language without a community of speakers is a dead
language. It may be studied for academic reasons, but unless someone takes the
effort to preserve the language, it will eventually be lost entirely. A
language adds to the community's sense of identity, while the community keeps
the language relevant and passes it on to future generations. The community's
culture shapes the language and gives it a purpose for existence.

Computer languages are equally dependent on the community behind
them. You can measure it by corporate backing, lines of code in operation, or
user interest, but it all boils down to this: a programming language is dead
if it's not used. The final sign of language death is when there are no
compilers or interpreters for the language that will run on existing hardware
and operating systems.

For design work this means it's not enough to only consider how
a feature fits with other features in the language. The community's traditions
and expectations also weigh in, and some changes have a cultural price.

The Principle of Freedom

In natural languages there is always more than one way to
express an idea. The author or speaker has the freedom, and the
responsibility, to pick the best phrasing--to put just the right spin on the
idea so it makes sense to their audience.

Perl has always operated on the principle that programmers
should have the freedom to choose how to express their code. It provides easy
access to powerful features and leaves it to the individuals to use them
wisely. It offers customs and conventions rather than enforcing laws.

This principle influences design in several ways. If a feature
is beneficial to the language as a whole, it won't be rejected just because
someone could use it foolishly. On the other hand, we aren't above making some
features difficult to use, if they should be used rarely.

Another part of the design challenge is to build tools that will
have many uses. No one wants a cookbook that reads like a Stephen King novel,
and no one wants a one-liner with the elaborate structure of a class
definition. The language has to be flexible to accommodate freedom.

The Principle of Borrowing

Borrowing is common in natural languages. When a new technology
(food, clothing, etc.) is introduced from another culture, it's quite natural
to adopt the original name for it. Most of the time borrowed words are adapted
to the new language. In English, no one pronounces "tortilla," "lasagna," or
"champagne" exactly as in the original languages. They've been altered to fit
the English sound system.

Perl has always borrowed features, and Perl 6 will too. There's
no shame in acknowledging that another language did an excellent job
implementing a particular feature. It's far better to openly borrow a good
feature than to pretend it's original. Perl doesn't have to be different just
for the sake of being different. Most features won't be adopted without any
changes, though. Every language has its own conventions and syntax, and many
aren't compatible. So, Perl borrows features, but uses equivalent structures
to express them.

Architectural Considerations

The second set of principles governs the overall architecture of
Perl 6. These principles are connected to the past, present, and future of
Perl, and define the fundamental purpose of Perl 6. No principle stands alone;
each is balanced against the others.

Perl Should Stay Perl

Everyone agrees that Perl 6 should still be Perl, but the
question is, what exactly does that mean? It doesn't mean Perl 6 will have
exactly the same syntax. It doesn't mean Perl 6 will have exactly the same
features. If it did, Perl 6 would just be Perl 5. So, the core of the question
is what makes Perl "Perl"?

True to the original purpose

Perl will stay true to its designer's original intended purpose.
Larry wanted a language that would get the job done without getting in his
way. The language had to be powerful enough to accomplish complex tasks, but
still lightweight and flexible. As Larry is fond of saying, "Perl makes the
easy things easy and the hard things possible." The fundamental design
philosophy of Perl hasn't changed. In Perl 6, the easy things are a little
easier and the hard things are more possible.

Familiarity

Perl 6 will be familiar to Perl 5 users. The fundamental syntax
is still the same. It's just a little cleaner and a little more consistent.
The basic feature set is still the same. It adds some powerful features that
will probably change the way we code in Perl, but they aren't required.

Learning Perl 6 will be like American English speakers learning
Australian English, not English speakers learning Japanese. Sure, there are
some vocabulary changes, and the tone is a little different, but it is
still--without any doubt--English.

Translatable

Perl 6 will be mechanically translatable from Perl 5. In the
long term, this isn't nearly as important as what it will be like to write
code in Perl 6. But during the transition phase, automatic translation will be
important. It will allow developers to start moving ahead before they
understand every subtle nuance of every change. Perl has always been about
learning what you need now and learning more as you go.

Important New Features

Perl 6 will add a number of features such as exceptions,
delegation, multi-method dispatch, continuations, coroutines, and currying, to
name a few. These features have proven useful in other languages and provide a
great deal of power for solving certain problems. They improve the stability
and flexibility of the language.

Many of these features are traditionally difficult to
understand. Perl takes the same approach as always: provide powerful tools,
make them easy to use, and leave it up to the user to decide whether and how
to use them. Most users probably won't even know they're using currying when
they use the assuming method.

Features like these are an important part of preparing Perl for
the future. Who knows what development paradigms might develop in a language
that has this combination of advanced features in a form easily approachable
by the average programmer. It may not be a revolution, but it's certainly
evolution.

Long-Term Usability

Perl 6 isn't a revision intended to last a couple of years and
then be tossed out. It's intended to last 20 years or more. This long-range
vision affects the shape of the language and the process of building it. We're
not interested in the latest fad or in whipping up a few exciting tricks. We
want strong, dependable tools with plenty of room to grow. And we're not
afraid to take a little extra time now to get it right. This doesn't mean Perl
6.0 will be perfect, any more than any other release has been perfect. It's
just another step of progress.

Welcome to the third of my US tour Perl 6 summaries. Once again I'm pleased
to report that the denizens of the Perl 6 mailing lists continue to make the
life of a touring summarizer an easy one by not posting all that much to the
lists. So, I can sit here in my room at the Shaker Inn in Enfield and marvel at
the traffic noise outside, wonder about the car next door with the New
Hampshire plates reading PERLFAN, and just generally appreciate the loveliness
of the room.

At the end of last week, Dan outlined his thoughts on how exception handling
will work in Parrot. This week, people talked about it. Discussion revolved
around how much information should be attached to an exception and how/whether
we should support resumable exceptions.

Last week I said that "I get the strong feeling that Leo Tötsch isn't
entirely happy with the new Continuation Passing Style". This week Leo
corrected me; I hadn't noticed that the speed issues had been addressed by the
latest changes to parrot (in fact the current CPS implementation is faster than
the old invoke/ret scheme).

Sean O'Rourke addressed Leo's problem with the Perl 6 Compiler tests failing
by saying that the compiler should really be ported to use CPS rather than
implementing a new variant of the Sub PMC that uses the old scheme. Leo
reckoned that such a port wasn't currently doable because IMCC needed to be
modified to use the CPS scheme, which would also involve reworking the register
allocator. Given Leo's prodigious rate of implementation, this may have already
happened.

Clinton A. Pierce had reported a memory leak in Parrot, but tracked it down
to a situation where he was doing:

.arg 0
call _foo

and forgetting to take the 0 off the stack. However, even after
he'd fixed that, he had segfault issues, and posted a (largish) code fragment
that tweaked the bug.

It appears that Parrot wasn't throwing warnings when stacks get to big, just
failing silently. Leo added a check for too deeply nested stacks, which at
least avoids segfaulting on logic bugs.

Leo and Dan discussed other places where such limit checking should be put
in place. Dan also muttered something about turning stack chunks into PMCs,
allowing for the garbage collection of stack frames. Leo also muttered about
the proliferation of stack implementations in Parrot (there are five) and
thinks it should be possible to have one general stack engine.

Jürgen Bömmels is in the process of porting the IO subsystem from
its current mem_sys_alloc/free based implementation to the sunny,
garbage-collected uplands of a PMC based implementation. However, he's run into
a problem; some of the operations in op.ops use integer File
Descriptors, grabbing information from a table in the interpreter structure.
This gets in the way of garbage collection, since any integer could be a file
descriptor.

Jürgen proposed removing the integer file descriptors and mandating
that ParrotIO PMCs be the only way to access IO (including the standard
STDIN, STDOUT, and STDERR). He proposed
adding get_std[in|out|err] ops to get at the standard streams.

Dan suggested that Jürgen Just Do It; the current IO system being more
than slightly hackish, essentially put in place until something better came
along.

Want to get involved in the Parrot development process? Don't know much
about Virtual Machine design and implementation? Do know Perl? Dan has a small
but interesting task for you.

At present, Parrot gets built without any compiler level optimizations
turned on because files like tsq.c can't have any optimizations
turned on (tsq.c is the thread safe queue module, which is
"annoyingly execution-order-dependent because it has to operate safely as
interrupt code potentially interrupting itself").

Dan would like a version of Configure.pl which can build a
Makefile (or whatever build tool we end up using) with per-C-file
compiler flags, and it needs to be possible to override those flags, per file,
by the platform configuration module.

Interested? David Robins seems to be, and he asked whether the build system
had to be Makefile based. Dan says not, but the really important
thing is that the resulting build script, or the config system that generates
the script be adequately understandable/maintainable.

Leo explained what's going on; essentially it boils down to issues with
register allocation not being aware of .local scopes. He
recommended that Clint use either true globals or lexicals instead of
.local. Clint isn't so sure that this is a good idea, pointing out
that there are occasions when having lexically scoped names at the IMCC level
as well as at the level of lexical pads would be very useful.

Luke Palmer has been thinking about value and reference objects. He wondered
if there was any value in a valclone operator alongside
set and clone which would allow the target PMC to
decide whether to use set or clone semantics. He also
offered a patch implementing the operator if people thought it would be useful.
Leo Tötsch wasn't sure the new operator was necessary.

Klaas-jan Stol noted that he'd encountered problems with reference/value
confusion when he'd been working on his Lua compiler, but he wondered if the
problem couldn't be solved by having a general, language independent "argument"
PMC class. (I'm not sure I understood what he meant by this so I'm hoping for
an explanation with code fragments).

There is a story that UK prime minister Harold MacMillan was asked by a
student what it was that concerned him most as Prime Minister. Mac replied
"Events dear boy, events."

Leo Tötsch laid out his thoughts and ensuing questions about
Exceptions, events, and threads, and how they played together. There has been a
small amount of discussion in response to this, but I think everyone's
currently thinking hard about the issue....

Luke Palmer wondered if there would be a standard way of inspecting the call
stack (for debugging/caller/etc). (I think I'm going to switch to
using the phrase 'call chain' rather than call stack, as the presence of
continuations makes the call 'stack' look pretty unstacklike....).

Leo and Dan both thought that this would be a high level language issue
rather than a Parrot issue, though Dan did note that there might be useful
things that Parrot could do to make such introspection easier/possible.

Leo Tötsch has been thinking about occasions when one might need to
monkey with the internals of an existing continuation (he was thinking about
the warnings state) and proposed several solutions. Dan favoured
his new opcode, updatecc and thought it would be good to be able
to broaden the scope of what one could update in a continuation/context. This
scared Leo somewhat, but Dan came up with some examples of where it might prove
to be useful.

Miko O'Sullivan engaged in some summer daydreaming by asking what everyone
was looking forward to most from Perl 6. Miko himself is looking forward to
more Pure Perl modules. If Perl 6 delivers on its performance promises then
there are going to be more and more things where implementing directly in Perl
will be fast enough, and Perl is so much easier to implement in than C....

Jonathan Scott Duff incurred Cozeny when he said that he's hoping that by
this time next year we'll have an 85% complete Perl 6 that will be usable in
production (by brave people). Simon Cozens noted that we already have such a
beast and it's called Perl 5. For some reason this led to a new marketing
slogan being proposed: Perl 6, the reconstituted cheeseburger of programming
languages. Somehow I don't think that one's going to fly. (I just read this bit
out loud to my wife and she says that she really doesn't like the thought of a
flying reconstituted cheeseburger, so I think we'd best leave it at that.

Tcha! I announce the retirement of Leon Brocard from his post as Perl 6
Summary Running Joke and put the right to choose the next joke up for auction
at YAPC. And what do you know, the winner of the auction nominates Leon Brocard
as the new running joke. So, settle in for another year of desperate
rationalizations for mentioning Leon in these summaries. Who knows, maybe
Leon's Parrot related workrate will go up to such an extent that it'll be easy,
but somehow I doubt it.

Thanks to, in chronological order, Adam Turoff and Lisa Wolfisch; Walt
Mankowski, Mark-Jason and Laurie Dominus; Dave Adler; Dan and Karen Sugalski;
and Uri and Linda Guttman for being such fine hosts in Washington,
Philadelphia, New York, Hartford, and Boston respectively. Next time we do
this, we will not be attempting to visit quite so many cities on the Eastern
Seaboard in such a short time. At one point all we were seeing was Perl nerds
and freeways in rapid succession.

As ever, if you've appreciated this summary, please consider one or more of
the following options:

Welcome to my first anniversary issue of the Perl 6 Summary. Hopefully there
won't be too many more anniversaries to celebrate before we have a real, running
Perl 6, but there's bound to be ongoing development after that. My job is
secure!

Because I can't think of anything better to do, I'll start with the action
on the perl6-internals list.

Klaas-Jan Stol wondered what he'd missed; last time he looked Parrot wasn't
doing continuation passing. He asked why Dan had chosen to go down that route.
Dan answered that he had realized that "we had to save off so much state that
we essentially had a continuation anyway". Explicitly going with continuation
passing just made things more formal, and wrapped up all the context saving
behind a straightforward interface. He promised a more detailed explanation
later.

Portable way of finding libc, unbuffered reads

just works, which simultaneously pleases and scares him silly. He wondered
if there was a good way of finding the standard C library on a Unix system
without scary hardwiring as in the fragment above. He also wondered if there
was an "official" way of getting an unbuffered read via parrot.

Jens Rieks came up with a gloriously evil way of finding libc. The theory
goes that Parrot is linked against libc, so you just have to
dlopen the running image and you can call libc functions to your
heart's content. To dlopen the running image you need to pass a NULL pointer to
the underlying loadlib so he offered a patch to core.ops
which interpreted an empty string as a pointer to NULL. Leo and Dan were
impressed and the patch (or something similar) was applied. I get the feeling
that Dan wants to do something a little less hacky to access the current
executable though....

Clint noted that the dlopen the running image by passing a null pointer
trick doesn't work with Windows, but outlined a workaround for that too. Jen
Rieks suggested a better Windows workaround.

Nobody came up with an approved way of doing getc, but once you
have libc loaded you can just use its getc.

OO, Objects

If you look in a fresh from CVS parrot directory you'll now find
object.ops, which will be the cause of much rejoicing in many
places. Dan's nailed the object spec down enough that he's started implementing
a few of the required ops. As he points out, what we have is "hardly
sufficient", but everyone's got to start somewhere, the journey of a thousand
miles begins with but a single step, etc.

Judging by the number of comments (none), everyone was stunned into silence.

More CPS shenanigans

I get the strong feeling that Leo Tötsch isn't entirely happy with the
new Continuation Passing Style regime. He's worried that the P6C tests break,
and that CPS subs are some 3 times slower for calling the sub. This led into a
discussion of what context really must go into a continuation, whether we can
get away with different classes of continuation (hold more or less contextual
information) and other ways of possibly speeding things up.

I'm not sure Leo has been entirely convinced, but I'm confident that Dan's
not going to change his mind about this.

Leo later submitted a large patch which unifies the various subroutine
related PMCs to take into account CPS.

Exceptions

Now that the rules for subs/methods etc are settling down, Dan outlined his
thoughts on exception handlers. If I'm understanding him correctly, an
exception handler is just a continuation that you invoke with the exception as
its only argument. There were no comments by the end of the week.

Meanwhile in perl6-language

The language list was quiet again. Maybe everyone was doing face to face
things at YAPC. Or on holiday. Or something.

printf like formatting in interpolated strings

Remember last week I mentioned that Luke Palmer had made a cool suggestion
about printf like formatting in string interpolation? (He
suggested a syntax like rx/<expression> but
formatted(<formatspec>)/, which I for one quite liked).

Edwin Steiner wasn't so keen, noting that Luke's suggestion was actually
more verbose than rx/sprintf <formatspec>,
<expression>/. He wasn't entirely sure that having a formatting
rule attached to a value with a 'but' was really the right thing to do (it does
rather violate the whole model/view/controller abstraction for instance).
Edwin's favoured interpolation syntax was,

(or something along those lines). Edwin went on to extend his idea, allowing
for all sorts of clever interpolation rules, leading Dave Storrs to comment
that the Obfuscated Perl people would certainly thank him if the suggestions
went in.

Arcadi Shehter came up with yet another suggested syntax involving
: (neglecting the important rule that, whilst one's heart may
belong to Daddy, the : belongs to Larry. And I'm really trying not
to think about the images that conjures up).

At this point, we ended up in a philosophical discussion about when was the
right time to do stuff, generality of solutions and Perl remaining Perl. I
remain confident that come the appropriate time, Larry and/or Damian (more
likely Damian given some of the stuff he was showing off to do with formatting
at YAPC) will nail things down and we'll all go "Of course!" and move onto the
next thing.

Dispatching, Multimethods and the like

Adam Turoff noted that, in his YAPC opening talk, Damian had mentioned the
catchall DISPATCH sub, which will allow for altering the dispatch behaviour to
do any magic you choose. The 'problem' with DISPATCH is defining its
interaction with the likes of AUTOLOAD and other built in dynamic dispatch
behaviours, which will need to be nailed down.

Dan Sugalski jetted over from perl6-internals to give the lowdown on what
would be available at the parrot level (which may or may not be exposed at the
Perl 6 language level). Essentially, what we know is that there will be the
capability to insert any dispatch method you like, but the details of how you'd
do it aren't thrashed out yet. It almost certainly won't be easy, but that's a
good thing.

Type Conversion Matrix, Pragmas (Take 4)

Acknowledgements, Announcements and Apologies

Whee! My first anniversary! I confess that when I started writing these
things I didn't expect to keep going for this long. Now I don't expect to ever
stop.

After due and careful consideration of a short shortlist, I should like to
award an anniversary virtual white parrot award to Leopold "Patchmonster"
Tötsch for his astonishing contribution to the Parrot core. Other mental
nominees for this award were: Clinton A Pierce, for BASIC and the associated
bug finding; Leon Brocard, for humorous reasons and Robert Spier and Ask
Bjørn Johansen for invaluable and invisible work on websites, CVS and
mailing list maintenance.

I eliminated the core design team from consideration for the above award,
but I'd like to formally thank Larry, Damian, Allison and Dan, without
whom...

As I said last week, Leon Brocard is no longer the summaries' running joke.
However, I auctioned off the right to specify the next running joke at YAPC
last week; next week should see the unveiling of the new, improved Perl 6
Summary Running Joke.

If you've appreciated this summary, please consider one or more of the
following options:

In the previous hidden
treasures article, we looked at some easy-to-use (but not well-known) modules in
the Perl Core. In this article, we dig deeper to uncover some of the truly precious
and unique gems in the Perl Core.

Wow, that's a lot of work! I've already given up on my program, not to
mention the syntax error in the declaration of TUESDAY. Now let's
try this again using the multiple declaration syntax, new to the
constant pragma for Perl 5.8.0.

The only warning here is that this syntax is new to Perl 5.8.0. If you
intend to distribute a program using multiple constant declarations,
then remember the limitations of the program. You may want to specify what
version of Perl is required for your program to work.

use 5.8.0;

Perl will throw a fatal error if the version is anything less than
5.8.0.

This module allows us to play with Perl's subroutine attribute syntax
by defining our attributes. This is a powerful module with a
rich feature set. Here I'll give you an example of writing a minimal
debugger using subroutine attributes.

First, we need to create an attribute. An attribute is any subroutine
that has an attribute of :ATTR. Setting up our debug attribute
is easy.

use Attribute::Handlers;

sub debug :ATTR {
my (@args) = @_;
warn "DEBUG: @args\n";
}

Now we have a simple debug attribute named :debug. Using
our attribute is also easy.

sub table :debug {
# ...
}
table(%data);
table(%other_data);

Now, since attributes are compiled just before runtime, in the CHECK
phase, our debugging output will only be sent to STDERR once. For the
code above, we might get output like this:

That debug string represents some of the information we get in an attribute
subroutine. The first argument is the name of the package the attribute
was declared in. Next is a reference to the symbol table entry for the
subroutine, followed by a reference to the subroutine itself. Next comes
the name of the attribute, followed by any data associated with the attribute
(none in this case). Finally, the name of the phase that invoked the handler
passed.

At this point, our debugging attribute isn't useful, but the parameters
we are given to work with are promising. We can use them to invoke
debugging output each time the subroutine is called. Put on your hard hat,
this is where things get interesting.

First, let us take a look at how we want to debug our subroutine. I think we'd
like different levels of debugging output. At the lowest level (1), the name
of the subroutine being invoked should be sent to STDERR. At the next
level (2), it would be nice to be notified of entry and exit of the subroutine. Going further (level 3), we might want to see the arguments passed to the
subroutine. Even more detail can be done, but we'll save that for later and
stop at three debug levels.

In order to do this voodoo, we need to replace our subroutine with one doing
the debugging for us. The subroutine doing the debugging must then invoke
our original code with the parameters passed to it, and return the proper output
from it. Here is the implementation for debug level one (1).

There are some sticky bits in the debug subroutine that I need to explain
in more detail.

my $name = join '::', *{$symbol}{PACKAGE}, *{$symbol}{NAME};

This line is used to find the name and package of the subroutine we're debugging.
We do the lookups from the symbol table, using the reference to the symbol that
our attribute is given.

no warnings 'redefine';

Here we turn off warnings about redefining a subroutine, because we're going
to redefine a subroutine on purpose.

*{$symbol} = sub { ... };

This construct simply replaces the code section in the symbol table with this
anonymous subroutine (which is a code reference).

In this example, we set the default log level to one (1), set up some helper variables, and replace our table() subroutine with a debugging closure. I
call the anonymous subroutine a closure because we are reusing some variables
that are defined in the debug() subroutine. Closures are explained in greater
detail in perlref (perldoc perlref from the command line).

To set the debug level for a subroutine, just a number the :debug attribute.

In this example, we use sprintf to make out debugging statements a little
more readable as complexity grows. This time, we cannot return directly from
the original code reference. Instead, we have to capture the output and return
it at the end of the routine. When the table() subroutine defines its debug
level as :debug(2) the output is thus.

Attribute::Handlers can do quite a lot more than what I've shown you already.
If you like what you see, then you may want to add attributes to variables or worse.
Please read the thorough documentation provided with the module.

This module is a well-known Perl debugging module. It generates Perl source
code from Perl source code provided to it. This may seem useless to some,
but to the aspiring obfuscator, it's useful in understanding odd code.

perl -snle'$w=($b="bottles of beer")." on the wall";$i>=0?print:last
LINE for(map "$i $_",$w,$b),"take one down, pass it around",
do{$i--;"$i $w!"}' -- -i=100

That is an example of an obfuscated program. It could be worse, but it's
pretty bad already. Understanding this gem is as simple as adding
-MO=Deparse to the command line. This will use B::Deparse to turn that mess into more readable Perl source code.

To use B::Deparse in the everyday example, just run your program using
it on the command line.

perl -MO=Deparse prog.pl

But if you want to have some real fun, then dig into the object-oriented interface
for B::Deparse. There you will find an amazing method called coderef2text(). This method turns any code reference to text, just like the command line trick does for an entire program. Here is a short example.

There are more methods in the B::Deparse class that you can use to muck
around with the results of coderef2text(). This module is powerful
and useful for debugging. I suggest you at least use the simple version if
code becomes ambiguous and incomprehensible.

While B::Deparse is good at what it does, it's not complete. Each
version of Perl has made it better, and it's good in Perl 5.8.0. Don't
trust B::Deparse to get everything right, though. For instance, I
wouldn't trust it to serialize code for later use.

This module, just like the constant pragma, is well-known. The
difference is that Class::Struct is not often used. For many programs,
setting up a class to represent data would be ideal, but overkill.
Class::Struct gives us the opportunity to live in our ideal world
without the pain of setting up any classes by hand. Here
is an example of creating a class with Class::Struct. In this
example, we're going to use compile time-class declarations, a new feature
in Perl 5.8.0.

Here we've created a class called Person with three attributes.
name can contain a simple scalar value, represented by the dollar
sign ($). mom and dad are both objects of type Person.
Using our class within the same program is the same as using any
other class.

Class::Struct classes are simple by design, and can get more complex
with further creativity. For instance, to add a method to the Person
class you can simply declare it in the Person package. Here is a
method named birth() which should be called on a Person object. It
takes the name of the baby as an argument, and optionally the father
(a Person object). Returned is a new Person object representing
the baby.

Encode is Perl's interface to Unicode. An explanation of Unicode itself
is far beyond the scope of this article. In fact, it's far beyond the scope
of most of us. This module is powerful. I'm going to provide some examples
and lots of pointers to the appropriate documentation.

The first function of the API to learn is encode(). encode() will convert a string for Perl's internal format to a series of octets in the
encoding you choose. Here is an example.

use Encode;
my $octets = encode( "utf8", "Hello, world!" );

Here we have turned the string Hello, world! into a utf8 string, which is now in $octets. We can also decode strings using the decode() function.

my $string = decode( "utf8", $utf8_string );

Now we've decoded a utf8 string into Perl's internal string representation.
Since utf8 is a common encoding to deal with, there are two helper functions:
encode_utf8(), and decode_utf8. Both of these function take a string
as the argument.

A list of supported encodings can be found in Encode::Supported, or by
using the encodings() method.

my @encodings = Encode->encodings;

For even more Unicode fun, dive into the documentation in Encode
(perldoc Encode on the command line).

This module gives us an easy way to write source-code filters. These
filters may change the behavior of calling Perl code, or implement new
features of Perl, or do anything else they want. Some of the more
infamous source-filter modules on the CPAN include Acme::Bleach,
Semi::Semicolons, and even Switch.

In this article, I'm going to implement a new comment syntax for Perl.
Using the following source-filter package will allow you to comment
your code using SQL comments. SQL comments begin with two consecutive
dashes (--). For our purposes, these dashes cannot be directly
followed by a semicolon (;) or be preceded by something other than
whitespace or a the beginning of a line.

In this example, we create an anonymous subroutine that is passed on
to Filter::Simple. The entire source of the calling program is
in $_, and we use a regular expression to search for our SQL comments
and change them to Perl comments.

Using B::Deparse on the command line, we can see what the code
looks like after it's filtered. Just remember that B::Deparse
doesn't preserve comments.

use SQLComments;
my $i = 100;
while ($i) {
--$i;
}

The output is exactly as we expect. Filtering source code is a complex art.
If your filters are not perfect, then you can break code in unexpected ways.
Our SQLComments filter will break the following code.

print "This is nice -- I mean really nice!\n";

It will turn into this.

print "This is nice# I mean really nice!\n";

Not exactly the results we want. This particular problem can be avoided,
however, using Filter::Simple in a slightly different way. You can
specify filters for different sections of the source code, here is how
we can limit our SQLComments filter to just code and not quote-like
constructs.

There are some functions that are repeated in hundreds (probably thousands)
of programs. Think of all the sorting functions written in C programs. Perl
programs have them, too, and the following utility modules try to clean up our
code, eliminating duplication is simple routines.

There are a number of useful functions in each of these modules. I'm going to
highlight a few, but be sure to read the documentation provided with each
of them for a full list.

blessed() will return the package name that the variable is blessed into,
or undef if the variable isn't blessed.

my $baby = Person->new;
my $class = blessed $baby;

$class will hold the string Person. weaken is a function that
takes a reference and makes it weak. This means that the variable will
not hold a reference count on the thing it references. This is useful
for objects, where you want to keep a copy but you don't want to stop
the object from being DESTROY-ed at the right time.

Hash::Util has a slightly different function than the previously
discussed variable utility modules. This module implements restricted
hashes, which are the predecessor to the undesirable (and now obsolete)
pseudo-hashes.

lock_keys() is a function that will restrict the allowed keys of a
hash. If a list of keys is given, the hash will be restricted to that
set, otherwise the hash is locked down to the currently existing keys.

The %person hash is now restricted. Any keys currently in the
hash may be modified, but no keys may be added. The following code
will result in a fatal error.

$person{wife} = $wife;

You can use the unlock_keys() function to release your restricted
hash.

You can also lock (or unlock) a value in the hash.

lock_value( %person, "name" );
$person{name} = "Bozo"; # Fatal error!

Finally, you can lock and unlock an entire hash, making it read only
in the first case.

lock_hash( %person );

Now our %person hash is really restricted. No keys can be added or
deleted, and no values can be changed. I know all those OO folks out there
wishing Perl made it easy to keep class and instance data private are
smiling.

You can specify any type of code, but if it's not the default two character
representation you must supply the extra argument to define what type it is.

my $name = code2country( "120", LOCALE_CODE_NUMERIC ); # Cameroon

Just as before, you can get a full list of codes and countries using the
two query functions: all_country_codes(), and all_country_names().
Both of these functions accept an optional argument specifying the code
set to use for the resulting list.

Memoize is a module that performs code optimization for you. In a
general sense, when you memoize a function, it is replaced by a memoized
version of the same function. OK, that was too general. More specifically,
every time your memoized function is called, the calling arguments are
cached and anything the function returns is cached as well. If the function
is called with a set of arguments that has been seen before, then the cached
return value is sent back and the actual function is never called. This makes
the function faster.

Not all functions can be memoized. For instance, if your function would return
a different value on two calls, even for the exact same set of calling arguments,
then it will be broken. Only the first sets return values will be returned for every
call. Many function do not act this way, and that's what makes Memoize so
useful.

Here is an example of a memoizeable function.

sub add {
my ($x, $y) = @_;
return $x + $y;
}

For every time this function is called as add( 2, 2 ), the result will be 4.
Rather than compute the value of 4 in every case, we can cache it away the first
time and retrieve it from the cache every other time we need to compute 2 + 2.

We've just made add() faster, without any work. Of course, our addition function isn't slow to begin with. The documentation of Memoize gives a much more details look into this algorithm. I highly suggest you invest time in learning about Memoize, it can give you wonderful speed increases if you know how and when to use it.

I currently don't have a Microsoft operating system running on any of my networks,
but when perusing the Perl core, I happened upon the Win32 module. I wanted
to bring it up because if I were using a Microsoft OS, then I would find the functions
in his module invaluable. Please, if you are running in that environment, then look
at the documentation for Win32 for dozens of helpful functions (perldoc Win32 on the command line).

Just as before, I've still not covered all of the Perl core. There is much more to
explore and a full list can be found by reading perlmodlib. The benefit of having
these modules in the core is great. Lots of environments require programmers to be
bound to using only code that is distributed with Perl. I hope I've been able to lighten
the load for anyone who has been put in that position (even by choice).

Welcome to the last Perl 6 Summary of my first year of summarizing. If
I were a better writer (or if I weren't listening with half an ear to
Damian telling YAPC about Perl 6 in case anything's changed), then this
summary might well be a summary of the last year in Perl 6. But I'm
not, so it won't. Instead, I'm going to try and keep it short
(summaries generally take me about eight hours on an average day, and I
really don't want to lose eight hours of YAPC, thank you very much).

Clinton Pierce wanted to know how to go about writing language level
debuggers in Parrot. (This man is unstoppable, I tell you.) He offered
some example code to show what he was trying to do. Benjamin Goldberg
had a style suggestion for the code, but nobody had much to say about
Clint's particular issue.

Much of this week's effort was involved in getting support for the
continuation-passing style function calling into Parrot. Jonathan
Sillito posted a patch. This led to a certain amount of confusion
about what needs to be stashed in the continuation and a certain
amount of bemusement about the implications of caller saves rather
than callee saves (in a nutshell, a calling context only has to save
those registers that it cares about; it doesn't have to worry about
saving any other registers, because its callers will already have
saved them if they cared.)

Dan ended up rewriting the calling conventions PDD to take into
account some of the confusion.

I think the upshot of this is that the Parrot core now has everything
we need to support the documented continuation-passing calling
conventions. But I could be wrong.

Clint Pierce's BASIC implementation efforts continue to be one of the
most-effective bug hunting (in code and/or docs) efforts the Parrot
team has. This time, Clint managed to segfault IMCC by trying to
declare nested .subs using the wrong sorts of names. Leo
Tötsch explained how to fix the problem. It seems that fixing
IMCC to stop it from segfaulting on this issue is hard, since the segfault
happens at runtime.

Last week, Ziggy worried about multimethod dispatch not being good
enough. This week at YAPC, Damian announced DISPATCH, a scary magic
subroutine that allows you to define your own dispatch
rules. Essentially, it gets called before the built-in dispatch rules do; beyond that, I know nothing.

Last week, I mentioned that Adam Turoff had worried a little about
multimethod dispatch, and wanted to know whether it would be possible to
easily override the dispatch system. This week, he outlines
the types of things he might want to do.

See above for the resolution. Details don't exist just yet, but we'll
get there.

Edward Steiner wondered about having some way to printf, like
formatting of numbers in interpolated strings. Luke Palmer (who just
told me he's embarrassed about something I wrote about something he
said last week, but I'd forgotten it) came up with a cool-looking
suggestion in response.

/usr/bin/pod2html: p6s.pod: cannot resolve L<p6summarizer@bofh.org.uk> in paragraph 50. Send feedback, flames, money, photographic and writing commissions, or
a nice long US power cable to plug into my Mac power-brick to
.

In 1995, Design Patterns was published, and during the intervening years, it
has had a great influence on how many developers write
software. In this series of articles, I present my take on how the
Design Patterns book (the so-called Gang of Four book, which I will call
GoF) and its philosophy applies to Perl. While Perl is an OO language -- you could code the examples from GoF directly in Perl -- many of the
problems the GoF is trying to solve are better solved in Perl-specific
ways, using techniques not open to Java developers or those C++
developers who insist on using only objects. Even if developers in other
languages are willing to consider procedural approaches, they can't, for
instance, use Perl's immensely powerful built-in pattern support.

Though these articles are self-contained, you will get more out of them
if you are familiar with the GoF book (or better yet have it open on
your desk while you read). If you don't have the book, then try searching
the Web - many people talk about these patterns. Since the
Web and the book have diagrams of the object versions of the patterns,
I will not reproduce those here, but can direct you to
this fine site.

I will show you how to implement the highest value patterns in Perl, most
often by using Perl's rich core language. I even include some objects.

For the object-oriented implementations, I need you to understand
the basics of Perl objects. You can learn that from printed sources
like the Perl Cookbook by Tom Christiansen and Nat Torkington or Objected
Oriented Perl by Damian Conway. But the simplest way to learn the
basics is from perldoc perltoot.

As motivation for my approach, let me start with a little object-oriented
philosophy. Here are my two principles of objects:

When you are working for a company that rents cars (as I do), an object
to represent a rental agreement makes sense. The data on the agreement
is tightly bound to the methods you need to perform. To calculate the
amount owed, you take the various rates and add them together, etc.
This is a good use of an object (or actually several aggregated objects).

Consider a few examples from other languages. Java has the java.lang.Math
class. It provides things such as sine and cosine. It only provides class
methods and a couple of class constants. This should not be forced into
an object-oriented framework, since there are no Math objects. Rather
the functions should be put in the core, left out completely, or made into
non-object-oriented functions. The last option is not even available
in Java.

Or think of the C++ standard template library. The whole templating
framework is needed to make C++ backward compatible with C and to handle
strong static-type checking. This makes for awkward object-oriented
constructs for things that should be simple parts of the core language.
To be specific, why shouldn't the language just have a better array type
at the outset? Then a few well-named built-in operations take care of
stacks, queues, dequeues and many other structures we learned in school.

So, in particular, I take exception to one consistent GoF trick: turning an
idea into a full-blown class of objects. I prefer the Perl way of
incorporating the most-important concepts into the core of the language.
Since I prefer this Perl way, I won't be showing how to objectify
things that could more easily be a simple hash with no methods or
a simple function with no class. I will invert the GoF trick:
implement full-blown pattern classes with simpler Perl concepts.

The patterns in this first article rely primarily on built-in
features of Perl. Later articles will address other groups of
patterns. Now that I've told you what I'm about to do, let's start.

There are many structures that you need to walk one element at a time.
These include simple things such as arrays, moderate things such as the keys
of a hash, and complex things such as the nodes of a tree.

The Gang of Four suggest solving this problem with the above mentioned
trick: turn a concept into an object. Here that means you should make
an iterator object. Each class of objects that can reasonably be
walked should have a method that returns an iterator object. The object
itself always behaves in a uniform way. For example, consider the following
code, which uses an iterator to walk the keys of a hash in Java.

The HashMap object has something that can be walked: its keys. You
can ask it for this keySet. That Set will give you an Iterator
on request to its iterator method. The Iterator responds to
hasNext with a true value if there are more things to be walked,
and false otherwise. Its next method delivers the next object in
whatever sequence the Iterator is managing. With that key, the
HashMap delivers the next value in response to get(key).
This is neat and tidy in the completely OO framework of a language
with limited operators and built-in types. It also perfectly exhibits
the GoF iterator pattern.

In Perl any built-in or user defined object which can be walked
has a method which returns an ordered list of the items to be walked.
To walk the list, simply place it inside the parentheses of a
foreach loop. So the Perl version of the above hash key walker is:

foreach my $key (keys %hash) {
print "$key\t$hash{$key}\n";
}

I could implement the pattern exactly as it is diagrammed in GoF,
but Perl provides a better way. In Perl 6, it will even be possible
to return a list that expands lazily, so the above will be more
efficient than it is now. In Perl 5, the keys list is built completely
when I call keys. In the future, the keys list will be built on
demand, saving memory in most cases, and time in cases where the loop
ends early.

The inclusion of iteration as a core concept represents Perl design
at its finest. Instead of providing a clumsy mechanism in non-core
code, as Java and C++ (through its standard template library) do,
Perl incorporates this pattern into the core of the language.
As I alluded to in the introduction, there is a Perl principle here:

If a pattern is really valuable, then it should be part of the core language.

The above example is from the core of the language. To see that foreach
fully implements the iterator pattern, even for user-defined modules,
consider an example from CPAN: XML::DOM. The DOM for XML was specified
by Java programmers. One of the methods you can call on a DOM Document is
getElementsByTagName. In the DOM specification this returns a
NodeList,
which is a Java Collection. Thus, the NodeList works like the Set in the
Java code above. You must ask it for an Iterator, then walk the Iterator.

When Perl people implemented the DOM, they decided that
getElementsByTagName
would return a proper Perl list. To walk the list one says something like:

One beauty of Perl is its ability to combine procedural, object-oriented,
and core concepts in such powerful ways. The facts that GoF
suggests implementing a pattern with objects and that object only languages
like Java require it do not mean that Perl programmers should ignore
the non-object features of Perl.

Perl succeeds largely by excellent use of the principle of promotion.
Essential patterns are integrated into the core of the language. Useful
things are implemented in modules. Useless things are usually missing.

So the iterator pattern from GoF is a core part of Perl we hardly think
about. The next pattern might actually require us to do some work.

In normal operation, a decorator wraps an object, responding to the same
API as the wrapped object. For example, suppose I add a compressing
decorator to a file writing object. The caller passes a file
writer to the decorator's constructor, and calls write on the decorator.
The decorator's write method first compresses the data, then calls the
write method of the file writer it wraps. Any other type of
writer could be wrapped with the same decorator, so long as all writers
respond to the same API. Other decorators can also be used in a chain.
The text could be converted from ASCII to unicode by one decorator and
compressed by another. The order of the decorators is important.

In Perl, I can do this with objects, but I can also use a couple of
language features to obtain most of the decorations I need, sometimes
relying solely on built-in syntax.

I/O is the most common use of decoration. Perl provides I/O decoration
directly. Consider the above example: compressing while writing. Here are
two ways to do this.

Now everything I write is passed through gzip on its way to output.gz.
This works great so long as (1) you are willing to use the shell, which
sometimes raises security issues; and (2) the shell has a tool to do
what you need done. There is also an efficiency concern here. The
operating system will spawn a new process for the gzip step. Process
creation is about the slowest thing the OS can do without performing I/O.

If you need more control over what happens to your data, then you can decorate
it yourself with Perl's tie mechanism. It will be even faster, easier
to use, and more powerful in Perl 6, but it works in Perl 5. It does
work within Perl's OO framework; see perltie for more information.

Suppose I want to preface each line of output on a handle with a time stamp.
Here's a tied class to do it.

This class is minimal, in real life you need more code to make the
decorator more robust and complete. For example, the above code does
not check to make sure the handle is writable nor does it provide
PRINTF,
so calls to printf will fail. Feel free to fill in the details.
(Again, see perldoc perltie for more information.)

Here's what these pieces do. The constructor for a tied file
handle class is called TIEHANDLE. Its name is fixed and uppercase,
because Perl calls this for you. This is a class method, so the
first argument is the class name. The other argument is an open
output handle. The constructor merely blesses a reference to this
handle and returns that reference.

The PRINT method receives the object constructed in TIEHANDLE plus
all the arguments supplied to print. It calculates the time stamp
and sends that together with the original arguments to the handle
using the real print function. This is typical decoration at work.
The decorating object responds to print just like a regular handle
would. It does a little work, then calls the same method on the
wrapped object.

The CLOSE method closes the handle. I could have inherited from
Tie::StdHandle to gain this method and many more like it.

After opening the file for writing as usual, I use the built-in tie
function to bind the LOG handle to the AddStamp class under the name
STAMPED_LOG. After that, I refer exclusively to STAMPED_LOG.

If there are other tied decorators, then I can pass the tied handle to
them. The only downside is that Perl 5 ties are slower than normal
operations. Yet, in my experience, disks and networks are my bottlenecks
so in memory inefficiency like this tends not to matter. Even
if I make the script code execute 90 percent faster, I don't save a noticeable
amount of time, because it wasn't using much time in the first place.

This technique works for many of the built-in types: scalars, arrays,
hashes, as well as file handles. perltie explains how to
tie each of those.

Ties are great since they don't require the caller to understand the
magic you are employing behind their back. That is also true of GoF
decorators with one clear exception: In Perl, you can change the
behavior of built-in types.

One of the most common tasks in Perl is to transform a list in some
way. Perhaps you need to skip all entries in the list that start with
underscore. Perhaps you need to sort or reverse the list. Many built-in
functions are list filters. They take a list, do something to it and
return a resultant list. This is similar to Unix filters, which expect
lines of data on standard input, which they manipulate in some way, before
sending the result to standard output. Just as in Unix, Perl list filters
can be chained together. For example, suppose you want a list of all
subdirectories of the current directory in reverse alphabetical order.
Here's one possible solution.

Perl 6 will introduce a more meaningful notation for these operations,
but you can learn to read them in Perl 5, with a little effort. Line 6
is the interesting one. Start reading it on the right (this is backward
for Unix people). First, it reads the directory. Since map expects
a list, readdir returns a list of all files in the directory. map
generates a list with the name of each file which is a directory (or
undef if the -d test fails). sort puts the list in
ASCII-betical order. reverse reverses that. The result is stored in
@files for later printing.

You can make your own list filter quite easily. Suppose you wanted to
replace the ugly map usage above (I tend to think map is always ugly)
with a special purpose function, here's how:

The new dirs_only routine replaces map above, leaving out the
entries we don't want to see.

The sort now has an explicit comparison subroutine. This is to prevent it
from thinking that dirs_only is its comparison routine. Since I had to
include this, I chose to take advantage of the situation and sort with
more finesse: ignoring case.

You can make such list filters to your heart's content.

I have now shown you the most important types of decoration. Any others
you need could be implemented in the traditional GoF way.

The next pattern feels like cheating, but then Perl often gives me that
feeling.

The idea of reusing objects is the essence of the flyweight pattern.
Thanks to Mark-Jason Dominus, Perl takes this far beyond what the GoF
had in mind. Further, he did the work once and for all. Larry Wall
likes this idea so much he's promoting it to the core for Perl 6
(there's that promotion concept again).

What I want is this:

For objects whose instances don't matter (they are constants or random), those requesting a new object should be given the same one they already received whenever possible.

This pattern fails dramatically if separate instances matter. But if
they don't, then it would save time and memory.

Here's an example of how this works in Perl. Suppose I want to provide
a die class for games like Monopoly or Craps. My die class might look
like this:
(Warning: This example is contrived to show you the technique.)

On first glance, this looks like many other classes. It has a constructor
called new. The constructor stores the received number of sides into a
subroutine lexical variable (a.k.a. a my variable), returning a blessed
reference to it. The roll method calculates a random number,
scales it according to the number of sides, and returns the result.

The only thing strange here are these two lines:

use Memoize;
memoize('new');

These exploit Perl's magic extraordinarily well. The memoize
function modifies the calling package's symbol table so that new is
wrapped. The wrapping function examines the incoming arguments (the
number of sides in this case). If it has not seen those arguments before,
then it would call the function as the user intended, storing the result in
a cache and returning it to the user. This takes more time and memory than
if I had not used the module.

The savings come when the method is called again. When the wrapper notices
a call with the same arguments it used before, it does not call the method.
Rather, it sends the cached object instead. We don't have to do anything
special as a caller or as an object implementor. If your object is
big, or slow to construct, then this technique would save you time and
memory. In my case, it wastes both since the objects are so small.

The only thing to keep in mind is that some methods don't benefit from
this technique. For example, if I memoize roll, then it would return
the same number each time, which is not exactly the desired result.

Note too that Memoize can be used in non-object situations - in fact
the documentation for it doesn't seem to contemplate using it for object
factories.

Not only do languages such as Java not have core functions for caching
method returns, they don't allow clever users to implement them.
Mark-Jason Dominus did a fine thing implementing Memoize, but
Larry Wall did a
better thing by letting him. Imagine Java letting a user write a
class that manipulated the caller's symbol table at run time - I
can almost hear the screams of terror. Of course, these techniques
can be abused, but precluding them is a greater loss than rejecting
poor code on the few occasions that some less-than-stellar programmer
improperly adjusts the symbol table.

In Perl all things are legal, but some are best left to modules with
strong development communities. This allows regular users to take advantage
of magic manipulations without worrying about whether our own magic
will work. Memoize is an example. Instead of rolling your own wrapped
call and caching scheme, use the well-tested one that ships with Perl
(and looked for the 'is cached' trait to do this for routines in Perl 6).

The next pattern is related to this one, so you can use flyweight to
implement it.

In the flyweight pattern, we saw that there are sometimes resources that
everyone can share. GoF calls the special case when there is a single
resource that everyone needs to share the singleton pattern. Perhaps the
resource is a hash of configuration parameters. Everyone should be able to
look there, but it should only be built on startup (and possibly rebuilt on
some signal).

In most cases, you could just use Memoize. That seems most reasonable to me.
(See the flyweight section above.) In that case, everyone who wants access
to the resource calls the constructor. The first person to do so causes
the construction to happen and receives the object. Subsequent people call
the constructor, but they receive the originally constructed object.

There are many other ways to achieve this same effect. For instance, if
you think your callers might pass you unexpected arguments, then Memoize would
make multiple instances, one for each set of arguments. In this case,
managing the singleton with modules like Cache::FastMemoryCache from CPAN
may make more sense. You could even
use a file lexical, assigning it a value in a BEGIN block. Remember
bless doesn't have to be used in a method. You could say:

This avoids some of the overhead of Memoize and shows what I'm doing more
directly. I made no attempt to take subclassing into account here. Maybe
I should, but the pattern says a singleton should belong always to one class.
The fundamental statement about singletons is:

All four of the patterns shown in this article use built-in features,
or standard modules. The iterator is implemented with foreach. The
decorator is implemented for I/O with Unix pipe and redirection
syntax or with a tied file handle. For lists, decorators are just
functions which take and return lists. So, I might call decorators
filters. Flyweights are shared objects easily implemented with the
Memoize module. Singletons can be implemented as flyweights or with
simple object techniques.

The next time some uppity OO programmer starts going on about patterns,
rest assured, you know how to use them. In fact, they are built-in
to the core of your language (at least if you have the sense to use Perl).

Next time, I will look at patterns which rely on code references or
data containers.

I wrote these articles after taking a training course using GoF from a well-known training and consulting company. My writing is also informed by
many people in the Perl community, including Mark-Jason Dominus, who
showed at YAPC 2002, using his unique flair, how Perl deals with the
iterator pattern. Though the writing here is mine, the inspiration
comes from Dominus and many others in the Perl community, most of all Larry
Wall, who have incorporated patterns into the heart of Perl during the
years. As these patterns show, time and time again, Perl employs the
principle of promotion carefully
and well. Instead of adding a collection framework in source code modules,
as Java and C++ do, Perl has only two collections: arrays and hashes.
Both are core to the language. I think Perl's greatest strength
is the community's choices of what to include in the core, what to ship
along with the core, and what to leave out. Perl 6 will only make Perl
more competitive in the war of language design ideas.

It's another Monday, it's another summary and I need to get this
finished so I can starting getting the house in order before we head
off to Boca Raton and points north and west on the long road to
Portland, Oregon. Via Vermont. (I'm English and as the poem comments,
the rolling English road is ``A rare road, a rocky road and one that we
did tread // The day we went to Birmingham by way of Beachy Head.''
Just because I'm in America doesn't mean I can't take an English route
to OSCON)

We'll start with the internals list this week (and, given that there are
only 18 or so messages in my perl6-language inbox, we may well stop
there).

It's been pretty much decided that IMCC will soon become 'the' parrot
executable. Josh Wilmes, Robert Spier and Leo ``Perl Foundation grant
recipient'' Tötsch are looking into what needs to be done to make
this so. It's looking like the build system may well see some vigorous
cleanup action in this process.

Clint Pierce continued to expand on the internals of this Basic
implementation. The more I see of his pathological examples, the
gladder I am that I escaped BASIC as quickly as possible. Still, kudos
to Clint once more for the effort, even if it is a tad embarrassing
that the most advanced language hosted on Parrot is BASIC. (On IRC
Leon Brocard and others have been heard to remark that they're?
unlikely to go all out at a real language until Parrot has
objects. Dan?)

The timely destruction thread still doesn't want to go away. Dan has
been heard muttering about this on IRC. Eventually, he did more than
mutter on IRC -- he stated clearly on list that 'We aren't doing
reference counting' and that as far as he is concerned the matter is
closed.

Dan's blog also has another of his excellent ``What The Heck Is'' posts,
this time about Garbage Collection.

Jonathan Sillito posted a longish meditation on Parrot's new
continuation passing calling conventions. He wondered if, now we have
continuation passing, we really needed the various register stacks
that were used in the old stack based calling conventions. Warnock's
Dilemma currently applies.

Over the past couple of week's Clint Pierce has been porting his BASIC
implementation over to run on IMCC. In the process of doing so he's
been finding and reporting all sorts of IMCC bugs and/or
misunderstandings and Leo Tötsch (usually) has either been
correcting Clint's assumptions or fixing the bugs he's found. I've
mentioned a few of these exchanges that generated longish threads in
the past, but that hasn't covered everything that's been found,
discussed and fixed. It's been great to see this sort of dialogue
driving the design and implementation forward based on the needs of a
real program.

The thread I've linked to below is another exchange in this ongoing
dialogue. Clint found a way of reliably segfaulting IMCC. Leo fixed
it. And on to the next.

Jürgen Bömmels is still working away at the Parrot IO
(PIO) subsystem. In this particular patch, he's gone through the
Parrot source replacing occurrences PIO_fprintf(interpreter,
PIO_STDERR(interpreter, ...) with the better factored
PIO_eprintf(interpreter, ...), which as well as eliminating
repetition, helps to keep the IO code slightly easier to maintain.

Leo applied the patch. (Although it's not mentioned explicitly
elsewhere, Leo continues to keep up his astonishing productivity with
various other patches to Parrot)

Bryan C. Warnock continued to discuss issues of the size of Parrot's
various types, particularly the integer types that get used within a
running Parrot. Bryan argues that these should ideally use a given
platform's native types, worrying about guaranteed sizes only at the
bytecode loading/saving stage. Dan and others commented on this (Dan
essentially said that he understood what Bryan was driving at but
wasn't quite sure of the way forward, and outlined his
options). Discussion continues.

Jonathan Sillito submitted a patch which changes invoke to
call, adds some PMC access macros and updates the tests. He and Leo
Tötsch discussed things for a while and I think the patch is in
the process of being rewritten as result of that discussion.

One of the good things about a simple minded reference counting
Garbage Collector is that object destructors generally get called in a
sensible order; if you have a tree of objects, the various node
destructors will generally get called in such a way that a given
node's children won't have been destroyed already. Garrett Goebel
asked if we could keep this behaviour with the Parrot GC system. Dan
was minded to say ``Yes'' as he's been wrestling with issues of non
deterministic destruction order in another of his projects (So have I;
it's a very long way from being fun, if I had the C chops I'd be
trying to fix Perl 5's 'at exit' mark and sweep garbage collector to
do something similar.)

Klaas-Jan Stol announced that he's turned in his project implementing
a Lua compiler that targets Parrot. He hasn't actually finished the
compiler, his deadline being what it was, but he did post a link to
his project report and commented that ``[Parrot is] a really cool
project and VM to target'' and thanked everyone on the mailing list for
their help. I think the parrot-internals people will echo my best
wishes to Klaas-Jan; it's great to see someone who comes to a list with a
project and, instead of saying ``Write this for me!'', asks sensible
questions and makes a useful contribution to the ongoing task.

Adam Turoff asked if multimethod dispatch (MMD) was really the Right
Thing (it's definitely a Right Thing) and suggested that it would
be more Perlish to allow the programmer to override the dispatcher,
allowing for all sorts of more or less cunning dispatch mechanisms
(which isn't to say we could still have MMD tightly integrated, but it
wouldn't be the only alternative to simple single dispatch). Luke
Palmer gets the ``Pointy End Grandma'' award for pointing out that Perl
6 is a '``real'' programming language now' (as Adam pointed out, Perl's
been a 'real' programming language for years), inspiring a particularly
pithy bit of Cozeny. As far as I can tell, Adam wants to be able to
dispatch on the runtime value of a parameter as well as on its runtime
type (he's not alone in this). Right now you either have to do this
explicitly in the body of the subroutine, or work out the correct
macromantic incantations needed to allow the programmer to use 'nice'
syntax for specifying such dispatch.

Assuming I'm not misunderstanding what Adam is after, this has come up
before (I think I asked about value based dispatch a few months back)
and I can't remember if the decision was that MMD didn't extend to
dispatching based on value, or if that decision hasn't been taken
yet. If it's not been taken, I still want to be able to do

It seems to me that if MMD is flexible enough to do this, then it
becomes easy to express any other set of dispatch rules as special
cases of this more general mechanism. (That said, I'm not sure how one
would go about expressing Smalltalk like message specifiers, which is
a shame, I like Smalltalk message specifiers).

Well, that's about it for another week. Next week's summary will be
coming to you from YAPC in Boca Raton. then there should be one from
chez Turoff in Washington DC (As far as I can tell the Washington
summary will be the first summary of the second year of my summary
writing, if you're going to be in the Greater Washington area around
that time, consider getting in touch with either me or Ziggy and we'll
see about having a celebratory something or other that evening). After
Washington I'll be in Boston for the next summary, and at OSCON for
the one after that. I fully expect to be writing either enormously
long summaries or drastically curtailed ones while I'm over in the
States. After OSCON, there'll be a summary from Seattle and then I'll
be off back home. If you're in any of those localities at the
appropriate times drop me a line, we'll try and arrange meet-ups to wet
the appropriate summaries' heads.

If you've appreciated this summary, please consider one or more of the
following options:

Everyone knows that Perl works particularly well as a text processing
language, and that it has a great many tools to help the programmer
slice and dice text files. Most people know that Perl's regular
expressions are the mainstay of its text processing capabilities, but do
you know about all of the features which regexps provide in order to
help you do your job?

In this short series of two articles, we'll take a look through some of
the less well-known or less understood parts of the regular expression
language, and see how they can be used to solve problems with more power
and less fuss.

If you're not too familiar with the basics of the regexp language, a
good place to start is perlretut, which comes as part of the Perl
distribution. We're going to assume that you know about anchors,
character classes, repetition, bracketing, and alternation. Where can we
go from here?

Matching multi-line strings is one thing that I have to admit confuses
me every time. I remember that it has something to do with the /m and
/s modifiers, so when I think my strings will contain embedded
newlines, I just slap both /ms on the end of my regular expression
and hope for the best.

This is inexcusable behavior, especially since the distinction is
pretty simple. /m has to do with anchors. /s has to do with dots.
Let's start by looking at /s. The ``any'' character, ., does not
actually match any character; by default, it matches any character
except for a newline. So for instance, this won't match:

"This is my\nmulti-line string" =~ /This.*string/;

Don't just take my word for it. Get into the habit of trying out these
things for yourself; with Perl's -e switch, it's very easy to make up
a quick test of regular expression behavior if you're unsure:

This newline-phobia only relates to the . operator. It's nothing to
do with regular expressions in general. If we use something other than a
. to match the stuff in the middle, it will work:

"This is my\nmulti-line string" =~ /This\D+string/;

This matches the first This, then more than one thing that isn't
a digit, and then string. Because \n isn't a digit - and nor is
anything else between This and string - the regular expression will
match.

So the dot operator won't match a newline. If we want to change the
behavior of the dot operator, we can use the /s modifier to the
regular expression.

"This is my\nmulti-line string" =~ /This.*string/s;

This time, it matches. If you're using the . operator in your regular
expressions and you want it to be able to cross over newline boundaries,
use the /s modifier. However, you can sometimes get the same result
without using /s by choosing another way of matching

What about anchors? Well, there are two possible things that we might
want anchors to do with a multi-line string. We might them to match the
start or end of any line in the string, or we might want them to match
the start or end of the whole thing. Let's back up a little, and then
see how the /m modifier can be used to choose between these two
possible behaviors.

First, let's try something we know that doesn't work.

"This is my\nmulti-line string" =~ /^(.*)$/;

This wants to match the start of the string, any amount of stuff that's
not a newline and the end of the string. But we know that there is a
newline between the start of the string and the end, so it won't match.
We could, of course, allow . to match a newline using the /s trick
we've just learnt, and then we can capture the whole lot:

Aha! This time, we've changed the meanings of the anchors - instead of
matching just the start and end of the string, they now match the start
of any line in the string.

What happens when Perl runs this regular expression? Let's pretend we're
the regular expression engine for a brief, mad moment.

We start at the beginning of the string. The ^ anchor tells us to
match the beginning of a line, which is handy, since we're at one of
those right now. Now we match and capture any amount of stuff - so long
as it isn't a newline. This takes us up to This is my, and as the
next character is a newline, that is where we must stop. Next, we have
the $ anchor. Now without the /m modifier, this would want to find
the end of the string. We're not at the end of the string - there's
\nmulti-line string left to go - so without the /m modifier this
match would fail. That's what happened just above.

However, this time we do have the /m modifier, so the meaning of
$ has changed. This time, it means the end of any line in the string.
As we've had to stop at the \n, that would mean we're at the end of a
line. So that means that our $ matches, and the whole expression
matches and all is well.

Well, it looks the same as when we had just used /s. Why? Because we
do have /s, the .* can eat up absolutely everything right up to the
end of the string. Now our /m-enabled $ matches the end of any
line in the string, and indeed we are at the end of the second line in
the string, so this matches too. In this case, the /m is superfluous.

Another trick to avoid confusion is to use explicit newlines in your
expression. For instance, if you're dealing with data like this:

This time we don't need any modifiers at all - we want the .* to stop
before the newline, and then the explicit newlines themselves obviate
the need for start-of-line or end-of-line anchors. In our next article,
we'll see how to use the /g modifier to read in multiple records.

So those are the two rules for dealing with multi-line strings: /s
changes the behavior of the dot operator. Without /s, . will not
match a newline. With /s, . truly matches anything. On the other
hand /m changes the behavior of the anchors ^ and $; without
/m, these anchors only match the start and end of the whole string.
With /m, they match the start or end of any line inside the string.

Another modifier like /s and /m is /x; /x changes the
behavior of whitespace inside a regular expression. Without /x,
a literal space inside a regex matches a space in the string. This makes
sense:

"A string" =~ /A string/;

You would expect this to match, and without /x, it does match. Phew.
With /x, however, the match fails. Why is this? /x strips literal
whitespace of any meaning. If we want to match A string, we have to
use either the \s whitespace character class or some other
shenanigans:

"A string" =~ /A\sstring/x;
"A string" =~ /A[ ]string/x;

How can this conceivably be useful? Well, for a start, by removing the
meaning of white space inside a regular expression, we can use
whitespace at will; this is particularly useful to help us space out
complicated expressions. The rather unpleasant

Perl 5.6.0 introduced the ability to package up regular expressions
into variables using the qr// operator. This acts just like q//
except that it follows the quoting, escaping and interpolation rules of
the regular expression match operator. In our example above, we had to
use single quotes for the ``basic'' components, and then double quotes to
get the interpolation when we wanted to string them all together into
$uk_postcode. Now, we can use the same qr// operator for all the
parts of our regular expression:

Because the modifiers are packaged up inside their own little component,
we can ``mix and match'' modifiers inside a single regular expression. If,
for instance, we want to match part of it case-insensitively and some
case-sensitively:

my $prefix = qr/zip code: /i;
my $code = qr/[A-Z][A-Z][ \t]+\d{5}/;

$address =~ /$prefix $code/x;

In this example, the prefix part ``knows'' that it has to match
case-insensitively and the code part ``knows'' that it should match
case-sensitively like any other normal regular expression.

Another boon of using quoted regular expressions is a little
off-the-wall. We can actually use them to create recursive regular
expressions. For instance, an old chestnut is the question ``How do I
extract parenthesized text?''. Well, such a simple problem turns out to
be quite nasty to solve using regular expressions. Here's a
simple-minded approach:

Oops. Our expression sees the first closing paren and stops. We need to
find a way to tell it to count the number of opening and closing parens
and make sure they're balanced before finishing. This actually turns out
to be tremendously difficult, and the solution is too messy to show
here. Regular expressions are not meant for iterative solutions.

Regular expressions aren't really meant for recursive solutions
either, but if we have recursive regular expressions, we can define our
balanced-paren expression like this: first match an opening paren; then
match a series of things that can be non-parens or an another
balanced-paren group; then a closing paren. Turned into Perl code, this
becomes:

This is almost there, but it's not quite correct. Because qr//
compiles a regular expression, it does the interpolation right there and
then. And when our expression is compiled $paren isn't defined yet,
so it's interpolated as an empty string, and we don't get the recursion.

That's OK. We can tell the expression not to interpolate the $paren
quite yet with the super-secret regular expression ``don't interpolate
this bit yet'' operator: (??{ }). (It has two question marks to remind
you that it's doubly secret.) Now we have

When this is run on some text like (lambda (x) (append x '(hacker))),
the following happens: we see our opening paren, so all is well. Then we
see some things which are not parens (lambda ) and all is still well.
Now we see (, which definitely is a paren. Our first alternative
fails, we try the second alternative. Now it's finally time to
interpolate what's inside the double-secret operator, which just happens
to be $paren. And what does $paren tell us to match? First, an
open paren - ooh, we seem to have one of those handy. Then some things
which are not parens, such as x, and then we can finish this part of
the match by matching a close paren. This polishes off the
sub-expression, so we can go back to looking for more things that aren't
parens, and so on.

Of course, if we need to get this confusing, you might wonder why we're
using a regular expression at all. Thankfully, there's a much easier way
of doing things: the the Text::Balanced manpage module helps extract all kinds of
balanced, quoted and tagged texts, and this is one of the things we'll
look at in our next article, next month.

Regular expressions are like a microcosm of the Perl language itself:
it's simple to use them to do simple things with, and most of the time
you only need to do simple things with them. But sometimes you need to
do more complex things, and you have to start digging around in the dark
corners of the language to pull out the slightly more complex tools.

Hopefully this article has shed a little light on some of the dark
corners: for dealing with multi-line strings and making expressions more
readable with quoting and interpolation. In the next article, we'll look
at the dreaded look-ahead and look-behind operators, splitting up text
with more than just split, and some CPAN modules to help you get all
this done.

The discussion of how to get timely destruction with Parrot's Garbage
Collection system continued. Last week Dan announced that, for
languages that make commitments to call destructors as soon as a thing
goes out of scope, there would be a set of helpful ops to allow
languages to trigger conditional DOD (Dead Object Detection) runs at
appropriate times. People weren't desperately keen on the performance
hit that this entails (though the performance hit involved with
reference counting is pretty substantial...) but we didn't manage to
come up with a better solution to the issue.

Bryan C. Warnock seems to be attempting to outdo Leo Tötsch in
the patchmonster stakes this week. He put in a miscellany of patches
dealing with the Perl based assembler, opcode sizes, debugging flags
and probably others. Most of them were applied with alacrity.

Dan Sugalski gave a rundown of how the Perl 6 Essentials book came
about, what's in it and all that jazz. He started by apologizing for
not mentioning it before, but he thought he had. This led Clint Pierce
to wonder if there was something up with Dan's Garbage Collection
system. The existence of the book probably goes some way to explaining
Leo Tötsch's relative silence over the last few weeks. Nicholas
Clark wondered if it explains why Parrot doesn't have objects
yet. Brent Dax wondered when it would be available (by OSCON this year
apparently).

Clint Pierce had some big headaches with moving his BASIC interpreter
over to IMCC owing to problems with .constant which is legal for
the assembler, but not for IMCC. Leo Tötsch pointed Clint at
IMCC's .const operator. Bryan Warnock wondered if IMCC and the
assembler's syntax couldn't be unified. Leo noted that it wasn't quite
that straightforward because .constant declares an untyped
constant, but .const requires a type as well. It turns out that
.const wasn't quite what Clint needed, so Leo pointed him at
.sym and .local which do seem to do what he needs.

Leo Tötsch's work on the new PMC layout continues apace. I'm
afraid I don't quite understand what's going on in this area, which
does make it rather tricky to summarize things. It seems to have a
good deal to do with memory allocation and garbage collection... Leo
thinks that it's the right thing, but there seem to be issues involved
with good ways of allocating zeroed memory.

Mitchell N Charity has put up an experimental Wiki for Parrot and
primed it with a few things. Stéphane Payrard pointed out that
it's rather hard to make a WikiWord from, for example, PMC. (10 points
to the first person to email p6summarizer@bofh.org.uk with the
expansion of PMC).

While toying with pbc2c.pl, Luke Palmer discovered that it doesn't
want to play with IMCC generated .pbc files. Apparently this is
because we currently have two bytecode file formats. Leo Tötsch
thought the problem lay with assemble.pl which is old and slow and
doesn't produce 'proper' parrot bytecode. Leo also thought that the
way pbc2c.pl worked wasn't actually any use. Dan reckoned the time
had come to ditch assemble.pl too, and reckoned there was a case
for renaming IMCC as parrot since it can run either .pbc or
assembly files. Leo liked the idea, but is concerned about the state
of the Tinderbox.

Dan tantalized all those waiting eagerly for objects in Parrot by
discussing how to make method calls. This, of course, means a few new
ops, called findmeth, callmeth and callccmeth for the time
being. Jonathan Sillito had a few naming consistency issues with the
ops. Dan agreed there were issues and asked for suggestions for an
opcode naming convention.

Leo Tötsch liked the idea modified it slightly and added it to
the code base, but disabled. Apparently there are problems with it,
but it's a good starting framework. There need to be lots more tests
though...

Bryan Warnock (in his own words) popped in to 'waffle on Parrot's core
sizes'. He proposed a way of drastically simplifying Parrot's type
system. He and Gopal V had a long discussion that I didn't quite
follow. I think Leo thinks that what Bryan proposes is doable, but I'm
not entirely sure whether he thinks it's a good idea...

Clint Pierce had some problems with IMCC's register allocation. He
posted an example that gave problems and wondered if the problem was
with him or with IMCC. Leo Tötsch confirmed that it was a
bug. Luke Palmer pointed Clint at find_global and friends as the
'correct' way to solve the problem. For bonus points, Clint showed of
a pathological example of why BASIC should not be anyone's favourite
language.

As if the Coroutine thread wasn't confusing enough, we now have the
Cothread thread, in which Michael Lazzaro argued that we should blur
the distinctions between coroutines and threads. Dave Whipp pointed
everyone at 'Austin Hastings' draft for A17 (threads)' and argued
that, whilst Coroutines, threads, closures, and various other things
that Michael had argued were aspects of the same thing were related,
they sufficiently different that bundling them all up behind a single
class would lead to badness (``a bloated blob with a fat interface'' was
the phrase he used).

This thread saw even more unrestrained speculation than usual and saw
the first use on the Perl 6 lists of the adjective 'Cozeny', from
Simon Cozens, possibly meaning ``feeling that what is being discussed
is over fussy and generally trying to take the language a long way
from what Real Programmers need''. This would seem to imply a verb form
'to Cozen', ``To more or less forcibly express ones Cozeny feelings''.

I'm afraid this was another thread I had a hard time following. I
reckon there's some interesting ideas in there, but I'm hoping that
someone will pull it all together in an RFC type document so I can go
``Remember that Cothreads thread last week? Leon Brocard summarized it
all neatly in a single proposal, you can find it here.'' (Except it
almost certainly won't be Leon Brocard, it'll be Mike Lazzaro, Leon
doesn't seem to do perl6-language very much).

In an effort to learn about Perl 6, Luke Palmer has been reading about
Haskell. For reasons he doesn't understand, this set him to wondering
what ::= is supposed to mean -- it means 'compile time
binding', but what does that mean?

Damian Conway came through with the goods, summarizing his answer as
::= is to := as a macro call is to a subroutine call.

Dave Whipp had some more thread questions, and wondered what would be
a good Perl 6ish way of implementing a threaded progress
monitor. Whilst the discussion of all this was interesting, I'm not
sure that it's really much to do with the language, more something
that one would implement according to taste and the particular
requirements of a given project.

Thanks once again are due to all the good people on the Perl 6
lists. Apologies will almost certainly be due to the organizers of
YAPC North America as I still haven't started writing the talks I'm
supposed to be giving.

As I noted last week, I'm awarding points (and points mean prizes) to
those kind people who spotted the deliberate mistake. Smylers gets 100
points for spotting the accidental mistake (last week was not in
2004.) Sam Smith, David Wheeler, David Cantrell and Leon Brocard all
earned 50 points for spotting the deliberate mistake of not mentioning
Leon Brocard. But they've helped me make up for it this week by
mentioning him twice, so the karmic balance is restored.

The points I have awarded can be redeemed for the following,
wonderful prizes:

A lifetime subscription to the Perl 6 summaries.

Er...

That's it

If you've appreciated this summary, please consider one or more of the
following options:

Send feedback, flames, money, photographic and writing commissions, or
a patches to Camelbones making it possible to make Perl classes that
inherit from Objective C classes (heck, if Ruby and Python can do it)
to p6summarizer@bofh.org.uk.