March 2005 Archives

Over time, I've accumulated a list of Emacs customizations I wanted to
implement when I got the chance. For example, I'd like macros to perform
certain global replaces just within a marked block, and I'd like a macro to
reformat an Outlook formatted date to an ISO 8609 formatted date. I'm not
overly intimidated by the elisp language used to customize Emacs behavior; I've
copied elisp code and modified it to make some tweaks before, I had a healthy
dose of Scheme and LISP programming in school, and I've done extensive work
with XSLT, a descendant of these grand old languages. Still, as with a lot of
postponed editor customization work, I knew I'd have to use these macros many,
many times before they earned back the time invested in creating them, because
I wasn't that familiar with string manipulation and other basic operations in a
LISP-based language. I kept thinking to myself, "This would be so easy if I
could just do the string manipulation in Perl!"

Then, I figured out how I could write Emacs functions that called Perl to
operate on a marked block (or, in Emacs parlance, a "region"). Many Emacs users
are familiar with the Escape+| keystroke, which invokes the
shell-command-on-region function. It brings up a prompt in the
minibuffer where you enter the command to run on the marked region, and after
you press the Enter key Emacs puts the command's output in the minibuffer if it
will fit, or into a new "*Shell Command Output*" buffer if not. For example,
after you mark part of an HTML file you're editing as the region, pressing
Escape+| and entering wc (for "word count") at the
minibuffer's "Shell command on region:" prompt will feed the text to this
command line utility if you have it in your path, and then display the number of
lines, words, and characters in the region at the minibuffer. If you enter
sort at the same prompt, Emacs will run that command instead of
wc and display the result in a buffer.

Entering perl /some/path/foo.pl at the same prompt will run the
named Perl script on the marked region and display the output appropriately.
This may seem like a lot of keystrokes if you just want to do a global replace
in a few paragraphs, but remember: Ctrl+| calls Emacs's built-in
shell-command-on-region function, and you can call this same
function from a new function that you define yourself. My recent great
discovery was that along with parameters identifying the region boundaries and
the command to run on the region, shell-command-on-region takes an
optional parameter that lets you tell it to replace the input region with the
output region. When you're editing a document with Emacs, this allows you to
pass a marked region outside of Emacs to a Perl script, let the Perl script do
whatever you like to the text, and then Emacs will replace the original text
with the processed version. (If your Perl script mangled the text, Emacs'
excellent undo command can come to the rescue.)

Consider an example. When I take notes about a project at work, I might
write that Joe R. sent an e-mail telling me that a certain system won't need any
revisions to handle the new data. I want to make a note of when he told me
this, so I copy and paste the date from the e-mail he sent. We use Microsoft
Outlook at work, and the dates have a format following the model "Tue 2/22/2005
6:05 PM". I already have an Emacs macro bound to alt+d to insert
the current date and time (also handy when taking notes) and I wanted the date
format that refers to e-mails to be the same format as the ones inserted with
my alt+d macro: an ISO 8609 format of the form
"2005-02-22T18:05".

The .emacs startup file holds customized functions that you want
available during your Emacs session. The following shows a bit of code that I
put in mine so that I could convert these dates:

The (interactive) declaration tells Emacs that the function
being defined can be invoked interactively as a command. For example, I can
enter "OLDate2ISO" at the Emacs minibuffer command prompt, or I can press a
keystroke or select a menu choice bound to this function. The
point and mark functions are built into Emacs to
identify the boundaries of the currently marked region, so they're handy for
the first and second arguments to shell-command-on-region, which
tell it which text is the region to act on. The third argument is the actual
command to execute on the region; enter any command available on your operating
system that can accept standard input. To define your own Emacs functions that
call Perl functions, just change the script name in this argument from
OLDate2ISO to anything you like and then change this third
argument to shell-command-on-region to call your own Perl
script.

Leave the last two arguments as nil and t. Don't
worry about the fourth parameter, which controls the buffer where the shell
output appears. (Setting it to nil means "don't bother.") The
fifth parameter is the key to the whole trick: when non-nil, it tells Emacs to
replace the marked text in the editing buffer with the output of the command
described in the third argument instead of sending the output to a buffer.

If you're familiar with Perl, there's nothing particularly interesting about
the OLDate2ISO.pl script. It does some regular expression matching to
split up the string, converts the time to a 24 hour clock, and rearranges the
pieces:

When you start up Emacs with a function definition like the defun
OLDate2ISO one shown above in your .emacs file, the function is
available to you like any other in Emacs. Press Escape+x to bring
up the Emacs minibuffer command line and enter "OLDate2ISO" there to execute it
on the currently marked buffer. Like any other interactive command, you can
also assign it to a keystroke or a menu choice.

There might be a more efficient way to do the Perl coding shown above, but I
didn't spend too much time on it. That's the beauty of it: with five minutes of
Perl coding and one minute of elisp coding, I had a new menu choice to quickly
do the transformation I had always wished for.

Another example of something I always wanted is the following
txt2htmlp.pl script, which is useful after plugging a few paragraphs
of plain text into an HTML document:

Again, it's not a particularly innovative Perl script, but with the
following bit of elisp in my .emacs file, I have something that
greatly speeds up the addition of hastily written notes into a web page,
especially when I create an Emacs menu choice to call this function:

Sometimes when I hear about hot new editors, I wonder whether they'll ever
take the place of Emacs in my daily routine. Now that I can so easily add the
power of Perl to my use of Emacs, it's going to be a lot more difficult for any
other editor to compete with Emacs on my computer.

Often, programmers find a need to use print statements to output information
to the screen, in order to help them analyze what went wrong in running the
script. However, including these statements verbatim in the script is not such
a good idea. If not promptly removed, these statements can have all kinds of
side-effects: slowing down the script, destroying the correct format of its
output (possibly ruining test-cases), littering the code, and confusing the
user. It would be a better idea not to place them within the code in the first
place. How, though, can you debug without debugging?

Enter Devel::LineTrace, a
Perl module that can assign portions of code to execute at arbitrary lines
within the code. That way, the programmer can add print statements in relevant
places in the code without harming the program's integrity.

Verifying That use lib Has Taken Effect

One example I recently encountered was that I wanted to use a module I wrote
from the specialized directory where I placed it, while it was already installed
in the Perl's global include path. I used a use lib "./MyPath"
directive to make sure this was the case, but now had a problem. What if there
was a typo in the path of the use lib directive, and as a result,
Perl loaded the module from the global path instead? I needed a way to verify
it.

To demonstrate how Devel::LineTrace can do just that, consider
a similar script that tries to use a module named CGI from the
path ./MyModules instead of the global Perl path. (It is a bad idea to
name your modules after names of modules from CPAN or from the Perl
distribution, but this is just for the sake of the demonstration.)

Name this script good.pl. To test that Perl loaded the CGI module
from the ./MyModules directory, direct Devel::LineTrace to print the
relevant entry from the %INC internal variable, at the first line
after the use CGI one.

To do so, prepare this file and call it test-good.txt:

good.pl:8
print STDERR "\$INC{CGI.pm} == ", $INC{"CGI.pm"}, "\n";

Place the file and the line number at which the trace should be inserted on the first line.
Then comes the code to evaluate, indented from the start of the line. After the
first trace, you can put other traces, by starting the line with the filename
and line number, and putting the code in the following (indented) lines. This
example is simple enough not to need that though.

After you have prepared test-good.txt, run the script
through Devel::LineTrace by executing the following command:

$ PERL5DB_LT="test-good.txt" perl -d:LineTrace good.pl

(This assumes a Bourne-shell derivative.). The PERL5DB_LT
environment variable contains the path of the file to use for debugging, and
the -d:LineTrace directive to Perl instructs it to debug the
script through the Devel::LineTrace package.

As a result, you should see either the following output to standard
error:

$INC{CGI.pm} == MyModules/CGI.pm

meaning that Perl indeed loaded the module from the MyModules
sub-directory of the current directory. Otherwise, you'll see something
like:

$INC{CGI.pm} == /usr/lib/perl5/vendor_perl/5.8.4/CGI.pm

...which means that it came from the global path and something went wrong.

Limitations of Devel::LineTrace

Devel::LineTrace has two limitations:

Because it uses the Perl debugger interface and stops at every line (to
check whether it contains a trace), program execution is considerably slower
when the program is being run under it.

It assigns traces to line numbers, and therefore you must update it if the
line numbering of the file changes.

Nevertheless, it is a good solution for keeping those pesky
print statements out of your programs. Happy LineTracing!

What if you could test your program's use of the DBI just by creating a set
of rules to guide the DBI's behavior—without touching a database (unless you
want to)? That is the promise of Test::MockDBI, which by
mocking-up the entire DBI API gives you unprecedented control over every aspect
of the DBI's interface with your program.

Test::MockDBI uses Test::MockObject::Extends
to mock all of the DBI transparently. The rest of the program knows nothing
about using Test::MockDBI, making Test::MockDBI ideal for testing programs that
you are taking over, because you only need to add the Test::MockDBI invocation code—
you do not have to modify any of the other program code. (I have found this
very handy as a consultant, as I often work on other people's code.)

Rules are invoked when the current SQL matches the rule's SQL pattern. For
finer control, there is an optional numeric DBI testing type for each rule, so
that a rule only fires when the SQL matches and the current DBI
testing type is the specified DBI testing type. You can specify this numeric
DBI testing type (a simple integer matching /^\d+$/) from the
command line or through Test::MockDBI::set_dbi_test_type(). You
can also set up rules to fail a transaction if a specific
DBI::bind_param() parameter is a specific value. This means there
are three types of conditions for Test::MockDBI rules:

The current SQL

The current DBI testing type

The current bind_param() parameter values

Under Test::MockDBI, fetch*() and select*()
methods default to returning nothing (the empty array, the empty hash, or undef
for scalars). Test::MockDBIM lets you take control of their returned data with
the methods set_retval_scalar() and
set_retval_array(). You can specify the returned data directly in
the set_retval_*() call, or pass a CODEREF that generates a return
value to use for each call to the matching fetch*() or
select*() method. CODEREFs let you both simulate DBI's
interaction with the database more accurately (as you can return a few rows,
then stop), and add in any kind of state machine or other processing
needed to precisely test your code.

When you need to test that your code handles database or DBI failures,
bad_method() is your friend. It can fail any DBI method, with the
failures dependent on the current SQL and (optionally) the current DBI testing
type. This capability is necessary to test code that handles bad database
UPDATEs, INSERTs, or DELETEs, along with
being handy for testing failing SELECTs.

Test::MockDBI extends your testing capabilities to testing code that is
difficult or impossible to test on a live, working database. Test::MockDBI's
mock-up of the entire DBI API lets you add Test::MockDBI to your programs
without having to modify their current DBI code. Although it is not finished
(not all of the DBI is mocked-up yet), Test::MockDBI is already a powerful tool
for testing DBI programs.

A great joy in a programmer's life is removing useless code, especially
when its absence improves the program. Often this happens in old codebases
or codebases thrown together hastily. Sometimes it happens in code written
by novice programmers who try several different ideas all together and fail
to undo their changes.

One such persistent idiom is wholesale, program-wide unbuffering, which
can take the form of any of:

local $| = 1;
$|++;
$| = 1;

Sometimes this is valuable. Sometimes it's vital. It's not the default
for very good reason, though, and at best, including one of these lines in
your program is useless code.

What's Unbuffering?

By default, modern operating systems don't send information to output
devices directly, one byte at a time, nor do they read information from
input devices directly, one byte at a time. IO is so slow, especially for
networks, compared to processors and memory that adding buffers and trying
to fill them before sending and receiving information can improve
performance.

Think of trying to fill a bathtub from a hand pump. You could
pump a little water into a bucket and walk back and forth to the bathtub,
or you could fill a trough at the pump and fill the bucket from the trough.
If the trough is empty, pumping a little bit of water into the bucket will
give you a faster start, but it'll take longer in between bucket loads than
if you filled the trough at the start and carried water back and forth
between the trough and the bathtub.

Information isn't exactly like water, though. Sometimes it's more
important to deliver a message immediately even if it doesn't fill up a
bucket. "Help, fire!" is a very short message, but waiting to send it when
you have a full load of messages might be the wrong thing.

That's why modern operating systems also let you unbuffer specific
filehandles. When you print to an unbuffered filehandle, the operating
system will handle the message immediately. That doesn't guarantee that
whoever's on the other side of the handle will respond immediately; there
might be a pump and a trough there.

What's the Damage?

According to Mark-Jason Dominus' Suffering from
Buffering?, one sample showed that buffered reading was 40% faster than
unbuffered reading, and buffered writing was 60% faster. The latter number may
only improve when considering network communications, where the overhead of
sending and receiving a single packet of information can overwhelm short
messages.

In simple interactive applications though, there may be no benefit. When
attached to a terminal, such as a command line, Perl operates in
line-buffered mode. Run the following program and watch the output
carefully:

The first five greetings appear individually and immediately. Perl flushes
the buffer for STDOUT when it sees the newlines. The second set appears after
five seconds, all at once, when it sees the newline after the loop. The third
set appears individually and immediately because Perl flushes the buffer after
every print statement.

Terminals are different from everything else, though. Consider the case of
writing to a file. In one terminal window, create a file named
buffer.log and run tail -f buffer.log or its equivalent
to watch the growth of the file in real time. Then add the following lines to
the previous program and run it again:

The first five messages appear in the log in a batch, all at once, even
though they all have newlines. Five messages aren't enough to fill the buffer.
Perl only flushes it when it unbuffers the filehandle on assignment to
$|. The second set of messages appear individually, one second
after another.

Finally, the STDERR filehandle is hot by default. Add the following lines to
the previous program and run it yet again:

select( STDERR );
loop_print( 5, "Unbuffered STDERR " );

Though no code disables the buffer on STDERR, the five messages should print
immediately, just as in the other unbuffered cases. (If they don't, your OS is
weird.)

What's the Solution?

Buffering exists for a reason; it's almost always the right thing to do.
When it's the wrong thing to do, you can disable it. Here are some rules of
thumb:

Never disable buffering by default.

Disable buffering when and while you have multiple sources writing to
the same output and their order matters.

Never disable buffering for network outputs by default.

Disable buffering for network outputs only when the expected time
between full buffers exceeds the expected client timeout length.

Welcome to yet another fortnight's summary. I believe this is the highest
volume I have ever seen the three lists at simultaneously. Hopefully they will
keep it up, because they're doing good work. To aid the epic endeavor of
summarizing all this, I have had to add some new Jazz to my playlist. We will
see how it works out. If it doesn't work well, blame Seton.

Perl 6 Language

Luke Palmer has tasted the forbidden fruit of Haskell, and now he wants more
of it in Perl 6. In particular he wants even more powerful pattern matching of
arguments for MMD. Rod Adams speculated that Larry had decided Perl 6 would not
be ML. In the end there was no real consensus, but the feeling seems to be to
say "don't hold your breath".

The question of how decorating objects with roles interacted with low-level types arose.
Larry came to the conclusion that it was okay, unless you wanted to decorate a single element in a primitive array.

Rod Adams pointed out that it's possible to implement much of logic
programming using the rules engine. Unfortunately, the syntax gets a little
hairy and cumbersome. Larry said that this particular goal might be something
that 6.0 does not address, deferring it instead. Ovid rumbled about porting a
Warren Abstract Machine to Parrot. I would like it.

Locale-Keyed Text

Darren Duncan finished up the first non-core Perl 6 module. Being properly
hubristic, he asked for a critique. His questions touched on subjects including
subtypes, module loading, and strictness.

Rod Adams wondered what would happen if he had both a sub and a method named
bar. What would $f.bar and bar $f do?
Jonathan Scott Duff explained that $f.bar would call the method
while bar $f would call the sub.

Rod Adams wants a single object to represent all of the possible multi
methods associated with a particular short name. It seems that Rod drank some
of the Lisp Kool-Aid (although in this case, I agree). He explained how this
allowed the dispatch scheme to be changed on a multi by multi basis, and also
allowed for nice introspection. This led to a discussion of how this would work
with lexically installed multi methods, and if this would trip people up. No
real consensus appeared.

Sam Vilain fixed up the SEND + MORE example to work correctly with
junctions. Unfortunately, the hoops through which he had to jump are pretty
horrendous. Larry mumbled that the option of autothreading all conditionals
might work, but would send too many lynch mobs after him. I for one like both
Twin Peaks and that idea.

Thomas Sandlaß wondered when arguments to function would be decorated
with roles from the function signature if they didn't exist. Larry conjectured
about allowing different views on objects versus mixing in various roles. This
led people to talk about covariant typing. An array of ints will always return
you a number and an array of numbers will always accept an int, but an array of
ints will not necessarily accept a number and an array of numbers will not
necessarily return an int. Thus, changing your view can be valid when writing
and not when reading, or vice versa.

Andrew Savige noticed that closing a file handle in Pugs did not force
all the thunks associated with the file. While this was a bug in Pugs, it led
to conversation about whether = should be lazy or eager. Larry
thinks that it will be safer to start eager and become lazy then vice
versa.

Rod Adams wondered how he would define the signature of exists
and delete as they do not evaluate the subscripted variables in
their arguments. Larry explained that they are now methods on the hash, so
someone will have to do a little macro magic to get it to work the old way.

Juerd put out a plea for lists in string context not to provide spaces
between elements automatically. Larry pointed out various ways to join on the
empty string, which I think is his way of saying "too bad".

Rod Adams wondered what it meant to pop a multidimensional array. Larry
agreed that it should pop off entire dimensions. Does this mean that popping
such an array in a loop will pop dimensions until there is only one left, at
which point it will switch to popping elements?

Markus Laire wondered what index("Hello", "", 999) would
return. Larry explained that it is not as simple as Markus thinks, because strings
use magic indices that do Unicode stuff, but it would probably throw an
exception.

Gall Yahas wondered how ::() would react to undefined
variables. Larry explained that it might be either legal or illegal as an
lvalue depending on whether or not the scope had finished being compiled, and
that it would be undefined as an rvalue.

Aaron Sherman posted a rough draft of a better POD. This led to many people
passionately discussing the merits and demerits of POD and kwid. Fortunately,
as the summarizer endowed with the power of double speaking, I can definitively
report that the conclusion was that everybody prefers both kwid over Pod and vim
over emacs.

Darren Duncan wants to protect his classes from their malicious enemies who
would use his references against him. Thus, he wants to know if his accessor
methods return references or copies. Larry explained that they would probably
return lazy copies, to provide the requisite protection, except when used
inside that class.

As originally specified, .method means $_.method.
This sets it apart from $.foo, @.foo, and
%.foo, which all refer to $self. Much discussion
ensued. I think the pendulum is slowly swinging toward switching the meaning of
.method to refer to $self.method.

Rod Adams wondered what would happen to study. Because I never did it in
high school or college, I doubt I will begin now. Other people seem to think it
would be easier to leave it as a no-op in case we want to do it eventually.

Rod Adams thought that perhaps chr and ord have a restriction to work only
at the code point level. Larry was less sure.

Perl 6 Compilers

Last week, I tried to link to many of the Pugs patches. I now think that was
a mistake for two reasons: first, there are a great many; and second, many more occur
off-list where I miss them. Therefore, I will not provide links for specific
patches unless they pass this arbitrary test: Are they as important as my pizza?

Pugs 6.0.12

Autrijus released Pugs 6.0.11 and 6.0.12. The features are plentiful and
awesome. For a more complete list (which is long) as well as daily blow-by-blow
of the Pugs development (which is fast) check out Autrijus's journal.

Anthony Kilna knew that one of the best ways to help Pugs was to write tests,
but didn't know if there was a database of tests that needed to be written or
were written. Stevan Little pointed him to the in-progress attempt to build
just such a database, and said that would be a good place to help.

Stevan Little compiled a list of bugs for Pugs. By the time you read this,
many will probably have been fixed.

Parrot

I will start this part with a very large announcement. Dan has decided to
step down as Parrot's Chief Architect. Chip Salzenberg (who just earned first
name-only status) has taken up the burning parrot...err...torch. To forestall
questions/outrage/grumbling, Dan explained that Leo did not get the position
because he did not want it. I know that I personally have learned a lot from
Dan and Squawks of the Parrot (including how to turn crystal sugar into baker's
sugar), and want to think him a great deal for the work he put into Parrot.
This means the responsibility of returning the pie to Guido now falls on Chip's
shoulders.

Leo put out a request to revive the Parrot tinderboxen. Steve Peters
suggested integrating it with the current Perl smoke reporting process. Peter
Sinnot put up a server on a spare machine in the meantime.

Leo noticed that some aggregates do deep copies while others do shallow.
All should do shallow. Takers welcome.

**Arrays TODO

Fixed*Arrays should have a limited form of splice
available to them. Also, the Resizable*Arrays should have their
allocation schemes adapted to that of the ResizeablePMCArray, and
*BooleanArray should store just bits. Bernhard Schmalhofer
offered to take on the *BooleanArrays.

Bernhard Schmalhofer committed a few TODO tests for generating and running
PASM from Pir. Jens Rieks pointed out that this does not work and is only a
debugging aid. I don't see anything wrong with wanting it to work, though.

François Perrad noticed that MinGW was very particular about how you
execs OS commands. He wondered if this should be fixed at the configure layer
or the Parrot_Exec_OS_Command layer. Dan explained that he never
intended the latter to be language independent, and that a language independent
version should go in a library.

The Usual Footer

Posting via the Google Groups interface does not work. To post to any of
these mailing lists please subscribe by sending email to
perl6-internals-subscribe@perl.org,
perl6-language-subscribe@perl.org, or
perl6-compiler-subscribe@perl.org. If you find these summaries
useful or enjoyable, please consider contributing to the Perl Foundation to
help support the development of Perl. You might also like to send feedback to
ubermatt@gmail.com.

Driving Windows DNS Server

If you happen to manage a DNS server running on Windows 2000 or Windows 2003
with more than just a couple of dozen resource records on it, you've probably
already hit the limits of the MMC DNS plugin, the Windows administrative GUI
for the DNS server implementation. Doing mass operations like creating 20 new
records at once, moving a bunch of A records from one zone to another, or just
searching for the next free IP in a reverse zone, can challenge your patience.
To change the name part of an A record, you have to delete the entire record and
re-create it from scratch using the new name. You've probably thought to yourself,
"There MUST be another way to do this." There is!

Silently, almost shyly, behind the scenes and without the usual bells and
whistles, Microsoft has arrived at the power
of the command shell. For the DNS services[1], the command line utility
dnscmd is available as part of the AdminPac for the server
operating systems. dnscmd is a very solid command line utility,
with lots of options and subcommands, that allows you to do almost every
possible operation on your DNS server. These include starting and stopping the
server, adding and deleting zones and resource records, and controlling a lot
of its behavior.

This article explores how to run dnscmd from Perl. In that
respect, it is a classic "Perl-as-a-driver" script, invoking
dnscmd with various options and working on its outputs.

DNSCMD

Invoke dnscmd /? to see a top-level list of available subcommands
and type dnscmd /subcommand /? for more specific help
for this subcommand. dnscmd /? shows that there is a subcommand
RecordDelete, and dnscmd /recorddelete /? (case not
significant) explains that you need a zone name (like "my.dom.ain"), a node name
within this zone (like "host1"), a record type (like "A"), and the data part of
the resource record to delete (like "10.20.40.5").

The first argument to dnscmd, if you actually want to do
something with it, is the name of the DNS server to use. A full working
command looks something like:

dnscmd dnssrv.my.dom.ain /RecordDelete my.dom.ain host1 A 10.20.40.5

This opens the very welcome opportunity to run dnscmd
remotely—from your workstation, for example—which frees you from the need to
log in to your DNS server.

The script in the current version only handles A and PTR records.
There is no handling of CNAME records, for example. Within this limitation, it
is also very A record-oriented: you can add or delete A records, change the IP or name
of an A record, or move the A record to a different zone. It
keeps PTR records in sync with these changes, creating or deleting a
corresponding PTR record with its A record. This is mostly what you want.

The most important thing to understand with this script is the format and
the meaning of the input data it takes ("Show me your data ...")[2]. The
format is simple, just:

<name1> <target1>
<name2> ...
... ...

Separate name and target by whitespace. A name is a
relative or fully- qualified domain name. A target can be one of these:

another domain name, meaning to rename to that name;

an IP, meaning to change to this IP;

nothing or undef, meaning to delete this name.

This mirrors the basic functions of the script. To add some extra candy,
the target parameter has two other possibilities which have proven very useful
in my environment. A target can also be:

a C net, given as a triple of IP buckets (like "10.20.40");

a net segment identifier ("v1", "v2", and so on, in my example).

In both cases, the script will give the name a free IP (if possible) from
either the C net or the net segment specified by its identifier. I'll return
to this idea soon.

To pull all of the various possibilities together, here is a list of sample
input lines, each representing one of the mentioned possibilities for the
target:

Pass this data to the script through either an input file or STDIN[3].

init()

Now to the code. At startup, wdns.pl pulls in the list of primary
zones from the given DNS server, both forward (names as lookup keys) and
reverse (IPs as lookup keys) zones. This is handy, because it will use this
list again and again.

mv_ip()

The main worker routine is the sub mv_ip. (Don't think too
much about the name; it's from the time when the only function of the script
was to change the IP of a given name). For any given name/target pair, it does
the following: First it tries to find a FQDN for the name. If it finds a host
for the given name, it uses the FQDN as a basis to construct the name part of
the targeted record. If it cannot find a name, it assumes that it should
create an entirely new record. If the options permit (-c), it
constructs one.

Then it inspects the target. Depending on its type, the
program prepares to assign a new IP to the name, rename an existing A record
while retaining the IP, search for a free IP in a certain range, or just
delete existing records. When everything settles, the actual changes take
place, using dnscmd to delete and add A and PTR records as
appropriate. (There is no UpdateRecord function in
dnscmd, so updating is in fact a combination of delete and
create).

That's it! The rest of the code is lower-level functions that help to
achieve this.

create_* and delete_*

The four subs create_A, create_PTR,
delete_A, and delete_PTR are wrapper functions
around the respective invocations of dnscmd. An additional issue
of interest is that Windows DNS will delete a PTR record once you delete the
corresponding A record, so you don't have to do so explicitly.

get_rev_zone() and get_fwd_zone()

One of the major issues when manipulating DNS resource records is picking
the right zone to do the change in. If you have just one forward and one
reverse zone, this is simple. However, if you are maintaining a lot of zones
with domains and nested subdomains, while other subdomains of the same parent
have their own zones, this might be tedious.

wdns.pl can offload this task for you. The subs
get_rev_zone and get_fwd_zone use the initially
retrieved list of primary zones from your server. They take an IP or a fully
qualified domain name respectively, and split it into the node part and the
zone part. So the IP 10.20.40.5 might split into 10.20.40 and 5 (if the proper
zone of this IP is 40.20.10.in-addr.arpa) or 10 and 20.40.5 (if
10.in-addr.arpa happens to be the enclosing zone), depending on your zone
settings. The same applies for domain names. Other routines use this
information to add or delete resource records in their appropriate zones.

IP Lookup Functions

There is a set of subs I called "IP lookup functions". They all help to
find a free IP in an appropriate range. Depending on the target specification,
they will search a certain C net or a whole net segment of unused address. This
searching breaks down to finding the appropriate zone, the appropriate node
("subdomain"), and then listing the already existing leaf nodes in this range.
Once it has the list of used nodes, it starts scanning for gaps or unused
nodes off the end of the list.

An additional feature of these routines is that they honor certain reservations in
the ranges, either through fixed directives ("leave the first 50 addresses
free at the beginning of each net segment") or through inspecting dedicated
TXT records on the DNS server that contain the RESERVED keyword.
(The actual format of these records is
RESERVED:<range-spec>:<free text>, where
range-spec is a colon-separated list of IPs or IP ranges. An
example is RESERVED:1,3,5,10-20,34:IPs reserved for the VPN
switches). This helps avoid re-using reserved IPs accidentally through
the automatic script, and also helps avoid messing things up when time is short.

In the case of these TXT records, I used dnscmd to retrieve
them, not nslookup, which would have been equally possible.

A Word About the Net Segments

If your IPs reside in a segmented network, which is likely to be the case
for most sites, make sure that your hosts have addresses for the segments to
which they attach. For this script I have chosen a poor man's approach to
represent the segments just by the list of their respective C nets in the
script itself (see the hash %netsegs in the "Config section").
There might be a more clever way to do this. If you are going to run the
script in your environment, edit this hash to reflect your network
topology.

The dns_lookup sub looks up the current DNS entries. It runs
the extern command nslookup and parses its output. If you need
more sophisticated DNS lookups (and nslookup's options just won't
do), you might want to resort to dig (which has a Windows
version) or Net::DNS (which
runs on Windows in any case). This simple way of doing it was just enough for
my needs.

Footnotes

In the Windows world, server processes are usually referred to as
"services"; I tend to mix this term with "server" every now and then.

"Show me your functions, and I will be confused. Show me your data, and
your functions will be obvious", to re-coin a famous quote from Frederick
Brooks' The Mythical Man-month.

Depending on your Windows command shell, you might have to tinker a bit to
get the STDIN input to work as desired. Cygwin's bash works like a breeze and
takes Ctrl-Z<RET> as the EOF sequence.

Having almost achieved the state of perfect laziness, one of my favorite
modules is Class::DBI::mysql.
It makes MySQL database tables seem like classes, and their rows like objects.
This completely relieves me from using SQL in most cases. This article explains
how Class::DBI::mysql carries out its magic. Instead of delving
into the complexities of Class::DBI::mysql, I will use a simpler
case study: Class::Colon.

Introduction

One of my favorite modules from CPAN is Class::DBI::mysql.
With it, I can almost forget I'm working with a database. Here's an
example:

Except for the MySQL connection information, no trace of SQL or databases
remains.

My purpose here is not really to introduce you to this beautiful module.
Instead, I'll explain how to build façades like this. To do so, I'll work
through another, simpler CPAN module called Class::Colon. It turns
colon-delimited files into classes and their lines into objects. Here's an
example from a checkbook application. This program computes the balance of an
account on a user-supplied date or the end of time if the user doesn't supply
one.

In the use statement for Class::Colon, I told it the name of
the class to build (Trans), followed by a list of fields in the
order they appear in the file. The date field is really an object itself, so I
used =Date after the field name. This told
Class::Colon that a class named Date will handle the
date field. If the Date class constructor were not named
new, I would have written
date=Date=constructor_name. My Date class is primitive
at best, it only provides comparisons like greater than. It only does that for
dates in one format. I won't embarrass myself further by showing it.

After shifting in the name of the account file, the code calls
READ_FILE through Trans, which
Class::Colon defined. This returns a list of Trans
objects. The fields in these objects are the ones given in the
Class::Colon use statement. They are easy to access through their
named subroutines.

The rest of the program loops through the transactions list checking dates.
If the user didn't give a date, or the current transaction happened before the
user's date, the program adds that amount to the total. Finally, it reports the
balance.

Though the example shows only the lookup access, you can easily change
values. All of the accessors retrieve and store. Calling
WRITE_FILE puts the updated records back onto the disk.

Other methods help with colon-delimited records. Some let you work with
handles instead of file names. Others help you parse and produce strings so
that you can drive your own input and output. See the Class::Colon
perldoc for details. (No, colon is not the only delimiter.)

Let the Games Begin

Both Class::DBI::mysql and Class::Colon build
classes at run time which look like any other classes. How do they do this?
They manipulate symbol tables directly. To see what this means, I want to start
small. Suppose I have a variable name like:

my $extremely_long_variable_indicator_string;

That's not something I want to type often. I could make an alias in
two steps like this:

our $elvis;

First, I declare an identifier with a better name. I must make it global. If
strict is on, I should use our to do this (though there are other
older ways that also work). Lexical variables (the ones declared with my) don't
live in symbol tables, so the tricks below won't work with them.

*elvis = \$extremely_long_variable_indicator_string;

Now I can point $elvis to the longer version. The key is the
* sigil. It refers to the symbol table entry for
elvis (the name without any sigils). This line stores a reference
to $extremely_long_variable_indicator_string in the symbol table
under $elvis, but it doesn't affect other entries like
@elvis or %elvis. Now, both scalars point to the same
data, so $elvis is a genuine alias for the longer name. It is not
just a copy.

Unless you work with mean-spirited colleagues, or are into self-destructive
behavior, you probably don't need an alias just to gain a shorter name. However,
the technique works in other situations you might actually encounter. In
particular, it is the basis for the API simplification provided by
Class::Colon.

To understand what Class::Colon does, remember that the
subroutine is a primitive type in Perl. You can store subs just as you do
variables. For instance, I could store a subroutine reference like this (the
sigil for subs is &):

my $abbr;
$abbr = \&some_long_sub_name;

and use it to call the subroutine:

my @answer = $abbr->();

Here, I have made a new scalar variable, $abbr, which holds a
reference to the subroutine. This is not quite the same as directly
manipulating the symbol table, but you can do that too:

*alias = \&some_long_sub_name;
my @retval = alias();

Instead of storing a reference to the subroutine in a variable, this code
stores the subroutine in the symbol table itself. This means that subsequent
code can access the subroutine as if it had declared the subroutine with its
new name itself. Adjusting the symbol table is not really easier to read or
write than storing a reference, but, in modules like Class::Colon,
symbol table changes are the essential step to simplifying the caller's
API.

Classes from Sheer Magic

The previous example demonstrated how to make symbol table entries whenever
you want. These can save typing and/or make things more readable. The standard
module English uses this
technique to give meaningful English names to the standard punctuation
variables (like $_). You want more, though. You want to build
classes out of thin air during run time.

The key to fabricating classes is to realize that a class is just a package
and a package is really just a symbol table (more or less). That, and the fact
that symbol tables autovivify, is all you need to carry off hugely helpful
deceptions like Class::DBI::mysql.

What use really does

This subsection explains how to pass data during a use
statement. If you already understand the import subroutine, feel
free to skip to the next section.

When you use a module in Perl, you can provide information for that module
to use during loading. While Class::DBI::mysql waits for you to
call routines before setting up classes, Class::Colon does it
during loading by implementing an import method.

Whenever someone uses your module, Perl calls its
import method (if it has one). import receives the
name of the class the caller used, plus all of the arguments provided by the
caller.

In the checkbook example above, the caller used Class::Colon
with this statement:

After shifting the arguments into meaningful variable names, the main loop
walks through each requested class (the list of fakes). Inside the loop it
disables strict, because the necessary uses of so many symbolic
references would upset it.

There are four steps in the fabrication of each class:

Make the constructor

Make the class methods

Make the accessor methods

Store the attribute names in order

The constructor is about as simple as possible and the same for every
fabricated class. It returns a hash reference blessed into the requested class.
The cool thing is that you can insert code into a symbol table that doesn't
exist in advance. This constructor will be NEW. (By convention,
Class::Colon uses uppercase names for its methods to avoid name
collisions with the user's fields).

This code requires a little bit of careful quoting. Saying
*{"$fake\::NEW"} tells Perl to make an entry in the new package's
symbol table under NEW. The backslash suppresses variable
interpolation. While $fake needs interpolation, interpolating
$fake::NEW would just yield undef, because this is
its first definition here.

Perl has already done the hard part by the time it stores the constructor.
It has brought the package into existence. Now it's just a matter of making
some aliases.

For each provided method, the code makes an entry in the symbol table of the
fabricated class. Those entries point to the methods of the
Class::Colon package, which serve as permanent shared delegates
for all fabricated classes.

Similarly, it builds an accessor for each attribute supplied by the caller
in the use statement. These routines require a bit of
customization to look up the proper attribute name and to deal with object
construction. Hence, there is a small routine called
_make_accessor which returns the proper closure for each
accessor.

Finally, it makes an entry for the new class in the master list of simulated
classes. This allows easy lookup by name when calling class methods through the
fabricated names. Note that there is nothing in the import routine
that limits the caller to one invocation. Further use statements
can bring additional classes to life. Alternatively, the caller can request
several new classes with a single use statement by including
multiple hash keys.

The actual routine is a bit more complex, so it can handle construction of
attributes which are objects. Note that the value of $attribute,
which is in scope when the closure is created, will be kept with the sub and
used whenever it is called. The actual code is a fairly standard Perl dual-use
accessor. It assigns a new value to the attribute if the caller has passed it
in. It always returns the value of the attribute.

What Class::Colon provides

Just for sake of completeness, here is how Class::Colon turns a
string into a set of objects. Note the heavy use of methods through their
previously-entered symbol table names.

All fabricated classes share this method (and the other class methods of
Class::Colon).

Recall that NEW returns a blessed hash reference with nothing
in it. In objectify, the loop fills in the attributes by calling
their accessors. This ensures the proper construction of any object attributes.
Callers access objectify indirectly when they call
READ_FILE and its cousins. They can also use it directly through
its OBJECTIFY alias.

Summary

By making entries into symbol tables, you can create aliases for data that is
hard to name. Further, you can create new symbol tables simply by referring to
them. This allows you to build classes on the fly. Modules like
Class::DBI::mysql and Class::Colon do this to provide
classes representing tabular data.

There are other uses of these techniques. For example, Memoize wraps an original
function with a cache scheme, storing the wrapped version in place of the
original in the caller's own symbol table. For functions which return the same
result whenever the arguments are the same, this can save time. Exporter does even more
sophisticated work to pollute the caller's symbol table with symbols from a
used package. At heart, these schemes are similar to the one shown above. By
carefully performing symbol table manipulations in modules, you can often
greatly simplify an API, making client code easier to read, write, and
maintain.

Welcome to yet another fortnight summary, once again brought to you by
chocolate chips. This does have the distinction of being the first summary
written on a Mac, so if I break into random swear words, just bear with me.

Off-list Development

In more related news, someone pointed out to me that development goes on off-list on places like IRC. I briefly contemplated quitting my job and tracking
such things full time, but then I decided that it would be better to accept
brief submissions for the summary. Thus I will be adding a fourth section to
the summaries based on contributions. If you would like to make a contribution,
email me with a brief summary. Please include the name by which you would like
me to attribute you (though sadly the process I use will likely to mangle any
Unicode characters). Please make all links full. I will shorten them.
Thanks!

Perl 6 Language

It turns out that not() (with no arguments) made Perl 5 core
dump for a while, and it took us five years to figure that out. In Perl 6 it
will be a list op. Calling it with no arguments will return a null list or an
undef depending on context.

I had hoped that last week someone would have addressed the concerns about
threading. I was disappointed in this. A new crop of concerns surfaced and died down fairly
quickly (as the chief proponent, Damian, was away).

Somehow the discussion of junctions morphed into a discussion of sets, which
morphed back into junctions, which morphed into a discussion of serialization
to different languages. Interesting stuff, but I wouldn't hold my breath for
it.

Adam Preble posted an offer to develop some benchmarks for Perl 6.
Unfortunately, I think he posted it to Google Groups. Also, he probably should
have posted it to p6c or p6i as the language folk tend to wave their hands and
say "magic occurs but correctness is preserved" when it comes to
optimization.

Autrijus posted an example using junctions, instead of parents, to solve the
classic

SEND MORE + ===== MONEY

problem. Markus Laire asked for a clarification, and Rod Adams pointed out
that he felt that it would not work as it did not capture the interdependence
of the "e"s. This lead to the question of how to write Prolog-like code
(including unification and backtracking) in Perl 6. No one offered answers.

Autrijus wanted to know if hash keys were still just strings or if they
could be more. The answer is that by default they are strings, but you can
declare them as having a different shape . This led to a
discussion of hashing techniques such as the .bits,
.canonicalize, or .hash methods.

Dave Whipp wanted to make "dynamically-scoped dynamic scopes". My head hurt,
but apparently Larry's didn't. He replied, "Piece of cake, the syntax [and
implementation] are left as an exercise for the would-be module author."

Rod Adams asked how he could specify arguments to rules so they could be
more function-like. Larry explained that there were several syntaxes, each of
which can coerce its arguments in slightly different ways. He then mused that
perhaps there were too many. I agree. There are too many.

Ahbijit Mahabal wondered how type checking will work for cases where it is
not easy to determine the types at compile time. The answer: checking will be
deferred to run time. In the end it seems that Perl 6 will blur the line between
run time and compile time heavily. Perhaps it will provide nifty support for
staged programming. Meta-Perl 6 here we come.

Brian Ingerson asked about the CONFIG hash and what sort of
secondary sigil it would have. Larry explained that $?CONFIG holds
the config for the machine compiling the program and $*CONFIG
holds the config for the machine running the program. Then he made some noise
about parsing, compiling, and running all on different machines. Then he
suggested that this way led to drug induced madness.

Luke Palmer wondered how optional arguments and slurpy ones would interact.
Brent and Larry explained that they would snap up whatever arguments they
could, but you can always beat them back by piping in your slurpy stuff with
==> .

Thomas Sandlaß wants to know how the type system and the class system
interrelate. He drew a happy tree of A, B, and its junctions. Really it
confused me, and I agree with him that I don't understand the value of the one
junction in the context of types.

Wolverian does not like any of the ways he can indent his long function
declaration when it uses traits. He wants to allow a comma in them to solve
this dilemma. Larry and others suggested a few alternatives. This led to a
discussion of module loading and header/module files. Larry admitted that he
would not mind if Perl 6 developed Ada-like module files.

Perl 6 Compiler

Pugs Releases and patches

Various Pugs Patches

Luke Palmer added more qq delimiters and fixed a unary
- bug. Yuval Kogman posted a fix that made anonymous blocks both
parse and run. Stevan Little un-TODOed a bunch of tests that started working;
he went on to add some new tests that do not yet pass. I suspect that he is
just providing more for him to un-TODO later. Yuval Kogman submitted several
patches including array interpolation, a CATCH {} test, a test for
an assignment bug, and a fix for a conditional of expected. Garrett Rooney
cleaned up given.t, added a test for %hash.kv, one for
declaring variables in a loop, and another for $?LINE and
$?FILE.

Abhijit Mahabal wondered if p6c was the correct place to post questions
about Pugs and bugs in Pugs. Patrick and Autrijus agreed that p6c was indeed
the correct place for most initial questions. Things will escalate to p6l only
when the Apocalypses|Exegeses|Synopses are not clear.

Garrett Rooney was having trouble using the &?SUB variable in pointy
subs. That is because they don't use it. &?SUB is only for full-fledged
subs. That way you can call &?SUB from within a for loop in a sub and get
the nice recursive behavior you likely want.

Autrijus asked Larry for clarification of which circumstances set
$_. Larry explained that -> topicalizes its first
argument but full subs undefine it until something else sets it, while methods
bind it to their first invocant.

Luke Palmer was having trouble getting for %hash.keys { ... }
to parse correctly. Larry replied that it is problematic if methods parse in
the same manner as subs. Fortunately, the parens are mandatory when there are
arguments in addition to invocants.

Darren Duncan has offered to start the ball rolling with Perl 6 integration
testing. He will translate a few modules he has written to Perl 6 so that they
can act as more holistic tests for Pugs and Perl 6. There is an interesting
conversation about CPAN and Perl 6 involved too.

Bernhard Schmalhofer asked about adding heredoc support to PIR. This led to
Melvin ranting that PIR is not a language for people to write. PIR's goal was
to be to be an intermediate language targeted by compilers and was not supposed
to have human niceties like heredoc. Of course, for PIR to reach that state, we
need a high level language that actually targets it.

Leo announced that he has merged Dan's string patch into the current CVS
head. Thanks go to Will Coleda for doing all the heavy lifting. String content
in assemblers now assume the iso-8859-1 charset, unless you specify
otherwise.

chromatic: I've followed your journal from the beginning, but it didn't start
from the start. Where did you come up with this crazy idea?

Autrijus: Ok. The story is that I hacked SVK for many months with clkao. SVK worked, except it is not
very flexible. There is a VCS named darcs, which is much more flexible,
but is specced using quantum physics language and written in a scary language
called Haskell. So, I spent one month
doing nothing but learning Haskell, so I could understand darcs. Which worked well;
I convinced a crazy client (who paid me to develop Parse::AFP) that Perl 5 is
doomed because it has no COW (which, surprisingly, it now has), and to fund me to
develop an alternate library using Haskell.

(I mean "Perl 5 is doomed for that task", not "Perl 5 is doomed in
general".)

chromatic: Copy-on-Write?

Autrijus: Yeah.

chromatic: So that's a "sort-of has".

Autrijus: Yeah. As in, sky suddenly worked on it and
claims it mostly works. Haven't checked the code, though.

chromatic: It's been in the works for years. Or "doesn't works"
perhaps.

Autrijus: But I digress. Using Haskell to develop
OpenAFP.hs led to programs that eat constant 2MB memory, scale
linearly, and are generally 2OOM faster than my Perl library.

Oh, and the code size is 1/10.

chromatic: Okay, so you picked up Haskell to look at darcs to borrow ideas from
for svk, then you convinced a client to pay you to write in Haskell and you
started to like it. What type of program was this? It sounds like it had a
bit of parsing.

Autrijus: AFP is IBM's PDF-like format, born 7 years
before PDF. It's unlike PDF in that it's all binary, very bitpacked, and is
generally intolerant of errors. There was no free library that parses or
munges AFP.

chromatic: Darcs really impressed you then.

Autrijus: The algorithm did. The day-to-day slowness and
fragility for anything beyond mid-sized projects did not. But darcs is
improving. But yeah, I was impressed by the conciseness.

chromatic: Is that the implementation of darcs you consider slow or the use of
Haskell?

Autrijus: The implementation. It basically caches no info
and recalculates all unnecessary information. Can't be fast that way.

chromatic: Hm, it seems like memoization is something you can add to a
functional program for free, almost.

Autrijus: Yeah, and there are people working on that.

chromatic: But not you, which is good for us Perl people.

Autrijus: Not me. Sorry.

Anyway. So, I ordered a bunch of books online including TaPL and ATTaPL so I could
learn more about mysterious things like Category Theory and Type Inference and
Curry-Howard Correspondence.

chromatic: How far did you get?

Autrijus: I think I have a pretty solid idea of the basics
now, thanks to my math-minded brother Bestian, but TaPL is a very
information-rich book.

chromatic: Me, I'm happy just to recognize Haskell Curry's name.

Autrijus: I read the first two chapters at a relaxed pace.
By the end of second chapter it starts to implement languages for real
and usually by that time, the profs using TaPL as textbook will tell the
students to pick a toy language to implement.

chromatic: I haven't seen you pop up much in Perl 6 land though. You seemed
amazingly productive in the Perl 5 world. Where'd Perl 6 come
in?

Autrijus: As an exercise. I started using Perl 6 as the
exercise. I think that answers the first question.

Oh. p6 land.

chromatic: More of a playground than a full land, but we have a big pit full
of colorful plastic balls.

Autrijus: Yeah, I was not in p6l, p6i or p6c. However, the
weekly summary really helped. Well, because I keep hitting the limit of
p5.

chromatic: It seems like an odd fit, putting a language with a good static
type system to use with a language with a loose, mostly-optional type system
though.

Autrijus: Most of more useful modules under my name,
(including the ones Ingy and I inherited from Damian) were forced to be done
in klugy ways because the p5 runtime is a mess.

chromatic: You should see Attributes::Scary. Total sympathy
here.

Autrijus:Template::Extract
uses (?{}) as a nondet engine; PAR comes with its own
perlmain.c; let me not mention source filtering. All these techniques
are unmaintainable unless with large dosage of caffeine.

chromatic: Yeah, I fixed some of the startup warnings in B::Generate a couple of
weeks ago...

Autrijus: Cool. Yeah, B::Generate is abstracted klugery
and may pave a way for Pugs to produce Perl 5 code.

chromatic: Parrot has the chance to make some of these things a lot nicer. I'm
looking forward to that. Yet you took off down another road.

Autrijus: Actually, I think Pugs and Parrot will meet in
the middle. Where Pugs AST meets Parrot AST and the compiler is written in
Perl 6 that can then be run on Parrot.

Autrijus: The easier plan is simply for Pugs to have a
Compile.hs that emits Parrot AST. Which, I'm happy to discover
yesterday, is painless to write. (Ingy and I did a KwidAST->HtmlAST
compiler in an hour, together with parser and AST.)

chromatic: Kwid and HTML, the markup languages?

Autrijus: Yeah.

Ok. So back to p6. P5's limit is apparent and not easily fixable

chromatic: It sounds like you wanted something more, and soon.

Autrijus: Parrot is fine except every time I build it, it
fails.

chromatic: Try running Linux PPC sometime.

Autrijus: Freebsd may not be a good platform for Parrot, I
gathered. Or my CVS luck is really bad. But I'm talking about several
months ago.

Autrijus: I was very interested in Ponie. I volunteered to Sky about doing
svn and src org and stuff, but svn was not kind for Ponie.

obra:Well, that was before svn 1.0

Autrijus: Right. Now it all works just fine, except
libsvn_wc, but we have svk now, and I learned that Sky has been
addicted to svk.

But anyway. And the beginning stage of Ponie is XS hackery which is by far
not my forte. I've read Lathos'
book, so I can do XS hackery when forced to but not on a volunteer basis.
Oh no.

chromatic: That's a special kind of pain. It's like doing magic tricks,
blindfolded, when you have to say, "Watch me push and pop a rabbit out of this
stack. By the way, don't make a reference to him yet...."

Autrijus: So, on February 1, when I had too much
caffeine and couldn't sleep, I didn't imagine that Pugs would be anything near a
complete implementation of Perl 6. I was just interested in modeling junctions
but things quickly went out of control. And some other nifty things like
subroutine signatures.

chromatic: There's a fuzzy connection in the back of my head about Haskell's
inferencing and pattern matching being somewhat similar.

chromatic: As long as they do the right thing with regard to roles, go
ahead.

Autrijus: They do. :)

chromatic: This was an academic exercise though?

Autrijus: Yeah. It stayed as an academic exercise I think for
two days.

chromatic: "Hey, this Perl 6 idea is interesting. I wonder how it works in
practice? I bet I could do it in Haskell!"

Autrijus: Yup. Using it as nothing more than a toy
language to experiment with, iitially targeting a reduced set of Perl 6 that
is purely functional. But by day three, I found that doing this is much easier
than I thought.

Autrijus: Well, Parsec and Happy. Happy is more traditional; you write in a yacc-like grammar thing and it generates a parser in Haskell for you. Parsec is pure Haskell. You just write Haskell code that defines a parser. The term is "parser combinator".

chromatic: Haskell is its own mini-language there.

Autrijus: It's a popular approach, yes. When you see
"blah combinator library", think "blah mini-language".

chromatic: I looked at the parser. It's surprisingly short.

Autrijus: And yet quite complete. Very maintainable,
too.

chromatic: Now I've also read the Perl 5 parser, in the sense that I picked
out language constructs that I recognized by name. Is it a combination
parser/lexer, or how does that work? That's the tricky bit of Perl 5, in that
lexing depends on the tokens seen and lots of context.

Autrijus: Yup. It does lexing and parsing in one pass,
with infinite lookahead and backtracking. Each lexeme can define a new parser
that works on the next lexeme.

chromatic: Does that limit what it can do? Is that why it's purely functional
Perl 6 so far?

Autrijus: The purely functional Perl 6 plan stops at day 3.
We are now fully IO. Started with say(), and mutable variables,
and return(), and &?CALLER_CONTINUATION. So
there's nothing functional about the Perl 6 that Pugs targets now :).

chromatic: Does Haskell support continuations and all of those funky
things?

Autrijus: Yes. And you can pick and match the funky things
you want for a scope of your code. "In this lexical scope I want
continuations"; dynamic scope, really. "In that scope I want a logger." "In that scope I want a pad."

chromatic: Performance penalty?

Autrijus: Each comes with its own penalty, but is
generally small. GHC, again, compiles to very fast C code.

chromatic: Can you instrument scopes at runtime too?

Autrijus: Sure. &?CALLER::SUB works. And
$OUTER::var.

chromatic: Are you compiling it to native code now? I remember that being a
suggestion a few days ago.

Autrijus: Pugs itself is compiled to native code; it is
still evaluating Perl 6 AST, though.

Autrijus: Cool. So yeah, it's like Perl 5 now. The
difference is B::* is trivial to write in Pugs

chromatic: Except maintainable.

Autrijus: And yeah, there's the maintainable bit. Pugs is
<4k lines of code. I think porting Pugs to Perl 6 will take about the same
number of lines, too.

chromatic: You already have one module, too.

Autrijus: Yup. And it's your favorite module.

chromatic: I've started a few attempts to write Test::Builder in
Parrot, but I'm missing a few pieces. How far along are classes and objects in
Pugs?

Autrijus: They don't exist. 6.2.x will do that, though.
But the short term task is to get all the todo_() cleaned. which will give us
an interpreter that really agrees with all synopses. At least in the places
we have implementation of, that is.

chromatic: I see in the dailies that you are producing boatloads of runnable
Perl 6 tests.

Autrijus: Yup, thanks to #Perl6. I seldom write tests now
:) The helpful committers do that for me.

chromatic: How do you know your code works then?

Autrijus: I just look at newest todo_ and start working
on it.

chromatic: Oh, they write tests for those before you implement
them?

Autrijus: Yup. It's all test-first.

chromatic: Okay, I'll let you continue then.

Autrijus: Ha. So yeah, the cooperation has been wonderful.
Camelfolks write tests and libraries, and lambdafolks makes those tests pass. If a
camelfolk wants a particular test to pass sooner, then that person can learn
from lambdafolk :). Things are easy to fix, and because of the coverage there's
little chance of breaking things. If lambdafolks want to implement new
things that may or may not agree with synopses or p5 norm, then they learn from
camelfolks.

chromatic: Have you started giving Haskell tutorials? I know Larry and
Patrick have started to pick up some of it. I'm pretty sure Luke and Damian
have already explored it (or something from the same family
tree).

Autrijus: I think I've read a paper from Damian that says
he taught Haskell in monash. It's before the monadic revolution though.

chromatic: It sounds like you're attracting people from both sides of the
fence then.

Autrijus: It indeed is. I get svn/svk patches and darcs
patches.

chromatic: Is there a lot of overlapping interest? Where does it come
from?

Autrijus: Well, ever since the monadic revolution of '98
Haskell people have started to do real world apps.

chromatic: Now that they can do IO, for example.

Autrijus: Yeah. It's been only 7 years ago. And recently
Haskell world has its native version control system; a Perl-review like
magazine, cpan/makemaker-like infrastructure, etc. So it's growing fast.

chromatic: There's still a lot of attraction there for real world
applications, of which Pugs is one?

Autrijus: Pugs is a practical project in that working on
it has a chance of solving real problems, and is very fun to boot. And although
p5 got no respect, in general p6 is very slick. So the mental barrier is lower
for lambdafolks to join, I think.

chromatic: The lambdafolks like what they see in Perl 6?

Autrijus: Yup. I quoted Abigail on #Haskell a while
ago.

chromatic: I saw something earlier about access to libraries and such. Do you
have a plan for the XS-alternative?

Autrijus: Yeah, Ingy is working on it ext/Kwid/
eventually inline Haskell code. And with luck, inline other kinds of code as
well through Haskelldirect (the Haskell equiv of Inline).

chromatic: Is this within Pugs or Perl 6 atop Pugs?

Autrijus: It's within Pugs. The Parrot side had not been
well-discussed.

chromatic: Yeah, the Parrot AST needs more documentation.

You're devoting a lot of time to it. Obra mentioned that you've
cleared most of your paying projects out of the way for the time being. What's
the eventual end?

Autrijus: And whither then? I cannot say :). As you
mentioned, I've diverted most of my paying projects away so I should have at
least 6 months for Pugs.

chromatic: How about in the next month?

Autrijus: This month should see robust semantics for basic
operations, the beginning of classes and objects, and many real modules
hooks to Haskell-side libraries.

chromatic: I'll do T::B then.

Autrijus: Oh and Pugs hands out committer bit liberally so
if you want to do T::B, I'll make you a committer :). You can start now
actually. Just write imaginary Perl 6 code, and we'll figure out how to make
it run. Most of the examples/* started that way.

chromatic: Ah, I'll take a look.

Autrijus: Oh. Right. I was quoting Abigail.

"Programming in Perl 5 is like exploring a large medieval castle,
surrounded by a dark, mysterious forest, with something new and unexpected
around each corner. There are dragons to be conquered, maidens to be rescued,
and holy grails to be quested for. Lots of fun."

"Perl 6 looks like a Louis-XVI castle and garden to me. Straight, symmetric,
and bright. There are wigs to be powdered, minuets to be danced, all quite
boring.".

I, for one, am happy for Perl to move from the dark age to the age of
enlightenment. I think many camelfolks and lambdafolks share the same
sentiment :).