Thoughts on Perl and Emacs, technology and writing

Posts Tagged ‘perl’

I wouldn’t get into google (their interviews don’t select for my brand of genius). But that’s fine as I wouldn’t be able to write perl there anyway.

I was trying to think of a few reasons why Google would have chosen Python over Perl as their primary scripting language. A number of the ones I came up with are less relevant than they used to be.

Python integrates better with C++

Boost.Python promises seamless interoperability between the two languages. It has been a while since I’ve been tempted to embed a scripting language in my C++ apps so I’m not sure what the counterpart is in Perl if any.

Python "threads"

Back in the day, these were better than the Perl offering. More recently, on Unix at least; with Coro, AnyEvent, Twisted, etc. it’s a wash.

Windows Implementation

A few years ago, the various Perl Windows implementations were not very high quality. Strawberry Perl has changed all that though. It’s awesome.

Jython et el.

Was Jython around when Google was choosing Python? I guess it was.

Perl (5) hasn’t escaped its VM as well as Python and Ruby. Having said that, the Jythons and JRubys of the world are always second class citizens compared to the original C VM as no-one can be bothered to rewrite all the C extensions.

Easier to avoid making a mess as a secondary language?

Experts in either language can write clean, efficient code. But what about folks who are expert in C++ and Java and use either Perl or Python on the side and only occasionally?

I think such usage for a large system will result in unwieldy code in either language, unless the large system is curated by experts, but for small to medium systems in the absence of experts, python may have the edge.

Rabid anti-Perlism

Even smart folks I know seem to think Perl is lacking in some way in comparison to Python. I haven’t managed to get them to enumerate any reasons though so I figure they are speaking from a position of ignorance. This could have been the case for the folks making the decisions for google too or, more likely, they may have considered some of the other reasons on my list.

The underlying reason for this is that structures are really vectors. (person-age object) translates into (aref object 1) which should hopefully be pretty fast. Another structure doesn’t need to keep its age member at an offset of 1, so I can’t say it is (generic-structure-age object).

In basic perl objects, because they are really just a hash reference, I don’t need to specify the type. When it is needed, the VM figures it out at runtime.

For a long time now I have suspected that calling perl subroutines is slow. And I couldn’t figure out from the language shootout which benchmark tested subroutine calls (did it use to be Ackermann?) so I made up my own benchmark.

Perl in comparison to its closest rival – CPython.

I tested a few things:

First, a loop that does nothing, to see how much is loop overhead
Next a zero parameter function
Then a function called with two integers
And finally, declaring those two integers inline.

I’m assuming that an optimiser doesn’t come along and remove code that does nothing. From the results, it seems like a safe assumption.

And, I’m not running benchmarks multiple times or with many iterations because, frankly, I don’t care that much. I just want to get an idea as to how Perl stacks up.

So I’m perhaps 1% of the way to a poorly thought out middleware component like CORBA1. No, it’s more light-weight, maybe a messaging layer, sorry I mean wire-level protocol specification implementation such as AMQP.

And then I think (like hundreds have probably thought before me), you know, this would be more useful if it had authentication. After all, I don’t want just anyone to be able to send kill signals to any processes. That would be like everyone being root. Which terrifies me.

Don’t invent your own authentication mechanism

And the golden rule about authentication, as far as I can work out, is don’t invent your own authentication mechanism. You’ll get it wrong and leave gaping vulnerabilities for the bad guys to have their wicked way with you. That is, if anyone besides you ever uses your code. And besides, I don’t want to waste any of my 1500 lines on coming up with Yet Another Broken Authentication System.

A quick trip to CPAN

Then I’m looking through the Authen::XXX modules on CPAN and none of them behave in exactly the way I want. But somehow I find a perl server that includes authentication and perhaps does everything I want and I should definitely put it on my list of things to look into even though I’m having a lot of fun with AnyEvent right now.

But by the time I come to look again, I can’t find it. And I’ve complained about documentation before, but Emacs really does deserve it, and I know of no system or language that is better documented than Perl. But I guess the classification problem is a bit tricky to overcome.

ARCv2

Anyway, long story short, I found it.

authenticated perl server -http

An Authenticating Perl Server

The first link (warning PDF) is a paper about using Authen::SASL in client/servers and it mentions ARCv2 which sounds like what I’m looking for.

The first thing to do is find out if it does what I want. The second is to check if it works on Windows.

I’ve been playing around with Plack and Twiggy recently and that motivated me to take a look at AnyEvent, the eventing library that Twiggy is built upon.

Now, it seems to me that AnyEvent is useful in a similar problem-space to POE (or Coro, on Unix at least). POE has more documentation but AnyEvent can actually use a couple of Event Loops that were implemented in C: EV and libevent which is the backbone of the hugely successful memcached. Sounds good to me.

As usual, first things first. How do you make a simple TCP server? I took a look at how Twiggy does it – the code is available in Twiggy::Server. I won’t need everything in there of course.

The Preamble

I like the way that Twiggy sets a constant called DEBUG from an environment variable. Now I can call the script like this: $ SERVER_DEBUG=1 ./ae.pl and get debugging output.

Basic Perl Objects

The watch variables only watch while they are in scope. If we have a simple object, we can dump them into the underlying blessed hash reference to keep ’em in scope. I’m lamenting the non-standardness of Moose here, but that’s another post.

Bottom-up or Top-down I’m never quite sure how to present my code. Hmmm…

AnyEvent::tcp_server

So, we’re assuming here that a server object will be called (with my $server Server->new()=) and then we will call $server->start_listen(...).

The example tcp_server call in the AnyEvent::Socket documentation rather unhelpfully demonstrates closing the socket the moment it has connected with a The internet is full, $host:$port. Go away! message. Oh well, maybe my examples are equally flawed. Thank goodness for Twiggy.

prepare_handler just logs a basic message on start-up. The accept handler returns a closure to maintain access to $self. The closure sets the socket options and then creates a watcher using watch_socket.

AnyEvent IO watcher

The watcher is setup to echo whatever it received, back to the sender. If it receives EOF (sent when a telnet client hits CTRL-D), then it terminates the connection.

Now, I didn’t manage to get this working immediately. If the watcher goes out of scope, it doesn’t end up watching anything. And I originally omitted the undef $headers_io_watcher statements. As the closure wasn’t referring to the watcher variable, it went out of scope immediately. Adding them added a reference which stopped that happening. Nice, if a little subtle.