perlmeditation
Ovid
<h3>How to gain a 50% performance increase over the Schwartzian Transform</h3>
<p><strong>Note:</strong> Where else could I get away with a title as cheeky as this one? :) If anyone can make suggestions to improve the C code, that would be welcome. I'd also like to point out that the title is a joke. There is nothing revolutionary about using C to resolve Perl performance bottlenecks. Also, this mode was more an exploration of [cpan://Inline::C] and not an attempt to speed up the Schwartzian Transform.</p>
<p>I hate forgetting things that I once knew. I returned to college when I was
in my mid-twenties because I hated the fact that I had forgotten how to speak
French<a href="#note1">&sup1;</a>. You can understand, therefore, why it bugs
me that I no longer know how to program in C. I started programming in the early
80s, writing games in BASIC. BASIC, as most of you know, runs as fast and
elegantly as a dead gazelle. In order to speed up the critical portions of my
programs, I taught myself assembler. It was tedious, but I had to do it. You
can imagine how delighted I was when I learned C in college.</p>
<p>That was 15 years ago. I haven't programmed in C since. I've been trying
to improve my resume and started Java classes. I'm going to keep taking them
because knowing Java will tremendously expand my
marketability, but I really don't care for Java<a href="#note2">&sup2;</a>.
So, I was browsing through Powell's technical bookstore here in Portland,
Oregon, and picked up a copy of O'Reilly's "Practical C". It was a godsend.</p>
<readmore>
<p>Relearning C is fine and dandy, but putting it to some practical use is finer
and, um, dandier. So, I went to the CPAN and downloaded [kobe://Inline]. The
Inline module allows you to embed other programming languages, such as C, Java,
or Python, in Perl. If you have extensive libraries written in another
language, it's great to be able to simply reuse them rather than rewrite them
in Perl. In the case of Inline::C (which is bundled with Inline), you can also
gain significant performance benefits.</p>
<p>If you've ever tried to embed C in Perl without Inline, I feel sorry for you.
I never considered it due to the daunting nature of XS and SWIG. With Inline,
it's almost trivial. I installed Inline using Cygwin on a Windows 98 machine.
The install was flawless, though it took longer to install than any previous
module. One of the sections of <tt>make test</tt> took 504 wallclock seconds!
However, having such a large module install without problems on a Cygwin box
told me that someone had done an <em>excellent</em> job of things.</p>
<p>After the install, I ran pod2html on Inline.pod, C.pod, and C-Cookbook.pod.
Reading through them was fairly straightforward. Inline.pod tells you to go
ahead and skip to the Cookbook if you want to get started right away, but I
wouldn't. The documentation can be a bit dry at times, but it's very useful and
the first time some of your embedded C fails to compile, you'll be grateful for
reading the documentation because that is chock full 'o tips and tricks of which
you should be aware.</p>
<p>After working through many of the examples in the Cookbook, I decided it was
time to try writing something myself. My first inclination, to be fair, was to
try writing CGI::Inline. Many people balk at the performance of the CGI module,
but since I need to walk before I can run (and I don't have as much free time as
I would like), I figured I would would start with something small. The
[Schwartzian Transform] (ST) seemed like a perfect candidate for this. The ST
is one of the fastest sorting methods available, so trying to speed it up a bit
with C seemed like a nice challenge. The basic structure of the ST is this:</p>
<code>
@new_list =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, expensive_function( $_ ) ] }
@old_list;
</code>
<p>Basically, you take an array of data and turn every element into a 2 item
array reference with first item in the reference being the original data and the
second item in the array being the data you want to sort on (in a sortable form,
of course). Then, you sort the array by the second item and map the original
data back to an array.</p>
<p>Hmm... what should I optmize? Well, if <tt>&expensive_function</tt> isn't
crying out for some help, I shouldn't be programming - of course, some would
argue that I shouldn't be programming anyway, but I try not to listen to the
voices coming from the toaster.</p>
<p>I decided to tackle the traditional problem of sorting strings with an
embedded date. I created an arbitrary data format like the following:</p>
<code>
a_bunch_of_random_letters|19670620|and_some_more_random_letters
</code>
<p>I need to grab that date as efficiently as possible. I used the following
Perl snippet:</p>
<code>
sub get_date_perl
{
my $date = substr( $_[0], index( $_[0], '|' ) + 1, 8 );
return join '', unpack( "A4A2A2",$date );
}
</code>
<p>Using [substr], [index], and [unpack] is about as efficient as you can get.
Now, I need to figure out how to do that with Inline. Passing a scalar to an
Inline function and getting back a scalar is fairly simple. I merely define
the the correct return type of the function (SV*), pass it a string and return
a new scalar value. I do this with this:</p>
<code>
SV* get_date( char* str )
{
/* some stuff goes here */
return newSVpv(date,8);
}
</code>
<p>I merely call the function with a <tt>char* foo</tt> argument and padding it
with a NUL is handled for me. Getting the date from this is trivial. Here's
the full function as I would embed it in Perl.</p>
<code>
#!/usr/bin/perl -w
use strict;
use Inline C =><<'END_OF_C_CODE';
#include <string.h>
SV* get_date( char* str )
{
char date[9]; /* the date to return (with an extra byte for the NUL) */
int index = 0; /* index of character in string */
/* note the lack of error checking. I have a guaranteed date format (in
generated test data). You wouldn't do this in the real world. */
while ( str[index++] != '|' );
strncpy( date, &str[index], 8);
return newSVpv( date, 8 );
}
END_OF_C_CODE
</code>
<p>If you prefer, you can also add the C code after the <tt>__DATA__</tt> token
or in another module or file. Now, the first time this is run, it will take
quite some time as Inline has to go through all of the steps necessary to
compile the C and make it accessible to Perl. On subsequent runs, Inline
generates a checksum for the C code and, if the checksum is the same, Inline
will not recompile, as it assumes that the C has not changed. You can override
this behavior if you wish.</p>
<p>Once this has been done, you call the C function like you would call any
other function:</p>
<code>
my $date = get_date( $string );
</code>
<p>Now that I have my Perl code and C code to test against, I run some benchmarks.
There's no point in repeating them here as the two ran neck and neck.
Repeatedly. I had taken too simple of a task and Perl was too efficient. I
decided to complicate things a bit. Many Americans use a <tt>MMDDYYYY</tt> date
format. Now, I not only have to grab the date, I need to rearrange the digits
into a sortable <tt>YYYYMMDD</tt> format. Here's the Perl:</p>
<code>
sub get_date_perl
{
my $date = substr( $_[0], index( $_[0], '|' ) + 1, 8 );
# skipping the scalars and doing return join '', unpack...
# did not result in a performance improvement.
my ( $month, $day, $year ) = unpack( "A2A2A4",$date );
return $year.$month.$day;
}
</code>
<p>Here's the C code I whipped up:</p>
<code>
SV* get_date (char* str)
{
char date[9]; /* the date to return */
char new_date[9]; /* this will be the date after reordering */
int index = 0; /* index of character in string */
while ( str[index++] != '|' );
strncpy( date, &str[index], 8);
new_date[0] = date[4];
new_date[1] = date[5];
new_date[2] = date[6];
new_date[3] = date[7];
new_date[4] = date[0];
new_date[5] = date[1];
new_date[6] = date[2];
new_date[7] = date[3];
return newSVpv(new_date,8);
}
</code>
<p>Now we have something just a tad more interesting. It turns out that we also
have something that's over 20% faster in C:</p>
<code>
Benchmark: timing 50 iterations of Inline, Perl...
Inline: 41 wallclock secs (40.72 usr + 0.00 sys = 40.72 CPU) @ 1.23/s (n=50)
Perl: 53 wallclock secs (53.21 usr + 0.00 sys = 53.21 CPU) @ 0.94/s (n=50)
</code>
<p>Now, that's nice and all, but I was hoping for something a bit more dramatic. Taking a closer look at the ST, we see that the last two steps are a sort, which returns a list, and a map, which returns a list. Wouldn't it be nice if the sort just returned what we wanted? Imagine doing this:</p>
<code>
@new_list =
map_sort { $a->[1] <=> $b->[1] }
map { [ $_, expensive_function( $_ ) ] }
@old_list;
</code>
<p>Of course, that won't work, but I started thinking a bit more about this. My initial idea was to write a bit of C code that would read the array off the stack, create a struct with the original data and the date, sort on the date, and just return the original data. Then, I had a weird inspiration. Now, my original idea is probably better, as the following is filled with dangers and will not always be appropriate for your task, but it should work in many cases.</p>
<p>I remembered that [bikeNomad] had asked [id://92331]. After some discussion, and taking an idea from [tye], he came up with [id://92365|this code snippet]. It occurred to me that I could strip the date, make the original value bi-modal, sort on the numeric value and return the string value all in one go! I call this (pardon the hubris) "The Ovidian Transform" :) It just has one map and one sort. Here are the benchmarks:</p>
<code>
Benchmark: timing 50 iterations of Inline, Ovidian, Perl...
Inline: 41 wallclock secs (40.72 usr + 0.00 sys = 40.72 CPU) @ 1.23/s (n=50)
Ovidian: 34 wallclock secs (34.71 usr + 0.00 sys = 34.71 CPU) @ 1.44/s (n=50)
Perl: 53 wallclock secs (53.21 usr + 0.00 sys = 53.21 CPU) @ 0.94/s (n=50)
</code>
<p>I ran them several times and got consistent results. A performance improvement of about 50% <em>over</em> the Schwartzian Transform. Full code is below. Enjoy!</p>
<code>
#!/usr/bin/perl -w
use strict;
use Inline 'C';
use Benchmark;
use vars qw/ $test_data @perl_data @inline_data @ovid_data /;
my $date = 'abc|03152001|';
my $c_date = get_date( $date );
my $p_date = get_date_perl( $date );
if ( $c_date eq $p_date )
{
print "Dates match: '$c_date'\n";
}
else
{
print "Dates do not match.\nC: '$c_date'\nPerl: '$p_date'\nTerminating program.\n";
exit;
}
print "Generating test data.\n";
$test_data = rand_data();
timethese(50, {
'Perl' => '
@main::perl_data =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, main::get_date_perl( $_ ) ] }
@$main::test_data;
',
'Inline' => '
@main::inline_data =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, main::get_date( $_ ) ] }
@$main::test_data;
',
'Ovidian' => '
@main::ovid_data =
sort { $a+0 <=> $b+0 }
map { main::get_date_c( $_ ) }
@$main::test_data;
'
});
# change to 1 if you want to print this to verify results
if ( 0 )
{
open PERL, "> perl.txt" or die $!;
open INLINE, "> inline.txt" or die $!;
open OVID, "> ovid.txt" or die $!;
print PERL join "\n", @perl_data;
print INLINE join "\n", @inline_data;
print OVID join "\n", @ovid_data;
close OVID;
close INLINE;
close PERL;
}
sub get_date_c
{
my $data = shift;
my $date = get_date( $data );
my $dual_var;
set_both( $dual_var, $data, $date+0 );
return $dual_var;
}
sub get_date_perl
{
my $date = substr( $_[0], index( $_[0], '|' ) + 1, 8 );
my ( $month, $day, $year ) = unpack( "A2A2A4",$date );
return $year.$month.$day;
}
sub rand_data
{
my @years = qw/ 1999 2000 2001 /;
my @months = ( '01' .. '12' );
my @days = ( '01' .. '28' ); # since this is just sample data,
# I'm not too worried about getting
# this perfect
my @sample;
for ( 1 .. 10000 )
{
my $garbage = rand_letters().'|';
$garbage .= $months [ random( 12 ) ];
$garbage .= $days [ random( 28 ) ];
$garbage .= $years [ random( 3 ) ];
$garbage .= '|'.rand_letters();
push @sample, $garbage;
}
return \@sample;
}
sub rand_letters
{
my $foo = '';
for ( 1 .. int( 100 * rand ) + 1 )
{
$foo .= ( 'a'..'z' )[ random( 26 ) ];
}
return $foo;
}
sub random
{
my $int = shift;
return int ( $int * rand );
}
__END__
__C__
void set_both(SV* variable, SV* string, SV* numeric)
{
SvPV(string, PL_na);
if(!SvPOKp(string) || (!SvNOKp(numeric) && !SvIOKp(numeric)) )
{
croak("Usage: set_both variable,string,numeric");
}
sv_setsv(variable,string);
if(SvNOKp(numeric))
{
sv_setnv(variable,SvNV(numeric));
}
else
{
sv_setiv(variable,SvIV(numeric));
}
SvPOK_on(variable);
}
SV* get_date (char* str)
{
char date[9]; /* the date to return */
char new_date[9]; /* this will be the date after reordering */
int index = 0; /* index of character in string */
while ( str[index++] != '|' );
strncpy( date, &str[index], 8);
new_date[0] = date[4];
new_date[1] = date[5];
new_date[2] = date[6];
new_date[3] = date[7];
new_date[4] = date[0];
new_date[5] = date[1];
new_date[6] = date[2];
new_date[7] = date[3];
return newSVpv(new_date,8);
}
</code>
<p><small><a name="note1">1.</a>Perversely, I have forgotten French again. I
was in Paris buying a pack of cigarettes and asked the cashier if she had any
matches (les allumettes). I said "Pardon, avez-vous des alouttes?" (Pardon
me, do you you have any larks?). It was most confusing.</small></p>
<p><small><a name="note2">2.</a>It's bloated, slow, has irritating limitations
like not being able to subclass String to add methods, etc. Very annoying. On
the other hand, it has taught me quite a bit about OO programming and has a
tighter type-checking system than Perl. And when was the last time you wrote
an applet in Perl?</small></p>
<p>Cheers,<br />
<a href="/index.pl?node=Ovid&lastnode_id=1072">Ovid</a></p>
<p><strong>Update:</strong> I'm trying to remember which monks helped me with some C questions. [wog] was one. <tt>/msg Ovid ...</tt> if you also helped. I'd like to give credit where credit is due!</p>
<p><small>Join the <a href="http://setiathome.ssl.berkeley.edu/stats/team/team_86606.html">Perlmonks Setiathome Group</a> or just click on the the link and check out our stats.</small></p>