Well, I survived talking about A/B testing

I gave my talk. I was more than a little dazed while I gave it, but it seemed to go well. Most of my audience was still there at the end, and they were still alive enough to laugh in the right places.

I won't really know how it went until I get my feedback in a couple of weeks.

Ideally this would be the time when I could sit back and enjoy the conference. Unfortunately child care responsibilities made me come home after one day so I'll miss it. However in a short time I saw a lot of familiar faces (eg Schwern, Robert Spiers, Damian, merlyn, etc) and met some interesting people (eg I had lunch with the CTO of Wikipedia). Plus I enjoyed Damian's talk on scripting vim. (Tip: Do not let him near your vim configuration anywhere near April 1. Just..don't.)

For those who are interested, my slides are online. I still have a couple of edits to make (mainly that I want to put in some of the actual questions I was asked), and it should be posted on the OSCON site some time this week.

Wednesday July 16, 2008

11:20 AM

Please give feedback on my OSCON tutorial

I will be presenting about A/B testing on Monday. I now have a rough draft of my talk. Any feedback, typo corrections, etc would be appreciated. Note that it is supposed to take 3 hours to present, so it is kind of long.

And heck, you might learn something from reading the presentation.:-)

Friday June 27, 2008

09:27 AM

How do I share JavaScript?

I will be presenting about A/B testing at OSCON. Since I'm presenting in the web track, I can't assume that people have Perl. Besides I've strongly recommended that people create a significance calculator as a web page, so I wanted to demonstrate what that looks like. So I decided to kill two birds with one stone and write that in JavaScript.

For this purpose I've ported Statistics::Distributions from Perl to JavaScript. Which is fine and dandy for my purposes, but now I'm wondering whether I can contribute this code to some project somewhere. With Perl it is easy - everyone knows that CPAN is the place to do things like that. With JavaScript, what are my options?

Monday June 23, 2008

03:26 PM

I hate context

I've hated Perl's notion of context for a long time. So this weekend was just confirmation.

The problem is simple. Perl's notion of context requires that we think about what we want to do in array and scalar contexts, and potentially do different interesting things. This automatically doubles all APIs. Now it is true that sometimes there is something useful you can hang on this hook. But in my experience it is more often true that nothing really is obvious. And in that case with depressing frequency you get design decisions that age poorly.

For example at one point I read an article suggesting that it was a good idea to return a reference to an array in scalar context. I was briefly convinced and did this in Text::xSV. And now I curse myself every time I write:
my $name = $csv->extract("name");

and it does what I don't want.

Now it is easy to say that this is a poorly thought through design decision. And it was. But I've noticed that attempts to be clever with context frequently lead to bad design decisions. And result in making APIs more complex than they need to be. Sure, context is occasionally handy. But when I compare Perl to Ruby or JavaScript, I find that on balance context doesn't seem worthwhile to me.

This is old hat. What prompted this is something specific. On the Rose::DB::Object list we had a discussion about a bug between Rose::DB::Object and Template Toolkit. Here is the bug in simplest possible form:#!perl -w

It turns out that Template::Stash::Context added the ability to do list operations only to native scalar types, and not to scalar objects. This fixes that.

Of course there is the real underlying problem, which is why returning 1 thing is so different from returning many things. Digging further the cause of that problem is that TT does not internally maintain a notion of context. Because of that, it winds up with the same internal data structure for the return out of scalar context and the return of one thing in list context. Then when it has to thaw it out, it has no choice but to treat them as being the same thing. I found at least one place that assumption was encoded, but there are more, and it doesn't look easy to fix.

Now it is easy to say "poor design decision within TT". It is. However there is an old design principle. Which is that when people make repeated mistakes involving one part of the interface, at some point you have to ask whether the real mistake is in the design of the interface itself. (This is not, incidentally, a software principle. Read The Design of Everyday Things for more on it. I also saw it repeatedly emphasized in a book about industrial disasters.)

It seems to me that there are repeated mistakes made by good people involving the notion of context. Which I take as evidence that the problem is with the idea of context itself. And then I conduct a sanity check. There are a lot of languages out there that deliberately borrowed a lot of ideas from Perl. How many have chosen to borrow the idea of context? Why not?:-)

Saturday May 24, 2008

12:02 AM

Thank you CPAN

Everyone's favorite reason to use Perl just came through for me again. Let me give some background.

At $work I am in charge of reporting. Our business has something called events. Different events are treated differently, and sometimes we make money from them and sometimes we don't. Sometimes we make more money from them, and sometimes less. One of the things we want reports for is to figure out what factors make events make more money. (Obviously because we want to make more events make money for us!)

Now I have a fairly flexible report that we'll call revenue per event, because that is its name. It allows us to see revenue per event at various ages broken out by various combinations of factors. Such as whether passwords are needed to login, whether gift certificates were added, whether specific promotions were run, that kind of thing. This is a very useful report. We can see that, for instance, running a promotion brings in money (duh) and tell about how much more money events with that promotion make.

But we have a problem. You see, we have a pretty good idea what factors make events make more money. (Unfortunately we don't control how the event is set up, we're handling them as a service for our direct customers.) So when people take our advice they do several things that are good. How can we tell how good each individual thing they are doing is?

Hmm..let's see. Sounds like we need to do some sort of multi-variable linear regression. Why don't we look on CPAN and find Statistics::Regression and see if that works? Oh look, it does! I've used it in the past. But look, it added a method called standarderrors, what is that? (Run some tests, make hypothesis, email author, get confirmation.) Goody, we not only can find a linear regression, but we can get estimates of how much random noise in the data might be throwing off the coefficients! I had been bothered by the fact that people tend to take the numbers I produce as gospel with no eye to whether there was any statistical validity to the numbers.

Hrm, do I trust the module? I take a legitimate pride in knowing a fair amount of math, but I know I don't know how to do this. Well look at the source code and..holy crap! OK, for me to learn this to my satisfaction would take a long time. What programmer contributed it and do I trust it? Rummage around and do some research...oh, he's a professor at Brown. He teaches courses on multi-variable statistics. I think I can trust that he knows his stuff!:-)

Getting my report to do multi-variable regression and display it the way I wanted still wasn't easy. But at least it wasn't easy for programming reasons (some day I need to write something explaining why it may be important to put a condition in an ON clause rather than a WHERE clause - I spent an hour tracking down the resulting bug), and not because I couldn't figure out the math.

I'll be presenting at OSCON :-)

I just got notified that my tutorial on A/B testing has been accepted. I'll be presenting at 1:30 on Monday, July 21. Now I just have to extend and improve a 2 hour presentation into a 3 hour one.

That, and figure out where I'm supposed to have the break. And decide whether I'm going to port Statistics::Distributions to JavaScript so that I can port my code samples to JavaScript. That way the code that I present can be the interface that I recommend developing for your users...

Wednesday January 23, 2008

12:57 AM

Why make classes you can't subclass?

Let me give the background. I'm using Template Toolkit. Because I want to be able to write things like [% cgi.popup_menu(...).scalar %] and get reasonable output, I am using Template::Stash::Context. And then I decided that I'd like to also have it check to catch any use of unknown variables in the template.

If you look at the design then it is obvious that I need a different stash. And there is no stash on CPAN that does what I want. (Both strict checks and context.

Obviously I should just subclass it, right? Wrong. If you look in Template::Stash::Context you'll find checks in key places that check whether you're at the root by testing whether the reference type is __PACKAGE__. Which a subclassed object is not.

OK, can I cut and paste? I got that working, but then threw that solution away because I really don't like checking open source code with its licenses into proprietary codebases. Sure, I know what is OK to do, but I don't want anyone else to lose track. And I don't like giving lawyers heart attacks.

OK, let's just put a proxy class in front of it.

Nope. Didn't work. I didn't track it down, but I assume that it didn't work because it is sometimes passing stashes in calls to other stashes. There is more wrapping/unwrapping needed than I wanted to figure out.

Final solution?# I do this because I don't want to copy somoeone else's copyrighted# code into our codebase, and the class was not written to be easily# subclassed.no warnings 'redefine';my $old_get = \&Template::Stash::Context::get;*Template::Stash::Context::get = sub {
my ($self, $ident) = @_;

I'm unhappy that I don't give more context, but I'm able to store that in another place in my code so that my debugging messages have all the context I could need.

That was far harder than it should have been. And it took me longer to come up with that answer than it should have. But that's what happens when I go back to thinking about Perl after spending all of my time writing SQL.

Well I gave the A/B testing talk

A/B Testing

Well I seem to have volunteered to talk for an indeterminate time on A/B testing at LA.pm on Thursday next week.

I've never given a presentation like this before. And my first one will basically be a math talk directed towards programmers? I could have picked an easier topic!

Ah well. I've long thought that the techniques of A/B testing weren't widely enough understood. This will be my chance to correct that. I'll just have to find somewhere to put my slides on when I'm done. As for presentation software, S5 looks like it does what I want.