Note

This is the column as published in the magazine, after I quickly edited
the original to remove the references to Hooters.

Copyright Notice

This text is copyright by CMP Media, LLC, and is used with
their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in
WebTechniques magazine.
However, the version you are reading here is as the author
originally submitted the article for publication, not after their
editors applied their creativity.

I heard about the ``Am I Hot or Not?'' site a little while back, and it
was all the rage (and might still be, when you read this), being able
to put a random picture up and coming back later to see how people had
voted. And with all that presidential election news last year, the
word ``voting'' just kept coming up, and I thought it was time to talk
about how tough (or easy) it is to take a poll yourself.

So, I quickly hacked up a voting scheme using DBI, including a
mechanism to reject duplicate votes (well, at least someone
immediately hitting reload to vote the same), and made it possible to
run it under mod_perl's Apache::Registry. But what to vote on?

Well, imagine you ran across a large database of images, 500 of them
in fact, that just happened to be presented in 5 different sizes, with
regular names like Disk1/Small/005.jpg for the 5th image of the
first disk (of 5) in its small size. Not very likely, but let's say
you did it in particular on the www.forgetmenots.comm website.
How hard would it be to create Am I a Forget-me-Not? using a generic
voting scheme? Pretty durn easy.

First, let's say the URL to an image looks like
http://www.forgetmenots.comm/images/DiskN/SIZE/NUM.jpg,
where N is 1 to 5, NUM is 001 to 100, and SIZE is one
of Thumb, Small, Medium, Large, or Exlarge. We decide
the voting script will present the Small image to vote on, provide
links for the others, then present the Thumb size to confirm the
vote. We can do this without disturbing the owners of the ``Forget Me
Nots'' site, since we're just generating the URLs. Of course, this all
breaks down if they move things around, but that's the nature of deep
linking.

So, let's take a look at this voting script, in [listing one below].

Lines 1 and 2 start nearly all the Perl programs I present here. In
Apache::Registry, the first line is actually ignored, but I put it
here in case I move this to a normal CGI directory (and it would still
work).

Line 4 pulls in the CGI module, importing all symbols.

Line 6 defines a top-level URL for all the URLs to which we deep link
over at the Forget Me Nots website.

Lines 8 to 20 capture the SQL I need to have MySQL create the table.
It's a simple record, with the disk/image number (together
representing a unique image on the remote site), the timestamp of
when then vote was made (both for statistics and to reject most
ballot-stuffing attempts), the IP address of the vote (again, for
statistics and to reject duplicate votes), and the vote itself
(something between 5 and 11).

Line 22 handles the CGI header, title, and visible heading.

Lines 24 to 26 declare three mostly-global variables, described as
needed below.

Lines 28 to 88 handle the incoming vote mechanics. I put these in a
separate block because I was going to create a bunch of variables and
didn't want them to collide with variables later, and they logically
go together.

Lines 29 to 34 validate the incoming vote. If we don't have a proper
disk number, image number, or rating, we don't have a vote, and we act
like it's a new call to the page (generating a random image
selection). Note the use of \A and \z as beginning-of-string
and end-of-string anchors on the regular expressions.

If we have a valid vote, we need to connect to the vote database.
Lines 36 to 38 do that, setting the RaiseError attribute which will
cause us to die if anything goes wrong. This is easier than
checking the result of every database access, but requires an eval
block around anything that might fail.

Lines 40 to 45 note whether this particular IP address has voted for
this image within the past 60 minutes. If so, we ignore this vote.
And this is the heart of our simple-minded ballot-stuffing avoider.

Only one vote for hour is permitted from a given IP address. Why IP
address? Well, it's a distinguishing characteristic, although flawed
for people voting behind firewalls or from dialups. Why only an hour?
Same reasons. I don't want to lock out two different people behind
the same firewall voting for the same picture, so the hour-long
timeout lets the next person come along and vote as well. Note that
no indication is given that a vote was not accepted, especially the
timeframe, so a ballotstuffer would have to work very hard to figure
out that a new vote could be triggered once an hour.

Once the vote is legal, we store it into the database in lines 47 to
50. Note that we let MySQL fill in the timestamp, to keep it
consistent with the query from the previous section. If the system
clock for the MySQL machine was off by five minutes, ``one hour ago'' is
still correct because it's relative to the times already gathered with
the incorrect clock.

Finally, regardless of whether or not we have a new vote, we calculate
the average of the old votes, using the code in lines 53 to 58. Note
in all three of these database accesses, we've used placeholders so
that I don't have to worry about quoting conventions in the data.

Lines 60 to 63 pretty-up the ``number of votes'' data, trying to avoid
warnings from undef as much as possible.

Lines 65 to 73 create the ``previous vote'' display to the left of the
main display. Initially, the $previous_vote value is empty, but if
we have a previous vote, we set it to a table. The table is there for
layout (sorry, purist friends). Notice that we build an inline image
link pointing at the remote website, in thumbnail size (it's
convenient that every image already comes scaled in five sizes). I
copied the layout and wording here directly from the ``Am I Hot Or Not''
site.

Lines 75 to 85 compute the ``next'' image to vote on. I originally had
just chosen a random image each time, but I was getting anomolies by
calling rand so many times without reseeding the data, and couldn't
quickly get enough bits from the random generator to do the job. So
after each image, I pick the next one. Image numbers go from 001 to
100, and disk numbers go from 1 to 5, wrapping around on both.

Line 87 is needed to prevent warnings if we're running as a CGI
script, but is ignored under mod_perl if we've pulled in
Apache::DBI (which attempts to cache database connections for
speed).

Lines 90 through 98 pick the new image to vote on, if needed. When we
compile this code, we'll spin the dial once (with srand). Then if
we aren't coming back from a previous vote, we'll get a random
starting point for a disk and image. The three calls to param
preset the ``sticky fields'' for form generation.

And then comes the messy ``tables within tables'' layout code for the
main output, starting in line 100. This started simple, and I
kept patching it. But it works. Line 103 dumps the previous vote
as a table cell on the left, if any.

Lines 104 to 121 dump the ``right'' cell, which is where all the output
happens, as a a two-by-two table. The upper left cell is the voting
area, again with wording taken from ``Am I Hot or Not?''. A radio group
for the rating, and values from 1 to 10 get their buttons. The
rows parameter in line 111 makes yet-another nested layout table to
ensure the buttons are horizontally placed. Two hidden fields in
lines 112 and 113 ensure that we'll know who the vote is for. The
upper right cell is the submit button.

The lower left cell starting in line 115 is the image itself, once
again an IMG link pointing directly at the Forget Me Nots website.
This time, we pick size Small, which is a bit larger than the thumbnail
used earlier, and again, just luckily about the same size as the
voting images on ``Am I Hot or Not?''.

Finally, the lower right cell is a list of links of all five sizes for
this particular image, again, all pointing directly toward the remote
website for fulfillment. Notice that at no time am I bringing an
image from their site and resending it along to the voter's browser:
this is all handled by the deep links directly to their site.

Lines 124 to 130 are an attempt to ensure that the ``Forget Me Nots''
site doesn't get objection oriented, nor the allusion to ``Am I Hot or
Not?'' gets lost. One word: ``parody''. And that's it.

Now, in order for this to be useful, people have to know about it, so
I inserted this program into my Apache::Registry-enabled area on my
website at http://www.stonehenge.com/perl/amiforgetmenot, and then
announced it from my home page and on a few select IRC channels.
Within a few days, I had accumulated enough votes to start getting
some interesting data.

Once data has accumulated, it's a matter of extracting what I want,
such as the top ten average votes for which at least five votes were
cast:

SELECT disk, image, avg(vote) AS average, count(*) AS votes
FROM votes
GROUP BY disk, image
HAVING votes > 5
ORDER BY average DESC
LIMIT 10

Or the bottom 10, which is the same without the DESC above. A
small matter of programming to turn the results into a nice table for
display, but at this point, I think you can see where to take it from
here. Until next time, vote early, vote often, and enjoy!