from the scratch-and-sniff dept

Jonah Lehrer has a fantastically entertaining article in Wired about cracking scratch card lottery tickets. It kicks off with a story about Mohan Srivastava, a statistician in Toronto who quickly realized that the scratch tickets couldn't actually be random, even if they try to give off that impression. Since the lotteries want to control how often people win, the numbers have be chosen by a careful algorithm -- and if that's the case, there's almost always ways to reverse engineer the algorithm. With the first game he tried it on, Srivastava was able to increase his odds to the point that he could pick a "winning" card 90% of the time. He ended up alerting the Ontario Lottery and Gaming Corporation -- though, he admits that he did so not for moral reasons, but because he quickly calculated that he probably wouldn't make that much money just by gaming the system:

His next thought was utterly predictable: "I remember thinking, I'm gonna be rich! I'm gonna plunder the lottery!" he says. However, these grandiose dreams soon gave way to more practical concerns. "Once I worked out how much money I could make if this was my full-time job, I got a lot less excited," Srivastava says. "I'd have to travel from store to store and spend 45 seconds cracking each card. I estimated that I could expect to make about $600 a day. That's not bad. But to be honest, I make more as a consultant, and I find consulting to be a lot more interesting than scratch lottery tickets."

Instead of secretly plundering the game, he decided to go to the Ontario Lottery and Gaming Corporation. Srivastava thought its top officials might want to know about his discovery. Who knows, maybe they'd even hire him to give them statistical advice. "People often assume that I must be some extremely moral person because I didn't take advantage of the lottery," he says. "I can assure you that that's not the case. I'd simply done the math and concluded that beating the game wasn't worth my time."

At first the Ontario Lottery ignored him. Apparently, lots of crackpots claim to have beaten the lottery, but haven't. So, instead, he sent a guy on the Ontario Lottery security team a package of unscratched cards, and sorted them into piles he thought were winners and losers. Apparently, he was pretty accurate, because they called him quickly after that and then pulled the game.

Of course, as often happens in these situations, the Lottery insisted that this was just a one-off error, and the rest of their games were secure. Srivastava correctly noted that was unlikely, as the chances that the ticket he'd randomly been given was the only one with a flaw seemed remote. And, even if he didn't think it was worth his time to game the system, that's not true for others. In fact, Srivastava notes that there are ways to profitably game the system:

I then ask Srivastava how a criminal organization might plunder the lottery. He lays out a surprisingly practical plan for what he would do: "At first glance, the whole problem with plundering is one of scale," he says. "I probably couldn't sort enough tickets while standing at the counter of the mini-mart. So I'd probably want to invent some sort of scanning device that could quickly sort the tickets for me." Of course, Srivastava might look a little suspicious if he started bringing a scanner and his laptop into corner stores. But that may not be an insurmountable problem. "Lots of people buy lottery tickets in bulk to give away as prizes for contests," he says. He asked several Toronto retailers if they would object to him buying tickets and then exchanging the unused, unscratched tickets. "Everybody said that would be totally fine. Nobody was even a tiny bit suspicious," he says. "Why not? Because they all assumed the games are unbreakable. So what I would try to do is buy up lots of tickets, run them through my scanning machine, and then try to return the unscratched losers. Of course, you could also just find a retailer willing to cooperate or take a bribe. That might be easier." The scam would involve getting access to opened but unsold books of tickets. A potential plunderer would need to sort through these tickets and selectively pick the winners. The losers would be sold to unwitting customers--or returned to the lottery after the game was taken off the market.

Lehrer then goes on to point out that there is statistical evidence that, at the very least, suggests that some gaming of the system has been done in various places.

Consider a series of reports by the Massachusetts state auditor. The reports describe a long list of troubling findings, such as the fact that one person cashed in 1,588 winning tickets between 2002 and 2004 for a grand total of $2.84 million. (The report does not provide the name of the lucky winner.) A 1999 audit found that another person cashed in 149 tickets worth $237,000, while the top 10 multiple-prize winners had won 842 times for a total of $1.8 million. Since only six out of every 100,000 tickets yield a prize between $1,000 and $5,000, the auditor dryly observed that these "fortunate" players would have needed to buy "hundreds of thousands to millions of tickets." (The report also noted that the auditor's team found that full and partial ticket books were being abandoned at lottery headquarters in plastic bags.)

Then there's the example of a woman in Texas, Joan Ginther, who has apparently won more than $1 million from the Texas lottery four separate times. There are also some indications that organized crime groups have regularly used such lottery tickets as a way to launder money -- and if they can crack the code, that makes the laundering process a lot more profitable.

Finally, the article notes that the Lottery industry around the world seems to more or less be in denial about the whole thing, frequently insisting that the new games are perfectly secure, and not even being all that aware of previous cracks and problems. The whole thing is well worth reading.

from the random-ain't-so-random dept

You may have heard by now that the political website Daily Kos has come out and explained that it believes the polling firm it has used for a while, Research 2000, was faking its data. While it's nice to see a publication come right out and bluntly admit that it had relied on data that it now believes was not legit, what's fascinating if you're a stats geek is how a team of stats geeks figured out there were problems with the data. As any good stats nerd knows, the concept of "randomness" isn't quite as random as some people think, which is why faking randomness almost always leads to tell-tale signs that the data was faked or manipulated. For example, one very, very common test is to use Benford's Law to look at the first digit of data in a data set, because in a truly random set, the distribution is not what people usually expect.

In this case, the three guys who had problems with the data (Mark Grebner, Michael Weissman, and Jonathan Weissman) zeroed in on just a few clues that the data was faked or manipulated. The first thing they noticed was that when R2K did polls that tested how men and women viewed certain politicians or political parties (favorable/unfavorable) there was an odd pattern: if the percentage of men that rated a particular politician favorable or unfavorable was an even number, so was the the percentage of female raters. It seemed like these two points always matched up. If the male percentage was even the female percentage was even. If the male percentage was odd, the female percentage was odd. Yet, as you should know, these are independent variables, not influenced by each other. That 34% of men find a particular politician favorable should have no bearing on why an even percentage of women find that politician favorable. In fact, this happened in almost every such poll that R2K did, to such a level as to suggest it being as close to impossible as you can imagine:

Common sense says that that result is highly unlikely, but it helps to do a more precise calculation. Since the odds of getting a match each time are essentially 50%, the odds of getting 776/778 matches are just like those of getting 776 heads on 778 tosses of a fair coin. Results that extreme happen less than one time in 10228. That's one followed by 228 zeros. (The number of atoms within our cosmic horizon is something like 1 followed by 80 zeros.) For the Unf, the odds are less than one in 10231. (Having some Undecideds makes Fav and Unf nearly independent, so these are two separate wildly unlikely events.)

There is no remotely realistic way that a simple tabulation and subsequent rounding of the results for M's and F's could possibly show that detailed similarity. Therefore the numbers on these two separate groups were not generated just by independently polling them.

The other statistical analysis that I found fascinating was that when you looked at weekly changes in favorability ratings, the R2K data almost always changed a bit. But, if you look at other data, no change is the most common result. As they point out, if you look at, say, Gallup data, you get this nice typical bell curve:

But if you look at the R2K data, you get things like the following:

Notice that substantial dip at the 0% mark. That seems to indicate a likelihood of faked or manipulated data, from someone who thinks that "random" data means the data has to keep changing. Back to Grebner, Weissman and Weissman:

How do we know that the real data couldn't possibly have many changes of +1% or -1% but few changes of 0%? Let's make an imaginative leap and say that, for some inexplicable reason, the actual changes in the population's opinion were always exactly +1% or -1%, equally likely. Since real polls would have substantial sampling error (about +/-2% in the week-to-week numbers even in the first 60 weeks, more later) the distribution of weekly changes in the poll results would be smeared out, with slightly more ending up rounding to 0% than to -1% or +1%. No real results could show a sharp hole at 0%, barring yet another wildly unlikely accident.

Kos is apparently planning legal action, and so far R2K hasn't responded in much detail other than to claim that its polls were conducted properly. I'm not all that interested in that part of the discussion however. I just find it neat how the "faux randomness" may have exposed the problems with the data.

from the a-weaker-man-might-be-moved-to-reexamine-his-faith dept

One of my all-time favorite scenes in a play and movie, is the scene in Tom Stoppard's Rosencrantz & Guildenstern Are Dead where every coin toss comes up heads, leading to a bit of a philosophical discussion on probability. Of course, the randomness of the coin toss is the quintessential example of a random event and is used regularly for a variety of situations in which randomness is required, let alone expected. Except... it turns out the common wisdom may be wrong. Paul Kedrosky has the news of a test that showed that if you ask people to try flip a coin and get more heads than tails, they will, and not by a small margin either. In the test, 13 people were asked to flip a coin 300 times, trying to get as many heads as possible. All 13 participants got more heads than tails. Seven out of the thirteen had statistically significant margins of heads over tails (meaning almost certainly not a matter of chance). The highest was one individual had 68% of the coin flips land heads. In other words, a coin toss isn't particularly random.