Mitigating Cheating & Voter Fraud in Online Contests…

We run online contests of various sorts that involve users voting on entries (usually one vote per user per day). The prizes range from hundreds to thousands of dollars. Over the last four years we have encountered a number of ways people try to cheat, and have implemented couter-measures in each case. As it stands, we use the following measures:

Authentication
A user must create an account and authenticate (log in). This rules out anonymous vote stuffing.

Email Confirmation
A user must confirm their email address by clicking a link in a system email to confirm they own and have access to their address. This rules out creating accounts en masse using random (not necessarily valid) email addresses. It also slows down the process a little for one account, and a lot if you're trying to create many.

No Gmail Address Aliases
Users cannot use instant alias addresses such as localpart+suffix@gmail.com. That slows down potential cheaters.

Additional Measures
We routinely audit our signups and voting rosters for strings of email addresses that come from the same private domain (user1@smithfamily.com, user2@smithfamily.com, etc.). We also look for similar names, usernames, and "local-parts" of email addresses across domains.

We also show voting result updates on a daily basis, so there's no instant feedback. That way, if someone is trying to cheat, it will take a day to see any results, and unless they went big, they won't know for sure if their method was successful. We try to be as much of a "black box" as possible.

Needless to say, this is all exhausting and getting harder and harder to scale up. We need an easier solution to ensure that we get a lot closer to "one person, one vote" in our contests, while not burdening the user beyond need in the process.

Some suggestions have included using credit cards, mail-in verification, reputation points … but these are all much too onerous for our target users.

What more can we do automatically in the back end to 1) identify cheaters, and more importantly 2) prevent them from even cheating?

UPDATE

We decided not to make our fraud protection leak-proof because one of the developers pointed out "the harder you make it for people to cheat, the harder it is to detect cheating." Instead, we are utilizing a medieval Chinese hunting technique called a three-sided Battue. By making it very difficult to cheat in most ways, but relatively easy to slip through in other ways, we know exactly what to look for and eliminate before the voting results are updated.

We look for patterns in votes, such as one contestant receiving a string of evenly timed votes, then look at the accounts associated with those votes. If there's a pattern to the accounts, we eliminate those accounts, and the associated votes disappear with them.

an interesting method I've seen is to use multiple types of storage to track users. "evercookies" Is a name for them I think. samy.pl/evercookie They just have to use the same browser.
–
WalterJ89Jan 17 '12 at 6:58

While that sounds tempting, the privacy ramifications are a little too dim.
–
tajmoJan 17 '12 at 15:47

7 Answers
7

I think there is a simple and low-tech solution. You introduce three requirements into the terms of the contest:

Each user must specify their name and address when they create the account.

You restrict the contest to one entry per household. You use the address to eliminate multiple entries with the same address.

You put in the terms of the contest that, when a winner is selected, the check will only be issued to a person of the name and address found in the winning account. If there is no person of that name at that address, then the prize is re-allocated to someone else. When someone wins, you do something to verify the address of the winner: e.g., send registered, certified to that address and require the winner to prove they received the mail; or require them to show a copy of their driver's license showing that address on the license.

What this does is prevent someone from cheating. They can create as many accounts as they want, but if they use their own address, they'll only be able to submit one entry per contest; and if they use some other address, they won't be able to collect the prize if their account is chosen (since it's not their address, and the address of the winner will be verified before they get their prize).

It is not perfect -- depending upon your procedure for verifying the address of the winner, a cheater may still be able to create several accounts, each one listing the address of a different friend -- but it might be good enough in practice.

Many legitimate voters live in the same household in our contests: a girl enters; her mother, father, sister, brother all vote for her. That's behavior we want. Her boyfriend the hacker who creates accounts, bots, or whatever to jook her numbers; that's what we don't want. The actual winner will not be the person who votes the most, but garners the most votes.
–
tajmoFeb 23 '12 at 18:37

@tajmo, You could modify my proposal to accommodate your circumstances: confirm the identity of the winner as part of the prize-awarding process (e.g., check their ID; make out a check to that name, so it can't be cashed by anyone else).
–
D.W.Feb 24 '12 at 5:25

@tajmo, it sounds like you haven't absorbed my proposal. I apologize if I didn't communicate it clearly. The beauty of my proposal is you don't need to be concerned with confirming voters, because that is not actually your real goal. Your real goal is to prevent cheating and multiple voting. My answer demonstrates that you can prevent cheating and multiple voting without confirming voters, if instead you confirm winners. Confirming winners is a lot easier than confirming voters, because there are many more voters than winners.
–
D.W.Feb 24 '12 at 18:07

3

I can attest that confirming winners doesn't prevent cheating; in our contests, applicants show up in person before the contest begins. They are confirmed physically, with ID, and by affidavit. And yet there is voter fraud. One thing has nothing to do with the other.
–
tajmoFeb 24 '12 at 18:14

How does this differ from OpenID? Or is this OpenID plus some tie into government identification?
–
GillesJan 16 '12 at 19:13

1

My understanding is that it's based on SAML, WS-Fed / WS-Trust where the personal data (or hash) is sent to a 3rd party
–
makerofthings7Jan 16 '12 at 21:40

2

@Gilles It is quite different from OpenID in that it has much better security, and can be configured so that the user is anonymous and can't be matched with accounts created by the same U Prove user on other sites. Whereas with OpenID the user reveals a globally unique id. Look up the "laws of identity" - identityblog.com/?p=354 OpenID only allows for an "omni-directional" identifiers, but U Prove also supports "unidirectional" identifiers which are only revealed to a specific relying party.
–
nealmcbJan 17 '12 at 2:42

@makerofthings7 It is more than a concept in planning. But the problem is there aren't enough folks demanding this - either web users looking for more anonymity, or web sites looking for a way to identify physical individuals via e.g. a government claim, and/or willing to give up the ability to track individuals across web sites (which brings in more advertising bucks). So this contest use case is the sort of thing we need to get to a better identity system
–
nealmcbJan 17 '12 at 3:00

I agree with @makerofthings7 that U-Prove, together with appropriately verifiable claims from a single provider like the government, is the closest thing we have to what you eventually want. Users will (hopefully) want support for "unidirectional identities" rather than a gloablly-unique ID like an OpenID which allows relying sites to collaborate and track you across the internet. See more at the Laws of Identity

In the meantime, requiring that users register via something like (pick one) Facebook Connect or Google Plus would allow you to leverage the (controversial) efforts of those big players to get folks to use real names, or at least "one-to-a-customer".

But of course as others have said, no matter what you do, given enough incentive folks will find a way to register multiple times. So it really comes down to the other factors in your threat model - e.g. what is the risk of excluding people who don't want to go thru the hassle of registering, what is your ultimate bottom line based on, etc.

Requiring either FB or G+ will make you "co-responsible" for anything evil these services do. And in the end, requiring a "real identity" this way just encourages people to make-up cheap, fake, online "real life" identities.
–
curiousguyJun 26 '12 at 1:03

If you have a contest e.g. what is your favourite day of the week ?
then you create a copy of your user table lets name it cont_101 then
you create a table with the contest answers lets name it cont_ans_101.

Now when a user vote then a flag in the cont_101 goes true so the user have vote
and the counter in cont_ans_101 raise by one.

When a user tries to vote you simple check the flag in the user table (cont_101)
and if is it false then you can count the vote otherwise you could ban the user.

That would require you to uniquely identify a user. The problem is that it's easy to create a large number of accounts, and vote once with each of them.
–
CodesInChaosJan 14 '12 at 23:35

1

This is also somewhat terrible from a database design point of view - It's much better to store a cross-ref table between user and answer. Data-set size is smaller, and it's much harder to 'forget' somebody (for example, a user who signed up after the user table was copied).
–
Clockwork-MuseJan 18 '12 at 18:53

yes you are right form database design point of view is not elegant.
–
jkarrJan 19 '12 at 6:13

1

Why do you need to copy the user table? As far as I can see, you just need a table that records which user participated in which contest: user_contest: id PK, user_id FK, contest_id FK). And of course the contest_vote table (id PK, contest_id FK, answer ??).
–
Hendrik Brummermann♦Jan 24 '12 at 7:19

You're attempting to uniquely identify something through a system that was somewhat designed to not require a unique identity. Short of tying and validating to a 3rd party (usually physical) identifier, this is impossible. Instead, your best bet is to restrict voting altogether, in a manner that discourages automation and/or quick results.
If your system supports it, see about weighting the votes based on some criteria:

'Age' of the account (since signup). Disallow votes for anything under some age (say, since the start of the contest).

Really good answer with a really creative approach. Being so restrictive users will get discouraged to cheat. I would even dare to add "paying" to the list (in cases where it makes sense).
–
AlphaJan 18 '12 at 21:46

If I am correct in saying that you issue a live paper check to someone one measure to put in place is the address. If you structure it like radio stations do and make it so only one entry per household / address this could prevent some cheating. In terms of trying to prevent someone from cheating in the first place you probably want to look at the online gaming industry. Things such as PunkBuster and VAC have failed to completely prevent cheating in games. A good thing to do is to perform once per week audits and evaluate the time a user spends on a page before voting, this is available in GA (Google analytics) as far as I recall. Set a threshold for how long a user must be on a page, this could also help to detect scripts that are clicking links.

You're going to want to check IPs (though obviously this isn't a large hurdle, you should still do it)

You'll want a captcha to prevent (at least weaker) automated cheating

You're going to want to (in the end) either manually or automatically check for large groupings of signups during small time frames (automation/stuffing)

Disallowing "free" email services would obviously take a large chunk out of cheating, but it would also probably take an even bigger chunk out of your user base depending

If you made them enter in more personal information it would at least help with verification (name + address + phone would add some verification even if you don't call/sms the phone if it is listed anywhere)

If at all possible you're going to want to have similarity detection across the board, not just for emails (to catch lazy cheaters who maybe repeat a similar email later, a name in an mail, or vice-versa)

But as far as preventing cheating on what is ostensibly a web form you're going to need some sort of unique identifier which is of course going to be a much larger overhead for your users, more security needed for your site, and will cause a number of users to become leery

SMS is more verifiable than email accounts though since it really isn't that hard to sign up for 20,000+ hotmail/gmail/yahoo etc. email accounts, that's ignoring throw away email accounts/forwarders.

There's also the option of automated calls (which rules out some, but not all of the SMS security issues)