A website I'm working on includes a contest where users are allowed to vote for their favorite entry. The customer wants some form of vote-fraud prevention.

The idea of recording the IP addresses of all those placing votes has come up, but I'm not sure if that's a good solution. Internal IP addresses are of course a bad idea, but what about external addresses? The problem I see with that is, for example, a whole college computer lab could be running off of a single external IP address. --maybe even the whole campus. If only a limited number of votes are allowed from an IP address, that limit could be reached very quickly by completely legitimate votes.

Does anyone know of a better way to implement this? Surely this has been done before.

Questions on Server Fault are expected to relate to server, networking, or related infrastructure administration within the scope defined by the community. Consider editing the question or leaving comments for improvement if you believe the question can be reworded to fit within the scope. Read more about reopening questions here.
If this question can be reworded to fit the rules in the help center, please edit the question.

3 Answers
3

You can use captcha to make automated voting unfeasible but doesn't prevent a single dedicated human from vote silliness.

You can require users to be authenticated before voting.

You can require the user to supply an email address then send a unique one time link for submitting the vote. This is similar to how you would register a user only if they had a valid email address. From here you could deny multiple votes for the same email address.

Save a cookie once a user votes to indicate that user already voted. This is easy to bypass but might stop a less technically inclined user.

Use openID such as with dotnetopenauth to authenticate to a third part such as google or yahoo to access the vote area then record the user's openID for future requests.

Record the user's keystrokes (keyboard dynamics), httpcontext data and vote completion time to statistically analyze the possibility of a repeat voter with the use of a statistical clustering algorithm such as k-means.

What you want is to open the vote only to users who have been registered for at least a week/month/etc -- but this probably won't help you, because if you're bringing up IP checks, it sounds like you aren't authenticating your users at all.

If you are trying to eliminate fraud, and you aren't authenticating your users, you probably get to throw up your hands and walk away. If you can be satisfied with a minimal reduction in casual fraud, there are a few things you can do.

Voter IP checks are a relatively low cost mechanism to try to keep multiple-voting down if you have nothing else to go on, but as you've noticed it can invoke a cost in terms of legitimate users prevented from voting, and does nothing whatsoever to prevent vote-stacking by recruitment of non-users, each of whom will have a legitimately different IP address (i.e. "Hey Slashdot/Reddit/Pharyngula/IRC-Channel-X, everyone go vote for this option on this poll!"). Single-user padding can still be done with botnets or large proxy lists or accounts on multiple machines (i.e. any university lab not behind a NAT).

What you're looking for is some bit of information that will be unique to a user. If you don't have that currently, you can create it by sending single-use tokens out to users in what is basically a vote-by-invite. This is still problematic, of course, if your users haven't already given you permission to e-mail them, or you don't know who your users are.

If you can lure your users into registering on the website, then you can authenticate votes that way, but it's dangerous to start that at the same time as a vote, because once again you face the possibility of a recruitment effort.

You can ask questions along with your poll that any real users should be able to answer easily, but non-users would not, and filter results that way, but you may discover that your users know less than you think, and nothing prevents fake voters from being provided with a cheat-sheet containing all the answers.

What you seem to be looking for, is a way to restrict input only to an unknown number of completely unknown people, and prevent them from providing input more than once or changing their apparent identity. That's not going to happen.

Short of limiting it per IP, there's not much else you can do if you really care about those votes being unique. You may want to look at storing the IPs that have voted in a non-relational database to increase query speed and decrease resource usage.