uDG08 Voting Results and Methods

mattness Wrote:One game conspicuously absent from the list is Reclaimed by Andy Korth. It was an ambitious project, but I think what he accomplished was amazing.

I agree! There were several that I felt should have placed a little better in at least some of the categories.

Speaking of those who didn't place in the top three, does anyone know if any runners-up are going to be divulged? What are the plans? Are all the vote results to be released eventually? What about the raw voting data? I haven't heard anything myself, but maybe I haven't paid enough attention.

Just my opinion, and maybe not well thought out, but I think it might be cool to maybe show entries out to about 8th or 10th place. It just seems to me that if I put in all the work some of these games took, I'd like to see if I at least got close!

I added another dimension to a little program I wrote to help analyze votes, and I have some observations you guys might be interested in. What I did was add up the actual place value that each game scored in each category. Example: Laserface Jones scored 1,1,1,1,3,1,15 <-- yeah, 15th in originality.

Adding all those places up yields a total score of 23. So the lower the score, the higher the "grand ranking" in my little experiment. Surprisingly to me, going through the entire field of entries shows that the top eight games happen to be the only contestants who placed third or higher amongst all the games, in all the categories (IOW, everyone already happens to know who they are because they won). I already gave the total points for Laserface in all categories, but since other scores haven't been released yet (that I know of), I won't show the other total points to avoid implications.

I want to say that the next two games placed consecutively 1 and 2 points just below BostonMouse, which is very close. Below those undisclosed two is an eleven point gap to the other half of the entrants.

Quote:Just my opinion, and maybe not well thought out, but I think it might be cool to maybe show entries out to about 8th or 10th place. It just seems to me that if I put in all the work some of these games took, I'd like to see if I at least got close!

I'd like to see this, I think my entry may have been in the top 10. My artist and I worked really hard on our entry. (as I know everyone else did)

I think all the winners deserved their victory, but I also think some people got nerfed as far as scoring goes. (this includes some of the winners)

I'm guessing this is because some developers are harsher critics and gave lower scores in general. But if the harsher critics aren't judging their own entry, there's an error in fairness. Now if someone gets a 7 by 18 devs and a 0 by one dev, they average to 6.6. (Is this fair? Shouldn't we be taking the mean and not the average?)
Anyway, enough of that, just some advice for next year's comp scoring.

Congratulations to everyone who entered the contest, especially to those who finished their games.

tcIgnatius Wrote:... I'm guessing this is because some developers are harsher critics and gave lower scores in general. But if the harsher critics aren't judging their own entry, there's an error in fairness. ....

I investigated that aspect and spent a lot of time shuffling the numbers around to see if I could find an angle on the voting that maybe could shift things around a bit, and I came up with... nothing (well, very little). Even dropping a few abnormally low scores out of peer voting on a (unfairly) selective basis, trying to rig it, just didn't produce enough difference to significantly affect the outcome of the contest (e.g. pulling out a few dismal peer votes on the lowest scorers didn't place them in the top five in any category). Sure, a person can find a few hairs to split here and there (and there were indeed some close ones), but at the end of the day they're all votes, and they all count (even if someone intentionally tries to sink someone else's game), and even if you fiddle with them they don't change the picture enough to argue that there's an issue. At least that's the conclusion I came to. I'm not a statistician, but after trying things out, I can tell you that a crazy peer vote or two here or there really doesn't change things in this contest.

What I did find was that the overall number of votes in the contest was pretty low IMHO. There were 872 total votes in the contest -- 516 public, 356 peer. The top three games look like this in terms of voting numbers:

(uDeadGame was one of the largest downloads in the contest BTW, so size clearly did not affect the number of votes significantly in relation to the rest of the field of contestants, so we can safely throw the file size argument out the door for future contests)

Peer voting was pretty steady in terms of numbers of votes, as one would expect. Then it's semi-linear down to the bottom end for public votes:

Code:

Total Public Peer
24 6 18

One can make what they want out of that in terms of the contest in general, but I can say that the total number of votes for each game does appear to correlate loosely with the total score of that game (by the way I calculated it a couple posts up).

I picked up all the results off the uDG website and stuck them in a spreadsheet which is here: http://www.pyramid-productions.net/downl...esults.xls. I've also put in the ranking in each category which makes it easier to compare. I also added in the number of prizes each entry was eligible.

I noticed a few things:

* The results across the categories are highly correllated. In one sense this makes sense, as 'good' teams tend to produce high quality across the board. However, it's also possible (and indeed likely) that the different aspects of a game influence each other i.e. if game has a good atmosphere, the gameplay, graphics and sounds would all be rated highly as they all contribute to the overall effect which may be greater than the sum of it's parts. Consider if Laserface had no sound, or simple block graphics (for example), then I would expect the gameplay score to also drop, even though the gameplay hasn't changed at all. Not sure if this is really something that needs to be fixed, but was just an observation.

* The result is that the vast majority of the prize pool goes to a small number of entries, and while the idea of "ask for the prize you want and someone might give it to you" is admirable, it will always have a level of potential unfairness - what if two devs ask for the same prize? A prize-winner then has to judge which of the two is 'more worthy' of getting it.

* I thought the rules were to do prize distribution down to the 5th place entry. Please note I'm not just saying this because I placed 5th in a category! But by my calculations, currently we are distributing the $20,000 between 9 of the 20 teams. Extending this down to 5th place would mean prizes for 13 of the devs (but still reward the top entries most heavily). Just a thought.

I'm not really advocating changing anything now, but these are some thigns to think about for the next contest.

I like the idea of a "Technical Achievement" category in the next contest. Its a good category for those ambitious games that don't have time to polish in other areas, but are none the less impressive. I would have scored Reclaimed tops in such a category. I also like the idea of prizes for the top five. Looking at Iain's list, top 5 would have allowed every entry that I thought deserved something to be an official winner.

IBethune Wrote:... Extending this down to 5th place would mean prizes for 13 of the devs (but still reward the top entries most heavily). Just a thought.
...
Anyone agree/disagree?

Distributing amongst the top three might have made most sense because one would not want to thin the prize pool out too much, but I too noticed that the ones at the top tended to be near the top in all categories, and so they would still win most prizes anyway. In reality, I suppose it's hard to say ahead of time how the prizes should be awarded until all the sponsors are lined up. If you say in the next contest that prizes will be awarded out to fifth place and there are only 10 prizes that were able to be gathered from sponsors, what do you do then?

IBethune Wrote:But by my calculations, currently we are distributing the $20,000 between 9 of the 20 teams.

Ooh, yes you're right. I originally thought it was only eight, but I missed SurroundedbyDeath in my experimental calcs. In my experimental grand ranking, Reclaimed would have been 8th and FIDRIS would have been 9th.

BTW, now that all the results are out, I'd like to mention that Reclaimed was a mere 0.033 of a point away from taking 3rd place away from Laserface in the Story category. That was one of the close ones!

mattness Wrote:I like the idea of a "Technical Achievement" category in the next contest. Its a good category for those ambitious games that don't have time to polish in other areas, but are none the less impressive. I would have scored Reclaimed tops in such a category.

I'm totally up with the technical achievement award too. We talked about this previously. Off the top of my head, I'd probably have voted SpaceProto first place there, and Reclaimed second, and prolly uDeadGame third, but that's just me

mattness Wrote:I like the idea of a "Technical Achievement" category in the next contest. Its a good category for those ambitious games that don't have time to polish in other areas, but are none the less impressive. I would have scored Reclaimed tops in such a category. I also like the idea of prizes for the top five. Looking at Iain's list, top 5 would have allowed every entry that I thought deserved something to be an official winner.

-Matt

While I think that both Reclaimed and StealthMan deserved a prize, the problem with this is that in an effort to make the competition more fair, it becomes less fair. The line must be drawn somewhere, and the lower the line is drawn, the lower it must be drawn to remain fair. Of course, if I got 4th or 5th in something rather than 10th and 11th, my tune might change.
That said, maybe the line should be drawn lower for the next competition. That's something to be decided by the guys in charge.

Quote:Sure, a person can find a few hairs to split here and there (and there were indeed some close ones), but at the end of the day they're all votes, and they all count (even if someone intentionally tries to sink someone else's game), and even if you fiddle with them they don't change the picture enough to argue that there's an issue. At least that's the conclusion I came to. I'm not a statistician, but after trying things out, I can tell you that a crazy peer vote or two here or there really doesn't change things in this contest.

I was more inclined to forget this, but I just watched flash of genius and I'm in a justice bringing mood.

I disagree with you in that a couple damaging votes don't play an important role. It could mean whole integer differences in some cases and some entrants were decimal points away from winning something. Same goes for nerfed votes that are positive for the entrant. I just think a mean average would suite this competition better because it would account for nerfed votes.

Seems overly political to me, to have other developers vote for their competitions' games to begin with. I'm assuming all the developers are pretty honorable and I'm not saying this actually went down. Just some things to consider for next year...

Seems overly political to me, to have other developers vote for their competitions' games to begin with. I'm assuming all the developers are pretty honorable and I'm not saying this actually went down. Just some things to consider for next year...

I don't get what you're saying here. It looks to me like they both deserve a 6.47.

I think that it is a good idea for the developers to rate the games. I, as a developer, am inclined to actually put more effort into getting into a game than I would be as a casual gamer, because my own game has bugs that I would want other gamers to put up with.

Quote:I don't get what you're saying here. It looks to me like they both deserve a 6.47.

I really meant to say the median, but I kept saying mean.

p1 has ten 7's, it would only seem fair that this game get a 7.
p2 has six 6's, it would only seem fair that this game get a 6.

Their mean average is the same though.

Quote:I think that it is a good idea for the developers to rate the games. I, as a developer, am inclined to actually put more effort into getting into a game than I would be as a casual gamer, because my own game has bugs that I would want other gamers to put up with.

I agree, but I think that in this context, its an overall bad idea to score games based on the mean.

What about X-Factor/Pop Idol etc. style. A group of judges picks a top 10 and over the final 5 weeks of the competition, 2 games are eliminated each week, with the judges picking a final winner. Surviving games can be tweaked and updated between each weekly judging session to try and impress the judges more the next week.

Doing the whole thing as judged by impartial and respected judges is probably the only way to avoid any issues of a public and dev voting system ending up with bias.

Quote:Yeah, I got that.
What I'm wondering is why we should do it that way rather than the way we have it now.

Because that would mean a majority of the voters felt it deserved a 7 or 6 respectively. We should base it on the majority, and not on the fact that p1 got nerfed with a 3 by a voter who felt their game didn't fulfill category "X" the way they wanted it to. Would you disagree on that p2 deserved a six? the majority of that entrants votes were 5's and 6's.