Well this is just my interpretation of MECHFrost's approach, I'm not quite sure if I understood it correctly, but I think he's saying it works like this:

1. Add the new vote to the full list.

2. Calculate the average vote of the full list.

3. Initialise the new smooth list as an identical copy of the full list.

4. Discard all extreme votes from the smooth list (but ONLY from this list).

5. The final rating is the average vote of the smooth list.

So the full list never discards any votes - some votes may not be included in the current final rating, but they're still stored, and may be included later if the rating rises or falls far enough (as more votes could be added at any time).

Thus in the example you gave, where a load of bad votes are given early on, the rating would initially be very bad - and good votes would be ignored as extreme values. But as the number of good votes increased (assuming there were no more bad votes coming in) the average (in step 2) would gradually increase, and eventually it would reach the point where the bad votes were considered "extreme" and all of the good votes were counted.

This could cause a bit of a yo-yo effect though...perhaps I've misunderstood the suggestion.

Yes, this is a correct, but since the average calculated in step 2 is not the correct average, steps 2 to 5 need to be repeated several times, each time with a new average that's going up to its true value. Otherwise high votes that are correct might get discarded as they are too far from the first (incorrect) average calculated. From my tests, they would only need to be repeated once unless the level of "corruption" is high.

Today, I calculated how many votes it must accept before displaying accurate results. My calculations are based on statistics so they are not themselves perfectly accurate, but they give a good idea.

For each level of corruption, I generated 1000 lists of votes with a random average between 60% and 100%, and I checked how many votes it takes for the average to be stable (i.e. within a decently close distance of the true average) for 100 turns/votes (changing that value didn't change the result; once it's stable for a few turns, it's stable forever - with a fixed amount of corruption).

The following screenshots show the results for:

1) how many votes on average it takes for the result to be stable (decently low)
2) how many votes at max it took for the result to be stable (can be very high, with bad luck) - result 2) is not accurate at all since the average is calculated out of 1000 samples whilst the max is only calculated from 1
3-4) the same number, but with a reliability %, just to discard the "exceptional" results, i.e. what is the number of votes required for a certain % of the 1000 lists to be stable.

Here's a solution inspired by PhysMUD: how about using bots to rate the difficulty of content?

Imagine the scenario: You have 10 "auditor" bots, each using different tactics to complete areas. When you submit an area, each of the bots attempts to complete your area, and each one that dies increases the difficult of your area by 1. If no bots die at all, your area is rejected. Your area is then assigned a difficult rating out of 10 - for example, if 8 of the bots failed, the area would have a difficulty rating of 8 out of 10. The difficulty would then be used to determine the completion reward.

One problem that I can see is that the bots will rate the difficulty as too high. An example is if the players always run invisible past monsters and don't fight them unless they have to, the bot might not do the same and fight all the monsters on his path, rating the area as difficult whilst all the players have to do is run invisible past the monsters and get the treasure from a chest at the end. Or more generally and pertinent, if there's a multi-path options (or similarly, dialogues with NPCs), one easy path and one difficult: depending on the path that the bot chooses, it will give a rating that doesn't reflect the combination of the two paths. Unless the bot divides itself each times it has a decision to make, and evaluates every option. The bot can always be refined, but will it be able to keep up with the new tactics that the players come up with?

This is an interesting idea though. The bot would have to be compared to the players' rating to see which one is the most reliable.

One problem that I can see is that the bots will rate the difficulty as too high. An example is if the players always run invisible past monsters and don't fight them unless they have to, the bot might not do the same and fight all the monsters on his path, rating the area as difficult whilst all the players have to do is run invisible past the monsters and get the treasure from a chest at the end.

Well the idea is that the bots would use different tactics, so some might indeed use invisibility or stealth. If you used a larger number of bots you could even break the difficulty down into different categories - so in the above example, the dungeon would would fail as it would have a "stealth difficulty" of 0.

MECHFrost wrote:

Or more generally and pertinent, if there's a multi-path options (or similarly, dialogues with NPCs), one easy path and one difficult: depending on the path that the bot chooses, it will give a rating that doesn't reflect the combination of the two paths. Unless the bot divides itself each times it has a decision to make, and evaluates every option. The bot can always be refined, but will it be able to keep up with the new tactics that the players come up with?

Once again the bots would each act differently - so some would pick the hard path while others would pick the easy path. NPC dialogues would be more tricky, and you'd probably need to have your bots cheat there (but as players could do the same by sharing information I don't think it's a big issue).

MECHFrost wrote:

This is an interesting idea though. The bot would have to be compared to the players' rating to see which one is the most reliable.

Or you could have them rate different things - the bots could rate the difficulty, while the players could rate the quality.

ide wrote:

It seems like you would want to make some of the bots influence the difficulty ranking more than others; for example, an invisible bot dying should add more to the rank than a vanilla tank dying.

You could, but I think it would be better if you didn't, as it would make the rating less linear. I'd like the difference between difficulty 8 and 9 to be much more noticable than the difference between difficulty 5 and 6, for example.

It would also be more easy to abuse - players might design their areas to catch out very specific builds, knowing that it would knock out one of the top bots and give their area a passable difficulty rating, despite being a breeze for most players.

If each bot has an equal influence on the difficulty, then it's easier to increase the difficulty rating by knocking out the weaker bots.

Assuming that you want player-generated content at all levels, then having a few static bots using preset strategies probably won't work. The hard dungeons will all be 0; the lowbie dungeons will all be 10s. I propose the following edition:

Whenever a bot dies, it increases its level (in some preset manner) before re-entering the dungeon. Its rating then becomes the lowest level at which it completes the dungeon. The dungeon difficulty is then the average of the minimum level required for completion for each different strategy bot. This also allows sensibly categoried difficulties, as MechFROST suggests above.

Unfortunately, this will probably only work for straightforward hack & slash dungeons. Bots will be unable to gauge the difficulty of a dungeon in which there is, for example, a puzzle (assume a puzzle whose solution is not identical in each instance). If bots are given free passes through puzzles, then the rating is instead thrown off for any puzzle which involves some combat (or other form of resource use).

I am a big fan of ratings (difficulty, for example) which are consistent. Even if they're imperfect, very much can be said for consistency, so this bot scheme sounds delightful (puzzles excluded).

Assuming that you want player-generated content at all levels, then having a few static bots using preset strategies probably won't work. The hard dungeons will all be 0; the lowbie dungeons will all be 10s.

Well I admit I was making certain assumptions that perhaps I didn't properly clarify.

Let's assume that the content creator selects a level for their area, which restricts the level range of the mobs. Let's also assume that the level of the bots is adjusted based on the level of the area, with some slightly below and some slightly above, so that some bots really are tougher (and act smarter) than others.

I was also imagining the bots themselves being initially modelled on something rather like CRobots, possibly even being created by players as some sort of bot-building contest. They'd be required to prove their adaptability by completing a handful of predesigned areas before being accepted into the "test team", so each would likely carry a selection of different weapons and tools.

Kernal wrote:

Whenever a bot dies, it increases its level (in some preset manner) before re-entering the dungeon. Its rating then becomes the lowest level at which it completes the dungeon. The dungeon difficulty is then the average of the minimum level required for completion for each different strategy bot.

That's an interesting approach as well, but it suggests the builder has free reign to pick whatever mob levels they wish - that's something I'd prefer to avoid, as it makes it difficult to categorise areas as being appropriate for a specific level (and also makes them harder to balance with bots, as you'd have players creating "keep left" dungeons to wipe out most of the bots).

If the builder is limited to a small level range then I don't think it's necessary to use bots outside of those levels - if all the bots can complete the area then it's clearly too easy, no need to test it with lower level bots (we don't to downgrade the level of the dungeon, because that would make it a high-exp area for lower level players).

Kernal wrote:

Unfortunately, this will probably only work for straightforward hack & slash dungeons. Bots will be unable to gauge the difficulty of a dungeon in which there is, for example, a puzzle (assume a puzzle whose solution is not identical in each instance). If bots are given free passes through puzzles, then the rating is instead thrown off for any puzzle which involves some combat (or other form of resource use).

A fair point, but if I were to do something like this I'd make the puzzles fairly limited in nature, and not combat oriented. Most likely there would be a list of different puzzle types which the builder could use - a locked door or chest (requires the lockpicking minigame to bypass), a trap (varied types rather like in D&D, each requiring disarm minigames to bypass), secret exits (requires a searching minigame to locate), and so on. A reasonable selection of puzzles, which the builder could further customise cosmetically, but enough to break up the HnS feel of the area.