Bugs and Suggestions

MUD Reviews

This is probably most unholy threadomancy, but everything seemed to cut off there, so I was wondering if there was a final word, and if not, whether it was still worth discussing.

After all, regardless of whether they are good, bad or indifferent, those review links are still stuck up there on the front page.

Perhaps one way would be to have a system whereby there is a team of review moderators (separate from forum moderators), and submitted reviews wait for 48 hours or so, and if no moderator reports an objection, they get posted. If there's an objection, then the moderators can discuss it, vote on it, investigate further etc.

There's never going to be a perfect solution though, because even professional critics have their own biases. It might be best to implement this basic spam/quality filter and then let it go as before, opt-out and all. At least then there's something out there, and people have some way of contributing in a much more controlled manner than the forum threads that have the potential to get out of hand. I don't think the opt-out is a big deal personally, as long as you see the "Top MUD Listing" as "these MUDs got the most people to vote for them" rather than "these MUDs are officially the best", because that's all it is really.

This is probably most unholy threadomancy, but everything seemed to cut off there, so I was wondering if there was a final word, and if not, whether it was still worth discussing.

Good question. Thread necromancy indeed and I'm about to start some serious "code necromancy" here one of these days.

The "new" TMS has been up almost 6 months now and I got buried on another project right after it did. From what I recall we had pretty much decided to use the forum for reviews which gives us a certain level of authentication on who is writing them and allows comments to be added. The forum itself will need some hacking to allow this to be done, although I believe there are some third party VBulletin tools that help with it.

We didn't really come up with a good rating system - too many variables. It wouldn't be fair to have a mud's total rating dragged down because they score low on PK when they're not a PK mud, same with roleplay, etc etc.

I like that a lot. It provides a good structure rather than having rambling reviews that don't say much, but it also allows people to put their comments where they make sense. One thing that no one's mentioned is that many people find reviews hard to write due to their unstructured nature, and just putting a comment about a pre-determined facet makes them a lot easier to write. It also helps a bit in reducing things to facts, rather than "the admins are all cheaters" or something. It's like one of those "we welcome your comments" cards that you get at hotels and places. There could then be a general comment space below, for things not covered.

There could be 3 columns, one for the different categories, one for the reviewer's comments, and one for the admin's comments if any. I think the structured approach is a lot easier and a lot more useful than just writing essays, since people who are looking through thousands of MUDs can quickly get the gist of any review, rather than having to read a whole new story for each one.

We didn't really come up with a good rating system - too many variables. It wouldn't be fair to have a mud's total rating dragged down because they score low on PK when they're not a PK mud, same with roleplay, etc etc.

If you're going to use a rating system, I'd rather not have something where people can give 10 (or 1) in everything, otherwise that's what most of them will do. Instead, I'd rather see one of the following:

1. Each review has 100 points to distribute among the different categories (and all 100 MUST be distributed). This doesn't represent how good a mud is compared to other muds, but instead represents how good certain aspects of a mud are compared with other aspects of the same mud.

2. Each reviewer can choose to rate up to 3 categories as 'strong', but must also rate the same number of other categories as 'weak'. Negative reviewers who aren't willing to recognise any good things about the mud are therefore not allowed to rate what they perceive as the bad things, and vice versa. Once again the 'weak' ratings don't necessarily mean that the mud is bad in that category compared to other muds, only that the mud is stronger in other areas.

If you're going to use a rating system, I'd rather not have something where people can give 10 (or 1) in everything, otherwise that's what most of them will do. Instead, I'd rather see one of the following:

1. Each review has 100 points to distribute among the different categories (and all 100 MUST be distributed). This doesn't represent how good a mud is compared to other muds, but instead represents how good certain aspects of a mud are compared with other aspects of the same mud.

2. Each reviewer can choose to rate up to 3 categories as 'strong', but must also rate the same number of other categories as 'weak'. Negative reviewers who aren't willing to recognise any good things about the mud are therefore not allowed to rate what they perceive as the bad things, and vice versa. Once again the 'weak' ratings don't necessarily mean that the mud is bad in that category compared to other muds, only that the mud is stronger in other areas.

I agree with the problem of rating gamesmanship, but your system does assume that all games are roughly of equal quality. It is possible that Game X does all things badly or well.

Still, this could be covered in the introductory paragraph, and/or by an overall rating. If a game gets a 2/10 overall, it's pretty clear that I didn't think much of it, though your system would force me to identify at least a few redeeming features. (For example, a largely stock MUD based on a mature codebase might still have very high stability, and responsive staff.) This should lead to slightly more mature reviewing, and it's a standard sort of format for movie reviews, etc.

My larger concern is that you're going to have a very hard time coming up with a list of characteristics that apply to all games. 'Roleplaying' or 'PvP' might either be the core concept a game is built around, or something that isn't possible, or actively discouraged. It is unfair to rate a game 'unsatisfactory' or whatever if the game wasn't designed to include the feature in question, but it is equally unfair to omit those categories if it's the main purpose of the game.

I agree with the problem of rating gamesmanship, but your system does assume that all games are roughly of equal quality.

The point of my system is that it doesn't compare the mud with other games at all - the ratings are all relative to other features of the same game.

Quote:

Originally Posted by Valg

It is possible that Game X does all things badly or well.

It is, but if any player can freely rate categories then the vast majority will give either top marks or bottom marks to every category, regardless of how well (or badly) the mud in question has actually implemented the corresponding features. This can provide some insight into the number of fan boys and disgruntled ex-players each mud has, but sadly offers very little else - while my approach at least forces the players to give some insight into the strengths and weaknesses of each game.

The point of my system is that it doesn't compare the mud with other games at all - the ratings are all relative to other features of the same game.

It is, but if any player can freely rate categories then the vast majority will give either top marks or bottom marks to every category, regardless of how well (or badly) the mud in question has actually implemented the corresponding features. This can provide some insight into the number of fan boys and disgruntled ex-players each mud has, but sadly offers very little else - while my approach at least forces the players to give some insight into the strengths and weaknesses of each game.

I dunno, KaVir, it seems to me that a rating system like yours could be easily misinterpreted by the readers. (Just as many readers misinterpret the rankings on this list to be about quality, while in fact they only are about how many players that click the vote button every day).

Most of them would probably just skip the parts where the rating system is explained, and then they would interpret the rating literally.
Meaning that they'd think the part rated as 'weak' would be weak in comparison to other Muds as well, while in fact - if the Mud in question was a really good one - it would be way above average, even though other features in that same Mud were even better.

And isn't reviews actually ABOUT comparing a game with other games?

I think it would be a good thing to have a number of categories that the review would have to comment on to be accepted, but I'm not really sure about rating, in any form. I think it would be better to describe everything in words instead.

Anyhow - it WOULD be good to have a new and improved review system implemented, if nothing else to get rid of those old reviews that have been on the front page for ages.

Molly asks: And isn't reviews actually ABOUT comparing a game with other games?

No. That isn't what reviews are about. When was the last time you read a book review, where Harry Potter was compared to Deception Point by Dan Brown? Did you ever see anyone review a Broadway production of Oklahoma and compare it to Godspell as performed on the back 90 of Red Ranch in Greenfork Tennessee?

Reviews are reviews. They are to offer an opinion, based on observation, of a thing. To find out whether the thing is worth experiencing or not, on its own merit. A book isn't good just because it compares favorably with another book, a TV show isn't bad just because it compares unfavorably with another TV show. If something can't be determined "worthwhile" or "not worthwhile" on its own merit, then the reviewer has failed.

Molly asks: And isn't reviews actually ABOUT comparing a game with other games?

No. That isn't what reviews are about. When was the last time you read a book review, where Harry Potter was compared to Deception Point by Dan Brown? Did you ever see anyone review a Broadway production of Oklahoma and compare it to Godspell as performed on the back 90 of Red Ranch in Greenfork Tennessee?
/snip/
If something can't be determined "worthwhile" or "not worthwhile" on its own merit, then the reviewer has failed.

Point taken - partly. It only holds true however, when you talk about things that are very different. You don't compare apples with carrots, but you might compare Cox Orange to Red Delicious. You don't compare Harry Potter to Brown's Deception Point, but you might well compare it to C.S. Lewis' Narnia series. And actually I have quite often seen plays or films compared to legendary performances 20 years back.

You might not compare a RP enforced MUSH to an unrestricted PK hack'n'slash. But between those extremes most text Muds have a lot in common.

To write a good review you need some kind of references. If you've only read one book in your life, or just seen one film, how could you review it? If you've only ever played one Text Mud, how would you know that it's better than all the other text Muds out there?

People who love Text Muds use to argue that they are better than Graphic game because of more depth and detail, better RP or whatever. But that isn't really true. They only have the POTENTIAL for more depth and better RP. There are lots of text Muds that are just as shallow as any graphic game. If you don't have any reference points, and make at least some comparisons, the review is likely to be useless.

I think that each category to be rated should also have a text box below them to for the reviewer to explain the grade and the grade to explain whether the tone of the review is seen as positive or negative by the author. The extreme end grades could have minimum character lenghts for the explanations so that people would have to give their reasoning for giving the good or bad grade.

If you were reviewing a product that offers a limited amount of choices, say a few decades ago Coca-Cola in the world of Caffeinated Caramel Coloured drinks, you could make comparisons between the product and the others and somehow come up with a measure of which product is superior.

The other option is to have a standard, in which case you compare to this standard (or even an average) and determine if the product does better or worse compared to this standard (average).

The problem we have is that we are trying to measure something that is not specifically defined, so there is a blanket definition that pools together thousands of inherently different game styles/types under the same banner. In such an scenario it is unrealistic to pretend a review that pits a game vs the rest because there are no standards and there is no way to create an average to compare to. Furthermore, even if the average was somehow defined it would be the job of the reviewer to actually use this information correctly to grade/rate the game they are reviewing.

In response to KaVir,

I think this is a good workaround, except that it would be difficult to grade very good/bad games as pointed out already, if I attempted to say that everything is equally good/bad and tried to assign a blanket grade to the game (say out of 10 options, 10 points to each), it would be impossible to distinguish a good opinion from a bad one. I suggest two things to improve this approach though. Split the likes and dislikes in two different polls, the reviewer can determine which aspects he is going to consider when setting values, you have a 100 points to distribute as KaVir suggested, but now you "grade" which features rank as important in what you like of the game and which rank as unimportant, statistics can be done on this type of data (though it is still rough and more difficult than plain grades for each feature).

Now, the idea of just having a grade 1-10 for each feature is not necessarily bad, think of it as a measuring stick with too few divisions, as the number of measurements increases, the average resembles the value of a carefully taken measurement

Code:

Error_x = sqrt(sum_i((x_av - xi)^2)/(N*(N-1)))

Check out (ref) for a discussion. But basically, you can make a quick test and after a rather short number of reviews with these extreme conditions, you arrive at a relative low error and an accurate enough average

So, if the absolute grade approach was taken, I would suggest to post results after certain number of reviews have been submitted (10 or so at least) and then the numbers can be shown.

Since we haven't had the ability for players to review for well over a year, my vote would be to just get back the original review ability, then argue a change. Those who frequent NW do not care about the method of review, just the ability to review: good, bad, or indifferent. I mean, the massive advances in the game and modifications and player base make the current reviews completely obsolete. It is like reading the review for Windows XP in determining whether you want to buy Windows Vista.