RSS

In the coming week of Spoiler Warning, we have a conversation about how gradual some of the upgrades are in Mass Effect 3. I’m talking about the things that give 10% more damage, or increase the radius of area damage by half a meter, or other baby-step upgrades. This conversation reminded me of another problem with complex damage systems. As a programmer, this is something that’s been bothering me for ages.

For our worst-case scenario, let’s consider the system for incoming damage in your typical online game of the massive variety. The game is set up so that the enemy (usually called a mob) is trying to kill the player character. To do this, it has to reduce the player’s hitpoints to zero. There are a lot of layers of mechanics between the mob and the player death state:

1. Control

You knock the monster down, knock it back, stick it in place, confuse it, transform it into a sheep, or otherwise stop the mob from initiating an attack. This is actually a pretty straightforward behavior and since the outcome is binary (it either attacks or it doesn’t) it’s not terribly difficult to see the impact it has on the fight. However, things get increasingly murky from here on…

2. Prevention

Players can inflict conditions onto mobs to make them attack more slowly, hit less hard, recharge their magical energy less quickly, limit their attack range, or otherwise limit their combat effectiveness. The mob will still attack, but those attacks will be some percent less potent.

However, the attack does take place unless the player has…

3. Evasion

You might be partly invisible, or dodge due to being very nimble. Maybe the game has a formally recognized dodge mechanic. Whatever. The mob takes a shot that may fail to connect. Maybe your ability simply makes the “next” attack miss, or perhaps this is a probability roll of your ability to dodge vs. their ability to hit.

If the player doesn’t dodge, then perhaps they will use…

4. Blocking

This typically only applies to characters with shields, but it’s also sometimes used in a magical sense where the player has a shield. Again: Maybe there’s a shield that can absorb a single attack, regardless of strength. Maybe the player is under an effect that will automatically block X attacks. Maybe this is a probability roll of your [fortitude] vs. their ability to batter aside your shield. Whatever. If the attack is blocked, then a majority (perhaps even all) of the incoming damage is discarded.

Any damage that makes it through blocking will then proceed to…

5. Mitigation

Here is where your “toughness” stat comes into play, or whatever stat they’re using for extremely durable characters in this game. If you’ve got great armor or high stats, then the incoming damage numbers will be reduced by some factor.

Once the damage has been mitigated, it finally gets to the health bar and its time for…

6. Absorption

Finally we’re subtracting some health away from the player’s hit point bar. If the mob can inflict sustained damage faster than the player is being healed, then eventually the player will drop dead.

This is a very bare-bones look at the system. We’re ignoring magical resistance, elemental damage, damage over time, damage reflection, and conditions on the player. We’re also ignoring the similar path followed by outgoing damage and aggro mechanics. Basically, we’re taking about the simplest form of damage: A mob hitting you with its pointy-stick of choice. Even this simplistic interaction has many layers that might have a lot of funny math going on.

In just one example: You’ve got innate stats and the armor bonuses of perhaps a dozen or so pieces of armor. How do those stats stack up? If your stats reduce incoming damage by 20% and your armor reduces it by another 30%, are those stats additive or compounding? If a mob has just hit the player for 100 damage then perhaps the 20% is added to the 30% reduction for 50% damage reduction. Result: The player suffers 50 damage. On the other hand, perhaps the 20% is applied first, reducing the damage to 80. Then the 30% is applied, reducing it to 56. Result: The player suffers 56 damage.

Most players just ignore the number-crunch. They look at two bits of armor, and pick whichever one has the higher numbers. No need to worry about what’s going on under the hood. This is usually my approach to playing a game.

But as a programmer, my problem is this:

How does anyone know if the system is actually working properly?

This system is simply too chaotic for normal QA testing to detect bugs. Sure, big problems will show up easily enough, but this system is made almost entirely of small systems chained together. Even if it were feasible to have a team of QA testers examine every power against every mob with every piece of gear at every level, and even if they understood the mechanics completely, I seriously doubt it would be possible for them to spot problems without some serious statistical analysis tools.

If that sword that offers 3% more critical damage was actually doing 3% less damage, how could you possibly tell? If the armor bonus of your boots was never applied to melee attacks, ever – would anyone know? What if that condition you repeatedly stack on an enemy to make them weaker is actually only effective the first time, and otherwise ignored? What if, instead of adding the critical chance of both of your equipped weapons, the system only uses the main-hand bonus unless you’re using pistols in which case it uses the lower of the two? What if the condition the enemy puts on you to make you slightly more likely to miss is never actually cleared until you log out or change zones?

Now, you might say that you’d notice because the numbers in your character screen wouldn’t look right. But you’re assuming the character screen is being honest with you. Josh found a bug in Guild Wars 2 where an Elementalist can select a trait that will make certain powers cool down faster. After selecting the trait, the listed cooldown of Mist Form dropped from 90 seconds to 75 seconds. However, when you actually use the power, the cooldown is still 90 seconds. The tooltip is lying. The trait is probably completely useless. Note that the only reason Josh is aware of this bug is because you can set the game to display the cooldown timer. Without this, he’d need to use a stopwatch to see the problem. If this trait impacted random damage rolls instead of a fixed timer, then it would be extremely difficult and time-consuming to conclusively observe this bug.

If you’re fighting alone, then fights are generally too short to observe anomalies. If you’re part of a group, then any anomalies are easily attributed to the activity of other players. All you can see are damage numbers, which are derived from random rolls and run through the numeric meatgrinder above. Even if by some miracle you do notice a problem with the numbers, are you really going to be able to detect this as a bug? Maybe you’re lagging. Maybe another player has thrown some other conditions or buffs into the mix. Maybe you’re just noticing a few bad rolls in all the noise. Maybe a recent patch changed things. Maybe you’re just failing to understand the undocumented mechanics.

In your typical MMO, think about all the observable bugs you see. The one corner where people fall out of the level by hopping. The way magical swords don’t generate particle effects after you get out of the water. The way the “salute” animation is incorrectly applied to characters in motion. The way on-screen chat bubbles get clipped if you have the camera zoomed all the way in and your game resolution is different from your desktop resolution. Bugs can often be rare, situational, and obscure.

If bugs that can be easily observed and reproduced end up lingering for months, then how much worse is it for bugs that people might not be able to notice? Any system that can’t be properly tested is bound to have bugs in it. And unlike visible bugs, there might be incentives to avoid looking for or fixing mechanical bugs.

Consider a programmer who is scrolling through the code one day and finds this:

So your ring that granted 10% fire protection was actually just reducing all fire damage by 0.1 hitpoints. This bug only applies to fire resistance bonuses from non-exotic jewelry that isn’t player crafted. In this game, perhaps fire damage isn’t a big deal until you get near the level cap, so here we have a bug that only applies to certain characters with certain gear of a certain quality, and even then the bug is only important if they happen to be facing a lot of fire damage.

Nobody has noticed this bug in the year or so since it was introduced. Now for the really tricky questions for the programmer:

Do you fix this bug? Is that going to introduce balance problems with PvP? Will making the required change introduce a bug elsewhere? Do you really want to announce to players that the precious gear they bought and quested for actually didn’t work right until now? Or maybe you should just fix the bug quietly?

Quest of the Dungeon Swords Online. Patch notes for build 1.133:

Fixed bug where players could fall out of level in The Citadel of Dragon Fighting.

Lorik Questsayer will only give the quest to kill the Dredrat King if the player has opened the way to the Dredrat Warrens.

Fixed the issue where bloom lighting would cause slowdowns with certain NVIDIA cards.

Made slight change to how fire damage is calculated in some situations.

Players can now re-bind the key to display scoreboards in Battlewatch PvP arenas.

This doubt grows on me the more I play these games. As the mechanics become ever more impenetrable, I can’t shake the notion that maybe I just spent five gold on a sword for no reason because the numbers don’t work the way the tooltips claim.

GW2 also has the problem that the combat log is not complete. If I apply 10 stacks of bleeding to a monster, my screen shows little numbers flying off it like a water fountain. My combat log just shows that I did X damage directly by swinging my sword. I want to know how much damage I’m actually doing, not how much my sword does directly when it hits. How am I supposed to compare, from a pure damage perspective, a critical focused build to a condition damage focused one? And that totally ignores the other modifiers like vulnerability or cold/rooted/slowed/quickened that can have second order effects.

Additionally, there isn’t really a way to analyze the combat log. At least not really in real-time (+5 seconds) like in Rift or WoW for example, where there are pretty advanced tools for analyzing the log (DPS-meter, but certainly not limited to this).

Furthermore, the lack of those analytical tools leads to a lack of (good) theorycrafters. In (early) WoW and in Rift, the theorycrafters were a lot more reliable than the developers and without them, manys bugs, imbalances etc. would have never been found or maybe even abused by a select few people stumbling over them.

Furthermore, I don’t really understand how in many games tooltips can ever get so messed up that the results are different than what the tooltip says. Now, I only know a little bit of Java, but shouldn’t there be a variable for, let’s say 5% more damage? So why isn’t the variable used in the tooltip output? To me it looks like stuff like the mentioned cd-reduction bug can only happen and be overlooked in testing if the tooltips aren’t even using their abilities’ own variables.
Or am I mistaken on this?

Yeah, but is the coder the one writing the tooltip? Is the writer a coder? Those two jobs and skillsets overlap a lot less than you’d think. I’d bet the tooltips are written by technical writers from the spec.

I think what happens is that the variables are correct but used wrong. So in the 10% fire resistance example, the variable is .1, which corresponds to 10%, and the display takes the number, multiplies by 100, truncates at some point (probably selected by some special algorithm to detect the last nonzero value and cut out all zeros past both that and the decimal point) and appends ‘%’. But then the math calculating the effect of fire resistance on damage doesn’t work right, possibly an artifact from an experiment with fixed damage reduction from resistance.

They might also just write the tooltip to contain what the effect is supposed to be, but that would be time-consuming and dumb.

I would suspect that the tooltip is written by a tech writer as plain text, and the client software actually has no clue how much of what is happening. I haven’t worked on MMO type games for real, but if I were, exchanges like the following would happen:

Client says to server: “My user would like to use a skill called whirlydoo of doom.”Server says to client: “Okay, you’re allowed to do that. By the way, remember the monsters I told you about earlier? Each of them just took 32 damages. Oh, and here’s where they’re all standing now. And here’s where you’re standing now.”Client -> Server: “Exciting. My user’s a lazy bum and hasn’t mashed any buttons in a while. Got any interesting facts about the world for me?”Server -> Client: “As a matter of fact, I have. Your player was hit for 9,000 damage. Your player has died. Your dead player is in this location. Monster number 24601 was killed while standing 3 meters due North of you. Monster number 86502 acquired some nifty new status effects. Chitter chatter.”

The client has no real need to know exactly what the effect of a skill is, except to the extent that it can let it predict which shiny graphics it should be showing next. Meanwhile, the server is the arbiter of all truth and so knows the correct value of that variable describing how much fire resistance a particular skill bestows.

But if everything were written by tech writers, someone would have to hand-write all item numbers and manually update them every time they got patched. Much easier to create an auto-generator of descriptive text that creates them from the raw data of the item. That would seriously take less than an hour, including debugging.

Actually, if the last few patch notes from STO are any indication, that’s exactly how it’s done.

That means that actual human beings are writing these tooltips and half of them are still useless number mash.

The only screen where I am confidant that I’m seeing derived values in that game displays different numbers in situations where they should be the same, and has for at least a year (If you level up an ability, it’s values change, but if you confirm the point expenditure and then reselect the skill box, the numbers will be different).

I remember in NWN 1 using the level editor to test things & gather stats. That was to see if it followed PnP D&D or not, but the same applies to bugs (and indeed the differences I saw could have been bugs, or could have been design choices).

Its particularly bad if the code was written so UI elements (tooltips, timers) are client side, but actual mechanics are server side and in a different module/library/class/etc.

Its particularly bad if the code was written so UI elements (tooltips, timers) are client side, but actual mechanics are server side and in a different module/library/class/etc.

This is how software ought to be written, actually. There’s a really complicated explanation for that which I won’t get into but for now I’ll just explain that logic, presentation, and input all need to be completely separated in both the client and server (which should be separate as well.)

Not to say that the broken tooltips in NWN weren’t bogus, though. They totally were, and I feel your pain. :/

3-and-a-half years later, I click on a “from the archives” link, think “hey, this is a cool article I don’t remember”, “the comments look interesting”, “hey, there is someone with the same username”, “wait, hang on a minute”…. doh!

I tend to give the underlying game systems the benefit of the doubt – innocent until proven guilty and all that. For instance, I’ve always assumed that my GW2 trait to reduce shout cooldowns worked as advertised, but after reading that Josh’s elementalist has a similar trait that doesn’t, I’m going to go actually test it.

Unit testing would definitely help, but given the complexity of the system, writing comprehensive unit tests might be prohibitive. Unit tests should catch simple bugs, like the one Shamus described, but what about complex bugs involving how multiple systems interact with one another?

Writing unit tests to cover every test case might take almost as much time as it took to write the game in the first place.

There’s only so much you can do to ensure your code’s reliability. Unit testing will give the most bang for your buck. There’s a reason why Google mandates all of its software development be unit tested (and code reviewed).

Unit testing is awesome and mandatory for a system as complex as WoW. But it can’t possibly catch everything.

Even if we pretend there are only 500 skills in the game, that’s 250,000 combinations that need to be tested against each other – and that doesn’t take stacking or random effects into account, or that more than two skills can be affecting a mob at once. In reality, there are thousands of skills, and dozens or hundreds of effects can be placed on a mob at once, particularly raid bosses.

And sure, you could generate some of the unit tests programmatically, but you’re back where you started now. What’s to stop your unit test having the exact same bug? What’s to stop a buggy test failing on a bug-free skill?

Sorry, missed a ‘not’ in there. Should’ve been that unit tests are “certainly not an all-purpose solution”. As you note, there are limits to what unit testing can catch. It’s a really grand place to start, though.

That’s ridiculous, you don’t need to write an individual test for every spell in the game! You need to write a test for each mechanic that your mechanic might interact with. You don’t need to write a hundred different tests for each individual fire damage spell, you just write a test that checks that fire resistance reduces fire damage by the right amount. As you don’t need to write this for every specific item that grants fire resistance, and you don’t need to write tests for spells that don’t involve fire damage (e.g. healing), this is going to be far less tests that the inflated number quoted.

Okay, I can see how that would be a problem. Couldn’t they try these things out in beta testing? I know some bugs are never going to be spotted until the world is live, but don’t they have private servers in their building to run the full world beforehand?

Well, you have to bear in mind that we don’t hear about nearly all of the bugs that get fixed in beta unless they’re exceptionally hilarious or happen at E3. Having made relatively simple programs, I can tell you that there’s a lot of bugs that never reach market. I mean, the game compiles and runs, so a lot of bugs had to get fixed before it can move from literally unplayable to virtually unplayable. Then the fixing of logic errors, the sorts of bugs you actually see, can begin. But it’s a huge piece of software, so there will be a lot of those. And there’s a complex web of interdependencies where fixing something might break another thing. And the ship date is looming in the distance like the icy hand of death.

CCP do, the next patch/expansion is usually thrown on to Singularity (the PTR) for open testing by the players, there’s also Duality and the private test servers, sure theres bugs and issues they miss, usually because the under-hood mechanics can be upwards of 7 years old.

Also of the MMO’s i’ve played (GW, GW2, WoW, Rift) Eve-Online has the most transparent mechanics, complete with tools used to optimize tanking, damage, CC ect. along with the very open approach to telling people what they change,

Private servers wouldn’t catch the kind of bugs Shamus is focusing on. The root cause is a disparity of information given to the human vs given to the computer. A private server or other beta testing would continue to tell the human one thing while doing another.

One solution is to provide more info to the player to allow them to double check what the computer is doing. IE complete combat logs breaking down every nuance of every roll. Really that info should be always be available to a player. (Off by default, but able to be turned on.) One player (out of millions) will be much more likely to notice the error quickly.

THIS is why it drives me absolutely INSANE that games virtually never tell you how the system under the hood actually works. +10 toughness does what now? Lowers damage? By how much? Is +10 toughness better or worse than +10HP? Why the hell am I playing, trying to make a good character when I’m utterly blind to how the system works, and I only get fuzzy feedback as to what should work?

I’ve seen you (Shamus) say before that PnP DnD rules are bad for computer games, but I’ve never understood why. At least with the 3.5 rules, it’s defined what stacks and what doesn’t and how, so when I make a character in NWN I know when I’m being gypped on my build. Games like GW2 are just impenetrable…

This issue right here is why I personally wish more games used pen and paper rules. Well, barring draconian copyright laws, of course–I’m sure WotC would love to sic their lawyers on any game foolish enough to venture into their territory, but it would be nice to have a transparent video game system to play with for once.

It’s better to design a system suited for the environment you’re using. In tabletop, the game is paced so that each roll / turn is exciting and there are decisions to be made every round. In a fast-paced computer game, those turns fly by too quickly for the deeper mechanics to shine.

A lot of D&D mechanics get dropped simply because they would be too hard to portray / animate in a game. Grapple rules, for example, are right out. So are about half of the wizard spells, and a lot of the jumping and tumbling moves. There’s also the grid-based nature of the combat, the 5-foot step, and that sort of thing. By the time you adapted all of that to a real-time world of free movement, you’ll have thrown away almost as much as you’ve retained.

It’s true that D&D rules are well-documented, and that is an advantage. But there’s still going to be a lot of compromise going on.

Having said that: I think I’d rather have badly mangled D&D rules than a slate of possibly buggy mystery mechanics.

Temple of Elemental Evil (by Troika) is by far the best translation of DnD mechanics to a computer game I have ever seen. It still throws away most of the wizard spells, but it retains a lot of those other things.

One of the important points, of course, is that combat is turn-based in that game. You really cannot take a turn-based system, like DnD, and make it real-time. It just doesn’t work.

Just think of it as a “tip” to a team that is actually trying to fight against DRM. And to provide whatever patches are needed to get it working on newer OS. I’ve re-bought a few games from them just to save me from the headaches of trying to install an old game (that’s even if I can find the disks again).

It’s also only 6 dollars and you get a copy you know will work forever without needing a cd.

The main thing though is if the game is that good that you want to play it again, why not chuck a few bucks the publishers way to say ‘good job sir bring me another’? Consider it a donation. Like you would a free to play game.

I consider it less a tip and more a fee. The GoG guys are taking older games and making them compatible with modern systems and tearing out the DRM as well as various other minor and sometimes major modifications. I’m happy to give them a few bucks for that work.

Yep, I am quite happy to pay someone else a minimal fee to provide me with a game that no longer needs a customised Dosbox setup and funky graphics drivers and a cd drive and can be installed somewhere other than c:\ORIGINALFOLDER

GoG is great. Plus they come with the manual in pdf, which saves a heap of scrambling through old boxes when you pull out something classic. Although I do miss my old handwritten notes on how to beat *that one damned section* that lurk all through my old manuals.

So, do the original publishers get a cut? If not…I suppose I could justify paying for it. If not, how is GOG not mired in infinity+ lawsuits?

I appreciate what GOG does, and when I come across a game that doesn’t work because it’s just plain too old I will buy it from them, but I refuse to give jackass publishers a financial incntive to employ DRM.

The truth is that most original publishers don’t get the cut because they no longer exist or they’re very different people working under the same name. This means that for the games GOG sells, it’s very rare for you to be paying to whomever came up with the DRM.

So you’d be paying whomever owns the copyright for those games (figths for copyrights are the reason games like System Shock 1 & 2 are not on GOG or anywhere else for sale) and, of course, GOG, who get a well deserved cut. Sure, those copyright owners might be getting free money for a game they didn’t make, but at least they didn’t make the DRM either.

I feel I should point out that (unless GOG have done some major work bug-fixing) the game never ever worked ‘fine’ – it was lways pretty buggy. It also had a difficulty level that was all over the place and no coherent plot or characterisation (which you could argue was Gary Gygax’s fault, but then, Troika didn’t have to choose this module to base their game on).

But it did have a brilliant, brilliant implementation of D&D 3.5 edition rules. Wish other games had been made which used it.

True, I posted that and thought about all the mechanics from DnD that definitely won’t work in a video game, but still…that’s just because there are things you can do in DnD that you can’t in a video game, right? Like Diplomacy–the diplomacy skill purely as written is silly, even in a pen and paper setting, but even moreso when attempted in a CRPG, so just drop it.

Drop what doesn’t apply to your mechanics and keep what works. Grapple mechanics byzantine and difficult to animate? Don’t use them. Want attacks to hit and miss based on player skill, and not some abstract armor mechanic? Then don’t use AC. But if your hit chance is calculated entirely based on character stats and line of effect at the time of attack (like, say, City of Heroes does it), then why does it have to be any more complicated than Attack roll vs Armor class?

I don’t even think it has to be pure DnD, either, it just that the DnD system is the most widespread and the most widely tested for balance. I just think video game systems should be simple enough that you could write it out in pseudocode, grab a bag of dice, and go play in your kitchen (and, subsequently, write it out for the rest of us to see so we don’t have to scratch our heads). And hey, since TSR/WotC already did all the balancing and play testing for a robust system before now, you shouldn’t even have to start from scratch.

For starters, the grappling example is not a good argument against DnD rules: any rule that people around the gaming table hesitate to use (see the relevant DMoTR comic) can probably be left out of the computer simulation.

Second, the rules of a PnP game are just simpler: even the most byzantine rules of the most complicated games have to be settled with a few dice rolls and very simple arithmetic (plus a table or two). Even these days, I don’t think that people are willing to sit around a table with a laptop and several spreadsheets to play their favorite PnP game.

And though I am very willing to give the “different environment, different rules” argument its due, you cannot convince me that the complicated rules that MMO’s and CRPG’s use are truly necessary in context. I think programmers do this because they can: giving the computer mind-numbing number-crunching to perform seems like the obvious thing to do.

But does it lead to good games? On top of the fact that this opens the door to countless hard-to-track bugs, what of the effect on player. This has been mentioned in this thread before: what do any of the stats DO? How can I possibly figure out what to prioritize in these increasingly complex systems?

Even an easily understood PnP system can lead to hard mathematical problems (the old World of Darkness comes to mind). So what chance does a player have to understand a closed system with so many stats as in todays MMOs?

“Even these days, I don't think that people are willing to sit around a table with a laptop and several spreadsheets to play their favorite PnP game.”
You would be wrong, my friends and I currently do that for for D&D 5th ed play testing (or whatever is the game of the week).
Books are better than laptops for passing around the table, but spreadsheets definitely help in character creation for games like Eclipse Phase.

“And though I am very willing to give the “different environment, different rules” argument its due, you cannot convince me that the complicated rules that MMO's and CRPG's use are truly necessary in context. I think programmers do this because they can: giving the computer mind-numbing number-crunching to perform seems like the obvious thing to do.”
I’m quite sure it would be the designers saying ‘We need shields, and blocking, and resistance, and elemental damage. In fact I saw this one game with this awesome damage over time thing, we should use it, but make it so it does more at first.” and so on, poor programmers would’ve been asked to figure out how to make all these things work together, and when you’ve got that many variables it’ll always come out messy.

Do keep in mind though that many PnP games look easy because the human brain is just such a good decision making machine (ie it’s good at making many decisions fast, not always at making good decisions). The problem is that not all those decisions can be easily communicated with a computer game the way they can with a DM.

Take the 3.5 power attack feat. It’s easy for a player to say “I power attack 5”, wait for the DM nod and roll the dice, adjusting the numbers accordingly. But how do you tell the computer?

NWN allowed you to activate Power attack mode, but it only worked with one of the up to twenty input values (a second feat added a second value). The choice was taken away from the player to make the game flow faster.

ToEE had a seperate menu with a slider so you could set your power attack value exactly. It was midly annoying to reach and slow to use; it would not have worked in a real time combat system. Furthermore, once set, you had to set it back manually. At the gaming table, you just say how you power attack each turn. In ToEE, the designers correctly decided that having to go to that menu every time you want to use it would be annoying. The downside is of course that if you forget you’re power attacking, you may miss a lot before figuring out why.

That is but one of the many feats in DnD. Any game that wants to implement the D20 system in realtime will end up taking away a lot of player choice.

About the most accurate translation of turnbased tabletop game into a realtime computer game was Starfleet Command 2, and even there, quite a few choices were made for the player (energy allocation for example).

Talking about feats brings you already way further into the process than I was. At the most basic level, a system like d20, even though I never played it around a table (because I am an old fart), is straightforward enough that I can look, say, at two magical rings, say, and have a pretty accurate sense of which one will protect me better in which situation, even if their nature is different (say AC bonus vs. Dex bonus).

I can’t say that the ad hoc systems that I’ve seen in other CRPGs ever communicated clearly what an armor score was for, how a high Dex might help me dodge attacks (I presume it does, but what do I know anyway), or, another favorite gripe of mine, whether it makes any sense for me to invest any points in characteristics that are not directly related to my character class. (Of course, that info may be out there, but I’m not interested in min-maxing, so I don’t go scouring the forums for it; I just want to feel like my decisions are based on something.)

(On a somewhat related note: I don’t necessarily think the PnP systems are easy to work out; I did mention the fact that they can lead to difficult and counterintuitive probability questions. But, by design, the principles themselves are easy to grasp for our brains, or the system is doomed for failure. It sounds like the kind of mechanics that most computer games are using are NOT of this type: not the kind of thing that you would like to explain to a friend in 15 minutes in a coffee shop — not if you want to keep that person as a friend anyway…)

They are easy in some ways (calculations) but difficult in others (presentation and information and input of choices).

CPUs are great at math but terrible at making decisions, while the human brain is the opposite. So games run by each tend to lean towards those strengths respectively.

The problem is that because our brains are so good at comparing information and making decisions based on it we tend to want extra details and depth added to our games to increase the mental challenge but we sometimes do it way past the point of being able to keep up in calculating all of it. That’s why computer games do get so complex: we don’t have to keep calculating, we just get to do the comparing part which we like. How pointful that is if we don’t know the actual calculations being done…? Well, depends on how much the game tells us or not.

Though, even IF you had a game that was BASED perfectly upon the DnD rules system, that is still no guarantee that those rules would make it into the code or that there wouldn’t be bugs on your enchanted ring of awesomeness.

I’m pretty sure that using one system of rules vs. another has almost nothing to do with the frightening scenario that Shamus has presented to us.

The difference between DnD rules and the rules usually in these games is that I can go to a website and read up on exactly how the rules work, and not be met with “well, best we can guess is, the to-hit percentage formula is [(attack statistic-enemy evasion statistic)^2-3*enemy armor]/(attack statistic)^2.” It’s “roll a d20+attack, you beat their armor you hit.” It’s designed to be human understandable, it’s relatively balanced and it works. It has problems aplenty, but at least when I make a character with 20AC, I have some idea of what that means besides “it’s higher than 19 AC.” /Conversely, GW2 has damage numbers attached to their weapons like “150-200,” damage numbers on skills like “350,” and altogether it adds up to….30-50 damage? Wha?

I greatly enjoy trying to optimize my own character without leaning on a wiki and the suppositions of strangers, which means I don’t enjoy virtually every CRPG out there as much as I could.

To continue with my blatant XCOM fanboyism, this is one thing I really liked about the new version vs. the original. As an example, the original modeled bullet flight physically. This means you have to have an equation to check for accuracy, which is not given to you (all you see is shot accuracy, not factoring in accuracy stat, kneeling bonus, two-hand bonus (which I didn’t know about), etc.) Now you know how likely it is to hit exactly on target. Then you need to figure out how badly it might deviate, which is murky but is at least directly tied to the to hit percent. Then you have to figure out maximum possible angle to hit a certain hitbox at a certain distance. Then cover comes into the equation and you have no idea what your odds actually are. EU just says: You’ve got a 70% to hit, here’s why, do you take the shot? It does lose something even aside from complexity and realism (panicking makes less sense, no friendly fire) and it’s not perfect (I’d love to be able to test a square for LOS/range/flanking/etc.) but it makes it a lot easier to make a decision with all the information.

Nope, the listed number is what you get when all of those are accounted for. What probably confused you is that all of those modify the weapon accuracy in the UFOpedia, which IIRC is the accuracy a person with 100 firing accuracy with no bonuses or penalties would have with that weapon. What you see when you click on the weapon and the shot type menu comes up is what you get.

EDIT: Wait, checking the wiki, what’s going on is that the listed accuracy is actually a control on deviation instead of raw to-hit. But all those factors do feed into both the formula and the listed to-hit.

I would not be at all surprised to find bugs, and the ‘to hit’ descriptors often feel off (they’re nonfunctional for psionics.) But I don’t think it’s a big problem- I tend to notice lucky streaks more than bad luck. (Also, just to be sure, you aren’t savescumming right? Surprising number of people out there refuse to learn how a PRNG works.)

In some of the Fire Emblem games, when calculating attacks the RNG actually creates two numbers between 1 and 99 and averages them, then compares that to the listed percentage. This makes things with odds of greater than 50% more likely than listed and those with lower less likely than listed. This is good for the player because the player has fewer and better units, so usually they’ll have high hits odds and enemies will have low hit odds.

To fully explain that, one must get into Fire Emblem math. Basically, everything has fixed effects on rated percentage, like each point of skill adds 5% to to-hit and subtracts 5% from enemy to-hit. So changing things up to get different bonuses would make the actual percentage change in weird ways that would confuse players a lot, leaving people who didn’t know about True Hit completely in the dark and not really making it easier for people who did. The thing is, unlike many video games, hit odds for Fire Emblem are not calculated via magical mystery box; all the numbers one needs to calculate the odds of hitting in a fight are avaliable in both simple (hit and avoid) and complex (several stats, relevant skills, possibly items) forms from character sheets. Although showing the true odds would mean not needing to take True Hit into account, it would make convincing people that the Attk and Avoid values on the character sheets are not complete gibberish much harder. If you don’t know about True Hit, it’s easy to qualitatively understand how a fight between someone with Attack 140 and avoid 40 and someone with Attack 95 and Avoid 35 will go, and if you do it’s simple multiplication to find out exactly how it will go.

Also, humans are bad at probability. True Hit makes probability work more like people expect it to work than how it actually works. People intuitively expect something with 95% odds to fail far less than 1 in 20 times, even though that’s wrong.

KoTOR had a pretty ‘open’ system- aside from the few bugs such as Sith Fury not granting extra attacks, it was really easy to see what you were doing and how to improve. Now, that’s not to say it didn’t have tons of problems, but at least I can say exactly what they were.

Firaxis have a very interesting other angle on this situation too. They have a conscious design decision to lie to the players about the numbers. So when CIV tells you that you have a 2:3 chance of winning a battle, you actually don’t.

Basically when they playtested they found the way the human brain judges probability is so poor that players expected to win a 1/3 battle more than 1 in 3 times and they expected the enemy to lose a 2/3 battle _more_ than 1 in 3 times and changed the odds to suit this because otherwise it’s natural to feel frustrated between the mismatch of expectations and reality. They even take into account previous battle history

So there’s absolutely no chance with them of a player spotting if something is a bug or not =D

First was the idea that designers should not do X, Y, Z. Like at [11:34] where he refers to unexpected moral choices or presenting a certain kind of initial gameplay then changing gears later. I would argue the opposite. Players absolutely want these things. A lot of what he argues against are the hallmarks of great games if done correctly. They are also the hallmarks of terrible games if done poorly. It’s a high risk/reward situation. Designers just need to be aware that it’s very difficult to pull off successfully and can single-handily destroy a game if they fail. (Spec Ops, Jade Empire do it well, Fable does it poorly.)

Bringing it back to the topic of “trusting the system” Sid councils the game cheating in favor of the player in an effort to match player expectations. But then he talks about how players don’t trust the AI and think it’s out to get them. Sid seems to not understand that players have a good reason not to trust the AI. Once trust in the system is gone, it’s gone and that trust needs to be earned by the game.

Distrust, not trust, is the game’s starting point for good reason. The first is bugs like we’ve been talking about here. The second is Sid just talked at length at how his games cheat using hidden mechanics. This is doubly true in his Rise/Fall ideas @10:59. And third is that it’s common practice in games. A lot of games do in fact cheat. Some have elastic difficulty curves and it’s obvious.

Sid’s solution is to manipulate both the mechanics and the player into liking/accepting the outcome so that trust doesn’t matter. That solution works, sure. But an alternative solution is giving the player enough info and tools to trust the system. I believe the latter is a better overall approach. Harder to successfully pull off, but can result in an overall better game. (Original XCOM did it well, XCOM 2012 does it poorly due to bugs.)

In fact according to Sid’s video the original XCOM would be a perfect example of what not to do. It goes pretty much every separate point he makes: Hard first 15mins, high randomization, surprising AI tactics, game mechanic changes later (psi), easy save/load etc. But it is one of the best games of all time because it successfully pulls those things off.

I fully understand the desire to make the rules fully transparent. Sadly, though, tabletop games have their own “bug” issues too. For years, possibly decades, it was assumed magic items were a necessity for higher level play in D&D, especially for non-spellcasters. Then the guys behind the Trailblazer supplement actually crunched the numbers and calculated high-level fighter types hit Challenge-Rating appropriate enemies about 2/3 of the time without hit bonuses from magic weapons. Too many pluses actually breaks the game, but the official rules still tell you “Nth level character should probably have a +X weapon and armour”.

This realization is a boon for DMs and players who prefer magic items to be rare and special like they are in most fantasy novels and who dislike the magic item commodity market implied by early editions of D&D and thoroughly embraced by 3rd and 4th editions. But can you imagine trying to sell a computer RPG with little to no loot?

With the first out of a fulll attack or the whole lot? And how do they measure up damage wise without magical enhancements or power attack compared to spellcasters?
Is this assuming two handed or sword and shield?

I’m not saying I do not believe that, but there can be a difference between hitting 2/3 of the time and actually meaninfully contributing to the combat.

Basically, instead of being applied iteratively, a lot of those stats move values around on a big table. So your initial phase of testing is to write debug code that outputs the table. Equip a piece of gear, inspect attack and damage tables directly, repeat for every property in your power/item system.

That’s strictly deterministic, and will catch everything but the really weird corner cases (+might only fails to work when on an item that also grants +toughness, that kind of thing).

For everything else, there’s aggregation. This is how players figure out the combat systems for games. One fight in isolation may be too little to tell reliably whether or not an item is working. But log 10,000 fights, and the math can point you in the right direction.

Of course, sanctioned testers work under different constraints than outside users. They don’t have as many hands, but they can do things like spawn a monster with one trillion hit points and hit it for the next hour.

Of course, testing an MMO is still a huge undertaking… And “item/power failed to provide the stated effect” is pretty common in MMO patch notes.

At least for combat stats you can test by finding some big enemy groups and carefully observing the numbers.But what if the bug is in % to find magical loot?How the hell are you supposed to see the difference between the base 10% chance to find magical loot and the bugged +5%(so instead of new 15% you still get 10%)that doesnt exist?Youd have to know exactly how the rng works in this game in order to test this,because sometimes that 10% can give you 10 magical items in a row,and 15% can end up giving you only 1 out of 100,even when everything works properly.

So yes,I dont trust the system.Because the base state of any piece of software is “Bugged as hell”.

Any time you ever introduce the concept of an RNG into a program, you’re going to have bugs. Guaranteed. Doesn’t matter how simple your program is or how rigorously you test it, I maintain that there will still be some weird case where the RNG throws out something you didn’t expect and everything goes haywire.

A developer can try to work around this by testing edge/corner cases (at the upper and lower bounds of allowed values) directly or by using the same seed for the RNG every time until you “know” it works, but there’s still going to be something they didn’t foresee, and it’s certainly not possible for the user to do this sort of testing.

Ah, the infinite abyss that is bug testing. The proper way to do this is to write up an exhaustive table of exactly what values might be output in any concievable situation and then log fights in every one of those situations, noting down any case where a number not in the table comes up, and use actual stopwatches to time cooldowns. Or you could swap out the RNG for a known set order of values and test that way.

The improper way is to not do those things. This is also the less difficult way.

Now, there are a lot of ways to test components in isolation and see if they work, but for any large piece of software that doesn’t quite cut it. The whole thing needs to be tested.

This isn’t limited to MMOs. IIRC, Mana Cleanse from DA:O does exactly the opposite of what it says. It’s saved from uselessness by Mana Clash, which works as intended.

DA:O is truly frightening in the amount of stuff that flat-out doesn’t work even close to as advertised. It doesn’t help that it’s running on its own ruleset, so there’s no good way to figure out the math behind stuff.

The Blood Mage AoE gives it a run for its money, because some enemies don’t have mana and also it’s a huge AoE no friendly fire with high damage and a chance to stun.

But yeah, Mana Clash causes the most difficult fights in the game to just die. If gotten off at the start of combat, there’s like one mage in the entire game who doesn’t just die if you discount demons, and there’s very few fights where the AoE won’t engulf every hostile mage.

I remember that on launch, cone of cold was pretty much the only skill you would ever need. It froze everything in a fairly large AoE, even the toughest bosses, for a long duration and had a short cooldown. And did lots of damage on top of that. It turned even the hardest fights into a cakewalk.

Probably. At present it only briefly freezes bosses, though it still does decent damage and non-bosses hit with it can be nailed with high-damage attacks or certain special spells and literally explode.

Automated unit testing would help here. First test all the parts of the chain, then combinations. And this can be done fast enough to get useful statistics (for thos +/- percent effects). Any decently designed system will have the damage calulations separated enough to facilitate this (of course shortcuts happen, especially at crunch time).

Another point is that unit testing is often done by the programmers, not by the QA department, so there is less internal politics to consider.

In my opinion, the amount and level of bugs, and also the way they are handled says a lot about a developers attitude towards the development process. A good developer will have a good system in place to handle bugs, an understanding that tests are a neccesary part of the development process, and the attitude that bugs needs to be fixed in the correct way. A bad developer, not so much. It’s amazing how often this turns out to be true. (I also consider this a lot if I am considering switching jobs)

Perhaps you could liken this to the classic “Identify” spell. So, the wizard casts identify on this nifty glowing sword you just found in that chest. If the spell is considered successful, does the wizard (the in-game character, not the player) now have a fully detailed and itemized stat sheet appear in his hand? Why no, of course not. He gets some indication of the magic of the sword, and that it isn’t cursed, and that should be it.

So, perhaps NOT knowing everything about a game system would be considered part of the realism…???

It would be really interesting to make a game built around this idea. A game where you don’t have (visible) hitpoints, but instead you just have to judge based on how bloody and injured someone looks. You can’t see how much damage you’re doing, but instead just get icons for “glancing hit”, “massive hit”, etc. You can’t see the level / class of foes, you can only see vague descriptions: Weak, formidable, perilous, etc.

It would mostly kill the looting mechanics, but it would make the world feel a little more uncertain, which might bring something new.

Maybe take the same ‘taking damage’ feel from modern first-person shooters? “Ugh, the screen is all red and my characters heartbeat is really loud. That last hit was really bad so I should probably heal soon.”

I like this idea.

Maybe even have enemies LOOK formidable so you know not to mess with them. You could probably get away with not having ANY text on the screen to indicate anything. That has the potential to be really immersive… “That guy is huge! I don’t think I want to mess with him.”

Some games have done exactly that. Pretty sure King Kong did. But the type of game that can work in is limited to games where the player doesn’t get to make any decisions involving choices between mechanics.

For example, as soon as the programmer offers a player a choice between “Boots of Agility” or “Boots of Toughness” with no additional info (no numbers nor what Agility or Toughness even mean) the programmer still has to make sure that it’s working as -he- intended. Assuming he wanted them to reduce damage taken then he has to check and make sure that it’s not increasing damage taken or some other weird unintentional effect. Same thing if it wasn’t an item but some kind of trait or ability given to the player. But at that point the decision itself becomes worthless because the player is just flipping a coin without enough info to make a rational choice.

It can work, but it has to be a specific kind of game which excludes MMOs. Uncharted comes to mind.

You could have a skill that identifies, so all you get are “Boots” at the beginning and increasing identify gives you “Magic Boots”, “Magic Boots of Toughness” and then more information like: ‘Increases the amount of damage you can take before dying’ or ‘Reduces the amount of damage you take from melee attacks’

That still requires that Magic Boots be empirically better than plain Boots. If it says “˜Increases the amount of damage you can take before dying' and that’s it’s designed intent, it actually has to do that. So someone still has to check that is true, and it’s still a bug if it doesn’t, or if it does it to the wrong intended extent.

The boots of vagueness doesn’t get around the issue. Soon as you describe something one way to a player, (or even to just the designer) then the computer has to agree. The only way to avoid the potential of a bug that skews the relationship between Choice A vs Choice B is to remove the choice entirely.

It’s merely frustrating and annoying in a game. But in other situations, such as a voting machine, it can be truly terrible.

I would be in total support of this. I just think some early modders would figure out the math anyway, post it online, and then players will just refer to the wiki that translates “Uncertain RPG: the Game with Qualitative Rules” into quantitative rules.

On a related note, do you think it would be possible to sell an RPG, MMO or otherwise, that de-emphasized or outright eliminated loot?

I’d say that Mass Effect 2 deemphasized it some– there were still credits and upgrade schematics to be found, but it wasn’t central and you had a significant source of income separate from loot. (I liked it more than the median poster here, I think, but I’ve never been hugely into the details of gear customization.)

A related question: level progression is a purely game-based mechanic, one of the many things ultimately inherited from D&D, which has no real counterpart in reality or most fiction. People and characters become more skilled and experienced over time, but not with that kind of regularity or on that kind of scale. Other genres of games don’t necessarily have it– Pac Man always had one hit point; in Portal the player learns ever more complex ways of thinking with portals, but Chell doesn’t become tougher, etc.

Would it be viable to have a CRPG in which weapons and enemies remain mostly at the same power level throughout the game, and character advancement isn’t a primary mechanic? (Or has it been done? I’ve been in and out of computer gaming over the years, and could easily have missed it.)

Shooters are largely about loot, though. Your ability is entirely dependent on what weapons you find to use, and your ability to keep using them is reliant on your ability to find ammo for them – both things are loot.

City of Heroes, before Inventions, was a pretty good example of a loot-free (ish) game, however.

I’d have liked to see more varied role-playing choice in ME2 (and in CRPGs in general), but that strikes me as orthogonal to issues like character customization, leveling, and the role of loot. In principle, you could have a linear shooter with none or all of the above or a sprawling plot full of meaningful character choices ditto.

“RPG” tends to get applied to things like customization and inventory in CRPGs, which strikes me oddly given that most of my history is on the tabletop– but that may just be a difference in dialects. (There may be some potential ambiguity if “RPG” can mean either a game where choices of dialog and action define a character’s path, one with D&D-like stat and equipment options, or some combination of both.)

Just to give some tabletop RPG comparisons: a game like Champions will generally have no inventory or looting at all (since both the genre and the mechanics discourage keeping enemies’ or found equipment past a single adventure), and for many game systems tournament games with prerolled characters aren’t uncommon.

(Hm– how did Champions Online handle loot?)

Character advancement (whether via levels, point buy, or improve-through-use) does tend to be standard across all but experimental indie games. Though some designers have been known to bemoan that, and at least some systems (e.g., Chaosium’s Basic Role Playing) make it somewhat slow and self-limiting.

Champions Online had standard MMO loot – equipment for various body slots with attribute bonuses upon them. They had no effect on appearance, but were otherwise pretty much exactly like EQ/WoW/any other MMO.

Dark Souls comes pretty close, there is a levelling mechanic but it is entirely possible to beat the game without levelling at all. You equipment becomes much more important to how powerful you are rather than your level, and in fact levelling too high can make things harder by limiting who you can summon to help and allowing players of higher level to invade you.
Enemies and equipment does get better as you go through the game (and really that sense of progression you get from good rpg’s is a large part of the enjoyment), but your character him/herself stays more or less at a pretty low level of power, they just find better stuff and learn new tactics.

The problem with the learning new tactics form of play, is it can pretty much only work once. If Portal 2 didn’t introduce a new puzzle mechanic, but it was just Portals again, you would lose all sense of progression, because many players will start off fully skilled with Portals and have nowhere to go.

And the same with shooters, when someone plays CoD8 they aren’t going to be a noticeably better shooter by the end of CoD8 because they’ve been good since all the way back in CoD4, so you can’t rely on genuine player skill improving to give you the sense of progression. (I reckon, this is a relatively new line of thought for me)

The one sense of progression all CRPGs could have, non stat based and sustainable, is becoming familiar with a region or area. If you design a game so knowledge of places, people, locations is the main skill in completing challenges, you can just build a new world for your next game.

Actually, does anyone know games where the progress mechanics are specifically centred around that? So you need to help a person who needs guns, which you can do by going to the shop and buying them, but with knowledge of the city you realise that there’s an illegal gun smuggler in alleyway X. Or there lots of quests involving finding things, making judgements based on people’s personalities etc. Or you have to assassinate people for quests, but as you become more familiar with the game you know people’s personal habits better and so can find places to take them out more easily

Probably very hard to design as your main focus and requires a lot to make :( Would be fun though

And there’s the problem. Computers are, at their heart, really big calculators. They’re all about numbers. So there’s no real way to avoid them, unless you used severely randomized stats behind the scenes (to keep anyone from being able to figure out anything).

Fallout 1 and 2 were a little like this. Without a perk you couldn’t see how many hitpoints the enemy had, though you could see how much damage you were doing and adjust your tactics if you didn’t do much.

Lots of good comments in response to this, so I’m just going to respond in-line.

You’re getting close to an issue that’s rarely addressed, the perceptive layer. People don’t percieve exact things, they percieve approximations. Even our ideas are approximate.

We are very used to computer games telling us EXACTLY what is happening in a game, merely because exact information is available. In real life it’s extra work to generate precise information. How heavy is this sword? Um, about five pounds. Oh, I need more precision, I guess I’ll have to get a scale. Let’s see, my bathroom scale says it’s 4.2lb. Need more accuracy? The guy down at the post office says “Hey, you can’t bring a sword in here!” In computer games the weight of the sword is stored down to eight (or more) places (4.18753664 lb which is silly since at that precision the weight of the sword would fluctuate depending on whether it has oil from your hand on it, or how often you sharpen it, or how much rust there is under the binding on the handle… but I digress) and it’s actually harder for the programmer to tell you the approximate value than it is to just give you the exact number.

There’s also error of perception. You may think that the sword weighs five pounds when it’s really ten, or three. You may think you got a good hit, when you actually just dinged the armor. Error and imprecision of perception all takes extra work to put into games, but it’s a part of life. As games get closer and closer to “real life” we’ll see more and more of this kind of thing.

But there’s also error of conception. In the example of weight above, the computer stores an unrealistically precise value, and makes unrealistically precise calculations. How much do I weight? About 140lb. But, how much do I really weight? Exactly? Well, after about four places the very idea of “weight” starts to break down. Do you mean with or without the air in my lungs? At what time of day? Does the dirt on my skin count? How about the air in the pores of my skin? The water in the dead cells on my skin? This is semantics, and the semantics of the word “weight” break down eventually. Even God couldn’t tell me my weight at twenty places, because the very idea of “weight” breaks down long before that. Eventually games will start modeling semantic ambiguity as well, but it’s going to take a while.

Hey, the old Resident Evil games did this, in a fashion. In the pause menu you could see a generic Fine/Caution/Danger bar that didn’t precisely tell you how many more hits you could take and your player moved slower if hurt or limped if badly hurt.

Though, of course, it only did it on the player and not the enemies, so it wasn’t much useful for this.

Sounds similar to the issues that tabletop roleplaying games have with complexity, and character options. A lot of people have been rediscovering the virtues of the simplicity of older edition rules particularly like the ones for original D&D, or Metagaming’s Melee/Wizard.

Not to say that games like GURPS, Hero System, D&D 4.0 and D&D 3.X are not fun but the added choices and complexity of those two editions comes at a price. One of which are weird corner cases where slavishly following the rules produced counter-intuitive results.

The advent of retro-clones and simpler rule systems has caused many in the tabletop industry look reconsider the idea that more choices and complexity is always a good idea.

With the various recent releases of computer RPGs which one needs to have a complex combat system in order to be interesting and which ones could have just had a simple combat system and still be interesting to play.

I don’t think it’s really that similar. The question of game balance certainly exists (and there’s a whole sub-discussion revolving around how much you should even worry about that or whether it’s just theoretically impossible to make happen or what). But the actual mechanics are not operating differently from how you think they are, because you’re the one making them happen and you’d notice. If the computer subtracts 2% damage for a resistance instead of 20%, but you don’t know because you don’t know if it rolled what it rolled for damage or, like, 18% more, that’s one thing. But if you get hit for 100 damage and you say brightly to the GM, “Ah, but I have a 20% resistance!” then you know whether you noted down 80 or 98 damage on your character sheet.

As to the point you’re making, which doesn’t relate too closely to what Shamus is talking about . . . whatever. You can rules-lawyer even the simplest fudgiest game if you really want to. And a strong GM will, presented with an odd corner case in GURPS or some other relatively rules-heavy game, say “OK, so this happens instead because it makes sense” or even “OK, so this happens instead because it’s way more dramatic”. Something which, again, computers don’t really do. And something which if you look at those heavy rules, they tend to repeatedly advocate.
Certainly I don’t think the solution to problems with current rules-heavy games is to go back to old-school games. There are modern games designed to be simple, flexible and dramatic (Fudge, Somewhat etc) which give you freedom with the simplicity. Melee/Wizard is just limiting. Like other older games, rather than a simple system it’s an incomplete system. It’s not flexible, it just has ironclad rules for a very small problem space, and beyond that you’re playing cops and robbers. It would be ridiculously complicated if you added enough options to let you do much.

The problems of tabletop games are not really very analogous to the problems of computer games.

“How does anyone know is the system is actually working properly?” Should it be “IF the system” ?

Actually I remember doing a lot of testing in mmorpg for some kind of power or gears, to see if one was really better than the other.

The fact that you know someone having find a bug like this show that, at least in MMORPG, the players are an extended free QA testing: Power gamer exist and will find if something is now working as intended, even for a few points.
If this ring against Fire doesnt seems to protect me and has a bit of value, I will test with a friend or with mob. I will go into a large chunk of fire damage to see how statistically the ring really impacted the output.

Probably wouldn’t trust, no. On the programming side, I think I’d want to keep the randomizing damage part very separate from all the other stuff and have something where I could run pseudo-combats (with, like, infinite hit points or whatever so I could observe stuff for a while) and look at the flat numbers I got before the random factor was added. Compare it to spreadsheets with all the numbers and percentages. That’d make it a lot easier to tell if something was off.

Seconded. The first thing I wanted to see was a debug window where I could toggle on and off different parts of the combat math. That would solve Shamus’ specific problem, but really you need unit tests and probably an expensive automated test setup to catch stuff like +magic find.

I think this is exactly why I have little patience for fussing with my character’s numerical abilities and stats in a game. As a QA person I know damn well that it’s nearly impossible to test every possible permutation, so I have little trust that my choices matter.

I spend WAY more time determining what color of bunny slippers my character is going to wear than whether or not I’ve put enough points in one of her passive abilities. I can SEE the bunny slippers. I can’t see how that extra point in bullet deflection actually affects my character’s mortality rate in combat.

‘zactly. It doesn’t matter if it’s trustworthy or not. I’m still gonna play it. None of the subtle interaction type bugs are going to really affect that in the long run, because they can’t have any really deep, profound, long-ranging effects or limit game play without becoming unsubtle.

And the little things, which plus or minus 5%, aren’t going to affect me much. Because I don’t look at the numbers on the tool-tips that much. I tend to mostly look at the white ones and how fast things go down. If it’s not fast enough, I try something else. If it is, I stay with it until it’s not.

I do the same, but for a different reason. Most of these games have to high a turnover on equipment is too fast. There’s literally no reason to care, just make sure it looks good and the numbers are vaguely higher, you’re going to replace it soon anyway, even considering what’s better almost seems like a waste.

When I read this article yesterday, I had Torchlight 2 on hold in the background. Your comment nicely matches my experience. I’m only lvl 16ish and I’m already shift-click-unloading everything onto my pet (to send it just getting rid of all that junk) that’s not a) part of a set or b) something obviously cool like +120 health. Just… can’t be arsed ;)(But then again, what is called the ‘normal’ difficulty does not exactly invite me to care about getting better. I just hold the mouse button and stare at the clusterfuck of stuff until Mr. Beserker reappears and the fight is over)

That whole “Recalculated fire damage slightly” comment reminds me of the many huge balance updates that I’ve seen referred to with tiny one-line updates in patch notes. “Balance changes” could be anything from “The Eye of Death item now functions as advertised” to “Reduced damage on Lightning +4 weapons by 2%” to “Reduced duration of Magic Shield spell by 80%.” All three examples above are brought to you by the same Dark Souls patch.

This isn’t just an issue for combat systems in MMORPG’s. There’s the same sort of complexities in any sort of game that has bonus/leveling/upgrade/research mechanics.

One bit of upgrade vagueness that requently bugs me is when you see “2% increased chance of magic find” (or critical hit or whatever), but there is no indication anywhere of what the base chance is, or even if that is a multiplier or additive on that base chance. ie: If the base chance was 10%, is it now 12% (0.1 + 0.02) or is it 10.2% (0.1 * 1.02)? IIRC, Diablo 2 did something like the latter, which annoyed the hell out of me when I found it out.)

The old turn based strategy game Galactic Civilizations II was really bad about this sort of stuff. The research technologies were full of things saying things like “+15% bonus on x”, when it wasn’t a percentage bonus at all it was a flat +15 to some hidden internal number. Or worse, it sometimes did oddball things like use the square of the percentage bonus (ie +4% actually gives 0.04 * 0.4 = 0.016 = 1.6% bonus).

Damnit Shamus. I did trust the system, because I’m a naive fool at heart. But now you’ve sown the seeds of distrust.

No, really, it’s an interesting notion. Besides the usual more obvious game-impacting bugs, though, I think that if problems like this exist wide-scale (I have zero doubt they exist small-scale) detecting and solving them is unlikely. Both for reasons of cost effectiveness and because of how easy it can be to work around.

Also, the tooltip for that lolita bunny game that reads ‘Terra Online’ should be ‘TERA Online’ instead. Nobody cares about this, at all.

On a side note, if the squad screen can be believed, ME3 is a game where bonuses do stack–or at least the little bars at the bottom of the screen that fill up as you choose one upgrade path or another seem to imply that.

No, your sword does *up to* 95 damage. Which means one time in some number, the plus 5 might matter. But that’ll get lost in the one time in some much smaller number that you do a critical hit and get some bonus amount piled onto your result. This is why Shamus talked about this as being very complicated, and why mostly (to me) it doesn’t much matter.

True, but I don’t think it invalidates the basic point: A relatively small bonus can be disproportionately effective if your damage levels are in certain zones relative to your opponents’ capacity to take damage, such as tending to do damage that’s *close* to enough to kill something, but not quite there. Bump it up a little bit and you’re often one-shotting things instead of two-shotting.
Compare to a situation where you’re tending to do a bit more than half; small bonus takes you from two-shotting to . . . two-shotting, making no difference.
Or if there’s a threshold for knockdown, stunning or whatnot, a small bonus could tip you over the edge. Whether you want to worry about that kind of thing a lot is another question.

But in the majority of cases, it doesn’t matter. And in most of the cases where you one-shot them instead of two-shot them, the increased kill speed doesn’t have much impact. It’s only that once in a long while moment where your character is taken down to effectively 1 HP before you finish off a monster, and you actually needed that little bonus.

Reminds me of all of the times when I was playing LOTRO and would get a big Devastating attack for 5000+ damage on an enemy that had 100 Morale left, and realized that the effective difference between me and someone with sub-par gear was nil at that point -we would have both killed the enemy in the exact same time despite the difference in our gear.

It really depends on the (type of) game and the numbers, plus the level you’re playing at.
Diablo II comes to mind – there were very precise break points in a lot of skills; and for highest-level play, the difference between, say, +176% attack speed or +177% attack speed could be the difference between kicking Baal’s but and having a VERY hard time, due to what’s been discussed above – technical limitations, rounding, imprecision. That difference could bring your attack from taking 5 frames to 4 frames (with 26 frames per second), meaning a 20% kill speed increase. Adding another 40% of attack speed wouldn’t change a thing.

I almost entirely trust Guild Wars 1; can recall one case where the description was vague, and that skill has been changed already (Aura of Holy Might, for reference). I’ve also manually tested a ton of stuff (including crit damage, chance, armor values, skill bonuses, armor-ignoring damage…), and saying the two wikis are comprehensive is an understatement.

Other than that? Only the simple/deterministic ones, such as FTL’s :P Hit/miss, shield, system, hull, augments. Nothing really complex going on there. Fire’s probably the least obvious due to spread.

Unrelated; I sooner recognised Sovereign by the text than by the image.

This is the reason why, when I play these kinds of games, I always choose my gear and stat loadouts based on actual gameplay rather than the comparison tooltips.

This is also why I like games like DDO where bonuses are BIG so you can usually tell whether they’re working properly or not. When you’re doing 25-30 damage on a swing, you can tell relatively quickly whether that +5 damage is working or not.

As for testing, some of the players go to the trouble of looking up enemy HP in memory, executing an attack, then checking the HP again to see if the the actual damage equals the expected… couple of bugs have been found this way. I’m sure QA with the developer tools have better methods of findings such bugs.

But yeah, the “tooltip doesn’t match actual value” is pretty crazy, one would think they would both be from the same data/variable so that what is displayed has to be what is actually applied… so that’s weird.

They’re not coming from the same data, though. Tooltips are stored client-side, generally. The actual skill is ‘stored’ server-side. This is because letting the client have access to the actual abilities makes it far more susceptible to hacks and exploits.

So it goes like this:Client: This guy just used Sinister Strike on Oscar the Orc.Server: That guy has Sinister Strike and can use it right now, Oscar is in range of Sinister Strike, so he uses it. Since he has stats of X , he deals Y damage.

This is why you’ll often see hotfix notes like “So and so skill now does 320 to 560 bonus damage. The tooltip will not be updated until the next patch.” They can change your skill whenever they want. They have to patch your client to change tooltips, though.

Except that the client/server divide does not necessarily mean they aren’t coming from the same data; the data on the client could be exactly the same as the data on the server, and I would expect that it is most of the time. There’s no reason to have two independent copies of the data, rather than them both coming from the same place in the development data.

This discussion reminds me of a bug discovered in Diablo 2 a few years ago (note: more than 10 years after its release).

Basically it boils down to: one skill automatically misses every other attack, *if* the player is in werebear form, *or* under some, but not all, circumstances in werewolf form (I’m not sure whether these circumstances have been quantified; if so I can’t remember what they are).

Now this is something you might think would be relatively easy to spot, since you can see your target’s hit bar. And it *still* managed to go undetected for ages, even by players quite dedicated to testing the system in modded and single-player mode. Partly because the skill Hunger, is widely considered underpowered and was little used anyway, but still.

Unit testing is your friend. Instantiate a generic crash-test dummy, equip it with nothing but your item, and throw a swathe of attacks with fixed amounts of different types of damage. Measure results. Do all of this automatically, for every item in the game, for every release.

Still won’t catch the strange interactions between different items, but at least you find the obvious “this armour isn’t protecting me” kind of errors.

If you don’t trust the system, then you do the only logical thing: extensive testing. Break out the spreadsheets!

Your basic auto-attack swings for 100 damage every 3.0 seconds. You found a ring with +5 haste. When you equip it, your tooltip shows you now attack every 2.95 seconds instead of 3.0 seconds; the formula is obviously Base-(Speed/100). Assuming no other stats have changed, this should be a 1.69% increase in damage output. The average length of a dungeon boss fight is 180 seconds, so unequipped you should do 6,000 damage; with the ring equipped, you should do 6,102 damage.

You set off on an epic adventure to find something to serve as a punching bag. Maybe it’s another player, maybe it’s a really tough but weak enemy, or maybe the designers have training dummies for you to hit.

After 100 trials (because you want to reduce the margin of error), you find something puzzling: you’re doing 6,300 damage with the ring equipped. Huh? Well, now you start stacking haste to start gathering data points. At 50 haste, your tooltip numbers say you should be doing 7,200 damage, but your trials show you doing 9,000 damage. At 200 haste, your tooltip numbers say you should be doing 18,000 damage, but your trials show you as doing 18,000 damage. What’s going on here?

There are a few possibilities. The tooltip designer may have implemented an older formula, or perhaps wrote their own formula while working off of data given with 200 haste. Maybe the design team decided they wanted to simplify the tooltip for fear that the actual formula would have been too confusing to understand. Maybe the designers are deliberately lying because they figured no one is going to notice a difference of 200 damage on an enemy with 43,100 health that requires 5 players to kill; turns out, it’s actually a difference of about 2,000 damage when you have 5 players (each of them have 5 haste).

Going by your data, you discover that the actual formula is not the Base-(Speed/100) as you thought previously, but rather it’s Base/(Speed/100)+1! You also realize that with the incorrect formula, you could ostensibly have a swing timer of 0 seconds, while this new formula prevents that from ever happening. Wondering how low you can get your swing timer, you decide to add more haste. You wrangle together every bit of haste the game has to offer, 357, and your tooltip does in fact say your swing timer should be -0.57 seconds; your spreadsheet says you should be doing 27,420 damage. In your trials, you discover something bizarre: you’re doing only 18,000 damage!

Congratulations, you’ve hit an undocumented cap. The designers made it so that your swing timer can never go lower than 1 second, which means any haste over 200 points is a wasted stat.

Granted, theorycrafting requires that you’re able to collect data and fine tune your stats so that you can isolate variables and all that. If the game you’re playing doesn’t allow you to collect data, then you might not notice that something is off. You can still crunch the numbers and do analog testing (am I killing this as quickly as my spreadsheet says I can?), but when you hit a -0.57 second swing timer you might realize something is wrong.

Really, the best thing you can do is just ask the designers to clarify things for you, if they don’t already run a wiki documenting the various formulas. Like, maybe you have a really complex spell that grows in power the further it travels; I can’t imagine anyone being able to reverse engineer a formula that looks like this. The key is being able to communicate with the designers and having access to data, be that formulae, timers, combat logs, and more. This is where forums and community managers shine because then you have those channels to open these kinds of dialogues.

As you get higher up in levels and get more specialised complex weapons and items with a larger number of attributes, it gets practically impossible to tell sometimes whether one weapon or piece of armour is better than another. A lot of it comes down to “feel”. Do I feel more powerful, tougher, etc. with this item? After a few mob fights you get the hang of what items work for you. I get the need for munchkins to try to game the stats, but I think for most people it just becomes too much of a drag as the game gets to the higher levels.

Shamus, excellent post. One of the things that keeps drawing me back to WoW as an MMOG is how the game designers keep iterating on mechanics and dropping ones that makes less sense. The latest expansion, for example, removed armor penetration, greatly simplified to-hit calculations, and took out the vast majority of the talent choices that were +% to a random stat.
EQ2 and LOTRO take the opposite approach, and just keep bolting on more new mechanics with each expansion, which makes it harder for me to get into either game.

There’s also the issue of how much data gets exposed to theorycrafters. WoW has lots of APIs that can be accessed by addons to provide very detailed information; hence you have lots of 3rd party sites, such as World of Logs, that can automatically parse combat log information to help calculate performance. While the issue of how open a game should be to this kind of analysis is certainly open, I like being able to deep-dive if I need to.

Agreed, the way combat mechanics in WOW work is well understood by the community, and many in-game and external tools are available to analyse the fairly comprehensive combat log.

In the small number of cases where mechanics are not explicitly explained by blizzard, the algorithms used are very quickly data-mined by the community and there are enough people pouring over it that bugs are usually detected very quickly.

Is it better than a closed system? Very subjective, but bugs like the cooldown reduction not working you mentioned would never make it past beta in WOW these days, though it does still suffer from sometimes inaccurate tooltips between the time they actually fix something on the servers and next patch the client.

This is why I’d really like if people actually told us what the formulas are which they’re using. It doesn’t take much, and it helps both the community with finding optimal gear and the developer because of the previous people finding bugs in the system.

“In the coming week of Spoiler Warning, we have a conversation about how gradual some of the upgrades are in Mass Effect 3. I'm talking about the things that give 10% more damage, or increase the radius of area damage by half a meter, or other baby-step upgrades.”

To be fair, that’s been a problem in *all* the Mass Effects. ME3 actually probably has it the least bad in some regards. ME1 frequently had increases of a few%. Effect increases for ME3 powers at least tend to never drop below +30%. (%’s for armour in SP, otoh, is really low unless you stack a bunch together. Honestly, I’d almost prefer no stat increases and have them just be purely cosmetic :/ That’s how I used ’em. Of course, I played SP on Casual…)

I did like how each upgrade is a clearly defined boost to a particular stat, often with a choice of 2 upgrades. Alpha Protocol had a similar system, and I liked it a lot more than ME1’s “Just killed another 100 geth, should I go for 4 more health or 2% more shotgun damage?”

+1 this. I think ME3 found a nice midground as far as levelling up+ went. Every time it felt like there was a fairly large and clear bump and often the choices weren’t pure stat upgrades but something visible. Like Chain Overloads.

And all felt testable. Okay the guns had very specific numbers on them, but after 5 minutes of playing round it should be possible to figure if it’s working out for you or not.

Yes, levelling up powers was *much* better done in ME3. (Although I do tend to find that I always stick to certain upgrades, even when respeccing in MP. (Frex, I haven’t actually tried double chain overload on a character. The rank 6 boost against shields/barriers just feels too necessary when my teammates are going to be relying on my taking down banshee/atlas/prime/praetorian shields. And the rank 4 chain (+ the rank 5 neural shock) seems to do well enough against mooks, especially with ~200% cooldown.)

However, having to level up weapons was dumb and stupid. Trying out a new gun at its actual potential is just a big credit sink, which discourages you from trying out different guns. (Also, I imagine it would be annoying for heavy-gun-toting players to have to spend a bunch more creds than single-light-weapon-fast-recharge players.)

Basically, what we needed was a system like ME2, where guns where fixed, but with *tons* more guns. (To be fair, we did get a lot more guns, and there’s way more variation between them than between guns in ME1 or in ME2. Still, even more would’ve been better ;] )

But I suppose that that wouldn’t have been good for monetizing MP, so…..

Yes, but you’re probably only ever going to be carrying one at a time. And failing to upgrade it is less hard on you. (Also, that gun is like to be a pistol, smg, or shotgun, and those are cheaper to upgrade.) Whereas a SP Soldier-y type is going to want to load up with at least a sniper (most expensive per upgrade) and a shotgun, and probably more (if nothing else, to have all three ammo powers at the ready; plus assault rifles probably work better with adrenaline rush than other guns). *And* they need to keep up with the weapons treadmill as the game goes on, whereas a pure power player can usually make do even with a crappy gun.

Oh I forgot about the gun upgrades. They were really bad, I think I ended up pretty much ignoring them. The nice thing about gun selection was you didn’t have to number crunch, because the feel of the gun told you enough to make the decisions.

I don’t know what I’d have done. If there were a ton more guns each one would feel less unique and it’d be hard to decide between them. If you found guns at certain levels (kinda like MP) it would still make the decision awkward. Is lvl 4 Phalanx better than a level 1 Scorpion, even though the Phalanx suits you’re play style more?

The system they had was anti good though, because as you said, it discouraged playing around with the different weapons. Maybe I would have preferred to scrap it all and have general dmg, weight, fire rate upgrades (on top of the mods) so you’re making a decision about whether you want more skill cooldown or more DPS instead. And you can fine tune that on later playthroughs when you’re aiming for specific weapons, which should be fun

Mentioning neural shock reminds me…
Neural Shock was a pretty fun Loyalty/Bonus power in ME2, which I used for a while during my first playthrough. In fact, I used a number of bonus powers in that first playthrough, as I unlocked them. Kasumi’s flashbang, Mordin’s NS, Zaeed’s grenade, and a few more, finally ending up with Jacob’s Barrier (which was fitting, considering that I was a vanguard, and had spent the prior game using barrier more than any other power, even lift :P ). (Barrier was also pretty necessary for me in both those games, even when playing on Casual. I was really sucky at shooters back then.) Note how I avoided implying that I’m not still sucky now :P

But on my second playthrough… I just used Barrier from the beginning. It was good for the class, of course, but it was a lot more boring. Kasumi’s flashbang grenades were HELLA hard to use as Shep, but they also had a certain charm, and getting to use lots of different powers was fun. I know that there was nothing stopping me other than my own conceptions, but I still couldn’t… it just didn’t feel the same the second time around, and wasn’t as fun…

So a lot of people are bringing up the difficulty of testing something with an RNG. That’s not really true.

See, usually when you use an RNG, you just call a random number function of some sort. If you want to know if something that boosts hit odds by 2% actually means something that would hit on a 55% now hits on a 57%, you can replace all instances of the random call with something that returns 57% every time. If you do this and the attack hits every time, then you change it to return 58% and the attack misses every time, it works as intended. Actually, if you’re a serious QA testing group, you probably have a program that lets you pick the values at runtime.

Finding out if your RNG actually works as intended is more complicated, but the math to determine how random a series of numbers is actually exists, so you could use your RNG to generate a million values and then run a randomness determiner program of some kind on the list.

That being said, MMOs are extremely large and complex and weird interactions happen all the time, so exhaustive testing, while entirely possible, is extremely time-consuming and therefore expensive.

Ah yes the programming side. Having done my own tiny share of it on Udacity I understand how easily things get screwed up. :)

So many Debug calls in the programs I was building, and recursive calls.
Though why doesn’t the programmers from the ground up give the tooltip writers short program calls in a format to call up the direct current variables?

Wouldn’t there be some kind of unit testing? Admittedly, I’ve never tried to make a game (except for a copy of Plants vs. Zombies using python), but can’t you include tests to make sure that the damage numbers come out correctly?

You could, and I imagine the better companies do. But those take time to write, and I expect they also need a human to run and evaluate them, because although you could create a system that compared actual values to expected values you’d need to debug that system to make sure it did the comparison right and calculated/had been written with the correct expected values, and that’s a hole with no bottom.

It seems like it’d just be an “assert” statement… It doesn’t seem like it would take that much time to write out. All you have to do is copy and paste, and then replace the variables. For example, just pulling out one of my old junit tests–. If you have some function, “frac”, that turns integers into fractions and you want to test that the numerator is in lowest terms, all you say is:
Rational r = frac(36, 48);
assertEquals(“numer() should be in lowest terms”, 3, r.numer());

Seems like a copy/paste kind of problem. It also seems like it actually saves time to write out tests, rather have problems later on and wonder where they are coming from.Is there a way to look at the coding itself for games? It’d be really interesting to see how much testing they actually do.

etc.
but these unit tests are now potentially buggy – maybe I typed in the wrong exception name in test2 (or, more likely, someone changed the DB class to regularize all the exceptions in some new way and now the exception thrown has changed from under me…)

Okay, maybe I don’t quite understand the gist of the article, but here’s my comment on it:

…and in order to write it, I’m donning my Cape of Peirce, which gives a 23% boost on Pragmatism. Now, do you, fellow commentors need to know my base-pragmatism? No, I think not, what you nevertheless can safely assume is that my comment will be pragmatic in nature. See what I did there?

Do I trust the system? I don’t know, I have never thought about it, because: Do I ever think about the system and/or care about it so much that the question would even be raised? No. I understand there is an underlying system of number crunching translating my clicks into input applied to the game world but I hardly feel the need to understand it to its tinytiny bits. When commandeering Gordon Freeman, do I wonder whether there’s a difference in damage done between shooting a combine in his legs or in his shoulders? And why should I?

For me, the reason why I like video games is how they are unmatched in creating the sense of Immersion. Thus, when I play a game, I am not only me, the potentially analytic person in front of the screen, I’m also my character. How much is he/she supposed to know?

And now excuse me, I have to decide whether my armor should be Midnight Green or Ruby.

I have always found the spreadsheets in these games inpenetrable -which is funny, because data analysis is my day job. I figure it’s because data analysis is my job -and therefore I really don’t want to do it when I play games.

So the way I figure out whether something works is I strap it on and go out into the game.

I recall my brother playing Heretic. He was clobbering stuff with his staff (because he was out of stars or whatever). He found these gauntlets of something or other. Put them on and said “muahahahaha, now I have lightning guantlets!” And went out to kill a skeleton.

And got clobbered.

“Actually, they’re not much better than the stick.”

So he went back to the stick.

And this pretty much the reason I hardly ever use shotguns in Mass Effect 2 and 3, never use the incissor at all, and save the Mattock for special occasions.

It’s also possible that these games just have very badly formatted data structures, and your professional sense tells you “don’t go there, it will drive you mad!”

And, as you said, the subconscious is a very powerful tool in evaluating complex systems. Unfortunately, it’s not very good at divining causes, and getting a few rare crit-hits just after switching equipment could strongly affect your assessment, even if there’s no correlation.

Honestly, if the system is failing in such a small way that I can’t actually perceive it, I don’t really care. Obviously, I would prefer it if things work as they say they do, but acquiring a sword that does 1% more damage and acquiring a sword that I think does 1% more damage is basically the same event in terms of the experience I have as a player. Paranoia about the possibilty that the system is broken will damage your enjoyment far more than the actual brokeness of it.

And from the development side of things, I think that’s the only thing you can do too. If there’s a bug that no-one ever notices, then it’s not big enough of a problem to be worth the concern.

Yes, this would be awesome! See the discourse on the “perceptive layer” above.
What if your character actually thought it was a +1 sword? What if the character’s belief that the sword was better made him hit harder in combat?
And, of course, what happens when you find Excaliper?

I think the underlying problem from a player’s perspective isn’t too much complexity, rather a lack of transparency. In a PnP RPG, the player knows exactly what any given bonus or mechanic means. If I get a +2 sword, I know specifically how it’s going to improve my abilities over my current +1 model. Likewise, if I get a Keen sword, or a flaming sword, or what have you, I know not just that it’s vaguely better, but the specifics. A keen sword will double my critical threat range. A flaming sword will do an additional 1d6 fire damage. Now when I make the decision to use the keen sword or the flaming sword, I know what decision I’m making. It gives me all the information I need to make decisions about what I want to use.

Skyrim does a similar thing graphically. If I get a fire-enchanted sword, I see the red glowing runes etched into it and I see my enemies burst into flames when I hit them with it. Likewise, if my hit gets blocked on the enemy’s shield, I see the hit on the shield, hear the sound of it hitting wood rather than draugr, get a little bit of a rebound animation, and generally get good instant feedback on it. It gives me less mathematical detail, but I still have plenty of information to make meaningful decisions about the system. I might not always be able to know exactly how much damage each swing will deal, but it gives me enough information to move in the direction of “better,” and still lets me make decisions about how I put my character together and what I want to do in the system.

The biggest problem I have with GW2 is that it doesn’t give me that kind of information. The best feedback I get is “I stood in the red circle, and took a lot of damage.” I have no idea what my stats actually mean, how much damage my abilities do, how damage mitigation works, or what any of the numbers I’m given actually translate into damage. There isn’t enough information given to make meaningful decisions about what stats to build up, what traits to pick up, what kinds of weapons or skills are best in what situations.

It’s the same problem that Shamus pointed out about dungeons. The game isn’t necessarily difficult because of trying to outplay the consequences of your decisions- if I choose to stack nothing but Power and Precision on my thief, the challenge doesn’t come from trying to stay alive long enough to do the tons of damage I’m capable of, as opposed to stacking vitality/toughness and trying to maximize the damage I get from comboing my abilities together before the enemies can eat through my health. The game is challenging because you’re supposed to make decisions without understanding what they really mean in the context of the system you’re dealing with. The game feels designed around some kind of optimized play without giving your average player enough information to understand what their goals should be, or how to best reach them.

It’s not that I don’t trust the system. I don’t care much for these kinds of systems. I’m not against numbers, so long as they mean something. If you only have 1hp (as in, most games in the 80s), +1hp is a huge difference. Not only is it a 100% increase in stats, it radically alters your play style and allows for strategies you previously thought impossible. Even in Starcraft2, a +1 upgrade to Protoss Ground Weapons means zealots can kill zerglins in 2 hits instead of 3, which makes a noticeable enough difference, and at higher levels of play a skilled tactician can use it to win a battle.

It doesn’t matter how simple or complex your system is, so long as you kinaethetically feel the impact (that’s why simplicity is easier: more direct impact; but more wheels can mean more varied flavours.). If you have such a system, finding the bugs is child’s play.

The problem with most MMOs though, is that they rely on feeding players better gear. Better gear means more stats, and then you have power creep that destroys the meaning behind all of your carefully arranged numbers.

If there is really good system encapsulation, then testing it wouldn’t be that difficult. You could just put the “combat damage” module into a “combat testing environment” which would throw lots of test cases at it to make sure it’s working properly. Obviously this wouldn’t take care of the more arcane UI problems, but it would solve nearly all of the mechanical issues.

But sadly, nearly all of the code we write is inextricably tangled in all the other code. We all do it, reusing this little bit of the rendering engine in the combat system, slipping a dependency on the save file interface into the character collision detection. This makes it nearly impossible to lift individual modules out of context and test them (brain-in-a-jar style) for internal robustness and consistency.

The adoption of “Object Oriented Programming” was supposed to solve all these problems, but as you’ve said before, it only made them worse. If only people weren’t so darned lazy and sloppy! We would have perfect bug free service-driven self-documenting programs!

That utopia aside, a well documented mechanic (like the combat system) would be a lot easier to test, because we would know how it was supposed to work. Good documentation can be hard to do, and frustrating at times, but it will save your butt every time. Now if only we could get ArenaNET to share their combat system documentation with us! (though I suspect that it doesn’t exist and they’re as lost as the rest of us)

On the topic of NCSoft games where the devs don’t quite know what’s going on, I’m reminded of City of Heroes and it’s defense stat(s).

See, initially they had defense that was specific to the type of damage (blunt, lethal (slice/stab), hot, cold, ‘energy,’ and ‘negative energy’) and defense that was specific to the type of attack (melee, ranged, and area of effect.) Since any attack would only be one type of damage and be only one type of attack, the designers told the programming team to have the applicable defenses stack.

This was fine right up until they added attacks like an icicle blast, which did both lethal and cold damage, because it twigged to ranged, cold, and lethal defenses. So the designers told the programming team that only the best ‘damage typed’ defense should count, so the icicle blast would be defended by the sum of your ranged defense and the better of your cold or lethal defense.

Unfortunately, what the programming team heard was “Defenses don’t stack; just use the best one,” so that’s what they implemented. Which meant that if you were really good at dodging icicles (say, 50% ranged defense) a force field that deflected some icicles (20% cold defense) did absolutely nothing to protect you.

It was eventually discovered by players after they got their hands on the game’s to-hit algorithm and parses weren’t showing the expected results. And initially the devs claimed the players must be doing it wrong, because that’s not what the design documents claimed.

Eventually, the bug was fixed by making any defense-granting ability that was supposed to stack with others offer both a ‘damage typed’ and an ‘attack typed’ defense; they weren’t willing to risk breaking the combat module to fix the bug, but ability effects were just database entries. So your force field that originally gave a 20% defense to hot and cold attacks also gained a 20% defense to ranged attacks.

Really good breakdown of this. I found that in MMO’s this magical “you will be protected” stat always applied to my level. Level 10 more and you have to “find another thing that protects you” For instance. Torchlight 1/2 you get tons of items. those items give you bonuses to things. is the 10+ HP good for now? or the long run? this dex won’t help me now but in a few levels maybe? The opposite can as well be said. You have items that make you awesome. For the level you are. Keep the gear in 10 or 20 levels (sometimes less) then you definitely see the “magic” stat not helping as much, or even at all.

(This damn board keeps eating my comments (comments #308183 and #308185, if that helps Shamus). I had to reproduce the text below, and then it ate it again. Fortunately I kept a copy the second time. Trying this AGAIN without the links in the text. Links have been changed to bold, and I’ll try to get the links in a reply to this.)

I skimmed through the comments, but I didn’t see anyone mention the ridiculous amount of work a few players put into testing the system in various games. Some games, like Dwarf Fortress, allow players to do ridiculously precise testing due to allowing the players to modify the creatures and weapons, giving them an arena to test them, and producing detailed combat logs which can then be analyzed. For example, take this recent thread in the DF forums: Dwarven Research: A Comparison Study on the Effectiveness of Bolts vs Armors. It’s an amazingly detailed study of exactly what the odds are of certain types of bolts penetrating or reflecting off of certain types of armor.

But even in games that don’t lend themselves to such detailed research, such as most MMORPGs, a few players are willing to do the kind of research necessary to test the system. For example, in Perfect World International, a few of the players have worked tirelessly, performing hundreds, maybe thousands, of tests to figure out how the damage and resistances work. The results can be found in their wiki’s Damage page, which includes equations that allow players to figure out exactly how much damage they can expect to do against various targets. (I don’t remember any specific examples, but I do seem to recall that it has shown that some skills do not work as advertised, and in a few cases that has led to the skill or its description being fixed.)

So, players not only don’t trust the system, but a few of them are willing to work damn hard to figure out the reality and they deserve our applause.

1) Built-in WordPress spam filter. This is just a bare-bones system that grabs obvious posts. Comments that are nothing but twenty links, or comments which contain a few keywords. (If we ever tried to discuss the various names for Viagra it would devour whole conversations.)

2) Akismet plug-in: This is the “real” spam filter used by most wordpress blogs. When something is flagged as spam, details about the comment are sent to the central Askismet site and other wordpress blogs benefit from it. Without this, I’d have to wade through a LOT of crap.

3) Bad behavior plug-in: This is usually only used by larger blogs, and is basically a self-defense system. Bad Behavior looks for shenanigans like comments being posted by visitors who have never loaded the page to which they are supposedly leaving a comment. It also begins eating comments from someone if they start leaving comments too fast. Without this, a bad spam bot would bury me in work or bring down my site. (This used to happen a couple of times a month. Someone would aim their spamutator at my site and the deluge would slow down my host to the point where it was basically a denial of service attack.)

These three systems are created and maintained by different teams and don’t always work well together.

Having said all that, I have no idea why bay12forums is blacklisted, or by who.

I’ve come to seriously distrust any MMO that doesn’t give me a proper combatlog. Mainly because I’ve discovered that GW2 is a lying piece of junk and that NONE of the very few stats in the hero screen can be trusted.

That very much looks like something that I would try to catch with unit tests, rather than in QA. Much better control over the sequence of actions. It does mean that someone has to sit down and figure out what’s supposed to happen in advance of the testing, though.

Not sure what happen to all the programmers here, but I can’t be the only one that noticed that this “So your ring that granted 10% fire protection was actually just reducing all fire damage by 0.1 hitpoints.” is wrong?

A modifier that reduces by 10% would have the value of 0.9, a modifier of 0.1 would equal 90% reduction.

This would be correct: “So your ring that granted 10% fire protection was actually just reducing all fire damage by 0.9 hitpoints.”

Did your example have a bug Shamus, or was this a bait to see if anyone noticed? ;)

Also in my opinion it’s kinda odd to do modifiers this way.
I would instead add up all modifiers, and then apply a cap (if needed) and then apply the summed modifier to the summed damage.

Stacking and capping could be easily adjusted from a single point in the code, and new modifers and damage could be just slotted in without breaking anything.

Doing the damage modification in one step is just begging for bugs as a simple – or + or * can not only mess up applying the modifier but also mess up the damage value, making it very difficult to find what messed up the value.

B is what most people also expect, it is possible to "add" the bonuses/stats in your head or with a notepad.
A is a nightmare to do in your head, and painful with a notepad is an understatement.

In example A the damage result is 36.3375 (originally 50)
In example B the damage result is 35 (originally 50)

Also example B could be using floats only or integer and float,
and by simply changing "damage*=((100-modifier)*0.01)"
to "damage=((damage*100)-(modifier*100))/100"
it can be all integer although with a 1% precision, changing the three 100 to 1000 would give 0.1% precision.

Myself I like floats, and I try to avoid mixing floats and integer as that can slow the CPU down, so I try to keep it always float or always integer until the very last step (or very first).
Two multiply and one subtract (if all floats) is faster than all integer with one subtract two multiply and one divide.

But I'm getting sidetracked now.

The point I'm making is that with Example B it makes sense to show the -10% damage, or +10% critical hit on items, and tooltips won't lie, and any cap limits will be obvious during use. (And showing calculated damage for example on the status/stats/equipment window is a nice bonus)

With Example B on the other hand, don't even bother showing +10% or -10% etc. They would be meaningless. Instead show actual calculated damage percentage instead. (one place for this would be the character status/stats/equipment window)

Example B is a known linear behavior (stack another +/- in there and you only reach the cap faster if any, 5% plus 5% equals 10%),
Example A is a unknown logarithmic behavior (stack another +/- in there and the whole calculation changes, 5% plus 5% does not equal 10%)

Some her may wonder why would anyone do Example A in the first place?
Well it is very tempting for a programmer to simply do things like
result=ringobject.modifier(damage)
and leave the calculation in the "ring" code/object.

This may or may not be a procedural vs objectoriented approach issue, I don't know but:
result=ringobject.percentage()
Benefit would also be that the percentage could be 0 for "normal" items so it could double as a "magic item" flag.
A Procedural approach would be to fetch the ring percentage from the "equipped" list the ring is in, which is my preferred approach.

Programming isn't easy, if it was then everybody would be able to do it.
The issue however is that many programmers even fail to think logically (one would assume a programmers is always logical but this is not true).
Temporary "duh" moments aside, Example A is just plainly stated a stupid and illogical way to do modifiers.

Crap, this turned into one of "those" long-winded comments, oops, sorry folks!

FireProtection is probably the combined fire protection of all equipment. By using values like 0.1 for 10% fire protection like Shamus did FireProtection can be calculated by simply adding all active fire protection values together (similar to your Example B). Then it would probably be capped to 1 so you can’t take negative damage.

And so on, and that’s when things start to go pearshaped as the value of Damage will or will not be affected by one of those, and 10% Melee protection may be 10% after fire modification or not.
So those 10% would not be “the same 10%” (I can’t believe I said that, sounds so illogical and it is, but possible as you see).
And adding a protection type or damage type would change the way that 10% is calculated.

“FireProtection is probably the combined fire protection of all equipment.” that is fine if that is the case, in which case Shamus’ example is correct in that. (correct or not is not the point of his example, it was that a tiny change could have a huge impact.)

My point is that the example code itself should only be that (an example) and point out that it’s not smart to code like that with multiple modifiers as it makes tooltips pointless and a 10% ring against fire may not be 10% against the firedamage depending on the order it’s checked (which is just illogical IMO)

So if that is combined damage, then that still leaves the issue of total damage.

and later the FireDamage and MeleeDamaged was added then that would be better.
Then again. it’s called Damage and not FireDamage, and
if (DamageType == DAMAGE_FIRE) {
FireDamage -= (FireDamage * FireProtection);
}
also makes little sense, the if (DamageType == DAMAGE_FIRE) would not be needed since the FireDamage and FireProtection seem to be already set elsewhere.

Oh and Shamus, I know this is just an example and that those few lines may just be the entire code for the sake of the example. But there will always be the copy’n’paste moron out there that take it as gospel.

So if you ever see example code, or see that the author states “this is just and example” or “error checking not included for brevity” or similar, make sure you really fully understand the coding before using the example.

I’ve posted code examples of ideas or concepts myself in the past on a programming forum, and though it’s luckily rare I’ve had someone say they had issues with it. Regardless if it says “rough example” or “concept test” or whatever, somebody will copy’n’paste the darn thing and expect it to work. :P

You may think I’m complicating Shamus’ example code.
But I’m not in a game, some other dev (especially in an MMO) may need to add another type of fire damage like ExplosionDamage to the if block.
Hopefully the Lead Programmer or whomever review code submits will catch a mistake but this is not always possible.

Just because something works or compiles does not mean it’s correct.
And a unit test would not catch the issue I was design issue I was pointing out. (not unless you want to create ASSERT hell for everyone, IMO if you start adding an ASSERT your code need to be re-written in the first place as you obviously do not trust all the values that might show up.) Now I’m straying off again, darnit…

Shamus example is both right and wrong depending on it’s context, sorry if I didn’t point that out in the long comment above. I did ask Shamus if it was a bug in the example or not though. (and with an easy out where he could save face and say “yup, I was testing you all”) but that’s kinda ruined now ain’t it :P

So those 10% would not be “the same 10%” (I can't believe I said that, sounds so illogical and it is, but possible as you see).
And adding a protection type or damage type would change the way that 10% is calculated.

Maybe I need to use a different example or something next time.
Point I’m making is that it is recursive math (Example A) and the logical one (if you are to show percentage bonus in tooltips or just want the player to be able to do the math themselves) is Example B.

The “combinatorically large number of combinations” problem is in no way unique to gaming. Consider testing computer hardware – can you test every processor with every brand and amount of ram plugged into every keyboard using every monitor???to infinity.

The central premise is that most bugs are solitary (this ram is broken with every processor). Those are easy to detect – test it once and you’ll see it. Most of the rest of the bugs are from a specific combination (processor X doesn’t with ram Y). So design tests so every pair gets tested at least once.

It would be rare to have a bug that requires three or more factors (processor X doesn’t work with ram Y if we’re using keyboard Z, but works fine with every other keyboard).

[…] ugly head in initial release of Frayed Knights: The Skull of S’makh-Daon, well-explained by Shamus Young yesterday about trusting the system. When there are too many moving parts, it can be really hard to find or fix problems – or, […]

I just wanted to note that GW2 actually has “Steady” weapons with fixed damage so that players can test out how traits, abilities, and stats work. There’s three(?) dummies of varying armor ratings to whale on, and players have already come up with a few theoretical formulas for things as well as possible ranges of armor the dummies have.

Of course, this is only with relation to dealing damage, but there is theorycrafting going on, some of which goes onto the official forum bug lists.

This drives me INSANE. I constantly notice in some games how leveling up and acquiring a new ability makes me weaker. I kept thinking it was an illusion on my side but when I did some small testing I noticed it was true.

I tested this on Champions Online, in which a fairly new introduced ability chart lets you pick every new level between many passive abilities to boost your stats and such. One of the abilities, for instance, is supposed to augment your energy bar and instead it makes it smaller. I thought I was just imagining it, but I actually checked the numbers and the amount of energy I spent while fighting and it’s definitely a mistake.

I stopped using that ability, so I don’t know if it was fixed or not, and it’s only one of several many bugged abilities that hinder gameplay.

Did you ever play the Paper Mario RPGs? The damage system featured no dice rolls, very simple maths (Power – Target Armour = Damage) and used small integer values for attack and defense (I don’t think I ever dealt more than 9 damage in a single attack), so you knew how it was all working just by observation and it was very easy for any player to theorycraft.

Then, of course, you realise the best strategy was to stack power onto jump attacks. +2 damage (double jumps) for the price of one. Twice the penalty for armour of course but, meh.

One Trackback

[…] ugly head in initial release of Frayed Knights: The Skull of S’makh-Daon, well-explained by Shamus Young yesterday about trusting the system. When there are too many moving parts, it can be really hard to find or fix problems – or, […]