As a major pain in you bally (due to a huge laughing crisis) if you were part of the team involved in the situation

As a major heads up to everyone who think that working beyond your limit is the way to deploy a project

The story is about 4 wasted hours of a team of 5, in the deployment of a critical project.

“What was the cause of that?!” you may ask… and the answer is: “A blank line!”. Actually I could add some bad words to that answer, but I’ll just leave it at that.

The project was successfully deployed, after a week of day and night effort of the team and it was finally time to load test it and get some actual readings of the capacity of the infrastructure. For the test I was using Visual Studio 2008, with a test project, all nicely configured with dynamic data sources, CSV files, a nice set of webtests and the load tests to go with it.

The project had been tested against the quality environment and everything worked smoothly. By the time I started hitting the production environment with the simplest test (load homepage with login process), I started getting 50% of 401s (access denied). OOPS! What the heck is going on?? I tried the webtest by itself (1 shot) and it went smoothly. Then I tried the load test and 50% were access denied. Lowered the number of concurrent users and the percentage was the same!? Well, we’ve started the obvious and not so obvious process:

Event log

Sharepoint logs

IIS Logs

None of these came up with unexpected information. Well, in the end that wasn’t correct, but I’ll fill in the gaps later on.

It’s time to bring in the artillery, I thought. Fired up WinDbg, which revealed nothing new, just loads of it… have you ever tried to debug a system while it was being load tested? Well, let’s just say: DON’T.

OK, by this time I was throwing out the towel. “Let’s hit another server! Probably there is some problem with this machine…”. And so we did. And the crap hit the fan again.

By now it was official: panic was setting in. Some of the members of the team were starting to roll back code, and the chaos was just around the corner.

In a desperate move, I tried hitting the quality environment again and the problem persisted. Ok we have a common denominator: my machine. This suspicious actually came true, when we used another machine to perform the test. The test came clean. And suddenly … a light at the end of the tunnel! (and it wasn’t a train heading for us…).

I use a CSV with the list of users that will logging in the application. We were using just 1 user at the time, so what would happen if there was 1 line feed too many ??? Exactly! Once in every 2 tests, there would be a test trying to login with A BLANK LINE!!!!!!!!!!!!!!!!!!!!! There it was: the magic number – 50% of access denied.

The thing is, in the IIS log files, there were some lines with the access denied and there was no username!!! But, tired as we were, we just missed it.

This whitepaper by Craig Schwandt has been published on MSDN. Learn how to operationally deploy policies through Microsoft Office SharePoint Server 2007 by using feature activation event handlers and the SharePoint Server object model. It also includes code samples that demonstrate the topics covered in the article. You can access the content here.

During a stress test project creation, I came accross a problem which was the fact that the test recorder toolbar wouldn’t appear on the IE. As it happens, you have to add the site to be tested to your trusted sites connections on the browser…