Tim Mitchell is a business intelligence consultant, author, trainer, and SQL Server MVP with over a decade of experience. Tim is the principal of Tyleris Data Solutions and is a Linchpin People teammate.
Tim has spoken at international, regional, and local venues including the SQL PASS Summit, SQLBits, SQL Connections, SQL Saturday events, and various user groups and webcasts. He is a board member at the North Texas SQL Server User Group in the Dallas area. Tim is coauthor of the book SSIS Design Patterns, and is a contributing author on MVP Deep Dives 2.
You can visit his website and blog at TimMitchell.net or follow him on Twitter at twitter.com/Tim_Mitchell.

Ok, I have two admissions that I must bare to the world in this post. The first is that I’ve been a little lazy. I’ve admired – from a distance – this T-SQL Tuesday thing ever since it started. It’s been great! A bunch of people blogging on the same day about their own interpretation of some common topic – brilliant! Sadly, I’ve been too busy playing Angry Birds keeping up with my other responsibilities and haven’t made the time to add my own submissions to this phenomenon. However, I’ve finally killed the last of those damned pigs caught up on my other work, so I get to let my hair down for a bit and join the rest of you fine folks in this thing.

The second admission is that I’ve rolled out some, uh, less than optimal solutions in the past. Any developer who is truthful will admit to a few instances where we’ve rolled out solutions with poorly performing cursors, improperly indexed tables, and convoluted code, all accompanied by little if any documentation – quite simply, it happens from time to time. But in the interest of the T-SQL Tuesday topic for this month, participants are asked to share about a time when they rolled out some truly…

Crap Code

Without a doubt, I could find a number of instances where I’ve written code that wasn’t worth the electricity it took to store it, but one such occasion really sticks out. This was early in my IT career, close to 10 years and several jobs ago, and I was writing my first real production database application. I fancied myself a bit of an expert – after all, I was completing my first relational database theory class in college, and I had a solid 18 months experience in SQL Server administration.

I had talked my boss into letting me architect (!?!) a solution that would allow educators to create short web-based assessments, to replace the manual process (as in literal cut and paste – with scissors and glue) they were currently using. With the requisite approval in hand, I began my discovery. Yes, me – not only was I the architect, I was also the business analyst, developer, tester, and system admin. #winning

The Development

In meeting with the educators, I learned that the assessments would be very brief – at most, 10 questions each. Educators would be able to share assessments with others, but could only edit or delete their own. The final product would allow the educators to save to PDF and print the forms – there was no interest in actually delivering the exams online, which made things a bit easier for me.

So as I set out into the design phase, I envisioned the entities I would need. Since the number of questions would be very limited, I decided to use a single table to store the questions directly with the assessments. (Don’t get ahead of me here – it gets better.) I did manage to store the answers to the questions in a separate table – not because it was a best practice, but simply because I couldn’t figure out an easy way to squeeze them into dbo.InsanelyWideAssessmentAndQuestionTable. Using my freshly minted ASP.NET skills – I had recently read two entire books about C# on ASP.NET – I started coding the front end. A simple yet ugly interface, rudimentary file upload capabilities, and slow response time, but during my solo testing sessions, it did manage to do what I intended.

The Deployment

I’m certain I violated every test and deployment best practice ever written. Source code consisted of a bunch of .zip files created at irregular intervals, and parallel testing involved two laptops on my desk, connecting to the web/database server under my desk. Deployment was the easiest part – I just manually copied the web application files to the web server and restored the database from my desktop dev machine to the server as soon as everything seemed to function without errors. What could go wrong?

The Meltdown

Did I mention that I was still in college at the time? Two mornings a week, I drove to university about an hour away. I had deployed the whole thing about 9pm the night before and e-mailed the instructions for system use to the key personnel. It was no surprise that, in the middle of class the next morning, my cell started ringing. We’ve got about 100 assessments in the system, the voice said, but we just discovered that the answers are associated to the wrong assessments! Further, as the educators enter the data into the system, their entry was often associated with someone else, so they couldn’t go back and edit or delete it.

After clearing 60 miles of 2-lane road back to my office in record time, I started some very quick triage while trying to avoid a few dozen dirty looks from my users. The problem, at least the main one, was that I was using incorrectly scoped variables to track user IDs in the ASP.NET application, which caused the assessments to be associated with the last user who had saved any item, and with more than 20 people entering data, there were more wrong than right. Further, since tests/questions and answers were entered in two different steps, most the answers were also incorrectly linked.

In true rookie fashion, I tried feverishly to fix the error on the fly. Now stationed in the midst of the users trying to enter data, I would stand up every 15 minutes or so and announce, “I’m recompiling, please log off!”. This went on for maybe 90 minutes, at which point I – in probably the only wise decision I made that day – stopped the bloodshed and asked that we reconvene later after we were able to correct the development issues.

The Aftermath

I was ready to fall on my sword over the debacle. After all, it was me and my ego that had caused the whole mess. Fortunately, I worked in a flexible environment that allowed us to introduce reasonable risk, even if it meant the occasional failure. Along the same lines, I was given the time to make it right: after two rounds of much more rigorous testing, I successfully deployed the updated (and this time, properly functioning) application several weeks later. Still, despite the eventual positive ending, I was embarrassed to say the least, and lost a little bit of cred that day.

The Lesson

Was it ever. I learned some hard lessons that day, lessons that I carry still to this day.

How did it change me? Let me count the ways:

Database normalization. Learn it, live it.

Don’t be afraid to admit when you’re over your head, even if it’s late in the process.

Test, test, test. A successful compilation isn’t the end of testing – it’s just the beginning.

Testing should involve business users, not just technical staff, and should simulate realistic usage as much as possible.

Never implement anything that you can’t be there to support on the day of go-live.

Don’t rush deployment. Missed deadlines will be forgotten, but crappy applications can live on forever.

And the most important lesson I learned that day:

Mistakes are a part of life, but you must a) own your mistakes and b) learn from them.

To that end, the catastrophic deployment described here was actually one of the highlights of my career. Never before or since has a single incident taught me so much.

Comments

Posted by Jason Brimhall on 10 August 2011

Good story. The best part comes near the end - own your mistakes and learn from them.

You can't get any better without recognition and acceptance of your own failures.

Posted by Andy Arena on 11 August 2011

:D I think any REAL developer has stories (yes, plural) like these. One of my favorites is releasing a ALPHA DEMO to management for scope REVIEW, then receiving phone calls from users "about this new system that's been DEPLOYED ..."

Posted by Dennis Wagner on 12 August 2011

I suppose it's forgivable because you were still a college student. I went back to grad school 5+ years after getting my degree and working in the real world. I ran into plenty of "cocky" college seniors who were very sharp, had great book knowledge, but simply had no real-world experience. As a project we were writing simulation software for the Viking Lander and you'd have thought I was speaking in Vulcan when I mentioned, "what happens when you get inconsistently bad data from one of your flight sensors?" I suppose real-world experience (failures) is the best way to learn these types of lessons.

Posted by jim.shaffer 25435 on 12 August 2011

I was a young COBOL developer working for a financial services company in 1976 (yes, there was a 1976....) I rewrote the monthly statement program that produced statements for all of our thousands of accounts every month. I made a similar mistake. I ignored an end-of-file situation and ended up sending statements to Customer B showing trading positions (like stock holdings) for Customer A. Not all of them, just a few hundred.

I was so embarassed, I didn't go to work the next day. Amazingly, no lawsuits, and few customers even reported the error (out of hundreds....hmmmmm) back to us.

Learned alot that day....

Posted by stephanie.sullivan on 12 August 2011

One of the many little debacles I've had in my short but lively career stuck with me the most...

We were migrating from SSRS 2005 to SSRS 2008 (I had been in the MI team maybe 4 months at this point and my degree was in philosophy) and I was responsible for ensuring that everything was tested and worked well in our reporting wrapper.

So I hopped to it with vigour- I got the IP of the new server, made a test copy of our report front-end, imported all the reports and tested. I tested the hell out of that thing- I tested on my access level, I tested on marketing's access level, I got other team members to test also. It was laborious, found a number of issues but we got there in the end.

Go live day and we start getting calls from the internal suers sitting in another building and our external customers who used the front-end. They couldn't access the reports! Uh-oh, major SLA violations here so it was gloomy gloomy day for me.

The issue was that the network guy, when asked for the server IP gave me an internal IP and little old barely computer-literate me did not know there was a difference in IP forms so used it. I didn't pick this up, my team didn't pick this up even though URLs showed at the bottom of the page with the IP. All in all a really stupid little error that caused havoc.

I learnt some important lessons then -

1) never trust other people to correctly gauge your level of knowledge

3) write test cases and get the QA/Test department to look over them before you start

4) even people who write pretty reports for a living need to know about networks

Posted by rf44 on 12 August 2011

My story is an old one too. In 1984 I worked for the Department of Culture, Museums and Tourism of a large city, here in Europe. This department had just bought its first PC and I decided to use it to perform statistical computations (number of visitors in the museums, number of visitors at major cultural events, impact of investments on touristic activity, etc.).

At the time, memory was scarce and limited (640K and the O.S. must fit in it too), Excel did not exist, Lotus (the PC version) was in infancy, so I decided to write a program in C which was a rather new language at the time. Due to the amount of data to be processed, it cannot fit into the memory so, very cleverly (or so I thought) I wrote a system that used temporary files to write intermediate computations. The processing was performed on large arrays that were reloaded from these files.

The first month everything seems to work fine and my system was able to produce the reports in a very short time. I must explain here that the main part of the process was formerly done manually and could take hours if not days. The I.T. Department of the municipality had a mainframe (Honeywell-Bull I think) but is was already overworked and was not available for such mundane tasks. The first results were impressive: the figures were strangely high but everybody thought it was because there were several major cultural events going on that attracted many visitors.

The next month the figures were even higher while no special events could explain it. The third month, the figure were even higher (huge actually) and people began to look at me in a funny (though not amused) way.

I finally realised that I was not zeroing the arrays between each cycle of computing, so the new results were added to the preceeding ones. It took me years to stop explicitly initialising my variables, even in languages where it's not strictly necessary.