Tonight... We Lunch... In HELL!

I saw that coming a mile away. Which of course means that I have done the same exact thing.

My only excuse was that our mailing system required getting an email address to display and an email address to send the email to. I was testing the email address that would get displayed, and the email address that the email got sent to was jacked up.

Which is exactly why I never stick in that sort of code without a TODO comment, just like Holden's, and why I never check in code without doing a diff on the file and looking specifically for any TODO comments I may have added (unlike Holden).

Q: What was the need for the debugging statement to filter on the programmer's email address?

A: The programmer was testing with real customer data and didn't want clients to get emails from his testing.

I do not allow testing with production data (even a copy of it), and we always have test email accounts setup for QA. Debugging email is done by setting a constant like SEND_EMAIL = true|false and wrapping all email functions (there should only be ONE) with:

if (SEND_EMAIL) {
message.send();
}

Then you can setup your QA environment to send/not send emails. And when you want to test the email functions, you don't have customer's addresses in the database

I don't get exactly what has gone on here.
If the address is Holden's then send the message to Holden. That bit I get.
It doesn't not say what happens in any real cases, ie. when the address is not Holden's.
Since there was an issue with the email, we can infer that something was going wrong with it but was it not going out at all, was the email going to the wrong people, were they getting empty or incomplete email messages?

You're almost there Freddy.
If the email address is Holden's then it sends email to Holden.
If the email address is not Holden's, then nothing happens because there was no ELSE statement. The system simply eats the event and continues like nothing happened.

The sendMessage() function was the only thing that ivokes the sending of mails, so he effectively stopped sending email for anyone else than himself, with the if statement. (probably to stop customers to get spurios test emails, as someone noted.)
So, no nobody got email anymore. The code sniplet said so, do not send email unless it goes to holden

Our customer has their own proprietary login validation server. That way their users have the same password for all their services. I had to write the code to interface with that.

Some problem came up that was only reproducible on the customer's site. The system would crash randomly when users were trying to login.
System wasn't live yet, so I employed the "friendly" debugging method, wherein I put hacked up code on their system trying to isolate the problem.

First things first, I need to see if it's actually my code that's breaking. Hardcode "return true;" at the top of the login function, so that every PIN is considered valid. If problem persists, it's not my problem.

Before deploying this fix, I happened to find the real bug, fixed that and dropped the fix on the customer's box. Tried logging in several times, each one successful. Hurray, it works. Check in code, pat self on back.

The next morning as I embarked on my 6 mile bike ride to work, I came to the realization that it only worked because I forgot to take out my hack. I've never pedaled that fast in my life. When I finally got to work, looked at code and made a few calls. Evidently I did remember to undo it. Hauled ass for nothing.

You might want to haul ass back home today, I think your oven is on. :-) I love it when I think I made a mistake and forgot something only to find out that I was more on the ball than I thought. I dread the day when I can hide my own easter eggs.

These kinds of events are frequent and common in shops that:
a) Do not have any sort of QA or secondary testing aside from unit testing.
b) Do not have proper infrastructure to allow unit testing and QA testing on a non-live, non-production environment.

The savings enjoyed by not having a dedicated test environment and test databases available are offset by the frequency of the above kinds of problems. They are amplified when peer review/unit testing isn't also afforded.

The WTF is still humorous. When cross-departmental grenades are lobbed, it's always a good thing. I'm sure for every development fubar, there are likely similar number of IT/MIS fubar's to allow for equal jabbing. My bet is J has had to buy lunch for a week as well if the company policies also extend to other departments. :)

Unfortunately, I've never heard of an SCM package that works on Pick. The closest I've come is a policy of copying changed programs into a backup directory before overwriting them in production.

Looking at the code, that doesn't look like Pick. Pick uses * for comments, uses just one = to test equality, and the function calls aren't usually that clean. However, I also have not seen an SCM for Pick, our company rolled its own that does help a little.

Good for you, but that's not always an option. If part of your process is cleaning/validating external data, it's *very* hard and time-consuming to produce sufficient test cases to cover the same breadth of possibilities that a simple copy of the production data would.

I assume that you've also never been in the position of having to deploy a major database schema update. You'd better pray to whatever deity you believe in if you have a blanket policy against cloning the production environment. But then again, maybe unnecessary downtime is one of the things that you DO allow?

Not to mention if you or your company provides a UAT environment. I'm sure that the customers won't mind testing on irrelevant garbage data - right?

Good for you, but that's not always an option. If part of your process is cleaning/validating external data, it's *very* hard and time-consuming to produce sufficient test cases to cover the same breadth of possibilities that a simple copy of the production data would.

I assume that you've also never been in the position of having to deploy a major database schema update. You'd better pray to whatever deity you believe in if you have a blanket policy against cloning the production environment. But then again, maybe unnecessary downtime is one of the things that you DO allow?

Not to mention if you or your company provides a UAT environment. I'm sure that the customers won't mind testing on irrelevant garbage data - right?

Agree 100%. Not being allowed to test on a copy of production data is asking for trouble.

These kinds of events are frequent and common in shops that:
a) Do not have any sort of QA or secondary testing aside from unit testing.
b) Do not have proper infrastructure to allow unit testing and QA testing on a non-live, non-production environment.

The savings enjoyed by not having a dedicated test environment and test databases available are offset by the frequency of the above kinds of problems. They are amplified when peer review/unit testing isn't also afforded.

The WTF is still humorous. When cross-departmental grenades are lobbed, it's always a good thing. I'm sure for every development fubar, there are likely similar number of IT/MIS fubar's to allow for equal jabbing. My bet is J has had to buy lunch for a week as well if the company policies also extend to other departments. :)

More than one person has pointed out that the Real WTF is the lack of testing... i don't see how more testers would have helped. All the testers would have said, "its broken"
Holden says, no it works. It works on my machine.

Holden might have found the problem using unit testing that used a different email address.

To me the "well this stopped working a week ago" would have been a HUGE light bulb turning on over my head.

Code reviews and QA aside, why didn't anyone just check the damn email server logs? It'd take about 3 nanoseconds to identify that no email had been sent to anyone other than Holden. And this went on for two MONTHS?!

Uhg... I know J & Holden (obviously not thier real names, duh!) It seems that something got lost in the translation here. Magically the WTF servers change days into months and obfuscate various other fact. Most important of all, to my knowledge, Holden has not bought J lunch ever. And for the record I have never known J to "Enjoy the sweet, tangy nectar of vindication" If you are going to mash up someone's story do it like a man, not limp wristed fairy

Uhg... I know J & Holden (obviously not thier real names, duh!) It seems that something got lost in the translation here. Magically the WTF servers change days into months and obfuscated various other facts. Most important of all, to my knowledge, Holden has not bought J lunch ever. And for the record I have never known J to "Enjoy the sweet, tangy nectar of vindication" If you are going to mash up someone's story, do it like a man, not a limp wristed fairy

I am the original submitter of this article. I submitted it several months ago, and quite honestly, the only way I recognized it was because they used the same pseudonyms as I did in my original submission.

I want to say first that the framework of the story:
Coder leaves in email testing statement
Coder blames it on email server
Coder finds problem, is much chagrined at lunch
is true.

However, I was actually quite disappointed to see just how much creative license was taken with the original text. Basically all the dialogue is fabricated, as were a few other details. It's still a good story, just not exactly what happened.

Also, for the person that noted that the code wasn't from PICK, you're quite right. I am the network engineer, not a coder, and my "code snippet" I submitted was pseudocode and I claimed as much in my email to Alex. I do understand cleaning that section up for understandability, but for the nit-pickers (and I'm glad nit-pickers exist), that is not correct PICK code.

"Holden, I'm telling you, I've tested this every possible way that the email could fail. The mail server's configuration hasn't changed in over a month, and this only became a problem like a week ago. I'm assigning it back to your team; I'm sorry." Neither J nor Holden liked the tension that was building.

This should have been the clue to look at right there. Go back to just before the problem started - a bit more than a week ago - and check for changes to code there.

<Rant>
This is standard troubleshooting, whether you're coding, working on the network, or fixing cars even! So, "TheRealWTF" (I hate having to use this phrase, but it's sooo appropriate here) was not following the "outpoint" back to its source, starting with when the bug appeared as mentioned above.

"Outpoint": something that is there but should not be, or something that is not there that should be.
</Rant>

There, I feel better now. I hope the above proves useful to someone out there. ;)

The proper way to design such a system would have been to send all email to a single, centrally-configured email account. And then setup rules that, based on the subject line, relay the emails to the correct recipient. This has the added benefit of providing a backup and an audit trail for the emails as they will remain in the centrally-configured email account.

The proper way to design such a system would have been to send all email to a single, centrally-configured email account. And then setup rules that, based on the subject line, relay the emails to the correct recipient. This has the added benefit of providing a backup and an audit trail for the emails as they will remain in the centrally-configured email account.

Dear God! I hope you are joking. If not, you should read your post again slowly and send me a copy of your resume so I don't accidentally hire you.

Pick is the real WTF. Its a database operating system from the 60's (like MUMPS, but for the Army), originally called GIRLS, written by Dick Pick, using ENGLISH as its query language. Also Cache is derived from Pick and MUMPS.

Our customer has their own proprietary login validation server. That way their users have the same password for all their services. I had to write the code to interface with that.

Some problem came up that was only reproducible on the customer's site. The system would crash randomly when users were trying to login.
System wasn't live yet, so I employed the "friendly" debugging method, wherein I put hacked up code on their system trying to isolate the problem.

First things first, I need to see if it's actually my code that's breaking. Hardcode "return true;" at the top of the login function, so that every PIN is considered valid. If problem persists, it's not my problem.

Before deploying this fix, I happened to find the real bug, fixed that and dropped the fix on the customer's box. Tried logging in several times, each one successful. Hurray, it works. Check in code, pat self on back.

The next morning as I embarked on my 6 mile bike ride to work, I came to the realization that it only worked because I forgot to take out my hack. I've never pedaled that fast in my life. When I finally got to work, looked at code and made a few calls. Evidently I did remember to undo it. Hauled ass for nothing.

So you arrived at work - all sweaty and perspiring. Then sat there stinky and all for at least eight hours .... now that's disgusting.

The proper way to design such a system would have been to send all email to a single, centrally-configured email account. And then setup rules that, based on the subject line, relay the emails to the correct recipient. This has the added benefit of providing a backup and an audit trail for the emails as they will remain in the centrally-configured email account.

Dear God! I hope you are joking. If not, you should read your post again slowly and send me a copy of your resume so I don't accidentally hire you.

"Top Cod3r" must an accountant in disguise.

And as for the auditing - just code to BCC every mail sent to a dedicated mail box where all such mails are being kept for auditing and logging purposes if so required. Simple.

If even more auditing is needed send the mails with delivery and/or read notication options enabled.

If after that you want to be really fancy then have an automated batch process check the abovementioned dedicated mail box every 10 minutes or so and integrate all received e-mails into the application database linking them to the original (invoicing) business transaction. That way you can even then run reports and/or similar research on customer rep response times, invoices emails without corresponding delivery and/or read receipts and such.

I don't get exactly what has gone on here.
If the address is Holden's then send the message to Holden. That bit I get. It doesn't not say what happens in any real cases, ie. when the address is not Holden's.

It does nothing. Which is exactly what Holden wanted at the time, because he didn't want to annoy real customers with dozens of "fake" messages while he was testing the functionality.

Only thing is, he forgot to remove that line when he was done testing.