The Hardcode to Success

The second and third line imply that Rodrigo would volunteer for the tasks that "mainly Justin" would want to do.
Calling it now from the second paragraph: the network connection goes down, and no message can go out indicating that it's down.
End of the article now, and while Justin strikes me as a jerk, he was correct to blame Rodrigo for that thoughtless approach. Still, since Rodrigo was "his" intern, he should have done a code review before pushing. But then we wouldn't have had a WTF today.

You don't tell the server to send you 99 bytes, you are told by the server that it is sending you 99 bytes. As such, no truncation should happen because your application still gets the whole response. As such, that "offending line" has nothing to do with anything, unless this is about the server producing the status report, and not about the application that processes it.

I'm guessing that the real wtf here is that someone probably made up this whole story.

All on Justin, this. Rodrigo did what he was supposed to do, and offered his code up to be peer-reviewed, or as it happens manager-reviewed. Manager gives it the okay. As a result something doesn't happen that, er, already wasn't happening before this code was released.

So an intern learns a lesson about making assumptions and pushing untested code to production... haven't we all done that at some time or another? Besides, he didn't break anything; the network admins are really at fault for immediately accepting and trusting this new app to do their job for them. They should have continued doing whatever they had done before until the app had gone through some bake-in period. Making Rodrigo the scapegoat was just a jerk move all around.

Agreed. This means that Pushbullet would be expecting a complete json string, but it wouldn't actually be a complete string since it was truncated. It likely wouldn't decode it properly and would not send the alert email.

Most WTFs are written as "submitter is right, subject is wrong". Or, rarely, "submitter did a bad long ago". It's fresh to see one where two sides messed up in two different ways (in code and in person).

I thought that this site had a thing where they didn't post the mistakes of interns and students, since they aren't particularly expected to know better. I get that TRWTF was Justin not actually reviewing the code, then pushing an intern's code direct to prod, while IT trusted un-reviewed intern code to work perfectly first attempt.

That said, hard-coding the length is the title, and Rodrigo was also lax in not finding a way to actually test a real error condition, and the title implies that those are the WTFs.

Agreed. It was a rookie mistake and Justin should've paid more attention.

Reminds me of a boss I had once. The company ran reports against a mainframe via a process that practically exploded when said mainframe was in a maintenance window. Rather than fix the reporting process to fail gracefully, I was supposed to detect a maintenance window and not run the reports. Because of the spaghetti design, the check had to be in two places. I tested it, checked it in, and forgot about it. A week later, this mook comes yelling at me about how I screwed up the reports. Turns out that he didn't like how I did it so he deleted one check and moved/changed the other without telling me. The condition failed, the reports ran during a maintenance window, and errors exploded all over. When I pointed this out, his response was "you shouldn't have an answer for everything." Needless to say, my contract was cut short soon after.

Looking at the big picture, if you have a system the falls down and loses hours of production effort so regularly that you're piling (more) monitoring/notification tools on top, you have bigger problems than you're willing to admit. I've seen it over and over and over. People like Justin need to be let go so people who are actually interested in development can do something and learn from it.

Right, as mentioned above the network admins should have allowed the system to prove itself before they assumed it was working, and Justin should have paid more attention in the code review.

The real WTF is that Rodrigo was let go because of a rookie error: I guess the real lesson learned that some people (e.g. Justin) and companies (wherever they were working) are nasty AF and he should look for a better internship.

The real failure was that Rodrigo did not specifically tell the manager that he never actually tested this in a failure condition. And the manager failed by not requiring Rodrigo to "show me how it works when a failure occurs". Then, the manager failed again by not running this in a staging environment for a while to make sure that it actually worked. Then the manager failed by not monitoring closely after putting it in production. And the manager failed by not notifying the network admins that even though there was a new monitoring tool, it is still their responsibility to monitor the network and report on downtime. And the manager failed by assuming that everything was hunky-dory and going out to get a beer with his buddies.

Sure, the intern failed to produce something that actually worked; but then relying on that tool without it being proven to work is entirely on the manager.

There is no such thing as a WTF by an intern. You should always assume that interns' code doesn't work and have it go through code review and QA - especially for a critical system and especially especially for monitoring for a critical system. This is entirely Justin's fault.

It definitely wouldn't decode a truncated JSON element properly. Gratuitous trailing whitespace aside, the only JSON that could remain parseable after being truncated to only 99 characters would be a ridiculously large/precise number. Everything else would have an orphaned opening delimiter.

TRWTF is an alerting system that only sends a message if an error is reported. It seriously bothers me how many people assume that "no reported errors" is the same as "Nothing is broken". Seriously, there are so many ways that something can break before the system can shout out that something is wrong.

Yes, this does happen. I'm amazed by how many developers write code to implement a given feature and don't even try it. And I don't mean lack of proper thorough testing, I mean that they don't actually run the code they've just produced. It's almost universal in interns and new hires, but not totally uncommon among "seniors".

What kind of idiot would put software into production on a Friday? Wednesday evening is the cut off, unless you're scheduling something to go in over the weekend and are paying people extra to hang around.