Good Programmers Are Good Writers

I’ve had a lot of people tell me they’d love to learn how to code, but that they wouldn’t be good at it because they aren’t good at math.

After a few times having this conversation, I started keeping track of how much math I use in my day-to-day work and spoiler alert: it’s not much.1 I sometimes use statistics and probability, but I certainly don’t use Linear Algebra or Differential Equations. And if you asked me to, I doubt I could without a textbook.

Instead of those high-level math courses, I really wish I had taken more classes on writing. It turns out the real world involves a lot of writing – emails, Slack messages, and meeting agendas – and if your job is programming, it involves a heck of a lot more than that. We just don’t think of it that way.

Source Code

Here’s the dirty secret. Writing code is just that: writing!

Sure, you’re writing instructions to a machine. Humans are much more forgiving conversation partners. But once you get used to the sometimes strange syntax, it’s amazing how closely it resembles the way we talk to each other.

There are nouns (variables) and verbs (functions) & they’re broken up into sentences (statements). Nouns may be described with adjectives (types or interfaces). And programmers often break independent thoughts into “paragraphs” separated by an empty line.

There’s a lot of information communicated there. When writing code in object oriented languages, we’re describing things and the ways they interact with each other. If you read that block of code aloud, it actually doesn’t sound too awkward:

Let Dave be a new user and set his email address to [email protected]. Then, let “welcome email” be a new email and send it to Dave.

It may not win points for eloquence, but it’s pretty easy to follow! This is invaluable for getting your bearings when going back to change some behavior, debug an issue, or onboard yourself into a new codebase.

Unlike us, the computer wouldn’t mind if I re-wrote the code like this:

That’d be awfully frustrating to work with, though, and I think that proves an important point: we write code for humans, more than for the machines that run it. Otherwise we’d all still be writing assembly! While the computer only reads those lines long enough to optimize them into the second form (and beyond), humans interact with the high-level code again and again. Whether through code review, refactoring, fixing bugs, or integrating with other systems - it’s almost guaranteed that code is read many more times than it’s written.

The parallels to written English don’t end once the computer gets involved. As part of parsing & running code, computers build abstract syntax trees to describe the relationship between words and symbols, and enforce grammatical rules for how they can be combined. Linguists do the same with phrase structure rules for how we can form valid English sentences.2

Of course, sometimes we still write code that doesn’t work. While both these tools provide some guardrails for effective communication, being grammatically correct doesn’t always mean you’ll make sense:

“Colorless green ideas sleep furiously.”

The linguist Noam Chomsky composed that sentence as an example of a phrase that makes sense grammatically but not semantically. For computer programs, this would be code that compiles but doesn’t do what you expect - a bug!

Comments

To help make sense of complicated logic, there’s another part of code that’s just for humans: comments. There’s a frustratingly widespread belief in programming circles that code should be self-documenting and therefore doesn’t need comments. But I think that’s missing the point.

The source code itself should explain what it does, but the comments explain why it does that. There’s often a lot of context that goes into a solution for a problem – business constraints, hacks to work around browser or framework quirks, or even just cutting a corner to meet a deadline. Comments are a way to leave a note to future readers so they know what you know.

Developers always complain about working on legacy code (and often, about how poorly written it is). I think an overwhelming reason for this is that context deteriorates over time - either the people working on a project change or the needs change and old solutions no longer make sense. Or the same people continue working on the same problem, but just forget things over time! I know I do.

Comments keep track of the things that aren’t important to the computer, but are essential for understanding why the computer is doing that task. In doing so, they preserve institutional memory which is crucially important to the long-term health of an organization.

Commit Messages & Pull Requests

A huge part of writing code is collaboration, but even if you’re working alone it’s helpful to keep a history of your work. This is where version control comes in. Unfortunately, commit messages often end up as an afterthought as deadlines or other frustrations ramp up the pressure:

With just a bit of discipline, however, commit messages become an amazing tool. It’s not unusual for a developer to look at a file and wonder why a change was made. The git commit log is a great tool to jump back in time and ask the author. Like code comments, they’re a way to maintain context.

The aggressively named git blame command allows you to trace when a particular line was last changed, which can often help explain its use now. And in GitHub, the commit that a change is introduced in links back to the containing pull request, adding even more valuable context.

Pull requests are where it all comes together. A pull request groups together code changes, comments, and commits into a neatly packaged solution to a problem. When reviewing a pull request, the diff tells you what changed but the comments and commit messages should tell you why.

To remind us of any other questions we should be answering, our team at DoSomething.org uses a pull request template. It’s a great reminder of the kinds of questions your teammates might have before handing code off for review.

One of the nice side-effects of writing more thoughtful commit messages and pull requests is that it forces you to stop and think about why you’re writing that line of code. Is this the most efficient way of solving this problem? Am I even solving the problem I set out to? By “making a case” for your code in your comments and commits, you’ll find you catch bugs and logic errors, and find solutions you may have missed had you been in more of a hurry.

It’s all communication.

There’s a myth of the “genius” that writes code so clever that nobody else on their team can understand it. I think that a truly great programmer writes code so clear that nobody can misunderstand it.

It’s often said that “you don’t really know it unless you can teach it”, and I think that’s especially true with programming. What’s simple to you may be complex to me, and vice versa. The most interesting (and challenging) programming work is taking something complex and making it seem simple.

Writing code is rarely about finding the most clever solution. It’s about finding the most understandable solution. The more code I write, the more I wish I’d gotten a degree in English.

Full disclosure, I write web apps. If you’re working on machine learning or computer graphics, you probably do need math. But you’re still writing code, so don’t close the tab just yet! ↩

I took a bunch of Psychology classes as electives and it’s fascinating how much overlap there is between Linguistics and compiler & type theory in Computer Science. ↩

Written on January 17, 2018.
Thanks to Matt Holford & Jason Wong for reviewing drafts of this post.