Engineering

Undebt: How We Refactored 3 Million Lines of Code

Evan H., Software Engineering Intern

Aug 23, 2016

Peter Seibel wrote that to maximize engineering effectiveness, “Let a thousand flowers bloom. Then rip 999 of them out by the roots.” Flowers, in how the metaphor applies to us, are code patterns — the myriad different functions, classes, styles, and idioms that developers use when writing code. At first, new flowers are welcome — maybe the new pattern seems easier to use, more scalable, more efficient, or more suited to some particular task than the old.

As a code base grows, and the flowers proliferate, however, it becomes clear which patterns work and which don’t. Suddenly, code patterns that were once beautiful new flowers become technical debt in need of removal. When that happens, it’s time to start ripping. Otherwise, since developers learn by reading (and occasionally copy-and-pasting) from existing code, the bad flowers and the technical debt that comes with them will continue to grow unchecked.

Removing a bad code pattern by hand, especially in a massive code base like ours, is a Herculean task that puts a massive drain on developer time; time that could be better spent working on new features and shipping new code. That’s why other members of the Core Backend team and I built Undebt, an elegant, fast, reliable tool for performing massive, automated code refactoring.

Undebt works on any language and lets you define complex find-and-replace rules using standard, straightforward Python that can be applied quickly to an entire code base with a simple command.

Since its inception at our most recent hackathon, Undebt has become a key tool for performing en masse code refactoring. Used along with our open-source debt tracker, we can now efficiently monitor and remove technical debt before it becomes a serious problem.

Usage of a deprecated method going to zero with the help of Undebt

The graph above, generated using our open-source debt tracker, shows the usage of a particular deprecated method across the 3 million lines of Python that make up Yelp’s codebase. The effect of Undebt can be seen near the end. We were able to completely remove two years of accumulated debt using a variant of the included method_to_function example. We have also made heavy use of the sqla_count example in refactoring our SQL alchemy code to remove inefficient sub-queries.

How it works

To use Undebt, you define a custom pattern that specifies what code pattern to look for and how to replace it. Undebt leverages the popular open source tool pyparsing to make writing a pattern file as easy as plain Python. Undebt also comes loaded with tools and examples to make this process as painless as possible.