Piotr is a hacker scientist: an assistant professor at Warsaw University of Technology and Cheap Science Officer at Rebased. Attempts to pry him from his mechanical keyboard end up in some random blabber and Warsaw Ruby Users Group meetings.

Give Codeship a try

Want to learn more?

Your Ruby code smells. And it’s okay — mine does, too! Maybe it has somelongmethods with too many parameters, a few classes that try to do too much, an uncommunicative name here or there. No codebase is perfect, but it’s worthwhile to be aware of its deficiencies and refactorings that could improve the state of things — so let’s look at a few example code smells, potential ways to fix them and a tool that helps find them in our applications.

Duplicate Method

One of the easiest to notice (and simplest to fix) code smells is duplicate method: a situation when a given method call is made two or more times in the same context.

Let’s assume we have a class that represents an office which has a method assessing whether the current weather helps concentrate on programming tasks. To assess the weather, the Office class collaborates with a weather source and asks it for current weather at the office’s city:

In the above implementation, the Office#weather_assessment method makes two separate WeatherSource#weather_at calls (probably resulting in firing synchronous requests over the network), each time passing the same argument. While there might be cases where this is warranted — the method’s results are volatile or explicitly calling it twice significantly improves readability — in the vast majority of cases making a single call and storing its result locally is a better solution:

Data Clump

Let’s create a few more methods for our Office class — this time ones that use a navigation source collaborator to calculate distance, directions, and whether one needs a passport to get to the office’s geographical coordinates given a starting latitude and longitude:

Notice how all of these methods take the same pair of parameters? This smell is called data clump, and — except for times when the arrangement is accidental — it usually means there is a higher-level object missing from the implementation. In this case such an object could be a Location struct representing the given pair of latitude/longitude coordinates:

Repeated Conditional

Let’s assume that (in our remote-friendly organization) we don’t exactly know the location of some of our offices. In this case we could be tempted to guard the NavSource calls with a conditional check:

class Office
def directions(source_location)
if @location
@nav_source.directions(from: source_location, to: @location)
end
end
def distance(source_location)
if @location
@nav_source.distance(from: source_location, to: @location)
end
end
def needs_passport?(source_location)
if @location
@nav_source.borders?(from: source_location, to: @location)
end
end
end

This smell is called repeated conditional – doing the same check over and over again suggests that there might be a better solution. Unless such a set of checks has exceptional readability, refactoring usually yields a better solution –but whether the refactoring should consist of introducing a separate LocatedOffice class or teaching the NavSource about handling NullLocations usually highly depends on the codebase in question.

Boolean/Control Parameter

Let’s now add a predicate that says whether the developers at the given Office program in Ruby — and let’s make it parameterizable so the caller can choose whether this should be a strict question (‘only Ruby developers’) or a loose one (‘Ruby developers among others’).

This kind of code is called the boolean parameter smell (a case of the more general control parameter smell); the caller knows exactly which code path the method will take and controls its execution from the outside. The typical refactorings are to either split the parameterized predicate into dedicated methods or to split the class into more classes, each of which would implement one of the code paths.

Let’s add an Office#good_fit? method that assesses whether a given office is a good fit for a given employee (passed in an argument); we consider the office a good fit when the employee is sociable or likes the city:

Notice how this method is much more interesting in interrogating its parameter; if it weren’t for the reference to the @city instance variable it would be a utility function — a method that operates solely on its arguments and can be moved anywhere in the system without any change to its body.

Refactoring feature envy smells is more involved, but the general notion is the same: There is a chance that this method would work better if it was implemented in the context of its parameter:

Finding Smells with Reek

Now that we know a few common code smells, how can we find them in our own Ruby codebases? This is where Reek comes in — Reek is a code smell detector that can analyze Ruby files and pinpoint the smells, and you can think about it as a RuboCop for your architecture and code quality.

Notice how Reek reports all of the code smells discussed above, as well as suggests that the original implementation of Office#ruby_developers? also smells of feature envy; this suggests that maybe the languages collection is the object that should implement the predicate. Reek also by default flags any classes and modules that lack code comments as irresponsible.

Reek can also be used on whole directories and comes with a configurable Rake task and a set of RSpec matchers, so it’s easy to integrate it into your tests or make your CI server flag any newly introduced code smells.

Configuring Reek

Reek’s opinions about what constitutes a code smell are by necessity very subjective; while it’s relatively easy to make blanket decisions and enforce a certain syntax style, opinions on whether a given piece of code is a smell and whether it should be refactored usually vary much more and often have to be contrasted with the readability and blunt obviousness of the original, ‘smelly’ versions.

Fortunately, this can be addressed via Reek’s high configurability — both on the project-wide level and when it comes to particular code pieces.

A project-wide (and, for a bit more fine-grained cases, directory-wide) configuration can be added by creating a YAML file ending with .reek. For example, the default configuration of the data clump smell is to aggressively flag even the smallest clumps as soon as they’re repeated:

DataClump:
max_copies: 2
min_clump_size: 2

Conversely, a method is considered to have a long parameter list when it has four or more of them — except for constructors, which can take up to five without being flagged:

LongParameterList:
max_params: 3
overrides:
initialize:
max_params: 5

Likewise, a method is considered to have too many statements when it has more than five of them — except for constructors, which are a bit harder to split in practice:

TooManyStatements:
max_statements: 5
exclude:
- initialize

The exclude setting can be tuned as needed — from just a single case of AClass#some_method to a regular expression matching a set of methods, like /meth/.

At the other end of the configuration applicability any method can be decorated with a set of magic comments, altering Reek’s flagging on a per-method basis — e.g., when we’re okay with a method call being made twice in a particular case:

Such a comment will both have a higher chance to be moved along with the method when it’s refactored and keeps guarding the implementation from adding a third call (at which point the method gets smelly again).

Whither Now?

Now that you know how to find — and, potentially, refactor — smells in your code it’s super important to remember that smells do not necessarily mean that the code is bad; to quote the venerable Content Creation Wiki:

Note that a code smell is a hint that something might be wrong, not a certainty. A perfectly good idiom may be considered a code smell because it’s often misused, or because there’s a simpler alternative that works in most cases. Calling something a code smell is not an attack; it’s simply a sign that a closer look is warranted.

While it might be tempting to go all out and try to fix each and every code smell until Reek no longer points anything out, this is not the point. To quote the Wiki again:

Rules are for the guidance of the wise and the obedience of fools. There are two major approaches to programming:

Pragmatic: code smells should be considered on a case by case basis

Purist: all code smells should be avoided, no exceptions

…and therefore a code smell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist.

I highly recommend investigating your code smells and refactoring the ones that are indeed problematic — but also learning how to accept the ones that are not; both experiences will let you level up as a developer. You can read about more code smells (and potential refactorings) in Reek’s docs.

Subscribe via Email

Over 60,000 people from companies like Netflix, Apple, Spotify and O'Reilly are reading our articles. Subscribe to receive a weekly newsletter with articles around Continuous Integration, Docker, and software development best practices.

We promise that we won't spam you. You can unsubscribe any time.

Join the Discussion

Leave us some comments on what you think about this topic or if you like to add something.

I found best usage of Reek and RuboCop by integrating them with editor. In my case atom, which, has great linter package that check your code on fly and gives you immediate warnings. I feel this is much better than getting warning from CI.

Greg Navis

I also prefer a shorter feedback cycle. I want to keep me Vim config simple so what I do instead is a git pre-commit hook that runs rubocop, reek, etc.

Hey, apologies for getting back to you so late and thanks so much for your patience!

I think RuboCop and Reek are quite different – Reek, rather than concentrating on the syntax, tries to point out potential architectural problems in a codebase on an OOP level. Reek’s results are much less rules and much more suggestions where to look for potential issues, and can’t really be auto-corrected (but rather require – often quite extensive – refactoring).

There are relatively few areas covered by both RuboCop and Reek; RuboCop is more of a syntax style enforcer, while Reek is an architectural tool. There are some common areas of interest –e.g., RuboCop tries to eradicate long methods because they’re a readability issue, while Reek points them out as potential code smells that mix the level of abstraction – but in general the best approach is to figure out what are acceptable thresholds for a given project (e.g., maximum number of lines in a method or maximum number of instance variables) and adjust them as necessary.

In general, RuboCop is much more opinionated because it usually deals with much cleaner choices – ‘this kind of syntax should be indented this many spaces’ or ‘single quotes where possible vs double quotes everywhere’ are relatively easily made choices that should indeed be uniformly enforced across the whole codebase if possible. Reek seldom deals in absolutes; there are often cases where a given piece of logic *technically* should live closer to class A, but *in practice* is much more useful when put in class B.

My general rule of thumb for RuboCop is ‘try to please it as much as possible and only reconfigure it on the most blatant disagreements’, while my rule of thumb for Reek is ‘this are fascinating suggestions, let me look at the smells one-at-a-time and I’ll probably whitelist quite a few of them – and that’s perfectly OK’.