Why Rust's ownership/borrowing is hard

Working with pure functions is simple: you pass arguments, you get a result — no side effects happen. If, on the other hand, a function does have side effects, like mutating its arguments or global objects, it's harder to reason about. But we've got used to those too: if you see something like player.set_speed(5) you can be reasonably certain that it's going to mutate the player object in a predictable way (and may be send some signals somewhere, too).

Simple example

Nothing in the experience of most programmers would prepare them to point suddenly stopping working after being passed to is_origin()! The compiler won't let you use it in the next line. This is the side effect I'm talking about — something has happened to the argument — but not the kind you've seen in other languages.

Here it happens because point gets moved (instead of being copied) into the function so the function becomes responsible for destroying it and the compiler prevents you from using it after that point. The way to fix it is to either pass the argument by reference or to teach it how to copy itself. It makes total sense once you've learned about "move by default". But these things tend to jump out on you in a seemingly random fashion while you're doing some innocent refactorings or, say, adding logging.

Complicated example

Consider a parser that takes some bits of data from an underlying lexer and maintains some state:

The seemingly unnecessary consume_lexeme() is just a convenience wrapper around a somewhat longer string of calls that I have in the actual code.

The lexer.next() returns a self-sufficient lexeme by copying data from the lexer's internal buffer. Now, we want to optimize it so lexemes would only hold references into that data and avoid copying. We change the method declaration to:

pub fn next<'a>(&'a mut self) -> Lexeme<'a>

The 'a thingy effectively says that the lifetime of a lexeme is now tied to the lifetime of the lexer reference on which we call .next(). It can't live all by itself but depends on data in the lexer's buffer. The 'a just spells it out explicitly here.

In plain English, Rust tells us that as long as we have lexeme available in this block of code it won't let us change self.state — a different part of the parser. And this does not make any sense whatsoever!

The culprit here is the consume_lexeme() helper. Although it only actually needs self.lexer, to the compiler we say that it takes a reference to the entire parser (self). And because it's a mutable reference, the compiler won't let anyone else touch any part of the parser lest they might change the data that lexeme currently depends on.

So here we have this nasty side effect again: though we didn't change actual types in the function signature and the code is still sound and should work correctly, a different ownership dynamic suddenly doesn't let it compile anymore.

Even though I understood the problem in general it took me no less than two days until it all finally clicked and the fix became obvious.

Rusty fix

Changing consume_lexeme() to accept a reference to just the lexer instead of the whole parser has fixed the problem but the code looked a bit non-idiomatic, having changed from a dot-method notation into a plain function call:

Luckily Rust actually makes it possible to have it the right way, too. Since in Rust the definition of data fields (struct) is separate from the definition of methods (impl) I can define my own local methods for any struct, even if it's imported from a different namespace:

Rust's borrow checker is a wonderful thing that forces you into designing code to be more robust. But as it is so unlike anything you're used to, it takes time to develop a certain knack to work with it efficiently.

Comments: 9

Thank you so much for writing this! I've banged my head against this wall several times without finding a good solution!

Chris Winn

For what it's worth, I never once thought you were anything but a native English speaker. Nice post.

Jack O'Connor

Did you mean for the original version of next() to return a lexeme? It looks like you might've?

Ivan Sagalaev

Did you mean for the original version of next() to return a lexeme?

The original version of Lexer::next() is not shown and Parser::next() actually returns parser events. I just fixed this bit, thanks!

Juarez

I have the impression that Rust is adding a unnecessary complexity adopting "move by default" for non raw types.
The programmer have the extra burden of 'boxing' references everywhere.
In my view, seems natural to think of:
a) 'copy by default' for raw types
b) 'reference by default' for complex types (structs, traits, etc...) in synchronous methods
c) 'move by default' for complex types in asynchronous methods, and case by case.
What I am missing?

Ralf

Rust's ownership/borrowing system is hard because it creates a whole new class of side effects.

Notice however that what you call an "effect" here is actually very, very different from the "effects" people usually mean when they say "side-effect". All this business about ownership and moving is purely a compile-time notion, it does not change at all what your code does. Hence it doesn't make reasoning about your code's behavior any harder, since the behavior is not changed. Actually, it makes that reasoning much simpler (provided the program passes the borrow checker) because the side-effects are now much more controlled.

So, maybe it helps to think about the borrow checker like that: As you said, reasoning about effects is hard. This applies in particular to reasoning about unrestricted effects like C++ has them, where there could be aliases to all sorts of data pretty much everywhere. This can get hard to keep an overview of (just think of how many programs out there will use a pointer into a std::vector that has since then grown or shrunken). The borrowing and ownership business is not a new side effect, it is about restricting the existing side-effects: If you own something or have a mutable reference (which is necessarily unique), you know there are no surprising (non-local) side-effects to this object because nobody else can have an alias. By this I mean that calling some function on some other data will never, magically, mutate the data you own. If you have a shared reference, you know there are no side-effects because nobody is allowed to perform mutation of the data. (With the exception of interior mutability, which allows mutation in very controlled circumstances - with RefCell, you can pretty much end up in the same mess of uncontrolled side-effects as with C++, but at least there is run-time checking.)

When the compiler tells you that data is moved and you cannot use it, that's not a new side-effect. That's "just" the compiler understanding the side-effects we all know about, so that it can make sure they are kept under control. In C++, if you pass the Point above to some other function, the compiler makes a shallow copy, and if Point contains pointers, that can result in a mess. Now, Point in particular is safe to copy around, but you have to be explicit here and tell the compiler that you want it to be treated as such:

#[derive(Copy,Clone)] struct Point { ... }

You may wonder why the compiler does not figure this out automatically. It could, just like it does for Send. The problem here is interface stability: If you write a library that exports a type that just happens to be Copy, then the library is bound to always keep this type Copy in the future. It should be a conscious choice of the library author to guarantee that this type is and always will be Copy, hence the explicit annotation.

Jean-Pierre

Seconding Martin here: this was very, very helpful.

Jan

I think your use of "diverged" in "Since in Rust definition of data fields (struct) is diverged from definition of methods (impl)" is not correct. I would use "divorced" or "separate". And you're missing "the" before both uses of "definition". But I know that's difficult for eastern-europeans.

Note I'm not a native English speaker either, so take this with a grain of salt.

Ivan Sagalaev

I think your use of "diverged" in "Since in Rust definition of data fields (struct) is diverged from definition of methods (impl)" is not correct. I would use "divorced" or "separate". And you're missing "the" before both uses of "definition". But I know that's difficult for eastern-europeans.