Explore the ownership system in Rust

Jan 19, 2015

Updated for Rust 1.0.

This guide is for a reader who knows basic syntax and
building blocks of Rust but does not quite grasp how the
ownership works.

We will start very simple, and then will gradually increase
complexity at a slow pace, exploring and discussing every new bit
of detail. This guide will assume a very
basic familiarity with let, fn, struct, trait and
impl constructs.

Our goal is to learn how to write a new Rust program
and not hit any walls related to ownership.

Well, it will “die” twice. First, at the end of foo,
and then at the end of main. If you modify it in foo,
it will not affect the value in main.

The value gets copied at the call of foo(i).

In Rust, like in C++ (and some other languages), it is possible to use
your own type instead of integer. The value will be allocated on current stack
and it will be destroyed (the destructor will be called) when it goes
out of scope.

However, the Rust compiler follows different ownership rules, unless
type implements a Copy trait. Therefore we need to talk about the Copy
trait first, and get it out of the way.

Copy Trait

The Copy trait makes your type to behave in a very familiar way:
the bits will be copied to another location when assigned, or when
used as a function argument. Exactly like a built-in integer.

For example, this simple struct will be copy-able by default:

#[derive(Copy,Clone)]structInfo{value:i64,}

Note that we had to tell the compiler that it is Copy - otherwise
it would always be moved to another
location and would follow the ownership rules.

But we are actually interested in ownership, so from now on we will
concentrate on non-Copy types!

Ownership

Ownership rules ensure, that at any point, for a single non-copyable
value, there is only one owner that can change it.

Therefore, if a function is responsible for deleting this value,
it can be sure that there are no other users that will try to
access, change or delete it in future.

Let’s see some examples!

Say hello to Bob, our brave new dummy structure

To demonstrate how the data is moving around, we will create
a new struct and call it Bob.

<anon>:33:16: 33:19 error: use of moved value: `bob`
<anon>:33 black_hole(bob);
^~~
<anon>:32:16: 32:19 note: `bob` moved here because it has type`Bob`, which is non-copyable
<anon>:32 black_hole(bob);
^~~

Simple! Compiler makes sure that we can not use moved values,
and explains nicely what happened.

There is no Magic - just some rules

To implement “memory safety without garbage collection”, compiler
does not need to go chasing your values around the code. It can
decide what is destroyed in a function simply by looking
at the function body.

You can easily do that too, if you know the rules. So far, we saw
a few of them:

Unused return values are destroyed.

All values bound with let are destroyed at the end of the
scope, unless they are moved.

Here you go, memory safety based on the fact that there can only be
a single owner of a value.

However, so far we talked only about immutablelet binding -
the rules get slightly more complicated when the value
can be changed.

Mutable Ownership

All the owned values can be mutated: we just need to put them to
mut slot with let. For example, we can mutate some
part of bob, like a name:

fnmain(){letmutbob=Bob::new("A");bob.name="mutant".to_string();}

new bob "A"
del bob "mutant"

We created it with name “A”, but deleted a “mutant”.

If we give this value to another function mutate, we can also
assign it to mut slot there:

Useful to know: the function arguments can also be upgraded to mutable,
because they are also bindable slots that work the same way as a let slot.
So function from previous example can be shortened:

fnmutate(mutvalue:Bob){// use mut directly before the arg namevalue.name="mutant".to_string();}

Replacing a value in mutable slot

What happens if we try to overwrite a value in some mut slot? Let’s see:

fnmain(){letmutbob=Bob::new("A");println!("");// skip line to make output nicer// First overwrite using name "B", and then "C"for&namein&["B","C"]{println!("before overwrite");bob=Bob::new(name);println!("after overwrite");println!("");// skip line}}

The old value gets deleted. The newly assigned value will be deleted
at the end of scope - unless it is moved or overwritten again.

Mutable Ownership rules

So, there is one additional rule, for the mutable slots:

Unused return values are destroyed.

All values bound with let are destroyed at the end of the
scope, unless they are moved.

Replaced values are destroyed.

Kind of obvious. The point is, in Rust, we are sure
nothing else owns or references them - so it is possible to
do that.

The power of Ownership system

These ownership rules might seem a tad limiting at first, but
only because we are used to a different set of rules. They
do not limit what is actually possible, they simply give us a
different foundation for building higher-level constructions.

Some of these constructions are way harder to make safe in other
languages. Even if they are made safe, they do not necessarily
provide compile-time safety guarantees.

We will now overview some of them, available in the standard library.

Memory Allocation

So far we talked about integer-like values, that live on a stack.
Our test dummy Bob was such a value. While some popular languages can also
keep values only on a stack (struct in C#, or
value instantiation without new in C++), many do not.

Instead, a newly constructed object instance (in many languages - with a new
operator) is created in what is called the heap memory.

The heap memory has some advantages. First, it is not limited by a stack size.
Placing a huge structure on the a stack might simply overflow it.
Second, its memory location does not change, unlike the location of a stack
value. Every time a stack-allocated value is moved or copied, the actual
bits need to be copied from one place of the stack to another.
While it is very efficient for a small structure
(the values are always “nearby”), it can become slower if the structure
grows bigger.

Box solves this by moving our created value to the heap, while
wrapping a small pointer to the heap location on the stack.

For example, we can create our Bob in the heap memory like this:

fnmain(){letbob=Box::new(Bob::new("A"));}

new bob "A"
del bob "A"

The type of value bob returned from Box::new is Box<Bob>.
This generic type makes the Bob lifecycle managed by this Box<Bob>
wrapper and deleted when the Box is deleted.

Box is not copyable, and follows the same ownership rules discussed
previously. When it reached the end of life on the stack, its destructor drop
was called, which subsequently called the drop on the Bob, as well
as cleaned up the memory on the heap.

The triviality of this implementation is a big deal. If we compare this
to the solutions in other languages, they mostly do one of the two things.
They either leave it up to you to clean up the memory (with some horrible
delete statement someone will forget or call twice), or rely on
garbage collection to track memory pointers and
clean up memory when those pointers are no longer referenced.

In Rust, ownership tracking has no runtime penalty and is ensured to be
correct at compile-time. This simple memory deallocation over Box
builds directly on ownership tracking, is small, safe and quite often
sufficient.

When it is not sufficient, there are other tools that can help with that.

Reference Counting

Rust has enough low-level tools for reference counting to be implemented as
a library. It can be used in rare cases when the value has several owners,
therefore its end of life can not be determined statically at compile-time.

Rust has a better name for it: shared ownership.
The std::rc library provides a way to share ownership of the
same value between different Rchandles. The value remains alive
as long as there is least one handle for it.

For example, we can make a bob instance managed by Rc handle this way:

We can change our black_hole function to accept Rc<Bob> and check if it is
destroyed by it. But instead it would be more convenient to make it
accept any type T that implements Debug trait (so we can print it).
We are going to make it generic:

Once wrapped by Rc handle, bob will live as long as there is a live Rcclone
somewhere. Rc handle internally uses Box to place new value in heap memory,
together with reference count (RC).

Every time a new handle clone is created (by calling clone on Rc), the RC
is increased, and when it reaches end of life, decreased. When
RC reaches zero, the object itself is dropped and memory is deallocated.

Note, that Rc above is not mutable. If the contents of Bob need to be mutated,
it can be additionally wrapped in the RefCell type which allows a mutable
borrow of a reference to our single bob instance. In the following example
it will be mutated it in the mutate function.

The RefCell is used to provide what is called the interior mutability.
It is just one of the tools in Rust toolbox to solve a specific problem.

So, the point is: different low-level utilities in Rust can be combined
to achieve precisely what is needed with minimal overhead.

For example, Rc can only be used in the same thread. But there is a
Arc type for atomic RC usable between threads. A
mutable Rc might create cycles when multiple objects reference each other.
However, Rc can be cloned into a Weak reference which does not participate
in reference-counting. More information can be found in the
official documentation.

Most importantly, more advanced memory management mechanisms can (and will)
be implemented later, and they can be done as libraries.

Concurrency

It is interesting to see how Rust changes the way we work with threads.
The default mode here is no data races. It is not because there are some
special safety walls around threads, no. With Rust you can build
your own threading library with similar safety properties, simply
because the ownership model is in itself thread-safe.

Consider what happens when we send two values into a new Rust thread, a
Bob (movable) and an integer (copyable):

usestd::thread;fnmain(){letbob=Bob::new("A");leti:i64=12;letchild=thread::spawn(move||{println!("From thread, {:?} and {:?}!",bob,i);});println!("waiting for thread to end");child.join();}

new bob "A"
waiting for thread to end
From thread, bob "A" and 12!
del bob "A"

What is happening there? First, we create two values:
bob and i. Then we create a new thread with thread::spawn
and pass a closure for it to execute. This closure is going to
capture our variables bob as i.

Capturing means different things for Bob and i. Because the Bob is
non-Copy, it will be moved to the new thread. The i will be copied
there. When the theead is running, we can modify original copy of i
(if needed). It does not influence the copy that was passed to the thread.

Bob, however, is now owned by this new thread, and can not be modified unless
the thread returns it back somehow. If we wanted, we could
return it to the main thread over child.join() (the join waits for
the thread to finish).

fnmain(){letmutbob=Bob::new("A");letchild=thread::spawn(move||{mutate(&mutbob);bob});println!("waiting for thread to end");ifletOk(bob)=child.join(){println!("{:?}",bob);}}fnmutate(bob:&mutBob){bob.name="mutant".to_string();}

One could say that this does not change much the way we used to work
with threads - we know not to share same memory location
between threads without some kind of synchronisation. The difference
here is that Rust can enforce these rules at compile-time.