A blog about programming, and tiny ways to improve it.

We’ve recently been doing a lot of work on Rust’s orphan rules,
which are an important part of our system for guaranteeing trait
coherence. The idea of trait coherence is that, given a trait and
some set of types for its type parameters, there should be exactly one
impl that applies. So if we think of the trait Show, we want to
guarantee that if we have a trait reference like MyType : Show, we
can uniquely identify a particular impl. (The alternative to coherence
is to have some way for users to identify which impls are in scope at
any time. It has its own complications; if you’re curious for
more background on why we use coherence, you might find this
rust-dev thread from a while back to be interesting
reading.)

The role of the orphan rules in particular is basically to prevent
you from implementing external traits for external types. So
continuing our simple example of Show, if you are defining your own
library, you could not implement Show for Vec<T>, because both
Show and Vec are defined in the standard library. But you can
implement Show for MyType, because you defined MyType. However,
if you define your own trait MyTrait, then you can implement
MyTrait for any type you like, including external types like
Vec<T>. To this end, the orphan rule intuitively says “either the
trait must be local or the self-type must be local”.

More precisely, the orphan rules are targeting the case of two
“cousin” crates. By cousins I mean that the crates share a common
ancestor (i.e., they link to a common library crate). This would be
libstd, if nothing else. That ancestor defines some trait. Both of the
crates are implementing this common trait using their own local types
(and possibly types from ancestor crates, which may or may not be in
common). But neither crate is an ancestor of the other: if they were,
the problem is much easier, because the descendant crate can see the
impls from the ancestor crate.

When we extended the trait system to supportmultidispatch, I confess that I originally didn’t give the
orphan rules much thought. It seemed like it would be straightforward
to adapt them. Boy was I wrong! (And, I think, our original rules were
kind of unsound to begin with.)

The purpose of this post is to lay out the current state of my
thinking on these rules. It sketches out a number of variations and
possible rules and tries to elaborate on the limitations of each
one. It is intended to serve as the seed for a discussion in the
Rust discusstion forums.

The so-called “unboxed closure” implementation in Rust has reached the
point where it is time to start using it in the standard library. As
a starting point, I have a
pull request that removes proc from the language. I started
on this because I thought it’d be easier than replacing closures, but
it turns out that there are a few subtle points to this transition.

I am writing this blog post to explain what changes are in store and
give guidance on how people can port existing code to stop using
proc. This post is basically targeted Rust devs who want to adapt
existing code, though it also covers the closure design in general.

To some extent, the advice in this post is a snapshot of the current
Rust master. Some of it is specifically targeting temporary
limitations in the compiler that we aim to lift by 1.0 or shortly
thereafter. I have tried to mention when that is the case.

There has been a lot of discussion lately about Rust’s allocator
story, and in particular our relationship to jemalloc. I’ve been
trying to catch up, and I wanted to try and summarize my understanding
and explain for others what is going on. I am trying to be as
factually precise in this post as possible. If you see a factual
error, please do not hesitate to let me know.

I’ve been working on a branch that implements both multidispatch
(selecting the impl for a trait based on more than one input type) and
conditional dispatch (selecting the impl for a trait based on where
clauses). I wound up taking a direction that is slightly different
from what is described in the trait reform RFC, and I
wanted to take a chance to explain what I did and why. The main
difference is that in the branch we move away from the crate
concatenability property in exchange for better inference and less
complexity.

A few weeks back pcwalton introduced a PR that aimed to move the
attribute and macro syntax to use a leading @ sigil. This means that
one would write macros like:

@format("SomeString: {}", 22)

or

@vec[1, 2, 3]

One would write attributes in the same way:

@deriving(Eq)
struct SomeStruct {
}
@inline
fn foo() { ... }

This proposal was controversial. This debate has been sitting for a
week or so. I spent some time last week reading every single comment
and I wanted to lay out my current thoughts.

Why change it?

There were basically two motivations for introducing the change.

Free the bang. The first was to “free up” the ! sign. The
initial motivation was aturon’s error-handling RFC, but I think that
even if we decide not to act on that specific proposal, it’s still
worth trying to reserve ! and ? for something related to
error-handling. We are very limited in the set of characters we can
realistically use for syntactic sugar, and ! and ? are valuable
“ASCII real-estate”.

Part of the reason for this is that ! has a long history of being
the sigil one uses to indicate something dangerous or
surprising. Basically, something you should pay extra attention
to. This is partly why we chose it for macros, but in truth macros are
not dangerous. They can be mildly surprising, in that they don’t
necessarily act like regular syntax, but having a distinguished macro
invocation syntax already serves the job of alerting you to that
possibility. Once you know what a macro does, it ought to just fade
into the background.

Decorators and macros. Another strong motivation for me is that I
think attributes and macros are two sides of the same coin and thus
should use similar syntax. Perhaps the most popular attribute –
deriving – is literally nothing more than a macro. The only
difference is that its “input” is the type definition to which it is
attached (there are some differences in the implementation side
presently – e.g., deriving is based off the AST – but as I discuss
below I’d like to erase that distiction eventually). That said, right
now attributes and macros live in rather distinct worlds, so I think a
lot of people view this claim with skepticism. So allow me to expand
on what I mean.

How attributes and macros ought to move closer together

Right now attributes and macros are quite distinct, but looking
forward I see them moving much closer together over time. Here are
some of the various ways.

Attributes taking token trees. Right now attribute syntax is kind
of specialized. Eventually I think we’ll want to generalize it so that
attributes can take arbitrary token trees as arguments, much like
macros operate on token trees (if you’re not familiar with token
trees, see the appendix). Using token trees would allow more complex
arguments to deriving and other decorators. For example, it’d be great
to be able to say:

@deriving(Encodable(EncoderTypeName<foo>))

where EncoderTypeName<foo> is the name of the specific encoder that
you wish to derive an impl for, vs today, where deriving always
creates an encodabe impl that works for all encoders. (See
Issue #3740 for more details.) Token trees seem like the
obvious syntax to permit here.

Macros in decorator position. Eventually, I’d like it to be possible
for any macro to be attached to an item definition as a decorator. The
basic idea is that @foo(abc) struct Bar { ... } would be syntactic
sugar for (something like) @foo((abc), (struct Bar { ... }))
(presuming foo is a macro).

An aside: it occurs to me that to make this possible before 1.0 as I
envisioned it, we’ll need to at least reserve macro names so they
cannot be used as attributes. It might also be better to have macros
declare whether or not they want to be usable as decorators, just so
we can give better error messages. This has some bearing on the
“disadvantages” of the @ syntax discussed below, as well.

Using macros in decorator position would be useful for those cases
where the macro is conceptually “modifying” a base fn
definition. There are numerous examples: memoization, some kind of
generator expansion, more complex variations on deriving or
pretty-printing, and so on. A specific example from the past was the
externfn! wrapper that would both declare an extern "C" function
and some sort of Rust wrapper (I don’t recall precisely why). It was
used roughly like so:

externfn! {
fn foo(...) { ... }
}

Clearly, this would be nicer if one wrote it as:

@extern
fn foo(...) { ... }

Token trees as the interface to rule them all. Although the idea
of permitting macros to appear in attribute position seems to largely
erase the distinction between today’s “decorators”, “syntax
extensions”, and “macros”, there remains the niggly detail of the
implementation. Let’s just look at deriving as an example: today,
deriving is a transform from one AST node to some number of AST
nodes. Basically it takes the AST node for a type definition and emits
that same node back along with various nodes for auto-generated impls.
This is completely different from a macro-rules macro, which operates
only on token trees. The plan has always been to remove deriving out
of the compiler proper and make it “just another” syntax extension
that happens to be defined in the standard library (the same applies
to other standard macros like format and so on).

In order to move deriving out of the compiler, though, the interface
will have to change from ASTs to token trees. There are two reasons
for this. The first is that we are simply not prepared to standardize
the Rust compiler’s AST in any public way (and have no near term plans
to do so). The second is that ASTs are insufficiently general. We
have syntax extensions to accept all kinds of inputs, not just Rust
ASTs.

Note that syntax extensions, like deriving, that wish to accept Rust
ASTs can easily use a Rust parser to parse the token tree they are
given as input. This could be a cleaned up version of the libsyntax
library that rustc itself uses, or a third-party parser module
(think Esprima for JS). Using separate libraries is advantageous for
many reasons. For one thing, it allows other styles of parser
libraries to be created (including, for example, versions that support
an extensible grammar). It also allows syntax extensions to pin to an
older version of the library if necessary, allowing for more
independent evolution of all the components involved.

What are the objections?

There is an inherent ambiguity since @id() can serve as both an
attribute and a macro.

The first point seems to be a matter of taste. I don’t find @
particularly heavyweight, and I think that choosing a suitable color
for the emacs/vim modes will probably help quite a bit in making it
unobtrusive. In constrast, I think that ! has a strong connotation
of “dangerous” which seems inappropriate for most macros. But neither
syntax seems particularly egregious: I think we’ll quickly get used to
either one.

The second point regarding potential ambiguities is more
interesting. The ambiguities are easy to resolve from a technical
perpsective, but that does not mean that they won’t be confusing to
users.

Parenthesized macro invocations

The first ambiguity is that @foo() can be interpreted as either an
attribute or a macro invocation. The observation is that @foo() as a
macro invocation should behave like existing syntax, which means that
either it should behave like a method call (in a fn body) or a tuple
struct (at the top-level). In both cases, it would have to be followed
by a “terminator” token: either a ; or a closing delimeter (),
], and }). Therefore, we can simply peek at the next token to
decide how to interpret @foo() when we see it.

I believe that, using this disambiguation rule, almost all existing
code would continue to parse correctly if it were mass-converted to
use @foo in place of the older syntax. The one exception is
top-level macro invocations. Today it is common to write something
like:

declaremethods!(foo, bar)
struct SomeUnrelatedStruct { ... }

where declaremethods! expands out to a set of method declarations or
something similar.

If you just transliterate this to @, then the macro would be parsed
as a decorator:

Note that both of these are more consistent with our syntax in
general: tuple structs, for example, are always followed by a ; to
terminate them. (If you replace @declaremethods(foo, bar) with
struct Struct1(foo, bar), then you can see what I mean.) However,
today if you fail to include the semicolon, you get a parser error,
whereas here you might get a surprising misapplication of the macro.

Macro invocations with braces, square or curly

Until recently, attributes could only be applied to items. However,
recent RFCs have proposed extending attributes so that they can be
applied to blocks and expressions. These RFCs introduce additional
ambiguities for macro invocations based on [] and {}:

@foo{...} could be a macro invocation or an annotation @foo
applied to the block {...},

@foo[...] could be a macro invocation or an annotation @foo
applied to the expression [...].

These ambiguities can be resolved by requiring inner attributes for
blocks and expressions. Hence, rather than @cold x + y, one would
write (@!cold x) + y. I actually prefer this in general, because it
makes the precedence clear.

OK, so what are the options?

Using @ for attributes is popular. It is the use with macros that is
controversial. Therefore, how I see it, there are three things on the
table:

Use @foo for attributes, keep foo! for macros (status quo-ish).

Use @foo for both attributes and macros (the proposal).

Use @[foo] for attributes and @foo for macros (a compromise).

Option 1 is roughly the status quo, but moving from #[foo] to @foo
for attributes (this seemed to be universally popular). The obvious
downside is that we lose ! forever and we also miss an opportunity
to unify attribute and macro syntax. We can still adopt the model
where decorators and macros are interoperable, but it will be a little
more strange, since they look very different.

The advantages of Option 2 are what I’ve been talking about this whole
time. The most significant disadvantage is that adding a semicolon can
change the interpretation of @foo() in a surprising way,
particularly at the top-level.

Option 3 offers most of the advantages of Option 2, while retaining a
clear syntactic distinction between attributes and macro usage. The
main downside is that @deriving(Eq) and @inline follow the
precedent of other languages more closely and arguably look cleaner
than @[deriving(Eq)] and @[inline].

What to do?

Currently I personally lean towards options 2 or 3. I am not happy
with Option 1 both because I think we should reserve ! and because I
think we should move attributes and macros closer together, both in
syntax and in deeper semantics.

Choosing between options 2 and 3 is difficult. It seems to boil down
to whether you feel the potential ambiguities of @foo() outweigh the
attractiveness of @inline vs @[inline]. I don’t personally have a
strong feeling on this particular question. It’s hard to say how
confusing the ambiguities will be in practice. I would be happier if
placing or failing to place a semicolon at the right spot yielded a
hard error.

So I guess I would summarize my current feeling as being happy with
either Option 2, but with the proviso that it is an error to use a
macro in decorator position unless it explicitly opts in, or Option 3,
without that proviso. This seems to retain all the upsides and avoid
the confusing ambiguities.

Appendix: A brief explanation of token trees

Token trees are the basis for our macro-rules macros. They are a
variation on token streams in which tokens are basically uninterpreted
except that matching delimeters ((), [], {}) are paired up. A
macro-rules macro is then “just” a translation from a token tree to
another token. This output token tree is then parsed as
normal. Similarly, our parser is actually not defined over a stream
of tokens but rather a token tree.

Our current implementation deviates from this ideal model in some
respects. For one thing, macros take as input token trees with
embedded asts, and the parser parses a stream of tokens with embedded
token trees, rather than token trees themselves, but these details are
not particularly relevant to this post. I also suspect we ought to
move the implementation closer to the ideal model over time, but
that’s the subject of another post.

While on vacation, I’ve been working on an alternate type inference
scheme for rustc. (Actually, I got it 99% working on the plane, and
have been slowly poking at it ever since.) This scheme simplifies the
code of the type inferencer dramatically and (I think) helps to meet
our intutions (as I will explain). It is however somewhat less
flexible than the existing inference scheme, though all of rustc and
all the libraries compile without any changes. The scheme will (I
believe) make it much simpler to implement to proper one-way matching
for traits (explained later).

Note: Changing the type inference scheme doesn’t really mean much to
end users. Roughly the same set of Rust code still compiles. So this
post is really mostly of interest to rustc implementors.

The new scheme in a nutshell

The new scheme is fairly simple. It is based on the observation that
most subtyping in Rust arises from lifetimes (though the scheme is
extensible to other possible kinds of subtyping, e.g. virtual
structs). It abandons unification and the H-M infrastructure and takes
a different approach: when a type variable V is first related to
some type T, we don’t set the value of V to T directly. Instead,
we say that V is equal to some type U where U is derived by
replacing all lifetimes in T with lifetime variables. We then relate
T and U appropriately.

Let me give an example. Here are two variables whose type must be
inferred:

'a: { // 'a --> name of block's lifetime
let x = 3;
let y = &x;
...
}

Let’s say that the type of x is $X and the type of y is $Y,
where $X and $Y are both inference variables. In that case, the
first assignment generates the constraint that int <: $X and the
second generates the constraint that &'a $X <: $Y. To resolve the
first constraint, we would set $X directly to int. This is because
there are no lifetimes in the type int. To resolve the second
constraint, we would set $Y to &'0 int – here '0 represents a
fresh lifetime variable. We would then say that &'a int <: &'0 int,
which in turn implies that '0 <= 'a. After lifetime inference is
complete, the types of x and y would be int and &'a int as
expected.

Without unification, you might wonder what happens when two type
variables are related that have not yet been associated with any
concrete type. This is actually somewhat challenging to engineer, but
it certainly does happen. For example, there might be some code like:

Here, at the point where we process x = y.unwrap(), we do not yet
know the values of either $X or $0. We can say that the type of
y.unwrap() will be $0 but we must now process the constrint that
$0 <: $X. We do this by simply keeping a list of outstanding
constraints. So neither $0 nor $X would (yet) be assigned a
specific type, but we’d remember that they were related. Then, later,
when either $0 or $Xis set to some specific type T, we can go
ahead and instantiate the other with U, where U is again derived
from T by replacing all lifetimes with lifetime variables. Then we
can relate T and U appropriately.

If we wanted to extend the scheme to handle more kinds of inference
beyond lifetimes, it can be done by adding new kinds of inference
variables. For example, if we wanted to support subtyping between
structs, we might add struct variables.

I am on vacation for a few weeks. I wanted to take some time to jot
down an idea that’s been bouncing around in my head. I plan to submit
an RFC at some point on this topic, but not yet, so I thought I’d
start out by writing a blog post. Also, my poor blog has been
neglected for some time. Consider this a draft RFC. Some important
details about references are omitted and will come in a follow-up blog
post.

The high-level summary of the idea is that we will take advantage of
bounds declared in type declarations to avoid repetition in fn and
impl declarations.