Saturday, January 10, 2015

Part of the sales pitch for Rust is that it's "as
bare metal as C".1 Rust can do anything C can do, run anywhere C
can run,2 with code that's just as efficient, and at least as safe
(but usually much safer).

I'd say this claim is about 95% true, which is pretty good by the standards of
marketing claims. A while back I decided to put it to the test, by making the
smallest, most self-contained Rust program possible. After resolving a
fewissues along the way, I ended
up with a 151-byte, statically linked executable for AMD64 Linux. With the
release of Rust
1.0-alpha, it's time
to show this off.

This uses my syscall library, which
provides the syscall! macro. We wrap the underlying system calls with Rust
functions, each exposing a safe interface to the
unsafesyscall! macro. The
main function uses these two safe functions and doesn't need its own unsafe
annotation. Even in such a small program, Rust allows us to isolate memory
unsafety to a subset of the code.

Because of crate_type="rlib", rustc will build this as a static library, from
which we extract a single object file tinyrust.o:

Note that main doesn't end in a ret instruction. The exit function
(which gets inlined) is marked with a "return type" of !, meaning "doesn't
return". We make
good on this by invoking the unreachable
intrinsic after
syscall!. LLVM will optimize under the assumption that we
can never reach this point, making no guarantees about the program behavior if
it is reached. This represents the fact that the kernel is actually going to
kill the process before syscall!(EXIT, n) can return.

Because we use inline assembly and intrinsics, this code is not going to work
on a stable-channel
build of Rust 1.0. It
will require an alpha or nightly build until such time as inline assembly and
intrinsics::unreachable are added to the stable language of Rust 1.x.

Note that I didn't even use #![no_std]! This program is so tiny that
everything it pulls from libstd is a type definition, macro, or fully inlined
function. As a result there's nothing of libstd left in the compiler output.
In a larger program you may need #![no_std], although its role is greatly
reduced following the removal
of Rust's runtime.

Linking

This is where things get weird.

Whether we compile from C or Rust,3 the standard linker toolchain is
going to include a bunch of junk we don't need. So I cooked up my own linker
script:

Finally we stick this on the end of a custom ELF header. The header is written
in NASM syntax but contains no instructions, only data
fields. The base address 0x400078 seen above is the end of this header, when
the whole file is loaded at 0x400000. There's no guarantee that ld will
put main at the beginning of the file, so we need to separately determine the
address of main and fill that in as the e_entry field in the ELF file
header.

The final trick

To get down to 151 bytes, I took inspiration from this classic
article, which
observes that padding fields in the ELF header can be used to store other data.
Like, say, a string
constant.
The Rust code changes to access this constant:

A Rust slice
like &[u8] consists of a pointer to some memory, and a length indicating the
number of elements that may be found there. The module
std::raw exposes this as an
ordinary struct that we build, then
transmute to the actual
slice type. The transmute function generates no code; it just tells the type
checker to treat our raw::Slice<u8> as if it were a &[u8]. We return this
value out of the unsafe block, taking advantage of the "everything is an
expression" syntax, and then print the message as before.

The object code is the same as before, except that the relocation for the
string constant has become an absolute address. The binary is smaller by 7
bytes (the size of "Hello!\n") and it still works!

You can find the full code on
GitHub. The code in this article
works on rustc 1.0.0-dev (44a287e6e 2015-01-08). If I update the code on GitHub,
I will also update the version number printed by the included build script.

@HugoDaniel - That looks like 71 bytes of text but just with a hashbang so bash can execute it. This article is talking about ELF binaries which are directly executable. Your example would require a haskell compiler or interpreter to run.

Graeme, the shebang is the executable magic number (located in the first 2 bytes of the file) for the os interpreter (following at most 128 bytes for the location of the interpreter). I guess not many people might know this. Shebang is actually and executable format such as AOUT, ELF or gzipped ELF. A bit more info: https://en.wikipedia.org/wiki/Shebang_%28Unix%29#Magic_number

Anders Eurenius, binary size doesn't really matter much since the OS can demand pages as it needs them (the whole thing is not loaded into memory, even in a small static linked binary). A bit more info here: https://en.wikipedia.org/wiki/Demand_paging

We should let someone just check the execution times and memory overhead of both the haskell wrapper and the binary code, then the difference might be noticeable even for HD.Regarding binary size, try using both implementations on a 8bit microcontroller and see who succeeds.

It looks like there's actually a missed optimisation in the LLVM code gen, in the first example it should produce xor %esi,%esi instead of mov $0x0,%esi, that would reduce the code size by 4 bytes and make it run quicker.

It's probably a bug in the way constants are passed into an inline asm, although I'd have thought the peephole pass would have picked it up.