Reducing the size of a Rust GStreamer plugin

Guillaume Desmottes April 28, 2020

Share this post:

A common complaint heard about Rust is the size of the binary it produces. They are various reasons explaining why Rust binaries are generally bigger that ones produced with lower level languages such as C. The main one is Cargo, Rust's package manager and building tool, producing static binaries by default. While larger binaries are generally not much of an issue for desktop or server applications, it may become more of a problem on embedded systems where storage and/or memory may be very limited.

GStreamer is used extensively at Collabora to help our clients to build embedded multimedia solutions. With Rust gaining traction among the GStreamer community as an alternative to C to write GStreamer applications and plugins, we began wondering if the size of such Rust plugins would be a problem for embedded systems, and what could be done to reduce sizes as much as possible.

As the plugin is statically built, the symbols information of all the crates (dependencies) used by the plugin ended up in our binary. Stripping it removed them and so saved us a lot of space.

build

modifications

size (bytes)

size (human)

% change

dev

none

32248640

31M

0%

dev

stripped

604512

591K

-98%

release

none

2740472

2,7M

0%

release

stripped

305504

299K

-88%

These numbers look much better, we already have something that should be usable in most systems. But we can still save some space by tweaking Cargo's build flags. All these settings are set using Cargo's profile sections.

From this point we'll consider only the size of release builds as that's what actually matter when distributing sofware in production. So we'll set our build flags in the profile.release section of our Cargo manifest.

Use LLVM's full LTO

By using the LTO setting and reducing the number of compilation units we can request the compiler to generate smaller binaries at the cost of a higher compile time. Let's add those settings in the profile configuration, this is done by editing our Cargo.toml and setting the lto and codegen-units settings in the release profile:

[profile.release]
lto = true
codegen-units = 1

These changes reduced the plugin size quite a lot, but once stripped we notice that we actually gained only 44K.

Abort on Panic

By default, Rust can provide a nice backtrace when panicking. This can be quite handy when debugging but consumes some space which may not be useful in production builds. Disabling backtraces on panic! can save us the size of the unwinding code in our plugin.

panic = 'abort'

Disabling this feature saved us some extra bytes as well:

build

modifications

size (bytes)

size (human)

% change

release

none

2740472

2,7M

0%

release

lto

888560

868K

-67.6%

release

lto + opt-level

855096

836K

-68.8%

release

lto + opt-level + panic abort

792136

774K

-71.2%

release

lto + opt-level + panic abort + stripped

207024

203K

-92.4%

It's important to note that this change will not only remove the panic stacktrace but also affect the behavior of GStreamer Rust plugins.

gstreamer-rs provides a macro converting panics to proper GStreamer error messages that can be handled by the application.

When such panic occurs the element will be marked as unusable but the application will continue running and have a chance to gracefully handle the problem.

By setting panic = 'abort' this whole system is disabled and the application process will abort right away.

Reducing even further

At this point we used all the options usable with the stable Rust version. To reduce even further, we would have to switch to Rust nightly, the unstable version of the compiler. One interesting option would be to manually build libstd so it can benefits from our optimized build settings.

We reached a size reasonable enough to be used in lots of embedded use cases. But how does it compare to a C implementation? We could have used the existing identity element as a comparaison but it's bundled in the coreelements plugin and provide more feature than rsidentity.

For the sake of the experiment, we re-implemented rsidentity in C using the exact same feature and APIs. It weigths 48K reduced to 15K once stripped.

So the Rust size overhead seems to be around 190K for this simple plugin. That's not unexpected as the Rust version statically link on Rust's standard library and contains all the bindings code between GLib and GStreamer.

We can use cargo bloat to list the biggest dependencies. Note that those numbers are for a pre-stripped build:

Actual plugins size

The plugin we used for our experimentations was very minimal. It would be interesting to look at the sizes of actual real Rust GStreamer plugins. We therefore built all the gst-plugins-rs plugins using the same build settings:

plugin

size (bytes)

size (human)

stripped size (bytes)

stripped size (human)

libgstcdg.so

2876960

2.8M

334208

327K

libgstclaxon.so

2795840

2.7M

354656

347K

libgstfallbackswitch.so

2964136

2.9M

412000

403K

libgstgif.so

2793224

2.7M

342392

335K

libgstlewton.so

2985256

2.9M

420192

411K

libgstrav1e.so

4511504

4.4M

1571208

1.5M

libgstreqwest.so

6762648

6.5M

3230480

3.1M

libgstrsaudiofx.so

815104

796K

223408

219K

libgstrsclosedcaption.so

3447056

3.3M

741240

724K

libgstrsdav1d.so

2748928

2.7M

313752

307K

libgstrsfile.so

1403832

1.4M

739592

723K

libgstrsflv.so

1007672

985K

321712

315K

libgstrusoto.so

7412336

7.1M

3734024

3.6M

libgstsodium.so

3050656

3.0M

572432

560K

libgstthreadshare.so

4530280

4.4M

1448376

1.4M

libgsttogglerecord.so

3012008

2.9M

436552

427K

It's interesting to notice that most plugins stay in the few kilobytes range with some notable exceptions. The plugins reaching the megabyte(s) size seem to be the ones relying on big Rust crates such as rav1e or reqwest. Those are "pure" Rust elements as they don't rely on external C libraries to actually process the data, like C plugins generally do.

The AV1 encoder and decoder are a good example here. The former, libgstrav1e.so, uses the rav1e crate which is also written in Rust and so is statically linked with the plugin. On the other hand, libgstrsdav1d.so wraps the dav1d C decoder to which it's dynamically linked to, so the actual decoding code isn't accounted in the plugin size.

What about ARM binaries?

So far we only considerd x86_64 binaries, however embedded devices are generally based on ARM SoC. We were interested in comparing the size of Rust plugins when built for this architecture, and wondered if we would observe any significant difference.

We therefore rebuilt all the plugins using the armv7-unknown-linux-gnueabihf toolchain as we would do to build for the Raspberry Pi, for example.

plugin

size (bytes)

size (human)

stripped size (bytes)

stripped size (human)

libgstcdg.so

2810512

2.7M

251460

246K

libgstclaxon.so

2815844

2.7M

263732

258K

libgstfallbackswitch.so

2893712

2.8M

321076

314K

libgstgif.so

2820696

2.7M

255556

250K

libgstlewton.so

2912520

2.8M

316980

310K

libgstrav1e.so

4376676

4.2M

1287752

1.3M

libgstreqwest.so

6213712

6.0M

2336548

2.3M

libgstrsaudiofx.so

902320

882K

165424

162K

libgstrsclosedcaption.so

3347412

3.2M

563580

551K

libgstrsfile.so

1348440

1.3M

501248

490K

libgstrsflv.so

1094828

1.1M

243248

238K

libgstrusoto.so

6818928

6.6M

2754168

2.7M

libgstsodium.so

3026284

2.9M

435892

426K

libgstthreadshare.so

4419844

4.3M

1132128

1.1M

libgsttogglerecord.so

2954476

2.9M

337452

330K

We notice here that ARM binaries are slightly lighter than their x86_64 equivalents and the gain from stripping is very similar on both architectures.

Conclusion

We have to keep in mind that each size reduction technique comes at a cost: binaries that are less debug friendly, higher build times, etc. Depending of our actual needs and constraints, one needs to consider the tradeoff between ease of debugging and binary size.

It's also important to note that we considered only a single Rust plugin in our setup. The total size would grow rapidly if we would have to ship multiple Rust plugins as each one would statically ship the GStreamer and GLib Rust glue code. We'll discuss and analyze in a future blog post the options to reduce the total size in such multi-plugins scenarios such as linking all just elements into a single larger Rust plugin so they can share common code.

Based on this research, we think that Rust is ready to deploy in embedded systems with limited memory resources. Rust brings numerous benefits to embedded systems, in particular, it's as fast as C/C++ but offers zero-cost abstractions, and advanced memory safety that enable rapid development and enable easier multi-threaded programming and fearless concurrency. As the GStreamer community is embracing the Rust language for its memory safety while handling untrusted multimedia data, Collabora is happy to help you bring Rust to your embedded projects.

Related Posts

Comments (2)

Reqwest depends on tokio and the whole async stack. If the plugin only needs a single (or a few) HTTP requests at the same time, using instead a lightweight HTTP client library like attohttpc would probably provide huge savings.

Indeed, it would be nice to have another plugin using such lighter http crate so users can pick the one fitting best for their use case.
Feel free to try writing one if you're interested contributing to gst-plugins-rs. :)