Federico's Blog

Last night, the linux.org DNS was hijacked and redirected to a page
that doxed her. Coraline is doing extremely valuable work with the
Contributor Covenant code of conduct, which many free software
projects have adopted already.

Coraline has been working for years in making free software, and
computer technology circles in general, a welcome place for
underrepresented groups.

While in the middle of converting librsvg's code that processes XML from C
to Rust, I went into a digression that has to do with the way librsvg
decides which files are allowed to be referenced from within an SVG.

Resource references in SVG

SVG files can reference other files, i.e. they are not
self-contained. For example, there can be an element like <image
xlink:href="foo.png">, or one can request that a sub-element of
another SVG be included with <use xlink:href="secondary.svg#foo">.
Finally, there is the xi:include mechanism to include chunks of text
or XML into another XML file.

Since librsvg is sometimes used to render untrusted files that come from
the internet, it needs to be careful not to allow those files to
reference any random resource on the filesystem. We don't want
something like
<text><xi:include href="/etc/passwd" parse="text"/></text>
or something equally nefarious that would exfiltrate a random file
into the rendered output.

Also, want to catch malicious SVGs that want to "phone home" by
referencing a network resource like
<image xlink:href="http://evil.com/pingback.jpg">.

So, librsvg is careful to have a single place where it can load
secondary resources, and first it validates the resource's URL to see
if it is allowed.

The actual validation rules are not very important for this
discussion; they are something like "no absolute URLs allowed" (so you
can't request /etc/passwd, "only siblings or (grand)children of
siblings allowed" (so foo.svg can request bar.svg and
subdir/bar.svg, but not ../../bar.svg).

The code

There was a central function rsvg_io_acquire_stream() which took a
URL as a string. The code assumed that that URL had been first
validated with a function called allow_load(url). While the code's
structure guaranteed that all the places that may acquire a stream
would actually go through allow_load() first, the structure of the
code in Rust made it possible to actually make it impossible to
acquire a disallowed URL.

The parts of the code that absolutely require a fragment id now take a
Fragment. Parts which require a PlainUri can unwrap that case.

The next step is making those structs contain an AllowedUrl
directly, instead of just strings, so that for callers, obtaining a
fully validated name is a one-step operation.

In general, the code is moving towards a scheme where all file I/O is
done at loading time. Right now, some of those external references
get resolved at rendering time, which is somewhat awkward (for
example, at rendering time the caller has no chance to use a
GCancellable to cancel loading). This refactoring to do early
validation is leaving the code in a very nice state.

A couple of weeks ago we had the fourth GNOME+Rust hackfest, this time
in Thessaloniki, Greece. This is the beautiful city that will host
next year's GUADEC, but fortunately GUADEC will be in summertime!

We held the hackfest at the CoHo coworking space, a small, cozy
office between the University and the sea.

Every such hackfest I am overwhelmed by the kind hackers who work on
[gnome-class], the code generator for GObject implementations in
Rust.

Mredlek has been working on generalizing the code generators in
gnome-class, so that we can have the following from the same run:

Rust code generation, for the GObject implementations themselves.
Thanks to mredlek, this is much cleaner than it was before; now both
classes and interfaces share the same code for most of the
boilerplate.

GObject Introspection (.gir) generation, so that language bindings
can be generated automatically.

C header files (.h), so the generated GObjects can be called from
C code as usual.

So far, Rust and GIR work; C header files are not generated yet.

Mredlek is a new contributor to gnome-class, but unfortunately was not
able to attend the hackfest. Not only did he rewrite the gnome-class
parser using the new version of syn; he also added support for
passing owned types to GObject methods, such as String and
Variant. But the biggest thing is probably that mredlek made it a
lot easier to debug the generated Rust source; see the documentation
on debugging for details.

Speaking of which, thanks to Jordan Petridis for making the
documentation be published automatically from Gitlab's Continuous
Integration pipelines.

Alex Crichton kindly refactored our error propagation code, and even
wrote docs on it! Along with Jordan, they updated the
code for the Rust 2018 edition, and generally wrangled the build
process to conform with the lastest Rust nightlies. Alex also made
code generation a lot faster, by offloading auto-indentation to an
external rustfmt process, instead of using it as a crate: using the
rustfmt crate meant that the compiler had a lot more work to do.
During the whole hackfest, Alex was very helpful with Rust questions
in general. While my strategy to see what the compiler does is to
examine the disassembly in gdb, his strategy seems to be to look at
the LLVM intermediate representation instead... OMG.

And we can derive very simple GtkWidgets now!

Saving the best for last... Antoni Boucher, the author of relm, has
been working on making it possible to derive from gtk::Widget. Once
this merge request is done, we'll have an example of
deriving from gtk::DrawingArea from Rust with very little code.

Normally, the gtk-rs bindings work as a statically-generated binding
for GObject, which really is a type hierarchy defined at runtime. The
static binding really wants to know what is a subclass of what: it
needs to know in advance that Button's hierarchy is Button → Bin →
Container → Widget → Object, plus all the GTypeInterfaces supported
by any of those classes. Antoni has been working on making
gnome-class extract that information automatically from GIR files, so
that the gtk-rs macros that define new types will get all the
necessary information.

Future work

There are still bugs in the GIR pipeline that prevent us
from deriving, say, from gtk::Container, but hopefully these will be
resolved soon.

Sebastian Dröge has been refactoring his Rust tools to create GObject
subclasses with very idiomatic and refined Rust code. This is now at
a state where gnome-class itself could generate that sort of code,
instead of generating all the boilerplate from scratch. So, we'll
start doing that, and integrating the necessary bits into gtk-rs as
well.

Finally, during the last day I took a little break from gnome-class to
work on librsvg. Julian Sparber has been updating the code to use new
bindings in cairo-rs, and is also adding a new API to
fetch an SVG element's geometry precisely.

Thessaloniki

Oh, boy, I wish the weather had been warmer. The city looks
delightful to walk around, especially in the narrow streets on the
hills. Can't wait to see it in summer during GUADEC.

Thanks

Finally, thanks to CoHo for hosting the hackfest, and to the GNOME
Foundation for sponsoring my travel and accomodation. And to
Centricular for taking us all to dinner!

Special thanks to Jordan Petridis for being on top of everything
build-wise all the time.

Lately, I have been converting the code in librsvg that handles XML
from C to Rust. For many technical reasons, the library still uses
libxml2, GNOME's historic XML parsing library, but some of the
callbacks to handle XML events like start_element, end_element,
characters, are now implemented in Rust. This has meant that I'm
running into all the cases where the original C code in librsvg failed
to handle errors properly; Rust really makes it obvious when that
happens.

In this post I want to talk a bit about propagating errors. You call
a function, it returns an error, and then what?

What can fail?

It turns out that this question is highly context-dependent. Let's
say a program is starting up and tries to read a configuration file.
What could go wrong?

The file doesn't exist. Maybe it is the very first time the program
is run, and so there isn't a configuration file at all? Can the
program provide a default configuration in this case? Or does it
absolutely need a pre-written configuration file to be somewhere?

The file can't be parsed. Should the program warn the user and
exit, or should it revert to a default configuration (should it
overwrite the file with valid, default values)? Can
the program warn the user, or is it a user-less program that at best
can just shout into the void of a server-side log file?

The file can be parsed, but the values are invalid. Same questions
as the case above.

Etcetera.

At each stage, the code will probably see very low-level errors ("file
not found", "I/O error", "parsing failed", "value is out of range").
What the code decides to do, or what it is able to do at any
particular stage, depends both on the semantics you want from the
program, and from the code structure itself.

Structuring the problem

This is an easy, but very coarse way of handling things:

gbooleanread_configuration(constchar*config_file_name){/* open the file *//* parse it *//* set global variables to the configuration values *//* return true if success, or false if failure */}

What is bad about this? Let's see:

The calling code just gets a success/failure condition. In the case
of failure, it doesn't get to know why things failed.

If the function sets global variables with configuration values as
they get read... and something goes wrong and the function returns
an error... the caller ends up possibly in an inconsistent state,
with a set of configuration variables that are only halfway-set.

If the function finds parse errors, well, do you really want to call
UI code from inside it? The caller might be a better place to make
that decision.

A slightly better structure

Let's add an enumeration to indicate the possible errors, and a
structure of configuration values.

enumConfigError{ConfigFileDoesntExist,ParseError,// config file has bad syntax or somethingValueError,// config file has an invalid value}structConfigValues{// a bunch of fields here with the program's configuration}fnread_configuration(filename: &Path)-> Result<ConfigValues,ConfigError>{// open the file, or return Err(ConfigError::ConfigFileDoesntExist)// parse the file; or return Err(ConfigError::ParseError)// validate the values, or return Err(ConfigError::ValueError)// if everything succeeds, return Ok(ConfigValues)}

This is better, in that the caller decides what to do with the
validated ConfigValues: maybe it can just copy them to the
program's global variables for configuration.

However, this scheme doesn't give the caller all the information it
would like to present a really good error message. For example, the
caller will get to know if there is a parse error, but it doesn't know
specifically what failed during parsing. Similarly, it will just get
to know if there was an invalid value, but not which one.

Ah, so the problem is fractal

We could have new structs to represent the little errors, and then
make them part of the original error enum:

The ParseError and ValueError structs have individual
error_reason fields, which are strings. Presumably, one could have
a ParseError with error_reason = "unexpected token", or a
ValueError with error_reason = "cannot be a negative number".

One problem with this is that if the low-level errors come with error
messages in English, then the caller has to know how to localize them
to the user's language. Also, if they don't have a machine-readable
error code, then the calling code may not have enough information to
decide what do do with the error.

Let's say we had a ParseErrorKind enum with variants like
UnexpectedToken, EndOfFile, etc. This is fine; it lets the
calling code know the reason for the error. Also, there can be a
gimme_localized_error_message() method for that particular type of
error.

How can we expand this? Maybe the ParseErrorKind::UnexpectedToken
variant wants to contain data that indicates which token it got that
was wrong, so it would be UnexpectedToken(String) or something
similar.

But is that useful to the calling code? For our example program,
which is reading a configuration file... it probably only needs to
know if it could parse the file, but maybe it doesn't really need any
additional details on the reason for the parse error, other than
having something useful to present to the user. Whether it is
appropriate to burden the user with the actual details... does the app
expect to make it the user's job to fix broken configuration files?
Yes for a web server, where the user is a sysadmin; probably not for a
random end-user graphical app, where people shouldn't need to write
configuration files by hand in the first place (should those have a
"Details" section in the error message window? I don't know!).

Maybe the low-level parsing/validation code can emit those detailed
errors. But how can we propagate them to something more useful to the
upper layers of the code?

Translation and propagation

Maybe our original read_configuration() function can translate the
low-level errors into high-level ones:

Etcetera. It is up to each part of the code to decide what do do with
lower-level errors. Can it recover from them? Should it fail the
whole operation and return a higher-level error? Should it warn the
user right there?

Language facilities

C makes it really easy to ignore errors, and pretty hard to present
detailed errors like the above. One could mimic what Rust is actually
doing with a collection of union and struct and enum, but this
gets very awkward very fast.

Rust provides these facilities at the language level, and the idioms
around Result and error handling are very nice to use. There are
even crates like failure that go a long way towards
automating error translation, propagation, and conversion to strings
for presenting to users.

Infinite details

I've been recommending The Error Model to anyone who
comes into a discussion of error handling in programming languages.
It's a long, detailed, but very enlightening read on recoverable
vs. unrecoverable errors, simple error codes vs. exceptions
vs. monadic results, the performance/reliability/ease of use of each
model... Definitely worth a read.

I want to write a braindump on the stuff that I remember from
gdk-pixbuf's history. There is some talk about replacing it with
something newer; hopefully this history will show some things that
worked, some that didn't, and why.

The beginnings

Gdk-pixbuf started as a replacement for Imlib, the image loading and
rendering library that GNOME used in its earliest versions. Imlib
came from the Enlightenment project; it provided an easy API around
the idiosyncratic libungif, libjpeg, libpng, etc., and it maintained
decoded images in memory with a uniform representation. Imlib also
worked as an image cache for the Enlightenment window manager, which
made memory management very inconvenient for GNOME.

Imlib worked well as a "just load me an image" library. It showed
that a small, uniform API to load various image formats into a common
representation was desirable. And in those days, hiding all the
complexities of displaying images in X was very important indeed.

The initial API

Gdk-pixbuf replaced Imlib, and added two important features:
reference counting for image data, and support for an alpha channel.

Gdk-pixbuf appeared with support for RGB(A) images. And although in
theory it was possible to grow the API to support other
representations, GdkColorspace never acquired anything other than
GDK_COLORSPACE_RGB, and the bits_per_sample argument to some
functions only ever supported being 8. The presence or absence of an alpha
channel was done with a gboolean argument in conjunction with that
single GDK_COLORSPACE_RGB value; we didn't have something like
cairo_format_t which actually specifies the pixel format in single
enum values.

While all the code in gdk-pixbuf carefully checks that those
conditions are met — RGBA at 8 bits per channel —, some applications
inadvertently assume that that is the only possible case, and would get
into trouble really fast if gdk-pixbuf ever started returning pixbufs
with different color spaces or depths.

One can still see the battle between bilevel-alpha
vs. continuous-alpha in this enum:

Fortunately, only the "render this pixbuf with alpha to an Xlib
drawable" functions take values of this type: before the Xrender
days, it was a Big Deal to draw an image with alpha to an X window,
and applications often opted to use a bitmask instead, even if they
had jagged edges as a result.

Pixel formats

The only pixel format that ever got implemented was unpremultiplied
RGBA on all platforms. Back then I didn't understand premultiplied
alpha! Also, the GIMP followed that scheme, and copying
it seemed like the easiest thing.

After gdk-pixbuf, libart also copied that pixel format, I think.

But later we got Cairo, Pixman, and all the Xrender stack. These
prefer premultiplied ARGB. Moreover, Cairo prefers it if each pixel
is actually a 32-bit value, with the ARGB values inside it in
platform-endian order. So if you look at a memory dump, a Cairo pixel
looks like BGRA on a little-endian box, while it looks like ARGB on a
big-endian box.

The loading API

The public loading API in gdk-pixbuf, and its relationship to loader
plug-ins, evolved in interesting ways.

At first the public API and loaders only implemented load_from_file:
you gave the library a FILE * and it gave you back a GdkPixbuf.
Back then we didn't have a robust MIME sniffing framework in the form
of a library, so gdk-pixbuf got its own. This lives in the
mostly-obsolete GdkPixbufFormat machinery; it
even has its own little language for sniffing file headers!
Nowadays we do most MIME sniffing with GIO.

After the intial load_from_file API... I think we got progressive
loading first, and animation support aftewards.

Progressive loading

This where the calling program feeds chunks of bytes to the library,
and at the end a fully-formed GdkPixbuf comes out, instead of having
a single "read a whole file" operation.

We conflated this with a way to get updates on how the image area gets
modified as the data gets parsed. I think we wanted to support the
case of a web browser, which downloads images slowly over the network,
and gradually displays them as they are downloaded. In 1998, images
downloading slowly over the network was a real concern!

It took a lot of very careful work to convert the image loaders, which
parsed a whole file at a time, into loaders that could maintain some
state between each time that they got handed an extra bit of buffer.

It also sounded easy to implement the progressive updating API by
simply emitting a signal that said, "this rectangular area got updated
from the last read". It could handle the case of reading whole
scanlines, or a few pixels, or even area-based updates for progressive
JPEGs and PNGs.

The internal API for the image format loaders still keeps a
distinction between the "load a whole file" API and the "load an image
in chunks". Not all loaders got redone to simply just use the second
one: io-jpeg.cstill implements loading whole files by calling the
corresponding libjpeg functions. I think it could remove that code
and use the progressive loading functions instead.

Animations

Animations: we followed the GIF model for animations, in which each
frame overlays the previous one, and there's a delay set between each
frame. This is not a video file; it's a hacky flipbook.

However, animations presented the problem that the whole gdk-pixbuf
API was meant for static images, and now we needed to support
multi-frame images as well.

We defined the "correct" way to use the gdk-pixbuf library as to
actually try to load an animation, and then see if it is a
single-frame image, in which case you can just get a GdkPixbuf for
the only frame and use it.

Or, if you got an animation, that would be a GdkPixbufAnimation
object, from which you could ask for an iterator to get each frame as
a separate GdkPixbuf.

However, the progressive updating API never got extended to really
support animations. So, we have awkward functions like
gdk_pixbuf_animation_iter_on_currently_loading_frame() instead.

Necessary accretion

Gdk-pixbuf got support for saving just a few formats: JPEG, PNG,
TIFF, ICO, and some of the formats that are implemented with the
Windows-native loaders.

Over time gdk-pixbuf got support for preserving some metadata-ish
chunks from formats that provide it: DPI, color profiles, image
comments, hotspots for cursors/icons...

While an image is being loaded with the progressive loaders, there is
a clunky way to specify that one doesn't want the actual size of the
image, but another size instead. The loader can handle that situation
itself, hopefully if an image format actually embeds different sizes
in it. Or if not, the main loading code will rescale the full loaded
image into the size specified by the application.

Historical cruft

GdkPixdata - a way to embed binary image data in executables, with a
funky encoding. Nowadays it's just easier to directly store a PNG or
JPEG or whatever in a GResource.

contrib/gdk-pixbuf-xlib - to deal with old-style X drawables.
Hopefully mostly unused now, but there's a good number of mostly old,
third-party software that still uses gdk-pixbuf as an image loader and
renderer to X drawables.

gdk-pixbuf-transform.h - Gdk-pixbuf had some very high-quality
scaling functions, which the original versions of EOG used for the
core of the image viewer. Nowadays Cairo is the preferred way of
doing this, since it not only does scaling, but general affine
transformations as well. Did you know that
gdk_pixbuf_composite_color takes 17 arguments, and it can composite
an image with alpha on top of a checkerboard? Yes, that used to be
the core of EOG.

Debatable historical cruft

gdk_pixbuf_get_pixels(). This lets the program look into the actual
pixels of a loaded pixbuf, and modify them. Gdk-pixbuf just did not
have a concept of immutability.

Back in GNOME 1.x / 2.x, when it was fashionable to put icons beside
menu items, or in toolbar buttons, applications would load their icon
images, and modify them in various ways before setting them onto the
corresponding widgets. Some things they did: load a colorful icon,
desaturate it for "insensitive" command buttons or menu items, or
simulate desaturation by compositing a 1x1-pixel checkerboard on the
icon image. Or lighten the icon and set it as the "prelight" one onto
widgets.

The concept of "decode an image and just give me the pixels" is of
course useful. Image viewers, image processing programs, and all
those, of course need this functionality.

However, these days GTK would prefer to have a way to decode an image,
and ship it as fast as possible ot the GPU, without intermediaries.
There is all sorts of awkward machinery in the GTK widgets that
can consume either an icon from an icon theme, or a user-supplied
image, or one of the various schemes for providing icons that GTK has
acquired over the years.

It is interesting to note that gdk_pixbuf_get_pixels() was available
pretty much since the beginning, but it was only until much later that
we got gdk_pixbuf_get_pixels_with_length(), the "give me the guchar
* buffer and also its length" function, so that calling code has a
chance of actually checking for buffer overruns. (... and it is one
of the broken "give me a length" functions that returns a guint
rather than a gsize. There is a better
gdk_pixbuf_get_byte_length() which actually returns a gsize,
though.)

Problems with mutable pixbufs

The main problem is that as things are right now, we have no
flexibility in changing the internal representation of image data to
make it better for current idioms: GPU-specific pixel formats may not
be unpremultiplied RGBA data.

We have no API to say, "this pixbuf has been modified", akin to
cairo_surface_mark_dirty(): once an application calls
gdk_pixbuf_get_pixels(), gdk-pixbuf or GTK have to assume that the
data will be changed and they have to re-run the pipeline to send
the image to the GPU (format conversions? caching? creating a
texture?).

Also, ever since the beginnings of the gdk-pixbuf API, we had a way to
create pixbufs from arbitrary user-supplied RGBA buffers: the
gdk_pixbuf_new_from_data functions. One problem with this scheme is
that memory management of the buffer is up to the calling application,
so the resulting pixbuf isn't free to handle those resources as it
pleases.

A relatively recent addition is gdk_pixbuf_new_from_bytes(), which
takes a GBytes buffer instead of a random guchar *. When a pixbuf
is created that way, it is assumed to be immutable, since a GBytes
is basically a shared reference into a byte buffer, and it's just
easier to think of it as immutable. (Nothing in C actually enforces
immutability, but the API indicates that convention.)

Internally, GdkPixbuf actually prefers to be created from a
GBytes. It will downgrade itself to a guchar * buffer if
something calls the old gdk_pixbuf_get_pixels(); in the best case,
that will just take ownership of the internal buffer from the
GBytes (if the GBytes has a single reference count); in the worst
case, it will copy the buffer from the GBytes and retain ownership
of that copy. In either case, when the pixbuf downgrades itself to
pixels, it is assumed that the calling application will modify the
pixel data.

What would immutable pixbufs look like?

I mentioned this a bit in "Reducing Copies". The
loaders in gdk-pixbuf would create immutable pixbufs, with an internal
representation that is friendly to GPUs. In the proposed scheme, that
internal representation would be a Cairo image surface; it can be
something else if GTK/GDK eventually prefer a different way of
shipping image data into the toolkit.

Those pixbufs would be immutable. In true C fashion we can call it
undefined behavior to change the pixel data (say, an app could request
gimme_the_cairo_surface and tweak it, but that would not be
supported).

I think we could also have a "just give me the pixels" API, and a
"create a pixbuf from these pixels" one, but those would be one-time
conversions at the edge of the API. Internally, the pixel data that
actually lives inside a GdkPixbuf would remain immutable, in some
preferred representation, which is not necessarily what the
application sees.

What worked well

A small API to load multiple image formats, and paint the images
easily to the screen, while handling most of the X awkwardness
semi-automatically, was very useful!

A way to get and modify pixel data: applications clearly like doing
this. We can formalize it as an application-side thing only, and keep
the internal representation immutable and in a format that can evolve
according to the needs of the internal API.

Pluggable loaders, up to a point. Gdk-pixbuf doesn't support all the
image formats in the world out of the box, but it is relatively easy
for third-parties to provide loaders that, once installed, are
automatically usable for all applications.

What didn't work well

Having effectively two pixel formats supported, and nothing else:
gdk-pixbuf does packed RGB and unpremultiplied RGBA, and that's it.
This isn't completely terrible: applications which really want to
know about indexed or grayscale images, or high bit-depth ones, are
probably specialized enough that they can afford to have their own
custom loaders with all the functionality they need.

Pluggable loaders, up to a point. While it is relatively easy to
create third-party loaders, installation is awkward from a system's
perspective: one has to run the script to regenerate the loader cache,
there are more shared libraries running around, and the loaders are
not sandboxed by default.

I'm not sure if it's worthwhile to let any application read "any"
image format if gdk-pixbuf supports it. If your word processor lets
you paste an image into the document... do you want it to use
gdk-pixbuf's limited view of things and include a high bit-depth image
with its probably inadequate conversions? Or would you rather do some
processing by hand to ensure that the image looks as good as it can,
in the format that your word processor actually supports? I don't
know.

The API for animations is very awkward. We don't even support
APNG... but honestly I don't recall actually seeing one of those in
the wild.

The progressive loading API is awkward. The "feed some bytes into the
loader" part is mostly okay; the "notify me about changes to the pixel
data" is questionable nowadays. Web browsers don't use it; they
implement their own loaders. Even EOG doesn't use it.

I think most code that actually connects to GdkPixbufLoader's
signals only uses the size-prepared signal — the one that gets
emitted soon after reading the image headers, when the loader gets to
know the dimensions of the image. Apps sometimes use this to say,
"this image is W*H pixels in size", but don't actually decode the
rest of the image.

The gdk-pixbuf model of static images, or GIF animations, doesn't work
well for multi-page TIFFs. I'm not sure if this is actualy a problem.
Again, applications with actual needs for multi-page TIFFs are
probably specialized enough that they will want a full-featured TIFF
loader of their own.

Awkward architectures

Thumbnailers

The thumbnailing system has slowly been moving towards a model where
we actually have thumbnailers specific to each file format, instead of
just assuming that we can dump any image into a gdk-pixbuf loader.

If we take this all the way, we would be able to remove some weird
code in, for example, the JPEG pixbuf loader. Right now it supports
loading images at a size that the calling code requests, not only at
the "natural" size of the JPEG. The thumbnailer can say, "I want to
load this JPEG at 128x128 pixels" or whatever, and in theory the
JPEG loader will do the minimal amount of work required to do that.
It's not 100% clear to me if this is actually working as intended, or
if we downscale the whole image anyway.

We had a distinction between in-process and out-of-process
thumbnailers, and it had to do with the way pixbuf loaders are used;
I'm not sure if they are all out-of-process and sandboxed now.

Non-raster data

There is a gdk-pixbuf loader for SVG images which uses librsvg
internally, but only in a very basic way: it simply loads the SVG at
its preferred size. Librsvg jumps through some hoops to compute a
"preferred size" for SVGs, as not all of them actually indicate one.
The SVG model would rather have the renderer say that the SVG is to be
inserted into a rectangle of certain width/height, and
scaled/positioned inside the rectangle according to some other
parameters (i.e. like one would put it inside an HTML document, with a
preserveAspectRatio attribute and all that). GNOME applications
historically operated with a different model, one of "load me an
image, I'll scale it to whatever size, and paint it".

This gdk-pixbuf loader for SVG files gets used for the SVG
thumbnailer, or more accurately, the "throw random images into a
gdk-pixbuf loader" thumbnailer. It may be better/cleaner to have a
specific thumbnailer for SVGs instead.

Even EOG, our by-default image viewer, doesn't use the gdk-pixbuf
loader for SVGs: it actually special-cases them and uses librsvg
directly, to be able to load an SVG once and re-render it at different
sizes if one changes the zoom factor, for example.

GTK reads its SVG icons... without using librsvg... by assuming that
librsvg installed its gdk-pixbuf loader, so it loads them as any
normal raster image. This kind of dirty, but I can't quite pinpoint
why. I'm sure it would be convenient for icon themes to ship a single
SVG with tons of icons, and some metadata on their ids, so that GTK
could pick them out of the SVG file with rsvg_render_cairo_sub() or
something. Right now icon theme authors are responsible for splitting
out those huge SVGs into many little ones, one for each icon, and I
don't think that's their favorite thing in the world to do :)

Exotic raster data

High bit-depth images... would you expect EOG to be able to load them?
Certainly; maybe not with all the fancy conversions from a real RAW
photo editor. But maybe this can be done as EOG-specific plugins,
rather than as low in the platform as the gdk-pixbuf loaders?

(Same thing for thumbnailing high bit-depth images: the loading code
should just provide its own thumbnailer program for those.)

Non-image metadata

The gdk_pixbuf_set_option / gdk_pixbuf_get_option family of
functions is so that pixbuf loaders can set key/value pairs of strings
onto a pixbuf. Loaders use this for comment blocks, or ICC profiles
for color calibration, or DPI information for images that have it, or
EXIF data from photos. It is up to applications to actually use this
information.

It's a bit uncomfortable that gdk-pixbuf makes no promises about the
kind of raster data it gives to the caller: right now it is raw
RGB(A) data that is not gamma-corrected nor in any particular color
space. It is up to the caller to see if the pixbuf has an ICC profile
attached to it as an option. Effectively, this means that
applications don't know if they are getting SRGB, or linear RGB, or
what... unless they specifically care to look.

The gdk-pixbuf API could probably make promises: if you call this
function you will get SRGB data; if you call this other function,
you'll get the raw RGBA data and we'll tell you its
colorspace/gamma/etc.

The various set_option / get_option pairs are also usable by the
gdk-pixbuf saving code (up to now we have just talked about
loaders). I don't know enough about how applications use the saving
code in gdk-pixbuf... the thumbnailers use it to save PNGs or JPEGs,
but other apps? No idea.

What I would like to see

Immutable pixbufs in a useful format. I've started work on
this in a merge request; the internal code is now ready
to take in different internal representations of pixel data. My goal
is to make Cairo image surfaces the preferred, immutable, internal
representation. This would give us a
gdk_pixbuf_get_cairo_surface(), which pretty much everything that
needs one reimplements by hand.

Find places that assume mutable pixbufs. To gradually deprecate
mutable pixbufs, I think we would need to audit applications and
libraries to find places that cause GdkPixbuf structures to degrade
into mutable ones: basically, find callers of
gdk_pixbuf_get_pixels() and related functions, see what they do, and
reimplement them differently. Maybe they don't need to tint icons by
hand anymore? Maybe they don't need icons anymore, given our
changing UI paradigms? Maybe they are using gdk-pixbuf as an image
loader only?

Reconsider the loading-updates API. Do we need the
GdkPixbufLoader::area-updated signal at all? Does anything break
if we just... not emit it, or just emit it once at the end of the
loading process? (Caveat: keeping it unchanged more or less means
that "immutable pixbufs" as loaded by gdk-pixbuf actually mutate while
being loaded, and this mutation is exposed to applications.)

Sandboxed loaders. While these days gdk-pixbuf loaders prefer the
progressive feed-it-bytes API, sandboxed loaders would maybe prefer a
read-a-whole-file approach. I don't know enough about memfd or how
sandboxes pass data around to know how either would work.

Move loaders to Rust. Yes, really. Loaders are
security-sensitive, and while we do need to sandbox them, it would
certainly be better to do them in a memory-safe language. There are
already pure Rust-based image loaders: JPEG,
PNG, TIFF, GIF, ICO.
I have no idea how featureful they are. We can certainly try them
with gdk-pixbuf's own suite of test images. We can modify them to add
hooks for things like a size-prepared notification, if they don't
already have a way to read "just the image headers".

Rust makes it very easy to plug in micro-benchmarks,
fuzz testing, and other modern amenities. These would be
perfect for improving the loaders.

I started sketching a Rust backend for gdk-pixbuf
loaders some months ago, but there's nothing useful
yet. One mismatch between gdk-pixbuf's model for loaders, and the
existing Rust codecs, is that Rust codecs generally take something
that implements the Read trait: a blocking API to read bytes from
abstract sources; it's a pull API. The gdk-pixbuf model is a push
API: the calling code creates a loader object, and then pushes bytes
into it. The gdk-pixbuf convenience functions that take a
GInputStream basically do this:

However, this cannot be flipped around easily. We could probably use
a second thread (easy, safe to do in Rust) to make the reader/decoder
thread block while the main thread pushes bytes into it.

Also, I don't know how the Rust bindings for GIO present things like
GInputStream and friends, with our nice async cancellables and all
that.

Deprecate animations? Move that code to EOG, just so one can look
at memes in it? Do any "real apps" actually use GIF animations for
their UI?

Formalize promises around returned color profiles, gamma, etc. As
mentioned above: have an "easy API" that returns SRGB, and a "raw API"
that returns the ARGB data from the image, plus info on its ICC
profile, gamma, or any other info needed to turn this into a
"good enough to be universal" representation. (I think all the
Apple APIs that pass colors around do so with an ICC profile attached,
which seems... pretty much necessary for correctness.)

Remove the internal MIME-sniffing machinery. And just use GIO's.

Deprecate the crufty/old APIs in gdk-pixbuf.
Scaling/transformation, compositing, GdkPixdata,
gdk-pixbuf-csource, all those. Pixel crunching can be done by
Cairo; the others are better done with GResource these days.

Figure out if we want blessed codecs; fix thumbnailers. Link those
loaders statically, unconditionally. Exotic formats can go in their
own custom thumbnailers. Figure out if we want sandboxed loaders for
everything, or just for user-side images (not ones read from the
trusted system installation).

Have GTK4 communicate clearly about its drawing model. I think we
are having a disconnect between the GUI chrome, which is CSS/GPU
friendly, and graphical content generated by applications, which by
default right now is done via Cairo. And having Cairo as a to-screen
and to-printer API is certainly very convenient! You Wouldn't Print a
GUI, but certainly you would print a displayed document.

It would also be useful for GTK4 to actually define what its preferred
image format is if it wants to ship it to the GPU with as little work
as possible. Maybe it's a Cairo image surface? Maybe something else?

Conclusion

We seem to change imaging models every ten years or so. Xlib, then
Xrender with Cairo, then GPUs and CSS-based drawing for widgets.
We've gone from trusted data on your local machine, to potentially malicious data that
rains from the Internet. Gdk-pixbuf has spanned all of these periods
so far, and it is due for a big change.

The bug that caused two brown-paper-bag released in librsvg —
because it was leaking all the SVG nodes — has been interesting.

Memory leaks in Rust? Isn't it supposed to prevent that?

Well, yeah, but the leaks were caused by the C side of things, and by
unsafe code in Rust, which does not prevent leaks.

The first part of the bug was easy: C code started calling a
function implemented in Rust, which returns a newly-acquired reference
to an SVG node. The old code simply got a pointer to the node,
without acquiring a reference. The new code was forgetting to
rsvg_node_unref(). No biggie.

The second part of the bug was trickier to find. The C code
was apparently calling all the functions to unref nodes as
appropriate, and even calling the rsvg_tree_free() function in the
end; this is the "free the whole SVG tree" function.

There are these types:

// We take a pointer to this and expose it as an opaque pointer to CpubenumRsvgTree{}// This is the real structure we care aboutpubstructTree{// This is the Rc that was getting leakedpubroot: Rc<Node>,...}

Tree is the real struct that holds the root of the SVG tree and some
other data. Each node is an Rc<Node>; the root node was getting
leaked (... and all the children, recursively) because its reference
count never went down from 1.

RsvgTree is just an empty type. The code does an unsafe cast of
*const Tree as *const RsvgTree in order to expose a raw pointer to
the C code.

The rsvg_tree_free() function, callable from C, looked like this:

#[no_mangle]pubextern"C"fnrsvg_tree_free(tree: *mutRsvgTree){if!tree.is_null(){let_=unsafe{Box::from_raw(tree)};// ^ this returns a Box<RsvgTree> which is an empty type!}}

When we call Box::from_raw() on a *mut RsvgTree, it gives us back
a Box<RsvgTree>... which is a box of a zero-sized type. So, the program
frees zero memory when the box gets dropped.

The code was missing this cast:

lettree=unsafe{&mut*(treeas*mutTree)};// ^ this cast to the actual type inside the Boxlet_=unsafe{Box::from_raw(tree)};

So, tree as *mut Tree gives us a value which will cause
Box::from_raw() to return a Box<Tree>, which is what we intended.
Dropping the box will drop the Tree, reduce the last reference count
on the root node, and free all the nodes recursively.

Monitoring an Rc<T>'s reference count in gdb

So, how does one set a gdb watchpoint on the reference count?

First I set a breakpoint on a function which I knew would get passed
the Rc<Node> I care about:

At this point I can print a stack trace and see if it makes sense,
check that the refs/unrefs are matched, etc.

TL;DR: dig into the Rc<T> until you find the reference count, and
watch it. It's wrapped in several layers of Rust-y types; NonNull
pointers, an RcBox for the actual container of the refcount plus the
object it's wrapping, and Cells for the refcount values. Just dig
until you reach the refcount values and they are there.

So, how did I find the missing cast?

Using that gdb recipe, I watched the reference count of the toplevel
SVG node change until the program exited. When the program
terminated, the reference count was 1 — it should have dropped to 0 if
there was no memory leak.

The last place where the toplevel node loses a reference is in
rsvg_tree_free(). I ran the program again and checked if that
function was being called; it was being called correctly. So I knew
that the problem must lie in that function. After a little
head-scratching, I found the missing cast. Other functions of the
form rsvg_tree_whatever() had that cast, but rsvg_tree_free() was
missing it.

I think Rust now has better facilities to tag structs that are exposed
as raw pointers to extern code, to avoid this kind of perilous
casting. We'll see.

Over in this issue we are discussing how to add debug logging
for librsvg.

A popular way to add logging to Rust code is to use the log crate.
This lets you sprinkle simple messages in your code:

error!("something bad happened: {}",foo);debug!("a debug message");

However, the log create is just a facade, and by default the
messages do not get emitted anywhere. The calling code has to set up
a logger. Crates like env_logger let one set up a logger, during
program initialization, that gets configured through an environment
variable.

And this is a problem for librsvg: we are not the program's
initialization! Librsvg is a library; it doesn't have a main()
function. And since most of the calling code is not Rust, we can't
assume that they can call code that can initialize the logging
framework.

Why not use glib's logging stuff?

Currently this is a bit clunky to use from Rust, since glib's
structured logging functions are not bound yet in glib-rs. Maybe it
would be good to bind them and get this over with.

What user experience do we want?

In the past, what has worked well for me to do logging from libraries
is to allow the user to set an environment variable to control the
logging, or to drop a log configuration file in their $HOME. The
former works well when the user is in control of running the program
that will print the logs; the latter is useful when the user is not
directly in control, like for gnome-shell, which gets launched through
a lot of magic during session startup.

For librsvg, it's probably enough to just use an environment
variable. Set RSVG_LOG=parse_errors, run your program, and get
useful output. Push button, receive bacon.

Other options in Rust?

There is a slog crate which looks promising. Instead of using
context-less macros which depend on a single global logger, it
provides logging macros to which you pass a logger object.

For librsvg, this means that the basic RsvgHandle could create its
own logger, based on an environment variable or whatever, and pass it
around to all its child functions for when they need to log something.

I am incredibly happy because of three big things that are going
on in librsvg right now:

Paolo Borelli finished porting all the CSS properties to Rust.
What was once a gigantic RsvgState struct in C is totally gone,
along with all the janky C code to parse individual properties.
The process of porting RsvgState to Rust has been going on since
about two months ago, and has involved many multi-commit
merge requests and refactorings. This is a tremendous amount of
really good work! The result is all in Rust now in a State
struct, which is opaque from C's viewpoint. The only places in C
that still require accessors to the State are in the filter
effects code. Which brings me to...

Ivan Molodetskikh, my Summer of Code student, submitted his first
merge request and it's merged to master now. This ports
the bookkeeping infrastructure for SVG filters to Rust, and also
the feOffset filter is ported now. Right now the code doesn't do
anything fancy to iterate over the pixels of Cairo image surfaces;
that will come later. I am very happy that filters, which were a
huge barrier, are now starting to get chipped away into nicer code.

I have started to move librsvg's old representation of CSS
properties into something that can really represent properties that
are not specified, or explicitly set to inherit from an SVG
element's parent, or set to a normal value. Librsvg never had a
representation of property values that actually matched the SVG/CSS
specs; it just knew whether a property was specified or not for an
element. This worked fine for properties which the spec mandates
that they should inherit automatically, but those that don't,
were handled through special hacks. The new code makes this a lot
cleaner. It should also make it easier to copy Servo's idioms for
property inheritance.

In ye olden days

In the context of GIMP/GNOME, the only thing that knew how to draw RGB
images to X11 windows (doing palette mapping for 256-color graphics
cards and dithering if necessary) was the GIMP. Later, when GTK+ was
written, it exported a GtkPreview widget, which could take an RGB
image buffer supplied by the application and render it to an X window
— this was what GIMP plug-ins could use in their user interface to
show, well, previews of what they were about to do with the user's
images. Later we got some obscure magic in a GdkColorContext
object, which helped allocate X11 colors for the X drawing primitives.
In turn, GdkColorContext came from the port that Miguel and I did of
XmHTML's color context object (and for those that remember, XmHTML
became the first version of GtkHtml; later it was rewritten as a port
of KDE's HTML widget). Thankfully all that stuff is gone now; we can
now assume that video cards are 24-bit RGB or better everywhere, and
there is no need to worry about limited color palettes and color
allocation.

Later, we started using the Imlib library, from the Enlightenment
project, as an easy API to load images — the APIs from libungif,
libjpeg, libpng, etc. were not something one really wanted to use
directly — and also to keep images in memory with a uniform
representation. Unfortunately, Imlib's memory management was
peculiar, as it was tied to Enlightenment's model for caching and
rendering loaded/scaled images.

A bunch of people worked to write GdkPixbuf: it kept Imlib's concepts
of a unified representation for image data, and an easy API to load
various image formats. It added support for an alpha channel (we only
had 1-bit masks before), and it put memory management in the hands of
the calling application, in the form of reference counting. GdkPixbuf
obtained some high-quality scaling functions, mainly for use by Eye Of
Gnome (our image viewer) and by applications that just needed scaling
instead of arbitrary transformations.

Later, we got libart, the first library in GNOME to do antialiased
vector rendering and affine transformations. Libart was more or less
compatible with GdkPixbuf: they both had the same internal
representation for pixel data, but one had to pass the
pixels/width/height/rowstride around by hand.

Mea culpa

Back then I didn't understand premultiplied alpha,
which is now ubiquitous. The GIMP made the decision to use
non-premultiplied alpha when it introduced layers with transparency,
probably to "avoid losing data" from transparent pixels. GdkPixbuf
follows the same scheme.

(Now that the GIMP uses GEGL for its internal representation of
images... I have no idea what it does with respect to alpha.)

Cairo and afterwards

Some time after the libart days, we got Cairo and pixman. Cairo had a
different representation of images than GdkPixbuf's, and it supported
more pixel formats and color models.

GTK2 got patched to use Cairo in the toplevel API. We still had a
dichotomy between Cairo's image surfaces, which are ARGB premultiplied
data in memory, and GdkPixbufs, which are RGBA non-premultiplied.
There are utilities in GTK+ to do these translations, but they are
inconvenient: every time a program loads an image with GdkPixbuf's
easy API, a translation has to happen from non-premul RGBA to premul
ARGB.

Having two formats means that we inevitably do translations back and
forth of practically the same data. For example, when one embeds a
JPEG inside an SVG, librsvg will read that JPEG using GdkPixbuf,
translate it to Cairo's representation, composite it with Cairo onto
the final result, and finally translate the whole thing back to a
GdkPixbuf... if someone uses librsvg's legacy APIs to output pixbufs
instead of rendering directly to a Cairo surface.

Who uses that legacy API? GTK+, of course! GTK+ loads scalable SVG
icons with GdkPixbuf's loader API, which dynamically links librsvg at
runtime: in effect, GTK+ doesn't use librsvg directly. And the SVG
pixbuf loader uses the "gimme a pixbuf" API in librsvg.

GPUs

Then, we got GPUs everywhere. Each GPU has its own preferred pixel
format. Image data has to be copied to the GPU at some point.
Cairo's ARGB needs to be translated to the GPU's preferred format and
alignment.

Summary so far

Libraries that load images from standard formats have different
output formats. Generally they can be coaxed into spitting ARGB or
RGBA, but we don't expect them to support any random representation
that a GPU may want.

GdkPixbuf uses non-premultiplied RGBA data, always in that order.

Cairo uses premultiplied ARGB in platform-endian 32-bit chunks: if
each pixel is 0xaarrggbb, then the bytes are shuffled around
depending on whether the platform is little-endian or big-endian.

Cairo internally uses a subset of the formats supported by pixman.

GPUs use whatever they damn well please.

Hilarity ensues.

What would we like to do?

We would like to reduce the number of translations between image
formats along the loading-processing-display pipeline. Here is a
plan:

Make sure Cairo/pixman support the image formats that GPUs generally
prefer. Have them do the necessary conversions if the rest of the
program passes an unsupported format. Ensure that a Cairo image
surface can be created with the GPU's preferred format.

Make GdkPixbuf just be a wrapper around a Cairo image surface.
GdkPixbuf is already an opaque structure, and it already knows how
to copy pixel data in case the calling code requests it, or wants to
turn a pixbuf from immutable to mutable.

Provide GdkPixbuf APIs that deal with Cairo image surfaces. For
example, deprecate gdk_pixbuf_new() and
gdk_pixbuf_new_from_data(), in favor of a new
gdk_pixbuf_new_from_cairo_image_surface(). Instead of
gdk_pixbuf_get_pixels() and related functions, have
gdk_pixbuf_get_cairo_image_surface(). Mark the "give me the pixel
data" functions as highly discouraged, and only for use really by
applications that want to use GdkPixbuf as an image loader and
little else.

Remove calls in GTK+ that cause image conversions; make them use
Cairo image surfaces directly, from GdkTexture up.

Audit applications to remove calls that cause image conversions.
Generally, look for where they use GdkPixbuf's deprecated APIs and
update them.

Is this really a performance problem?

This is in the "excess work" category of performance
issues. All those conversions are not really slow (they don't make up
for the biggest part of profiles), but they are nevertheless things
that we could avoid doing. We may get some speedups, but it's
probably more interesting to look at things like power consumption.

Right now I'm seeing this as a cool, minor optimization, but more as
a way to gradually modernize our image API.

We seem to change imaging models every N years (X11 -> libart
-> Cairo -> render trees in GPUs -> ???). It is very hard to change
applications to use different APIs. In the meantime, we can provide a
more linear path for image data, instead of doing unnecessary
conversions everywhere.