Introducing the Distance Field Generator

At least from the perspective of rendering, text is often the most complex part of a traditional two-dimensional user interface. In such an interface, the two main components are rectangular images and text. The rectangular images are often quite static, and can be represented by two triangles and four indexes into a texture atlas that is uploaded to graphics memory once and then retained. This is something that has low complexity and which the graphics hardware has been optimized to handle quickly.

Text starts as a series of indexes into an international database of writing systems (Unicode). It is then, based on some selection algorithm, combined with one or more fonts, which is in principle a collection of shapes and some lookup tables and executable programs that convert said indexes into shapes and relative positions. These shapes, basically filled paths made out of bezier curves, then have to be rasterized at a specified size, and this can range from simple and neat outlines to complex ones with lots of detail. (By rasterization, I mean finding out how much of each target pixel, or subpixel in some cases, is covered by the shape.)

Objectively the most beautiful character in the Latin alphabet. Here represented by a rasterized image of three channels, respectively giving the coverage of the red, green and blue target subpixels. Scaled by 400% to make the pixels visible.

All combined, it is a heavy process. But luckily, instead of redoing every step for every string, we can often cache intermediate results and reuse them later.

For instance, it is possible to rasterize the glyphs the first time they are used, keep this in memory, and then at each subsequent use, render these glyphs the same way images are rendered as described above: By putting the rasterized glyphs in a texture atlas and representing the glyphs by two triangles and indexes into this atlas. In fact, when the Text.NativeRendering render type is in use in Qt Quick, this is precisely what happens. In this case, we will ask the underlying font system (CoreText on macOS, GDI/DirectWrite on Windows and Freetype on Linux) to rasterize the glyphs at a specific size, and then we will upload these to a texture atlas which can later be referenced by the triangles we put on the screen.

Contents of texture atlas in a typical Qt application.

There are some limitations of this approach however: Since the font size has to be known before the glyphs are rasterized, we may end up rasterizing and caching the same glyphs multiple times if the text in the UI comes in many different sizes. For some UIs that can be too heavy both for frame rate and memory consumption. Animations on the font size, for instance, can cause us to rasterize the shapes again for every frame. This rasterization is also done on the CPU, which means we are not using the resources of the device to its full potential when preparing the glyphs.

Additionally, transformations on the NativeRendering text will give pixelation artifacts, since they will be applied to the pre-rasterized image of the glyph, not its actual mathematical shape.

So what is the alternative?

For a more flexible approach, we want to actually do the rasterization on the GPU, while rendering our frame. If we can somehow get the shapes into the graphics memory and rasterize them quickly using a fragment shader, we free up CPU resources and allow both transformations and size changes without any additional penalty on performance.

There are several approaches to this problem. The way is is done in Qt is by using so-called distance fields. Instead of storing the rasterized glyphs in texture memory, we store a representation of the shapes in a texture atlas where each texel contains the distance to the nearest obstacle rather than the coverage.

A distance field of the same Q, as an 8-bit map where each value is set to the distance to the nearest point on the outline of the glyph

Once these distance fields are created and uploaded to texture memory, we can render glyphs at any font size and scale quickly on the GPU. But the process of converting the shapes from the fonts into distance field is still a bottle neck for startup time, and that in particular is what this blog is about.

So what is the problem?

Creating the distance fields is CPU-bound, and – especially on lower-end hardware – it may be very costly. By setting the QT_LOGGING_RULES environment variable to “qt.scenegraph.time.glyph=true” we can gain some insight into what that cost is. Lets for instance say that we run an example that displays 50 unique Latin characters with the Deja Vu Sans font (the simple and neat outlines further up). With the logging turned on, and on an NXP i.MX6 we have for testing in our QA lab, we get the following output:
qt.scenegraph.time.glyph: distancefield: 50 glyphs prepared in 25ms, rendering=19, upload=6

From this output we can read that generating the necessary assets for these 50 glyphs took 19 ms, over one whole frame, whereas uploading the data to the graphics memory took 6 ms. It is the 19 ms for converting into distance fields that we will be able to reduce. These 19 ms may not seem like a lot, but it will cause the rendering to skip a frame at the point where it happens. If the 50 glyphs are displayed at startup, then those 25 ms may not be as noticeable, but if it is done during an animation, it would be something a user could notice. It is worth mentioning again, though, that it is a one-time cost as long as the font remains in use.

Running the same for the HVD Peace font (linked as the complex font above), we get the following output:
qt.scenegraph.time.glyph: distancefield: 50 glyphs prepared in 1016ms, rendering=1010, upload=6

In this case, we can see that rendering the distance fields takes a full second, due to the high complexity of the outlines in use.

Another use case where we may see high costs of generating distance fields is if the number of unique glyphs is very high. So let us test an arbitrary, auto-generated “lorem ipsum” text in Chinese with 592 distinct characters:
qt.scenegraph.time.glyph: distancefield: 592 glyphs prepared in 1167ms, rendering=1107, upload=60

Again, we see that generating the distance fields takes over one second. In this case, the upload also takes a bit longer, since there is more data to be uploaded into graphics memory. There is not much to be done about that though, other than making sure it is done at startup time and not while the user is watching a smooth animation. As mentioned, though, I will focus on the rendering part in this blog.

So what is the solution?

In Qt 5.12, we will release a tool to help you improve on this for your application. It is called “Qt Distance Field Generator” and you can already find the documentation in our documentation snapshot.

The way this works is that it allows you to pregenerate the distance fields for either a selection of the glyphs in a font or all of them. Then you can append these distance fields as a new font table at the end of the font file. Since this custom font table follows SFNT conventions, the font will still be usable as a normal TrueType or OpenType file (SFNT mandates that unsupported font tables are ignored).

So the font can be used as normal and is still compatible with e.g. Qt Widgets and Text.NativeRendering, where the rasterization will still go through the system.

When the font is used in Qt Quick with Text.QtRendering, however, the special font table will be detected, and its contents will be uploaded directly to graphics memory. The cache will therefore be prepopulated with the glyphs you have selected, and the application will only have to create distance fields at runtime if they are missing from this set.

The result of this can be impressive. I repeated the experiments, but with fonts where I had pregenerated distance fields for all the glyphs that were used in the example.

As we can see, there is a great improvement when a lot of time is spent on creating the distance fields. In the case of the complex font, we got from 1016 ms to 4 ms. When more data is uploaded, that will still take time, but in the case of the Chinese text, the upload was actually faster than when the distance fields were created on the fly. This is most likely a pure coincidence, however, caused by the order of the glyphs in the cache causing slightly different layouts and sizes.

Another peculiar thing we can see is that the complex font is faster to load than the simple one. This is simply because the glyphs in that font are square and compact, so there is not a lot of unused space in the cache. Therefore the texture atlas is a little bit smaller than for the simpler font. The complexity of the outlines does not affect the loading time of the atlas of course.

Running the same tests on my Windows Desktop workstation, we see that there is not as much overhead for generating the distance fields, but there is still some performance gain to be seen in some cases.

For 50 Latin glyphs with Deja Vu Sans, both tests clocked in at 3 ms, which was mainly spent uploading the data. For HVD Peace, however, generating the distance fields took 131 ms (versus 1 ms for just the upload) and for the Chinese text it took 146 ms (vs 11)

Hopefully this can help some of you get even better performance out of your Qt devices and applications. The feature is already available in the Qt 5.12 beta, so download the package and take it for a test drive right away.

Its library has some Distance Field and Packing functionality which you can integrate for your project using C++ interace. Also, there is a command-line interface. The GUI is build with Qt, but the rest is independent from Qt.

Nice improvement. But honestly it would be a pain to use the tool manually for all project/fonts. Wouldn’t be more easy for everyone to just create an on-disk cache where the distance field is created at first startup and then reused automatically? There already is some on-disk caching for qml file… That way it’s automatically enabled for all projects and no need to manually use another tool.

This is a good idea, but it would require some more logic around it: At which point do you generate the disk cache and call it done for instance? If it is continuously updated throughout the application history for instance and have the feature enabled for all text, then you may end up preloading a lot more glyphs than you actually want, and this will actually cause a regression in startup performance rather than improvement.

The idea here is that if you are at the point where you do the types of optimizations this allows (for most use cases, just generating the distance fields at run time will be fine), then you will prefer control over convenience. This approach also allows you to build something which does not require “pre-heating” to generate caches before leaving the factory.

And if we assume a device UI uses maybe ten different fonts and generating the distance fields is a one-time cost, I don’t think pre-generating these caches will be a significant part of the development process.

That is not to say we will not explore this idea down the road, as I think it could definitely be a convenient addition, and it should be possible to build on the same mechanisms that are now already in place.

Yes I found the env vars, the problem is this overrides eveyrthing, so even fonts that don’t need the extra detail use up the extra memory. It is a waste.

Ideally, this should be settable on a per-font basis, right from QML preferrably. So that the engine knows to provide a particular font with enough resolution. In the font loader element, there could be a SDF resolution related hint or two that get taken in account when generating the SDF.

And there should be an option to set that in the SDF generator as well.

And while throwing in ideas, it would be quite easy to modify the tool to also accept a collection of vector glyphs and consolidate them in a SDF font format, which will make it very easy to employ the SDF rendering for single flat color icons as well – quite useful IMO.

The original SDF technique (as made popular by Valve), that is also used in Qt Quick has some problems. It distorts the glyph shapes when zoomed – sharp corners become rounded, giving all fonts a kind of “comic sans” look. I was wondering if you guys looked into some other techniques, namely Multi-channel SDF, or perhaps the Slug library?

At large sizes, yes, the level of precision in the default distance fields can cause artifacts such as rounded corners. We have researched different approaches to this issue in the past, but none of these have landed in Qt at this point.

There were also some performance issues and anti-aliasing issues at smaller sizes which I was not able to sort out (though there are still many possible optimizations to be made to this code). At least for my current implementation it made it unsuitable as a replacement for distance fields, though it could possibly be usable for large scale text. There is a patent pending for that algorithm, however, which makes it problematic to add an implementation of it in Qt.

Thanks for your response, Eskil. Cool to know, that you are considering more options.

Yea, I know about the Slug patent. I’m sure however, Eric would be happy to at least talk about licensing his algorithm.

Have you also checked the Multi-channel SDF techinique by Viktor Chlumsky? I didn’t actually try it myself, but by looking at his paper and github, it would seem that he is able to reconstruct the glyph shapes more accurately from an even lower resolution texture.

A few years ago when I was debugging shaping of Malayalam language in Qt 4.x, the same text was being run through the shaper over and over (whenever I switch the window, that is). Today Harfbuzz fixed the shaping issues once and for all, but I wonder whether Qt can optimize there.

Thanks for the feedback. In the case you refer to, my guess is that the example used QPainter::drawText() in a paint event. In this case, whenever the widget repainted, you would feed the same unicode string into the paint engine and it would have to do all the steps to convert this into graphics on the screen: The shaping is definitely a significant part, but also itemizing and font selection.

This is not an issue in Qt Quick, since it declarative nature makes it possible to only update parts which have changed during the repaint.

We introduced QStaticText as a convenience for working around this problem in Qt 4, which is basically a way to store the QString with all the shaping and font selection artifacts, so that these could be reused as long as the text did not change. The idea was to give an API which was symmetric to the drawText() API in order to be as easy to use as possible.

QTextLayout is similar, with a bit more power to control the layout but with a lower-level API. And this can also be used together with QGlyphRun/QRawFont for low-level access to the results.