When a feature needs to be labelled, the following (simplified) steps are undertaken:

the string eventually goes through iconv to be converted to utf8

the string goes through fribidi to reorder the glyphs from “logical order” to “visual order”,
to support languages that are written from right to left. e.g. supposing capital letters are
arabic letters, the text which is stored in the feature as thisissomeARABICtext is
transformed into thisissomeCIBARAtext in order to be fed transparently to the
renderers that currently layout glyphs from left to right

the string goes through line breaking, which is currently broken for RTL
scripts for line breaking as we render (for arabic\ntext)

TXETCIBARA

instead of the required

CIBARATXET

the string goes through alignment, which is currently hacky and unprecise (thanks
to yours faithfully) as we (try to) achieve alignment by padding with spaces instead
of using precise offsets per line

the string is then passed to the renderers as-is, who are responsible for the whole
laying out of each glyph

While the current situation is simple to understand, and works reasonable well for latin
languages, it has a number of shortcomings that this RFC aims to resolve:

All the renderers are duplicating layout logic:

splitting the utf8 string into individual glyphs

looking up a given glyph inside a font file

laying out a glyph with respect to the previous one (this includes spacing between
subsequent characters, or determining how many pixels should be vertically added
when a linebreak is encountered)

While latin languages have a 1-to-1 mapping between string characters and font glyphs,
this is not the case for complex languages, where ligature glyphs need to be inserted
depending on context (see the wikipedia complex text layout page for a more complete explanation).
While this is partly taken care of for us by fribidi for arabic (which has a shaping
algorithm that inserts ligatures for us), we currently fail hard for other complex
text languages.

We don’t support double spacing or line spacing at all, and our text alignment
(center, right) is imprecise, as implementation for these would need to be added
to each individual renderer.

In short, instead of passing a string of text and a starting position to the renderers,
we pass a list of glyphs (i.e. a glyph is a specific entry inside a specific font file)
along with their precise positioning, very similarly to what we do currently when
rendering FOLLOW labels, as is already outlined in bug 3611 . The actual shaping and layout
happens in a single mapserver function, and the renderers just dumbly place glyphs where
they are told to.

Architecturaly, this modification is significant, as the whole label rendering chain
needs to be refactored in order to take a list of positioned glyphs into account instead
of just plain text strings.

Freetype becomes a central mapserver dependency, as we need the information about glyph
sizes at a higher level than currently (where freetype is a dependency of the individual
renderers). We also need a single central glyph cache to be used across harfbuzz and the
renderers. fontsetObj will likely need to be extended to be more than a simple
filename hashtable.

Fribidi (which has some thread safety issues) is kept as the library that handles the
bidi algorithm for us. We do not use it for shaping anymore.

Bring in Harfbuzz to support complex script shaping (to simplify: the tool that will
insert ligatures between characters). Harfbuzz algorithms will be applied on text strings
that have been determined to not be latin (i.e. the slowdown induced by harfbuzz is limited
to those languages that actually require shaping).

UTHash (http://troydhanson.github.io/uthash/) is a header only hashtable
implementation that works with arbitrary keys and values, and is used for
accessing cached fonts and glyphs. Given some performance testing, we might
want to replace our own hashtable implementation with this one some day.

Note that the Fribidi and Harfbuzz dependencies will remain optional and can be
disablable at compile time for those only treating latin scripts. Harfbuzz and Fribidi will
need to be enabled or disabled together.

(fontcache.c)
A global (thread protected if needed) glyph and font cache will
ensure that cached glyphs are reusable across
multiple requests for the fastcgi case, but in turn requires some
thread-level protection and probably some pruning in order for it to
remain of reasonable size. Some APIs have changed in order to have the
fontcache accessible.
The fontcache contains caches for:

This step will be added to mapserver, and consists in coordinating the outputs from freetype,
the bidi algorithm, harfbuzz, line-breaking, text alignment, and in the future text and line
spacing. This step could be in most part implemented transparently through pango, however
pango is hard to support across platforms, and, to my knowledge, has a hard dependency on
fontconfig which is incompatible with our fontset approach.

Text is represented by a “textPathObj” which is basically a list of
positioned glyphs. (e.g the word “Label” at size 10 for an arial font
is represented by arial font’s glyph “L” at position (x,y)= (0,0),
glyph “a” at position (10,0) , “b” at (18,0), etc...). Multiline text
is handled transparently by having glyphs positioned at different y
values. A textPath can be either “absolute” (i.e. the glyph positions
are in absolute image coordinates, used to position glyphs for angle
follow labels), or “relative”, in which case they must be offset by
their labeling point.

All the shaping happens in textlayout.c,
who’s principal role is to take a string of text as input, and return
a list of positioned glyphs as output. The input string goes through
multiple steps, and is plit into multiple run. Each run will have a
distinct line number, bidi direction, and script “language”.

As an example, we’ll be working with the input unicode string “this is
some text in english, ARABIC and JAPANESE”. Capital letters are used
to denote non latin glyphs, also note that ARABIC is stored in logical
(=reading) order, whereas it would be rendered as CIBARA.

iconv encoding conversion to convert the string to unicode

run1="this is some text in english, ARABIC and JAPANESE",line=0

line wrapping: break on wrap character, break long lines on spaces

run1="this is some text in english,",line=0run2="ARABIC and JAPANESE",line=1

bidi levels, each run has a single bidi direction (i.e. left-to-right or right-to-left)

run1="this is some text in english,",line=0,direction=LTRrun2="ARABIC"line=1,direction=RTLrun3=" and JAPANESE",line=1,direction=LTR

script detection is applied to enable language dependent shaping,
and also to refine which fonts will be used (more on that later)

run1="this is some text in english,",line=0,direction=LTR,script=latinrun2="ARABIC"line=1,direction=RTL,script=arabicrun3=" and "line=1,direction=LTR,script=latinrun4="JAPANESE"line=1,direction=LTR,script=hiragana

for each run, we select which font should be used, in order to use
the same font inside a given run. MS RFC 80: Font Fallback Support allowed to specify
multiple fonts for a LABEL, this has been extended to be able to fine
tune which fonts are to be preferably used for a given script:

Each run is then fed into harfbuzz, which returns a list of
positioned glyphs. The number of returned glyphs is not meant to be
identical to the number of glyphs we had in the unicode string, and
are ordered from left to right. As a speedup, a run who’s script is
“latin” is not fed through harfbuzz, but instead uses cached glyph
advances.

The glyphs of each run are reassembled to account for line numbers
and run positions (e.g. run 3 is offset down by one line, and placed
to the right of run 2)

Each line is horizontally offseted to account for ALIGN. LABEL ALIGN
now stops defaulting to LEFT, so right-to-left runs will be right
aligned instead of left as is now.

The labelcache and the renderers will need to be updated to work with a list of glyphs.
Changes here are extensive but should remain conceptually simple. Individual renderers are
substantially simplified.

Work has been done to trim down the labelcache computations as much as possible:

When inserting features into the labelcache:

We’ll insert a reference to the original labelObj instead of a copy
if the labelObj and it’s child styleObjs don’t contain any bindings.
This cuts dow on memory usage when attribute bindings aren’t in use.

We don’t insert features that will never get rendered (e.g. out of
scale, too large for their feature (minfeaturesize keyword) )

At the msDrawLabelCache phase:

We delay computation of the label text bounding box to after we have
checked conditions that would cause it not be renderered, i.e.

if they have a MINDISTANCE set and a neighbouring label with
identical text has already been rendered

for labels without markers, we first check that the labelpoint
doesn’t collide with an existing label.

The Collision detection has been optimized:

We keep a list of rendered labels and loop through those instead of
checking the status of all the labels in the labelcache for each
member

The bounding metrics for a label has been cut down from a full
shapeObj to a struct containing a bounding rect and an optional
lineObj. For non rotated labels, there’s no information needed more
than the bounding box, which makes intersection detection much easier
for two labels like this (i.e. the overlapping of these two labels is
the same as the overlapping of their bounding boxes, no need to go
into further geometric intersection primitives).

Existing performance on latin only text will need to be kept to
the current level, by detecting and shortcutting these cases. If done correctly we may
actually be able to speed up latin text rendering marginally.

For other scripts, the cost of going through harfbuzz will come with a performance
penalty.