Abstract

Web Open Font Format (WOFF) 2.0 is a proposed update to the existing WOFF 1.0 with improved
compression. This report lists requirements for successful deployment,
evaluates how the requirement may be met, and examines the compression
gains and tradeoffs vs. code complexity, encode and decode time.
This document is non-normative.

Status of this document

This section describes the status of this document at the time of
its publication. Other documents may supersede this document. A list of
current W3C publications and the latest revision of this technical
report can be found in the W3C
technical reports index at http://www.w3.org/TR/.

This is a First Public Working Draft of the WOFF 2.0 Evaluation Report.
This document was developed by the WebFonts
Working Group.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Requirements

A successor to WOFF 1.0 must be deployable in
parallel with WOFF 1.0, without relying on style sheet switching,
content negotiation
,
user-agent version sniffing or suchlike fragile methods.

An improved compression method used in WOFF must:

Be freely implementable in a Royalty-free manner
consistent with the W3C Patent Policy

Have a freely available specification which is
sufficient by itself to implement the compression method (it must not
depend on opaque 'reference code')

Produce significantly better compression than WOFF 1.0
in the best case

Produce the same (or, ideally, better) compression
than WOFF 1.0 in the worst case

Produce notably better median compression than WOFF
1.0

Not take significantly longer than WOFF 1.0 to
decompress

Not take significantly more RAM than WOFF 1.0 to
decompress

It should be noted that pathological fonts can be found which compress
badly. Equally, some poorly constructed fonts trivially compress well,
due to needless duplication. The main goal of this work is to optimally
compress fonts that are already well optimized.

Deployment

In WOFF 1.0 [WOFF 1.0], fonts are deployed
using the @font-face feature of CSS3 [CSS3
Fonts]. The format part of the src descriptor identifies the font
as WOFF 1.0. Multiple formats are allowed, and form a prioritized list;
the font with the first supported format in the list will be loaded.
This mechanism thus allows fonts in WOFF 2.0 to be deployed in parallel
with those in WOFF 1.0, using the same style sheet and without any
user-agent sniffing or server configuration.

Preprocessing

WOFF 2.0 adds a preprocessing step, before entropy coding.The reason
for this is that many fonts contain redundant or duplicate information,
or information that can be deduced from other items of data also
present. Data may also be padded to a convenient multiple of 8 bits,
because aligned data is faster to access. They are thus optimized for
convenience of access at the expense of speed. Preprocessing detects and
eliminates redundant data, more efficiently encodes derived data, and
stores data in the minimal number of bits. Following decompression, a
reconstruction step is needed to re-align data and to compute derived
values.

This step is based on the MicroType Express-style [MTX]
submission, but uses a subset of the methods in that specification,
chosen for maximal gain balanced against minimal processing cost.

The preprocessing steps used for initial explorations are summarized in
Appendix D.

The placement of knots in a glyph path follows a set of rules, which
can in some cases allow a point to be predicted based on nearby points
and the overall position on the curve (for example, at local curve
minima or maxima). A novel preprocessing stage was considered which
removes predictable points before entropy compression, and restores them
in the reconstruction step. Experiments showed that a modest reduction
in filesize could be obtained, but that the prediction capabilities of
the entropy coder were already accounting for these points and thus, the
size of the compressed result was not significantly reduced. These
experiments are summarized in Appendix
C.

Lossless compression

The compression in WOFF 1.0 is (subject to certain preconditions, such
as prior removal of extraneous space between tables) bitwise
lossless; the exact same bitstream is produced on
decompression as was input to the compression process.

The use of a preprocessing step means that the overall
compression in WOFF 2.0 will not be bitwise lossless. The preprocessing
converts some data to the most space-efficient form, and removes
redundant information or information which could be regenerated on
decompression. This is not a disadvantage; the resulting font functions
the same as the original one and in some
cases will be more consistent between rendering implementations.
The secondary, entropy coding step is bitwise lossless.

Thus, WOFF 2.0 is said to be functionally lossless.

Continuation streams

In WOFF 1.0, each OFF table is separately compressed. This was
originally done to allow individual tables to be decompressed as needed,
although that facility is rarely used in practice. For WOFF 1.0 with
Flate compression, resetting the compression stream like this also turns
out to give a slightly smaller filesize compared to compressing all the
tables in a single stream.This is because Flate compression uses a small
buffer and cannot cache previous states.

For WOFF 2.0, the impact of per-table compression versus entire-font
compression with continuation streams was examined. For the better
entropy coding schemes, there was an overall benefit to entire-font
streams. As an example, WOFF2 with Brotli compression and continuation
streams gave an overall reduction of -29.21% compared to WOFF 1.0, while
resetting the compressor on a per-table basis gave a less good filesize
reduction of -28.26%.

Candidate A: MicroType Express-style plus LZMA

Specification

LZMA is defined primarily by reference code, portions of which are (in
the words of the originator) “hard to explain”. A specification would
need to reverse engineer this code. Work was started on this, but it was
abandoned when other difficulties indicated that the LZMA approach was
no longer worth pursuing.

Intellectual Property

The WebFonts Working Group was unable, despite reaching out to LZMA
developers, to secure a commitment to the development of a
re-implementable, Royalty-Free specification.

Compression compared to WOFF 1.0

Compression compared to WOFF 1.0 was impressive. (Part of this was due
to the MicroType Express-style preprocessing step). Several complete
font collections were examined, each font in the collection compressed
and the compression gains tabulated. The results are presented in detail
in Appendix A.

Decompression time and memory requirements

The decompression time required (
for example, a decompression rate of 36.5 MB/s on the Google Fonts corpus)
was considerably worse than for WOFF 1.0 (194.3 MB/s on the same
hardware with the same corpus). This was primarily due to the
LZMA step; MicroType Express-style step is an optimization and removal
of redundancy, so the impact of reconstruction on decompression time
was minimal (in fact, omitting the preprocessing step decreased
decompression throughput slightly, to 36.0 MB/s, as the
entropy coder was working on a larger data stream).

The memory required for LZMA decompression was around
twice that needed for gzip decompression.

Analysis

Compression gains over WOFF 1.0 are largely consistent (within a few
percent) across all of the font collections studied. This implies that
they would be applicable to other collections too. In general 20-26%
median compression improvements for TrueType were observed, and 12-13%
for CFF. This satisfies the requirement to produce notably
better median compression than WOFF 1.0.

Improvements up to 94.51% were observed in some cases (in the Monotype
collection, for fonts with exceptionally large kern tables.) The
smallest improvements seen were of the order of 1% (again, in the
Monotype collection); most other collections had smallest improvements
of 4-6%. It is notable that in no case was a font found that compressed
to the same size as WOFF 1.0, and no cases were observed where a font
became larger.

Decompression time precluded use of this candidate if multiple fonts
were decompressed in parallel, or if decompressed on
resource-constrained mobile devices. The requirement for decompression
time was not satisfied.

After extensive discussion with the original LZMA developer, and
examination of the sample code which is the primary documentation, the
Working group concluded that the Royalty-Free and
reimplementable algorithm requirements could not
be met.

Accordingly, the WebFonts Working Group has discontinued efforts
on this candidate.

Candidate B: MicroType Express-style plus Brotli

Specification

The specification [Brotli] is being developed
by the Google Brotli team in Zürich. Being a derivative of the widely
deployed Flate [Flate] algorithm, it proceeds
from a well-understood basis. The improvements are well documented and
open source, sample code is freely available.

Intellectual Property

The proposers have stated their intention to make the specification
freely available and to pursue it's standardization through an
appropriate body such as the IETF.

Compression compared to WOFF 1.0

Compression compared to WOFF 1.0 was good. It was initially less
impressive than candidate A but improved significantly over the course
of study as the algorithm was tuned. (Part of this was due to the
MicroType Express-style preprocessing step). To date, only a single font
collection has been examined. Each font in the collection was
compressed and the compression gains tabulated. The initial results are
presented in detail in Appendix B.The Working
Group plans to continue with analysis of other font corpi.

Decompression time and memory requirements

The decompression time required (for example, a decompression rate of
61.1 MB/s on the Google Fonts corpus) was considerably worse than for
WOFF 1.0 (194.3 MB/s on the same hardware with the same corpus) but
twice as good as for WOFF 2.0 with LZMA (36.5 MB/s). Omitting the
preprocessor step made decompression throughput worse (56.4 Mb/s).

The memory required for Brotli decompression was around ???
that needed for gzip decompression.

Analysis

Initial results are promising but relate to only a single corpus. The Working Group
will continue to examine other data sets.

Conclusions

Initial results indicate that the final compression gains achieved by
Brotli (median 23.94% on the Google Fonts corpus) have significantly
outperformed our initial goal of half WOFF 2.0 Candidate A (LZMA) gains
(median 24.96% on the same corpus). This has been achieved while
significantly increasing decompression speed and lowering decompression
memory requirements.

Appendix C: Curve point prediction

Due to the regular placement of curve points on the outline of a glyph, it is possible to
reliably predict the coordinates of some points. A study of 37,00 fonts (the Monotype font corpus)was performed
to test the feasibility of this. While the results varied substantially from font to font,
the average number of all points that could be predicted (with respective coordinates
eliminated as redundant info) was 2.0%, corresponding to an average of 3.11 bytes saved
per eliminated point.

A second study on a small corpus (fonts supplied with Windows 7) identified and
eliminated on-curve points that were exactly
midway between their preceding and following off-curve points. No flag was added to indicate
removal. The size of the compressed, altered font was compared with the compressed size
of the unaltered version. The results varied substantially; in a few cases the font was
significantly smaller (around 2% in the best case). In around one-third of the fonts tested,
the reduction was insignificant (a small fraction of a percent). However, in two-thirds
of the fonts tested, the compressed size was larger than the unaltered font
(by around 0.1%).

These results indicate that, for predictable coordinates, the entropy coder is doing a better
job of compressing the redundant data than a content-aware hueristic for point removal.
The Working group thus concluded that this approach was not worth pursuing further.

Appendix D: MTX-like Preprocessing

This appendix summarizes the preprocessing steps used for initial
testing, and is non-normative. The Working Group plans to refine this
over time. The preprocessing steps eventually chosen will be fully
documented in the WOFF 2.0 specification.

Table-specific transforms

In some cases, the TrueType tables contain significant redundancy. For
these, WOFF 2.0
defines transforms that strip out the redundancy, and then allow the
table to be reconstituted.
Currently, effort is focussed on the ‘glyf’ table, but applying
transforms to other tables is also under
consideration as well.

When the glyf table is transformed, both the ‘glyf’ and ‘loca’ tables
should be listed in the
directory with the applyTransform flag set. The transformSize of the
loca table should be zero -
the data in the transformed ‘glyf’ table will be used to reconstruct
both glyf and loca tables.

The ‘glyf’ table

When the applyTransform flag is set on the ‘glyf’ table, the contents
of the glyf table are
transformed using an algorithm designed to optimize the compressibility
of the resulting stream.
This transform is based on the one in [MTX], but has been updated to
achieve even more
performance.

The transformed glyf table consists of seven substreams. There is a
header consisting of a
version, the number of glyphs, and the sizes of each of the substreams,
then the data of the
substreams follow.

The individual glyphs are interleaved across all these streams. Thus,
each stream contains
some number of bytes for glyph 0, followed by some number of bytes for
glyph 1, etc. The
reconstruction process is defined in terms of reading from the various
streams.

The ‘loca’ table

The origLength for the ‘loca’ table should be large enough to hold the
reconstructed loca table.
If the indexFormat is short, this means 2 * (numGlyphs + 1). If long, 4
* (numGlyphs + 1).
The origLength in the ‘maxp’ table must match that in the transformed
‘glyf’ table. (Rationale
for duplicating the value: the transformed table should contain
sufficient information for
reconstructing the final table. Otherwise, it’s much harder to do
incremental processing).

Bounding boxes

Each glyph can optionally have an explicitly specified bounding box.
Since a bounding box takes
8 bytes, it should be omitted where possible. However, reconstructing a
bounding box in the
case of arbitrary glyph transformations is non-trivial. Further, it is
possible for the source font
to have a bounding box inconsistent with the actual glyph data. In this
case, it the role of the
compression algorithm to represent the font as accurately as possible,
inconsistencies or no, so
that the use of the font does not vary at all dependent on whether
compression was applied.

The bounding box data stream is defined as follows. First, there is a
bitmask, consisting of 4
* ((nGlyphs + 31) / 32) bytes. Bits are packed big-end-first. Thus,
glyph 0 is represented as a
value of 0x80 in the first byte, glyph 1 as 0x40, and glyph 8 as a value
of 0x80 in the second
byte. A bit of 1 indicates that the bbox is present, and a bit of 0
indicates its absence.
For each glyph with a 1 bit set, the bounding box is represented as 4
SHORT values, xMin,
yMin, xMax, yMax, and the interpretation of these values is the same as
in the header of an
individual glyph.

Note: In the present implementation, bounding boxes must be specified
for all composite glyphs.
Under consideration is reconstructing bounding boxes for composite
glyphs with offset-only
transforms. For this reason, the bbox reconstruction is described (and
implemented, in the
reference code), as a separate pass rather than part of the stream
processing for glyphs.