Eliminate Private Use Encoding in Revised Fonts?

I’m cross-posting this with the OpenType mailing list to try to get a wider cross-section of views.

As has been mentioned here and elsewhere, in new fonts Adobe is moving away from using Unicode Private Use Area (PUA) encodings for glyphs that are alternates or variants of another glyph that is encoded as the default form for a character. About the only thing we’d use PUA for in new fonts would be ornaments or dingbats that really don’t have their own codepoints.

We’re working on a general tune-up of our whole type library, and one of the questions which arose is, should we make such a change in revising already shipping fonts?

All of us who have discussed it at length (David Lemon, Christopher Slye and I) are quite ambivalent about this. As of our last poll I think David and I were leaning mildly against, and Christopher was mildly in favor.

Why are we angst-ridden on the topic?

Well, the reason to do it is to set an example, and to "put our money where our mouth is." We’ve been saying for several years at least that if we had it to do over again, we would not encode anything in PUA except for dingbats and the like. If we change it even in already shipping fonts, we’re sending a strong signal to other font developers, and not doing it kind of undermines the message that it’s a bad idea.

But if make this change, we’re trashing backwards compatibility for any people who have made documents in, say, Word for Windows, and used the Windows Character Map to get at some special alternate they couldn’t access directly (oldstyle figures? some neat ornament? a swash cap? I dunno). We really don’t think there are a lot of these people, but for them to get notdefs instead of their desired glyphs if they open an old doc with a newer version of the font… well, that’s pretty ugly.

Although fonts have version numbers, in my experience software apps don’t look at them and they are not stored in documents that use the fonts. This makes revving a font in any substantial way impractical. Changing character encodings is a major change and will invalidate existing documents. I think it would be better to come up with new names for the fonts that are clearly derived from the old names. This way users can substitute the new for the old, being assured that it doesn’t change the look of their document, while letting them check for altered characters.Paul

Will,They don’t get encoded at all. Instead they are accessed by the base character plus appropriate OpenType layout features. For fine typography (as opposed to default behaviors for complex writing systems) these additional layout features are essentially user-applied, as formatting. This model of character/glyph processing is really fundamental to OpenType. The reason that we could look at removing the PUA encodings is that as far as we can tell, next to nobody is using the PUA encodings to access these glyphs anyway – they’re doing it the “right way” already.Regards,T

Paul,On the one hand, you are absolutely right that today few if any applications pay attention to font version numbers.On the other hand, it has been longstanding practice of OS vendors to make major changes to their fonts without changing the font names in the least. For better or worse, font vendors have mostly followed suit.(Now, in most cases changes made by OS and font vendors are backwards compatible. But this is a small consolation: in the real world, files may go in either direction, not just from older versions of the font to newer versions. We do have some plans to ameliorate this problem somewhat, but it’s not easy.)I certainly respect the point of view that we should change the font names. Probably we would sooner leave the PUA encodings in place than change the font names, however.This leads to an interesting question: how big does a change need to be before it is worth changing font names? Fixing a bug in a single glyph outline? Adding a single glyph to cover a needed character?The problem with changing font names is that it creates a whole bunch of work for users, who then have reason to keep every version of the font around, even if they are otherwise unaffected by the change.How small a percentage of users should be affected before we *don’t* change font names?I’m not saying I have a pat answer to these questions. In every case, going down either route has some significant disadvantages. It’s a classic problem of deciding which approach stinks less.Regards,T

Personally, I would be upset if an existing font was changed, and no backward compatibility was implemented. Imagine running a job that ran fine last year, but fails this year because the printer updated fonts.Make changes to new releases, but not to the current versions.

What would the impact be of leaving them in the font? Sure it might be considered better practice not to add them into new fonts, but now they are already there, and the fonts works as it is, perhaps it’s more of a risk to take them out.Next time you make a significant enough change to the font that it would require a name change, then they could be taken out.

A font, out in the wild, shouldn’t be updated unless it’s a critical fix. Minor updates just cause too much confusion and leads to too many potential problems. It would be easier to update fonts if they were like other applications (check for updates weekly), but they aren’t and that’s just the business we’re in.So when should an existing font be updated. My take is:1. For a major technical problem (glyphs display improperly)2. For inclusion of new/special characters (adding a Euro)3. On a set schedule. So every five or ten years, the fonts are given a checkup and re-released.I don’t think that Private Use Encoding is a big enough problem to fix for existing/distributed fonts. They will still function just fine with or without them. It may bother you, but that’s because you know what’s going on behind the scenes.Besides, there is all sorts of technical stuff in fonts that never quite panned out (correct me if I’m wrong, but does any application call on PANOSE settings?).

Mark, you wrote:It would be easier to update fonts if they were like other applications (check for updates weekly), but they aren’t and that’s just the business we’re in.That’s actually a key problem we’re trying to deal with, instead of just accepting it. I’ll be writing about this more in the future, I’m sure.Regards,T

Although a very low proportion of Word users make use of Adobe’s PUA codes, there are a lot of Word (and other WP program) users. At the moment, we can use a VBA macro to insert any glyph that has its own code point, including non-exotic things such as OS figures and SCs; more to the point, we can replace the standard glyphs associated with Unicode. There’s no other way for us, unless Word etc. can recognize all the glyphs available in some Open Type fonts, which is not likely to happen soon (if ever). The PUA is far from full, so please continue to supply glyphs for alternative figures, small capitals, and extra ligature in the PUA as well as in their ‘proper’ places.

This wouldn’t affect glyph names, unless those were of the uniXXXX form. This should be rare at best.We do understand the issue for Word users. The public beta of Word 2007 for Windows has sown us another version of Word that still doesn’t support OpenType typographic layout features, except where required for language support.Although it seems there are not that many folks using this approach, we do get semi-regular inquiries to tech support about it. Overall, I think we are likely to leave PUA values intact in existing fonts, but not put them into new fonts.Regards,T

Why not have the version number in the name of the font? This way you would know exactly which font was being used. It would also help in Prepress to make sure you load the correct version and not get any typeflow problems.Brian

Brian,Hmmm. I think that would make pre-press and output folks happy, and would irk everybody on the design side. It would also make it much more necessary to keep every version of the font on-hand.The best solution would be to store version info in the document and have the application verify that you have the same version that was used to author the document. Maybe one of these days….T

> As has been mentioned here and elsewhere, in new fonts Adobe is moving away from using Unicode Private Use Area (PUA) encodings for glyphs that are alternates or variants of another glyph that is encoded as the default form for a character.Would it be possible with OpenType fonts, to use some Private Use Area characters such as U+EF01 ALTERNATE GLYPH OF THE FIRST KIND and U+EF02 ALTERNATE GLYPH OF THE SECOND KIND and so on, and have items in the glyph substitution table such that, say, the sequence U+0067 U+EF02 would display the second alternate glyph for a lowercase g? U+EF01, U+EF02 and so on, maybe up to, say, U+EF07, could be included in the font as zero-width glyphs so that if any alternate glyph requested was not available in the particular font then the display would be of the basic glyph of the printing character and a zero-width glyph, so the display would not be disrupted. This method could also be used to display alternate forms of glyphs for ligatures, such as, say, a swash version of a ct ligature.As far as my limited knowledge of OpenType goes it appears to me that this system could be implemented in individual fonts at a font designer level as OpenType technology would permit the glyph substitution: is that correct please?William Overington19 August 2006

The mechanism you suggest sounds a lot like Unicode variation selectors. The only advantage of it is that there is a representation in the plain text that there is something more than the normal character. Of course, this same “advantage” will break spell checkers and many other things that operate on the underlying text. Also, if the user switches to a font that does not have the special characters in question, they will likely become visible as undefined characters (“notdefs”).This approach also lacks the advantage of a regular PUA assignment: the ability to have the desired visual display in an application that is Unicode savvy but does not do OpenType layout.Perhaps there is some advantage I’m missing to this approach. But failing that, I’m not convinced it’s an improvement.Regards,T

> The mechanism you suggest sounds a lot like Unicode variation selectors.I have not used variation selectors. They are used in variation sequences which are defined in regular Unicode and should not be used otherwise. My suggested selectors, perhaps we could call them Alternate Glyph Selectors, would be usable by any font designer who so chose to provide an indication of the use of an alternate glyph in any particular font.> The only advantage of it is that there is a representation in the plain text that there is something more than the normal character.Well, that seems an advantage worth having.> Of course, this same “advantage” will break spell checkers and many other things that operate on the underlying text.This is true. I am thinking that if some text, say a poem, has been spelling-checked and then some alternate glyphs added, then reformatting with a different font would use alternate glyphs in the same places where the different font had such alternate glyphs available. However, that is for the situation of using Private Use Area Alternate Glyph Selectors. If tests were otherwise successful using PUA alternate glyph selectors then maybe an application to include some alternate glyph selectors in Unicode could be made, maybe with the suggestion that they be located in plane 10 of the Unicode code space with a rule that they are ignored by spelling checkers.> Also, if the user switches to a font that does not have the special characters in question, they will likely become visible as undefined characters (“notdefs”).This is true. If there were a number of OpenType fonts which had zero width glyphs at U+EF01 to U+EF07 and appropriate entries in the Glyph Substitution Table, then the system could perhaps become regarded as useful, though it could go wrong with other fonts. However, it might perhaps be useful with setting poems and then trying a different font.> This approach also lacks the advantage of a regular PUA assignment: the ability to have the desired visual display in an application that is Unicode savvy but does not do OpenType layout.Well, this approach is orthogonal to that: an alternate glyph for a g could also be mapped into the Private Use Area for direct access, indeed that would be a useful additional facility to have available.I cannot make an OpenType font at present. However, I have added U+EF01 to U+EF07 (and, in fact, U+EF00) as zero width characters into one of my fonts, Sonnet to a Renaissance Lady. This is available for free download from our family webspace. There is also a support font where U+EF01 to U+EF07 (and, in fact, U+EF00) have visible glyphs so as to help in entering the alternate glyph selector characters into applications.The Sonnet to a Renaissance Lady font could perhaps be found useful for experiments, it has, for example, five alternate glyphs for g encoded in the Unicode Private Use Area, from U+E421 to U+E426.William Overington21 August 2006

I have been rereading through this thread.In 2002 I introduced the golden ligatures collection of Private Use Area code points for ligatures.The documents from that time are introduced and indexed at the following web page.http://www.users.globalnet.co.uk/~ngo/golden.htmSince that time I have learned more about various aspects of Unicode and have also learned some of the skills of fontmaking and produced a number of fonts.http://www.users.globalnet.co.uk/~ngo/fonts.htmThese fonts are all TrueType fonts.A feature of some of these fonts is that they include glyphs for ligatures encoded within the Unicode Private Use Area. There is often consistency of the Private Use Area mapping used from one font to another. For example, a glyph for a ct ligature is encoded at U+E707 in various of my fonts, such as Quest text, Chronicle Text, the 10000 font, Pixel Polka and Sonnet to a Renaissance Lady.[Righto. That’s the same general approach we were using, making our PUA assignments consistent across fonts, at least for the glyphs that we had in a fair number of fonts. – TP]I am aware that an OpenType font can include a glyph for a ct ligature and have it automatically used using glyph substitution, yet that depends upon glyph substitution being acted upon by the application program which is using the font.So, maybe far from removing existing Private Use Area encodings from fonts, perhaps Adobe could please consider including more Private Use Area mappings in fonts, including those from the golden ligatures collection. As I understand the situation it would be straightforward for Adobe to add a mapping to U+E707 to a glyph for a ct ligature in an Adobe font. It would cost little to do and would make the glyph available to more applications. I can access a glyph for a ct ligature from those of my fonts which include one by using Alt 59143 in WordPad and using the Insert | Symbol facility in Word 97 and using the Insert | Special Character facility in Open Office 2 Writer.I recognize that use of Private Use Area codes has its limitations as regards interchange of information, yet interchange is possible within the limits either of an agreement or of making use of a known list of Private Use Area codepoint assignments. However, the key advantages of using the Private Use Area code mappings are that a local screen display can be made, hardcopy prints can be produced and the ligatures can be used in text within graphic art design applications, none of which involve interchange of code points.It may be the case that as technology proceeds that the golden ligatures collection will be of no use to anyone as everything could be done at that time using OpenType fonts in conjunction with applications which apply glyph substitution. This would include use for adding text in graphic art programs and desktop publishing packages as well as in wordprocessors.However, at present, using a Private Use Area codepoint as an additional way to access a glyph for a ligature in an OpenType font seems to be a feature which is worth Adobe considering adding into its OpenType fonts.[I appreciate that you think this is a good solution to the problem. But we’ve been making fonts with PUA assignments for over six years, and have decided that very few people are using the PUA as an access route for these glyphs in our fonts, and that the PUA usage causes more problems than it’s worth. Of course, it’s up to each font developer to decide how to handle this issue, but we know which way we’re going. Additionally, something like your “golden ligatures” collection is of limited applicability. For example, it probably wouldn’t cover even a tenth of the unencoded alternates and ligatures in the typeface I’m working on right now. – TP]Another potential use of the golden ligatures collection is as a convenient way to include the artwork for glyphs for ligatures in a font for font producers who do not yet produce OpenType fonts, not only for use in application packages but also as a convenient way to store artwork for a potential future OpenType version of the font: maybe an automated way to convert from a TrueType font which includes some glyphs for ligatures encoded using the golden ligatures collection codepoint assignments to an OpenType font with a Glyph Substitution Table could be produced.So, started in 2002 and still of use in early 2007, the golden ligatures collection could yet have uses as more fonts have glyphs for ligatures included within them yet not every application with which those fonts might be used can access those glyphs for ligatures without using a Private Use Area codepoint.William Overington5 January 2007