(In reply to comment #0)
> Loading fonts can be very expensive on some systems.
To be more specific, on a real user's system without any "superfluous" fonts, we load 423 font files on startup. This results in atleast 32mb of data being read from those files(not counting filesystem overhead).
This is a problem because for many users firefox is one of the first font-hungry programs to get started so it pays the price of loading fonts into OS cache.

We've talked about doing this for a while, but the actual hit was never large enough to justify the effort. Did that change? I'm all for doing this, fwiw; we did some work to do the cmap loading on a thread, but if we ever have to actually do char fallback, then we block until we read all of them.

(In reply to comment #3)
> We've talked about doing this for a while, but the actual hit was never large
> enough to justify the effort. Did that change? I'm all for doing this, fwiw;
> we did some work to do the cmap loading on a thread, but if we ever have to
> actually do char fallback, then we block until we read all of them.
This bug isn't asking to load them async though - it's asking for the information to be cached. I'd love to see both happen, fwiw though.

(In reply to comment #3)
> We've talked about doing this for a while, but the actual hit was never large
> enough to justify the effort. Did that change? I'm all for doing this, fwiw;
> we did some work to do the cmap loading on a thread, but if we ever have to
> actually do char fallback, then we block until we read all of them.
Lots of font io is one of the symptoms experienced by users with slow startup eg 593614. Not clear whether it's the cause of it, but it's certainly #2 contributor to IO behind places, except that font io seems to happen on the ui thread.

the full set of character maps for each font are needed to correctly do CSS fallback; no good way around it, unfortunately.
Like I said, font io is async unless the data is actually needed -- at which point the main thread blocks (during layout) for the remaining fonts to be read in.

(In reply to comment #9)
> Do we have any idea how much time is spent loading those 423 fonts?
> Percentages/rough numbers OK.
Almost 9 seconds. The suspect thing is that this is continuous io. So 99% of the fonts are opened with no other io interleaved between them, so it smells like rest of firefox is blocking on it.
As to timeframes, this happens right after sessionrestore is written out 3.7 seconds into startup.
So if xperf is to be believed, font loading takes 2x as much time as the rest of startup.
xperf times seem a bit optimistic compared to what the user that provided the trace reports, but they sound bad enough.

(In reply to comment #1)
> > Loading fonts can be very expensive on some systems.
>
> To be more specific, on a real user's system without any "superfluous" fonts,
> we load 423 font files on startup. This results in atleast 32mb of data being
> read from those files(not counting filesystem overhead).
What's the methodology here? Is it something like "measure performance for the first 30 seconds when starting up"? If so, that will definitely show us going through the font files but that *doesn't* equate with "we load xxx font files on startup".

(In reply to comment #14)
> (In reply to comment #1)
> > > Loading fonts can be very expensive on some systems.
> >
> > To be more specific, on a real user's system without any "superfluous" fonts,
> > we load 423 font files on startup. This results in atleast 32mb of data being
> > read from those files(not counting filesystem overhead).
>
> What's the methodology here? Is it something like "measure performance for the
> first 30 seconds when starting up"? If so, that will definitely show us going
> through the font files but that *doesn't* equate with "we load xxx font files
> on startup".
I believe this was the "wait long enough for the browser to become responsive" definition of startup.

(In reply to comment #15)
> (In reply to comment #14)
> > What's the methodology here? Is it something like "measure performance for the
> > first 30 seconds when starting up"? If so, that will definitely show us going
> > through the font files but that *doesn't* equate with "we load xxx font files
> > on startup".
>
> I believe this was the "wait long enough for the browser to become responsive"
> definition of startup.
The browser -is- responsive here. We're doing some stuff on a thread, doesn't mean the browser is not responsive...

So the trick here is to run soon after windows starts, which is a reasonable thing to do for someone who wants to surf the web. Even a fast machine gets into 30s startup territory then. Slower computers(ie netbooks) run into this post-startup.
I thoroughly printf()ed my build. On my i7 laptop with a 7200 rpm drive io seems to happen in RunLoader. It seems to take up to 10seconds using your style of instrumentation. A user with a japanese version of windows suffers a lot more.
The basic problem is that the default timeout is way too small. Due to slow IO, wonderful anti-viral slowdowns, extensions, windows indexing services, etc 8seconds can be really early in the startup. Looks like in some cases this fontloading loop takes off before the window shows and io-starves everything else.
btw 2/3 variables are dead: http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatformFontList.cpp#75
I tried chrome, it doesn't load any extra fonts for a few min that i was profiling it after startup. What does chrome do differently?
bandaid: For ff4 I would recommend setting the initial delay to 60s. Instead of scanning N font files in every timer run we should limit the number of files scanned based on how much time elapsed. Ie if < 150ms elapsed continue scanning.
longerterm bandaid: I think we should start an io-queue in firefox. We have a bunch of large independent IO operations going on: ff sync, font loading, vacuuming, urlclassifier. Instead of having these step on each other's toes they should wait in line when possible. Moving operations off the main thread is not a solution because hard drives do not like to executing things in parallel. This should also make it easier to ensure that the crazy io-train starts after the ui, even on slow mobile platforms.

> So the trick here is to run soon after windows starts, which is a
> reasonable thing to do for someone who wants to surf the web. Even a
> fast machine gets into 30s startup territory then.
Huh? So you're claiming our startup takes a total of 30seconds? Ouch.
Is this loading huge numbers of tabs and/or extensions?
> I thoroughly printf()ed my build. On my i7 laptop with a 7200 rpm
> drive io seems to happen in RunLoader. It seems to take up to
> 10seconds using your style of instrumentation. A user with a japanese
> version of windows suffers a lot more.
10secs total in RunLoader, or 10secs from the first time it's call until
the cmap loading completes? And what's different about Japanese
Windows? All my Windows machines are Japanese OS machines, I'm not sure
what behavior is different. You're thinking they have lots and lots of
big CJK fonts on them or something like that?
> The basic problem is that the default timeout is way too small. Due to
> slow IO, wonderful anti-viral slowdowns, extensions, windows indexing
> services, etc 8seconds can be really early in the startup. Looks like
> in some cases this fontloading loop takes off before the window shows
> and io-starves everything else.
So we load font cmaps in intervals, we don't load them all at once. So
I don't quite follow why this "io-starves everything".
I'd really like to try and reproduce the behavior you're seeing. Could
you set up a simple testcase/set of steps along with a quick description
of the tools you're using? Are you seeing system IO queues with large
wait queues throughout the time that font cmaps are loading? If you
bump the delay timer, do all the delays go away?
> bandaid: For ff4 I would recommend setting the initial delay to 60s.
If we need to do system font fallback (i.e. find a font for character X)
we need to access the cmaps, at which points you'll need to load them
all at once. Doing this early in spurts hopefully ameliorates the delay
if a page needs to do system font fallback. So increasing the delay
increases the chance where we'll need to pull in all cmaps at once,
which is just the sort of blocking IO loop you're looking to avoid.
> Instead of scanning N font files in every timer run we should limit
> the number of files scanned based on how much time elapsed. Ie if <
> 150ms elapsed continue scanning.
That seems reasonable.

(In reply to comment #20)
> http://twitter.com/robarnold/status/26220099907
>
> How can this bug not block if startup work whose wins it masks blocks?
If Rob is claiming font loading blocks important startup improvement
work, that really needs to be explained more clearly. With a specific
testcase and environment in which font loading prevents us from seeing a
speedup in area XYZ, not just a tweet.
A simple way to test this is to simply bump up the
kDelayBeforeLoadingCmaps delay in the locations listed in comment 12
from 8s to 60s and use a startup page for which font fallback doesn't
occur (e.g. www.google.com not www.facebook.com).
I also wonder if there are configurations that suffer the sorts of
DWrite factory initialization times all the time not just after a
reboot (see timings in comment 13).

I realize not everyone sees the bug, but it seems clear that a combination of one or more bugs, plus asymptotic complexity in the design (even assuming no bugs), are at work.
Every time I do something unbounded in cost, compounding O(n^2) or greater, it bites back on the web -- hard. I keep telling folks (at least on the JS team) to learn from this and avoid such designs. It's why hashing and direct lookup tables (arrays) are so common (but even then you can have problems).
The reason I came a little uncorked in comment 23 is because the bugs and design issue seem *obvious* to me, yet this bug is not a blocker -- but users (not just Rob) are telling us startup is loading all fonts on their Windows systems. That is worse than any startup perf hit in other code I see being worked on by Taras et al., so it looks like a blocker.
/be

(In reply to comment #21)
> If we need to do system font fallback (i.e. find a font for character X)
> we need to access the cmaps, at which points you'll need to load them
> all at once.
Reading the CSS spec, font fallback is used to choose a similar font when the specified font is unavailable, selecting the first available user agent default to match ‘serif’, ‘sans-serif’, ‘cursive’, ‘fantasy’, and ‘monospace’. If none of the specified fallbacks contain the required glyph, 'some form of "missing glyph" character is rendered by the user agent'.
The default fonts for the current locale/encoding (which can be set in Content -> Fonts & Colors -> Advanced...) could be loaded when necessary, and I assume a missing glyph character is built in. So why do we have to load all possible system fonts when font fallback is required?

(In reply to comment #21)
> > So the trick here is to run soon after windows starts, which is a
> > reasonable thing to do for someone who wants to surf the web. Even a
> > fast machine gets into 30s startup territory then.
>
> Huh? So you're claiming our startup takes a total of 30seconds? Ouch.
> Is this loading huge numbers of tabs and/or extensions?
Yes to 30s startup :( Nope, user initially lead me onto this opens his browser to facebook + gmail. 30s claim was surprising to me to, but if you consider that we load 30mb of stuff (before fonts, 70-300mb with) if a system is under any io strain our startup gets impacted dramatically.
>
> 10secs total in RunLoader, or 10secs from the first time it's call until
> the cmap loading completes? And what's different about Japanese
> Windows? All my Windows machines are Japanese OS machines, I'm not sure
> what behavior is different. You're thinking they have lots and lots of
> big CJK fonts on them or something like that?
10secs IN RunLoader, I summed up time spent in RunLoader. Asian fonts are much larger, causing even more IO.
I think the problem is that most of us don't test cold startup due to us restarting firefox all the time and having all relevant files in the OS cache.
>
> > The basic problem is that the default timeout is way too small. Due to
> > slow IO, wonderful anti-viral slowdowns, extensions, windows indexing
> > services, etc 8seconds can be really early in the startup. Looks like
> > in some cases this fontloading loop takes off before the window shows
> > and io-starves everything else.
>
> So we load font cmaps in intervals, we don't load them all at once. So
> I don't quite follow why this "io-starves everything".
The current loading method is not much better "loading all at once". It effectively is loading all at once since fonts are loaded in batches of 10 with a small interval between them.
>
> I'd really like to try and reproduce the behavior you're seeing. Could
> you set up a simple testcase/set of steps along with a quick description
> of the tools you're using? Are you seeing system IO queues with large
> wait queues throughout the time that font cmaps are loading? If you
> bump the delay timer, do all the delays go away?
As I mentioned before, running firefox first thing after startup will hurt in RunLoader. On my clean windows install + i7 cpu i can't reproduce a stalled browser, but I can see how 10+ seconds of nonstop IO will ruin interactivity in less fortunate configurations.
>
> > bandaid: For ff4 I would recommend setting the initial delay to 60s.
>
> If we need to do system font fallback (i.e. find a font for character X)
> we need to access the cmaps, at which points you'll need to load them
> all at once. Doing this early in spurts hopefully ameliorates the delay
> if a page needs to do system font fallback. So increasing the delay
> increases the chance where we'll need to pull in all cmaps at once,
> which is just the sort of blocking IO loop you're looking to avoid.
Right. I'm saying that the current workaround is actually causing the same behavior for users as what you are trying to avoid. Chrome seems to avoid eager font enumeration, we should too.
>
> > Instead of scanning N font files in every timer run we should limit
> > the number of files scanned based on how much time elapsed. Ie if <
> > 150ms elapsed continue scanning.
>
> That seems reasonable.
Lets start landing workarounds for this.

So this is indeed worse with directwrite.
setting gfx.font_rendering.directwrite.enabled to true causes font loading to happen in initfontlist as in your example. In meantime the browser is completely locked up while loading 260mb of fonts. Not sure why it loads more fonts in directwrite mode in my system.

Based on discussions in irc, below are two scenarios where font loading is reported to be blocking effective startup:
Case 1: slow netbook
Example: eee box b204, 1.6GHz Atom N270, 1GB RAM, 160GB HD, Win7
1. Set up a profile with gmail/facebook opening on startup
2. Startup with FF4, DirectWrite enabled
Result: blank white page is all that appears for 30seconds
Case 2: cold startup on older laptop
Example: Lenovo T60, 2.1Ghz Core2 Duo, 3GB RAM, 5400rpm HD, Win7
1. Set up a profile with twenty tabs, with facebook/twitter as the first two tabs
2. Reboot and startup FF4, DirectWrite enabled
Result: time before user can flip between first two tabs is roughly 30seconds
Questions:
1. Are these startup times the same running 3.6?
2. Do we see similar times running XP or with DirectWrite disabled?

(In reply to comment #29)
> Questions:
>
> 1. Are these startup times the same running 3.6?
Doubtful, but we aren't competing with 3.6 -- we're competing with Chrome, IE9, Opera, and Safari, latest betas or finals.
It's not clear CSS font fallback requires loading fonts so aggressively. Cc'ing dbaron for some guidance, but again: if the competition doesn't do it, there is either a way per-spec, or the spec needs fixing and we should work with the other vendors to fix it. In either case, we must compete.
We should not compartmentalize our code and design so much that we end up doing something like (at the limit) loading all the fonts that the OS knows about on (some) startup, just because CSS might want a glyph in a hard case. This either is a CSS spec bug, a CSS spec misreading, or at best a gray area -- but in any event, it is too pessimistic of a design on its face.
The worst case happens on the web, count on it. Murphy was an optimist.
/be

(In reply to comment #30)
> > 1. Are these startup times the same running 3.6?
>
> Doubtful, but we aren't competing with 3.6 -- we're competing with
> Chrome, IE9, Opera, and Safari, latest betas or finals.
Right, understood. But if FF4 is significantly worse than 3.6 it's
also possible that something else related to new code (e.g. our new
DirectWrite implementation) is also affecting IO performance. Several
comments on bug 600713 seem to imply that it is (comment 8, comment
12) and one person has reported that the DirectWrite backend seems to
be reading in fonts in their entirety (comment 17) as opposed to just
particular tables.
Whatever the problem is we need to find it and fix it and not assume
what the answer is before tracking down the source(s). If this means
we have to be less aggressive about font loading and come up with
other ways of doing system font fallback to be better than
Chrome/Webkit, fine, but let's do it with information in hand and not
go off half-cocked.

(In reply to comment #31)
> (In reply to comment #30)
> > > 1. Are these startup times the same running 3.6?
> >
> > Doubtful, but we aren't competing with 3.6 -- we're competing with
> > Chrome, IE9, Opera, and Safari, latest betas or finals.
>
> Right, understood. But if FF4 is significantly worse than 3.6 it's
> also possible that something else related to new code (e.g. our new
> DirectWrite implementation) is also affecting IO performance. Several
> comments on bug 600713 seem to imply that it is (comment 8, comment
> 12) and one person has reported that the DirectWrite backend seems to
> be reading in fonts in their entirety (comment 17) as opposed to just
> particular tables.
>
> Whatever the problem is we need to find it and fix it and not assume
> what the answer is before tracking down the source(s).
John, I have not assumed what the "answer" is. I am contending that this bug is a blocker, and that in any event CSS font fallback shouldn't require the hazard of enumerating all fonts to render a certain code point.
Both points contradict comments here: comment 17 and others re: blocker or not; comment 3 re: fallback requiring enumerating all fonts. Can someone comment on why can't we use OS font-matching code on Windows, in particular with DWrite?
As for half-cocked, the way this bug was going before recent comments, it was "uncocked" and looking likely to be unfixed. That is troubling given the user complaints and worst-case behavior.
/be

Brendan, your last couple of comments haven't been terribly helpful. John's trying to figure out what the issues are, not trying to make the problem "go away."
I agree, now that I know the scope of this bug, that we should not release without at least some mitigation of it.

Believe me, (a) I'm trying to be helpful; (b) I'm representing a lot of "silent majority" types watching this bug, who deserve better than they were getting till yesterday. Let's call it even and get back to technical content.
To repeat something from my last comment, a technical question that could be helpful, which no one has answered: Can someone comment on why can't we use OS font-matching code on Windows, in particular with DWrite?
/be

I'd be interested in killing two birds with one stone: only load a hard-coded set of standard fonts, don't load all system-installed fonts. That will help solve the startup issue *and* the font-fingerprinting issue all at once. Now that sites have the ability to use downloadable fonts, I don't see why we shouldn't do this.

FYI, I ran that past the font folks a while back (in the context of fingerprinting), and they about leapt out of their skins. It *is* a big change, and will definitely impact foreign locales disproportionately. (We floated the idea of having a per-locale list, but maybe that does the web a disservice by not offering a predictable set of fonts.)
I'd be for having a broader discussion on it, FWIW...

Hard-coding seems a little extreme - why not use the fonts set in 'Options... -> Content -> Fonts & Colors -> Advanced...'? That dialog appears to be missing options to match the CSS ‘cursive’ and ‘fantasy’ keywords, but other than that it seems perfect for what the font fallback spec describes (with the exception of falling back to specific font families, but those should only be loaded when needed).

(In reply to comment #36)
> FYI, I ran that past the font folks a while back (in the context of
> fingerprinting), and they about leapt out of their skins. It *is* a big change,
> and will definitely impact foreign locales disproportionately. (We floated the
> idea of having a per-locale list, but maybe that does the web a disservice by
> not offering a predictable set of fonts.)
Where is the list available in the Content section of the prefs dialog coming from? If we can get that list from the OS font fallback mechanisms, I don't see why this change should affect non-en-US locales (which is what I assume you meant by foreign) disproportionately.
> I'd be for having a broader discussion on it, FWIW...
Should changing our font matching behavior something which we should consider this late in the cycle? If I'm reading this correctly, this has the potential to break lots of big sites (Persian Wikipedia is one example I'm familiar with personally). But if we're looking at Firefox.Next, then by all means, a broader discussion is what we should be having here!

(In reply to comment #39)
> > I'd be for having a broader discussion on it, FWIW...
>
> Should changing our font matching behavior something which we should consider
> this late in the cycle? If I'm reading this correctly, this has the potential
> to break lots of big sites (Persian Wikipedia is one example I'm familiar with
> personally). But if we're looking at Firefox.Next, then by all means, a
> broader discussion is what we should be having here!
This bug was opened with the intention of caching font data (cmaps and such), not changing the font matching behavior. I would prefer a new bug for changing the font matching behavior. Caching should be sufficient to get our startup perf where it needs to be without introducing too much risk at this point.

(In reply to comment #39)
> Should changing our font matching behavior something which we should consider
> this late in the cycle?
Absolutely not. It should be on the table for next, though -- we'll be doing a lot of other privacy work for that release too.
Didn't mean to divert things here (sorry!), let's save all that part for a later discussion and keep this bug focused on unbreaking 4. ;)

John,
So the bug supernova was seeing is that facebook's "lucida grande" font choice causes gfxPlatformList::InitOtherFamilyNames() to read in a lot of fonts. Takes 5seconds on my i7 on cold start.
Facebook being in the initial set of tabs is pretty likely :(.

Above comment was for directwrite being disabled case. The bug is the same in directwrite-enabled case except for some reason a lot more fonts get enumerated. ie 300mb vs 30mb worth of fonts, this causes a stall of a few minutes.

We should just cache the list of fonts and their cmaps and just load that on startup. Add some code to validate that list shortly after startup, and if there is a difference invalidate the textruns? You'd generally pay the cost once. People are unlikely to remove fonts which would change the behavior a bit. Adding fonts would have pretty small impact on which font got chosen as well, but probably not generally noticable. Flushing the textruns and reflowing would fix it if we found a mismatch of our cache.

> To repeat something from my last comment, a technical question that
> could be helpful, which no one has answered: Can someone comment on
> why can't we use OS font-matching code on Windows, in particular
> with DWrite?
Font fallback is typically a black box for system APIs, including DWrite. You give the API a run of text with a font and style attributes set and hope for the best. Plus they're not all that speedy. With cmap data we can do a much better job of font matching.

I'm still wondering why we need information on all system fonts when the CSS (2&3) spec says to fallback to a default font for several font families. What advantage does checking every single font on the system have? If the problem is related to locales, there are already default fonts for each locale.

Font matching is described in section 15.2 of the CSS 2.1 spec:
http://www.w3.org/TR/CSS2/fonts.html#algorithm
Step 5 describes system font fallback:
"If there is no font within the family selected in 2, then use
a UA-dependent default 'font-family' and repeat step 2, using
the best match that can be obtained within the default font. If
a particular character cannot be displayed using this font,
then the UA may use other means to determine a suitable font
for that character. The UA should map each character for which
it has no suitable font to a visible symbol chosen by the UA,
preferably a "missing character" glyph from one of the font
faces available to the UA."
How system font fallback is done is up to the UA, it's *not* a CSS spec
requirement to display missing glyph characters. There's also no strict
requirement to know about all fonts on the system but even if an author
just assigns "Helvetica, Arial" as the font list, modern browsers,
including Firefox, will display characters if there are fonts that
support them on the system. That's why the main Facebook page
(http://www.facebook.com ) will display just fine even though fonts for
Arabic, Devanagari, Japanese and Chinese are not explicitly specified in
the stylesheet. Same goes for the front page of Wikipedia
(http://www.wikipedia.com ).
The goal of the font matching system is to avoid system font fallback
whenever possible. Pref fonts cover most situations but not all.

Thanks. I understand wanting to display the right thing for all these different locales (when nothing is specified), but still - aren't the Fonts & Colors defaults (for the various locales) enough to do that? Do we really need to enumerate and read in every single font on the system just for a bit of text on Facebook (or say, Wikipedia)?

Can we take discussion of font matching strategies to a list, and other bugs to other bugs? This caching work is important, we all agree; let's not drown it out. We need to do more than cache to avoid a 30-sec first start, but that's for other (likely blocking) bugs.

I watched the firefox.exe process on Windows 7 (64b) using Process Monitor, because it's unresponsive for several seconds ca. 10-20s after startup (e.g. after an EM restart) in the middle of session restore.
While Firefox hangs I see (solely) loads of "ReadFile"s for the file
ProfileDir\startupCache\startupCache.4.little
but that's after the font files were read. The reads go on for 18sec (don't know how much ProcMon slows this down). If anyone is interested in the log, I saved it.
Are there other open bugs about the startupcache in Bugzilla where I should take this?

(In reply to comment #57)
> Is this bug Windows-only or would Linux and Mac be able to take advantage of
> fonts in the startup cache too?
Caching system font information will help on Mac also since the code is shared with Windows. The Linux code is different.

(In reply to comment #58)
> (In reply to comment #57)
> > Is this bug Windows-only or would Linux and Mac be able to take advantage of
> > fonts in the startup cache too?
>
> Caching system font information will help on Mac also since the code is shared
> with Windows. The Linux code is different.
Is Linux different in a good/neutral way or different as in it'll give you a headache to describe its differences? In other words, could Linux be brought into the fold for a boost or is there no possible gain there?
In any case, if this affects Mac: OS -> all (I really wish Bugzilla let you pick multiple instead of going straight to all)

(In reply to comment #61)
> The main problem here is bug 602792, we have to figure out a way around that
> first.
In Bug 602792 comment 22 you said my statement about needing that bug to fix this was incorrect. Can you please clarify?

(In reply to comment #63)
> The comment was in response to Dietrich asking for an update. We figure out a
> fix for the dwrite io problem, then I work on this bug. They are only
> dependent from a scheduling perspective.
So bug 602792 is blocked on waiting from microsoft. A possible solution for it is going back to gdi-based font lookup. So it is logical to work on this bug while we wait for communication from Microsoft.
Replying to comment in 602792 in this bug
(In reply to comment #35)
> (In reply to comment #34)
> > So while we are waiting for this, can you figure out why chrome does not need
> > to enumerate fonts the way we do?
>
> In the context of this bug, Chrome doesn't use dwrite yet so they don't hit any
> of the problems here.
>
> For family name lookups, Chrome does lookups with straight GDI calls so it has
> all the problems of GDI font name lookup and fails tests in the CSS 2.1 test
> suite because of this. For example, the page below will render differently
> between Windows/OSX versions of Chrome even though the same basic fonts exist
> in both places:
>
> http://people.mozilla.org/~jdaggett/tests/arialvariations.html
>
> For font fallback, Chrome relies on using hard coded font fallback settings
> (i.e. "for this codepoint, try these fonts..."). For codepoints that fall
> outside those it just punts; even with a font that supports a given codepoint
> it displays a missing character. This method doesn't really scale as more
> codepoints are added in Unicode (e.g. Unicode 6.0 "oh joy" addition: codepoints
> for emoticons, including U+1F4A9 == Pile of Poo).
>
> So the main reason we do font enumeration is for font fallback but we can
> definitely come up with ways that avoid the need to do expensive operations at
> startup in the majority of cases, including caching fallback info. This will
> be dealt with in bug 600713, not this bug.
So can we cache font family lookup and defer cmap-fallback stuff until it's needed(then cache the cmap fallback info too)?
This bug severely impairs our startup and covers up any startup improvements we have achieved elsewhere, so we have to fix this for Firefox 4 if we are to claim improved startup speed on Windows.

(In reply to comment #64)
> This bug severely impairs our startup and covers up any startup improvements we
> have achieved elsewhere, so we have to fix this for Firefox 4 if we are to
> claim improved startup speed on Windows.
And I'd be happy to provide instructions on how to use xperf to get stacks of where we end up loading all these fonts during startup. It's rather impressive to see a bunch of fonts get read in in the summary tables I've been looking at. When it happens, it usually dominates any other reading we do.

(In reply to comment #59)
> Is Linux different in a good/neutral way or different as in it'll give you a
> headache to describe its differences? In other words, could Linux be brought
> into the fold for a boost or is there no possible gain there?
Linux gets system font information for font-selection from an OS cache rather than from the fonts, so I wouldn't expect such a speed "boost" for linux here.
A solution here could perhaps be used to let us choose different kinds of information (such as PostScript name or measuring character support in a different way) though arguably that would be better added to the OS cache. Perhaps it could save a little by being more specific about the information cached, but it would be at the expense of having to generate/update the cache more often.

** PRODUCT DRIVERS PLEASE NOTE **
This bug is one of 19 being moved from blocking2.0:betaN+ to blocking2.0:- as we reached the endgame of Firefox 4. The rationale for the move is:
- the bug had been identified as a "soft" blocker which could be fixed in some follow up release
- the bug had been identified as one requiring beta coverage, thus is not appropriate for a ".x" stability & security release
The owner of the bug may wish to renominate for .x consideration.

(In reply to comment #43)
> John,
> So the bug supernova was seeing is that facebook's "lucida grande" font choice
> causes gfxPlatformList::InitOtherFamilyNames() to read in a lot of fonts. Takes
> 5seconds on my i7 on cold start.
>
> Facebook being in the initial set of tabs is pretty likely :(.
Since part 1 in bug 602792 landed, this no longer happens. I just tested under GDI on Win7 and confirmed that with Facebook as the startup page, we _don't_ read extra family names (and we _don't_ read the cmaps from the 400+ fonts I have installed but only from 21 of them). So the amount of font i/o on the startup path is pretty small. The remaining 400 cmaps do eventually get loaded, but by the deferred loader task that runs after the browser has started up; this does not block the UI.
I believe the main thing that would still cause a lot of font table i/o during startup, and block the initial page load, would be if that page includes "exotic" characters that are _not_ found via the fonts specified in Prefs for the relevant Unicode ranges. This would trigger loading of all the cmaps - but it should be a relatively rare occurrence.

(In reply to comment #69)
> (In reply to comment #43)
> > John,
> > So the bug supernova was seeing is that facebook's "lucida grande" font choice
> > causes gfxPlatformList::InitOtherFamilyNames() to read in a lot of fonts. Takes
> > 5seconds on my i7 on cold start.
> >
> > Facebook being in the initial set of tabs is pretty likely :(.
>
> Since part 1 in bug 602792 landed, this no longer happens. I just tested under
> GDI on Win7 and confirmed that with Facebook as the startup page, we _don't_
> read extra family names (and we _don't_ read the cmaps from the 400+ fonts I
> have installed but only from 21 of them). So the amount of font i/o on the
> startup path is pretty small. The remaining 400 cmaps do eventually get loaded,
> but by the deferred loader task that runs after the browser has started up;
> this does not block the UI.
>
> I believe the main thing that would still cause a lot of font table i/o during
> startup, and block the initial page load, would be if that page includes
> "exotic" characters that are _not_ found via the fonts specified in Prefs for
> the relevant Unicode ranges. This would trigger loading of all the cmaps - but
> it should be a relatively rare occurrence.
With the latest nightly build:
I set facebook.com as my homepage(logged out). Turned off directwrite.
xpref records io on over 100fonts in gdi case. (only 14mb total since i guess it's just reading in the headers). That's hundreds of fonts just to render facebook.com, not delayed cmap reading.

(In reply to comment #70)
> With the latest nightly build:
> I set facebook.com as my homepage(logged out). Turned off directwrite.
> xpref records io on over 100fonts in gdi case. (only 14mb total since i guess
> it's just reading in the headers). That's hundreds of fonts just to render
> facebook.com, not delayed cmap reading.
Does _that_ take "5 seconds on your i7" (as per comment 43)? Or how long does it take now? Yes, it'll be iterating over all font families (using EnumFontFamilies) during startup, so I can easily believe it touches 100 font files, but I wouldn't expect that to take anywhere near as long as it used to take when it was also calling InitOtherFamilyNames().

(In reply to comment #71)
> (In reply to comment #70)
>
> > With the latest nightly build:
> > I set facebook.com as my homepage(logged out). Turned off directwrite.
> > xpref records io on over 100fonts in gdi case. (only 14mb total since i guess
> > it's just reading in the headers). That's hundreds of fonts just to render
> > facebook.com, not delayed cmap reading.
>
> Does _that_ take "5 seconds on your i7" (as per comment 43)? Or how long does
> it take now? Yes, it'll be iterating over all font families (using
> EnumFontFamilies) during startup, so I can easily believe it touches 100 font
> files, but I wouldn't expect that to take anywhere near as long as it used to
> take when it was also calling InitOtherFamilyNames().
No, the io behavior of font enumeration is better now, but still too high. seems to take somewhere between 0.5-4s depending on io load(usually on the lower side). There a lot less data read from these font files now(5x less?). By the way webkit browsers figure out font families from registry.

Michael, the tricky part of this is not the actual caching, it's figuring out whether the cached data is invalid or not. Obviously 99% of the time it will be valid but we have to have a way to deal with the 1% case when fonts change or are updated. Windows doesn't make it easy to map <face> ==> <file>, which is what's required so that we can stat the file to see confirm changes haven't occurred.
I think we'll see a much bigger win by revamping our fallback to consolidate cmap data per family and use hardcoded fallbacks so that the need to rely on having a 100% accurate cmap cache is eliminated. And whatever we cache would be better to cache as one big blob to avoid doing lots of zipfile i/o with relatively small fragments.

Created attachment 528143[details]
xperf capture of gdi font overhead
JP wanted an update on current gdi hit on startup. On my system we read around 45fonts on startup to start with facebook.com as homepage. Due to noise and non-deterministic nature of seeks it's hard to get an exact cost for IO however we can make a pretty good estimate:
A good ballpark figure for the perf hit is 45*30ms=1.35seconds.
According to xperf csv dump of file io, 35% of startup IO is due to fonts. This particular startup took 7seconds, so 2.45 seconds went to IO. It's safe to say that our font overhead is between 15-35%.

(In reply to Paul Wright from comment #77)
> Is this bug a WONTFIX at this point, or has it just gone dormant for a while?
This still needs to be fixed. We now have telemetry data showing that font enumeration can be pretty horrible

(In reply to Taras Glek (:taras) from comment #78)
> (In reply to Paul Wright from comment #77)
> > Is this bug a WONTFIX at this point, or has it just gone dormant for a while?
>
> This still needs to be fixed. We now have telemetry data showing that font
> enumeration can be pretty horrible
So, this bug was debated as being a FF4 (or not) blocker over a year ago. Armed with the new telemetry data, what is the plan and priority to address going forward? It would now seem to be a data-driven decision to fix it sooner rather than later. Also, as comment 42 asked, is there a spinoff bug existing for the font-matching idea(s)?

If this is something that still affects users, we should make improvements to it. When I looked at the telemetry data a few weeks after mwu checked in the probes it didn't seem like a high-priority problem. I'll look at the data again when the telemetry front end is up again.

(In reply to Taras Glek (:taras) from comment #78)
> This still needs to be fixed. We now have telemetry data showing that font
> enumeration can be pretty horrible
Could you point us to where we can look at this data? Thanks.

(In reply to John Daggett (:jtd) from comment #84)
> (In reply to Taras Glek (:taras) from comment #83)
>
> To start with, we're not reading cmaps in the routines that have
> Telemetry metrics, so these numbers have *nothing* to with the subject
> of this bug, namely caching cmaps.
I'm not sure why you look at things that take over >180ms, >=180 makes more sense as that's 3x more than UX's 50ms response rate target. Is there anything we can do to speed this up, spawn this call of in a thread early in init?
FONTLIST_INITOTHERFAMILYNAMES and FONTLIST_INITFACENAMELISTS do capture overall time to read cmaps.
FONTLIST_INITOTHERFAMILYNAMES(5114 counts) takes >50ms 47% of the time, >500ms 31%, > 1s 26%
FONTLIST_INITFACENAMELISTS(18745 counts) takes > 50ms 36%, > 500ms 22%, > 1s 20%
Now if you cut out obvious warm startups, percentages of slowness are 87%, 50% 40%. ie 40% of cold startups(conservative number, i'm including some semi-warm ones) feature >1second pause to read cmaps.
Note I would not read these percentages to be representative of user's pain. Our telemetry opt in rates are fairly low at the moment(2%) and this was a sample for a single day. The main thing these numbers show is the range of possible values to help us focus on how to make this smoother.

(In reply to Taras Glek (:taras) from comment #85)
> I'm not sure why you look at things that take over >180ms, >=180 makes more
> sense as that's 3x more than UX's 50ms response rate target. Is there
> anything we can do to speed this up, spawn this call of in a thread early in
> init?
I was simply using that as a cutoff to show that really bad times in this area are completely due to time spent calling GetSystemFontCollection. I am in no way implying that 180ms is acceptable.
One thing we might consider is simply blacklisting DirectWrite builds prior to the one where Microsoft fixed the font cache bug in KB 2505438 and force those users to use the GDI path.
> FONTLIST_INITOTHERFAMILYNAMES and FONTLIST_INITFACENAMELISTS do capture
> overall time to read cmaps.
No, these routines are reading the name table not the cmaps. Let's open up a separate bug to address this. Filed bug 699331.
> FONTLIST_INITOTHERFAMILYNAMES(5114 counts) takes >50ms 47% of the time,
> >500ms 31%, > 1s 26%
>
> FONTLIST_INITFACENAMELISTS(18745 counts) takes > 50ms 36%, > 500ms 22%, > 1s
> 20%
These don't necessarily occur at startup, we try to avoid calling these as much as possible.
> Now if you cut out obvious warm startups, percentages of slowness are 87%,
> 50% 40%. ie 40% of cold startups(conservative number, i'm including some
> semi-warm ones) feature >1second pause to read cmaps.
How do you separate warm/cold startup times in the telemetry data? Is there a field (or set of fields) from which you can infer this.
And again, these routines are *not* loading cmaps.

(In reply to Taras Glek (:taras) from comment #87)
> John, thanks for clarification. Can you check if we have telemetry that
> covers reading cmaps?
We don't appear to have telemetry covering reading cmaps, if I'm reading the code right. Is the right thing to do to time gfxFontFamily::ReadCMAP, or is it to time individual *FontEntry::ReadCMAP functions?

(In reply to Nathan Froyd (:froydnj) from comment #89)
> We don't appear to have telemetry covering reading cmaps, if I'm reading the
> code right. Is the right thing to do to time gfxFontFamily::ReadCMAP, or is
> it to time individual *FontEntry::ReadCMAP functions?Bug 705590 added timing to the system font fallback routine which will lazy load cmaps that haven't been loaded.
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatformFontList.cpp#410
The data for this is stored in SYSTEM_FONT_FALLBACK_FIRST (in ms) and SYSTEM_FONT_FALLBACK (in us). This will be revamped as part of bug 705594.

This has been superceded by bug 705594. Rather than cache all cmaps, the system font fallback routine is going to be revamped to use a combination of hard-coded fallbacks and bitflags to indicate Unicode blocks supported by a specific font family.