A quick Firefox startup update

Recently I’ve been working on a project to improve desktop Firefox’s startup time during “cold starts” where none of the Firefox binaries or data are cached in memory (e.g. the first launch of the browser after a reboot). I’ve been paying special attention to the time required to reach the “first paint” startup milestone: the point in time when the first Firefox window becomes visible.

The analysis has mostly consisted of profiling the latest Firefox Nightlies using XPerf on a reference dual-core Windows 7 laptop with a magnetic HDD. I’ve been working on several bugs arising from the investigation (bug 881575, bug 881578, bug 827976, bug 879957, bug 873640) and I have many more coming. This is an overview of a few challenges I’ve run into over the last month.

Making Startup Times Reproducible

I wanted to evaluate the impact of my experimental code changes by comparing startups, but I quickly discovered that there is a tremendous amount of variation in startup times in my test environment. I then turned off Windows Prefetching & Windows SuperFetch, two performance features responsible for pre-fetching files from disk based on the user’s usage patterns, but I still recorded excessive variation in start times.

I then turned off a plethora of 3rd-party and Windows services that were running in the background and accessing the disk: Windows Update & Indexing Service, OEM “boot optimizer” software, Flash & Chrome automatic updaters, graphics card configuration & monitoring software bundled with drivers, etc. After rebooting the laptop several times and disabling any remaining programs causing disk activity, I was finally able to achieve reproducible startup times. I expected that cold starts would be dominated by disk I/O, but I was suprised by just how heavily I/O operations dominated startup time in a vanilla Firefox install.

Startup time has improved almost 30% over the last year

In an attempt to reproduce the startup regression reported in bug 818257, I compared time to first paint for Firefox 13.0.1 and Firefox 21.0 using my test setup. To my surprise, I found Firefox 21.0 (current release channel) requires roughly 4.6 seconds to reach first paint during cold starts, while Firefox 13.0.1 (release channel from a year ago) required ~6.4 seconds! This is almost a 30% reduction in startup time.

I was surprised by this result because I expected increases in code size and the overhead from initializing new components added over the course of a year to cause regressions in startup. On the other hand, many people have landed patches to improve startup by postponing component initialization and generally reducing the amount of work done before the first-paint milestone. I haven’t tried to identify the patches responsible yet, but from a quick look at the XPerf profiles for each version, it looks like there were gains from fixing bug 756313 (“Don’t load homepage before first paint”) and from changing the list of Mozilla libraries pre-loaded at startup (see dependentlibs.list).

We are still FSYNC-ing too much at startup

Apparently, the FlushFileBuffers function on Windows causes the OS to flush everything in the write-back cache as it “does not know what part of the cache belongs to your file”. As you can imagine, calling FlushFileBuffers is bad news for Firefox startups even it’s done off the main thread — other I/O requests will be delayed while the disk is busy writing data. Unfortunately we are currently calling this method on browser startup to write out the webapps.json file, the sessionstore.js file, and several SafeBrowsing files. The flush method isn’t being called directly, rather it’s the SafeFileOutputStream and OS.File.writeAtomic() implementations that force flushes for maximum reliability. In general, we should avoid calling methods that fsync/FlushFileBuffers unless such reliability is explicitly required, and I’ve asked Yoric to change OS.File.writeAtomic() behavior to forego flushing by default.

Next steps

I’m continuing to work on reducing the number of DLL loads triggered at startup and I’ll soon be filing bugs for fixing some of the smaller sources of startup I/O.

……….Ah! Only the feed of all your posts categorized as “Planet Mozilla” is on Planet Mozilla. And you posted it without a category.
Issue found. (And hopefully soon fixed since nobody should miss the awesome work you’re doing.)

Very interesting post and thank you for the work put into make FF fast. I can say that I love Firefox and I am really glad of MemShrink, Ion and the performance improvements all over the place that FF bring.

I understand that IO at startup is slow, so the main steps are right now to load out of the main thread most of the items. Can’t FF do something like “Prefetch” by tracking the session? I mean that some data calls which are expensive to be read from “omni.ja” but are known to be useful the next time, to be saved explicitly from the previous “recorded” session, so the most of data to be read from this “cache”? I used “omni.ja” as a file, but I mean in general the all relevant data that is first read at the first startup. I ask about this because it seems an important side of your testing is around “disabling Prefetch” (probably because it improves the running time at times), so why not going to a solution like this (at least for the items which are still on the main thread).

Hi, I’m worried that the plethora of 3rd-party and Windows services slowing down the machine and slowing startup time is the situation in which many users are, and blame on Firefox (this sounds exactly like my mum’s computer) .

So it seems that trying to endure as well as possible such a situation unfortunately needs to be a target of the performance effort.

Which is why I’m worried that if your procedure starts by removing all of them, what you’re optimizing whilst being a lot more reproducible, is also less comparable to the kind of setup the users with big performance problems really have.

My intent is reduce the amount of I/O Firefox does on startup so that Firefox startup times are less dependent on disk characteristics and other programs’ behavior. I think less Firefox startup I/O is generally better, so I’m not too concerned about evaluating the changes in a more “realistic” environment.

Just a minor nit, but that quote about FlushFileBuffers being stupid was referring to Windows CE, a completely different OS that happens to share some APIs. (and that mozilla doesn’t even compile for anymore). I seriously doubt it would just throw away the handle parameter on anything vaguely modern.

It’s a good point and I did notice the article I linked refers to Windows CE. I’m currently running an experiment to get an idea of the performance impact of calling FlushFileBuffers in a couple of different scenarios.