goals for multi-process firefox

Since the release of Firefox 4 we’ve been working again to bring multi-process content support to Firefox. I thought that it would be good to write a post to try to lay out some of the reasons why we’re doing this. Although it might be obvious to many people, it’s good to actually lay it out on paper so that we have a clear understanding of why we’re doing something. It helps us determine what to prioritize as well as helps us measure when we’re ready to ship.

There are several areas listed below – performance, multi-core, memory behaviour, etc. For each of these areas there’s still a lot of work to do outside of the scope of the multi-process work. What this means is that every release of Firefox will get faster, more stable and has better multi-web page interactive performance even if it doesn’t include support for multi-process. But we know that in order to get across some hurdles we’re going to need to invest in multi-process model. That’s what this post is about. Multi-process is not a panacea for any of these topics, but it does give us a leg up on some of the more systemic problems.

Performance

In the case of Electrolysis we’re not talking about the kind of performance that’s usually referenced in the press or is the subject of benchmarks. What we’re really talking about with multi-process performance is responsiveness:

How long does it take for a mouse click to be recognized?

When you resize the window does it feel smooth?

Does the browser mysteriously pause from time to time?

Are animations smooth, without pauses?

These are all examples of measurements when building a responsive browser. At a basic level we’re talking about making sure that the main UI of the browser isn’t away from the mainloop for more than fifty milliseconds. We’ve made great strides here, and Firefox 5 is a great browser from a responsiveness standpoint. But we know that if we want to separate chrome and content concerns that we’re going to have to go to multi-process.

This is due to two reasons:

The cost of garbage collection goes up as the heap size of your process goes up.

In a single-process model, all web pages share the same heap. They also share memory with the chrome that control them. This means that whenever you have to garbage collect (look for unused objects in JavaScript) or cycle collect (which looks for cycles between C++ and JavaScript) you need to scan the entire JS heap looking for objects. As the amount of heap you have allocated increases, the amount of time you need to spend collecting increases with that size. For this reason, it’s better for pause times to have many small processes instead of one big process.

Now, this isn’t entirely true. We can garbage collect individual compartments where memory and objects are collected. We actually did this in Firefox 4, and it really helped our interactive performance, especially when running complex animations. But there are still cycles that exist across C++ and JavaScript, and we still need to cycle collect across the entire process. We do a lot of our GCing on other threads, including cycle collection, but it still stops the main thread. And because each GC affects the main thread, it causes pauses that can be felt in the UI. This means we have lots of little garbage collection events instead of one big one, but they still all block the main UI.

Garbage collection in content causes pauses in the main UI

Sometimes content gets large. Big web applications like gmail, facebook and twitter (yes, twitter is actually a pretty big web app) cause memory and garbage collection events to happen often. When they do, for reasons stated above, they still block the chrome. Compartments mitigate much of the pain here, but even if it’s for short periods of time, little pauses add up, and the user can feel them. We’d like to make sure that garbage collection for pages doesn’t really affect the main UI.

You can start to see a preview of tools for measuring responsiveness in one of Ted’s posts. Our investment in tools is happening along side of the multi-process work so we’re able to measure if we’re making progress in overall browser responsiveness. Those tools still need to be “productized” but since responsiveness is our primary metric and purpose for multi-process Firefox, we need to measure so we know we’re actually making forward progress.

Support for multi-core machines

There’s a basic problem with the web. The DOM is single-threaded. JavaScript, how CSS is resolved, and how objects are added and removed from the content model all assume that there’s a single view for a document.

This doesn’t mean that we don’t use threads throughout the browser. The networking stack, image decoding, much of our I/O, video and audio decoding and all kinds of other things are threaded and off the main loop of the browser. But the content itself is required to be single threaded.

Computing is quickly moving to a multi-core model. The speeds of processors aren’t increasing as much as they have been in the past, largely due to the constraints imposed by power and heat as well as the move to mobile. Basically everyone at this point has a multi-core processor on their desktop or laptop. And multi-core processors are starting to show up in mobile devices as well.

So one of the easiest ways to take advantage of multiple processors is to have each DOM assigned to its own processor, and the easiest way to do that is to have a few processes that can each be assigned to their own CPU.

Although we’ve made vast improvements to memory handling since the release of Firefox 4, we’re still faced with the fundamental problem of memory fragmentation. Because we’re based on C and C++, objects in our graph are often not relocatable. Over the long term, heap allocation will grow and cause memory to “leak.” This isn’t a problem that’s specific to Firefox. Just about every long-running process with even mildly complex allocation patterns suffers from this problem.

You can see this in the difference between the system memory reporting tools and what the internal allocator reports as memory that’s allocated. That “missing memory” is sometimes memory held in reserve, but they are often enough holes brought about by memory fragmentation. We also do some larger allocations in anonymous memory maps, but most small allocations still happen in pools that are allocated on the heap.

Physical pages of memory are allocated at the operating system layer and handed to user processes, at the process level, as virtual pages. The best way to return those to the operating system is to exit the process. It’s a pretty high-level granularity for recycling memory, for for very long-running browser sessions it’s the only way to get predictable memory behaviour. This is why content processes offer a better model for returning memory to the operating system over time.

Crash protection

We introduced protection from crashes in plugins with the release of Firefox 3.6.4. We implemented it because of the reliability problems that plugins – in particular Flash – were suffering from. Crashes in Flash were causing overall browser stability problems, and reflecting poorly on Firefox’s perceived reliability.

Although the number of crashes caused by content is relatively small – on the order of 1-2 crashes per 100 users per day – crashes that can be contained to the content processes are easier to identify, easier to diagnose and don’t take down the entire browser.

There’s also another nice benefit to having content processes. When there’s a crash, it’s much easier to tell what site caused the crash. In a single-process model, you can guess based on all of the sites that a person has open, but it could be any of them, and you have to look at a large sample of data and correlate sites to crash signatures to see patterns in the data. With a single tab (or small group of tabs) the number of sites is reduced so the crash can be more easily identified.

Sandboxing for security

The last goal that we have for adding support for multiple content processes to Firefox is for security. Some operating systems now have the ability to put a process into a “low rights mode” where they can’t access a lot of system resources. This means that even if there is a security problem in a content process, that the amount of damage that content process can do is limited to what the sandbox allows.

This system is imperfect, of course. Having the ability to talk to the “more privileged” chrome process can still result in exploits that have raised permissions. And it doesn’t protect one web site from another malicious web site. But it is a positive step forward, and is well-worth the investment.

33 Responses to goals for multi-process firefox

50ms to respond to a user action – it’s not the frame rate, exactly. It’s more “when I click this button something should change in 50ms.” That’s getting to the bottom edge of human perception. Frame rates for things like video are better at slightly higher frame rates, of course.

You are assuming that the input loop and the draw loop are synchronous. When you deal with multithreaded applications that is not a valid assumption even with video games these days. A game running at 122 FPS might only be updating the game loop at 30hz, and the input loop might not even be queried every game loop tick. The same idea is occurring here. Several frames can go by before an action is registered, but some frame in the next 50 ms will reflect my input.

I liked this writeup about what we’re doing and why. But I have a few technical points about multiprocess and UI responsiveness.

* It’s not just GCs on the main thread that make the UI unresponsive. The
whole browser locks up when you load the HTML5 spec not because of GCs, but
because *all* action on the content page takes place on the same thread as
the UI. (I think the HTML5 spec in particular makes a lot of expensive
modifications to the DOM.)

* Image decoding currently happens on the main thread, and this is a serious
responsiveness issue for us. See bug 666352.

* But even if we had no GC pauses, and even if all our expensive media and
layout routines lived off the main thread, and even if all our DOM routines ran in 0 time, content’s JS would still run on the main thread. So when content does an expensive computation, it would still freeze up the browser UI.

Will more processes mean more memory used? Isn’t there a way to achieve the same goals using only threads, so that the memory can be fastly and simply shared, avoiding to change the process context? Isn’t a single process with many threads faster than many different processes?

It’s likely that more processes will mean more memory will be used. There’s actually quite a bit of memory shared between processes, as well. If you’re using a library the actual binary code can be shared via dlls. We do a lot of sharing across web pages for things like parser atoms and whatnot. So we’ll see some more overhead. We’ll know more once we’re beyond the prototype stage.

Using threads would give us much of the responsiveness, but wouldn’t give us the crash protection, low-rights mode so we’d rather do processes.

Would this multiprocess breakdown also work toward making Gecko easyer to entigrate? It seems to be increasingly hard for 3rd party developers of things like SongBird to keep Gecko updated vs. the ton of web browsers like rockmelt and flock that are taking chromium and creating browsers like nothing?

If Mozilla was to get more companies using Gecko as a base again, there might be a large portion of developer inflow.

I imagin this could be done by during the redesign Gecko’s core is broken into engines, rather than a giant heap. I’m not sure if this is done already, but if not it should be worth the investment.

Gecko has always had Firefox as its main target. I’m not sure if this will help the embedding story you’re referring to or not. It’s not in our plans. (Note that I would not characterize our engine as a giant heap – it’s got modules, but most of them are in service of the whole.

Cept for when it was operated by Netscape (Navegator)? I didn’t mean heap in that way… Not trying to insult Gecko I love it, but it concerns me that it’s so contained, focus wise, to Firefox. I hope embedding is taken more seriously, it’s better to have it fixed sooner than later. My best comparison to this is closing a child’s candy store — while the others are still open — and then expecting the children to use the candy store again when you reopen it.

Please get this feature out soon
This is one of the much needed things to counter other browsers.
They’re not bad, but if Firefox doesnt have it’s 30%+ market share, who’s going to ensure our web stays free? Exactly, no one. Get to it guys!

While I like having careful thoughts, plan and proper groundwork is more important then rushing in to do a bad implementation of e10s. But it was 2009 when we first talked about it, and it was once suppose to be an feature of Fx4. We are now near 2011 and looking at Fx8, we are very unlikely to have e10s by the end of this year.

I just want to point out, your; or Mozilla’s fans and evangelists out there, are having a hard time trying to generate noise and defense / prevent against the user migrating to chrome. e10s should now be a Top priority on your features list.

Actually, both Chrome and Firefox do this already. I can’t remember what version we introduced it in but we back off timers in pages that aren’t in the foreground. We’re also doing decode-on-draw in newer versions which means that we won’t keep memory around for images that aren’t on pages you aren’t currently showing.

There’s a lot of other things that need to happen, but that’s just a start.

Coincidentally I had a conversation with some friends a few days ago who’d switched to Chrome, their single biggest reason was when a site causes the browser to crash it doesn’t take down the entire browser but just a tab.

Every single one said this is why they switched to Chrome from Firefox. Every single one.

Mozilla should seriously heavily prioritise multi-process Firefox. Otherwise the share will continue to bleed over to Chrome. (seriously, this isn’t the first time I’d had that conversation either and it was the same reason for switching before that, a single tab crashes and not the entire browser)
The only reason I stay on Firefox is because of the addons and I’m not a big fan of the ultra minimalist UI of Chrome.

What’s strange about this argument to me is that I simply haven’t had a web page crash in YEARS… and even plugin crashing has been much more rare recently than in the past. I’ve actually purposefully killed the plugin-container process a few times just to make sure this feature is still working!

I sincerely hope you don’t make it like Google Chrome. For all the praises heaped on the process model, Google Chrome gets very, very, very sluggish after something like 10 tabs or so. It also uses a lot of memory for a lot less on the user experience side. Compared to it, I use Firefox with several tens of (and sometimes even a hundred or more) tabs and I like the fact that the memory used by Firefox is a lot lesser than browsers like Chrome or Safari. Plus, it’s not sluggish like the others after 10-20 tabs.

I really don’t want a process model in Firefox if it’s going to become like Chrome or Safari for heavy tab users like me. Currently I don’t have the Flash plugin installed and I’d rather live without plugins like Flash (hey, the iOS guys have been living this dream for 4+ years without issues).

Coming to my needs, will Firefox with the process model be a lot different from Chrome for users who open tons of tabs?