MemShrink progress, week 32

There wasn’t much MemShrink activity this week in terms of bugs fixed, just bug 718100 and bug 720359. So I’m going to take the opportunity this week to talk about the bigger picture.

Bug Counts

As a prelude, here are this week’s bug counts.

P1: 20 (-4/+0)

P2: 131 (-3/+3)

P3: 74 (-2/+7)

Unprioritized: 4 (-3/+4)

The drop in P1s was just due to bug re-classification; in particular, three bugs relating to long cycle collector pauses were un-MemShrink’d because they are more about responsiveness, and they are being tracked by Project Snappy.

The Big Ticket Items

David Mandelin asked me today what where the big ticket items for MemShrink. I’d been looking a lot at the MemShrink:P1 list recently (which is why some were re-classified) and so I was able to break it down into six main areas that cover most of the P1s and various P2s. I’ll list these from what I think is least important to most important.

#6: Better Script Handling

Internally, a JSScript represents (more or less) the code of a JS function, including things like the internal bytecode that SpiderMonkey generates for it. The memory used by JSScripts is measured by the “gc-heap/scripts” and “script-data” entries in about:memory.

Luke Wagner did some measurements that showed that most (70–80%) JSScripts created in the browser are never run. In hindsight, this isn’t so surprising — many websites load libraries like jQuery but only use a fraction of the functions in those libraries. If SpiderMonkey could be changed to generate bytecode for scripts lazily, it could reduce “script-data” memory usage by 60–70%. This would also allow the decompiler to be removed, which would be great.

Both of these changes potentially will make the browser faster as well, because SpiderMonkey will spend less time compiling JavaScript source code to bytecode.

No-one is assigned to work on these bugs. The lazy script creation can be done entirely within the JS engine; the script sharing requires assistance from Necko. Luke is currently busy with some other righteous refactorings, but I’m quietly hoping once they’re done he might find time for one or both of these bugs.

#5: Better Memory Reporting

Before you can reduce memory consumption you have to measure it. about:memory is the critical tool that has facilitated much of MemShrink’s work. (For example, we never would have known about zombie compartments without it.) It’s in pretty good shape now but there are two major improvements that can be made.

First, the “heap-unclassified” number (a.k.a “dark matter”) is still typically around 20–25%. My goal is to reduce that to 10%. This won’t require any great new insights, we already have the tools and data required. Rather, it’s just a matter of grinding through the list of memory reporters that need to be added and improved.

Second, the resources used by each browser tab are reported in an unwieldy fashion: JS memory on a per-compartment basis; layout memory on a per-docshell basis; DOM memory on a per-window basis. Only a few internal architectural changes stand in the way of uniting these to provide the oft-requested feature of per-tab memory reporting. This will be great for users, because if Firefox is using more memory than they’d like, it tells them which tabs they should close in order to free up memory.

I am actively working on both these improvements, and I’m hoping that within a couple of months they’ll be mostly done.

#4: Better Memory Consumption Tracking

One thing we haven’t done well in MemShrink is to improve the state of tracking Firefox’s memory consumption. We have plenty of anecdotes but not much hard data about the improvements we’ve made, and we don’t have good ways to detect any regressions. A couple of ideas haven’t gone very far, but some good news is that John Schoenick is making great progress on a proper areweslimyet.com implementation. John has demonstrated preliminary versions of the site at two MemShrink meetings and it’s looking very promising. It uses the endurance test framework to make the measurements, and opens lots of pages from the Talos tp5 pageset.

We also hope to use telemetry data to analyze how the memory consumption of each released version of Firefox stacks up. That analysis would come with a significant delay — weeks or months after each release — but it would be much more comprehensive than any oft-run benchmark, coming from the real-world usage patterns of thousands of users.

#3: Compacting Generational GC

If you look in about:memory, JavaScript memory usage usually dominates. In particular, the “js-gc-heap” is usually large. There’s also the “js-gc-heap-unused-fraction” number, often 30% or higher, which tells you how much of that space is unused because of fragmentation. That percentage overstates things somewhat, because often a good proportion of that unused space (see “js-gc-heap-decommitted”) is decommitted, which means that it’s costing nothing but address space… but that is cold comfort if you’re suffering out-of-memory aborts on Windows due to virtual memory exhaustion.

A compacting garbage collector is one that can move objects around the heap, filling up all those little gaps that constitute fragmentation. The JS team (especially Bill McCloskey and Terrence Cole) is implementing a compacting generational garbage collector, which is a particular kind that tends to have good performance. In particular, many objects die young and generational collectors find these quickly, which means that the heap will grow at a significantly slower rate than it currently does. I could be wrong, but I’m convinced this will be a big win for both memory consumption and speed.

#2: Better Foreground Tab Image Handling

Images are stored in a compressed format (e.g. JPEG, PNG, GIF) on disk. In order to display them, a browser must decompress (a.k.a decode) the compressed form into a raw pixel form that can easily be ten times larger. This decoded form can be discarded and regenerated as necessary, and there are trade-offs to be made — for example, if you are too aggressive in discarding decoded images, you might have to decode them again, which will take CPU cycles and the user might see flickering if the decoding occurs in the visible part of the page.

However, Firefox goes way too far in the other direction. If you open a page in the foreground tab, every single image in that page will be immediately decoded, and none of the decoded data will be discarded unless you switch away to another tab. For pages that contain many images, this is a recipe for horrific memory consumption, and Firefox does much worse than all the other browsers. So this is a problem that doesn’t rear its head for all users, but it’s terrible for those that are affected.

(See this discussion on the dev-platform mailing list for more details about this topic.)

#1: Better Detection and Notification of Leaky Add-ons

It’s been the case for several months that when a user complains about Firefox consuming an excessive amount of memory, it’s usually because of one or more add-ons, and the “can you try that again in safe mode?” / “oh yeah, that fixes it” dance is getting tiresome.

Many add-ons leak. Even popular, well-written ones: in the past few months leaks have been found in Adblock Plus, Video DownloadHelper, GreaseMonkey and Firebug. That’s four of the top five add-ons on AMO! We’re now getting several reports about leaky add-ons a week; in this week’s MemShrink meeting there were four: TorButton, NoSquint, Customize Your Web, and 1Password. I strongly suspect the leaks we know about are just the tip of the iceberg.

Although leaks in add-ons are not Mozilla’s fault, they are Mozilla’s problem: Firefox gets blamed for the sins of its add-ons. And it’s not just memory consumption; the story is the same for performance in general. Here’s the quote of the week, from a user of 1Password:

I only use a handful of extensions and honestly never suspected 1P, however after disabling it I noticed my FireFox performance increased very noticibly. I’ve been running for 48 hours now without the 1P extension in Firefox and wow what a difference. Browsing is faster, switching is faster, memory usage is way down.

I’ve lost count of the number of stories like this that I’ve heard. How many users have we lost to Chrome because of these issues, I wonder?

(And it’s not just leaks. See this analysis of 16 add-ons and their effect on memory consumption when Firefox starts.)

One small step towards improving this situation was made this week: Jorge Villalobos and Andrew Williamson added a “check for memory leaks” item to the AMO review checklist (under “Memory leaks from content”). And Kris Maglione added some support for this checking in his Extension Test add-on. This means that add-ons with obvious memory leaks (and many of them are obvious if you are actively looking for them) will not be accepted by AMO.

So that will prevents leaks in some new add-ons and new versions of established add-ons. What about existing add-ons? One idea is that AMO could also have a flag that indicates add-ons that have known memory problems (and other performance problems). (This flag wouldn’t be an automatic thing, it would only be set once a leak has been confirmed, and after giving the author notification and some time to fix the problem.) So that would also improve things a bit.

But lots of add-ons aren’t hosted on AMO. Another idea is to have a stronger mechanism, one that informs the user if they have any add-ons installed that are known to cause high memory consumption (or other bad performance problems). There is an existing mechanism for blocking add-ons that are known to be malware or exceptionally crashy, so hopefully the warnings could piggy-back on top of that.

Then, we need a better way to detect leaky add-ons. Currently this is entirely done manually — and a couple of excellent contributors have found leaks on multiple add-ons — but I’m hoping that it’ll be possible to do a much more thorough job by analyzing telemetry data to find out which add-ons are correlated with high memory consumption. That information could be used to trigger manual checking.

Finally, once you know an add-on leaks, it’s not always easy to work out why. Tools could help a lot here, if they can be made to work well.

Conclusion

I listed six big areas for improvement. If we fixed all of these I think we’d be in a fantastic position.

Three of them (#5 better memory reporting, #4 better memory consumption tracking, #3 compacting generational GC) have people working on them and are in a good state.

Three of them (#6 better script handling, #2 better foreground image tab handling, #1 better detection and notification of leaky add-ons) don’t have people working on them, as far as I know. If you are willing and have the skills to contribute to any of these areas, please contact me!

And if you think I’ve overestimated or underestimated the importance of any issue, I’d love to hear about it. Thanks!

Thank you for looking into the NoSquint problems. It will surely encourage the author(s) to address the problem.

Offtopic:
NoSquint is an extension I cannot possibly do without, which is partly web authors’ fault and partly Firefox’s fault for letting the web authors do what they want. I would really love it if Firefox would allow me to zoom the page automatically to ensure that the smallest font on the page is a certain minimum size. Although Firefox presently allows you to control the minimum font size, this breaks many web page layouts and that is the reason NoSquint was invented.

I kill my firefox almost an hour on windows. Suddenly it starts eating lots of memory, (Highest consumer). It starts suddenly and FF becomes non-responsive. Is there a way send this data and see who is doing that? Its a dev machine i have FireBug installed. And few others too.

Typically it happens to me with Firebug in the following situation.
If I dump the text representation of a big (or huge) object for debugging purpose, Firefox with Firebug gets crazy and memory increases to the limit of available memory.
Example in php: var_dump($myHugeObject); Then display the resulting page.

I mentioned this via email to Jan Honza Odvarko, the lead Firebug developer. He said: “Firebug memory problems are top priority for me and FWG so, I am definitely interested. Could you please provide (or the commenter) a simple test case (php page), that I could use to repro the problem on my machine? Having such a test case is essential for me to find the actual problem so, that would be great help!”

If you want to analyse it, you’ll have to figure out a way to notice the problem without spending an entire hour, disable one add-on and restart (starting with the ones you feel are suspect), and repeat until the problem seems to disappear. Then list the add-ons you had enabled before, and the add-ons you had to disable (there’s a list in about:support that can be copy-pasted; the before list is useful because the problem could be due to several add-ons interacting). The add-ons you still need for web dev, like firebug, can be installed to a separate profile.

Sure; just disable all your extensions but one, then use your browser for a while. If it’s OK, then re-enable one extension, and use it again. Proceed until you find an add-on which is misbehaving (keeping in mind that you may have multiple bad add-ons).

You can also look for zombie compartments, which are a common kind of memory leak.

This isCrappy button would only be added after the add-on author had been given notification and a reasonable amount of time to fix the problem.

What else would you suggest? Just let users suffer from bad add-ons with no way of knowing? We’ve been trying that approach for pretty much as long as add-ons have existed, and I think it’s not working.

As a user I’d be a lot more willing to install and recommend add-ons if I knew they would have negligible memory impact. Currently I only have very indirect hints like freshness, focus on doing one thing well, unobtrusiveness, popularity. As a result I install only the bare minimum. When I see an add-on is restartless I’m a lot more willing (most of it will show up in about:memory).

Anyway, I expect competent add-on authors already care about memory usage. Others may be unaware of it, but will care once the review process gently nudges them.

I think the importance of every issue is estimated correctly. Just to confirm this from a standpoint of a user – please, if possible, try to report the memory used by each addon so that we, users, can disable or find alternatives for the addons that use lots of memory. I think if you find a way to do it, then 1) users reporting memory problems will be able to provide you with detailed and useful information than they can do now by by copying info. from about: memory 2) addons developers themselves will have a ready tool to measure how much memory their addon consumes and thus, if necessary, fix them before releasing them.

Nicholas, thank you for your comment.
1) I think you just have to start it somewhere and somehow. As you mentioned yourself, it’s a number 1 issue in your project. I think without finding a solution to it, you can’t say that MemShrink project reached its goal no matter how much Firefox itself may improve its memory efficiency. In this case, Mozilla community (not few people, but everyone) shall think how Firefox’ architecture could/should be changed to accommodate the need for reporting the memory used by addons. I think you shall invite everybody to propose ideas on how to deal with this issue, take a more proactive stand.
2) Please try to do it. The most difficult step is to start doing it. Once you start it you will find the solution.

I hope all these issues will be addressed in a Mozilla meeting.
That fact that no one are working on it is just sad.
JS Memory usage, How many facebook, twitter, Google Plus button do we have in memory?
( Wall Flower Addon isn’t perfect in stopping those running )

Did you read the linked bug? It’s going to be implemented differently. It has been exposed to the wide world of js scripts for many releases, has legitimate uses and isn’t deprecated, which makes it an unlikely candidate for removal.

It would be nice to be able to see addon’s memory usage reported per addon in about:memory so it’s easier for users to see what their addons are using and enable easier testing by them to see which are the problematic ones for their usage case.

is there a pattern when looking at the problematic extensions? For instance, are extensions which cross the chrome/content boundary more likely to cause zombie compartments involuntarily or are extensions which do not touch content equally susceptible? Knowing that might help with diagnosis when trying to help other users with memory issues.

I’m hoping that it’ll be possible to do a much more thorough job by analyzing telemetry data to find out which add-ons are correlated with high memory consumption.

Doing this via RSS numbers would require a powerful statistical analysis, and I’m not convinced our data is reliable or plentiful enough.

However, if we could send a “has zombie compartments” bit along with telemetry, that might be useful. We’ve always said that some extensions may keep compartments around legitimately, but I haven’t seen one yet which intentionally does this, so they can’t be particularly common.

Of course, a prerequisite for this is being able to identify automatically what’s a zombie.

> JS memory on a per-compartment basis; layout memory on a per-docshell basis; DOM memory on a per-window basis

Can you make a blog post expanding on what compartments, docshells, windows (I’m assuming you mean the js |window| object as opposed to HWNDs), etc. mean for those of us unfamiliar with the mozilla high level architecture?

Compartments can currently be shared between tabs, which make things tricky for per-tab reporting. However, https://bugzilla.mozilla.org/show_bug.cgi?id=650353 will change that — each compartment will end up belonging to a single tab. Once that’s done, JS code can be reported on a per-tab basis pretty easily.

The windows mentioned in relation to the DOM are just the DOM |window| objects, as you suggested. And a docshell is just an internal data structure within the implementation of the |window.document| object, so there’s a 1-to-1 correspondence between |window| objects and docshells. And because each |window| object clearly belongs to a tab, per-tab reporting of windows and docshells can be done right now. I just have to find time to do it 🙂

Not all JS data is stored within compartments. For example, some stuff is stored in the “runtime”, which you can see in about:memory. Stuff that’s stored in the runtime can be shared between tabs, which can be good for reducing memory consumption, and that’s what would happen with the shared parts of scripts. Depending on the nature of that sharing, it might be possible to apportion blame to each tab, or the blame might just have to fall into the a “system” bucket.

I have just started reading you posts and I am enjoying them. i don’t know what proportion of memory the JScript bugs consume, but they sound like they should be prioritised. I suspect the impact will be bigger than you estimate, simply because the effect is hidden across pages.

There’s nothing really hidden about them — the “js-total-scripts” number near the bottom of about:memory tells you how much memory scripts account for. E.g. in my current session it’s 54.02 MB out of 718.53 MB explicit. Fixing both of those bugs would get rid of most of that 54.02 MB.

Not sure if this is worth filing a bug for (that’s why I’m asking here first), as it is a pretty hefty page, so it might be normal behavior. I’ve seen it cause heap-unclassified values of 45%, but it does seem to go down again after a while.

That page crushes browsers in general. It took FF9 about a minute to load on my i7 desktop; and that was the best result out of the 3 browsers I tried.

IE9 will render the current viewport in about 5-10 seconds but won’t scroll via wheel/arrows and stalls for equally long between refreshes if you try page up/dn. After going down a few pages that way it appears to have gotten stuck since attempts to scroll down farther just result in it going blank and then showing the same lines of code again. Other tabs in IE at least remained functional while the one with it open remained paralyzed.

Opera 11.61 was frozen for >16 minutes when i finally lost patience and killed the process.

Traditional (non-JetPack) add-ons can hook their claws extremely deep into Firefox. There’s no clear dividing line between JS memory that’s due to Firefox and that which is due to the add-on. And in lots of cases both Firefox and the add-on will hold references to structures like DOM nodes; in that case, who do you blame for it?

I don’t see a problem with that.
In cases where both Firefox and the add-on hold references to a memory structure, there are 2 options:
1. It should be blamed for Firefox , since the add-on is not the only cause of holding the memory.
2. Or, It can be counted as shared memory for the add-on. Each add-on can have memory reported in 2 columns: private and shared memory. Where private is memory that is held only by the add-on, and neither another add-on nor Firefox are holding references to this memory, so the add-on is the only one to blame for this memory.

Factors like this inability to distinguish Add-On memory from core browser memory are why I cringe when you claim broad problems like leaky Add-Ons are not Mozilla’s fault.

Mozilla chose the open slather approach to attracting third party customization of Firefox whether deliberately or through neglect. Hell Mozilla even allowed the most brazen of user-compromising background unauthorized Add-On installation until recently! This is how desperate Mozilla was to play nicely with 3rd parties. Add-On developers just use what Mozilla allowed them to use. Initially I doubt there was even simple ‘best practices’ guidelines available to Add-On authors. Common sense would suggest that before any application opens itself to third party customization, that application would have set up reasonable boundaries. The developers of that application should have enough foresight to at least look at potential scalability consequences. Not so for Mozilla! From not having a ceiling on Firefox memory consumption relative to available system memory, to not having a means to even measure Add-On resource usage separately from core browser usage, it is Mozilla’s fault that these massive holes in it’s 3rd party customization (Add-Ons) model exist. What was a poorly conceived strength thus has quickly become an Achilles heel exacerbated by the atrocious regressions in Firefox 4. These are undoubtedly the biggest reasons why Firefox has lost users to Chrome.

Mozilla has to take the good with the bad as well. It is a stated Mozilla policy to let Add-On authors do a lot of Firefox’s innovation. Apart from the ‘creative’ designers who wake up every day asking themselves how they can copy other popular browsers (first IE, now Chrome), the JavaScript engine developers and those implementing new content rendering code such as HTML5 support, it is largely Add-On authors who do Firefox’s innovation. Usually this is in the critical field of user-friendliness. Session restore; re-opening closed tabs; textarea resize; developer tools; centralized bookmark synching; the combined Firefox button (ick); keeping background tabs unloaded … all these are examples of current (or arriving) native features developed by Add-On authors first. Add-Ons like NoSquint, Textarea Cache and Context Search are examples of simple user-friendly innovations provided by Add-Ons that should have gone native years ago but as yet have not. Mozilla cannot have it both ways. Mozilla cannot accept 3rd party innovation and also criticize those developers who provide this innovation for using Mozilla’s flawed ecosystem. The Jetpack ecosystem (Add-On ecosystem Mark II) still provides for access to chrome interfaces and components. There is no reason why (except perhaps the lamentable lack of Jetpack maturity) that Mozilla couldn’t start a campaign to encourage Add-On developers to migrate to Jetpack and then at least there would be better identification of memory consumption for most Add-Ons, right Nich? Mozilla formerly did this sort of thing before it went berserk for mobile and strange campaigns like web makers (I still don’t understand what that is). In short:

You want 3rd party innovation? You got it. You want slim 3rd party innovation? Take the more considered approach (like Jetpack apparently has) since day one!

Playing games with memory consumption limits to keep some free for other applications is something that sound clever, but blows up readily as soon as 2 or more programs with different rules try running concurrently.

We do drop some discardable stuff on memory pressure at the moment — specifically, we do it on Windows when virtual memory gets low. It’s throttled in a way that ensures it doesn’t happen too often, something like once every 10 seconds. There’s a bug open to do similar things for Mac and Linux (https://bugzilla.mozilla.org/show_bug.cgi?id=664291).

Chrome’s addons’ isolation makes them powerless. And so does Jetpack if you don’t use chrome interfaces. And once you do, you can forget about memory measurement. Oh and did I tell you that Chrome’s addon memory accounting is bogus?

Sometimes, the focus seems to be on implementation-/addon-level memory use and snapshots. What about JS developers wanting to improve their code’s memory-use-over-time profile?

In terms of possible sources to draw inspiration from, are you aware of the existing work on heap profiling for functional programs? Moving from snapshots to graphs, showing usage by user-defined types and along several possible dimensions, adding post-mortem lifetime information (is memory allocated long before first use, or retained long after last use?) are example ideas (here from Haskell heap profiling).

I’m aware of the heap profiling work, it inspired me to write Massif several years ago; early versions of Massif even used the same output format and hp2ps tool that the Haskell profiler used. A lot of that stuff (esp. the lag/drag/null/void stuff) is tailored towards the uniquely odd behaviours caused by laziness; in my experience that kind of time profile is much less interesting for strict languages, and heap snapshots at interesting times (e.g. peak memory usage) are more interesting. That’s what the more recent versions of Massif do.

As for the more general suggestion of providing tools to JS authors; yes it would be a good thing. I believe Chrome has some good tools for this. But doing that well is a lot of work, and while there is low-hanging fruit within the browser itself, IMO it’s better to focus on that, because that benefits every piece of JS code on the web. In contrast, better heap profiling tools for webdevs will only help with code written by good webdevs who know how to use the tools.

Great! Btw, the biographical profiling isn’t as specific to lazy evaluation as you make it sound; also, every practical language has both strict and non-strict constructs, not to mention hand-coded laziness patterns in JS.

As for the focus: yes, with one browser running dozens of tabs, any progress you make in the browser is shared/multiplied; but if you could help to enable progress in the JS code that runs in the tabs, that progress would be distributed (lots of good JS coders improving their code). And as the low-hanging fruits in the browser disappear (or you don’t have the resources to optimize further), the latter category is going to have the greater gains (and is going to pressure webdevs into improving their code). Just wanted to make sure you’re aware of the options.

Completely different topic: I’m wondering how many instances of JQuery (etc) there are in a typical browser session and, if there was a way to have immutable modules, whether those instances (or their code sections) could be shared.

Yes I would also like to support heap snapshots.
Chrome has pretty good support for this.
We learned at lot with the developement of the Eclipse Memory Analyzer (for Java) which could be applied to Javascript in many cases.
See my blog for some examples:http://kohlerm.blogspot.com/search/label/memory

I wonder how hard it would be to support chromes heap snapshot format, then theoretically we could use the same tools to analyze the snapshots.

Regarding #5 and #1 – It seems like all of the focus on finding leaks in add-ons is focused on finding Zombie compartments. However, there’s a Firebug leak I found with the help of Honza that’s represented pretty much entirely in the main [System Principal] compartment. While about:memory is very helpful for testing whether or not the leak is occurring, the information it provides wasn’t detailed enough to make it easy to catch the leak in the first place.

The leak I was experiencing significantly increased cycle collection times for me, which noticeably impacted responsiveness, so if other add-ons are doing something similar, it may be more important to catch that than zombies.

On a more constructive note, I think another helpful thing for add-on developers would be better documentation on how to use weak references. All of the documentation on devmo related to using C++ for weak reference. The only JS reference is here:https://developer.mozilla.org/en/Components.utils.getWeakReference
I added some text based on what I learned, but I was very surprised out how difficult it was to get things right in JavaScript.

I suspect using more weak references would prevent some of these leaks.