They say Marc Andreessen, co-founder of Netscape and co-author of the Mosaic browser, once said:

[An operating system] is just a bag of drivers.

People have been fantasizing about the web as application platform for as long as we’ve had it. Nearly a decade later, we’re really just getting started at realizing this vision–of truly reproducing the power of traditional operating system APIs inside of the browsers.

While some have had this vision of browser-as-application-runtime since the beginning, most of us have traditionally viewed the browser as a web page renderer. It’s only been in the past few years that some have begun to push hard on changing this status quo. Google stands out in this group both with the creation of boundary-pushing “desktop-quality” applications like Gmail and in describing Google Chrome as an application run-time, not a page viewer. [1]

Here in the Mozilla Developer Tools Lab, we’ve been pondering the various gaps in the tool-chain when you treat the browser as a serious, OS-grade application run-time. We’ll talk more about the landscape of tools and what’s available in a different post. In this one, we’d like to talk about one of the gaps we’ve found: memory tools.

If an application’s appetite for memory crosses over into gluttony, it can put a developer’s snappiness ambitions at risk. There are at least a couple of reasons why.

First, applications have a finite amount of memory available to them. When the operating system runs out of memory, a cool trick lets them supplement disk space for memory, but when this happens, performance hits the floor–hard drives, being mechanical, are orders of magnitude slower than memory.

While web applications don’t directly interact with the operating system to obtain memory, the browser does both for its own internal functions and as a proxy to the appetite of the web applications it is displaying, and as a web application’s memory consumption grows, so does that of the browser.

Therefore, if an individual web application’s memory needs grow sufficiently large, it can force the operating system to start dipping into disk space to provide sufficient memory, and when this happens, kiss any semblance of responsiveness goodbye.

Since there’s no way for a web application to know how much memory is available before this performance doomsday occurs, its good behavior to make your memory footprint as svelte as possible.

Garbage Collection and You

But there’s another, much more important reason why small web application memory footprints are good. It has to do with the way memory is handled in a browser. Like Java and pretty much any scripting language, JavaScript manages memory allocation for developers. This frees developers from having to deal with the tedious bookkeeping associated with manual memory management, but it comes at a cost.

That cost is embodied by the garbage collector. As a web application executes, it is constantly creating new objects, most of which are fairly transient–they are part of a transaction that has completed, like creating some short-lived jQuery objects to look-up some DOM elements. These objects consume memory. Eventually, the web application has created enough objects and is therefore consuming enough memory that the collector needs to wade through all the objects to see which ones are no longer being used and therefore represent memory that can be released.

This is where the performance implication comes in. To do its work, the collector stops the web application’s execution. Typically, this happens so fast that the user doesn’t notice. But when a web application creates lots and lots of objects, and these objects aren’t transient, the collector has a lot of work to do–it must go through all of these objects to ferret out the ones that are no longer used. This is turn results in delays that the user can perceive–and impairs the application’s responsiveness.

Leaks

To be clear, most web pages and web applications don’t push the browser’s memory limitations enough to cause performance problems related to either of the scenarios above. As stated at the outset, this blog entry is about those web applications that need to treat the browser as a high-performance run-time, which in the context of this entry means that they have much-larger-than-average memory requirements.

However, these issues apply to more than just those web apps that are designed to use large amounts of memory; they can also apply to long-running applications which, over time, gradually consume small amounts of memory until the footprint grows to be quite large. When an application consumes more memory than its designers intended, it is said to leak memory. [2]

And this leads in turn to a third way in which memory can give the shift to performance: when the browser itself leaks memory. It turns out that mere mortals have created web browsers, and every so often they’ve made mistakes which can trigger either of the two scenarios described above.

Diagnosing the Problem

So how do you as a developer go about troubleshooting these sorts of problems? Today, there’s really only one way good way to do it: use the operating system’s tools. Unfortunately, this option doesn’t provide the right level of detail; you can either see how much memory the browser is consuming in aggregate (which is fine to let you know that your memory use is increasing, but doesn’t tell you why) or you can see which data structures in the browser itself are consuming the memory (which is fine if understand the guts of the browser, but it’s pretty hard for anyone else to understand how this maps into the web application they’ve developed).

What’s missing is a tool targeted at web developers that makes it easy to understand what’s happening with their application’s memory usage. We propose to create such a tool.

Start Small, Start Focused

Our plan is to start small and address two key needs that are presently unmet by any of the existing, developer-friendly, easy-to-use tools we’ve seen on any browser. These needs are:

Understand the memory usage of an application

Understand the garbage collector’s behavior

While here in the Developer Tools Lab we’re most interested in creating developer tools for the entire web community (i.e., not just Firefox users), in this case because the tool will need some pretty deep integration with the browser, we’re going to start with Firefox (because we sit close to the engineers who work on it).

We plan on the initial implementation of this tool to be simple. For memory usage, we want to introduce the ability to visualize the current set of non-collectible JavaScript objects at any point in time (i.e., the heap) and give you the ability to understand why those objects aren’t collectible (i.e., trace any object to a GC root). For the garbage collector, we want to give you a way to understand when a collection starts and when it finishes and thus understand how long it took.

Help Us!

This is obviously a small step into a large world. Is it a good first step? What do you think we should do differently? We’d love to hear from you, and thanks for reading!

[1] Of course, Firefox does a fine job of acting as application run-time; my point is that Google was the first to call out web applications as a distinct class of web content and to talk in terms of supporting these for their mainstream browser. Incidentally, Mozilla Labs’ Prism project sought to pioneer this idea years before.

[2] I’m using the term “leak” in a much more general way than is common in most developer communities. Traditionally, the term is applied to an application that allocates memory and then neglects to deallocate it when done. Because a language like JavaScript doesn’t allow developers to manually allocate or deallocate memory, it is impossible to leak at the JS level in this sense. But in my broader sense, any time a developer unintentionally creates memory footprint (e.g., by continuously storing objects in a hash in a mis-designed cache, etc.), I consider it a leak. This broader definition is borrowed from the Java community.

Like this:

Related

Comments

Ben. You are right on. This is one of the most painful aspects of building scalable Ajax apps. We have been tackling it as part of the development of feedly for about 6 months now. Here are some of the lessons we learned.

The most important leaks are related to the use of closures and binding of js code and DOM elements. A lot of frameworks (JQuery included) make developers think of bind event handlers to DOM elements as a one way thing. In our experience, the firefox collector has a hard time unbind those relationship which result is memory not getting freed.

The solution we have found is two fold: a light weight binding framework which helps developers bind and unbind js-dom relationship and a set of best practices to perform the unbind as soon and as aggressively as possible (ie as soon a DOM fragment is replaced, all the references should be undone so that both the JS and DOM can be gc’ed.

The other technique we have put in place is what we call inflating objects: when we know that there is a leak, then we systematically go through is kind of JS objects used in our apps and inflate them by 50MB. And see if the leak is aggravated or not. THis way, by iteration, we can more easily find which references are to blame and create reproducible test cases.

This is a huge pain. It would be awesome if you could provide a set of tools to automate and provide diagnostics. We would love to help in any way we can.

Being able to have a map, graph or list of items taking up memory (similar to a process/task manager, perhaps?) could be pretty handy for developers trying to track or manage the references between objects in their apps.

Memory leaks in Javascript were a big factor for me when working on the redesign of Yahoo! Photos in 2005, given we were trying to create a desktop-like experience with pagination, drag and drop etc. in a single page. We ended up hitting the “wall” in IE 6 from creating too much “stuff” (JS/DOM), and had to throw away objects to keep the overall working set down because the GC was apparently going into overtime trying to manage everything, slowing the whole browser down as a result.

I also implemented a manual form of object destructors to unhook event handlers, remove DOM references and so on to prevent leaks, which proved to be crucial in maintaining performance in IE 6 at the time.

Looking forward to seeing progress on this very promising tool. Ideally it would be of the same level as the Firebug object analyzer, but at this point anything that gives displays the heap will be of great value. Thanks.

A suggestion: sometimes when I’m wearing my user hat instead of my developer hat, I notice that my browser (FF3) uses more and more memory as time goes on. I suspect this may be due to add-ons–specifically Extensions. But I don’t know how to tell what extension is the culprit.

It would be nice to have a way to distinguish between memory allocated by the page and the browser vs. memory allocated by extensions.

I’m not sure what the developer benefit is, though, and I realize you’re mainly talking about a tool for developers. I’m talking about a tool that sophisticated users could use to determine which extension is eating all their RAM.

I suppose the developer tool would be one that lets you examine how much memory your FF extension is using.

The only downside I see from this is that it will take all the speculation away from no longer dealing with a black box garbage collector.

Where’s the fun in that?

What am I going to do with all my free time if I don’t have to scour through code line by line, dereferencing objects, crossing my fingers, refreshing the page and hoping to see a change in Task Manager?:D

In a large application there could easily be hundreds or even thousands of js objects. Being able to view usage in terms of classes would be very handy. Maybe something similar to FireBugs Profiler Report.
Eg.

What stage of the development process is this in? Previously I attempted to implement a similar feature as a firebug add-in, but shelved the project due to what appeared to be an inability to enumerate the closure scopes (statically) via the JSD interface. I’d be quite interested in the discussing architecture of this project if it is at a place where such discussions can occur.

this is great to read, especially because this is being tackled by you Mozilla guys. Exciting.

Last year when we had a memory eating web app, we started a tool to basically watch the memory and the CPU, we called it xray :-). You can see a couple of screenshots in those slides here http://www.slideshare.netwww.slideshare.net/nonken/dojo-and-adobe-air-presentation (after slide 17). We hadn’t released anything yet, simply because … we are developers, we never get anything done enough to be confident enough to release it :-).
On the screenshots you can see the graphs for the memory and CPU load for any browser, additionally the web app can send events which are then mapped into the graph so you can see which event has what effect on the CPU or memory. The graphing app is an AIR app communicating with the web app to retreive the events. An external AIR app to not blur the actual browser stats. We have ideas about integrating dtrace into it and some other ideas of course.
What do you think, does this sound interesting? It is not yet as advanced as your thoughts on GC and stuff, but both sound like pieces of a puzzle that fit and just need to be put in place. We would be interested to see if you guys think this is interesting.

This is good, but I’m wondering when Mozilla will rise to the larger challenge pointed out by Google Chrome, that is the need for protected memory or browser window “sandboxing” or whatever you want to call it.

Forgive me if this has been covered in other blog posts, but there’s no obvious way to search your blog on this screen.

I am a little surprised to hear that no modern garbage collection is used as of yet in Javascript/the browser. It sounds as if they are still using a stop-and-copy algorithm rather than a more modern algorithm. It’s good to hear that they are doing something about it.

Thanks to everyone for the great feedback. I’m working with Wolfram and his collaborators to see if xray can be adapted to our needs.

Several folks have asked how to participate / contribute; I’ll do another post on that soon.

@Scott Schiller: Yes, definitely want both a graph and a list/table.

@Dennis Decker Jensen: Stop-the-world collection is still state-of-the-art. Java 6 has an experimental (i.e., not-on-by-default) concurrent collector, but it’s by no means mainstream and the serial and parallel stop-the-world collectors are used by 99% of Java apps out there.

ALL GC’s I know have some “stop the world” phase. It’s just that JVM implementations use generational collectors which stop the world most times very shortly. I know people who use the Concurrent Mark and Sweep collector (which has even shorter stop the world phases) in production. On 64 bit machines with a lot of memory, and lot of cores, everything else is not really practicable.

Let’s discuss some of the reasons which influence you to
adopt male enhancement pills over other methods
for sexual problem. Unfortunately for this reason, people never consider
that an effective treatment should first have its source in nature.
I did feel a slight momentary boost in overall sex drive, but not enough to match the advertised
results or the cost in comparison to some other supplements I tried.

First of all I want to say fantastic blog! I had a quick question in which I’d like to ask if you don’t
mind. I was curious to know how you center yourself and clear your mind
prior to writing. I’ve had a tough time clearing my thoughts in getting my thoughts out there.

I do enjoy writing but it just seems like the first 10 to 15
minutes are lost simply just trying to figure out how to begin.
Any suggestions or hints? Thank you!

Just wish to say your article is as amazing. The clarity in your post is just cool
and i can assume you’re an expert on this subject. Well with your permission let me to grab your RSS feed to
keep updated with forthcoming post. Thanks a million and
please continue the rewarding work.

Hello there, I found your website by means of Google at the same time as looking for a related matter, your web site got here
up, it seems to be good. I have bookmarked it in my google bookmarks.

Hello there, just changed into alert to your weblog
through Google, and located that it’s really informative.
I’m gonna be careful for brussels. I’ll be grateful in the
event you continue this in future. Lots of other people
can be benefited from your writing. Cheers!

Heyy there! This is kind of off topic but I need some guidance from an established blog.
Is it vrry hard to set up your own blog? I’m not very techincal but I can figure things out pretty quick.
I’m thinking about creating my own but I’m not sure where to start.
Do you have any ideas or suggestions? Thanks

Heya just wanted to ggive you a brief heads up and let you
know a few of the images aren’t loading properly. I’m not sure why but I think its a linking
issue. I’ve tried it in two different internet browsers and both show the same
outcome.