Links

Many people are using OProfile for system measurements on Linux. This is a very handy profiling tool (supports hardware performance counters and has a very low runtime overhead) although it is in alpha phase. Using OProfile to measure any application performance is a straightforward task on Linux: you should get the latest OProfile for your machine or device, build, configure and run the daemon, and start measuring an application or your whole system. Finally, you can execute several scripts which report the performance numbers to you (according to your reporting requests). It sounds easy, doesn't it?

The main focus of my next experiment is yet another allocator called TLSF (Two-Level Segregate Fit). Based on the info published on its website, TLSF has bounded response time, efficient allocation methods which are fast enough, and it has efficient memory usage. Furthermore, the site promises quite good memory fragmentation values. Can it demonstrate all this inside WebKit?

I found Emery Berger's allocator called hoard. Hoard's homepage holds out some promising general qualities (fast, scalable, and memory-efficient) about the allocator, do we need more than this? I tried out how it performs in WebKit.

After I benchmarked JavaScriptCore with our new participant called DLmalloc, it has been suggested to test it with QtLauncher also. I compiled DLmalloc in thread-safe mode (USE_LOCKS=1), so it became capable to serve WebCore's memory requests. Perhaps, another solution could have been to turn off every use of threads in QtLauncher/WebCore, but I think this would be a lucky approach...

The central data structure in JavaScriptCore is JSValue. It represents the value of a JavaScript variable. Since there are no type restrictions in JavaScript, a JSValue can contain anything from a simple bool constant to an object. Most operations work on these JSValues, so their effective handling is essential for a fast JavaScript engine. In JavaScriptCore there are 3 types of JSValue representations: JSValue32 and JSValue32_64 for 32 bit machines, and JSValue64 for 64 bit machines. Albeit maintaining all of these representations is a heavy burden, the feeling to have a fast JS engine is rewarding.

Perhaps the first question to pop out of somebody's mind is why do we need a fast JavaScript parser? A browser does so many things, why do we focus on JavaScript parsing, which probably consumes only a small amount of time? Earlier, we were thinking alike, but oprofile yielded some surprising results. The JavaScript parser was often responsible for 15-20% of the runtime of libWebKit. Further analysis has shown that popular web-sites usually come with big JavaScript source files (about 200-400k). These JS files contain a lot of source code, many are brower-specific. The expensive operation here is the syntax checking of the whole source code, as the browser must reject the execution of invalid JS files.

There are a lot of custom allocators in the world, so let's try another promising one, called DLmalloc. It was made by Doug Lea. I've put DLmalloc into JavaScriptCore with the help of the custom allocation framework. I tested DLmalloc only in JavaScriptCore, because it didn't work well with QtWebKit's multi-threaded features. The measurements have been made with QtWebKit, running on x86 Debian-Lenny.

Since, two of our tested allocators - TCmalloc, JEmalloc - benefit in multi-threaded environment, I did some benchmark which uses threading effectively. I run 2 instances of each popular benchmarks simultaneously with the help of JavaScript workers. I benchmarked on Linux-Qt port of WebKit and used official r54475 revision. All measurements were running in QtLauncher and did minimal painting only.

JEmalloc is a highly scalable memory allocator made by Jason Evans. This is the default allocator of the FreeBSD operating system and Firefox's Linux/Windows versions, but how does it perform in WebKit?

It took a while but now I'm happy to announce that all core classes are inherited from FastAllocBase in WebCore. Further the previous changes in JavaScriptCore, by now almost the whole world is a subclass of FastAllocBase. :-)