Monday, December 28, 2009

Over the past month further work was done to improve ABCL startup times. This effort is especially directed at supporting ABCL on Google App Engine (GAE).

The startup time of the trivial GAE example application in ABCL's source tree takes 19 seconds to startup, as mentioned in an earlier blog item. Although this is only a "Google theoretical time", because the page is served in only 12 seconds, this is clearly a lot. I heard many GAE applications have startup times between 5 and 10 seconds. It surely would be nice if our trivial application could get closer to that.

A number of different solutions have been evaluated:

Reduction of the number of classes loaded at startup; making better use of ABCL's auto-loader facility

Supporting binary fasls

Create a system for finer-grained auto-loading support

The reason exactly these scenarios were evaluated is the fact that I looked for help regarding this performance issue on irc (irc://irc.freenode.net/#...); the answer was "you're doing too much during setup or at the first request". My first reaction was that we didn't have any options: that's how ABCL is designed. After some discussion these scenarios came up though.

The first and third scenarios are the result from many profiling sessions of "ABCL startup" time. The conclusion was that 35 to 45% of ABCL startup time is spent in Java reflection: when loading function classes ABCL needs to look up the class constructors to instantiate an object of the given class.

Scenario 1 is about delaying loading of FASLs until a function in them is required. Scenario 3 goes into more detail about the use of a function: even when a FASL is loaded, not all functions in it will be used (immediately or ever). The idea behind scenario 3 is to delay reflection API access until a function is actually used.

Using scenario (1) startup times could be reduced somewhat, especially in the case of our minimal servlet application: it uses relatively few Lisp functions and the ones it does use are related to printing and streams. Those are concentrated in a limited number of fasls.

In order to implement scenario (3), a quite bit more effort was required. The basic idea - as explained above - is that many functions in a FASL won't be used until a later stage in the application. In order to be able to delay resolution of the bytecode of the function, we introduced an object which - like the auto-loader - acts as a proxy for the unresolved function. This proxy class doesn't exhibit the same overhead, because it is resolved only once.

Upon the first call to the function, its bytecode gets resolved and the proxy in the function slot gets replaced with the actual function. After that, the first call is forwarded to the real function, as if it had been called directly. Although the actual implementation is a bit more complex to account for the loading of nested functions, that's basically it.

With scenario (3) applied to function definitions only, we were able to reduce startup time of the first request on GAE from 19 seconds to 11 seconds (roughly 40%). Today, we started to apply the same strategy to macro functions too. The result is - measured on my local PC, not GAE - a savings of another roughly 13%. Assuming that the same applies to GAE (as it did with the other 40%), we've realized a saving of 50% startup time!

Binary fasls - scenario 2 - were an attempt to reduce the amount of work that needed to be done at startup: because the normal fasl loading process is driven by a text file containing Lisp code, that could have been one of the causes. We didn't remove support for them, but they didn't turn out to be a big saver; that can be explained because the binary fasls are just another ABCL function object which needs to be loaded using reflection.

All in all did we save 50% start up time. Let this be an invitation to start experimenting with ABCL on GAE.

Tuesday, November 10, 2009

Last spring, the Maxima challenge for ABCL was to get it to complete its test suite. We've mastered that bit of Maxima for some months now. However, as turned out soon after that, Maxima runs rather slow on ABCL. To some extent - being limited by the JVM - that's to be expected. The performance observed was way off base though: much too slow.

Through analysis, the cause was established to be the fact that Maxima declares lots of symbols to be special [and that ABCL doesn't offer a way to remove that specialness].

Last summer we found that allowing Maxima to undeclare specials increases ABCL performance immensely (roughly 35%). However, the final goal for Maxima is to make more sparingly use of specials and declaring unspecial or undeclaring special isn't defined in the spec. Because of the two reasons it felt not right to implement the solution at the time: it would have been a very Maxima specific one to a problem Maxima intends to fix in the long run.

Peter Graves noted that he converted the special bindings storage in XCL to use the same scheme as in SBCL/CCL, using an array with active bindings instead of a linked list of bindings. He observed a performance gain of 10% in his tests.

Last weekend, I implemented the same scheme in ABCL and although the general speed up doesn't show 10% in our tests [which may very well differ from Peter's], we observed roughly 40% performance gain on Maxima's test suite!

Saturday, November 7, 2009

On behalf of the developers of ABCL (Armed Bear Common Lisp) I'm glad to be able to announce the 0.17.0 release.

This release features - among lots of other things - performance improvements, a fix for unexpected thread termination due to uncaught exceptions and example code for running ABCL on Google App Engine. Please refer to the release notes for the full list.

If you have questions regarding use or licensing, or you find issues, please report back to the development list:

Monday, October 26, 2009

Triggered by the interest of one of our users, last week was mostly dedicated to finding out how (how well) ABCL runs on Google App Engine (GAE). This is what we found out:

To those readers who don't know: GAE is an environment for hosting web applications, backed by Google's storage and server clouds. It supports running applications written in Python and Java.

GAE's Java environment turns out to be a servlet environment. This means there is a single servlet instance per JVM which gets a chance to initialize itself in an init() method. The first request waits for this method to complete before it's being processed. Google makes no guarantees regarding the number of JVMs your application might be running on concurrently, or the lifetime of a single JVM: when GAE needs memory to run other apps, your JVM might get torn down when not serving any requests.

Knowing the above, getting ABCL to run on GAE involved several steps:

Getting the Java SDK for GAE (don't forget to get Java's SDK too!)

Implementing Java classes wrapping ABCL

Writing a minimal servlet in ABCL

Step (1) turns out to be rather easy: just get it from the GAE website, unzip it and -if you want your local paths to match the examples on their site- rename it to remove the version number at the end (appengine-java-sdk-1.2.6 becomes appengine-java-sdk).

Step (2) turned out to be a bit more involved, but after some twiddling, we found that we needed a minimum of 2 classes: at least one servlet class and a singleton class which loads a single ABCL into the JVM. [Note: a web application may contain any number of servlets with a minimum of 1.] The resulting application classeswere committed to the ABCL repository.

From there, a minimal "Hello world" web app was easy, making step (3) a quick one. The end result was entirely committed to the examples directory in ABCL's repository.

GAE offers a performance dashboard to monitor your application through an administration web interface. From there, you can check the application logs, response times (called latency in the dashboard) and see how much CPU your application is using. For the latter, they use an indicative measure: the time which it would have taken to handle the request on an unloaded Core2 1.2Ghz processor. This compensates for many of the variances in the Google infrastructure which influence how long it actually takes to handle the request.

With a working application in place, the next step was performance. Most notably that of the first request: all subsequent requests are handled within miliseconds (7 to 15 miliseconds), so there's no issue there. This is the part that dominated last week: It turned out that although the latency was around 12 seconds, the CPU consumption was around 19secs [1], both very high and said to be close to some upper limit which remains unspecified.

We're striving to get these figures down: even though they would not really impact operation of a servlet in a regular hosting situation, GAE's regular servlet restarting makes these times more important. The best way to reduce figures like these is to get the figures the application scores on your local system down first. One of the first things which comes to mind is ABCL's "long" startup time: on my local Core2 - 1GB - 1.6Ghz machine it takes 1.7 seconds.

More on the steps we took to optimize this startup time in a next blog post.

Conclusion: ABCL - if you accept the long initial request response - is definitely an option for writing your web applications in Common Lisp on a Java/JVM based infrastructure. It'll even run on Google App Engine. We'll keep you posted on how we fare on supporting that even better!

[1] In comparison: On the Clojure mailing list, 5.5 seconds is mentioned for Clojure and on #appengine (on irc.freenode.net), 7 to 10 seconds are said to be normal for JVM based apps.

As shown for a fact, ABCL's CLOS is improving and while possibly still it's weekest spot, it's definitely becoming useable. If you find performance issues, preferably with example code to show the issue, please report to the Armed Bear mailing list (address information on the project front page).

[Note added: Though 90% improvement is hard to achieve, another 30% (24 seconds to 16 seconds) were realised last weekend.]

Tuesday, October 6, 2009

At first glance, compilation of TAGBODY and BLOCK forms seems simple: the tags in the TAGBODY are all static and the exit point for the BLOCK is also well defined. Summarizing: no complexities with the dynamic environment of any kind.

However, taking a closer look and adding closures to the picture, things get a little more complicated: GO or RETURN-FROM can mean transfers of control to exit points outside the current function. Consider the snippet below:

(block NIL (funcall (lambda () (return-from NIL 3))) 4)

Apart from the fact that the example is too obvious: the lambda form can be inlined, it demonstrates what I mean by "RETURN-FROM will cause a transfer of control to a non-local exit point": The exit point to jump to is located outside the lambda.

So far, so good: the above existed for a long time in ABCL and is achieved by raising Java exceptions. Now things get more contrived: we'll use a closure created inside a BLOCK form which is part of a recursively called function.

To which exit point would you expect the lambda to return? You should expect it to return to the exit point associated with B block from the first FOO call. Now it's becoming clear what's so hairy about compiling BLOCK (and TAGBODY which has the same issue): there are 2 B blocks on the stack, so, which one to return to?

The way this last issue - recursive BLOCK and TAGBODY forms - was solved in ABCL just last weekend was to create a (hidden) variable which gets set to a certain unique value upon entry of the block at run time. Then, any non-local transfer of control uses that block identifier value to find the right block to jump to.

Actually, this solution has a slight advantage for TAGBODY over the pre-existing solution: the old solution checked all symbols in a TAGBODY before concluding the GO wasn't meant for the given tagbody. With the new approach, a single variable test (equality of object pointers) allows checking if the GO is meant for any given TAGBODY or that stack unwinding should continue.

But then there's another issue: closures can be assigned to variables which outlive the extent of the originating block or tagbody. Like the snippet below:

(progn (block B (setq a (lambda () (return-from B 3)))) (funcall a))

As indicated above, ABCL uses Java exceptions for non-local transfers of control. Now, if the (funcall a) form would throw a Go exception, regardless of the fact that there's no matching try/catch block, the exception would remain unhandled and the processing thread would exit. This is the situation in ABCL as it existed before last weekend.

Now, the solution to the problem with the recursive function calls has a nice additional benefit. Since there's storage shared between the closure and the BLOCK - they now share a variable - the variable can be used to let the block communicate to the closure that its extent has ended by setting it to a specific value. The GO form can then check for that condition before it throws the actual Go exception, making sure there's always a matching try/catch block. If there's no such block ABCL now generates a call to ERROR, allowing interactive error handling and selection of restarts where it used to unwind the stack to some location which happened to catch the exception - or terminate the thread if that didn't happen. Quite an improvement I'd say.

As a consequence of the changes described above, the code presented in the lisp paste at http://paste.lisp.org/display/88240 - which tests the full requirements of the CL spec - does succeed with today's ABCL (whereas it didn't last week!)...

Tuesday, September 15, 2009

Currently ABCL is a pretty decent decent Common Lisp that runs on the JVM, but we have really not started to add the necessary features which cater to making that "special" relationship easier. One of this rough sports involves how one packages ABCL applications for distribution. Since JAR files are currently the natural base unit for the distribution of JVM packages, it would make sense if one could load ABCL FASLs from JAR files. Indeed, this is an often requested feature on the armedbear-develop mailing list. Right after we published abcl-0.16.0, one of the first features to hit the trunk was the ability to load FASLs from JAR files, which we would like to explain a little bit about here.

ABCL has long had an extension to the semantics of Common Lisp PATHNAME that allowed one to specify entries of JAR files. Typing the following at an ABCL REPL:CL-USER> (defvar *jar-entry* #p"jar:file:/home/evenson/foo.jar!/bar.abcl")

would create a reference in *JAR-ENTRY* to a PATHNAME that contained a reference in the path of the JAR in the DEVICE field, with the reminder of the PATHNAME referring to the actual JAR entry. I think these semantics were used from J to load extension functions, but they were currently unused in the ABCL codebase.

Now, *JAR* has the following parts:

CL-USER> (pathname-device *jar-entry*)

#P"/home/evenson/foo.jar"

CL-USER> (pathname-name *jar-entry*)

"foo"

Note that DEVICE is actually a reference to another PATHNAME at this point.

which would load the code compiled into the 'bar.abcl' FASL that was packed into the JAR. The ".abcl" extension is not strictly necessary: we check for entries ending in "*.abcl" and "*.lisp" for you in the same way LOAD works on the filesystem

What doesn't work yet is MERGE-PATHNAME with these special PATHNAMES. And looking for the JAR file on the current CLASSPATH. With these sort of conveniences it will be possible to include ASDF packaged systems in the JAR easily. Stay tuned!

Sunday, September 6, 2009

On behalf of the developers of ABCL (Armed Bear Common Lisp) - a lisp implementation running on the JVM - I'm glad to be able to announce the 0.16.0 release.

This release features - among lots of other things - performance improvements,better type checking for the THE form and ANSI tests fixes. Starting this releaseJSR-223 support is delivered in the sources (this corrects an error in the procedure for earlier releases). You can find the release notes at:

Sunday, August 23, 2009

ABCL's compiler stores properties of blocks of code in a structure. Although it recognizes a number of different types of blocks (BLOCK, TAGBODY, etc.), there's only one structure type - called BLOCK-NODE.

While cleaning up this situation, separating the different block-node uses into different structures, I found ABCL didn't verify that the argument passed to the accessor functions for structure slots.

Cutting a long story short, we had to implement:

Structure type verification in the accessor functions

THE special operator type verification in the interpreter

THE special operator type verification for other policies than a *safety* value of 3 in the compiler

The first point being an issue with the accessor functions generated by ABCL: they didn't generate code to verify the argument passed. The effect being that a different structure with the same (or larger) number of slots could be passed in without an error occuring.

The second point being an issue that - even if there was a THE form - the interpreter would never verify the type specified as if it wasn't there. Talking to Peter Graves, I found that he had never intended the interpreter to be a full Common Lisp interpreter, meaning for ABCL to be a compiler-only system. The interpreter was merely there as a bootstrapping mechanism.With all the energy spent last year to get it to the same level of CL conformance as the compiler, this point just had to be ironed out.

The third point being the issue that the compiler would treat THE as TRUELY-THE for any other *safety* value than 3. This is clearly not strict enough: it means no type verification takes place at all at any of these levels, while the user may expect some level of type verification for any level of *safety* other than zero.

Now, I can continue the reorganization of the compiler code with a safety-belt on: with the right *safety* setting, I know my structure types (and their changes) are being verified!

As a general benefit: this applies to all code running in ABCL, of course. Should you want to prevent type-verification (for example for speed reasons) in your code, just use a *safety* value of zero. In this case, the compiler simply assumes the type fits.

Monday, August 10, 2009

Well, we've had a nice break on our self-set applicability target. As explained before, a Common Lisp implementation isn't of much use if it's not able to run much of the already-existing software available for the language. As an indicator we use ABCL's ability to run a number of software packages. Maxima is one of them.

Last autumn, ABCL wasn't able to complete the test suite delivered with Maxima. This spring ABCL would run it, but with many errors. Due to continued efforts - from both sides - and mainly the fact that Maxima changed their number comparisons to be CL-compliant (EQL), there are only 3 failures remaining - out of over 4500.

The remaining 3 failure clearly ABCL issues, where the outcome returned is of a lesser precision than expected by Maxima's test suite. So, even though we're not completely there yet, I'd say we're definitely usable with Maxima.

The note to add would be that our performance is less than acceptable with Maxima: a lot slower than any other Lisp implementation. The performance issue will also be addressed from both sides: we'll research how to improve performance (generally) on our side. One thing we know to be a performance issue with Maxima is their over-use of special variables. This is - according to Robert Dodier - something being addressed on their side.

With the specials over-use fixed, ABCL can be over 40% faster, as proven with a local hack to compensate for part of the over-use. Unfortunately, the hack can't make it into the ABCL repository as a general solution.

Work is under way though to improve repeated special binding access. Such repetitions occur if a loop uses a special variable as a looping variable, or if a special variable is used to collect results. The former is a use-case in Maxima's code; the latter is a pattern regularly seen in ABCL's compiler.

In ABCL's current code, each time a special variable is accessed, be it for reading or writing, the binding is looked up. The work focuses on reusing a binding after it has been looked up once. While in most cases this won't make a performance difference: the binding will be used only once, in some applications it may help improve performance quite a bit.

Sunday, July 12, 2009

Last time was about how the team is working to advance ABCL's correctness and conformance; these two are mostly measured by the ANSI tests. Next to those characteristics, we also introduced a measure called 'applicability'. It's a qualitative measure which we say increases when (more) existing Lisp code is made to (or is proven to) work with ABCL.

So far, we've been using the following software to test (and improve) the applicability of ABCL:

Next to the above we know a number of people are using ABCL to write their own applications. See the testimonials page. If you have software to complete the list of known-working software, please contact our developers list. If you have software which is almost working, please help us improve our applicability by at least reporting; patches to resolve the situation are even more appreciated.

Another aspect of our user experience is performance. Being an implementation on the JVM, of course we're restricted in many ways on which code we can generate and how we store our data. As a result we'll be in a disadvantageous position to start with.

That doesn't mean there's no room for improvement in ABCL at the moment. The development team has gradually moved into the area of profiling and improving the implementation over the last 2 or 3 months. As it turns out, there's still a lot which could be improved, with the following examples:

Support for unboxed single-floats and double-floats in the compiler

Careful creation of new objects - focussing on re-use - in the support library

Last year around August, Peter Graves asked me (Erik Huelsmann) to take over ABCL (Armedbear Common Lisp) development. I gladly accepted.

Since then we - a small team of lisp hackers - have been hard working to improve ABCL. Yesterday it was pointed out to me that the Lisp community was probably unaware of the renewed energy being put into ABCL's development.

Our efforts have concentrated on 3 areas of improvement:

Correctness - doing right what is already implemented

Conformance - implementing what the spec says should be

Features - things the spec doesn't require, but which are too handy not to provide standard

With respect to correctness and conformance we are glad to be able to say that the ANSI test failures have been drastically reduced to below 40 (coming from hundreds).

With respect to the features, we're proud of our achievements, which are two-fold: first of all we have added some Lisp features most implementations provide (

compiler improvements such as unboxed local variables

MACROEXPAND-ALL

COMPILER-LET and

locking primitives for threading

lots of other stuff;

second we added some features which matter for a Lisp in a Java environment:

JSR-223 support: being a scripting engine for any Java application

Threading primitives (different ones) which inter operate with the Java world

Improved Ant-based build system

more...

But we did more: ABCL was separated out of the source tree of the J editor (and moved to common-lisp.net). A wiki, mailing lists, new project pages, a repository and defect tracker have been set up. A bi-monthly release schedule has been put into place - source releases only for now and we're working toward binary releases too.

On this blog, I will discuss the advances and difficulties that make up the ABCL development experience.