Monday, July 26, 2010

So, EuroPython 2010 is over, I am flying home and it's time to write a report
about the conference from the PyPy point of view.

As usual, the conference was very interesting and went very well. The quality
of the talks I attended to was high on average and most importantly I could
meet a lot of interesting people to discuss various things.

On the first day, Armin, Amaury and I presented the usual PyPy status talk
(here are the slides):
the talk is an extended version of the one that I and Armin presented at
Pycon Italia in May and is divided in three parts: first I talked about the
current status of the project, what is the content of the recent 1.2 and 1.3
releases and showed a demo of a simple Django application that renders a
Mandelbrot fractal and is measurably faster on PyPy than on CPython. In the
second part of the talk, Armin gave an introduction about the ideas that stand
behind the JIT. Finally, in the third part Amaury explained how the new
cpyext module lets PyPy to compile and load existing CPython extensions
written in C.

I think that the talk was well received: the only drawback is that there was
no time to answer questions at the end of the presentation. However, we
received a lot of "offline" questions after the talk finished and thorough the
whole conference: it is always great to see that people are interested in our
work, and I'd like to thank everybody for the feedback that they gave to us.

PyPy was also mentioned in the interesting Mark Shannon's talk, where he
compared the optimization techniques used by PyPy, Unladen Swallow and
HotPy, which is Mark's own PhD project. Moreover, Henrik Vendelbo
gave a talk about how to tweak PyPy to produce a standalone
executable which embeds a whole python application to make deployment easier,
while Andrew Francis explained his implementation of the Go select
statement based on the stackless.py module implemented in PyPy. Personally,
I am glad to see that people start to think of PyPy as a useful starting
point to experiment with new features and use cases that we did not think
about: after all, one of PyPy explicit goals is to be "flexible and easy to
experiment with".

After the conference there were the usual post EuroPython sprints: this
year we had not planned a PyPy sprint, but some people showed interest
in it and since Armin and I happened to be still around the day after the
conference, we decided to do a mini 1-day sprint, with 6 or 7 people
present. Since there were only two core developers it was impossible to use
our usual pairing scheme, in which every newcomer pairs with someone who is
experienced with the source code to gain knowledge of it. However, I think it
was still a successful day of work, and we managed to fix a couple of bugs
that was standing in our issue tracker. Again, I'd like to thank all the
people that came and worked with us during the sprint.

In conclusion I really enjoyed the EuroPython 2010 experience: the fact that I
managed to find a place in Birmingham where to eat a good Italian-style "gelato"
helped a lot :-).

So, EuroPython 2010 is over, I am flying home and it's time to write a report
about the conference from the PyPy point of view.

As usual, the conference was very interesting and went very well. The quality
of the talks I attended to was high on average and most importantly I could
meet a lot of interesting people to discuss various things.

On the first day, Armin, Amaury and I presented the usual PyPy status talk
(here are the slides):
the talk is an extended version of the one that I and Armin presented at
Pycon Italia in May and is divided in three parts: first I talked about the
current status of the project, what is the content of the recent 1.2 and 1.3
releases and showed a demo of a simple Django application that renders a
Mandelbrot fractal and is measurably faster on PyPy than on CPython. In the
second part of the talk, Armin gave an introduction about the ideas that stand
behind the JIT. Finally, in the third part Amaury explained how the new
cpyext module lets PyPy to compile and load existing CPython extensions
written in C.

I think that the talk was well received: the only drawback is that there was
no time to answer questions at the end of the presentation. However, we
received a lot of "offline" questions after the talk finished and thorough the
whole conference: it is always great to see that people are interested in our
work, and I'd like to thank everybody for the feedback that they gave to us.

PyPy was also mentioned in the interesting Mark Shannon's talk, where he
compared the optimization techniques used by PyPy, Unladen Swallow and
HotPy, which is Mark's own PhD project. Moreover, Henrik Vendelbo
gave a talk about how to tweak PyPy to produce a standalone
executable which embeds a whole python application to make deployment easier,
while Andrew Francis explained his implementation of the Go select
statement based on the stackless.py module implemented in PyPy. Personally,
I am glad to see that people start to think of PyPy as a useful starting
point to experiment with new features and use cases that we did not think
about: after all, one of PyPy explicit goals is to be "flexible and easy to
experiment with".

After the conference there were the usual post EuroPython sprints: this
year we had not planned a PyPy sprint, but some people showed interest
in it and since Armin and I happened to be still around the day after the
conference, we decided to do a mini 1-day sprint, with 6 or 7 people
present. Since there were only two core developers it was impossible to use
our usual pairing scheme, in which every newcomer pairs with someone who is
experienced with the source code to gain knowledge of it. However, I think it
was still a successful day of work, and we managed to fix a couple of bugs
that was standing in our issue tracker. Again, I'd like to thank all the
people that came and worked with us during the sprint.

In conclusion I really enjoyed the EuroPython 2010 experience: the fact that I
managed to find a place in Birmingham where to eat a good Italian-style "gelato"
helped a lot :-).

Friday, July 9, 2010

The last five days we have been sprinting in a meeting room in the Computing
Center at CERN in Genève, Switzerland. Present are Armin Rigo, Antonio Cuni,
Carl Friedrich Bolz and Wim Lavrijsen (LBL). The goal of the sprint was to use
some of the C++ technology developed at CERN to make it possible to use C++
libraries from PyPy's Python interpreter. For this we used the Reflex
library, which provides reflection information for C++ classes. We discussed
using Reflex in PyPy during the Düsseldorf sprint of 2008, please read
that blog post if you want some more details on how Reflex works. There is
support for this sort of C++/Python integration also for CPython, using the
PyROOT module.

The sprint was very successful. On Monday we had a few discussion about how
Reflex could best be integrated with PyPy. One of the goals of the sprint was to
make the approach JIT-friendly from the start, so that calls to C++ libraries
can be reasonably fast. After the discussion we started coding on the
reflex-support branch. This branch adds a new cppyy builtin module to
PyPy's Python interpreter (why we chose that name is left as an exercise to the
reader). This module can be used to load C++ classes, construct instances and
call static and instance methods on them.

The work has just started, as of now, the argument and return types of the
methods are restricted to some simple C types, such as int, double and
char* and pointers to class instances. Most of the work necessary to
properly resolve overloaded methods is done, but default arguments are not.

We also did some very early JIT work and some early performance measurements.
The rough figures are that cppyy is two times faster at calling a simple C++
method from Python than PyROOT. To get a feeling for how fast things could
go in the end, we also implemented a proof-of-concept for some more advanced JIT
technology (which requires a patch for Reflex and uses a GCC extension). With
this, the speedup over PyROOT is a factor of 20. Of course, this is still a
lot slower than a C++ to C++ method call (probably by at least an order of
magnitude).

The sprint was very productive because we managed to get the right people into
the same room working together. Wim has a lot of experience with C++ and Reflex,
and is the author of PyROOT, and of course the others know a lot about PyPy
(at the end of the sprint, Anto was very glad that he stopped using C++ a long
time ago). Also, working at CERN was very cool. The atmosphere is amazing, and
we got to visit the ATLAS control room. Extremely advanced technology, and
also research on a completely different scale than what we are used to.

The last five days we have been sprinting in a meeting room in the Computing
Center at CERN in Genève, Switzerland. Present are Armin Rigo, Antonio Cuni,
Carl Friedrich Bolz and Wim Lavrijsen (LBL). The goal of the sprint was to use
some of the C++ technology developed at CERN to make it possible to use C++
libraries from PyPy's Python interpreter. For this we used the Reflex
library, which provides reflection information for C++ classes. We discussed
using Reflex in PyPy during the Düsseldorf sprint of 2008, please read
that blog post if you want some more details on how Reflex works. There is
support for this sort of C++/Python integration also for CPython, using the
PyROOT module.

The sprint was very successful. On Monday we had a few discussion about how
Reflex could best be integrated with PyPy. One of the goals of the sprint was to
make the approach JIT-friendly from the start, so that calls to C++ libraries
can be reasonably fast. After the discussion we started coding on the
reflex-support branch. This branch adds a new cppyy builtin module to
PyPy's Python interpreter (why we chose that name is left as an exercise to the
reader). This module can be used to load C++ classes, construct instances and
call static and instance methods on them.

The work has just started, as of now, the argument and return types of the
methods are restricted to some simple C types, such as int, double and
char* and pointers to class instances. Most of the work necessary to
properly resolve overloaded methods is done, but default arguments are not.

We also did some very early JIT work and some early performance measurements.
The rough figures are that cppyy is two times faster at calling a simple C++
method from Python than PyROOT. To get a feeling for how fast things could
go in the end, we also implemented a proof-of-concept for some more advanced JIT
technology (which requires a patch for Reflex and uses a GCC extension). With
this, the speedup over PyROOT is a factor of 20. Of course, this is still a
lot slower than a C++ to C++ method call (probably by at least an order of
magnitude).

The sprint was very productive because we managed to get the right people into
the same room working together. Wim has a lot of experience with C++ and Reflex,
and is the author of PyROOT, and of course the others know a lot about PyPy
(at the end of the sprint, Anto was very glad that he stopped using C++ a long
time ago). Also, working at CERN was very cool. The atmosphere is amazing, and
we got to visit the ATLAS control room. Extremely advanced technology, and
also research on a completely different scale than what we are used to.

Saturday, July 3, 2010

Recently, I've become aware of the SPUR project of Microsoft Research and
read some of their papers (the tech report "SPUR: A Trace-Based JIT Compiler
for CIL" is very cool). I found the project to be very interesting and since
their approach is in many ways related to what PyPy is doing, I now want to
compare and contrast the two projects.

A Tracing JIT for .NET

SPUR consist of two parts: On the one hand it is a VM for CIL, the
bytecode of the .NET VM. This VM uses a tracing JIT compiler to compile the
programs it is running to machine code. As opposed to most existing VMs that
have a tracing JIT it does not use an interpreter at all. Instead it
contains various variants of a JIT compiler that produce different versions of
each method. Those are:

a profiling JIT which produces code that does lightweight profiling when
running the compiled method

a tracing JIT which produces code that produces a trace when running the
compiled method

a transfer-tail JIT which is used to produce code which is run to get from a
failing guard back to the normal profiling version of a method

an optimizing JIT that actually optimizes traces and turns them into machine code

Optimizations Done by the Optimizing JIT

SPUR's optimizing JIT does a number of powerful optimizations on the traces before it
turns them into machine code. Among them are usual compiler optimizations such
as register allocation, common subexpression elimination, loop invariant code
motion, etc.

It also performs some optimizations that are specific to the tracing context and
are thus not commonly found in "normal" compilers:

guard implication: if a guard is implied by an earlier guard, it is removed

guard strengthening: if there is a sequence of guards that become stronger
and stronger (i.e. each guard implies the previous one), the first guard in
the sequence is replaced by the last one, and all others are removed. This can
greatly reduce the number of guards and is generally safe. It can shift a
guard failure to an earlier point in the trace, but the failure would have
occurred at some point in the trace anyway.

load/store optimizations: this is an optimization for memory reads/writes.
If several loads from the same memory location occur without writes in
between, all but the first one are removed. Similarly, if a write to a memory
location is performed, this write is delayed as much as possible. If there is
a write to the same location soon afterwards, the first write can be removed.

escape analysis: for allocations that occur in a loop, the optimizer checks
whether the resulting object escapes the loop. If not, the allocation is moved
before the loop, so that only one object needs to be allocated, instead of one
every loop iteration.

user-controlled loop unrolling: not exactly an optimization, but an
interesting feature anyway. It is possible to annotate a CIL method with a
special decorator [TraceUnfold] and then the tracing JIT will fully unroll
the loops it contains. This can be useful for loops than are known to run a
small and fixed number of iterations for each call-site.

user controlled tracing: The user can also control tracing up to a point.
Methods can be annotated with [NativeCall] to tell the tracer to never
trace their execution. Instead they appear as a direct call in the trace.

A JavaScript Implementation

In addition to the tracing JIT I just described, SPUR also contains a JavaScript
implementation for .NET. The approach of this implementation is to translate
JavaScript to CIL bytecode, doing some amount of type inference to detect
variables that have fixed types. All operations where no precise type could be
determined are implemented with calls to a JavaScript runtime system, which does
the necessary type dispatching. The JavaScript runtime is implemented in C#.

The JavaScript implementation and the CLI VM with a tracing JIT sound quite
unrelated at first, but together they amplify each other. The tracing JIT traces
the JavaScript functions that have been translated to CLI bytecode. Since the
JavaScript runtime is in C#, it exists as CLI bytecode too. Thus it can be
inlined into the JavaScript functions by the tracer. This is highly beneficial,
since it exposes the runtime type dispatching of the JavaScript operations to
the optimizations of the tracing JIT. Particularly the common expression
elimination helps the JavaScript code. If a series of operations is performed on
the same object, the operations will all do the same type checks. All but the
type checks of the first operation can be removed by the optimizer.

Performance Results

The speed results of the combined JavaScript implementation and tracing JIT are
quite impressive. It beats TraceMonkey for most benchmarks in SunSpider (apart
from some string-heavy benchmarks that are quite slow) and can compete with V8
in many of them. However, all this is steady-state performance and it seems
SPUR's compile time is rather bad currently.

Further Possibilities

A further (so far still hypothetical) advantage of SPUR is that the approach can
optimize cases where execution crosses the border of two different systems. If
somebody wrote an HTML layout engine and a DOM in C# to get a web browser and
integrated it with the JavaScript implementation described above, the tracing
JIT could optimize DOM manipulations performed by JavaScript code as well as
callbacks from the browser into JavaScript code.

Of course the approach SPUR takes to implement JavaScript is completely
generalizable. It should be possible to implement other dynamic languages in the
same way as JavaScript using SPUR. One would have to write a runtime system for
the language in C#, as well as a compiler from the language into CIL bytecode.
Given these two elements, SPUR's tracing JIT compiler would probably do a
reasonable job at optimizing this other language (of course in practise, the
language implementation would need some tweaking and annotations to make it
really fast).

Comparison With PyPy

The goals of PyPy and SPUR are very similar. Both projects want to implement
dynamic languages in an efficient way by using a tracing JIT. Both apply the
tracing JIT "one level down", i.e. the runtime system of the dynamic language is
visible to the tracing JIT. This is the crucial point of the approach of both
projects. Since the runtime system of the dynamic language is visible to the
tracing JIT, the JIT can optimize programs in that dynamic language. It does not
itself need to know about the semantics of the dynamic language. This makes the
tracing JIT usable for a variety of dynamic languages. It also means that the
two halves can be implemented and debugged independently.

In SPUR, C# (or another language that is compilable to CIL) plays the role of
RPython, and CIL is equivalent to the intermediate format that PyPy's
translation toolchain uses. Both formats operate on a similar abstraction level,
they are quite close to C, but still have support for the object system of their
respective language and are garbage-collected.

SPUR supports only a JavaScript implementation so far, which could maybe change in
the future. Thus JavaScript in SPUR corresponds to Python in PyPy, which was the
first dynamic language implemented in PyPy (and is also the reason for PyPy's
existence).

There are obviously also differences between the two projects, although many of
them are only skin-deep. The largest difference is the reliance of SPUR on
compilers on all levels. PyPy takes the opposite approach of using interpreters
almost everywhere. The parts of PyPy that correspond to SPUR's compilers are (I
will use the Python implementation of PyPy as an example):

the JavaScript-to-CIL compiler corresponds to the Python interpreter of PyPy

the profiling JIT corresponds to a part of PyPy's translation toolchain
which adds some profiling support in the process of turning RPython code into
C code,

the tracing JIT corresponds to a special interpreter in the PyPy JIT which
executes an RPython program and produces a trace of the execution

the optimizing JIT corresponds to the optimizers and backends of PyPy's JIT

PyPy's Optimizations

Comparing the optimizations that the two projects perform, the biggest
difference is that PyPy does "trace stitching" instead of fully supporting trace
trees. The difference between the two concerns what happens when a new trace
gets added to an existing loop. The new trace starts from a guard in the
existing loop that was observed to fail often. Trace stitching means that the
loop is just patched with a jump to the new trace. SPUR instead recompiles the
whole trace tree, which gives the optimizers more opportunities, but also makes
compilation a lot slower. Another difference is that PyPy does not perform
loop-invariant code motion yet.

Many of the remaining optimizations are very similar. PyPy supports guard
implication as well as guard strengthening. It has some load/store
optimizations, but PyPy's alias analysis is quite rudimentary. On the other
hand, PyPy's escape analysis is very powerful. PyPy also has support for the
annotations that SPUR supports, using some decorators in the pypy.rlib.jit
module. User-controlled loop unrolling is performed using the unroll_safe
decorator, tracing of a function can be disabled with the dont_look_inside
decorator.

PyPy has a few more annotations that were not mentioned in the SPUR tech report.
Most importantly, it is possible to declare a function as pure, using the
purefunction decorator. PyPy's optimizers will remove calls to a function
decorated that way if the arguments to the call are all constant. In addition it
is possible to declare instances of classes to be immutable, which means that
field accesses on constant instances can be folded away. Furthermore there is
the promote hint, which is spelled x = hint(x, promote=True). This will
produce a guard in the trace, to turn x into a constant after the guard.

Summary

Given the similarity between the projects' goals, it is perhaps not so
surprising to see that PyPy and SPUR have co-evolved and reached many similar
design decisions. It is still very good to see another project that does many
things in the same way as PyPy.

Recently, I've become aware of the SPUR project of Microsoft Research and
read some of their papers (the tech report "SPUR: A Trace-Based JIT Compiler
for CIL" is very cool). I found the project to be very interesting and since
their approach is in many ways related to what PyPy is doing, I now want to
compare and contrast the two projects.

A Tracing JIT for .NET

SPUR consist of two parts: On the one hand it is a VM for CIL, the
bytecode of the .NET VM. This VM uses a tracing JIT compiler to compile the
programs it is running to machine code. As opposed to most existing VMs that
have a tracing JIT it does not use an interpreter at all. Instead it
contains various variants of a JIT compiler that produce different versions of
each method. Those are:

a profiling JIT which produces code that does lightweight profiling when
running the compiled method

a tracing JIT which produces code that produces a trace when running the
compiled method

a transfer-tail JIT which is used to produce code which is run to get from a
failing guard back to the normal profiling version of a method

an optimizing JIT that actually optimizes traces and turns them into machine code

Optimizations Done by the Optimizing JIT

SPUR's optimizing JIT does a number of powerful optimizations on the traces before it
turns them into machine code. Among them are usual compiler optimizations such
as register allocation, common subexpression elimination, loop invariant code
motion, etc.

It also performs some optimizations that are specific to the tracing context and
are thus not commonly found in "normal" compilers:

guard implication: if a guard is implied by an earlier guard, it is removed

guard strengthening: if there is a sequence of guards that become stronger
and stronger (i.e. each guard implies the previous one), the first guard in
the sequence is replaced by the last one, and all others are removed. This can
greatly reduce the number of guards and is generally safe. It can shift a
guard failure to an earlier point in the trace, but the failure would have
occurred at some point in the trace anyway.

load/store optimizations: this is an optimization for memory reads/writes.
If several loads from the same memory location occur without writes in
between, all but the first one are removed. Similarly, if a write to a memory
location is performed, this write is delayed as much as possible. If there is
a write to the same location soon afterwards, the first write can be removed.

escape analysis: for allocations that occur in a loop, the optimizer checks
whether the resulting object escapes the loop. If not, the allocation is moved
before the loop, so that only one object needs to be allocated, instead of one
every loop iteration.

user-controlled loop unrolling: not exactly an optimization, but an
interesting feature anyway. It is possible to annotate a CIL method with a
special decorator [TraceUnfold] and then the tracing JIT will fully unroll
the loops it contains. This can be useful for loops than are known to run a
small and fixed number of iterations for each call-site.

user controlled tracing: The user can also control tracing up to a point.
Methods can be annotated with [NativeCall] to tell the tracer to never
trace their execution. Instead they appear as a direct call in the trace.

A JavaScript Implementation

In addition to the tracing JIT I just described, SPUR also contains a JavaScript
implementation for .NET. The approach of this implementation is to translate
JavaScript to CIL bytecode, doing some amount of type inference to detect
variables that have fixed types. All operations where no precise type could be
determined are implemented with calls to a JavaScript runtime system, which does
the necessary type dispatching. The JavaScript runtime is implemented in C#.

The JavaScript implementation and the CLI VM with a tracing JIT sound quite
unrelated at first, but together they amplify each other. The tracing JIT traces
the JavaScript functions that have been translated to CLI bytecode. Since the
JavaScript runtime is in C#, it exists as CLI bytecode too. Thus it can be
inlined into the JavaScript functions by the tracer. This is highly beneficial,
since it exposes the runtime type dispatching of the JavaScript operations to
the optimizations of the tracing JIT. Particularly the common expression
elimination helps the JavaScript code. If a series of operations is performed on
the same object, the operations will all do the same type checks. All but the
type checks of the first operation can be removed by the optimizer.

Performance Results

The speed results of the combined JavaScript implementation and tracing JIT are
quite impressive. It beats TraceMonkey for most benchmarks in SunSpider (apart
from some string-heavy benchmarks that are quite slow) and can compete with V8
in many of them. However, all this is steady-state performance and it seems
SPUR's compile time is rather bad currently.

Further Possibilities

A further (so far still hypothetical) advantage of SPUR is that the approach can
optimize cases where execution crosses the border of two different systems. If
somebody wrote an HTML layout engine and a DOM in C# to get a web browser and
integrated it with the JavaScript implementation described above, the tracing
JIT could optimize DOM manipulations performed by JavaScript code as well as
callbacks from the browser into JavaScript code.

Of course the approach SPUR takes to implement JavaScript is completely
generalizable. It should be possible to implement other dynamic languages in the
same way as JavaScript using SPUR. One would have to write a runtime system for
the language in C#, as well as a compiler from the language into CIL bytecode.
Given these two elements, SPUR's tracing JIT compiler would probably do a
reasonable job at optimizing this other language (of course in practise, the
language implementation would need some tweaking and annotations to make it
really fast).

Comparison With PyPy

The goals of PyPy and SPUR are very similar. Both projects want to implement
dynamic languages in an efficient way by using a tracing JIT. Both apply the
tracing JIT "one level down", i.e. the runtime system of the dynamic language is
visible to the tracing JIT. This is the crucial point of the approach of both
projects. Since the runtime system of the dynamic language is visible to the
tracing JIT, the JIT can optimize programs in that dynamic language. It does not
itself need to know about the semantics of the dynamic language. This makes the
tracing JIT usable for a variety of dynamic languages. It also means that the
two halves can be implemented and debugged independently.

In SPUR, C# (or another language that is compilable to CIL) plays the role of
RPython, and CIL is equivalent to the intermediate format that PyPy's
translation toolchain uses. Both formats operate on a similar abstraction level,
they are quite close to C, but still have support for the object system of their
respective language and are garbage-collected.

SPUR supports only a JavaScript implementation so far, which could maybe change in
the future. Thus JavaScript in SPUR corresponds to Python in PyPy, which was the
first dynamic language implemented in PyPy (and is also the reason for PyPy's
existence).

There are obviously also differences between the two projects, although many of
them are only skin-deep. The largest difference is the reliance of SPUR on
compilers on all levels. PyPy takes the opposite approach of using interpreters
almost everywhere. The parts of PyPy that correspond to SPUR's compilers are (I
will use the Python implementation of PyPy as an example):

the JavaScript-to-CIL compiler corresponds to the Python interpreter of PyPy

the profiling JIT corresponds to a part of PyPy's translation toolchain
which adds some profiling support in the process of turning RPython code into
C code,

the tracing JIT corresponds to a special interpreter in the PyPy JIT which
executes an RPython program and produces a trace of the execution

the optimizing JIT corresponds to the optimizers and backends of PyPy's JIT

PyPy's Optimizations

Comparing the optimizations that the two projects perform, the biggest
difference is that PyPy does "trace stitching" instead of fully supporting trace
trees. The difference between the two concerns what happens when a new trace
gets added to an existing loop. The new trace starts from a guard in the
existing loop that was observed to fail often. Trace stitching means that the
loop is just patched with a jump to the new trace. SPUR instead recompiles the
whole trace tree, which gives the optimizers more opportunities, but also makes
compilation a lot slower. Another difference is that PyPy does not perform
loop-invariant code motion yet.

Many of the remaining optimizations are very similar. PyPy supports guard
implication as well as guard strengthening. It has some load/store
optimizations, but PyPy's alias analysis is quite rudimentary. On the other
hand, PyPy's escape analysis is very powerful. PyPy also has support for the
annotations that SPUR supports, using some decorators in the pypy.rlib.jit
module. User-controlled loop unrolling is performed using the unroll_safe
decorator, tracing of a function can be disabled with the dont_look_inside
decorator.

PyPy has a few more annotations that were not mentioned in the SPUR tech report.
Most importantly, it is possible to declare a function as pure, using the
purefunction decorator. PyPy's optimizers will remove calls to a function
decorated that way if the arguments to the call are all constant. In addition it
is possible to declare instances of classes to be immutable, which means that
field accesses on constant instances can be folded away. Furthermore there is
the promote hint, which is spelled x = hint(x, promote=True). This will
produce a guard in the trace, to turn x into a constant after the guard.

Summary

Given the similarity between the projects' goals, it is perhaps not so
surprising to see that PyPy and SPUR have co-evolved and reached many similar
design decisions. It is still very good to see another project that does many
things in the same way as PyPy.