John Nagle <nagle at animats.com> writes:
> Python is defined by what a naive interpreter with late binding
> and dynamic name lookups, like CPython, can easily implement. Simply
> emulating the semantics of CPython with generated code doesn't help
> all that much.
Indeed.
> Because you can "monkey patch" Python objects from outside the
> class, a local compiler, like a JIT, can't turn name lookups into hard
> bindings. Nor can it make reliable decisions about the types of
> objects.
But it /can/ make guesses. A dynamic runtime doesn't have to predict
everything right in advance; it only has to predict most things sort of
well enough, and fix up the things it got wrong before anyone notices.
For example, A Python compiler could inline a function call if it makes
a note to recompile the calling function if the called function is
modified. Most functions aren't redefined, so this is probably a pretty
good guess.
> That adds a sizable performance penalty. Short of global program
> analysis, the compiler can't tell when code for the hard cases needs
> to be generated.
The right approach is to guess that things are going to be done the easy
way, and then detect when the guess is wrong.
> So the hard-case code, where you figure out at run-time, for ever use
> of "+", whether "+" is addition or concatenation, has to be generated
> every time. Making that decision is far slower than doing an add.
There's an old trick here called `inline caching'. The first time a
function is called, compile it so as to assume that types of things are
as you found this time: inline simple methods, and so on. Insert some
quick type checks at the top: is this going to work next time? If not,
take a trap back into the compiler. The traditional approach is to
replace the mispredictions with full dispatches (`monomorphic inline
caching'); the clever approach tolerates a few different types,
dispatching to optimized code for each (`polymorphic inline caching'),
unless there are just too many decision points and you give up.
There are time/space tradeoffs to be made here too. Fortunately, you
don't have to compile everything super-optimized from the get-go: you
can dynamically identify the inner loops which need special attention,
and get the compiler to really stare hard at them. The rest of the
program might plausibly be left interpreted much of the time for all
anyone will care.
> I've referred to this problem as "gratuitous hidden dynamism".
> Most things which could be changed dynamically in a Python program
> usually aren't.
This is one of the crucial observations for making a dynamic language go
fast; the other is that you still have the compiler around if you
guessed wrong.
An aggressively dynamic runtime has two enormous advantages over batch
compilers such as are traditionally used for C: it gets the entire
program in one go, and it gets to see the real live data that the
program's meant to run against. Given that, I'd expect it to be able to
/beat/ a batch compiler in terms of performance.
> This has been pointed out many times by many people. There's
> even a PhD thesis on the topic. Without a few restrictions, so
> that a compiler can at least tell when support for the hard cases
> is needed, Python cannot be compiled well.
This assumes static compilation. It's the wrong approach for a dynamic
language like Python.
-- [mdw]