LuaJIT performance

LuaJIT performance

I am in the early stages of deciding on a fast scripting
language for a new C++ project and obviously Lua is a candidate. When I
say fast I mean fast so we’d probably be using LuaJIT as opposed to
interpreted Lua.

Does anyone know a performance comparison of JIT’ed
Lua versus something like V8 JavaScript? Lua has the reputation as the
fastest scripting language (in fact that’s how I came across it) but does
the JIT compiler used in modern JavaScript implementations like V8 greatly
narrow the performance gap? JavaScript is clearly a much more
comprehensive and complex language so I would be surprised if it could be
executed faster than Lua but it’s the speed that we really need over
language features.

Re: LuaJIT performance

John C. Turnbull wrote:
[...]
> Does anyone know a performance comparison of JIT’ed Lua versus something
> like V8 JavaScript? Lua has the reputation as the fastest scripting
> language (in fact that’s how I came across it) but does the JIT compiler
> used in modern JavaScript implementations like V8 greatly narrow the
> performance gap? JavaScript is clearly a much more comprehensive and
> complex language so I would be surprised if it could be executed faster
> than Lua but it’s the speed that we really need over language features.

I did some quick-and-dirty benchmarks as part of Clue. Back then, V8
wasn't really set up for running command-line Javascript applications so
I was never able to integrate it into the benchmark suite (I should
check again), and the benchmarks are astonishingly artificial, but I saw
that while V8 was way faster than any other Javascript interpreter out
there, LuaJIT was still a lot better.

However, I wasn't quite testing like-for-like, so I'm not sure that this
was a meaningful result. I need to go and have another look to see if V8
has a proper command-line driver these days.

Re: LuaJIT performance

On Fri, Aug 7, 2009 at 8:34 AM, John C. Turnbull wrote:
>
> Does anyone know a performance comparison of JIT’ed Lua versus something
> like V8 JavaScript? Lua has the reputation as the fastest scripting
> language (in fact that’s how I came across it) but does the JIT compiler
> used in modern JavaScript implementations like V8 greatly narrow the
> performance gap?

Re: LuaJIT performance

John C. Turnbull wrote:
> I am in the early stages of deciding on a fast scripting language for a new
> C++ project and obviously Lua is a candidate. When I say fast I mean fast
> so we'd probably be using LuaJIT as opposed to interpreted Lua.

You should also consider the size of scripting engine you're
embedding and how easy it is to bind to it. Lua and LuaJIT are
more than ten times smaller than V8 and IMHO much easier to embed.

Well, we can find out ... so I fetched today's V8 trunk and ran
some standard benchmarks. Unfortunately the V8 standalone shell is
very limited and is unable to run quite a few of them. And there's
no JavaScript translation for some others. :-(

All ratios are normalized relative to the performance of the
standard Lua interpreter. E.g. 5.0 means something is five times
faster than Lua. Higher numbers are better:

Summary: Ok, so V8 is catching up. But LuaJIT 1.x still beats it
on 6 out of 10 benchmarks. V8 is mainly faster on object allocation.
But, surprisingly, V8 is slower for nbody, even though its complex
logic for managing object shapes should make this go really fast.

Not suprisingly, Lua and LuaJIT still have the lead on numeric
benchmarks (unboxed floating point numbers pay off here). And
LuaJIT 2.x will completely change the game (sorry, still no ETA).

But as others have said: please compare the different VMs with
benchmarks that best match *your* performance needs.

Re: LuaJIT performance

Mike Pall wrote:
> Not suprisingly, Lua and LuaJIT still have the lead on numeric
> benchmarks (unboxed floating point numbers pay off here). And
> LuaJIT 2.x will completely change the game (sorry, still no ETA).

Those times for LuaJIT 2.x are simply mindboggling. A JIT compiler beating
GCC in more then one standardised/non hand-picked tests? Also made me smile
to hear it's still being worked on, has been quiet for a while now. Excited
would be an understatement...

Re: LuaJIT performance

Does there exist eventually an ARM port? Would be great to hear ...

Best Regards
Michael

Alex Davies schrieb:

> Mike Pall wrote:
>> Not suprisingly, Lua and LuaJIT still have the lead on numeric
>> benchmarks (unboxed floating point numbers pay off here). And
>> LuaJIT 2.x will completely change the game (sorry, still no ETA).
>
> Those times for LuaJIT 2.x are simply mindboggling. A JIT compiler
> beating GCC in more then one standardised/non hand-picked tests? Also
> made me smile to hear it's still being worked on, has been quiet for a
> while now. Excited would be an understatement...
>
> - Alex

Re: LuaJIT performance

> Does there exist eventually an ARM port? Would be great to hear ...
>
> Best Regards
> Michael
>
> Alex Davies schrieb:
>>
>> Mike Pall wrote:
>>>
>>> Not suprisingly, Lua and LuaJIT still have the lead on numeric
>>> benchmarks (unboxed floating point numbers pay off here). And
>>> LuaJIT 2.x will completely change the game (sorry, still no ETA).
>>
>> Those times for LuaJIT 2.x are simply mindboggling. A JIT compiler
>> beating GCC in more then one standardised/non hand-picked tests? Also made
>> me smile to hear it's still being worked on, has been quiet for a while now.
>> Excited would be an understatement...

Is there a 64bit version coming when version 2 is done? That would be
so great!!!
--
Regards,
Ryan

Re: LuaJIT performance

> On Sat, Aug 8, 2009 at 7:34 AM, Michael
> Bauroth<[hidden email]> wrote:
>> Does there exist eventually an ARM port? Would be great to hear ...
>>
>> Best Regards
>> Michael
>>
>> Alex Davies schrieb:
>>>
>>> Mike Pall wrote:
>>>>
>>>> Not suprisingly, Lua and LuaJIT still have the lead on numeric
>>>> benchmarks (unboxed floating point numbers pay off here). And
>>>> LuaJIT 2.x will completely change the game (sorry, still no ETA).
>>>
>>> Those times for LuaJIT 2.x are simply mindboggling. A JIT compiler
>>> beating GCC in more then one standardised/non hand-picked tests?
>>> Also made
>>> me smile to hear it's still being worked on, has been quiet for a
>>> while now.
>>> Excited would be an understatement...
>
> Is there a 64bit version coming when version 2 is done? That would be
> so great!

My customer also would be interested in a x64 LuaJIT. Maybe we can
come up with some company sponsorship for Mike on this? He'd sure
deserved some!!

Re: LuaJIT performance

RJP Computing wrote:
> Is there a 64bit version coming when version 2 is done?

The goal is to release the x86 version and stabilize it a bit
before starting the x64 port.

Most of the LJ2 VM is already "64 bit ready". E.g. it has an
arch-independent 32 bit pointer abstraction for all GC objects.
This keeps tagged values at 8 bytes on all platforms. But several
major areas need more work: porting DynASM to x64, porting the
interpreter (which is 100% x86 assembler), dealing with the
different x64 calling conventions (WIN64 is different than the
rest of the world) and a couple more open issues.

Michael Bauroth wrote:
> Does there exist eventually an ARM port?

Given the market share and the estimated demand, that's most
likely the next port after the x64 port. But it's much more
complicated, since there is no uniform ARM platform. The choice of
the number type for Lua is the main difficulty.

Using double-precision floating-point numbers is one option. But
it needs really fast FP arithmetics (x86/x64 provides that).
Unfortunately most older ARM devices have no FPU at all and NEON
doesn't do double precision FP. And about VFP ... well, some
vendors like to hide the fact that most of their gadgets only
contain something called "VFPlite". The high latencies and the low
throughput makes softfp suddenly look like an attractive option.

Another option is to use 32 bit integers only. Certainly easier to
implement, but I'm not so sure everyone would be happy with it.

I've also considered using 32.31 fixed-point numbers. Yes, it's a
bit of an awkward choice. But you'd get fractional numbers at the
speed of integer arithmetics. Again, I'm not sure about the needs
of developers who'd like to have LJ2 ported to ARM.

[Note that support for multiple number types per platform is not
an option. A JIT compiler needs to emit very different instruction
sequences for each number type. Switching number types is not as
easy as changing a couple of C macros.]

Re: LuaJIT performance

> >> Is there a 64bit version coming when version 2 is done? That would be
> >> so great!
>
> > My customer also would be interested in a x64 LuaJIT. Maybe we can come up
> > with some company sponsorship for Mike on this? He'd sure deserved some!!
>
> Actually, there is a chance our company would be able to participate
> in such sponsorship as well.
>
> We'd love to use 64bit LuaJIT 2 in our products!

Sure, I can put up a proposal for a sponsorship program for the x64
port (*after* the x86 release of course). Since I'm an independent
consultant, you'll get a proper bill/invoice/tax receipt or
whatever it's called in your country. Companies should be able to
deduct the expenses.

But before going to the effort, I'd like to know a rough estimate
what your respective companies might be willing to spend on this.
You can email me privately about this and any amounts are of
course not binding (yet). Depending on the outcome I may go for it.

Re: LuaJIT performance

> Given the market share and the estimated demand, that's most
> likely the next port after the x64 port. But it's much more
> complicated, since there is no uniform ARM platform. The choice of
> the number type for Lua is the main difficulty.

<...>

I, personally, am interested in LJ2 for iPhone (which is ARM-based).
As I intend to reuse existing (x86) game logic code on it, I would
need floating point support.

Re: LuaJIT performance

> Sure, I can put up a proposal for a sponsorship program for the x64
> port (*after* the x86 release of course). Since I'm an independent
> consultant, you'll get a proper bill/invoice/tax receipt or
> whatever it's called in your country. Companies should be able to
> deduct the expenses.

> But before going to the effort, I'd like to know a rough estimate
> what your respective companies might be willing to spend on this.
> You can email me privately about this and any amounts are of
> course not binding (yet). Depending on the outcome I may go for it.

Cool! But it looks like we'd need to sponsor x86 release first! :-)

You've said there is no ETA yet for x86, but, perhaps, you may share
some information on the amount of work left to do?

Re: LuaJIT performance

> Another option is to use 32 bit integers only. Certainly easier to
> implement, but I'm not so sure everyone would be happy with it.

If you go this way eventually, please backport to x86. We use stock Lua
with 32-bit integers on x86 and PowerPC (Xbox 360) for very peculiar reasons
(synchronicity of calculations across the network) and are generally
happy with it.
I was unpleasantly surprised when I learned there's no support for
that in LuaJIT 1.x.

Re: LuaJIT performance

Alexander Gladysh wrote:
> I, personally, am interested in LJ2 for iPhone (which is ARM-based).
> As I intend to reuse existing (x86) game logic code on it, I would
> need floating point support.

Ok, but you may be in for a nasty surprise: the 3GS has an ARM
Cortex-A8 CPU which only has VFPlite. This is actually a step back
from the previous models which had an ARM 1176JZ(F)-S with a full
VFP unit. And since the vector mode of VFP is officially deprecated,
you're in for more surprises in the future.

Not to squash your hopes, but I suggest you try to measure whether
the iPhone FP performance can keep up with your requirements.
Maybe try some simple double-precision FP benchmarks in C (don't
compile as Thumb code or you get softfp).

Timm S. Mueller wrote:
> Just for the record: a port of LuaJIT to ARM would be very welcome, and
> I'd be perfectly happy with a 32 bit numerical datatype on this
> architecture.

Umm, so one probably needs at least two different VMs for ARM (FP
vs. int-only). Then combine this with the options for ARM vs.
Thumb vs. Thumb2 code and with ARMv4-ARMv7 support and soon we'll
have an exponential number of targets to support ... *sigh*

Re: LuaJIT performance

Alexander Gladysh wrote:
> You've said there is no ETA yet for x86, but, perhaps, you may share
> some information on the amount of work left to do?

Well, I'm already cutting corners everywhere wrt. features for the
first alpha. But issues with correctness and completeness keep me
busy (the coordination between the JIT code and the GC is currently
a minefield). And the code needs to be cleaned up a lot before it's
ready for public consumption. Then I'll need to work on the
packaging, the docs, the web site reorganization and so on ...

Thankfully I've recently removed the last major stumbling block
(better trace linking) and the benchmark results demonstrate that
going for a trace compiler was a sound design decision after all.

But I have to say it was an expensive decision: I've considerably
underestimated the amount of research and trial-and-error which
was needed to convert a research toy into a production compiler.
There are some important implementation details which the few
papers about trace compilers completely fail to mention ... :-|

Re: LuaJIT performance

> > Just for the record: a port of LuaJIT to ARM would be very welcome, and
> > I'd be perfectly happy with a 32 bit numerical datatype on this
> > architecture.
>
> Umm, so one probably needs at least two different VMs for ARM (FP
> vs. int-only). Then combine this with the options for ARM vs.
> Thumb vs. Thumb2 code and with ARMv4-ARMv7 support and soon we'll
> have an exponential number of targets to support ... *sigh*

Please make a sensible decision, the sky isn't falling over if I'm not
getting LuaJIT/ARM for free.

My deployment needs are in the no-FPU, no-2nd-level-cache, ~200MHz
range. That's where throughput for user interfaces is scarce and
desperately needed. But Lua is up to the task. I have designed my
libraries to work with integer (using fixpoint arithmetics in places),
so that I can use them in these contexts, and ARM-7 is an architecture
I am frequently concerned with.

Re: LuaJIT performance

> Umm, so one probably needs at least two different VMs for ARM (FP
> vs. int-only). Then combine this with the options for ARM vs.
> Thumb vs. Thumb2 code and with ARMv4-ARMv7 support and soon we'll
> have an exponential number of targets to support ... *sigh*

Ignore Thumb; I'm not sure anybody would want to run LuaJIT on a
Thumb-only device (like a Cortex M3). ARM's hardware FP has always
left something to be desired, so a combined approach like Asko's
integer patch might be a solution. And for the most part (excluding
floating point), it should be quite easy to produce code that will run
on both ARMv3 and 4 and ARMv7.