LTR (Lua Tiny RAM) in eLua

Modules and LTR

LTR (Lua Tiny RAM) is a Lua patch (written specifically for eLua by Bogdan Marinescu) that significantly decreases the RAM usage of Lua scripts,
thus making it possible to run large Lua programs on systems with limited RAM. This section gives a full description of LTR. If you're writing eLua
modules, this page will certainly be of interest to you, as it shows how to interact with LTR in a portable and easy to configure way.

Motivation

The main thing that drove me to write this patch is the relatively high Lua memory consumption at startup (obtained by running
lua -e "print(collectgarbage'count')"). It's about 17k for regular Lua 5.1.4, and more than 25k for some of eLua's platforms. These figures are
mainly a result of registering many different modules to Lua. Each time you register a module (via luaL_register) you create a new table and populate
it with the module's methods. But a table is a read/write datatype, so luaL_register is quite inefficient if you don't plan to do any write operations
on that table later (adding new elements or manipulating existing ones). I found that I almost never have to do any such operations on a module's
table after it was created. I just query it for its elements. So, from the perspective of someone worried about memory usage, I'd rather have a
different type of table in this case, one that wouldn't need any RAM at all, since it would be read only, so it could reside entirely in ROM.

There's one more thing related to this context: Lua's functions. While Lua does have the concept of C functions, they still require data structures
that need to be allocated (see lua_pushcclosure in lapi.c for details), as they can have upvalues or environments. Once again, this isn't something I
use often with eLua. Most of the time my functions (especially the ones exported by a C module) are very simple, and they don't need upvalues or
environments at all. In conclusion, having a "simpler" function type that improves memory usage.

Details

The patch adds two new data types to Lua. Both or them are based on the lightuserdata type already found in Lua, and they share the same basic
attributes: they don't need to be dynamically allocated (as they're just pointers on steroids) and they're compared in the same way lightuserdatas
are compared (by value). And of course, they are not collectable, so the garbage collector won't have anything to do with them. The new types are:

lightfunctions: these are "simple" functions, in the sense that they can't have upvalues or environments. They are just pointers to regular
C functions. Other than that, you can use them from Lua just as you'd use any other function.

rotables: these are read-only tables, but unlike the read-only tables that one can already implement in Lua with metamethods, they have a
very specific property: they don't need any RAM at all. They are fully constant, so they can be read directly from ROM. They have a number of
special features and limitations when compared with a regular table:

rotables can only contain values of type "lightfunction", lua_Number or pointers to other rotables.

you can't add/delete/modify elements from rotables (obviously). However, rotables will honour the "__newindex" metamethod.

you can use rotables as metatables for both "regular" tables and for Lua types (via debug.setmetatable)

a rotable can have another rotable (or itself) as a metatable

you can iterate over rotables with pairs/ipairs/next just as you do with "regular" tables.

Just as with lightuserdata, you can only create lightfunctions and rotables from C code and never from Lua itself.

Testing

I tested my patch with the (Lua 5.1 test suite). The test suite
was an excellent testing tool. I thought I had the patch ready until I found the test suite and ran it. After another week of work, I had something
that could be called functional :)

I tested everything via "make generic", which is how I always build Lua for my embedded environments. This means (among other things) that I didn't
test pipes and dynamic module loading, although I don't see why they wouldn't work.

I never tested the patch in a multithreaded environment with threads running different lua_States. I never even used regular Lua like this,
so I can't make assumptions about how my patch would behave in a multithreaded environment. It doesn't use any global or static variables, but you
might encounter other problems with it.

Results

The table below summarizes the RAM usage in KBytes (as obtained by running lua -e "print(collectgarbage'count')" from the eLua shell).
OPT=0 is LTR's "compatibility mode" (basically this means that the patch is disabled, so you're running plain Lua) and OPT=2 is the
patch in action.

Platform

OPT=0

OPT=2

AVR32

23.75

5.42

AT91SAM7X

25.16

5.42

STR7

24.92

5.42

STR9

22.23

5.42

LPC2888

22.23

5.42

i386

16.90

5.42

LM3S

27.14

5.42

As you can see, the differences are significant, and (more importantly) it doesn't matter how many modules you load in eLua, the RAM consumption
doesn't modify.

Currently, there aren't any performane measurements related to LTR. It's clear from the implementation that the patch slows down the virtual machine,
but a precise performance penalty figure is not known. Experience suggests that the performance penalty is minimal, and it certainly can't be observed
with "regular" (non-computationally intensive) Lua programs.

How to enable LTR

Enabling LTR is very easy: all you need to do is specify the optram=1 as a parameter to scons when building eLua, as explained
here. You don't even need to specify this explicitly, as LTR is enabled by default for all eLua targets.

When optram is 0, LTR is not active. In this mode, the patch just tries to keep the modified version as close as possible to the unpatched version
in terms of speed and functionality. You might want to use this if you want full Lua compatibility (although this is rarely an issue in practice),
or need to overcome the read-only limitations of rotables (but check this first). If your program behaves weird and you
suspect that LTR might be the cause of your problems, recompiling with optram=0 is a quick way to eliminate or confirm your suspicions.

When optram is 1 (default), all the LTR optimizations are enabled. The implementation of the Lua standard libraries is modified to take advantage
of the new datatypes. In particular, the IO library is modified to use the registry instead of environments, thus making it more resource-friendly,
the side effect being that this mode doesn't support pipes in the io module (which isn't an issue for eLua).
It also leaves the _G (globals) table with a single method (__index) and sets it as its own metatable, so all accesses to globals are
now sligthly slower because of the __index metamethod call.

Writing LTR-compatible modules

The LTR patch introduces a specific method for writing modules in such a way that they're fully compatible with both optram=0 and optram=1.
If you're writing a new eLua module you should use this method, as it keeps code coherency.

We'll show this method using a simple example. Let's assume that you want to register a simple module called "mod" that has a single function named "f".
For regular Lua, you'd do something like this:

for values: LRO_FUNCVAL(f) defines a lightfunction value, LRO_NUMVAL(f) defines a number value, LRO_RO(p) defines a
rotable value (p is the pointer to the rotable) and LRO_NILVAL defines a NULL (empty) value.

all the "global" rotables in the system (the ones that must be visible from _G, like the rotables of all the modules exported to Lua) must be
included in a special array, called lua_rotable (defined in linit.c). Simply including the rotable's definition array (mod_map in this case)
in the lua_rotable array makes it visible globally, thus you don't need to call any kind of register function. This is why luaopen_mod now returns
0.

The two forms above (for regular tables and for rotables) are clearly different, but we want to keep them both to be able to work at both optram=0
and optram=1. You can use #ifdefs to differentiate between the two cases in different optimization levels, but this becomes really annoying after
a (short) while. This is why I added another file called lrodefs.h (src/lua) that can be used to give an "universal" definition to our map
arrays. Here's how our example looks after rewriting it to take advantage of lrodefs.h:

Now, if LUA_OPTIMIZE_MEMORY (a macro defined by the system as 0 when optram=0 and as 2 when optram=1) is less than
MIN_OPT_LEVEL, the above definition will compile in its "regular table" format. If LUA_OPTIMIZE_MEMORY is 2, it compiles to the
rotables format. Problem solved :) LREGISTER will also take care of calling luaL_register and return 1 when optram=0 and do
absolutely nothing when optram=1. You can see more examples of this in any module from src/modules, and you're encouraged to do so,
as this is only a very basic example; src/modules contains real life examples that can serve as a good basis for a new module.

As you know by now, rotables can have metatables, and also you can set a rotable as a metatable for a regular table. If a rotable must have a
metatable, then it needs a "__metatable" field to point to its metatable (which is also a rotable, not necessarily another rotable) and the usual
metatable functions. For example, let's make our mod rotable its own metatable and declare an __index function. Moreover, let's do
this for both optram=0 and optram=1.

If you want to register a module using a regular Lua table, but use lightfunctions instead of regular functions, use luaL_register_light instead
of luaL_register (same syntax).

More important things to keep in mind when working with LTR:

currently, MIN_OPT_LEVEL should be always set to 2

you need a C99-compatible compiler to use LTR (because of the compile-time explicit union initialization that's needed to declare const rotables).
Fortunately this isn't a issue right now, as all current eLua targets use GCC and GCC knows how to handle this.

your linker command file should export two symbols: stext and etext. They should be declared before and after the .rodata* section
placement (generally you'd declare stext at the beginning of .text definition and etext and the end of .text definition, see for example
src/lua/at91sam7x256/flash256.lds). These are needed by the patch to differentiate between a regular table and a rotable (although this is likely
to change in a future version of the patch.

remember to declare all your rotable's definition array as 'const'!! Forgetting to do so will not only increase
memory usage, it will also make the patch nonfunctional, because of the way it recognizes rotables (see above).

LTR and module configuration at build time

With unpatched Lua, you can specify what modules to be part of the Lua image by modifying src/lua/linit.c. In the particular case of eLua,
one has to declare a list of the modules that must be compiled in src/platform/<name>/platform_conf.h like this:

(IMPORTANT NOTE: the fact that there are no commas between two different _ROM declarations (as seen above) is NOT an error;
on the contrary, this is intentional. Try using commas and you'll get in trouble very soon :) ).

Note the 3rd parameter of the _ROM macro, which is the name of the definition array for the (ro)table. That's it. The code in linit.c will take
care of everything else, including initializing the list of modules in LUA_PLATFORM_LIBS_ROM with regular tables instead of rotables at optram=0
(to maintain compatibility with regular Lua). You can also have a list of modules that you want to use with regular tables no matter what the
optimization level is. In that case, list it in the LUA_PLATFORM_LIBS_REG macro via the old syntax for LUA_PLATFORM_LIBS, as shown
above (the regular Lua syntax for defining a module to be registered with luaL_register). If you want this module to use lightfunctions instead of
regular functions (at optram=1), use luaL_register_light instead of luaL_register.