(I guess I better post something here because it’s been a while.) So over the last two weekends I made a simple oFono client in javascript, meaning that it’s browser-based or “web-based”. To do that I needed a way to talk to D-bus over HTTP. I’ll try to set up a demo instance of the client later but now I’ll just mention the HTTP to D-bus gateway. Even though the whole thing is a hack, maybe the gateway will be useful to someone. It’s also possible that there are already fifteen similar programs there, I’ve not really checked.

The idea is rather simple, it’s a 10 kilobyte python script called gateway.py and you can run it in some directory and it will run a primitive web server on a given port using python’s built-in http library, and will serve the files from the current directory and its subdirectories. It also understands a couple of fake requests that enable web applications to talk to D-bus. It connects to the system bus and relays messages to and from D-bus services using the following three types of (GET) requests:

/path/to/object/Interface.Method(parameters) – This makes a regular D-bus call to a given method on a given interface of an object. It’s synchronous and the HTTP response will contain the D-bus response written as JSON. The D-bus types correspond very neatly to JSON types so the response is easy to use in javascript on the web.

/path/to/object/Interface.Signal/subscribe/<n> – This subscribes the application to a given D-bus signal. The applications identify themselves with a number (<n>), this can be any integer but it should be (reasonably) unique, for example it can be a random number generated when the application loads.

/idle/<n> – This just waits for any signal that application <n> is interested in, to arrive. The signal arguments are then sent to the client as JSON again, in the HTTP response. This way the browser keeps a socket open to the server and signals are sent over it.

Here are some example calls to make it clearer, along with their return values:

It’s easy enough to make a little javascript class in your code to hide the http stuff away so you can make plain js calls and get the return values and have handlers called for the signals. Also, obviously ajax doesn’t just sit waiting for a http response so your application doesn’t become synchronous in any way.

You’ll notice that the interface names are shortened to just the last part of the name. Since the part before the dot is usually same as the service name, you can skip it and it’ll be added automatically. So you can write either /org.ofono.ModemManager or just /ModemManager

To check out the repository do,

git clone http://openstreetmap.pl/balrog/webfono.git

It’s python 3 and uses the D-bus and glib bindings, so getting these dependencies installed may be a little challenge at this point.

So anyway here’s a cheap trick I came up with but which you might already know. If you’re indexing any georeferenced data, such as when doing any fun stuff with OpenStreetMap data, you’ve probably wanted to index by location among other things, and location is two or three dimensional (without loss of generality assume two as in GIS). So obviously you can combine latitude and longitude as one key and index by that but that’s only good for searching for exact pairs of values. If your index is for a hash table then you can’t hope for anything more but if it’s for sorting of an array you can do a little better (well, here’s my trick): convert the two numbers to fixed point and interleave their bits to make one number. This is better because two positions that are close to each other in an array sorted by this number probably are close to each other on the map. You could probably use floating point too if you stuff the exponent in the most significant bits and get a result similar to some degree. With fixed point you can then compare only the top couple of bits when searching in the array to locate something with a desired accuracy.

Converting to and from the interleaved bits form is straight forward and you can easily come up with a O(log(number of bits)) procedure (5 steps for 32 bit lat / lon) or use lookup tables as suggested by the Bit Twiddling Hacks page, where I learnt they’re called Morton numbers. 32-bit lat/lon will give you a 64-bit number and that should be accurate enough for most uses if you map the whole -90 – 90 / -180 – 180 deg range to integers. Even 20-bit lat/lon (5 bytes for the index) gives you 0.0003 deg accuracy.

So what else can you do with this notation? Obviously you can compare two numbers and use bisection search in arrays or the different kinds of trees. You can not add or subtract them directly (or rather, you won’t get useful results) but you can add / subtract individual coordinates without converting to normal notation and back, here’s how:

The result is signed two’s complement with two sign bits in the top bits.

Now something much less obvious is that if you want to calculate absolute difference, you can call abs() directly on the result of subtraction and only mask out the unused bits afterwards. How does this work? The top bit in (ax - bx) always equals the sign bit even if ax and bx only use even bits (top bit is odd), so this part is ok. Now, if the number is positive then there’s nothing to do with it. If it’s negative, then abs negates it again (strips the minus). Conveniently -x equals ~(x - 1) in two’s complement, so let’s see what these two operations do to a negative (ax - bx). ~ or bitwise negation just works because it inverts all bits including the ones we’re interested in. The x – 1 part also works because it flips all the bits until the first 1 bit starting from lowest bit, and you’ll find, although it may be tricky to see, that the first bit set in (ax - bx) is always even (or always odd).

There’s a set of macros for gdb, described in a comment on this page, that will let you attach to a running python program using gdb and inspect its python call stack and python objects using the familiar interface of gdb. I’m a complete stranger to python and couldn’t figure out how to enable the python debugger, and it would get me lost even if I managed to enable it. Additionally I was trying to find out when and why a python program uses a particular syscall and I’m not sure the python debugger can help with this. For the record that python program blocks all signals so I couldn’t just send it a signal and have it print the stack.

I’m wondering if you can do the same thing with Java, and who’ll be the first to implement the gdb macros. I’ve not coded java for years but it makes me want to have a look at it again considering there’s source code for it now (I just wish I had the time). How about swi-prolog?

Practical note: For this to work you will need to rebuild python with debug information in. If you’re on Gentoo, whose default package manager uses python, and if you still have python2.4 installed, if you screw up your python2.5 installation, you can revive emerge by running it implicitly with python2.4 (python2.4 /usr/bin/emerge blah blah). To rebuild python with custom options, edit /usr/portage/dev-lang/python/python-2.5.2-r8.ebuild to add –with-pydebug, and run ebuild /usr/portage/dev-lang/python/python-2.5.2-r8.ebuild digest unpack, then edit /var/tmp/portage/dev-lang/python-2.5.2-r8/work/Python-2.5.2/Objects/unicodeobject.c to remove the assert on line 372, which seems to be a typo, and then ebuild /usr/portage/dev-lang/python/python-2.5.2-r8.ebuild compile install qmerge to let it finish. You may need to re-emerge some of the packages that have installed into /usr/lib/python2.5/site-packages) for your program to work again.

Background: I try lots of new things when I make my food and while most of my experiments fail miserably, there are cases that come out well enough that I even repeat them, so I thought I can share (open-source) one of these results, and this is an attempt. But, open-sourcing food, strangely, isn’t so easy because the “code-base” is very ugly – everything is an undocumented hack (or “spaghetti code”), and needs to be documented. My optimisation flags are always set for minimising the number of dishes to wash and ingredients cost.

Gazpacho: gazpacho is a Spanish dish (or drink) originating from Andalusia. It’s made of mainly vegetables, is almost liquid, is consumed cold and doesn’t involve cooking. It’s eaten in the summer only, especially on hot days. (At higher latitudes than Spain, I found the added practical argument is that vegetables are about 3 times cheaper in the summer).

There are a zillion types of gazpacho and some Spanish are very religious about preparing it, especially those who make cookbooks. Every region has its own type, but there’s also the generic type that you can buy in supermarkets or in McDonald’s.

Now, I’m not Spanish and I allow myself to break some of the rules. If you’re Spanish, stop reading here because you’ll find that I’m committing various terrible crimes against the mediterranean cuisine.

The recipe: today I completed a quest for all the ingredients and made a gazpacho again and it turned out eatable again, so it must not be extremely difficult, here’s the list.

0.5 to 1kg of tomatoes (canned tomatoes will also do, even a box of juice – here’s where I commit the first crime).

2 red peppers/paprikas, optionally add one green.

2 or so slices of bread.

one half bulb of garlic (maybe less).

a medium-size onion.

a cucumber.

1tbs or so of salt, some pepper (or none).

half a glass of oil (here’s my second crime: use any oil – it really really doesn’t matter that much. It appears that if you’re Spanish a single drop of oil that is not the absolute highest quality immediately spoils your dinner. You will never ever see or hear the word oil (aceite) go alone when you’re in Spain – it is in 100% cases accompanied by the words virgen and extra, as in aceite de oliva virgen extra – it might equally well just be a single word in the vocabulary because it always appears together, I don’t think anyone is even able to pronounce aceite alone).

4tbs or so of vinagre and half glass of water.

Place the bread in a plate with water and let it dissolve a bit. Cut all the vegetables into pieces of sizes that will make your electric blender happy. The peppers and tomatoes are fine as they are (another crime!) – in a real gazpacho recipe they tell you to peel them and remove the seeds, but the seeds are the best part, they’ll get blended anyway and they’ll just make the texture nicer, and peeling is just too much work. Blend the bread, vegetables, and water until liquid. Then throw in the oil, vinagre and salt and mix again until the oil can’t be seen.

The colour is between orange and pink and is most influenced by the red peppers. The taste is most affected by the salt and vinagre and you need to adjust their amounts but it will probably be a lot of vinagre and a lot of salt. Too much onion or garlic makes the gazpacho spicy but at some point the smell is too strong.

When it’s done, just store in a fridge and serve cold optionally with pieces of toasted bread.

I was going to make a small trip this weekend but I missed my plane and have to wait until next week. But that means I already have a good excuse for not spending the weekend studying for this week’s exams and I have finally put the time into making gllin behave under Schwartz.

Gllin is a closed-source driver for the Global Locate (now Broadcomm?) GPS known as Hammerhead and it’s been said it didn’t work when the folks compiled it for ARM EABI (i.e. what is used on most ARMs currently) so they only released the OABI binary (the ad-hoc ABI that was used on Linux until ARM came up with a standard ABI and hired people to implement it). So the downloadable gllin package comes with an OABI rootfs which will run under chroot if you have OABI support in your kernel. It seemed wrong to me to have a second rootfs on my phone to run a single program, and it has several other drawbacks.

With the Schwartz loader/linker you can run OABI-compiled programs natively on Linux systems that use different ABIs. This is achieved through translation of library calls that I mentioned previously. Schwartz is by no means complete, and more than anything it’s a proof-of-concept, but it seems to be usable and today my Neo1973 had an actual 3d fix and gave me real coordinates as well as satellite time/date and other info. I took my Neo for an excursion to the shopping mall (not so much to show off, but) to make my first GPS trace for OpenStreetMap. It ran quite stably for the whole 2h and I uploaded the trace here. So here’s how to use it.

Download the schwartz binary from here or here (minimal version). The sources are in this git tree, but building them is not exactly straight-forward. Upload the file to your Neo1973 (or qemu-neo1973). Upload also the gllin binary if you don’t have it there already. In the openmoko package the binary is named gllin.real because gllin is a wrapper script that runs the whole chroot thing. You only need the “.real” binary. You can also safely leave out OABI support from your kernel. Next, make the named pipe for your NMEA data, same way the openmoko package does. After that we’re ready to run gllin and then your favourite gps software.

You can modify the scripts from the package to do all that. ld4 is quite verbose and will print lots of stuff tot he console, which just shows how far it is from completeness. The minimal ld4 differs from the full binary in that the “strace” code is not compiled in. With the full binary, if you append –trick-strace to the cmdline options you will get a strace-like (but more pretty!) log of all functions being called and their parameters. This may potentially be useful for the folks reverse engineering the Hammerhead protocol but I’m not really sure. In the ld4 output you can see a lot of debugging messages and other, that gllin doesn’t normally print out. I have not noticed any anomalies when running gllin under Schwartz but it’s totally possible that the floating-point precision is reduced or something else is broken. gllin is a pretty tough test case for the ABI translation thing for various reasons: all the floating-point arithmetics, heavy usage of memory/files/sockets, C++ libraries, C++ exceptions, real-time constraints and more.

Among other things schwartz enables you to do is running gllin without root privileges (chroot normally requires those). Also an interesting thing to do is compare the strace (the real traditional strace) output of gllin running under a chroot with OABI compiled libc, and the strace output of the same gllin running under schwartz and using EABI libc. You’ll see two different sequences of syscalls being made, but having pretty much the same end effect.

I probably won’t have time to hack schwartz further but improvements from others are welcome. I just wish I had the thing running earlier – ironically I already have a GTA02 on my desk, and GTA02 has a different GPS chip in it which needs no driver on the OS side. There’s very little time left till the mass-production and selling of GTA02 starts and gllin slides into oblivion. (It seems that the TomTom Go’s using the same or a similar driver though).

First, why would we want to do that? Most architectures have a single popular ABI accepted by the kernel and supported by the binutils, on Linux this is usually the System V R4 defined ABI. This is the case of i386. X86-64 also has a single standard ABI based on the i386 ABI but it’s not a System V standard because System V doesn’t seem to have one for x86-64 yet. The ARM case is different because there are more than one ABIs in use and you can get a mismatch when pairing user-space and kernel images or libraries for a program. The older and unstandardised one is called OABI and Schwartz can (attempt to) translate between OABI calls issued by an OABI-compiled program and whatever ABI the host uses. This will be enabled automatically if an OABI executable is detected, no command line switch needed.

Why it seems this hasn’t been done before? Because it’s non-trivial. Currently people resort to using an entire OABI rootfs sitting in a subdirectory of the host rootfs and chrooting to it, if they need to run a OABI binary in a system that uses EABI.

Why is it non-trivial and how does Schwartz do it? In a nutshell if an executable is compiled with a different ABI than the host, we need to translate everything that’s being passed between the program and the libraries it uses (this is assuming the executable is dynamically linked and issues no syscalls directly – otherwise only the syscalls would have to be translated but that cannot be done in user-space so we’re not concerned with this) and the format of this interaction is precisely what ABIs define. Two types of interaction occur that I know of: through data and control. The control is always passed to and from libraries in the same way, through jumps aka. branches, and there isn’t any space for differences between ABIs so we’ll concentrate on the data. Data is passed on various occasions. I will divide all the data interaction into three parts:

static chunks of data shared between program and library. This means mainly global variables in terms of a C program or other. The format of a variable depends on it’s type and the ABI. The most basic types are encoded always the same way, while data types which are constructed of sub-elements, like structs, have a format governed by the ABI. The ABI usually specifies how elements are packed inside an object and there may be important differences between ABIs. Fortunately global objects are not usually shared by libraries, and those that are, are almost always simple types, so we don’t perform any translation. In addition it would be very difficult because we would have to react to every access to such variables, and in some cases completely impossible, for example for C union types, because the data has more than one interpretation in such cases, and we can’t tell which interpretation is used in which access.

on program entry. Entry happens only once, when the control is passed to the program at start and is accompanied by some data being passed too (for example the command line arguments). This part is easy because we can have a separate entry for each ABI, and some ABIs just don’t specify any requirements for the entry point (this is the case of OABI and EABI, and the Linux implementation is exactly identical for both of them). So currently there’s only one main() call per architecture in Schwartz.

on function calls. This is responsible for the biggest part of ABI translation in Schwartz. A function call between a program and a library is accompanied by data being passed both ways, from caller to callee in call arguments, and from callee to caller in the return value. We will see below that a library can be both a callee and a caller, for different functions. Function parameters as well as their return values can be passed differently depending on the ABI. The ABI usually specifies when and which parameter values (or parts of them) are passed in registers (of the CPU or FPU) and which are marshalled on stack, and possibly which are passed as pointers. They can also have different types, ranging from simple to compound, where the packing is important again, as it was in 1.

How does Schwartz handle function calls to different ABIs? We simply make a wrapper for every library function that we suspect may be used, and we resolve function symbols to our wrappers instead of the original functions. Again this is not a generic solution if we want to load arbitrary executables but practically is good enough. If there is an executable that uses symbols we haven’t a wrapper for, we can easily add information about the new function and recompile. The information is generated automatically based on system headers and a list of symbol names (and the list is extracted automatically from a list of executables). Such wrapper will accept parameters in the program’s ABI format, adapt them to the library ABI if needed and call the real function passing the same parameters but in the library’s ABI again. The same has to be done with the return value, just in the reverse order.

But here’s the trick: a function pointer is also a data type, so it can be passed as a parameter or a return value from a library function, and we have to handle it very carefully. Example library functions that take a function pointer as parameter are signal(), qsort() or __libc_start_main() (specified in Linux Standard Base). Example function that returns a function pointer is signal() again. So how do we handle translation of the function pointer data type? We have to generate a wrapper for every value passed that is a function pointer, and since there may be different such values passed in successive calls to the same function as parameters, we have to do it dynamically in the run-time, for every value separately. Fortunately there’s only a finite number of such values because the only valid values are those that point at functions in the program (plus optionally NULL, which we pass intact) and there is a finite number of functions, they aren’t generated dynamically. Now the wrappers will be of two types: those for parameters and those for return values. To see the difference between these two, let’s look at what the callee can do with the value it is passed in a parameter and a value a caller gets when it is returned from a call. It can do two things:

It can make a call to the function pointed to by the function pointer. If we’re a callee and we got a function pointer in a parameter we will want to make the call in our ABI, while the function was passed from the caller so it expects parameters in the caller’s ABI, so we need translation again. But this time the callee (we) becomes a caller and the target of the call is a function passed from the other ABI, so the translation needs to be in reverse direction. If we are the library and the caller was the program, we now need a wrapper that translates from library ABI to program’s ABI. The converse case is easier: we’re now the caller, we called a function and it returned another function pointer. The function which is pointed at will expect parameters in the callee’s ABI so the translation occurs in the “same direction” as before.

It can remember the value somewhere and the value can later be returned or passed as a parameter back to the other side. Since the function pointer is a value we got in return or in a parameter, we know that it is already wrapped appropriately by Schwartz. But we are now passing it back to the other side, precisely where it came from. If we follow the logic from 1. we will be unnecessarily wrapping it again (wrapping the wrapper) in a translator of opposite direction. Schwartz has to notice the double wrapping and “annihilate” the two translators and just pass the original pointer, in order to inhibit the possibility of DoS’ing ourselves by generating an infinite serie of wrappers. To see this better here’s an example of when this happens in a C piece:

sighandler_t *original_handler; /* Function pointer */
...
/* Let's setup a handler for SIGUSR1 */
original_handler = signal(SIGUSR1, &my_sigusr1_handler);
/* External function is being returned,
it is wrapped in an ABI translator,
so that we can safely call it (but
we don't in this example). */
...
/* Let's restore the original handler */
signal(SIGUSR1, original_handler); /* The wrapped external function is
being passed as parameter, normally
it would be wrapped again so that the
callee can safely call it. But
instead we "unwrap" it and we get the
same effect. */

The bottom line in 1. is that if we decide to do ABI translation from ABI X to Y, we also have to translate from Y to X occasionally, so they are tied together, and we have to be able to do both things dynamically. In 2. the bottom line is that we need to cache pointers to untranslated functions also. If we add to this the fact that pointers can point to functions which also have function pointers as parameters or return types (see man xdr_union(3)), and that struct or array elements can be function pointers too, and that there can be a variable number of parameters of unknown types, we get a pretty complex task.

There’s another case of functions like dlsym() that return a-void-pointer-but-we-know-it’s-a-lie, for which we need a totally custom translator, but this is more easily doable.

It seems everyone needs to code at least one ELF loader of their own, so here’s mine. Schwartz is a yet another ELF loader and linker that can do a couple of tricks that other linkers can’t do (names not included – any similarity is purely coincidental), like ABI translation. I started it when the gllin binary was released to public in November but never had the time to finish it. It aims to be a generic linker not tied to any architecture or host ABI, but gllin was a good reason to start coding. My next couple of posts will be related to Schwartz as well, so you better be interested!

Schwartz doesn’t use the ELF interpreter mechanism like the ld-linux linker – it compiles to a normal user-space program that needs no special privilege level. Typically the user just runs the linker (the executable name is ld4) passing as a parameter the name of the executable to load and run. Supported architectures are at the moment x86-64, ARM and i386 (the last one untested).

For that to work we have to use some tricks at every level, starting from the loader part. Because every hack has its limits (that make it what we call a hack), if you take The Schwartz code and try to extend it you may hit one of the limits and see that things stop working. There’s nothing inherently unfixable in it but you may need to come up with a new hack.

The loader

Its task is loading the contents of an ELF executable into memory at the right locations where the ELF will feel especially comfortable. In other words we construct the memory image of the program out of the image in the executable file. This at first seemed like an easy task because I had zero experience with ELF executables and my last experience with executables was from ms-dos times where all executables were relocatable. So in my endless ignorance I was thinking I’d just reserve a piece of memory, dump the contents there and relocate the code. Obviously this didn’t work because it turns out operating systems stopped using relocatable binaries for normal programs about twenty years ago when I wasn’t paying attention. So to make the program feel at home you have to place the code at the exact addresses it wants.

To run fully in user-space we use a linker script that moves our own code to a non-standard location in the memory image, so that the standard location becomes free and we can load the executable there. Such linker script can be pretty much generated automatically for every platform. Obviously on the target executable could have also used a linker script and chosen an address colliding with our non-standard addresses. In this case the dungeon collapses and we don’t support such executables. The user has to go and modify the script (which is fairly trivial) to be able to run such executable. The user can even go farther and support only a single executable and just link the ld4 with her target program into a single file if she wants to only take advantage of (say) the ABI translation feature for this single program.

By doing that we have both programs in a single memory space / single process, happily coexisting and we gain one interesting feature: If we attach a debugger to the process, we will have the symbols from both executables in place. This means we can load the debug info for either of the programs into the debugger and the debugger will see the symbols in the right places and not get confused. In GDB you can switch the debugged binary in runtime without detaching from the process.

Linker

The linker is used only for dynamic executables. It looks at the list of symbols in the external libraries that are used by our target program and resolves each of them by loading the necessary library and finding the symbol. Again we have both programs (ld4 and the target) in a single process so we can share the libraries instead of loading them two times. I use libdl for external symbols rather then resolving them manually but there’s no reason the Schwartz couldn’t recursively load the libraries as well. Currently we support only a very small subset of the defined relocation types but this seems to be more than enough for programs built with binutils (i.e. all programs).

Because we control what we resolve every symbol to, we can override the library symbols with our own when we want. This allows us to play different kinds of tricks on the program.

One such trick is a strace-like tracing of the calls made by the program to library functions. I’ve implemented that for most of the <string.h> calls as an example, this functionality is turned on with the –trick-strace switch.

Another feature is a fake chroot done with simply mangling the path strings passed forward and back between the program and libraries. This is ofcourse not as secure as a real chroot if you allow arbitrary executables, because an executable may use libraries or library functions that we haven’t provided a wrapper for, or use syscalls directly. However, it has the advantage that any user can use it, while normal chroot requires root privileges. This is enabled with –trick-chroot <path>.

Yet another trick could be a user-space implementation of a poor man’s debugger, with the capability to set breakpoints, inspect data, etc., but perhaps not watchpoints (at least not easily) and other fancies. I’m not implementing this.

And yet another trick based on overriding library symbols is C++ exception model translation and ABI translation. More about this in the next post. Look out!