Recommended Posts

Most VMs use a stack-based approach to interpret code, such as Java and the .NET runtime. Some however have started using register based instructions, because they tend to take up less opcodes, and don't have a load of implicit operands.
One reason why I'm interested in doing this is that it would make JIT compilation of my scripts (and perhaps even full compilation) easier. However, I do not know how to achieve a good method of doing this.
Today I tried something that 'simulates' stack based instructions, but using registers, as if the register file is a stack. Register 1 is the bottom, register 2 is above that, etc.
Here's the code (Python):

Give it a shot.
However, we learned in our computing systems lecture that this is just an easy way to do this, and 'real' compilers actually do it a different way. My way IS easy because it works like a stack, and it's easy to deal with dependencies.
The 'other' way seems to have a completely different structure to the actual format that source is programmed in, and seems to make dependencies more tricky.
Does anyone know a good method of doing this?
Thanks

0

Share this post

Link to post

Share on other sites

You've still got more of an byte-code interpreter than a compiler there, since you're mapping high level operations pretty much one-to-one to the instructions. A real compiler will typically have to perform decomposition of a high level statement into several low level instructions.

Instead of being able to arbitrarily choose how many registers you want, typically you're limited to a given number of them. (In real compilers this is down to the hardware, but you get to choose how many you want.) The compiler for a high level language then decides which registers are available for each instruction and allocates them accordingly in the generated machine code. (It's possible to run out of registers but you can always overflow onto the stack in that case.) These algorithms are supposedly quite complex. Maybe the Wikipedia page on register allocation will give you some pointers.

Sometimes register choice is further complicated since instructions are hard-wired (literally, at least in older hardware) to use certain registers, meaning the compiler sometimes has to shuffle things around to get them to the correct register. This is probably less of a problem on RISC architectures than on CISC and not a problem at all on soft architectures such as a bespoke VM because you can simply avoid writing such instructions should you so choose.

It also has to consider generating instructions to save and restore registers when calling functions so that instructions in the nested function don't overwrite values in the current function. You could save and restore all registers for every function call, but technically a function only has to preserve the registers it uses and restore them at the end, so there is scope for optimisation there.

However, many of these issues are extra concerns that machine code compilers have to address in order to work with the specific hardware in question. Your VM doesn't have the same physical restrictions and therefore much of this complexity can be ignored.

I hope all of that's correct, as it's a long time since I last worked with compilers or assembly.