Friday, 30 December 2011

Some days ago, I finished the first public version of my audiovisual virtual machine, IBNIZ. I also showed it off on YouTube with the following video:

As demonstrated by the video, IBNIZ (Ideally Bare Numeric Impression giZmo) is a virtual machine and a programming language that generates video and audio from very short strings of code. Technically, it is a two-stack machine somewhat similar to Forth, but with the major execption that the stack is cyclical and also used at an output buffer. Also, as every IBNIZ program is implicitly inside a loop that pushes a set of loop variables on the stack on every cycle, even an empty program outputs something (i.e. a changing gradient as video and a constant sawtooth wave as audio).

How does it work?

To illustrate how IBNIZ works, here's how the program ^xp is executed, step by step:

So, in short: on every loop cycle, the VM pushes the values T, Y and X. The operation ^ XORs the values Y and X and xp pops off the remaining value (T). Thus, the stack gets filled by color values where the Y coordinate is XORed by the X coordinate, resulting in the ill-famous "XOR texture".

The representation in the figure was somewhat simplified, however. In reality, IBNIZ uses 32-bit fixed-point arithmetic where the values for Y and X fall between -1 and +1. IBNIZ also runs the program in two separate contexts with separate stacks and internal registers: the video context and the audio context. To illustrate this, here's how an empty program is executed in the video context:

The colorspace is YUV, with the integer part of the pixel value interpreted as U and V (roughly corresponding to hue) and the fractional part interpreted as Y (brightness). The empty program runs in the so-called T-mode where all the loop variables -- T, Y and X -- are entered in the same word (16 bits of T in the integer part and 8+8 bits of Y and X in the fractional). In the audio context, the same program executes as follows:

Just like in the T-mode of the video context, the VM pushes one word per loop cycle. However, in this case, there is no Y or X; the whole word represents T. Also, when interpreting the stack contents as audio, the integer part is ignored altogether and the fractional part is taken as an unsigned 16-bit PCM value.

Also, in the audio context, T increments in steps of 0000.0040 while the step is only 0000.0001 in the video context. This is because we need to calculate 256x256 pixel values per frame (nearly 4 million pixels if there are 60 frames per second) but suffice with considerably fewer PCM samples. In the current implementation, we calculate 61440 audio samples per second (60*65536/64) which is then downscaled to 44100 Hz.

The scheduling and main-looping logic is the only somewhat complex thing in IBNIZ. All the rest is very elementary, something that can be found as instructions in the x86 architecture or as words in the core Forth vocabulary. Basic arithmetic and stack-shuffling. Memory load and store. An if/then/else structure, two kinds of loop structures and subroutine definition/calling. Also an instruction for retrieving user input from keyboard or pointing device. Everything needs to be built from these basic building blocks. And yes, it is Turing complete, and no, you are not restricted to the rendering order provided by the implicit main loop.

The full instruction set is described in the documentation. Feel free to check it out experiment with IBNIZ on your own!

So, what's the point?

The IBNIZ project started in 2007 with the codename "EDAM" (Extreme-Density Art Machine). My goal was to participate in the esoteric programming language competition at the same year's Alternative Party, but I didn't finish the VM at time. The project therefore fell to the background. Every now and then, I returned to the project for a short while, maybe revising the instruction set a little bit or experimenting with different colorspaces and loop variable formats. There was no great driving force to insppire me to finish the VM until mid-2011 after some quite succesful experiments with very short audiovisual programs. Once some of my musical experiments spawned a trend that eventually even got a name of its own, "bytebeat", I really had to push myself to finally finishing IBNIZ.

The main goal of IBNIZ, from the very beginning, was to provide a new platform for the demoscene. Something without the usual fallbacks of the real-world platforms when writing extremely small demos. No headers, no program size overhead in video/audio access, extremely high code density, enough processing power and preferrably a machine language that is fun to program with. Something that would have the potential to displace MS-DOS as the primary platform for sub-256-byte demoscene productions.

There are also other considerations. One of them is educational: modern computing platforms tend to be mind-bogglingly complex and highly abstracted and lack the immediacy and tangibility of the old-school home computers. I am somewhat concerned that young people whose mindset would have made them great programmers in the eighties find their mindset totally incompatible with today's mainstream technology and therefore get completely driven away from programming. IBNIZ will hopefully be able to serve as an "oldschool-style platform" in a way that is rewarding enough for today's beginninng programming hobbyists. Also, as the demoscene needs all the new blood it can get, I envision that IBNIZ could serve as a gateway to the demoscene.

I also see that IBNIZ has potential for glitch art and livecoding. By taking a nondeterministic approach to experimentation with IBNIZ, the user may encounter a lot of interesting visual and aural glitch patterns. As for livecoding, I suspect that the compactness of the code as well as the immediate visibility of the changes could make an IBNIZ programming performance quite enjoyable to watch. The live gigs of the chip music scene, for example, might also find use for IBNIZ.

About some design choices and future plans

IBNIZ was originally designed with an esoteric programming language competition in mind, and indeed, the language has already been likened to the classic esoteric language Brainfuck by several critical commentators. I'm not that sure about the similarity with Brainfuck, but it does have strong conceptual similarities with FALSE, the esoteric programming language that inspired Brainfuck. Both IBNIZ and FALSE are based on Forth and use one-character-long instructions, and the perceived awkwardness of both comes from unusual, punctuation-based syntax rather than deliberate attempts at making the language difficult.

When contrasting esotericity with usefulness, it should be noted that many useful, mature and well-liked languages, such as C and Perl, also tend to look like total "line noise" to the uninitiated. Forth, on the other hand, tends to look like mess of random unrelated strings to people unfamiliar with the RPN syntax. I therefore don't see how the esotericity of IBNIZ would hinder its usefulness any more than the usefulness of C, Perl or Forth is hindered by their syntaxes. A more relevant concern would be, for example, the lack of label and variable names in IBNIZ.

There are some design choices that often get questioned, so I'll perhaps explain the rationale for them:

The colors: the color format has been chosen so that more sensible and neutral colors are more likely than "coder colors". YUV has been chosen over HSV because there is relatively universal hardware support for YUV buffers (and I also think it is easier to get richer gradients with YUV than with HSV).

Trigonometric functions: I pondered for a long while whether to include SIN and ATAN2 and I finally decided to do so. A lot of demoscene tricks depend, including all kinds of rotating and bouncing things as well as more advanced stuff such as raycasting, depends on the availability of trigonometry. Both of these operations can be found in the FPU instruction set of the x86 and are relatively fundamental mathematical stuff, so we're not going into library bloat here.

Floating point vs fixed point: I considered floating point for a long while as it would have simplified some advanced tricks. However, IBNIZ code is likely to use a lot of bitwise operations, modular bitwise arithmetic and indefinitely running counters which may end up being problematic with floating-point. Fixed point makes the arithmetic more concrete and also improves the implementability of IBNIZ on low-end platforms that lack FPU.

Different coordinate formats: TYX-video uses signed coordinates because most effects look better when the origin is at the center of the screen. The 'U' opcode (userinput), on the other hand, gives the mouse coordinates in unsigned format to ease up pixel-plotting (you can directly use the mouse coordinates as part of the framebuffer memory address). T-video uses unsigned coordinates for making the values linear and also for easier coupling with the unsigned coordinates provided by 'U'.

Right now, all the existing implementations of IBNIZ are rather slow. The C implementation is completely interpretive without any optimization phase prior to execution. However, a faster implementation with some clever static analysis is quite high on the to-do list, and I expect a considerable performance boost once native-code JIT compilers come into use. After all, if we are ever planning to displace MS-DOS as a sizecoding platform, we will need to get IBNIZ to run at least faster than DOSBOX.

The use of externally-provided coordinate and time values will make it possible to scale a considerable portion of IBNIZ programs to a vast range of different resolutions from character-cell framebuffers on 8-bit platforms to today's highest higher-than-high-definition standards. I suspect that a lot of IBNIZ programs can be automatically compiled into shader code or fast C-64 machine language (yes, I've made some preliminary calculations for "Ibniz 64" as well). The currently implemented resolution, 256x256, however, will remain as the default resolution that will ensure compatibility. This resolution, by the way, has been chosen because it is in the same class with 320x200, the most popular resolution of tiny MS-DOS demos.

At some point of time, it will also become necessary to introduce a compact binary representation of IBNIZ code -- with variable bit lengths primarily based on the frequency of each instruction. The byte-per-character representation already has a higher code density than the 16-bit x86 machine language, and I expect that a bit-length-optimized representation will really break some boundaries for low size classes.

An important milestone will be a fast and complete version that runs in a web brower. I expect this to make IBNIZ much more available and accessible than it is now, and I'm also planning to host an IBNIZ programming contest once a sufficient web implementation is on-line. There is already a Javascript implementation but it is rather slow and doesn't support sound, so we will still have to wait for a while. But stay tuned!

79 comments:

Looks like a tinier more esoteric parallell spin of a lingering project of mine, lyd, which primarily is my first audio hacking experiment but has an example where the generated signal is instead fed as video into a framebuffer shown with SDL. Lyd uses an in-fix expression parser whose AST is compiled to a data flow machine code. Great stuff and writeup :)

Why does each pixel depend on the contents of the stack from the last pixel? that makes ibniz extremely difficult/impossible speed up by parallelizing it, or converting the kernel to a GPU shader. If only the ibniz program depended only on the stack containing x,y,t and nothing else.

Pixel colors only depend on one another in special cases. Any optimizer or parallelizer for IBNIZ code should do an internal dependency check for the code in order to rule out what kind of optimizations are possible. Quite a lot of sane IBNIZ code should parallelize quite trivially (including most of the examples in the video).

Viz, any plans on adding fixed-cycle execution to your language/virtual machine? (Because without it, your language is only useful for democoding for size, not size and speed. Which may be your intention, but I don't know which is why I'm asking.)

Trixter: Yes, the support for a fixed execution speed is on the development roadmap (I haven't yet decided the default value, however). Still, IBNIZ is primarily a sizecoding platform and probably won't end up being very friendly to a lot of speed optimization tricks.

Im not a code junky, but i've been trying my hand at getting IBNIZ to run on my macbookpro. running leopard and X11 3.1.1 I'm stuck at updating MacPorts. Also I have no idea what im doing. If some really awesome coder could create a walk through I'd be very happy.

another Fractal variations (slow speed)\dw+arp&/ \Fractal Flower GardenThis is very beautiful on iPhone.

Javascript implementation is rather slow. However, it is possible to observe a complicated change in detail. In MS-DOS and Javascript, displays may differ, so that it does not seem to be the same codes. For example, it is this. \d*v*vv* \Striped pattern in MS-DOS and Tower of Chaos in Javascript

I'm just experimenting without knowing what I'm doing. I't be cool to have some sort of tutorial for *complete* beginners, including the c operators and how they work (though they're not the most difficult thing to google out) so that it doesn't stay a coder-only thing. For instance i don't understand where the 5-1-1 come from in the first place in the first example... It's probably clear to coders but I just don't get it :( pls help!

The bad thing is that "Jupiter storm" is not working after bug fix (as it's not working on js implementation). Or it must be "d.FFFF&8rv++/" instead of 2-character "+/"

By the way, I have implemented some ibniz programs on fpga with 1280x1024x60Hz video. It looks nice, and it can be helpful in debugging IP-cores, because you can see with your own eyes how your math hw-module works. (Though, I "compiled" ibniz to verilog by hands). And one more thing. One can make "Algorithmic symphonies" in hardware too. Actually, "t * ((t>>12|t>>8)&63&t>>4)" is not only C and js valid expression, it is verilog valid expression also. Here is my post with implementation (in Russian): http://marsohod.org/index.php/projects/184-jukeboxMy point that for demoscene world fpga-s can be not only utility equipment, but new platform as well.

Really enjoying Ibniz! I have to ideas for features for a future version that would greatly extend the fun 1) allow for one to make edits to the program when the code display overlay is turned off. 2) definable macros assigned to key combinations say ctrl-1 for that allow for scripted start/stop/reset/cursor movement and key input/delete

with these two, realtime uses would be greatly expanded. thanks so much for sharing!

What a cool audio visual tool! It's pretty complex and I think I understood most of the programming, but not everything. The effect that that must add to the presentation or concert or whatever must be amazing.

Thank you so much for creating this! I created a music video for my band recently with it (https://www.youtube.com/watch?v=9fd0lL32HC8) and created made live projections with it for our performance as well. Very good work Viznut.