The discussion of hardware graphics acceleration on TIB prompted me to return to this side of Geminus, and I have some promising code that accelerates sprite plotting, particularly window texture backgrounds, toolsprites and icons in Filer windows.

Coupled with better use of the nVIDIA's hardware acceleration and, my favourite acceleration, cacheing window contents for much faster redraw, this should make for a much faster desktop.

Some of you may remember that this code was prototyped and basically working some time ago. True, but it now understands multiple screens and is much closer to being working, usable code. All I need to do now is accelerate plotting of masked sprites in hardware, and - if possible - font rendering and we'll finally be using the nVIDIA card the way it was always intended to be used

Some more prototype code - hw acceleration of horizontal and vertical lines, in particular the OS-exported HLine routine which is used for DrawFile rendering and plotting large fonts. If this sounds familiar, then it's probably because SIMON was demonstrated doing something similar on the RiscPC (with special hardware) at the Guildford Show last year, presumably for inclusion in the A9home at a future date.

I've also got rectangle inversions working and the redraw cacheing is now included in Geminus and working as well as it did separately (Note the qualified wording because it still needs a fair bit of work to make it allocate cache memory intelligently to those windows which benefit most.)

And, for good measure, there are some patches in there to accelerate hardware scrolling and plotting of text outside the desktop. Well, why not?

The basic idea, of course, is to move as many operations into hardware as possible and then feed them to the nVIDIA's FIFO/DMA controller without waiting for their completion. The remaining, unaccelerated operations, are plotting of masked sprites and - the biggie - font rendering.

If nothing else, I have some more code that can temporarily mark the screen as cacheable which doubles the alpha plotting speed of Richard Wilson's Tinct module, for example (operation would be similar to the way screen cacheing works on RO4 with a StrongARM), but I'd really like to find a way to achieve this in hardware too.

"rectangle inversions" ... hmm. I wonder if this will help Photodesk? The biggest problem with PD on the Iyonix is the display of selections, which for anything of reasonable complexity or size is hideously slow (unusably so, in fact). I think it uses an xor plot on the entire window to achieve the 'crawling ants' effect.

I have some prototype code that takes the first step towards removing the 1KB-width limitation (screen widths must be a multiple of 256 pixels in 32bpp modes currently) of Geminus.

There's still a lot of work to be done before this can become usable code, however, since at the moment all we've demonstrated (with Castle's cooperation) is that it's possible to suitably configure the nVIDIA card and that applications are sufficiently well-behaved for this to work.

Geminus itself needs extensive modifications to support such an arrangement, but ultimately it should allow you to use, say, 2 CRTs at 1600 x 1200. It should also mean that druck can buy his dream screen; so start saving now, Dave

Accelerated sprite plotting and cacheing now working in rotated screen modes, with the 'software' screen removed.

Geminus currently achieves rotation by having two copies of the screen. For performance reasons it only intercepts screen writes and translates the co-ordinates and/or swaps the R/B components before writing to the actual display buffer in the nVIDIA memory. Reads come directly from the untranslated, RISC OS RGB order screen image in the machine's main memory.

Now that Geminus performs more graphics operations using hardware acceleration it becomes increasingly burdensome to update this second screen image in main memory, and the incidence of screen reads is of course reduced anyway. It therefore makes sense to eliminate this second screen and perform the co-ordinate and colour component translations on both reads and writes.

The result is a smoother desktop and an extra 16MB of available memory (the default amount claimed by Geminus for emulated screen modes), which can't be bad.

Now that most desktop operations can be accelerated, the R/B components can be swapped at effectively zero cost, meaning that unmodified graphics cards[1] can be used and will perform as fast as those which have been tweaked for RO use.

Geminus still needs to colour swap the screen writes & reads performed by applications that draw their own windows directly (and also font rendering, currently) but as we showed with the rotation code, this can be done with little visible effect upon performance. Plus, with all other operations being accelerated now, the net result is greater speed.

So I now have a desktop with one screen in RO colour order, and one screen in PC order