Thursday, September 27, 2007

I was bored a few hours ago, coding and debugging stuff on the DS (not emulation related), and I remembered that I wanted to do some type of "that desmume one year ago" post. That's what you get when I'm too lazy to spend time doing some real work on desmume :P

So... not much to talk about, I just searched for the first public screenshots that were released of a commercial game (I remembered they looked BAD, because I added an ugly hack for transparency, and I didn't know that version of the 3D core would be used for screenshots...), and made new screenshots today with the lastest version. Nothing really new shown on that screenshots, but might be fun for those that saw the original screenshots one year ago.

The screenshots that I'm talking about are on this post on emutalk (my nickname there is synch, that'll see referenced on that post), and you can compare them with this ones:

Speed has increased about 2x-5x (depending on game) in a year, while having way better (or almost perfect) 3D, so I guess it hasn't been a total waste of time... Have fun :)

Saturday, September 01, 2007

After abandoning desmume development for a few months, I've been working on it quite a lot lately. Of course, not as much as when I started working on the project, as motivation to develop desmume is rare this days (just check the official desmume CVS, and you'll see what I'm talking about). But anyway, I've done some improvements on a very specific side of the project that I've avoided working on for the last 6 months: speed.

I've to be clear: desmume current way of handling most of the interrupts sucks. For example, timers, dmas and other, are checked PER opcode emulated. In an ideal world, you'd run a batch of instructions (the exact ammount would be determined by some heuristics or whatever), and then you'd do the processing needed by the pending interrupt / timer handling, then again back to running a batch of instructions. Of course, that'd be a less exact than current implementation, but it'd be WAY faster. I did something similar on my Gameboy emulator, and it gave good results, we'll see how it works on desmume, whenever I find time to implement it.

Anyway, I'll start with two screenshots, one before the speed ups I implemented the past 3 days (left), and another one with them (right):

As you can see, it's a practical 33% speed up on this particular case. I must say that it affects every game that I've tested on a considerable ammount, being the minimum a 14% (I'll explain later why). Just as a real world example, tested "Castlevania: Dawn of Sorrow" and it's 30% faster now... So let's talk about the speed ups.

First, desmume used/s GDI to draw the DS screens onto the main window. GDI setup is as easy as it can get, but seems that it's not as fast as I would wish (more keeping in mind that it's only task is to take an image and draw it on the screen). I considered a few options, namely DDraw, openGL or GDI+. I excluded openGL at the beggining, because it would be quite a lot of work to make the current 3D core work correctly with it. So after some discussing it with some friends (more exactly, hearing some DDraw bashing and why I should stop using DirectX7 features), I started working on a GDI+ blitter. In the end it was easier than I expected. Reading the documentation, implementing the blitter and a bit of adjusting was done in an hour. In fact, searching through the documentation was what took longer. The good part, is that it gave a free 14% perfomance gain.

The rest was a matter of fixing some tidbits of how memory is accesed, avoiding a lot of conversions from floating point to integer (which is way more expensive that you'd guess), changing the way the texture cache works, and more stuff I can't remember. I won't be any more concise about this stuff, as it would take quite a few pages, which I'm not willing to write now. Maybe I'll go back to them in a future post.

Just as an ending, it seems that my obsession about fixing all regression bugs that I see is giving good results, as most games that I test have perfect 3D. Here's some proof, Hotel Dusk running perfectly, even with the motion blur effect (I rotated the image with an image editor, because I don't have screen rotation implemented):