Share this post

Link to post

Share on other sites

I'm no 2D Wizard, but using 16-bit aligned data is going to kill any loop. Actually, you should try to read 32-bits at a time and also write 32-bits at a time. This would allow you to work on two pixels at a time. The lpSprite and lpBack pointers should point to 32-bit values. So, using lpSprite++ will move the pointer 4 bytes at a time. This would also save you from using array indices. Sorta like this:

Mmmh, again, I'm no 2D expert (and doing this outa my head), but this should be a bit faster. Obviously, you have to adjust your initialization code and where you update lpSprite and lpBack. There's probably some MMX instruction which automatically does, what the if-construct does. BTW, with all these ifs, this might look slower, but I'd bet my life that this inner loop is just as fast as your code, but it works on two-pixels at a time, whereas your code only deals with one pixel !

Also, some more helpfull coding tips (for speed):

instead of doing:

i = 2*j;

do:

i = j << 1;

instead of doing:

i = j%2;

do:

i = j&1;

This might give you a bit more speed, but combined with the loop above, you should really be on the right track. MMX will help even more, but I don't know enough about it. Using 16-bit graphics can REALLY slow you down. Sometimes, it's better to work with 32-bit graphics internally and then just convert this down to 16-bit when copying to the backbuffer.