Greetings! For those who don't know me, I have this project to perfectly recreate Doom64 for the PC. I also have full reverse engineering documentation on how the game works on the N64 hardware. Lately I've been expressing some interest in porting the game to the Nintendo DS and for the most part I have most of the basic things in and the renderer has been rebuilt to better suit the DS hardware (I also have a screenshot of that working prototype on my wordpress main page).

To summarize, Doom64 is a fully 3D engine; it is not software unlike the other Doom games and as a result, I had to fully leverage the 3D hardware on the DS. While my custom renderer is working, I've ran into several issues with texture management. I am kinda disappointed by the 512kb limit on video memory because the sprites are the #1 culprit. Doom64's sprites are VERY large, especially for monster sprites. Fortunately all world textures and simple sprites uses 16-color palettes, which I can cram a lot of them in the VRAM. The monster sprites however uses 8-bit color (256-color palettes) and usually ~64x128 or 128x128 size, making them MUCH harder to cram all rotations and frames. One monster alone eats up ~256kb of VRAM and that's just for 5 frames at most.

I've already seen this thread and now I am really worried. I wanted to try to upload textures on the fly and flush them out after the frame has been drawn but now that's not an option. I tried doing something like call glDeleteTextures on textures that's no longer visible to the player's view but it seems like that function doesn't work in the libnds library. I don't think it ever worked for me for any situation (possible bug perhaps?). But anyways I am trying to look for solutions to try to cram these sprites and I am not sure what other options to take other than to downscale the sprites to ridiculous resolutions. I was thinking about maybe adding some mip-mapping feature but that goes back to that thread again, giving me the impression that it is not even possible. I even tried something like double buffering by uploading textures to VRAM_A on the first frame, draw them and then upload new textures VRAM_B on the other frame, draw, dump textures in VRAM_A and repeat. Unfortunately I get horrendous screen-tearing as a result on my DS-lite.

Any thoughts or comments on what I can do? I appreciate the feedback! Thanks

you cannot dump the textures in vram until the frame has finished rendering. I used this approach for a 3d hardware heretic port that I have been working on and it works quite well with no tearing. AB one frame CD the next frame - one bank was not enough for heretic and I suspect the same is true for doom. I ended up double buffering the vram writes - you will need to compensate for the bank swapping, but it is not difficult. just track the offset from the begging of the cache and add it to VRAM_A or VRAM_C when calculating the texture address.

What's weird was that I actually tried doing that before after reading that other thread. I would throw all textures to VRAM_A on the first frame, and then throw textures on VRAM_B on the second frame, and then start over. The tearing was even worse than what I had before.

Seems like the tearing would happen whenever I just call glBindTexture

After a bit of fudging with my code, I got it to work without getting any lines. Though when I was inspecting your code I saw that you never used swiWaitForVBlank. Even when I had a similar setup to yours I was still getting the issue of seeing the lines. What I did to fix it on my end was that I called glFlush, wait for VBlank and THEN I dma'ed the texture cache to vramA/B on one frame and then vramC/D on the other.

I've also noticed that now the game runs very slowly when drawing scenes with a lot of textures. I am assuming that I am going to have to keep track of what textures are still visible and what textures are not and dma only the ones that are new to the scene.

Maybe it might be possible or you to use 2 main VRAM banks (say banks A&B) for 'static' assets and use alternatively one of the other 2 (say C&D) as a bank where you can upload textures for the next frame... each bank is 128KB and should be enough, hopefully.

Theoretically (as I haven't tried it myself), you can begin to copy over new textures for the next frame as early as scanline 144, as that is the earliest time when the rendering engine finishes the last scanline (for line 192) to be sent to the scanline buffer. You then have until about scanline 214 to finish copying, as that is when the rendering engine kicks in to start buffering scanlines for the new frame. So, upwards of about 70 scanlines maximum without possible tearing from copying in new data. That isn't taking into account that the polygon/vertex buffers need to be swapped for the new frame, which is executed at the start of the Vertical Blank (VBlank), but is called before then. I don't remember if code execution is halted after calling that (like swiWaitForVBlank), but assuming it does, you'd need to check after each texture is copied that between the current Vertical Counter (VCount) and the start of the VBlank is enough time for the next texture to be copied. When it isn't, then call glFlush, and then continue copying textures until about scanline 214.

Maybe it might be possible or you to use 2 main VRAM banks (say banks A&B) for 'static' assets and use alternatively one of the other 2 (say C&D) as a bank where you can upload textures for the next frame... each bank is 128KB and should be enough, hopefully.

Doom64's sprites range from small 16-palette sprites up to 256 color sprites. The monster sprites in general are very HUGE. After padding the texture sizes, almost all of them range at 128x128, so there's a lot of thrashing / fighting when it comes to cramming in textures and removing those that are not used, which kills the performance due to constant memcpys and dma'ing.

I seriously don't know how they did it on the N64 since that system only has a 4kb texture cache.

I have thought about breaking up the sprites into smaller tiles like breaking up a non-padded 56 x 82 sprite into 3 tiles, so that would be three 64 x 32 textures (after padding) for a single sprite to lighten up the workload when uploading them to the VRAM bank.

I could try what Discostew suggested as well though I thought doing a 'while(REG_COUNT > ##)' loop seems non-practical...

Forgive me if my knowledge of the NDS is rusty. I haven't done any serious programming for it in a long time, so if something sounds off, it's because I haven't been refreshing my memory of it.

I would imagine with the use of interrupts, you won't have to hold the system in a while loop like that. Perhaps, set the VCount interrupt to 144, as that is the earliest possible scanline to begin copying in textures. Have the system in a Interrupt Halt state (assuming you can do game calculations before scanline 144) until that time when the VCount interrupt activates. Then, within that interrupt, check if copying to VRAM is safe (in that no more scanline buffering is done. If safe, then simply start the copy procedure. If not, then activate the HBlank interrupt, and set the Interrupt Halt state again, and basically check back later at the next line if it is safe. Deactivate the HBlank interrupt when it is safe.

There is always the last resort if you really want to get Doom64 running on the NDS. I assume texture streaming from game data -> RAM -> VRAM is not being done, but is pre-loaded into RAM when the game is first loaded (for general re-used textures) and in between area transitions (for textures unique to those areas). If it is streaming, then ignore this option I'm about to suggest. If it is pre-loading, then perhaps you could spend a bit more pre-load time, and simply scale down the textures. The game won't look as good, but sometimes sacrifices must be made. Scaling them down would also help with copy time if you still require copying to VRAM in between frames. Can also pick and choose which textures will and will not be scaled down.

Doom64's sprites range from small 16-palette sprites up to 256 color sprites. The monster sprites in general are very HUGE. After padding the texture sizes, almost all of them range at 128x128, so there's a lot of thrashing / fighting when it comes to cramming in textures and removing those that are not used, which kills the performance due to constant memcpys and dma'ing.

a 128x128 pixel 256 colors texture is 16KB, so you can put 8 of them into a 128KB bank. If you think this isn't enough, you should consider switching to compressed textures (Format 5), which may be / may not be suitable for you, I don't know that.

I assume texture streaming from game data -> RAM -> VRAM is not being done, but is pre-loaded into RAM when the game is first loaded (for general re-used textures) and in between area transitions (for textures unique to those areas)

I've kinda done that already by determining if a texture has been recently used or not. If the current tick is > than the tick that the was last used then it discards it. This alone works just fine.. but when you have a ton of unique monsters on screen that eats up 16kb of vram, it becomes a constant battle between textures fighting for space, so stuff is constantly getting thrashed around and freed. I have made it to simply not draw the sprite if there isn't any room at all to cram a sprite in (meaning, all textures are currently in use). I know this can be better managed, I just need to make better use of the hardware (if it's even possible).

Discostew wrote:

If it is pre-loading, then perhaps you could spend a bit more pre-load time, and simply scale down the textures. The game won't look as good, but sometimes sacrifices must be made. Scaling them down would also help with copy time if you still require copying to VRAM in between frames. Can also pick and choose which textures will and will not be scaled down.

I am looking into that as well but I can't find any simple image scaling library written in C that I can quickly implement and iterate on. I was considering using Mesa3D's gluScaleImage but the code itself looks daunting.

sverx wrote:

a 128x128 pixel 256 colors texture is 16KB, so you can put 8 of them into a 128KB bank. If you think this isn't enough, you should consider switching to compressed textures (Format 5), which may be / may not be suitable for you, I don't know that.

In some cases, even 8 is not enough. The final level, for example, has you fighting against a large wave of monsters and usually are the ones with 128x128 sprites... in addition to unique frames and rotations.

I've researched on the compression method but after testing it on one sprite I've noticed that the size is larger than the uncompressed one because it uses three different types of data (index, palette, data). I am not sure if this will even help though I could try it still. Unfortunately there's only one utility for compressing textures that's written in Python and I would love to get my hands on code written in C so I could quickly implement it and iterate on it.

Who is online

Users browsing this forum: No registered users and 4 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum