I have found why it blinks, the SMB NonMaskableInterrupt code sets the name table address to be $2800 at the beginning.And then it does his sprite zero detection. But it does not restore it when setting the scrolling values, it is later when rendering with the PPU_CTRL_REG1 bit0 to 1 that the next nametable bank will be used. And therefore it will render ok on the nes, but in my current super nes emulation, the bit in use at the moment of the sprite zero position will be used. And therefore I must also update the scrolling registers when the bit changes in PPU_CTRL_REG1.

Edit: I have much less glitches on this, but still going to horizontal position 0.

It seems that I have a speed problem with what is before sprite 0 hit flag. But maybe not, the top still glitches somehow if I make it collide at 96. It means that PPUCTRL1 does not always point to the first name bank at the end of vblank. But it is much better if the HScroll value is below 256 (no glitch at all).Is it a way to profile snes code? To know how many cycles and scanlines were used between 2 points?

My way would be to read the V counter before when PPUCTRL1 is accessed.Anyway it seems (from the scroll after 256) that the number of available cycles are not enough to complete the NMI routine before the end of vblank.

I had many bugs in the Vblank bits, and I had a wait Vblank in my sprite DMA IO emulation.Finally, instead of a profiler, I used the V line counter in bsnes+ debugger view, to look at how much time was used in the nes NMI routine.I tried to enable the fast mode by writing in the register and setting the rom type on the header ($30). I do not know if it worked. Do anyone know how to test if a rom is recognised as fast rom?

lda Mirror_PPU_CTRL_REG1 ;load mirror of $2000, ora #%00000100 ;set ppu to increment by 32 by default bcs SetupWrites ;if d7 of third byte was clear, ppu will and #%11111011 ;only increment by 1SetupWrites: jsr WritePPUReg1 ;write to register pla ;pull from stack and shift to left again asl bcc GetLength ;if d6 of thir

It writes $11 in the PPUCTRLREG1 during the Vblank and this "pushes" the Score bar to the left. Because the bit zero is the higher bit of HScroll. It is weird, maybe the screen is disabled on the nes when calling this Updatescreen routine?

However, it seems to be just enough in the SNES in terms of CPU power to run the emulation. I hope to remove this scrolling glitch.

The Update screen sets the bank to $2400 at line 290 and therefore the score bar does not show up. The score bar is in bank zero, PPUCTRLREG1 should be $10 like configured at line 239 at the begining of NMI.However the problem does not show up on the NES. It does not disapear.

Can anyone explain why the score bar does not blink on the nes?It is like the PPUCTRL1 reg write is ignored when the rendering is disabled in PPUCTRL2.

It deos both writing in PPU_ADDRESS and WritePPUReg1. It enters in the routine at the UpdateScreen label.

Code:

WriteBufferToScreen: sta PPU_ADDRESS ;store high byte of vram address iny lda ($00),y ;load next byte (second) sta PPU_ADDRESS ;store low byte of vram address iny lda ($00),y ;load next byte (third) asl ;shift to left and save in stack pha lda Mirror_PPU_CTRL_REG1 ;load mirror of $2000, ora #%00000100 ;set ppu to increment by 32 by default bcs SetupWrites ;if d7 of third byte was clear, ppu will and #%11111011 ;only increment by 1SetupWrites: jsr WritePPUReg1 ;write to register <----------------- It writes $11 or $15 here pla ;pull from stack and shift to left again asl bcc GetLength ;if d6 of third byte was clear, do not repeat byte ora #%00000010 ;otherwise set d1 and increment Y inyGetLength: lsr ;shift back to the right to get proper length lsr ;note that d1 will now be in carry taxOutputToVRAM: bcs RepeatByte ;if carry set, repeat loading the same byte iny ;otherwise increment Y to load next byteRepeatByte: lda ($00),y ;load more data from buffer and write to vram sta PPU_DATA dex ;done writing? bne OutputToVRAM sec tya adc $00 ;add end length plus one to the indirect at $00 sta $00 ;to allow this routine to read another set of updates lda #$00 adc $01 sta $01 lda #$3f ;sets vram address to $3f00 sta PPU_ADDRESS lda #$00 sta PPU_ADDRESS sta PPU_ADDRESS ;then reinitializes it for some reason sta PPU_ADDRESSUpdateScreen: ldx PPU_STATUS ;reset flip-flop ldy #$00 ;load first byte from indirect as a pointer lda ($00),y bne WriteBufferToScreen ;if byte is zero we have no further updates to make hereInitScroll: sta PPU_SCROLL_REG ;store contents of A into scroll registers sta PPU_SCROLL_REG ;and end whatever subroutine led us here rts

We have the "reinitializes it for some reason" which looks dubious.Usually the Mirror value of PPUCTRL1 (the copy in ram) is anded with $FE to remove the lower bit at the begining of the Vblank NMI routine. But here it simply writes it without masking.

The going theory is that a programmer saw that leaving rendering off with the VRAM address pointed at $3F01-$3F1F caused that color to be sent to the composite output block instead of the color at $3F00. The programmer internalized a (wrong but close enough) model of the hardware in which the CGRAM had a separate address pointer, and writing $3F00 then $0000 initialized both the CGRAM address and the VRAM address, just as it (actually) has separate OAM and VRAM pointers.

In 1999, loopy discovered the skinny on why the last two writes on $2006 keep the status bar from flickering. Let me summarize:

Bits 1-0 of the value written to $2000 get copied into bits 11-10 of t, the top-left corner address. This address is used to reset vertical parts of v, the VRAM address, during the pre-render line's hsync pulse and the horizontal parts of v at the start of hblank. But a pair of writes to $2006 overwrites both t and v. Here, writing $0000 clears both t and v to 0, causing the PPU to read from the first nametable.

The score bar glitch problem is solved. I have found a speed problem causing a scrolling glitch, the code does not finish prior to Vblank End and it skips a frame wile the scrolling is at (0, 0). It recovers on the sprite 0 hit flag assertion on line 30 of the next frame.It seems to happen during scrolling when colums are updated in the nametables, but I have seen that behaviour at the end of the map near the flag pole while not moving. An IO emulation routine may take a lot of time, it takes 35 lines to complete, maybe I should make an array of counters per IO call and reset it on vblank start. It would indicate what is called.

I checked the calls to IO emulation and sometimes I find 26 writes to PPUMEMDATA or 42 writes to PPUMEMADDR. The calls caused a missed vblank end.My code was not in bank $80! Therefore the fast mode was not used. Now it is faster, and the 26 writes to PPUDATA do not cause a glitch. But the 42 PPUMEMADDR writes still cause a frame miss.This function changes the address of the routine called when writing. Maybe I could put it in RAM and only call the code in the other bank when the toggle is 0 after writing. This will cut in half the number of calls to the bank 0, it will surely work here.But I doubt that it is enough cycles to add sound to it, and also have the scrolling working but I will try it. The ratio between available cycles and IO call cost is low.

I moved part of "sta $2006" in ram, reducing from 42 calls to 21. The missing frame is still here, even if it is faster with the latest optimisation. It does not glitch when returning directly for all the 26 writes to PPU RAM, and therefore it is close to working on super mario bros. Maybe by using a 2KB jump table in the WRAM bank depending on the PPUADDR, it will spare a ton of cycles both in $2006 and $2007 emulation. But it looks like the snes has not enough power to emulate scrolling nes Games at 100%. I have cut a lot of cycles and optimisation makes the code less clear. I could get more cycles during Vblank by changing the background update to a fifo instead of a rolling DMA transfer. But all in all, if everything must be optimised, the development will be very slow. + On the console, it shows little rendering mistakes.Anyway, it works with non scrolling games and Super Mario Bros can be played directly from the conversion.

I will take a look at what Memblers did with the APU emulation, in order to see how it could fit in, and I end here. It was interesting, it went further than what I expected but it is not an aesthetic conversion where everything fits (that was my goal if possible). However it is fast, despite the few missing frames it really feels like the NES. I am not going to squeeze the cycle count for each IO access. There is no room for a correct PPU emulation and therefore no room for improvement.Thanks for your help and for the amazing emulators and their integrated debugger. It is impressive that even on such code, bsnes behaves like the real hardware.

edit:I can't stop thinking... the super FX or a custom FPGA program sould be able to do PPU emulation. ...someday maybe.

I could not stop thinking about this cycle problem, and I found something really effective about PPUADDRESS. I use a table of routines in WRAM (4KB) to be able to jump to the routine of the current PPU address quickly. It removes the address increment routine cost. And I moved the PPUADDRESS IO write acces to the ram code. This removed a shitload of cycles, I gained 20 rendering lines, but it still glitches. 2 more lines needed.. I could solve it by moving the sprite0 hit routine to the ram code area (using the timers or the DMA to update the flags). This last change will remove the cost of 10 calls and spare 10 rendering lines. It will remove the glitch on SMB1.However I doubt that I can add sound to it. Any opinion on that topic? The sound takes 5 or 6 calls per frame, the cost of bank-switching is already included in the current code. It goes to the sound routine but the routine does nothing.

Sound emulation could be fine if we can update the registers in the SPC700 when we want.On Smb1 we have from line 80 to line 239 to do it. The timer will call the update routine on a given line from the romname.txt file, like SoundLine: 120.LOLZ

I managed to gain enough cycles to eradicate the glitches. And it was not easy but I used the IRQ to emulate the PPUSTATUS flag. It was not easy because it turns out the IRQ must not occur close to the NMI interrupt or the program bank will be lost. Unless the first instruction of the NMI interrupt is an sei but in that case you never have an IRQ and therefore it needs to add an variable telling that it is an IRQ during the NMI and then enabling interrupts...I just insterted the PPUSTATUS update in the NMI .The trace recording was very useful.

I need a way to update a column of tiles from RAM to VRAM? A line update is easy with the DMA, but I do not see how to do it with an increment of 32 to transfer a column. With HDMA?Any idea?

Who is online

Users browsing this forum: No registered users and 6 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum