I'm really excited to finally share this with you. I've been working on implementing the SNES in an FPGA using Verilog HDL. In the past I've implemented an NES, SNES APU, HQ2X filter, etc in an FPGA so I figured full SNES was the next logical step. I've been making lots of progress and I wanted to share with you all the current state of the design.

Hardware setup:- The entire SNES is hooked up to my TLA715 logic analyzer for investigation purposes. I have attached some images below for you to check out. My TLA715 supports up to 256 simultaneous channels @ 250MHz sample rate with a 64M sample buffer depth.- I've primarily been using the SD2SNES for running my own custom test ROMs in addition to original carts for the most popular games.- I'm using a Cyclone 4 FPGA, specifically the Terasic DE2-115 board.

Logic blocks that are completed:- S-SMP and S-DSP - I completed this some time ago. As far as I know it's the most accurate FPGA-based APU core ever created. I ran my Verilog implementation against Blargg's C-based APU core using an automated script for all 35,000+ SPCs on snesmusic.org and my Verilog implementation matched Blargg's output bit for bit.

- S-CPU 65816 - This is 100% complete and fully verified. I used SystemVerilog Assertions (SVAs) and a formal equivalence tool to verify proper behavior of every combination of internal register, opcode, and addressing mode on every microcycle. I wrote over 3,200 SVAs by hand from scratch for the formal equivalence checking. For those of you who don't know what formal equivalence checking is you can google it - but essentially it will literally create a formal proof which states that no matter what combination of inputs are given to a particular logic block that said logic block will never violate a particular assertion. If anyone is really interested I suppose I could show you what a simple assertion looks like, but without understanding the syntax it would be pretty useless. My 65816 core is probably the most accurate hardware implementation of an 816 ever created aside from the original ASIC. The CPU is fully microcoded for all instructions so it is very resource efficient.

- S-CPU DMA and HDMA - Both of these are complete and working very well. My logic analyzer setup was extremely useful for implementing these blocks. The MDMA implementation was pretty easy, but the HDMA was an incredibly huge complicated pain in the ass. The way HDMA interacts with MDMA (stalling in-progress MDMAs or killing them), the multiple modes of operation, corner cases, etc, etc, etc. Suffice it to say, it sucked! There are probably a few clock cycle differences here an there in my HDMA implementation versus a real SNES but I got tired of working on it - I do intend on going back and fixing them but for now I wanted to move on to get some graphics working. It's at least 98% accurate which is good enough for me for now.

- S-PPU graphics background/sprites = 1% (i.e. one) done. Haha. I've only just spent a few hours on this so far. You can see the very buggy initial results for mode 0 in the video below.

- S-WRAM - Both A-bus and B-bus interfaces are 100% complete and verified.

- Standalone FPGA-based affine transformation and perspective projection (i.e. mode 7) implementation is complete. I made this just to see if I could do it and it turned out to be super cool and I learned a lot. The code will be extremely useful and portable to the mode 7 logic in the SNES. I am 100% confident in implementing mode 7 once I get to that point. I've provided a video link to a demo of this code.

Miscellaneous items:- Created over 30 original test ROMs with pass/fail conditions. Approximately 8 of those test ROMs (about half are HDMA tests) will only pass on my VeriSNES and a real SNES, no other emulators can pass them including bsnes/higan with accuracy profile - this is primarily thanks to my logic analyzer setup where I can see everything happening on a per clock-cycle basis. This might sound cool but it has actually made development quite a bit more difficult since I can no longer rely on BSNES/Higan for the accuracy I need when comparing tracelogs (for one example). Instead I'm basically forced to use my logic analyzer for almost everything...but in the end I suppose that's probably better anyway.

- I'm actually able to load and run nearly every single game that I test out (as long as it doesn't require a special enhancement chip). Even though I have no graphics output I can still hear the attractors/demos running (since my APU core is 100% complete). I can even interact with the games using the joypad, for example in SMW I can press the start button, select a level, jump around, collect coins, kill enemies, etc I just can't see anything. Lol. I actually also played LoZ blind while running a software emulator simultaneously and matching button presses. I was actually able to get all the way to the dungeon where Link meets his father and picks up the sword - at that point I just stopped playing but it was super fun to do!

- I've captured 100s upon 100s of GBytes of logic analyzer traces covering nearly every aspect of the SNES hardware and its behavior including odd corner cases and other esoteric behaviors.

- I am actually implementing each of the S-PPU chips individually just as they are implemented in the original SNES as opposed to one combined humongous PPU1+PPU2 blob of logic. This forces me to critically think about exactly how the two PPUs are communicating with each other in order complete a particular task.

Next to be implemented:- Most common video/background modes, then sprites.- I started with Mode 0 because it was the simplest and most straight forward so I'll probably finish that off first and then move on to Mode 1.

Request: If anyone has any clues or ideas whatsoever what is causing the Bad Apple demo to run so slowly I would love to hear them. I'm hoping it's as simple as just being some register/feature I haven't implemented yet which is causing the issue. Remember that this is a hardware emulator, not software, so it's not an issue with the VeriSNES slowing down under load or anything - hardware always runs at a constant rate regardless of what it's doing. Here is a link to the exact time index where I start the Bad Apple demo.

That's all for now!

Attachments:

tla_setup2.png [ 723.67 KiB | Viewed 29428 times ]

tla_setup1.png [ 527.2 KiB | Viewed 29428 times ]

Last edited by jwdonal on Tue Nov 22, 2016 1:48 pm, edited 2 times in total.

On the whole, congrats and best of luck in the future with your project!

Quote:

- S-SMP and S-DSP - I completed this some time ago. As far as I know it's the most accurate FPGA-based APU core ever created. I ran my Verilog implementation against Blargg's C-based APU core using an automated script for all 35,000+ SPCs on snesmusic.org and my Verilog implementation matched Blargg's output bit for bit.

Then it's probably very wrong, because although blargg's DSP is basically perfect in the digital realm, his SMP has several very severe issues. To the point where even with 50+ hacks, Snes9X had serious SMP regressions upon using it. A big part of the recent release was replacing his SMP core with mine.

(The only imperfection in the DSP core is that the DSP mute flag controls an analog component that sits beyond the output of the digital audio, and is done after mixing in the cartridge and expansion port audio. As a result, the effect is neither immediate nor instantaneous.)

Quote:

- Created over 30 original test ROMs with pass/fail conditions. Approximately 8 of those test ROMs (about half are HDMA tests) will only pass on my VeriSNES and a real SNES, no other emulators can pass them including bsnes/higan with accuracy profile - this is primarily thanks to my logic analyzer setup where I can see everything happening on a per clock-cycle basis. This might sound cool but it has actually made development quite a bit more difficult since I can no longer rely on BSNES/Higan for the accuracy I need when comparing tracelogs (for one example). Instead I'm basically forced to use my logic analyzer for almost everything...but in the end I suppose that's probably better anyway.

I sincerely hope you'll share them, along with some basic notes on what's going on. I'll be happy to try and pass all of your tests :D

I would be very surprised to learn of bugs that exist outside of the PPU per-cycle timing conditions. Especially the HDMA ... I even know about an edge case where if you have HDMA indirect enabled on the last channel, once HDMA terminates for the rest of the frame, it skips loading one of the address reload bytes and leaves the low byte in the high byte and zeroes out the low byte. That is some extremely obsessive attention to detail. And of course, I was the one that clocked out all of the CPU<>DMA clock synchronization stuff, the HDMA init trigger position changing between CPU revision 1 and 2, etc.

I would actually love to work with you to ensure you can use higan to debug things against your real hardware logic analyzer logs. I'm especially very interested in doing this for the PPU, because that's something where I can't work out the timings from test ROMs alone.

I think we worked well together with the HQ2x simplification in the past. And I'm certain I could return the favor by explaining the PPU operation as best I can to you while you work on that part of your implementation.

Then it's probably very wrong, because although blargg's DSP is basically perfect in the digital realm, his SMP has several very severe issues. To the point where even with 50+ hacks, Snes9X had serious SMP regressions upon using it. A big part of the recent release was replacing his SMP core with mine.

It's unlikely that it's very wrong. I should have mentioned that I fixed well over 20 different bugs in both the SMP and DSP cores (moreso in the SMP as you say) based on my logic analyzer observations. So it's not Blargg's original source. I think I sent about half of the fixes to Blargg via email but didn't seem interested in fixing them at the time so eventually I stopped bothering to send them. :/ But Blargg was a great help to me when I was learning about the APU logic and helped me out a lot with understanding the C code.

byuu wrote:

I even know about an edge case where if you have HDMA indirect enabled on the last channel, once HDMA terminates for the rest of the frame, it skips loading one of the address reload bytes and leaves the low byte in the high byte and zeroes out the low byte.

Yeah, I've got that implemented already as observed on the logic analyzer. It's one of the absolute weirdest things that I've ever seen. I spent 30min trying to work out on paper how in the heck the HDMA controller could have possibly been designed to have that behavior and couldn't come up with anything. It makes absolutely no sense whatsoever.

byuu wrote:

I would actually love to work with you to ensure you can use higan to debug things against your real hardware logic analyzer logs. I'm especially very interested in doing this for the PPU, because that's something where I can't work out the timings from test ROMs alone.

I think we worked well together with the HQ2x simplification in the past. And I'm certain I could return the favor by explaining the PPU operation as best I can to you while you work on that part of your implementation.

Yeah, that sounds cool for sure. But first I want to knock out some more of the graphics logic cause I'm so excited to work on it - FINALLY!

I should have mentioned that I fixed well over 20 different bugs in both the SMP and DSP cores (moreso in the SMP as you say) based on my logic analyzer observations.

Ooooh, okay. My apologies then.

Quote:

I think I sent about half of the fixes to Blargg via email but didn't seem interested in fixing them at the time so eventually I stopped bothering to send them. :/

Yeah, he unfortunately fell off the radar it seems. Along with TRAC, anomie, neviksti, and most of the true MVPs of the SNES emulation community.

At any rate, this is the first I'm hearing of test ROMs you have that I'm not passing.

So please let me stress that SNES emulation is my magnum opus in life. I've worked on many emulators simply because I've hit the limits of what I could do through my software testing.

I will always be extremely interested in working with you or anyone polite to fix bugs in bsnes/higan as promptly as possible and will prioritize it above all my other projects :D

Also, if you want, I can tell you about some hardware edge cases you're probably unaware of that will ruin your month, heheh ;)

Quote:

It makes absolutely no sense whatsoever.

Yep. That's the SNES in a nutshell.

No matter what I try, I can never predict how the SNES will actually behave. Even with all I know about how crazy it behaves, and trying to predict crazy things in return ... somehow, someway, the SNES always manages to surprise me. Each and every time.

Quote:

Yeah, that sounds cool for sure. But first I want to knock out some more of the graphics logic cause I'm so excited to work on it - FINALLY!

Alright then, as soon as you're ready, please reach out to me when you can, especially about those test ROMs. I'm very excited to take a look at them, but I won't rush you. Have fun with the start of the PPU :)

A big part of the recent release was replacing his SMP core with mine.

Do you have a list or a revision log of the various bugs that you fixed in the SMP? I'm wondering if I can dig up my SVN revision logs for the fixes that I made to Blargg's SMP and compare them against your list. And/or maybe gmail still has the emails I sent to Blargg in my message history/archive...

Unfortunately, no. I didn't fix up his core, I simply rewrote a new one from scratch based on knowledge gained from my own emulator's core. BearOso then optimized out the cycle switch cases to merge those that lacked side-effects (and thus didn't need to synchronize with the CPU.)

Ill be looking forward to seeing the progression of this project, and hopefully a final product I can purchase one day.

I'm not trying to derail this topic, but I couldn't help but think while reading this new topic, you appear to have the knowledge and resources to implement some of the special chips not available for the SD2SNES yet. Just saying.

Do you plan to make this public once it reaches some state of completion, or are you keeping it private for a product that you plan to sell?

I would love for this community to have open source hardware .. However I understand and respect the author's decision either way. There has been a closed sourced SNES FPGA variant shown in the past. Early this year I was working on the 65816 in verilog hoping to throw it on github but haven't got around to finishing it yet.

Do you plan to make this public once it reaches some state of completion, or are you keeping it private for a product that you plan to sell?

I definitely have no plans on selling any products. All of my FPGA projects I do purely for three reasons: 1) It's something I enjoy. 2) It improves my FPGA design skills and keeps existing skills sharp. 3) I get to learn new things that I wouldn't have learned otherwise. As far as making it public I probably wouldn't have any issue with that if I could be sure that some company/person wasn't going to steal the code and use it to make money - but let's face it, that's probably exactly what would happen.

Most importantly, I have no understanding of the legal issues surrounding the release of free cloned Nintendo hardware, cloned WDC processors (816 core), or cloned Sony processors (SPC700). That's a headache that I wouldn't want to deal with.

marvelus10 wrote:

I'm not trying to derail this topic, but I couldn't help but think while reading this new topic, you appear to have the knowledge and resources to implement some of the special chips not available for the SD2SNES yet.

I wouldn't be totally opposed to that as long as the source code was protected somehow (see previous comments) - the simplest way to do this would be to provide ikari with a pre-synthesized netlist of the enhancement chip logic targetted to his device which he could then integrate into his full design. The primary issue is that the SD2SNES uses a much older FPGA and the tools that support that chip don't support SystemVerilog only Verilog. So I would need to massage the code a little to only use Verilog keywords. This wouldn't be a huge effort but it's something that would need to be done before I could even synthesize, for example, the 65816 for that older FPGA.

defparam wrote:

I would love for this community to have open source hardware

As far as I understand all of the SD2SNES FPGA code is open source. As well as ludde's FPGA NES, that's open source and is up on github.

Last edited by jwdonal on Tue Nov 22, 2016 5:30 pm, edited 3 times in total.

I sincerely hope you'll share them, along with some basic notes on what's going on.

Sorry for delayed response. I actually answered this question on #nesdev when kevtris asked me so I think I got wires mixed up in my brain like I had already answered this question - except an answer on #nesdev doesn't help people that are only on the forums.

The answer is yes, I will be sharing them. But I don't have an ETA on when this will be. I will need to clean up the code and most likely provide annotated logic analyzer traces so people even understand what's going on. That will take time and right now I want to spend time on super fun graphics logic!!!!

Do you have any of the HDMA/DMA test ROMs that you made that you could share as well?

Who is online

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum