Does there exist a test ROM that validates the cycle timings of 68K instructions?

The timing documents out there are a bit tough to read, and even tougher to verify for every instruction given the way my core tries to move each cycle penalty to exactly where it's used (eg inside the bus read/write handlers, inside the effective address pre-decrement code, etc.)

I would prefer a Genesis test ROM, but if there's one for no real specific CPU architecture (eg I won't have to emulate an entire Amiga or Neo Geo), that would also be great.

Kinda. If you want to be able to verify perfect cycle accuracy for all bus timings for all instructions, you're out of luck. I did work on making one that measured bus timings within an opcode for particular sets of instructions, but getting repeatable, stable results out of it while avoiding infinite spirals of death was hard. You've got to remember that on the Mega Drive, bus cycles are being stolen for RAM refresh. My technique involved watching for torn reads on hcounter transition points to set a known base point to the cycle, then spinning repeated instructions keyed off hcounter to fill a few KB of results, and taking a high/low measure of the counter change and comparing it to measured values from the system. This required sub-cycle timing accuracy, RAM refresh cycle, and hcounter torn reads to be supported in order to pass in an emulator. Problem was, it only worked (on hardware) about 98% of the time, and I wasn't happy with that. I shelved it late last year. I could dust it off and see where it ended up.

Apart from that, there are a few things out there to measure timing over large spans of opcodes (IE, execute 1000 instructions and take a measure), but they're not overly helpful as you can't easily narrow down where things went wrong, and the RAM refresh cycles throw things off as I mentioned.

Last edited by Nemesis on Wed Jun 05, 2019 10:24 am, edited 1 time in total.

All that said, I see the Microcode analysis (viewtopic.php?f=2&t=3023) as the way forward. Reading the microinstructions tells you what you need to know regarding cycle timing. I hope to extract simple, verified data tables at the end of that process to make things easier for everyone, but ultimately I think that's a more logical route right now than hardware testing. The alternative is probably driving the 68000 with a microcontroller or FPGA and sampling timing for every instruction variation, then sticking your 68000 core in a similar software-defined box and comparing with the expected results.

The RAM refresh does indeed sound like a challenge. At least on the SNES, the RAM refresh is predictable and easy to test instruction timings around it. I haven't looked into how to emulate the version on the Mega Drive yet.

Perhaps we could use a timing test ROM for a 68K system that doesn't use DRAM, if one even exists? As long as I don't have to emulate an entirely new system and video chipset to get the results ...

A coarse test is fine as well. I guess the best I can do for now is to build a little framework that tries to generate those weird A(B/C) tables that are all the rage on the 68K. But they're a bit hard to read.

…the 68000 user manual doesn't give vague "maximum value", look into that (section 8, if you wonder where the timings are). The only one where it's vague is with division (it just says it's around 140 cycles), and I think that's in large part because the underlying algorithm is complex enough that an exact estimation would be hell to explain. BlastEm actively attempts to emulate the algorithm proper, may want to look at that code (m68k_core_x86.c seems to have the implementation in it at least? as divu()/divs()).

I recently noticed one of these fixed instruction had some noticeable effect on one MD game (Clue) once you implement accurate VRAM access (FIFO/DMA) timings as it is massively used in a big loop executed just before some VRAM routines that can cause screen corruption if started too close from Vint. This part is very timing sensitive and might break each time you start improving emulation timings (like adding periodic bus refresh delays) if some of your other timings are off... or get fixed because of inaccurate timings emulation.

Does there exist a test ROM that validates the cycle timings of 68K instructions?
...
I would prefer a Genesis test ROM, but if there's one for no real specific CPU architecture (eg I won't have to emulate an entire Amiga or Neo Geo), that would also be great.

I can't comment about the Genesis, but I doubt there is a generic test suit like that. How would you measure cycles without some specific hardware?

Anyway, in the first place, I recommend you check the official Motorola documentation that is comprehensive and later versions are very accurate (do have a couple mistakes).

As somebody recommended, get YACHT. It still has a couple of mistakes, but it is very complete and includes bus access timing, not just number of cycles.

The DIV algorithm isn't defined. MULS sounds ... really weird.

MUL timing might sound weird, but it is accurate as documented. It has to do with how the microcode works.

There is no simple formula for computing DIV timing. Get the sources I published years ago for code that computes the exact timing.

Bit manipulation has an "* indicates maximum value" note, but doesn't explain what that means.

If the bit is located in the lowermost word (bit number < 16), then it is two cycles less.

Slowly working through yacht still ... had to emulate the prefetch to make sense of JMP/JSR timings. The trick is JMP (d16,An) can peek ahead to IRC without having to fetch another byte, whereas JMP (An) has no use for it, hence why they're both 8(2/0).

According to this, IRD will always be equal to IR. Is there any reason for an emulator to have an IRD, or is that just a detail for microcode/transistor-level simulation?

IR is used for decoding the next instruction. IRD holds the opcode currently being executed. They have the same content most of the time, but not always. The above steps are not performed all at the same cycle. IR and IRD are never loaded at the same microinstruction. That means that IR changes at least two cycles before IRD.

e.g, during instructions that takes four cycles (like NOP), two cycles IR and IRD will be the same, two cycles might be not (depending if the next instruction has the same opcode or not).