Been going through a lot of ROM archives and with the help of tepples lately going through the headers of a lot of converted Game Doctor disks...

It appears that headers are still a bit of an issue when it comes to emulators and ROM archivers. It seems that ROM archiver tools that are authored by one person stagnate due to lack of interest or life. Community based ROM community DATs sometimes have the same issues.

Why not have a database of "header" information in (or around) the emulator, much like NEStopia, where also the emulator can "fix" the header information for the user given a prompt? The emulator could also delete "bad" or "overdump" ROMs given the same prompt. (I've always found it quite odd that people want to keep verified bad dumps or overdumps...)

A database of ROM CRCs with the header information could be up on a github-ish website. The emulator could update its database definitions when they are published; or even allow the user to supply a path to the database it wishes to use, if the emulator author does not update their definitions in a timely manner.

In summary, anyone with cursory GIT knowledge could help fix the problematic issue of NES ROM headers. Also this could be a tool to push NES 2.0 implementation and revision and deprecate UNIF usage. Three types of NES headers for one console format is a bit overkill.

You are correct that some emulators support IPS patch format. But some patches are more practical to express in a binary patch format other than IPS than in IPS. These include patches to files larger than 16 MiB (about which an NES emulator doesn't need to worry) or patches that move a lot of data around in ROM. One example of the latter is such as one that expands PRG ROM. Because IPS is limited to overwriting byte strings with new data at the same position, an IPS patch would contain almost the entire ROM. This problem showed itself with mikaelmoizt's patch of Balloon Fight. Other binary patch formats, such as xdelta and BPS, are capable of moving data around in a file. However, emulators tend to lack support for binary patch formats other than IPS, instead relying on the patch already having been applied to the ROM before it runs.

Because of this, I have edited my previous post to express that my intent is not limited to IPS.

It's quite literally just a mapping of hash of PRG+CHR(+Trainer+PC10) to 10 bytes for the variable parts of the NES2.0 header. If you aren't worried about UNIF upgrades, the program is just something like

Get file size, decrease by 16. Quantize to lower multiple of 8192.

Read 16 bytes for header

Check if Trainer, increase file size by 512 if so

Read 8192·n+512·t bytes

Hash chunk in memory

Do database lookup

If match, replace first 16 bytes with "NES\x1A", value from database, "\0\0". Truncate off extra crap at the end.

The hard part is the making the database for the long tail of random crap. Not the bad dumps, the overdumps, the translations... It's a political problem, more than a technical one.

It would be simple to automatically fix incorrect iNES headers, but fixing overdumps or warning about bad dumps would require hash values for those (I believe GoodTools contains such a DB, in theory, but it is not something that can easily be integrated into another program).

But then, on the other hand, what if the database is incorrect for some game and you end up breaking the rom's header instead? This is mostly why I've avoided adding such a feature to Mesen so far.

Also, the vast majority of users will never redistribute any of those fixed roms - so in the end, people are still going to be getting same old incorrect roms off the internet.

Some overdumps can be detected automatically by comparing the first and second halves of PRG ROM. In a recent project for another member here, I included a tool to detect and correct overdumps of NROM-128 as NROM-256.

The problem of bad dump with bad header (and/or with IPS) could be addressed by letting you manually look up entries by name and not just hash. Could automatically start finding candidates just by searching fuzzy terms from the filename.

Hashing PRG and CHR separately might help find prospective matches too where only one of those is corrupted. You could even hash individual banks, or otherwise hash smaller segments, and bad dumps would become very easy to find a match for. Really just a matter of how much data you want to store. (Also might obviate the need to try and store every bad dump.)

Interactive and transparent operations, e.g. list the matched game being used for the header, and all changes being made to the header in human readable form. After the best guess is made, let the user search candidates (with the match prediction displayed, etc.)

maseter wrote:

Who would this please, except the byuu's of the world (no offense)?

Never mind that it would be a never-ending sisyphean task...

Yeah, I'd agree. Just because an idea is simple to describe doesn't mean it's actually easy to implement. People can always wish, though.

Sour: Hrm... I've never tried Mesen before until now. It's really impressive... Gonna hunker down and get used to it! Lot of good features...

Regarding ROMs, it appears at the moment that no one community has decided unanimously in a proper method. Some of them even just blatantly disregard headers. Moving forward, a database slowly built by time would help...

In fact the database could be created by the users given some redundancy checks. Let's say there's a ROM that Mesen doesn't recognize from its database. Once the user configures the proper NES 2.0 header, or one is supplied by another user... If there is external redundancy of the same information maybe 3-4x times from different sets of IPs then it could be put into the pending database. There could be a "submit info to database" button, etc. This is somewhat in the same vein as CD Ripping info databases.

In the same manner overdump detection or bad dump detection could be flagged by a user.

Nice word. But mapper assignment is "sisyphean" regardless of who you push the responsibility onto. Pushing emulator updates is far easier than pushing rom updates due to the illicit nature of roms and the sheer number of them. So I'm gonna say that rom-side mapping was a bad idea from day 1.

The occasional bad header is a frequent enough problem for you that you'd throw away all homebrew ROMs to fix it?

Not really trying to pick a fight, just trying to gauge the magnitude of this problem for you that you'd insist on a solution like that. Like, it's easy to imagine a tool that will fix it automatically, but in my perspective it's a relatively small problem? ...especially compared to the amount of work required to robustly automate a fix.

I've had hands in creating a few ROM archives myself since ~1999. I know how difficult it can be. I worked with NEStoy, GoodNES, a tool a friend and I made, etc... The thing is, there are people who like doing this kind of work. Source it to them. Make the back-end for them and they will do it with the free time you don't have. No sense just considering it a mess and just living with it. I would have thought the NES/Famicom community would have stagnated by now, but ROM dumps are STILL emerging. Let's fix it now before it gets worse.

"Header ROM size" is the total of PRG ROM size and CHR ROM size from the header. It is valid only if the file is at least 16 bytes larger than this size.

"File ROM size" the file size, minus 16 bytes, rounded down to an 8 KiB boundary, and rounded further down to the sum of two powers of two. This may produce a wrong result in the case of a PlayChoice ROM whose CHR ROM is 8 KiB or nonexistent, as it will assume the instruction ROM is part of CHR ROM. But it will produce the correct for the most notable game with a non-power-of-two PRG ROM size, as the total size of Action 52 is still 2 MiB.

A ROM's hash is a size followed by the SHA-1 hash value of the concatenated PRG ROM and CHR ROM data. (SHA-1 is used because NesCartDB offers it. I concede it is insecure against constructed collisions since SHAttered, but it is still secure against preimages.) The header correction tool contains a database of hashes extracted from an XML dump of NesCartDB. This list maps each hash to the correct NES 2.0 header. Each entry will thus need 32 bytes: 20 for the hash, and 12 for the header excluding the initial "NES\x1A" (from which the size can be calculated). The latest dump has 3179 <cartridge> elements; even if I pessimistically assume they're all unique (which they aren't), that's still only 100 KiB.

When attempting to correct a ROM image, the tool calculates its header ROM size and file ROM size. For each distinct valid size, it calculates the hash and compares it to the hashes in the database. If it's found, it returns the header associated with that hash.

If header correction fails to find a hash match, the emulator falls back to the existing iNES or NES 2.0 header. This means homebrew, hacks, obscure games, and the like will still run so long as they already have a valid header. So if you're releasing your own stuff, don't release crap.

Who is online

Users browsing this forum: No registered users and 8 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum