Author
Topic: Making hacking easier (Read 11503 times)

As someone who has zero skills related to hacking, I think somehow facilitating the learning process would probably be the best way forward, but I'm not exactly sure how you would do that with a hobby that takes place almost 100% over the internet. I still would encourage every single person not skilled at hacking but still interested in doing hacks to start off with the pretty amazing selection of game-specific utilities on this site. I honestly think if people spent more time practicing with those and starting small before getting discouraged by the amount of effort a complete-game hack takes, there would probably be a few more members of the community.

That said, most skills take practice and hard work, and you can find structured classes for help training in most of them. You can find classes on programming too, but I guess I don't know how comparable romhacking and the stuff you find in proper programming classes are. For all I know, the classroom stuff isn't always applicable to the actual practice of romhacking.

Maybe if someone gets some spare time they could create some kind of learning document for aspiring hackers to learn the basics? Provided there isn't one out there already--I admittedly wouldn't know because I know I don't have time to learn.

It would probably be a lot of work, so it would have to be a labor of love (much like everything else on this site). In my experience though, that's the only real way to streamline a skill-learning process.

I think the real reason you guys don't want to help is because you don't want it to be easy, because you enjoy the challenge of pushing your ability and wrestling with the experience of your own limits (and, I assume, it makes you feel good to see "lesser" minds fail).

I have looked into this sort of thing for a variety of fields*, mainly as secret sauce knowledge and security by obscurity is something I particularly dislike.

*one of the more well studied possibly being the history of bump keys in lockpicking; Danish locksmiths supposedly knew of it for decades before it got popular elsewhere.

I have seen attempts at keeping some kind of ROM hacking knowledge an exclusive skill, usually in cheats but occasionally in things like sprite ripping, I believe I mentioned in a previous post in one of these topics. It tends to be seen in game specific circles and I am comfortable in saying anybody trying that here would get laughed out of the boards. At worst people have to occasionally pull themselves up by their own bootstraps as there are not enough people to hold hands through everything, though it results in fine hackers few would call that situation ideal. If you search hard enough you might find something around here that people did not overview at the time of posting, those would be the ROM hacker equivalents of hackmes or obfuscation challenges/contests ( http://www.ioccc.org/ ).

Anyway I can not speak for others, but I will pretend I can, and say there is a certain amusement to having the mad skills that others might not, however if said skills subsequently become common then it only means a new level has been opened up and said new level probably means a lot more scope for change or projects previously not viable then become viable for ease or time reasons (or if you prefer "the possible is now possible, how can I optimise it a bit more"). This will probably continue until such a time as we have natural language programming and some flavour of AI, and then only limited by imagination.

Maybe I am the exception though and I will have to return you to the pack of professional 6502, z80, 68K, 65c816 and ARMv3 programmers that inhabit the boards.

Back on topic you speak of code section identification and yeah having something tell me what are the main loops and subroutines and whatnot would be nice, however the methods you seem to want to explore are going to leave very fuzzy edges and thus are borderline useless compared to the accuracy of a conventional breakpoint. Similarly I am not sure how such a thing breaks down the most barriers, unless you thing said identification is going to act as a prebuilt tracing session/make a kind of quasi filesystem/data lookup table, in which case you are dreaming. There might be something to having some more soft analysis (relative search is hardly universal but it is a fundamental technique).

As someone who has zero skills related to hacking, I think somehow facilitating the learning process would probably be the best way forward, but I'm not exactly sure how you would do that with a hobby that takes place almost 100% over the internet. I still would encourage every single person not skilled at hacking but still interested in doing hacks to start off with the pretty amazing selection of game-specific utilities on this site. I honestly think if people spent more time practicing with those and starting small before getting discouraged by the amount of effort a complete-game hack takes, there would probably be a few more members of the community.

That said, most skills take practice and hard work, and you can find structured classes for help training in most of them. You can find classes on programming too, but I guess I don't know how comparable romhacking and the stuff you find in proper programming classes are. For all I know, the classroom stuff isn't always applicable to the actual practice of romhacking.

Maybe if someone gets some spare time they could create some kind of learning document for aspiring hackers to learn the basics? Provided there isn't one out there already--I admittedly wouldn't know because I know I don't have time to learn.

It would probably be a lot of work, so it would have to be a labor of love (much like everything else on this site). In my experience though, that's the only real way to streamline a skill-learning process.

A few years ago I tried writing a tutorial on grasping Hexadecimal pertaining to ROM hacking from the perspective of a 100% inexperienced nooblet, covering real-world questions like "what the fuck am I looking at?" and "why the fuck would anyone do it this way?" It's here: http://www.baddesthacks.net/?p=1118

Good documentation and lessons on the fundamentals combined with the reader's drive to learn and master a skill is the only realistic answer to this thread's great question, but Zonk is against this in favor of a one size fits all magic ROM hack button that will never exist.

Now if you'll excuse me I have a global warming deniers rally to attend.

I know I have weird tastes, but a hex editor in binary with table support for characters that are not standard 8 bit would be really interesting for cases like Battle of Olympus (5 bits/letter), but more commonly JP games with 10/12/32 bit/letter.We have tile editors allowing for sub 8x8 tiles (thankfully), why not hex/text too?If I ever begun coding my own hex editor (hopefully soon, once I'm done with C++ and Qt) I'm planning this sort of thing (maybe hira/kata tbl switching too).

One of the things I don't like about romhacking, as someone who is still kind of new to assembly / hex / binary operations and who probably isnt aware of the best tools out there and all.. is how much I have to repeat the same operations over and over again.. For example, sure I can find the binary value of "A5" and the the result of AND'ing it with "43" all in my head..

The way I like to remember how it works, wouldn't it be better to look at how AND works like a coffee filter?All the bits that would match a zero bit on the other end would all get zeroed out.And bits that would match a one bit slip to the other side...

For example with SMB3, when the value from $0578 (setting Mario's form but also other things) gets sent to $00ED (Mario's actual form), it gets AND'd with #$0F (0000 1111) and the value has its leftmost 4 bits zeroed out (filtered out) after this?In fact, that line's purpose in SMB3 is to filter out values out of an acceptable range.

I'm sure for other logical operations one can make up their own little memorization trick I reckon looking at the problem from a binary standpoint would be tiresome, but it's one of the things that come with this stuff. Maybe if one is used to making cheat codes using bit flags (like status effects in RPGs, or e-Switchs in SMA4) it will be more natural and some notable hex values (01, 02, 04, 08, 10, 20, 40, 80) will stand out, and eventually everything will seem easier as you go.

Apologies for not replying earlier, but this is one really excellent suggestion and I'm pleased everytime Auto Hot Key is mentioned.First heard of it with people discussing how to deal with a difficult Flash platformer (it's "Pause Ahead" btw, very nice game) by automating certain hard jumps. It's really useful for lots of things, and with it being open-source, it could be useful.

A few years ago I tried writing a tutorial on grasping Hexadecimal pertaining to ROM hacking from the perspective of a 100% inexperienced nooblet, covering real-world questions like "what the fuck am I looking at?" and "why the fuck would anyone do it this way?" It's here: http://www.baddesthacks.net/?p=1118

I still recommend something like Rodnay Zaks "Programming the Z80" as a book for learning the fundamentals, including binary and hex.

When you're done with the first 3 chapters of that, you'll have a good set of basic information as to how computers work, and what all this weird stuff means. From my POV, if you can get your head around that, then everything else becomes much, much easier.

Then, if you start by experimenting on a simple machine, like the GameBoy, you'll have a chance to really learn some solid fundamentals before you try to dive into 20-30 years of successive development and specialization that have lead to the complexity of modern machines.

I know I have weird tastes, but a hex editor in binary with table support for characters that are not standard 8 bit would be really interesting for cases like Battle of Olympus (5 bits/letter), but more commonly JP games with 10/12/32 bit/letter.

You propose a hex editor which uses a virtual address space to map 5-bit sequences (fivelets?) across a range of bytes. It sounds hard to decode though. Would be better to do it in sets of five bytes, with the fifth byte containing the 5th bits for the nibble pairs in each. (or is this what you were considering?)

Logged

A good slave does not realize he is one; the best slave will not accept that he has become one.

You propose a hex editor which uses a virtual address space to map 5-bit sequences (fivelets?) across a range of bytes. It sounds hard to decode though. Would be better to do it in sets of five bytes, with the fifth byte containing the 5th bits for the nibble pairs in each. (or is this what you were considering?)

Dunno why but the way you say it sounds overly complicated.Anyways.

Text data usually encodes data with one byte per character.One byte has only 256 possible values, so Japanese games often (not always) use 2 bytes per character. Some even use three bytes.

2 bytes means 65535 possible values.Most retro Japanese games have around 650 kanji characters.Recent ones have around 2800 kanji characters. The ones who put all possible characters have around 8000.

So 65535 is a bit of an overkill.But it's far easier to just use 2 whole bytes for the job, and so most devs do just that.

But devs in the nineties who had big troubles fitting the text in a tiny ROM, disagreed about that.So instead of using whole bytes (each byte being 8 bits), they'd encode each character on less than 8 bit multiples.That's a simple compression.

Lagrange Point had less than 0x80 characters. And those characters were two identical sets (like upper case and lower case).So they used 6 bits per character.

Battle of Olympus only needed 26 English letters and a few control codes.So they used 5 bits per character.

As Rotwang said in his tutorial linked here, that turd of a game for NES based on the Simpsonsused 4 bits (half a byte, also called "nibble") per character for the most needed letters, and 8 bits (one byte) for the other letters.This idea was improved and called the Huffman compression and is widely used but the character length in bits is variable and thus wouldn't interest us.

There's games using even 12 bits per character (Zelda Kagami no Triforce and Tengai Makyou Zero for example).

Well.Some hex editors offer options to modify the "unit" from a byte (8 bit) to a nibble (4 bit) or a half word (16 bits/2 bytes) or a word (32 bits/4 bytes). But all of these are multiples of 8.If these options were freely configurable so that a table using less than 8 bit values per character (be it 5, 6, 8, 10, 12..), it would help for a few games (I bet the main usefulness would be for stuff like the slightly more common 3-byte characters though).

If you're just gonna encode these characters as continuous bits streams, then the streams must always be multiples of 8, in terms of bytes, to be even.

for 5 characters you don't hit a multiple of 8 until 40 bits (every 8 characters)for 6 characters not until 48 (8 characters)for 7 characters not until 56 (8 characters)

So below 8 bits, its always an 8 character sequence.... and I think this continues on... it's alwayseight characters. But after 10/80, the length of the sequence (in bytes) must be the same as the bit length to always be even.

(13 x 8 = 104, 17 x 8 = 136)

In other words, any sequence so coded/compressed must be a multiple of the binary size x 8 bytewise. If you have a five bit sequence, its must be sized 5 bytes, 10 bytes, 15 bytes etc; for a 6 bit sequence, 6 bytes, 12 bytes, 18 bytes, etc.; and so on. Any file produced by such an editor will naturally have size that is a multiple of its bit/character range.

I'm pretty sure that Dragon Warrior 2 uses 5 bits for its tiles... I do remember having enormous difficulty following the DWEdit's viewer code for its maps.

« Last Edit: April 12, 2016, 03:13:34 am by zonk47 »

Logged

A good slave does not realize he is one; the best slave will not accept that he has become one.

If you're just gonna encode these characters as continuous bits streams, then the streams must always be multiples of 8, in terms of bytes, to be even.

Err...I'm not the one who encoded them that way, the game programmers did (and made it work so it's obviously not impossible).Also, it's entirely possible to make the hex editor parse units of less than 8 bits to interpret as characters. There's that nice thing called bit shifting, and logical operations...

Just one example of a possible implementation.Supposing the current setting is 7 bits / character.

Hex editor reads the first byte.We need first 7 bits.

So, we have one leftover bit. We copy him, and him alone, with an "AND #$01" (0x01 = 0000 0001) to a value called DATA_LEFTOVER (which uses a whole byte)We have a variable HOWMANY_LEFTOVER telling us how many leftover bits there are. We put 1 in that variableDATA_LEFTOVER is shifted to the left "7 minus HOWMANY_LEFTOVER" times.

For the first byte, we copy it to our first character, shift its bits to the right as much times as HOWMANY_LEFTOVER (so, only once), and do an "AND#$7F" (0x7F = 0111 1111).We get our first character.

Now, hex editor reads first 7 bits minus HOWMANY_LEFTOVER.So only 6 bits.Now those 6 bits are copied to our second character after some bit shifting to the right, then OR'd with DATA_LEFTOVER (which has the remaining bit from that character that was in the first byte).

And so on.It's very doable without major changes to how the hex editor works.

The size of the stream produced nonetheless ought to be as many bytes as there are bits per character. For 13 bit character, 13 bytes etc.

My method would be to make an array of as many bits in a minimum sequence (say, 48 for 6 bit characters/6 byte sequence) and use multiplication to find the first bit of the character, then just start setting 1s and 0s. This way you have your data model for the writing. To do the writing, set the bits based on the values in the array using a per-byte/per-bit loop.

Your way is probably more space efficient, though it's very difficult to follow. You'll have to do a comprehensive write up.

The vb8086 debugger is coming along well; however, I'm having difficulty understanding the extension mechanism (that is, opcodes beyond $FF that are accessible via the GRP instructions). How do these things work, in terms of bytecode?

« Last Edit: April 12, 2016, 04:14:14 am by zonk47 »

Logged

A good slave does not realize he is one; the best slave will not accept that he has become one.

A few years ago I tried writing a tutorial on grasping Hexadecimal pertaining to ROM hacking from the perspective of a 100% inexperienced nooblet, covering real-world questions like "what the fuck am I looking at?" and "why the fuck would anyone do it this way?" It's here: http://www.baddesthacks.net/?p=1118

Good documentation and lessons on the fundamentals combined with the reader's drive to learn and master a skill is the only realistic answer to this thread's great question.

I totally agree 100% on this.I think the community is pretty good about helping people when they need specific help, but I think it might be asking too much for folks who are doing this as a hobby to step in and take the time to teach people on a consistent basis. Or at least on a basis that would be useful for new hackers--that's why I think the type of doc that Rotwang made is really really valuable.

If there are more kind of tutorials and stuff, I think just getting a good compilation of them (and maybe even reviews so folks don't end up starting with bad ones) is probably the best way to ease the growing pains. Is there a place like this we could maybe redirect people to if they want learning materials? Or how much work do you guys think it would take to put something accessible together?

The size of the stream produced nonetheless ought to be as many bytes as there are bits per character. For 13 bit character, 13 bytes etc.

My method would be to make an array of as many bits in a minimum sequence (say, 48 for 6 bit characters/6 byte sequence) and use multiplication to find the first bit of the character, then just start setting 1s and 0s. This way you have your data model for the writing. To do the writing, set the bits based on the values in the array using a per-byte/per-bit loop.

Your way is probably more space efficient, though it's very difficult to follow. You'll have to do a comprehensive write up.

The vb8086 debugger is coming along well; however, I'm having difficulty understanding the extension mechanism (that is, opcodes beyond $FF that are accessible via the GRP instructions). How do these things work, in terms of bytecode?

I agree with GHANMI. The bit-to-byte matching proposed here is totally unnecessary. Bit shifting + bit masking handles this nicely. You can even map to bitfield structures if you really want that (there are compiler compatibility limitations for those, so you might want to stick to shifting).... Unless you're writing this with VB in mind and VB has some language limitations with regards to bits handling? (I don't know VB that much.)

A few years ago I tried writing a tutorial on grasping Hexadecimal pertaining to ROM hacking from the perspective of a 100% inexperienced nooblet, covering real-world questions like "what the fuck am I looking at?" and "why the fuck would anyone do it this way?" It's here: http://www.baddesthacks.net/?p=1118

Good documentation and lessons on the fundamentals combined with the reader's drive to learn and master a skill is the only realistic answer to this thread's great question, but Zonk is against this in favor of a one size fits all magic ROM hack button that will never exist.

Now if you'll excuse me I have a global warming deniers rally to attend.

Those two tutorials you wrote are both hilarious and informative. I always wondered what people were talking about when they mentioned "nybbles."

And the method you mentioned for creating table files was a great tip! I'll keep it in mind the next time I'm text-hacking a game and can't track down it's font graphics (my usual method of finding values to make a table file with). Damn fine reads!

There is a fairly common used phrase along the lines of 'Explain it to me like I'm a 5 year old' which is probably why the tutorials at BaddestHacks are so good, because the average contributor there has the mentality of a 5 year old. ZIIINNNNGGGGGGG!

There is a fairly common used phrase along the lines of 'Explain it to me like I'm a 5 year old' which is probably why the tutorials at BaddestHacks are so good, because the average contributor there has the mentality of a 5 year old. ZIIINNNNGGGGGGG!

Yes, but let the record show that you registered there last week

I honestly love writing documents like this because it forces me to verify everything I think I already know about the subject and in the end I wind up learning some things myself or correcting any misunderstandings I might have had.

I honestly love writing documents like this because it forces me to verify everything I think I already know about the subject and in the end I wind up learning some things myself or correcting any misunderstandings I might have had.

Yeah, I've heard the saying that you don't really understand something completely unless you can explain it to a 5 year old. That must be a really good exercise that makes sure you understand each concept fully. I think the same goes for the readers too, even the more advanced might have some misunderstandings that wouldn't be corrected if they were not taken for "dummies"

Yeah, I've heard the saying that you don't really understand something completely unless you can explain it to a 5 year old. That must be a really good exercise that makes sure you understand each concept fully. I think the same goes for the readers too, even the more advanced might have some misunderstandings that wouldn't be corrected if they were not taken for "dummies"