Adventures in C++ Hotloading: Changing Data Structures at Runtime

I’ve been working on a miniature game engine (lovingly called Mingine) as a little side-project lately. It’s very barebones right now – pretty much just a platform layer based on SDL with a sprinkle of WinAPI, a mostly-functional graphics API abstraction layer (with an OpenGL backend), some image loading code, input handling…

The glClear is always greener somewhere else!

Oh, and it supports C++ hot-reloading, which is actually the whole reason I started Mingine in the first place. In this article, I’ll be talking a little bit about the early state of that architecture, as well as some of my takeaways from it. This will be a little longer than usual, so buckle up!

Background

Like most of the other implementations of C/C++ hotreloading I’ve seen, my implementation borrows many ideas from Casey Muratori’s Handmade Hero series. The basis of my architecture is very similar – all of the platform code is located in an executable, while all game code is stored in a shared library during development (ideally, it would be statically linked for release), along with a static library which hosts a bunch of “engine” code. However, unlike Casey, I would rather not reinvent the wheel for literally everything – sometimes external libraries are pretty handy!

The biggest problem I wanted to address was the fact that Handmade Hero’s implementation can’t change data structures at runtime, at least as far as I’ve seen. While there is certainly value in allowing certain parts of memory to exist, untouched, through a code reload, I would really like to be able to change certain structures without having to worry too much about what it might affect. With that in mind, I had to handle memory a bit differently.

Changing Data Structures at Runtime

For our purposes, there are three major types of ingame memory that we have to manage:

Memory that has to survive an entire frame and can then be safely deleted. In some architectures this could all be allocated on the stack, but since I plan to allow for interleaved frame processing, it needs to be handled a bit more intricately. A valid name for this is scratch memory, but for style, we’ll call it short-term memory.

Memory whose structure is rigid, persists for a very long time, manages itself, and needs to safely live through reloads untouched. This type of memory includes things like asset data, graphics objects, etc. We’ll call this long-term memory.

Memory whose structure might change at runtime. This would likely come into play as you’re building out new gameplay systems, fundamentally changing a boss fight, things like that. We’ll call this type of memory Crazy Eddie.

As you can imagine, each type of memory has drastically different storage requirements based on their usage and lifetimes.

For short-term memory, Mingine employs something very similar to the concept of a tagged heap from Naughty Dog’s 2015 GDC talk (if you haven’t watched it, you really should – it changed my life). This is partly because I intend on handling threading very similarly to the way they do it, but also because the concept of tagging memory just fits this type of problem very well. Furthermore, since this memory is short-lived, it can actually handle some changes in data structures as long as you’re really careful about it. Freeing of this memory is controlled (almost) entirely by the platform layer, which means that it’s generally not the game’s problem to worry about.

For long-term memory, nothing particularly special has to be done. The platform layer passes a game_memory struct which includes a userdata pointer. The game layer, on initialization, assigns that userdata pointer to a new instance of MemoryHeader, defined in the game code, and stores a copy of that pointer in global memory. Since it’s all located in the heap, you can unload/reload the DLL as many times as you want and, as long as you don’t stupidly change the structure, you don’t have to worry about a thing. The MemoryHeader contains all of the allocated graphics objects, the physics world, cached asset data – all of the stuff that we know for a fact at compile time will be there, and that we need to live across reloads.

Crazy Eddie

Crazy Eddie is different. That guy’s batshit crazy and totally unpredictable. While we know the lifetime of our short term memory and the exact structure of our long term memory way in advance, we have absolutely no idea what Crazy Eddie is going to do once the game is running. Will he erroneously gain boatloads of memory and crash the game? Will he suddenly decide he doesn’t want to store anything anymore and get rid of all of his fields before a reload? Will he go streaking? Who knows! We just know that we have to deal with him. As long as we’re smart about this, it turns out he’s not such a bad guy. His insanity can be incredibly useful.

The key behind dealing with Crazy Eddie is twofold. Firstly, we give him an indirection (that was the most pertinent link I could find). We keep track of him with a pointer on the MemoryHeader, created any time the DLL is loaded and freed any time the DLL is unloaded. However, before we unload, we execute step two of dealing with Crazy Eddie: serialization. The general flow works like this:

The platform layer detects that the DLL has been updated

It sets up a cereal::BinaryOutputArchive, backed by a std::stringstream (a string is just a sequence of bytes, after all), and passes it, along with some other important arguments, to the game code’s exported GameUnloadDLL function.

GameUnloadDLL knows that we’re about to reload, so it tells Crazy Eddie to write all of his data to the archive so that he can be resurrected later on. Crazy Eddie is then deleted.

The platform layer seeks the stringstream back to position 0, creates a BinaryInputArchive, and passes it to GameLoadDLL. It also tells GameLoadDLL that this is a reload so it knows to actually do something with the archive.

Since this is a reload, GameLoadDLL knows that Eddie used to exist. It assigns g_MemoryHeader->CrazyEddie (definitely the name in-engine) to a new, uninitialized CrazyEddie, hands him the BinaryInputArchive, and tells him to relearn all that he had learned.

This serialization doesn’t require the same data structure at all – it just needs to know the order in which data was saved (which it implicitly knows). This allows us to add, remove, and rearrange fields at runtime to our heart’s content, so long as we’re mindful about how we do it. And if we make a mistake, we can just restart the program. No big deal.

Another benefit of doing things this way is that the system can very easily be extended to save game state to disc for later use. This game state would only be usable for very similar versions of the game (since the data isn’t currently versioned), but even a limited form of this is incredibly valuable for debugging purposes. I think Casey Muratori does something kind of similar with his looped live code editing, but I don’t personally think that the looping workflow would be useful for me.

Overall, it’s a fairly simple way of solving the problem, but it works fantastically for my purposes.

Limitations

Now, all of this works within some limitations. Firstly, it should be noted that virtuals created from the DLL can’t be safely stored in the MemoryHeader, even if they’re not polymorphic. This is due to the fact that their vtables may (and probably will) be completely clobbered by an assembly reload. This might not sound like a big deal, but I’ve seen a lot of code that adds a virtual destructor for seemingly no reason. In that case, the program would crash on shutdown as it tries to call an invalid destructor (which doesn’t sound like the worst thing in the world, but it’s still undesirable behavior).

Furthermore, my serialization logic does nothing to serialize polymorphic types that it can’t identify at compile time. For example, if you have a class Dog which inherits from Animal, you would have to store a pointer to Dog instead of a pointer to Animal. I have a pretty good idea of how to extend Crazy Eddie to support polymorphic types generically if that ever becomes necessary, but I can’t think of many cases in which I would need a polymorphic type with absolutely no foreknowledge of what it would be at compile time, or where a little bit of extra serialized metadata couldn’t solve the problem.

Also, we have to operate with the knowledge that all global data will be lost on a dll reload. This is the entire purpose for the different memory tiers, however it still seems like it’s important to say. The only things stored globally are pointers to data in either the platform layer or accessible via the MemoryHeader. These pointers are assigned in the GameLoadDLL function (called before any other DLL code) which ensures that they are always valid. Unless something goes horribly, horribly wrong.

In closing…

Overall, this system has worked quite well so far. I’m hardly far enough along to say with complete confidence that this experiment has been worth it, but with as neat/handy as it’s been so far, I’m definitely leaning that way. Little things like building out a basic first person camera controller without ever closing the game have already shown me that this methodology has a lot of potential, especially when it comes to tweaking.

Thanks to SDL, the code is also very portable. I started off using the WinAPI functions for DLL handling, but after porting all of that over to SDL’s equivalent functionality, I got it to compile on Linux with very little fuss (other than some weirdness with a particularly misleading gcc error regarding linking static and shared libraries). The hotreloads don’t currently seem to be working correctly on Linux which I’ll have to look into, but given that the code runs identically and I tend to do my development on Windows anyway, that’s not a huge deal for me.

Hopefully this has been somewhat useful to somebody out there. I’ll probably be posting more about it as I continue to work on the engine, so if you did find it useful, stay tuned!