Event driven game engine

This is a discussion on Event driven game engine within the Game Programming forums, part of the General Programming Boards category; Since the subject has been discussed recently I though I woudl post a few of the considerations to keep in ...

Event driven game engine

Since the subject has been discussed recently I though I woudl post a few of the considerations to keep in mind when writing an event driven game engine.

The core concept in an event driven game engine is the message loop. Similar to the windows message loop, in which the game keeps a list of things to do each iteration of the game. One big difference though is that modern event driven game engines use a second message loop for things that will occur in the future, but do not need to be checked every iteration. e.g. If you click on a factory to build a unit, the event to create the unit will occur at some point in the future, lets say 10 seconds, but you dont need to check whether that 10 seconds is up every game tick, which woudl mean you checked if it was time to create the unit some 300 times before it was actualyl time to create it (at 30fps). So there are two event loops, or lists. The events that must be processed every game tick, such as updating a units position, rendering it on screen, etc. and the events that only need to be processed when their timers expire, such as unit creation.

So the game engine has a linked list of enents called the main event loop. Each event has a type. The event loop pulls the first eent off the list and processes it. It then deletes teh event object and procededs tot he next event in the list. It continues to do this until it either reaches the end of teh event list, or processes a QUIT event. As I will explain, a properly written event loop will never reach teh end of the event list. The first event type I will explain is the Render Game State event. It sounds pretty obvious that this event renders teh buffer onto the screen, but it also does more than that. It also adds a new Render Game State event to teh back of the event list. So that there is always an event on the list that will render the current game state. This is a self refreshing event and many events are of this type. The QUIT event will cause teh game to immediately terminate as if it recieved a WM_QUIT message. This is used to shut down the game in responce to e.g. the player hittign the exit button on a menu. So when teh application starts up, the initialization code would place the Render Game State event on the loop. The next event type that some engines use is the Process Windows Messages event, which iterates through all pending WM_ messages and processes them. Personally I dislike this as it locks the mouse and keyboard processing to yoru frame rate. I prefer to write teh message loop so that it processes any pending windows messages each iteration, and then processes a game event, then checks again fro pendign windows messages. This does create a small amount of overhead as you are pollign for windows events much more often, but the overhead is very minor, and it creates just awesome responsiveness, which teh player will appreciate. The next event Id like to discuss is Process Timed Event. This is also a self refreshing event, and it causes the engine to check the timed events (e.g. the create unit event mentioned earlier). The timed event list must always be kept sorted so that the first item on the list will expire first. This way the engine needs only check the first item, if it hasnt expired it can simply ignore all other events on the timed event list, saving a lot of processing time. There are several variations of how the engine handles an expired event. Some engines immediately process the item, some move it to the top of the main event list so that it gets processed later. There is also some variation in what the engine does next. Some engines only process the first item, others continue checking for for expired items. The benefit of processing one item only is that if a large number of events expire simultaneously it will not cause a noticeable lag in the games frame rate, as only a single event is processed per frame. The drawback is that if the game is generating more than 1 timed event per frame for an extended period of time, the backlog can cause anomalies like units not being created for several seconds after they are finished. I could write a PhD thesis on all the trade offs and optimizations that are possible, but I'll save that for my PhD thesis.
But what about the units themselves. Well, each unit gets its own event, which is the Update Unit event. This usually self refreshing event updates the physics of the object, renders it to the buffer, and conditionally creates a new event on the loop. I say conditionally because if the unit is dead or needs to be deleted, it adds a different event, the Delete Object event, and does not refresh the Update Unit event. In a similar manner, the Create Object event when processed does not refresh itself, but instead adds an Update Object event.
Now I need to take a moment here to explain what the events themselves look like in memory. The event loop is basically a linked list of pointers to an Event structure. The event structure is just a value for the event type, and a pointer or other data to any information necessary to process that event.
Anyhow, as the events are processed and refreshed the loop will always have some events in it. The frame rate is basicalyl how fast it can process all teh events in the loop between one occurance of the Render event, and the next. If we had 3 objects in teh list the loop migth look like this to begin -

Update 1
Update 2
Update 3
Check Timed Events
Render

and after it processed one event it would look like -

Update 2
Update 3
Check Timed Events
Render
Update 1

and would continue. It is important to note that the UI code can insert and delete events based on the players inputs. This loop basically just processes the physics, it doesn't handle the UI, which would be done through the windows message loop.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

I've been thinking about this a while, mostly since I started using Asio. I'm either going to use Asio or design something similar to Asio, and use it as the code of my engine and it's components (graphics, networking, etc). Nice thing is Asio has intervals built-in, among other things. You're right about processing input all at once, and that goes for any subsystem. That's why I like poll_one() from asio, you can poll_one() event from each subsystem in a loop (first input, then networking, then graphics, etc) so it's all in sync. It does have a little overhead, as you said. Another great thing about an asynchronous event system is it can be entirely callback driven (which really pays off with lambda's), meaning you can pass off the workload to threads (which only benefits you if you have multiple cores, 1 thread per core preferable - one for graphics, one for physics, one for networking, one for the rest). This is good for abstraction, of course.

The actual design of the event system itself eludes me. Doubly linked lists is definitely an option. I was considering a vector array where each index is a millisecond in the timeline, and you simply add/remove events to the appropriate millisecond you want the event fired, then loop through them in the main loop. Events could renew themselves, and you could keep track of the event timeline, go backward, forward, save a chunk of the timeline. Say from index 102980 to 102985 would be saving 5 milliseconds of events (graphics, input, everything), which you could use for playbacks or reversing time, whatever. Kind of impractical, and you can do that with linked lists too you just have to set a timestamp when you need it.

I'm in the middle of a game and I'm just using a quick std::deque until I decide what I'm doing.

You don't really need a double linked list for the main event list, since new events or refreshed ones are only added to the end. The timed event list you could use either, since you need to insert items possibly in the middle of the list, although I suppose when you walk the list to do the insert, you could just keep track of the previous link then, and not have the overhead of a double linked list.

As for the typos, Cboard was having problems when I was trying to post, so I ended up having to come back later and post using ctrl+V, which had the unfixed version, and google chrome doesn't show typos for pasted text.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

What do you think games like Braid do, where you can reverse time/events, or games with playbacks (COD, NHL)?

I had to implement something similar just recently. Not on a game, but on a ZX Spectrum emulator I'm working on (which will indirectly give it record/playback and undo/redo features).

The solution for me was a Command Pattern. I serialize the emulator state and simply keep a command queue complete with timmings. Both can be dumped to a single file for later playback. On modern, more complex games a similar solution can be applied, I suppose. You'll find the state serialization (do not forget the almighty seed values) and command queue(s) as being remarkably small in size. You just make provisions on your game engine to accept this functionality, link them to your engine and voilá: a magic record/playback.

I'm not saying it's easy on modern games by any stretch. But it's certainly doable and you don't need to mess with your message loop in any way.

which only benefits you if you have multiple cores, 1 thread per core preferable

I'm thinking that being conservative doesn't help. Sure, if there's no waiting, 1 thread per core seems to be the right thing to do. But how often will you have that chance in a game engine? Instead, all that networking, graphics, I/O and whatnot is better distributed among several threads and let the CPU deal with the details.

With so much thread yielding on those subsystems, your game would crawl if you were to define the number of threads of each by the number of cores. Yes, you get better performance if you have more threads than cores

Well, I have found that because the Render event must be synchronous with all the Updates, you cant really assign more than 1 thread to the event loop. It is also bad practice to have multiple threads rendering to the backbuffer at the same time. While you could use a critical section to allow multiple threads to execute updates, You still need to make sure that every event is processed before Rendering. I'm sure there is some elegant way of doing this, but its late and I'm not seeing it at the moment. Some way of ensuring that an arbitrary number of tasks, that may increase or decrease during handling, are all executed before allowing the render event to execute.

Perhaps have multiple threads, where one thread is the master thread and will execute the Render event, and the slave threads will simply SuspendThread() themselves if they encounter it. Then the master thread needs to ResumeThread() the workers after it finishes Render. But I see two issues. First the Master thread would have to check that all the workers had suspended themselves before executing the Render, and you would have to use a critical section on accessing the event loop, which puts a finite upper limit on how many events you can handle per second. Something like 15 million for a 3.2GHz P4 That probably wouldn't be an issue at all, but its a consideration.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

I'm thinking that being conservative doesn't help. Sure, if there's no waiting, 1 thread per core seems to be the right thing to do. But how often will you have that chance in a game engine? Instead, all that networking, graphics, I/O and whatnot is better distributed among several threads and let the CPU deal with the details.

With so much thread yielding on those subsystems, your game would crawl if you were to define the number of threads of each by the number of cores. Yes, you get better performance if you have more threads than cores

Not what I read, but it's all good. I'm not nearly concerned with that aspect yet. For now everything is fine without threads. There wouldn't be that much thread yielding. Basically the main thread would build up a list of jobs for each subsystem and pass them off at an interval (say every frame), and recieve responses from the last interval and fire off the callbacks. Each subsystem would be abstracted so they wouldn't interact with eachother other than this method. If you took a normal, single thread, program that hasn't be abstracted, you'd likely see a lot of interaction and hundreds/thousands times more yielding. In the case of those programs, threads, even with multiple cores, can run SLOWER than without. I saw tests show them running at about 65-85% normal speed because of the overhead of yielding and transferring from each core and whatnot. I think the recommendation was 4-8 threads for 4 cores. Even with this design, it was only running at like 180% speed, not a dramatic benefit, again due to overhead. It was in a few game engine design articles. I could find links. Kind of old by now, like 2008. Sorry for the vagueness.

According to the UDK Unreal 3 is a multi-threaded renderer and event system. However I don't buy the multi-threaded rendering stuff. Rendering is not complete until all aspects of the scene have been rendered. Also having multiple threads touch the same back buffer and/or off-screen texture is probably not a good idea either and to get it working would require a significant number of mutexes or at the least some critical sections. Well if each thread is waiting for the other to get done with the backbuffer....how is that any different than each item rendering when the scene graph tells it to? It's not. You are still essentially rendering linearily which does not lend itself well to threads.

I can, however, buy the idea that the event system is threaded b/c it is a perfect candidate for threading. It does not touch the back buffer and probably does not touch any other code that would be modifying data that another thread was modifying. I actually have a working system such as this in my demo space game/tech demo that will never be completed in my lifetime.
It works quite well and it follows most of what abachler has posted except that I wait 50 ms on a WaitForSingleObject. If the specific event does not fire in that time frame it proceeds to process a small portion of the messages if there are any at all. When a message is sent via a broadcast or via a specific route this fires the event which causes the event loop to process the message immediately or very close to it. I found that in practice there were not as many messages building up as I had initially feared might be. In fact the system works quite well amd the major problems I had to solve were more related to thread safety than the event system becoming bogged down. My super cheese-ball AI objects all function on this same type of event system and although the AI does manipulate the graphical objects game state it does not alter the graphical state so it's simple to thread.

Items that are event based and/or threaded in my system are:

Input

Game world updates

Sound

Networking

Game event system

Lua scripting (Lua is abstracted through another object that is threaded)

One could argue that input need not be threaded as well. I tend to agree since input can only happen at one specific point in the main game loop. As well if you respond to input but don't render the results then the input is never communicated to the user. This boils down to having input happen prior to updating the game world and prior to rendering. If input processing is not done prior to these it is very possible that input could be out of sync with what is displayed. If it is done after these two methods then the result of the input would not be seen until the next frame. So the last frame's input would be displayed in the next frame which is not ideal. Does it really matter if someone pushed a key during a render? It will be picked up on the next time through the loop and we can't really render the results of the key press until we first update the game world according to the input.

I decided not to multi-thread rendering b/c I could not see any net benefit of it that would be worthy of the re-structuring it would require to pull it off correctly. I'm not saying it's impossible to do it but I fail to see how rendering is not a linear process that should be done inside the main game loop. Rendering, in my estimation, would produce far more yielding than anything else which would ultimately slow the system down.

Not everything in a game is capable of infinite parallelization. Amdahl's law kicks in pretty quickly. You can't blit the backbuffer to the screen before every object has been rendered. You can't render multiple objects at the same time, well, you CAN but you don't gain anything by it since the bottleneck is the GPU bus. Therefore you are stuck with a single rendering thread. Other tasks are trivially parallelizable, AI for example, You can run as many AI threads as you have game objects. But again, you are ultimately bottlenecked by the rendering speed.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

Ah, right, parallelism. To clarify I mentioned each subsystem having it's own thread, only for the purpose of utilizing all of the cores available. Of course, there's different methods available for that (OpenMP?). I didn't really take into account parallelism inside each subsystem. I don't really think it'd be necessary. You're right, especially for rendering, you wouldn't. You could keep it in the main thread, or in ONE different thread so it's on a different core hopefully. Though, I don't think graphics are too cpu intensive unless software rendering.

Oh yeah, think this was the article I read a while ago that made me think about abstracting each subsystem in it's own thread/core, with an event system connecting them together.