23 Feb 2009

Dammit, what’s that EA-login bullshit in Burnout Paradise? Didn’t they get the memo from Konami? Turns out my usual email address for shit like this at fuck@you.com was already taken by somebody else, oh well… Crap like this should be prohibited by Microsoft’s certification process, it’s completely working against the whole idea of Xbox Live.

home:export/db/static.db4: This contains all “static” data which doesn’t change over the course of a game session.

home:export/db/game.db4: This contains all the data which may change during a game session.

Those files can be inspected and modified with the SQLiteSpy tool, but since some of the contained data are blobs it may be a good idea to access them with custom tools (N3’s db addon offers wrapper classes for this, but of course it’s also possible to use sqlite directly to access the database).

Both database files are normally populated by various exporter tools in our inhouse-asset-pipeline.

The game.db4 file is important for N3’s standardized NewGame/Continue/Save/Load feature:

on New Game, game.db4 will be copied to the user: directory of the application, changes during game play will be committed back to this copy (usually when a level is left)

on Continue Game, N3 will simply open the existing game.db4 file under “user:”

on Save Game, a copy of the game.db4 file under “user:” will be performed into the save-game directory

on Load Game, the game.db4 in “user:” will be overwritten with a copy from the save-game directory

The main chunk of data in both database files are game entity related. A game entity is completely described by a collection of key/value pairs called attributes, which live in a single row of a database table.

Application layer subsystems may store additional data in the database by managing their own tables in both databases. Usually, save-game relevant data lives in the game.db4 database, while read-only data lives in the static.db4 database file. Those custom-tables are usually read/written by derived Manager::OnLoad() and Manager::OnSave() methods.

The static.db4 database contains at least the following tables:

_Attributes: This table contains a description of all attribute types used in other database tables. This lets an application load a database in a meaningful way even if the attributes haven’t been defined in its C++ code. The columns of _Attribute are:

AttrName (string, primary, unique, indexed): the attributes name, doh

AttrType (string): the type of the attribute, this is how the different types are stored in the database:

int: stored as SQL type INTEGER

bool: stored as SQL type INTEGER

float: stored as SQL type REAL

string: stored as SQL type TEXT

vector3: a float[3] stored as SQL type BLOB

vector4: a float[4] stored as SQL type BLOB (a raw Math::float4 written to the database)

matrix44: a float[16] stored as SQL type BLOB (a raw Math::matrix44 written to the database)

blob: can be used to efficiently store “anything else” in the database, hardly used though, Drakensang uses this to store the fog-of-war data for instance

AttrReadWrite (bool), AttrDynamic (bool): these two columns are currently not used in any meaningful way

_Categories: Contains a description of every entity category which shows up in the database. This is basically the lookup-table for the CategoryManager. The _Categories table has the following columns:

CategoryName (string, primary, unique, indexed): The name of the category (e.g. Camera, NPC, Actor, Monster, etc…). Some category names are hard-coded and standardized across games:

Levels: a meta-table which contains one row for every level in the game

IsVirtualCategory (bool, indexed): a virtual category only has a ‘template table’ but no ‘instance table’, virtual categories can be used to provide read-only data (lookup tables, etc) to the game application without having to write a custom exporter tool

IsSpecialCategory (bool, indexed): only the above mentioned hard-coded categories are marked as “special”

CategoryTemplateTable (string): The name of the database table with entity templates, a template table contains “blueprints” for entities with their initial attribute values. The method FactoryManager::CreateEntityByTemplate() is used to instantiate a new entity from such a template. Template tables live in the static.db4 database.

CategoryInstanceTable (string): The name of the instance table (only if this is not a virtual category). Instance tables live in the game.db4 database file and contain one row per actual game entity.

Template tables: Template tables contain blueprint for entities which are created through the FactoryManager::CreateEntityByTemplate() method. A row in the template table contains the starting attribute values for a new entity of that type. A template table must contain at least the Id column, which must be (string, primary, unique, indexed).

The game.db4 database has the following tables:

_Attributes: this has the exact same structure as the _Attributes table in static.db4

_Globals: this is a simple global variable storage managed by the GlobalAttrsManager.

Instance tables: the instance tables as listed in the _Categories table of static.db4. Every game entity in the entire game has a row in one of the instance tables. The general structure of an instance table is the same as the associated template table, but with the following additional columns:

_ID (string): a human-readable unique identifier (only within the level)

_Level (string, indexed): the level where this entity currently resides (may change during game)

_Layers (string): The layers this entity is a member of. Entity layers can be used to “overlay” a specific group of entities at level load time. This can be used for day/night, intact/destroyed versions of a level without duplicating the entire level. An empty _Layers attribute means that this entity is always loaded into a level.

Transform (matrix44): the current world space transform of the entity

The _Instance_Levels instance table is special in that it doesn’t have the above mentioned hardcoded attributes. Instead it contains one row per level in the game with the following attributes:

StartLevel (bool): true if this is the start level which should be loaded on New Game, false for all other levels

_Layers (string): the currently active entity layers

Center (vector3): the midpoint of the level in world space

Extents (vector3): the extents vector of the level (Center and Extents are used to describe the bounding box of the level)

NavMesh, NavMeshTransform, NoGoArea, NoGoAreaTransform: used by the navigation subsystem (not currently implemented in Nebula3)

Most entity-related database accesses are wrapped by the CategoryManager, which currently caches most of the data in memory (all template tables, and all entity instance data of the current level). This may change in the future when we generally run SQLite through the in-memory mode. If the in-memory database accesses are fast enough we might drop the caching (which isn’t as good as I’d like because of the huge amount of memory allocations going on during reading from the database).

7 Feb 2009

I think I have a pretty good plan now for Nebula3’s asynchronous job subsystem, and (most importantly) I have a nifty name for it: EVA (Eingabe-Verarbeitung-Ausgabe, engl.: Input-Processing-Output). A job object is basically a piece of input data plus some code which asynchronously processes that data, resulting in the output data. Hence EVA. The main motivation is of course to make use of the PS3 stream processing units (I know, I know, the S in SPU actually stands for synergistic, but I think that’s wishful thinking at least in the context of a game engine, processing of individual data streams makes more sense then chaining the SPUs together IMHO, but I digress…).

I really don’t want an entire, important subsystem in N3 which only makes sense on the PS3 though. EVA jobs should also work on the CPU or GPU if desired.

The 2 main inspirations are Insomniacs “SPU Shaders” (treat a SPU job like a GPU shader), and DX11’s Compute Shaders (use GPU shaders for non-rendering purposes).

The main problem is to provide a simple, generic interface of how to get data to and from “external computation units” with as little synchronization as possible. GL and D3D had to solve that problem years ago in order to let the GPU work in parallel to the CPU. Vertex buffers and textures provide the input data, which is processed by shader code and the output is written to render targets. It’s a simple and intuitive pattern of how to communicate with an asynchronous processing unit, and best of all, every programmers knows (or should know) those concepts in and out.

EVA will simply wrap existing ideas under a common subsystem, with the emphasis on data-compatibility, not code compatibility. It should be relatively easy to “port” a job between the CPU, GPU or SPU. The main problem (how to structure the input and output data) should only be solved once (with the general rule-of-thumb that related data should be placed near to each other), while the processing code needs to re-written for each processing unit type (FX/HLSL for GPU jobs, simple, self-contained C or C++ code for CPU and SPU jobs).

Thus an EVA job object would have the following properties:

one or more input buffers

one output buffer

a small set of input parameters

the actual processing code

Buffers usually contain a stream of uniform data elements (similar to vertices or pixels), the input parameters can be used to cheaply tweak the behavior of an existing job object.

For a CPUJob, the input and output buffers would be simple system memory buffers, and the processing code would be standard C/C++ code which is running in a thread of a thread-pool. The processing code should not call any “non-trivial” external functions as to remain somewhat portable to the other job types.

A “DX9PixelShaderJob” would use textures as input buffers, and a render target as output buffer, the input parameters and the processing code would be described by an FX shader file.

SPU jobs would need a way to manually pull data from the input buffers to local memory, and to write blocks of processed data back to the output buffer, possibly using some sort of double buffering.

A way to cheaply convert/map buffers between the different job types would be desirable (for instance to use an output buffers directly as a DX texture, etc…). DX10 provides a pattern for this with its “resource views”.

A great plan is the simple part of course, the devil is in the implementation. And once EVA is ready for action the true challenge will be to “de-fragment” the game-loop in order to identify and isolate jobs which can be handled asynchronously.

1 Feb 2009

In the 3 months before Christmas, we assembled a small tiger team of 4 programmers to port a prototype demo we created with the N2-based Drakensang engine on the PC over to the 360 running on top of N3. We had the demo up and running in 720p rendering resolution with 4xMSAA at 30+ fps right before Christmas, exactly as planned. The primary goal was to have something nice to show to publishers running on the 360, but we also wanted to bring our technology a few steps forward. We could have opted for doing a quick’n’dirty port: just use the existing N2-based source code, and make it run on the 360. This would have solved the primary task - have something to show on the 360 - but underneath the hood the demo would still run on an “old” PC engine, hacked to run on the 360. We wouldn’t have benefited on a technological level from this. Instead we decided to start a real “engineering project”, and re-build the demo from scratch on top of N3, only reusing the existing art assets.

In hindsight, this was exactly the right decision. Nebula3 was developed with the 360 in mind from day one and the multithreaded rendering pipeline was already up and running on the 360. To my slight surprise we had very good performance right from the start. Due to the relatively complex pixel shaders and postprocessing effects in the PC demo I was expecting to see a framerate of somewhere between 15 and 25 frames, and then optimize from that point on. Instead we were very well north of 30 frames through the whole project with a 1280x720 render resolution and 4xMSAA. For a 3 year old graphics chip, this is remarkable. The prototype is very light on the game logic side so the main thread is basically idling on the frame synchronization point all the time. The rendering is fully limited by the GPU’s fill-rate, and we have a nice chunk of CPU time free on the render-thread side. We basically have a perfect graphical benchmark now to play around with and get a feeling for what the 360 can do, with most of the CPU still free for game logic, physics and dynamic rendering effects.

There’s still a lot of opportunity to fine-tune the CPU/GPU interaction in the render thread, but with the rendering performance out of the way this early we could concentrate on adding missing features to N3, like the new animation, character rendering and particle subsystems. We had the luxury to do a complete and clean rewrite for those, and I’m quite happy how they turned out (well, except for some parts of the particle system where we had to be bug-compatible with N2 to make the rendering result look identical using the existing source parameters). We have SQLite running on the 360 using the in-memory-database feature now, which we will probably back-port to our PC projects as well since it’s generally a nice-to-have feature. zlib, TinyXML and Lua had already been brought over before.

Finally, we now have a completely identical new PC-version of the prototype as a “side effect”. Our entire build-pipeline is now multiplatform-capable, a specific target platform is selected by a simple command line switch when running the MSBuild script. This is very nice even for a pure 360-project. Game logic code and even most of the engine code is platform-agnostic and can be implemented and tested on the PC, without the programmer hogging one of the ever-precious devkits. Switching to the 360-build for testing, debugging and optimization work is just a matter of seconds.

The next (and I think - final) big thing for N3 is a proper asynchronous resource streaming system specifically optimized for console platforms. N3 already has the concept of loading resources asynchronously through the managed resource system, but that is only one fairly low-level building block. What’s missing is a resource streaming system which basically acts as a fixed-size memory cache between the graphics chip and the disc.

The main disadvantage of consoles (compared to the PC) is the slow data rate and poor seek times of DVD and Bluray. But the good thing on consoles is, that the resource setup process has been opened up to the programmer. On the PC, DirectX and OpenGL are both very black-boxy when it comes to resource handling. The programmer never really knows what happens inside the API and graphics driver when a new texture is created or prepared for rendering. This black box is unlocked and documented on pretty much all consoles platforms, so that it makes very much sense to write platform specific resource streaming systems.

All in all I’m now very confident that N3 can handle a real-world project on the PC and/or 360. We have a very good feeling for what the 360 can and can’t do, the 360 is now integrated into our build pipeline, and N3 itself is pretty much feature-complete from our point of view.

I’d really love to put up a screenshot of our prototype since it looks pretty sweet on the 360. But it’s currently under cover and for publisher’s eyes only, so unfortunately I can’t.

I’ll try to get a new N3 SDK out soon (of course without the 360 stuff, as usual).