Developing Your Own Replay System

Having a replay feature in a game can definitely
enhance the player experience. It's an opportunity to show your
polished graphics from a different view, it gives the player more
time to appreciate how intelligent and naturally your AI behaves,
and it's a component that can enhance the gamer's experience.

During the past year I worked on a racing game
for PS2 called Cyclone Circus at Playlogic Games Factory,
a two-year-old company in the Netherlands. The game is based on
vehicles propelled by wind which progress in complex levels by jumping,
flying, and doing stunts in the air--which makes for some unique
gameplay (see Figure 1).

My main responsibilities for this project were
to design and implement a complete AI solution adapted to the game,
handling all the tasks concerning the animation of vehicles and
characters, and to make significant contributions to the gameplay.
Relatively late in the project, the replay feature lay implemented,
and no one had yet put any thought into it at that time. For that
matter, no one on the project had concrete experience implementing
a replay system, and nobody was even sure it could be done, since
we were already short on memory and we hadn't designed the game
to support a replay system. Still, the decision had been made to
keep this feature in the game, and we had no other choice but to
implement it.

I took the responsibility of making the core
replay system, trying to validate that it could be done in spite
of the project constraints. In our case, we needed to provide a
replay system for a PS2 racing game, but the techniques described
in this article can be applied to any game genre on any console,
as the methodology we used is quite common for different games.

The main goal of this article is to provide
a good insight of how to make a replay system, how to avoid risks
by knowing the problems you will face earlier in the project, how
to better estimate how much memory your replay system will need,
and how much time needs to be allocated during the planning phase.
Finally, I'll explain how to save huge amount of debugging time
by developing the right tools for the task.

So what are the different approaches we can
take to implement a replay system? Imagine a crash of two cars occurring
in the street--and that we want to reproduce the scene, making a
movie out of it. The first solution is to ask two stuntmen in two
new cars to drive the same way, and to turn, accelerate and brake
the way original drivers did. If a given number of conditions are
met--having almost identical cars, the same traffic conditions alongside
the two vehicles, and so on--we can expect that the replay crash
will look almost like the original one. The second solution would
be to log the important details every few seconds: the position
of each car, the position and orientation of all wheels, the state
of the body of the car, and the state of any damage--then use this
information to reproduce the scene again every few seconds. I suppose
you get the point that the second approach is not well-suited for
this purpose and would need to store a lot of information about
the state of the car at frequent intervals to reproduce the scene
accurately.

In the context of a game replay system, the
first solution uses a deterministic approach. It's based on the
fact that if only a restricted number of variables are recorded,
like the state of the game controller, if initial conditions are
the same, and if the code behavior is deterministic, then the simulation
will be exactly the same again over time. In this case, only initial
conditions and a limited number of variables need to be saved. The
second approach, previously illustrated by the crash example, would
be implemented by saving the full state of one or many objects in
a game each frame.

Here we can already show off the first big risk
for a replay system: either you put in the energy needed to make
your game engine deterministic, or you will have to put in the energy
needed to minimize the amount of data used to save the state of
the game objects each frame. For most games, there are no other
alternatives. Even worse, before you choose between the two options,
I would like to suggest that the second alternative is almost never
an option for any complex game. Allow me to convince you.

Memory Requirements

How many frames are there in a usual game that
must be reproduced in a replay? It depends, of course, on the frame
rate and the game duration. Let's take our example of a racing game
on PS2 to get some concrete numbers: if the average frame rate is
60Hz and the average duration of a race is 300 seconds, we will
need to generate 18,000 frames to reproduce the gameplay. For some
game genres, the game duration might even be a lot longer.

If you have worked on a PS2 game, then you know
how hard it is to fit everything into about 32MB of memory. We could
not afford to reserve more than 1MB for storing the replay in any
case, and we preferred to use less than 500K. If we were to use
30 bytes per frame, we would need 540,000 bytes to regenerate those
18,000 frames-exceeding our desired memory budget.

Again, let's provide some concrete numbers about
how much memory would be needed in our project to save the full
state (what we call a "snapshot") of the game at any given
moment. For each vehicle we had to save its physics properties,
like inertia matrices, velocity vector, frame matrix, wheels grip,
vehicle and character animation state, AI parameters, scores, current
and active bonuses, and so on. Storing the state of one vehicle
requires something like 1K. We could optimize this by a factor of
2/3 or so, but not easily. In our game, a race has six vehicles.
Storing the state of the circuit took another 4K. We ended up with
a snapshot that required approximately 10K to store the state of
the race.

It's clear that storing a 10K snapshot per frame
is impractical. It would require about 200MB to store 18,0000 frames.
Even on a PC game, it's better to avoid allocating so much memory
and accessing the hard disk so extensively during replay. I hope
I have convinced you that the deterministic approach is the proper
route to take.

Pros And Cons

Each approach has some positive and negative
consequences that we need to keep in mind. We already know that
the deterministic approach usually takes a lot less memory than
the snapshot approach, because only a few variables are saved in
order to do the simulation step from the previous state to the new
one. However, one drawback to this method is that we can't jump
from one point to another in the replay. The only way is to compute
all the simulation steps in between. Later in this article, I'll
explain how to jump to some point in the replay using a deterministic
approach as shown on the replay interface in the Figure 2.

Another drawback to the deterministic approach
is that if any aspect of the game is not deterministic, then the
result of the simulation step might differ from what transpired
in the actual game, and the full simulation could deviate after
just a few frames. Even the compression of a float value at the
wrong moment could make a difference of a small decimal in the replay,
and make vehicles follow a completely different trajectory. Particularly
with physics, a very small change in position can introduce a completely
different collision response, followed by a completely different
trajectory. But that's just one part of the challenge of making
a replay system and a deterministic game.

Figure
2. Screenshot during replay showing interface to jump forward
and backward in time along the race as described in the article.

Identifying The Deterministic Part Of Your Game

To make a deterministic game, usually you don't
have to concentrate on all parts of your game. Knowing the exact
border of the deterministic parts can save you a huge amount of
time. Let's take a few examples. In many games the sound engine
does not need to be deterministic, because sound will never influence
back any game object. An exception, of course, is if your sound
engine sends back messages at some given time in the playback that
triggers different behaviours. On the other hand, in our project,
it was important that some animations are replayed exactly the same
way, because sometimes a behaviour was triggered when the end of
an animation was reached. This makes the animation system a part
that needs to be deterministic as well.

Special effects and the rendering parts usually
don't need to be deterministic, and it's advisable to keep them
out of the deterministic border. Keep in mind, for example, that
you will certainly want to change your camera perspective for the
replay, which means you are not going to display the exact same
picture again. It's also better to have your special effects optimized
for the new camera position in the replay. If your special effects
don't influence your game and if it's not important for the game
to replay them at the exact same position, then you should put them
outside the deterministic border. If you decide that the rendering
should not be completely deterministic, then you should be careful
that this rendering part has no influence to any element inside
the deterministic border. For example, if you have game object behaviors
being activated when entering the view frustum and you do a replay
from a different camera, then you should expect that your game won't
be deterministic anymore (See Figure 3).

Figure
3. Overall design of a replay system with identification of
the deterministic part of the game.

Finally, expect to have a few non-obvious situations
that could make your replay non-deterministic. For example, it is
imperative that your physics engine is deterministic. It's better
to pay attention to it as soon as possible in the project to avoid
being stuck later because of the approach taken so far. Pay particular
attention if your physics engine uses a cache system. In the replay,
the cache will need to be in the exact same state as it was during
the game. Having 1MB needed to save the cache state and restore
it is certainly unacceptable.

Let's now describe the two most common tasks
needed to make a game deterministic. Variability is usually introduced
in a game by computing random values, as well as by the inputs from
the player. To make a replay, the same sequence of random values
needs to be reproduced again, along with the inputs from the player.
To handle random values, you must not forget to restore the seed,
so that you get the same sequence again in the replay. And the deterministic
parts of your game should use a different random-number generator
than the non-deterministic parts. For example, if your special effects
need some random values, they should not influence the sequence
of random values taken by the deterministic part of the game. Different
implementations of random number generators can be found easily
on the web.

Most of the games also require that you save
the delta time for each frame. Usually updating the game with a
step of 20ms and again with a second step of 20ms doesn't have the
same result as updating your game with one step of 40ms. This is
often the case because of the way physics engines handle penetrating
objects. If special care has been taken from the start to ensure
the game is deterministic independently of the update step durations,
then you are lucky and you can rely on this to do a replay at a
different frame rate. Otherwise, we usually just skip frames in
order to synchronize the replay results if the replay frame rate
is not exactly the same as the frame rate during the game.

Saving Replay Information

Before we study how to save replay information,
we said earlier that a major drawback of saving only delta information
is that we won't be able to skip a part of the replay. In our case,
we wanted to be able to jump to a few sections in the replay. To
do this we need to combine saving delta information with saving
the full state of the game a few times-not unlike the structure
that MPEG encoding uses.

We found it very handy to use one big memory
stream to store replay information (see listing 1). For example,
each time a minute of the game elapsed; we store the full state
of the game and also store the position in the memory stream at
this point. Using a memory stream makes it possible to avoid using
a predefined data structure to store information, and also allows
interleaving different types of information in the same stream,
so long as you read it back in the correct manner.

//
// The TMemoryStream class uses a block of memory instead of a file
// but implement IStreamReader and IStreamWriter interfaces.
//
class TMemoryStream : public IStreamReader, public IStreamWriter
{
public:
TMemoryStream();
virtual ~TMemoryStream();
// IStreamReader / IStreamWriter
virtual int ReadBytes(void* pBuffer, int numBytes);
virtual void WriteBytes(const void* pBuffer, int numBytes);
// Seek methods
bool SetPos(int nPos);
int GetPos();
// PreAllocation
void PreAllocate(int nSize);
void AllocationStep(int nStepSize);
long GetAllocatedSize();
// This method uses ReadBytes to see if the current value of the
// variable is the same as the one previously stored in the memory
// stream. Otherwise it adds a line to the log with information about
// the variable name, the given filename, and both current value and
// previously stored value.
bool LogAssertValue(const void* value, size_t nSize,
const char* name, const char* filename = NULL);
};
// This macro is used to easily give information about the variable name
// and the filename needed for the log if the variable doesn't have the
// value that was previously stored.
#define AssertValue(x) LogAssertValue(&x, sizeof(x), #x, __FILE__)

Lisiting
1. MemoryStream class header

To save the full state of the game we introduce two methods
for each object called SaveState and RestoreState,
both of which include a reference to the memory stream. The SaveState
method writes all needed information in the stream so that the RestoreState
method will be able to read back this information later on to restore
the object in its previous state (See listing 4).

Besides this, we have to save the delta time
and the state of the input controller each frame (We will detail
this later). The update method of your game should be something
like this:

When Recording:

Save delta time

Read input controller

Save input controller state

Update game

if one minute elapsed save full state of
the game

During replay….

Restore delta time

Restore input controller state

Update game with restored delta time

If asked to jump in the replay, set position
in memory stream and restore full state of the game

Drastic Optimization For Delta Information

The only information that needs to be saved
each frame for a replay based on a deterministic approach is the
delta time for each frame (unless the game's frame rate is fixed)
and the state of all input controllers. This is the first place
you should look to optimize the memory taken by the replay (See
Figure 3).

In a first-draft approach, we could save a float
value for the delta time, a float value for each analog input value,
and a Boolean value for each input button. Let's evaluate the size
needed to store the state of a PS2 gamepad with this approach. There
are eight analog buttons, six digital buttons, and four analog axis.
In other words, we would need to store 12 float values and 6 Boolean
values. Storing a float value takes 4 bytes and all 6 Boolean values
can be stored on one byte. Even then, this will require 49 bytes
(4 × 12 + 1) to store for each frame, and double that in two-player
mode. Since we said earlier that our budget was ideally lower than
20 bytes per frame, we must optimize this data storage in a more
drastic way.

We can save a lot of memory by knowing the purpose
of each button in our game, instead of storing the full gamepad
in a generic way. I would definitively advise you to do some game-specific
optimization here, and to compress information after interpreting
the mapping of your input controller. In this case you can save
only the information you really use for your game. Many buttons
are only used as digital ones even if they are physically analog.
Even if you use analog buttons, how many levels do you really need?
Do you need a float, 256 levels, or would even 16 levels be enough
for some features? That's really the place were you shouldn't waste
any byte--or even any bit--of memory.

Let's take the optimization even further. How
many times does a player press or release a button? Certainly not
60 times per second, and much of the time the player will only be
able to press or release one button per frame. So you can likely
save only information when the state of a button has changed, and
minimize the data stored if nothing has changed. The same can be
done for analog values. Most of the case, an analog stick will just
stay at a given value--usually an extreme value or the central value.
So storing the delta of the analog value also helps save memory,
and optimizing the amount of data stored when the value did not
change will help you optimize even further.

Finally, you can also compress the delta of
an analog value in the range 0 to 255 to three bits by using a table
like [-128, -48, -12, -4, +4, +12, +48, +128]. Then during the game
you also force the delta of the input analog value to fit one of
these values, to save the exact same value for the replay as was
used during the game. This is usually an acceptable compression
since using big values allows for fast reaction when moving over
the full range and using small values allows a precision of 4 units
to be reached over two or three frames. We used this compression
scheme for features like steering and pitching on the analog stick
without any noticeable effect on gameplay.

When dealing with data being stored in only
a few bits, or even encoded with a variable number of bits, I strongly
advise you to create a small class named something like BitPacker.
Imagine this class as a stream of bits instead of bytes, where you
have methods for saving a boolean value taking the next bit, then
saving an integer value taking the next three bits, and so on (see
Listing 2). In the end, you call a Write method with
a reference to the stream used for the replay, and a few bytes will
be written to encode your bit level information. This will make
your code a lot cleaner and you won't have to deal with bit alignment
or fixed bit structures.

//
// Container to store multiple bit information in one or more bytes.
//
class TBitPacker
{
public:
TBitPacker();
virtual ~TBitPacker();
// Clear all information to get an empty container.
void Clear();
// Add a boolean information.
void AddBit(bool bValue);
// Add an integer stored on a given number of bits only.
void AddInteger(int nValue, int nBitCount);
// Get the next boolean information and read the next byte
// from the given stream if needed.
bool GetBit(IStreamReader& stream);
// Get the next integer and read the next bytes from the
// given stream if needed.
int GetInteger(int nBitCount, IStreamReader& stream);
// Get the number of bytes needed to save the container content.
int GetSizeToWrite();
// Get the number of bits stored in the container.
int GetBitStoredCount();
// Write the content of the container to the stream, by saving
// the number of bytes needed for the container content.
void Write(IStreamWriter& stream);
};

Figure 3 shows how we used this class and the
way replay information was stored each frame. We use a bit for each
analog value to indicate if a delta value is encoded or if the value
reminds unchanged. We also use a bit to indicate if there is one
more boolean value that need to be inversed, in which case we encode
the index of the boolean variable. Using these tricks we save only
a very small amount of data when values stay unchanged--in other
words, for most cases.

Figure
4. Example of data stored each frame for replay on a few bits.

Finally by applying the optimizations described
above and detailed in the listing 3, we could save the full state
of a PS2 input controller in an average of 1.2 bytes. We achieved
a compression ratio of more than 1:40 mainly because most of the
time the input controller varies little from the state of the previous
frame. Adding one byte to store the delta time for the frame update
gives us a total of 2.2 bytes per frame. Multiply this by 18,000
frames, and our storage requirement is now around 39.6K for one
player, or 61.2K for two players--a huge savings over our initial
budget. Even if we save 6 full states of the game, to enable jumping
to different sections during the replay, we still use less than
150K of memory.

//
// To make the code as simple as possible, we
// only save two boolean and one analog
// values from the input controller. In the article
// however, explanations and memory size are based
// on a real case with much more saved values.
//
void InputControllerUpdate(RwReal deltaTime)
{
TMemoryStream* pMemStream = GetReplayStream();
// Store the previous state of all input values
unsigned char nPreviousSteer = m_nSteer;
bool bPreviousJumpPressed = m_bJumpPressed;
bool bPreviousOpenWingsPressed = m_bOpenWingsPressed;
// Replay system need to store and restore the
// exact same input controller state
if (g_bRecord)
{
// Update all analog and boolean input values
// based on the input controller
m_nSteer = FloatToUnsignedChar(m_pControllerBinding->Steer.GetValue());
m_bJumpPressed = m_pControllerBinding->Jump.BecamePressed();
m_bOpenWingsPressed = m_pControllerBinding->OpenWings.BecamePressed();
// Compress the state difference of the input
//controller compared to the previous frame
TBitPacker bitPacker;
// For each changing boolean input value, we store a bit with value true,
// followed by the index of the changed value.
if (m_bJumpPressed != bPreviousJumpPressed)
{ bitPacker.AddBit(true); bitPacker.AddInteger(0,4); }
if (m_bOpenWingsPressed != bPreviousOpenWingsPressed)
{ bitPacker.AddBit(true); bitPacker.AddInteger(1,4); }
// A bit with value false is added to indicate the end.
bitPacker.AddBit(false);
// Based on the previous and new analog values, we find the
// compressed delta that get the value nearer from the goal.
// We first store a bit to indicate if we store a delta or if we
// just keep the previous value because the delta is too small.
int nDeltaSteer = DeltaCompress(nPreviousSteer, m_nSteer);
bitPacker.AddBit(nDeltaSteer != -1);
if (nDeltaSteer != -1) bitPacker.AddInteger(nDeltaSteer, 3);
// Save the packed delta information to the memory stream
// You can log the value of bitPacker.GetSizeToWrite() to see
// how many bytes are needed to save the input controller state.
bitPacker.Write(*pMemStream);
}
else if (g_bReplay)
{
TBitPacker bitPacker;
// As long as we read a bit with value true, we read an index
// stored on 4 bits that indicates the boolean input value that
// has changed. We inverse the boolean state of this variable.
while(bitPacker.GetBit(*pMemStream))
{
int nVariableIndex = bitPacker.GetInteger(4, *pMemStream);
switch(nVariableIndex)
{
case 0: m_bJumpPressed = !m_bJumpPressed; break;
case 1: m_bOpenWingsPressed = !m_bOpenWingsPressed; break;
}
}
// Test if a compressed delta has been stored for the steer value
if (bitPacker.GetBit(*pMemStream))
{
m_nSteer = DeltaDecompress(nPreviousSteer, (unsigned char)
(bitPacker.GetInteger(3, *pMemStream)));
}
}
// Do update here according to the inputs...
}

Listing
3. Structure used to compress and restore the InputController
state.

How To Guarantee Your Game Is Really Deterministic

Making a game fully deterministic is not an
easy task, and the amount of time needed to test and debug non-deterministic
issues could take you months without some tools to help you out.
Finding the last few bugs is usually quite challenging, since they
are often difficult to reproduce and detect. Just looking at the
replay is a poor approach to see if something replayed differently.
First, you will miss a lot of subtle differences even if you watch
it very carefully. And even if you see that the replay simulation
deviates, you won't have precise information about the cause of
deviation--the first variable that was different between the gameplay
and the replay. Believe me that spending some time here to create
some better debug tools will save you a huge amount of time and
could be what lets you keep the replay feature.

We saw earlier that we can either save the full
state of the game object by calling a function like SaveState,
or that we can save the delta information for each frame containing
information about the delta time and the input controllers. Let's
introduce a new REPLAY_ASSERT mode in the game that
we will use when debugging the replay. In this mode, we also save
the delta information just as in the normal mode at the start of
each frame, then do the full simulation update and at the end of
each frame we save the full state of the game. In other words, in
the same memory stream we will alternate saving delta information
and saving full state of the game continuously each frame.

During a replay in this assert mode, we will
restore delta information the normal way, then we do the full simulation
update and at the end of the frame we compare the full game state
with the state previously saved at the end of the frame. If the
full state is exactly the same as the one we saved previously, it
means the frame has been replayed and all saved variables have evolved
exactly the same way. If after a frame any deterministic variable
is different, then we know that the variable has evolved differently
during the last update, even if all deterministic variables were
still exactly identical when starting the frame update. This means
that we have to look at how this variable is updated and its dependencies
with any non-deterministic variable. Perhaps we forgot one variable
that should be deterministic as well.

The comparison between the two full states can
be done by introducing a function AssertState beside
each function SaveState as illustrated in Listing 4.
In this method we will log any variable which has different content
and display information about the class or file where the variable
is defined. Looking at this log will give us immediate information
about which variable is not deterministic, and thus the source of
deviation in the replay.

// This method saves the state of an object to the memory stream.
void SaveState(TMemoryStream& MemStream)
{
MemStream.WriteValue(m_vLastPosition);
MemStream.WriteValue(m_fLastProgression);
MemStream.WriteValue(m_fHeightInTrack);
// Save AiVehicle
m_pAiVehicle->SaveState(MemStream);
}
// This method restores the state of an object from the memory stream.
void RestoreState(TMemoryStream& MemStream)
{
MemStream.ReadValue(m_vLastPosition);
MemStream.ReadValue(m_fLastProgression);
MemStream.ReadValue(m_fHeightInTrack);
// Restore AiVehicle
m_pAiVehicle->RestoreState(MemStream);
}
// This method assert the state of an object is the same as the one
// described in the memory stream and log differences.
void AssertState(TMemoryStream& MemStream)
{
MemStream.AssertValue(m_vLastPosition);
MemStream.AssertValue(m_fLastProgression);
MemStream.AssertValue(m_fHeightInTrack);
// Restore AiVehicle
m_pAiVehicle->AssertState(MemStream);
}

Lisiting
4. Methods to Save, Restore and Assert the state of an object.

Don't forget that this approach saves the full state of the
game each frame, and so you should expect a few hundred megabytes
of memory to be used. That's usually not a problem if you can debug
your game on PC. If you must debug on the console, you will have
to play for a short time only, or regularly save data on a local
or remote hard disk. Also, expect some acceptable frame rate drop
when using this replay assert mode.

Using this approach we managed to keep the game
deterministic, without spending too much time tracking down the
source of bugs. Don't get me wrong, non-deterministic issues are
not always easy to fix in a game, but with such a tool it becomes
a lot easier. You should fully expect to be faced with a list of
bugs that are difficult to track and reproduce--just as we did--and
that's when you will be happy to have spent the time creating this
debugging tool.

Conclusion

As you can see in this article, the task of
implementing a replay system for a game is quite risky. There aren't
many approaches that can be used, and in any case it's hard to stick
to a low memory budget for storing the replay data. Making a game
deterministic relatively late in the project is quite challenging
and often requires a code review of many parts of the game. It's
also a difficult task to find which parts of the game need to be
deterministic, and which variables need to be saved within the state
of an object.

I have shown you how we implemented a replay
system for a PS2 title with drastic memory constrains. It took us
about two weeks to implement the core replay system, and another
two weeks to debug and solve the deterministic issues. Many other
decisions would have brought us to a dead-end. I hope this article
will help you make good choices when implementing your own replay
system, and avoid bad surprises late in the game. In the end we
were really happy to see a perfectly deterministic replay working
on PS2, all fitting in a small memory budget of only 150K of memory.

For More Information

This article does not cover how to design good
camera for replay but I can refer you to [Kharkar01] for an article
covering this subject. There was also no space to discuss how to
select the best camera and how to identify the best sequence to
show in a replay. This could be the purpose of an other article.