Animation in SDL: Hardware Surfaces

The Simple DirectMedia Layer (SDL) provides three kinds of
surfaces for rendering graphics: software, hardware, and OpenGL. Software
surfaces are stored in the computers main memory. Hardware surfaces are
stored in memory on your video card. OpenGL surfaces are handled in
whatever way OpenGL does things on your system. My previous SDL article
provides basic information about using SDL and details of software
surfaces. This article explores the promise and problems of using
hardware surfaces.

People assume that because hardware buffers are based on
hardware that programs that use them will be faster than programs
that use software buffers. That assumption is often not true. In fact,
some applications will run slower with hardware than with software
buffers. The decision to use a hardware buffer must be based on testing
and not on assumptions. People also seem to forget that the hardware is a
limited resource. Just because you can get enough memory to make the
program fly on your development machine does not mean it will perform as
well on another computer. When they work for you hardware surfaces are
fast, easy to use, and give you smooth animation without tearing and other
unpleasant visual artifacts. When they don't, they can lead to a set of
bewildering problems.

Working With Hardware Surfaces

A cross platform development tool like SDL is designed to let you write
programs that work without change on many different kinds of hardware and
operating systems. Hardware-dependent programming is the opposite of cross
platform development. The conflict between the realities of hardware
dependent development and the goals of SDL are at the root of many of the
problems you may encounter when using hardware surfaces. SDL hardware
surfaces are both cross-platform and hardware dependent.

While writing about SDL hardware surfaces, I have used more weasel
words than a pork-barrel politician the week before election day. I do
that because most features have a few special cases where they don't quite
work the way you expect them to. The special cases come from the many
factors that affect how hardware surfaces really work:

The actual hardware installed in your computer. There are
many ways to build a video card. Your video card may have hundreds of
megabytes of dedicated super fast DDR RAM. Then again, it may not have any
dedicated memory, instead sharing the computer's main memory. Your
hardware may have a blindingly fast graphics accelerator or it may let the
CPU do all the work. Using hardware surfaces doesn't tell you much about
how they will perform on any given system. In fact, they may perform
poorly on a system with a high end graphics accelerator while performing
remarkably well on a system with a weak graphics system.

How your computer talks to your video card. Most computers
talk to the video card over a data bus, like the AGP bus, that is designed
to send data from the CPU to the video card, not the other way
around. It is almost always the case that the CPU can write to the video
card much faster that it can read from the video card. It is usually the
case that the CPU can read and write its main memory faster than it can
read or write video memory. That all means that using
the CPU to copy images (or anything) around in graphics memory is
going to be slow.

The version of the device drivers you are using. Your
hardware may be capable of providing hardware surfaces and hardware
acceleration, but the drivers you are using may not make those abilities
available to user programs. You can write a program that works great on
your computer but that will fail on a similar computer with the same OS
and the same graphics card just because it has a different version of the
device driver.

The way the operating system controls access to hardware.
Operating systems control access to the physical hardware and try to keep
programs from messing with the hardware in ways that can crash the
computer. Because of differences in their design, Windows allows normal
programs to get hardware surfaces while on Linux and other Unix-like
operating systems, a program must have root privileges to access the
hardware.

Having listed so many problems with SDL hardware surfaces, you might
think they are not worth using. However, if you are writing a
two-dimensional game on a platform with good support for SDL hardware
surfaces, they may be the correct choice. You just have to know enough to
know when they are a bad choice.

Using Hardware Surfaces

The easiest way to show the differences between hardware and software
surfaces is to convert the softlines.cpp program, which I
wrote for my last article, from using software to hardware surfaces. The
new program is called hardlines.cpp. Converting the code did
not require me to change many lines of code. Of course, what seemed like
tiny details kept the code form working as expected. The closer you get to
the hardware, the pickier the work gets.

hardlines.cpp has the same sections with the same
functions as softlines.cpp. Working from the top of the
program down, I didn't have to make any changes in the program until I
reached the main() function.

Selecting The Driver

The first change I had to make was to add some Linux specific code just
before the call to SDL_Init():

#ifdef __linux__
putenv("SDL_VIDEODRIVER=dga");
#endif

SDL checks the value of the SDL_VIDEODRIVER environment
variable to decide which driver to use. To get hardware surfaces while
running on Linux under X, you have to specify which driver to use. I've
chosen the DGA driver because the default X11 driver does not support
hardware surfaces. The SDL
FAQ has more information about selecting drivers on Linux and Windows.
There is also a
detailed list of SDL environment variables and their use. The number
of different drivers that you have to choose from is staggering and shows
the range of applications for which SDL could be used.

Setting The Video Mode

The changes are small, but the reasons for the changes aren't. The
options tell SDL that I want a full screen (SDL_FULLSCREEN),
double buffered (SDL_DOUBLEBUF), hardware surface
(SDL_HWSURFACE). The part that isn't obvious is that on my
desktop system if I want a hardware surface, it has to be full screen. I
can't get a hardware surface for a window. This is one of those things
that is operating system and device driver specific. Some systems let you
have a hardware surface for a window. Even if you can get a hardware
surface for a window, you may not be able to get a double buffered
hardware surface for a window.

There are good reasons to refuse a hardware surface for a
window. SDL_SetVideoMode() returns a pointer to an SDL_Surface.
Inside that structure is a pointer to the pixel data for the
surface. Without that pointer you can't draw anything. The demo program
uses that pointer to draw lines. Having a pointer to a window on the
screen means there is a good chance that you can write to any pixel on the
screen, not just the ones in your window. You can probably read from any
pixel on the screen, which creates a nasty security hole. A bug in your
program can scramble the whole desktop, not just your window.

Because you have a pointer to the data in the window, you also have to
worry about what happens when the window is moved, resized, or
obscured. When the window moves, the address of the image data for that
window also moves. If it changes and you use an old copy of the pointer,
your program winds up drawing in the wrong place. If another window
partially covers your window, who is responsible for keeping you from
writing to the covered parts of your window? How does an SDL application
even find out what those are? Double buffering introduces another set of
problems. You may be able to get a hardware surface in a window, and not
be able to get a double buffered surface for that window, because the
entire desktop is not double buffered.

All of these problems can and in fact have been solved many different
ways. By far the easiest solution is just to require that applications
that directly access the screen run as full screen applications. If you
want to use SDL hardware surfaces, assume that your application will have
to run in full screen mode.

To make sure that the program actually got a hardware surface I added
code that tests the surface type right after I set the video mode:

If SDL can't give you what you ask for it will give you what it can. If
it can't give you a hardware surface, SDL will give you a software
surface. We have to check to see if we really got a hardware surface.

Other Hardware Surfaces

After you have set the video mode, you can use SDL_CreateRGBSurface() to create more hardware surfaces
and SDL_FreeSurface() to release them. These surfaces are
used to hold image data, such as sprites or fonts, that you want to draw
onto the screen. If your screen and your graphics are both in hardware
surfaces, SDL can use the graphics hardware to copy from one surface to
another. Using the graphics hardware gives you a significant performance
boost.

This may sound obvious, but if the video card has 32 megabytes of
memory you aren't going to store more than 32 megabytes of data in it. You
won't get the full 32 megabytes because the windowing system and other
applications may also be storing information in graphics memory. When you
use hardware surfaces, you have to set a budget for graphics memory use
and then stick to that budget.

Hardware Locking

Graphics hardware is a shared resource. Operating systems generally
require that we lock shared resources before we use them and unlock them
after we are done. SDL provides SDL_LockSurface() and SDL_UnlockSurface() to lock and unlock hardware
surfaces. It is possible to have a hardware surface that should not be
locked and SDL provides the SDL_MUSTLOCK() macro so that we can tell them apart.

Failing to lock a hardware surface can cause unexpected results or even
program crashes. Locking the surface ensures that all graphics hardware
pending operations are completed before you can touch the buffer. In
hardlines.cpp the call to SDL_FillRect()
may be performed by the graphics hardware and run in parallel with your
code. In fact, there could be several graphics operations that are queued
up waiting for the graphics accelerator to perform them. If we don't wait
for those operations to complete, the program can be drawing lines in
software while the background is being filled by the graphics
accelerator. No matter what happens, the results are unpredictable and
certainly not what you want. Further, the pointer stored in the surface
record can change. If you are using double buffering and you swap the
buffers, the current buffer is a different block of video memory. The
pointer can also change if the window was moved. The pointer is only
guaranteed to be valid while the surface is locked.

After learning why you have to lock hardware surfaces, you might think
that you should just lock them at the beginning of the program and leave
them locked. We can't do that because while the hardware is locked, we
cannot safely make any system calls. System calls may not be able to
complete until the hardware is unlocked.

To make the sample program work with hardware surfaces I have added
code around the code that updates the screen that locks and unlocks the
hardware screen surface.

To be as portable and fast as possible, I only lock the surface if
SDL_MUSTLOCK() says it must be locked. There is a real cost
to locking the surface, so we don't want to lock it if we don't have
to. Using SDL_MUSTLOCK() also lets the code work with
software buffers.

Screen Flipping

At the very end of the original animation loop, we had two lines of
code:

SDL_Flip(screen);
SDL_Delay(10);

When using a double buffered display, graphics are drawn into the back
buffer and only become visible after the call to SDL_Flip(). When
used with software surfaces SDL_Flip() copies the contents of
the back buffer to the display and returns immediately. The story is more
complicated with hardware surfaces.

The version of SDL_Flip() used for hardware surfaces can
be implemented in at least two different ways. It can copy the back buffer
to the front buffer, or it can tell the hardware to stop displaying the
current surface and start displaying what is in the back buffer. In the
second case it just changes the value of a pointer that tells the hardware
where the graphics are. At that point the display surface (also called
the front buffer) becomes the back buffer and the back buffer becomes the
display buffer. No copying is done at all.

Copying and swapping both get the next frame on the screen. You only
care about the difference if you are doing incremental updates of the
frames. If SDL_Flip() is copying buffers, the back buffer
always has a copy of the last frame that was drawn. If
SDL_Flip() is doing page swapping, the back buffer usually
contains the next-to-last frame. I say usually because double buffering
can be implemented using a hidden third buffer to reduce the time spent
waiting for the buffer swap to happen. You can find out what kind of
swapping is being done by watching the value of the back buffer pointer
(screen->pixels in hardware.cpp) to see if it
changes and how many different values it has. If it never changes, then
SDL_Flip() is copying the pixels. If it toggles back and
forth between two values, then page swapping is being used.

Using hardware surfaces changes the timing behavior of
SDL_Flip() and lets us get rid of tearing. Image tearing
results from changing the display buffer while the video hardware is
drawing what you see on the screen. The video hardware is constantly
reading the contents of video memory, your animation frame, and converting
it to a video signal that your monitor then turns into a pattern of
colored light that you see. The process of painting an image on the
screen takes time. At 85 frames per second, it takes just just under 12
milliseconds to draw the frame on your screen. The process is broken up
into several phases, but the ones we are interested in are the frame time
and the video retrace period. The frame time is the length of time from
when the hardware starts displaying the current image on the screen until
it starts display the next image on the screen. The video retrace period
is a brief period at the end of the frame time when the video system has
finished displaying one image but hasn't started displaying the next
image.

If we change the content of the display buffer during the frame time,
the hardware will display part of the front buffer at the top of the
screen and part of the back buffer at the bottom of the screen. Splitting
the image like that is called tearing. We want the buffers to switch
during the vertical retrace period so we never see parts of two frames on
the screen at the same time.

We want our animation programs to

Draw a new animation frame

Wait until the vertical retrace period

Swap the frames

Repeat

Unfortunately, that wait can be very long. There is a lot of work that
we could be doing instead of waiting for the buffer swap. What we really
want to do is

Draw a new animation frame

Tell the hardware to swap the buffers at the next video retrace period

Do other work such as processing user input and network traffic so that we are ready to draw the next frame

When there is nothing else left to do, wait for the buffers to swap

Repeat

This is precisely what SDL tries to do. The call to
SDL_Flip() tells the hardware to swap buffers at the next
video retrace, but it does not wait for the retrace. When you try to lock
the surface, or when one of the SDL graphics routines tries to, SDL waits
until the buffers have swapped. Delaying the wait lets you keep working
after calling SDL_Flip() but prevents tearing and prevents
you from writing to a buffer that is being displayed. This design lets
your program do all the set up work needed for drawing the next frame
while waiting for the buffers to swap.

There is, of course, a caveat. On some systems it is not possible to
implement SDL_Flip() to work the way I just described. On
those systems, SDL_Flip() may wait until the buffers have
swapped or it may never wait and give you tearing. I have never
encountered these problem, but you need to test SDL_Flip() on
your target system before depending on a specific behavior.

SDL_Delay() is rarely needed when using hardware
surfaces. The wait for the hardware buffer swap keeps the program from
generating frames faster than they can be drawn on the screen and forces
the program to give up time to the operating system. Thus the next to last
change to hardlines.cpp was to remove that line. Removing the
call to SDL_Delay() is not always correct. It would have been
more correct to time the animation loop and call SDL_Delay()
if we were drawing an unreasonable number of frames per second.

The Last Change

I added code to compute the average frame rate of the animation and
print it out at the end. I just count the number of frames that were drawn
and divide by the time it took to draw them. If the program is working
correctly the frame rate should be very close to the frames per second
setting on your display.

Conclusion

This article covered details of using SDL hardware surfaces along with
the problems and incompatibilities that interfere with there use. As there
are no standards for hardware, device drivers, and operating systems that
cross the range of platforms that are supported by SDL, there are bound to
be incompatibilities and inconsistencies. This another case where SDL
isn't amazing because it works so well, SDL is amazing because it works at
all.

Next time I'll be looking at how to use OpenGL from within SDL. The combination
of a portable 3D API like OpenGL with the portable input and multimedia
capabilities of SDL make it possible to write high performance commercial
games that run on Linux, Windows, and the Mac.

Bob Pendleton
has been fascinated by computer games ever since his first paid programming
job -- porting games from an HP 2100 minicomputer to a UNIVAC 1108
mainframe.