? Techniques for determining speed / efficiency

I want to make a spaceship that looks like a diamond move around the screen.

I want neon colors and fading 'trails' or 'ghosting' as the ship moves.

As I'm doing this for a dissertation - I have to have proof, as to what is the most efficient way to do something and back that up with examples.

So, do I draw the ship(s) using lines i.e. like vector graphics?

Or do I 'fake' them using .png's and texture them onto a quad?

Finally, how do I PROVE which is the most efficient? I've heard that back in the old gpu days, cards were optimized for vector graphics, but now they are optimized for textures and doing real 'vector' graphics will slow them down.

How can I PROVE this is the case on the PowerVR MBX chip in the iPhone?

So you're saying you want to test performance in drawing a simple spacehip with lines vs. flat-shaded vs. textured? And you want faded trails behind it? Doesn't seem like there's much to do there except do it and measure your performance by calculating frames per second. Measuring FPS is a pretty common metric for real-time graphics. [edit] Unfortunately, iPhone is locked to 60 Hz refresh, so measuring FPS on it isn't going to be a straight-forward task to get good differentiation between the data sets [/edit]

For lines I would stick with GL_LINES or GL_LINE_LOOP for this, although if I were making production quality stuff and wanted vector graphics I'd use arekkusu's glAArg, which is a really slick little library for that stuff. Unfortunately it doesn't appear that he's updated it to use arrays yet, so you'd have to make a few modifications to run it on iPhone. ... although maybe he could be bugged to make those modifications for you

The thing is, using textured quads for your lines is the same thing as using a textured quad for your spaceship, except that it will be slower since you'll have multiple quads for each line and multiple lines for the ship.

Anyway, if it were me, and I could choose, I'd probably choose something more interesting and currently hotly debated, which is proving that a tight loop on iPhone gets better graphical performance than a game loop driven by a timer. If you could prove or disprove that empirically for everyone to see, with a good set of data from a realistic test, you'd be pretty popular in the iPhone community

Elphaba Wrote:Yeah, I mean for the texturing section - that I would draw the entire spaceship in photoshop, export as png and texture that spaceship (made to look like vector line graphics) onto a quad.

Yes, that'd be the fastest way to draw a nice vector-graphic-looking spacehip.

Elphaba Wrote:My lecturers tell me that it's ALWAYS faster to use textures then to do everything with lines (lineloops or whatever)...

Hmm... I've never heard that before, but I'm not sure I'm clear on the statement. Line primitives aren't used all that often though, and I can't say I've ever really compared performance myself. Textured triangles are used most often by far. I can say that if you're using multiple textured quads for nice vector graphics in place of line primitives, they will for sure be slower (e.g. glAArg). However, if you're drawing an object with a single texture representing multiple vector graphics lines, yes that will be as fast as a single "regular" (i.e. not looking like a vector graphic) texture of a spaceship (surprise, surprise!), and likely faster than equivalent drawing using line primitives.

[edit] To clarify a bit more: I'd say that drawing one line using a texture is more than likely going to be slower than drawing the same line as a GL_LINE. If you want to draw multiple lines, if you can draw those using one texture, yes the texture is likely going to be faster than the equivalent of multiple GL_LINES. This of course also depends on how many lines, and also your geometry submission being immediate or retained (i.e. glBegin(GL_LINES) vs. glDrawArrays). [/edit]

Elphaba Wrote:So doing both and having an accurate FPS counter is the only way?

I don't know if that's the only way to do it, but it seems most logical to me, except for the issue of being locked to the 60 Hz refresh that I mentioned earlier.

Elphaba Wrote:How can one be sure that one's FPS counter is then 'accurate'?

Use mach_time; it's accurate. You can save your FPS counter samples to an array in memory and save them out to disk when the test is over, so-as not to interfere much at all with the test.

// calculate the time base for this platform only on the first time through
if (timebase == 0)
{
mach_timebase_info_data_t timebaseInfo;
mach_timebase_info(&timebaseInfo);
timebase = timebaseInfo.numer / timebaseInfo.denom;
}

At a simplified level, you can just implement it both ways (textured quads and lines) and see which one is faster. On the iPhone, the framebuffer swap is always limited to the 60Hz refresh rate, which means that a naiive measurement of "more fps = faster" is the wrong approach. It is better to measure "more objects at a target framerate". Or, flush the rendering, but don't swap the framebuffer, to measure offscreen rendering and ignore the swap interaction.

At a deeper level, proving performance is a hard task, be prepared for some work if you really need to do that.

Elphaba Wrote:So I'm guessing I have to simulate the most complex scene my game could display, and output the FPS - first in true vector graphics (LineLoops etc) and then again as textured quads...

Right?

It's not quite that simple. As we've been saying (you might've missed my edit earlier), you're limited by the fact that the swap will occur when the screen refreshes, which happens at 60 Hz, which cannot be changed. If all you do is simulate a complex scene and simply take an FPS measurement, that won't necessarily reflect reality. What *might* make more sense is to increase or decrease the complexity of your scene until you get a particular frame rate and say you can draw x amount of such and such at y frame rate. Still, that isn't going to be easy either, since you'll have to be certain that a given amount of x is indeed being rendered at said rate, and not just happening to be dictated by the screen refresh.

Drawing lines and points are generally slower than drawing triangles. Graphics hardware these days are optimized for triangles, and lines and points are actually implemented by converting them to triangles before rendering.

For the problem of the ghosting of the lights, I would generate a curve for the path of the light and extrude it. For an example, the simplest case is a line. If you extrude that line, then it will become a rectangle. In this case, you can sample points for the last n frames (probably skipping some frames between each sample), and treat each consecutive pair of points as a line segment. For each segment, you can then form a quad, though you will want to make sure that you change the angle of the left and right edges based on the angle between the line segments. For example, lets say you have 2 line segments in this configuration:

Code:

\
\___

You will want to generate the 2 quads like this:

Code:

/\
\ \___
\/___|

Notice the boundary is slanted. This will prevent overlaps. Once you have this mesh, you can simply put on a square texture where the top and bottom fade out to nothing, and fade the alpha of vertex down to 0 the further down the curve it is. Assuming you optimize it so you re-use the geometry from previous frames and just take off old segments that have faded out and add new segments to your existing list, that should be the most efficient way to render it.

As for why it's the fastest, let's quickly look at the possible ways to tackle this problem. There's two main possible ways I can think of: generate geometry to render, or accumulate the images of previous frames to do an image-based solution. Each of these operations are incremental, which means that they can be done with 1 operation each frame. However, in order to do the image based solution, you will need to do a separate render each frame to accumulate the light trail, which is going to be slower than drawing many triangles for numerous reasons. That means it's between rendering lines and rendering triangles. Assuming it must transform lines to triangles, the triangles will be a bit faster since it will need to regenerate the triangle info each frame it draws a line.

As for the space ship, for a simple diamond shape I'd say that drawing it using pure geometry would be faster than using a texture, as it will need to sample the texture each pixel that's occupied on the screen, and will be even slower if it has alpha blending if there's a hole in the center. However, if the geometry were complex enough, then the texture will be faster.

Note that I'm not sure about all of my assumptions of the video card on the iPhone, but I'd imagine it's likely it's similar enough to desktop (integrated) cards that they should hold.

As for timing your code, that's notoriously tricky for stuff like graphics operations. This is because operations are accumulated on a buffer and then passed to the graphics card on a flush. In order to make it work, you would have to flush the command buffer, wait for the GPU to finish processing, issue your commands for what you want to time, flush the command buffer, and wait again for the GPU to finish processing. I think you can do that with the glFinish() command, assuming it's in OpenGL ES. (just make sure to not swap buffers)

akb825 Wrote:Note that I'm not sure about all of my assumptions of the video card on the iPhone, but I'd imagine it's likely it's similar enough to desktop (integrated) cards that they should hold.

It isn't. The iPhone's GPU is a tile-based, deferred renderer, as opposed to stream renderers on the desktop.

Additionally, a lot of what actually is the fastest way to do a given things depends on the particular drivers, e.g. the fast path, optimizations the driver may or may not do behind the scenes depending on OpenGL state, etc etc. If you want to know what is the fastest, write tests. Since you don't have the information what the hardware and drivers do exactly, you cannot "prove" performance in the mathematical sense, but you can benchmark it, which should be good enough.

And then you have to deal with the fact that benchmarks cannot account for every possible combination of things you can render. As academic research goes, a correct methodology goes a long way.