I'm using XNA to test an image analysis algorithm for a robot. I made a simple 3D world that has a grass, a robot, and white lines (that are represent the course). The image analysis algorithm is a modification of the Hough line detection algorithm.

I have the game render 2 camera views to a render target in memory. One camera is a top down view of the robot going around the course, and the second camera is the view from the robot's perspective as it moves along. I take the rendertarget of the robot camera and convert it to a Color[,] so that I can do image analysis on it.

I want to overlay the results of the image analysis on the robot camera view. The first part of the image analysis is finding the white pixels. When I find the white pixels I create a bool[,] array showing which pixels were white and which were black. Then I want to convert it back into a texture so that I can overlay on the robot view.

When I try to create the new texture showing which ones pixels were white, then the game goes super slow (around 10 hz).

Can you give me some pointers as to what to do to make the game go faster.
If I comment out this algorithm, then it goes back up to 60 hz.

1 Answer
1

To know the exact problem you should use a profiler but I can make an educated guess that the performance penalty comes from the call to texture.GetData(colors1D); in TextureTo2DArray.

As usual the excellent Shawn Hargreaves has written something about this:

Normally, the CPU and GPU run in parallel. Framerate = max(CPU time,
GPU time).

If your code causes a pipeline stall, however, the processors must
take turns to run while the other one sits idle. Yikes! Now framerate
= CPU time + GPU time. In other words, programs that stall can be both CPU and GPU bound at the same time.

The easiest way to cause a stall is to draw some graphics into a
rendertarget, then GetData on the rendertarget texture to read the
results back to the CPU. Think about what happens if you do this:
Charles (the CPU) is processing your drawing calls. He has filled a
piece of paper with instructions for his brother George (the GPU).
Charles reaches an instruction that says "copy data from George back
into this array". But the drawing instructions haven't actually been
processed by George yet! Charles cannot just note down the GetData
call on his piece of paper. The next instruction might use values from
the array, so he needs that data right away. Charles has no option
but to immediately hand the incomplete list of drawing instructions
over to George, then wait around twiddling his thumbs in boredom until
George has finished drawing everything, at which point Charles can
resume processing the GetData instruction while George becomes idle.

One of the great successes of the Direct3D API is how it hides the
asynchronous nature of GPU hardware. Many graphics programmers are
writing parallel code without even realizing it! But as soon as you
try to read data back from GPU to CPU, all this parallelism is lost

So it is generally not advisable to pipe data from the GPU back to the CPU.

Another problem is that processing individual pixels on the CPU is expensive. All modern GPUs calculate thousands of pixels at the same time but using the CPU its back to 1, 2, or 4 pixels. Remember inspecting a 1920x1080 image will mean 2073600 series of operations.

Now I don't know what the underlying problem is you're trying to solve by rendering a texture and inspecting if pixels are white but you're probably better off using some other way of abstracting this problem.