ATI ups the GPGPU ante

ATI TODAY IS announcing its new Stream transcoding paradigm, and unlike some others, it makes quite a bit of sense. Stream now parses workloads between the CPU and the GPU, and does things much more intelligently.

There are some who think that the GPU is the only thing that matters in a computer, mainly because that is all they sell. There is a term for that point of view: wrong. Most workloads tend to have serial and parallel portions, and GPUs are very good at the parallel side, but are horrible at the serial parts. CPUs are the exact opposite.

Use what makes sense

Some problems have a serial portion that takes so much time that the parallel portion is almost free by comparison. Others are so parallel that the time needed for the entire workload just depends on how many cores you throw at them. Most are more balanced, so if you stick all your compute on the GPU, things go much slower than parsing them sanely.

Not a UN org chart, just GPU compute

That is what makes the new stream APIs so interesting. They not only pull in the CPU and GPU, but also the video decoders on the GPU as well. If there is a step in a problem that needs lots of serial number crunching, it is shunted to the CPU, parallel parts to the GPU, and video decoder if needed. Instead of using one or two parts, the latest Stream can do it all, at once, theoretically well.

For transcoding, this can make a huge difference, one of the major steps, the initial decode and scaling, is essentially free. This allows both of the main chips to do the things that they should be doing without interruption.

As is the norm with all of these releases, ATI put out a bunch of graphs showing Stream transcoding pummeling Nvidia cards with the same software in H.264 and MPEG-2 transcoding. They are comparing some low-end cards to other low-end cards, so we will have to wait and see how well it does when the usual suspects compare things across a wider range of cards.

The new transcoder should work on most 4000 series GPUs. The older version only used the higher-end cards, but now the compatibility matrix goes all the way down to the lowest-end 4350 - almost integrated territory there. The lowest-end cards don't do GPU encode though, they simply don't have the horsepower, and would be slower than a CPU.

You can get the new Stream bits for free in the Catalyst 9.5 'hotfix', available now, but only for the Broken OSTM. It works with Cyberlink Espresso, and a lot more to come soon. There will also be XP versions as well, but that will be a few months off. Basically, all of the usual encoding and HD playback wares will support the new encoder on the next rev. µ