Picking up from where we left off yesterday afternoon, word comes from AMD that they have finally squashed the installation bug on their Catalyst 14.1 beta drivers, resolving the last show stopper bug preventing the release of these drivers. As such, albeit a day behind schedule, AMD has finally publically posted the 14.1 beta drivers on their website for public use.

Under most circumstances we don’t place a great deal of importance on any given driver release. Both AMD and NVIDIA have 3-4 “major” driver releases each year, with a major driver release typically being a driver from a new branch that incorporates a mix of new features and low-level performance optimizations. Consequently it’s the major driver releases that are the most notable releases for each GPU vendor, and even then those are generally regular releases. But even by those standards Catalyst 14.1 is going to me a more significant driver than virtually any other, due to the unprecedented number of features being rolled out all at once.

Altogether AMD is rolling out 3 major new features and a heap-ton of bug fixes with Catalyst 14.1, the first release of the 13.35 driver branch. The marquee feature is of course Mantle, with 14.1 being the first public release with Mantle API support. But 14.1 also includes HSA support for AMD’s recently launched Kaveri (A-7000 series) APUs, and at long last AMD’s suite of “phase 2” frame pacing fixes for all pre-GCN 1.1 (pre-XDMA) hardware. So this is a release that should have something for everyone, and further cuts down on AMD’s feature debt by implementing some long-awaited improvements.

Mantle

The marquee feature for Catalyst 14.1 is of course Mantle, AMD’s new low-level graphics API. Designed specifically around the shared GCN architecture in all of AMD’s current-generation graphics products, Mantle is intended to improve GPU efficiency and overall graphics performance by offering a lower level of hardware access than APIs like Direct3D offer, and in the process bypassing the overhead Direct3D can impose.

Since we have already written a small tome on Mantle from shortly after its announcement back in September, we won’t go into too much depth on Mantle for today’s driver release, but please see our first Mantle article for further details on how Mantle works and what AMD’s goals for the API are. Meanwhile we’re hard at work on a full write-up on Mantle’s performance with the new Catalyst 14.1 drivers, but in the meantime you can also see our Mantle performance preview for a quick glimpse of Mantle’s performance under Battlefield 4 on a Radeon R9 290X.

For this driver release Mantle is supported for all GCN discrete video cards and high performance APUs – Radeon 7000, 8000, and 200 series, and A-7000 series APUs (Kaveri). However at the moment the largest gains are to be found on AMD’s GCN 1.1 parts, the Radeon 290 series, Radeon 260 series (and 7790), and the A-7000 series APUs. For older GCN 1.0 hardware Mantle does work fine, as evidenced by the Star Swarm technical demo in particular, but Battlefield 4 performance isn’t showing the same gains on those parts as it is GCN 1.1 parts. Which indicates that AMD’s Mantle drivers and/or Battlefield 4 are in need of further optimizations, which is something AMD is already committing to follow through on.

On the software side of matters, Battlefield 4 and Star Swarm will be the launch titles for Mantle. AMD is ultimately shooting for much broader support – Thief is due with Mantle support this month, for example – but they will be starting small with just these two titles. Star Swarm itself is a technical demo of an engine that will be used in future games, rather than being a game or game demo itself, so of those two titles Battlefield 4 will carry the banner as the launch title for Mantle.

Finally, it’s important to note that AMD is being especially emphatic in pointing out that the Mantle component of the Catalyst 14.1 driver set is in a beta state. The SDK and the drivers are still a work in progress, hence the lack of GCN 1.0 optimizations in Battlefield 4 for example, so while AMD’s beta drivers are typically of reasonable quality, AMD is making sure everyone is aware that the Mantle component has further limitations and known issues than is in a typical beta release. AMD’s list of Mantle known issues is posted as below, with the most notable being that the state of Crossfire support is especially raw. The low level nature of Mantle means that driver-enforced AFR for multi-GPU setups is absent under Mantle, so proper multi-GPU scaling is largely in the hands of game developers, a significant change from how things work under Direct3D.

Mantle performance for the AMD Radeon™ HD 7000/HD 8000 Series GPUs and AMD Radeon™ R9 280X and R9 270X GPUs will be optimized for BattleField 4™ in future AMD Catalyst™ releases. These products will see limited gains in BattleField 4™ and AMD is currently investigating optimizations for them.

Multi-GPU support under DirectX® and Mantle will be added to StarSwarm in a future application patch

Notebooks based on AMD Enduro or PowerXpress™ technologies are currently not supported by the Mantle codepath in Battlefield 4™

AMD Eyefinity configurations utilizing portrait display orientations are currently not supported by the Mantle codepath in Battlefield 4™

Graphics hardware in the AMD A10-7850K and A10-7700K may override the presence of a discrete GPU under the Mantle codepath in Battlefield 4™

While we have yet to test Crossfire performance, for all other testing we’ve done we have not run into any issues with Mantle. Battlefield 4 for its part has a reputation for bugs – one that’s not unearned – and we’ve still run into the occasional Battlefield 4 bug. But Mantle itself has proven stable in our testing thus far.

Phase 2 Frame Pacing & Dual Graphics

AMD’s second major feature release with Catalyst 14.1 is the long-awaited set of “phase 2” frame pacing improvements for pre-GCN 1.1 hardware. Since March of 2013 AMD has been working on resolving some systematic issues in their multi-GPU frame pacing mechanisms, which under closer scrutiny had proven to be inferior to NVIDIA’s frame pacing mechanisms. In the interim period AMD has released their GCN 1.1 hardware, whose XDMA engine has essentially solved the problem going forward for new hardware, but there is still the matter of existing pre-GCN 1.1 hardware that AMD has been contending with.

For these pre-GCN 1.1 cards, AMD’s frame pacing fixes have been released in multiple phases. Phase 1 was released back in August of 2013 with Catalyst 13.8, and implemented AMD’s vastly improved frame pacing algorithms for D3D10+ on single display configurations. Phase 2 was to follow, and after having been delayed multiple times has finally been released as part of Catalyst 14.1.

Whereas phase 1 focused on single-display issues, phase 2 contains AMD’s multi-display frame pacing fixes. This includes both pure Eyefinity multi-monitor setups, and also fixes for tiled 4K displays, which behave as multiple monitors from the point of view of the video card. With their phase 2 fixes now in place, AMD has an improved frame pacing mechanism for these very high resolution configurations, which is especially important for cards such as the Radeon 7990, which was advertised in part on its suitability for these configurations.

Catalyst 14.1 Frame Pacing

Single Display

Eyefinity

D3D11

Y

Y

D3D10

Y

Y

D3D9

N

N

OpenGL

N

N

Mantle

Dev Implemented

Dev Implemented

One thing to note with phase 2 is that like phase 1, these frame pacing improvements are solely for Direct3D 10+. Direct3D 9 and OpenGL are not covered by phase 2 under single-monitor or multi-monitor configurations, and as such still use AMD’s old frame pacing algorithms. AMD will ultimately be releasing a phase 3 driver to handle these APIs, though AMD isn’t providing a schedule for phase 3 at this time.

Besides the immediate frame pacing benefits from the phase 2 fixes, phase 2 is also extremely interesting from a technical perspective, as the high resolutions this covers are so high that they overwhelm the limited bandwidth available via AMD’s Crossfire Bridge Interconnect. As a result AMD is having to ship a significant amount of frame traffic over the PCIe bus, which is notable since AMD doesn’t have the XDMA engine to lean on in these older parts. We’ll be digging into these frame pacing improvements a bit later this month once we’re done with Mantle.

Finally, alongside the phase 2 dGPU fixes, AMD is also including their frame pacing fixes for their Dual Graphics technology, which leverages their multi-GPU capabilities to pair up an iGPU with a dGPU for similar performance improvements. Similar to AMD’s Crossfire technology, Dual Graphics has suffered from frame pacing issues for many of the same reasons, and with this release AMD is finally applying their new frame pacing algorithms to Dual Graphics. Though we haven’t had a chance to test this yet, AMD tells us that Catalyst 14.1 should significantly improve frame pacing for a subset of existing Dual Graphics setups, and meanwhile this is also the first driver to properly support Dual Graphics with the recently released Kaveri APUs. Dual Graphics has been nearly an afterthought for most buyers due to outstanding issues such as frame pacing, so hopefully AMD is better able to deliver on its potential with these fixes.

HSA

Last but not least, Catalyst 14.1 also includes the first runtime drivers for AMD’s Heterogeneous System Architecture (HAS). In the works for years and finally delivered with AMD’s recently launched Kaveri APU, HSA is AMD’s end game for integrating their CPUs and GPUs into a single product, coupling the two tightly enough to allow for the GPU component to be an efficient processor for various compute tasks. HSA requires a runtime to compile the pseudo-ISA to the architecture’s native ISA, hence the need for a driver component, which is finally being delivered via Catalyst 14.1.

More so than even Mantle, HSA is a work in progress. It'll still be a matter of years before we see widespread adoption of heterogeneous programming and software, but the release of HSA capable hardware and its associated runtime lays the groundwork for the development of the first HSA applications.

Closing Thoughts

As always, you can grab the driver from AMD’s driver download page. The Windows desktop driver weighs in at 286MB, while the mobility driver weighs in at 270MB.

AMD’s driver download page has extensive release notes on this driver release, including specific supported (and unsupported) products, and lists of known issues and notable bug fixes. Meanwhile AMD’s gaming sub-site has a blog post up on these drivers, reiterating the expected benefits and offering AMD’s own performance numbers.

Finally, as these are beta drivers AMD has been asking for bug reports for any and all issues encountered with 14.1. Bug reports may be submitted via AMD’s issue reporting form.

If you want low-level access to hardware (=Mantle), don't be surprised if you don't get complex high-level features.. If you want the API and the driver to do stuff for you, DirectX still works.

Mantle is not for the everyday Indie developer, it's first targeted for AAA titles that can afford the extra cost for using a practically beta level environment, what is still probably full of bugs..Reply

Oh I know, I'm just saying that even large devs like DICE push out products with a shitload of bugs and problems, so expecting them to actually take care with frame pacing (which is, in the big picture, a rather minor thing) when a lot of configs tank to <30 fps is a bit foolish.Reply

Many game developers, including indies, ueses 3rd-party engine like UE, CryEngine, Unity, OGRE (anyone use it ?) etc. Implying that these engine already have good platform-abstraction codes, I believe that Mantle could be widespreaded if it shows performance benefit in wide margin.

Though it might not be very popular to developers using in-house engine, especially those who don't do platform abstraction.Reply