AMD has revealed an API that gives developers direct access to GPUs using the GCN architecture.

Now that AMD is powering all four major gaming platforms – namely the Wii U, Xbox One, PlayStation 4 and PC – the company has finally revealed its secret weapon to bind these platforms together: the low-level high-performance "Mantle" graphics API. This will allow developers to "speak the native language" of AMD's Graphics Core Next architecture used in modern AMD-based graphics cards and APUs.

Mu question is, how will it affect OpenGL and the future of the APIs. Could it not have been abstracted under OpenGL ?Or maybe the cgn supported cards actually have their implementation on top of mantle but having access to the raw mantle gives additional benefits.

The information is kinda lacking or I just don't see it.

“The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.” - Michael A. Jackson

Mantle is mostly targeted at titles that will be deployed on both next-gen consoles and PC. Console developers take advantage of the direct hardware access they enjoy on those platforms and both PS4 and Xbox One have AMD hardware under the hood. Mantle will enable the reuse of the same optimized paths on the PC. Everyone else will also be able to use it of course, but I'm not sure many will.

Both OpenGL and Direct3D need a lot more work to enable low-level access to the GPU and, in any case, abstracting over the hardware differences will be hard and probably not as optimal as a proprietary API. But we'll definitely see more work in that direction for both APIs and the gap will close.

As for Java, Mantle will be just another API to bind with JNI to. But you need to keep in mind that this will not be just another set of function pointers that, by calling them, magically increase your draw calls per second. Instead of handles, there will probably be a lot of direct pointers to GPU-specific data structures and misc buffers and it's going to be awkward to work with in Java. The power will be great; better reuse of buffers for reduced memory/bandwidth pressure and you'll be able to do novel things that enable new algorithms and effects. It's just going to be horribly unnatural and unsafe to use from Java. There's also been some info during JavaOne that the sun.* package will be inaccessible in Java 9 with Jigsaw modularization. Not sure what that means exactly yet, but it's scary to think how we're going to be able to deal with all that without sun.misc.Unsafe.

Anyway, AMD is obviously taking full advantage of the unique opportunity it has with so much hardware being deployed with its GCN architecture underneath. As much as I don't appreciate proprietary APIs (see CUDA), this is a decent step towards moving the PC to the living room, which I really want to see happening. Imagine a Mantle optimized game running on SteamOS; direct GPU access on top a very lightweight platform that minimizes kernel - user space context switches. The efficiency will be awfully close to that of a console and it'd still be an open(-ish) platform.

There's also been some info during JavaOne that the sun.* package will be inaccessible in Java 9 with Jigsaw modularization. Not sure what that means exactly yet, but it's scary to think how we're going to be able to deal with all that without sun.misc.Unsafe.

I'm a bit worried. Sun and Oracle already refused several times to provide a public API to release the native memory reserved by direct NIO buffers, they are almost unusable in performance-critical code without a mean to trigger their "destruction".

There's also been some info during JavaOne that the sun.* package will be inaccessible in Java 9 with Jigsaw modularization. Not sure what that means exactly yet, but it's scary to think how we're going to be able to deal with all that without sun.misc.Unsafe.

I'm a bit worried. Sun and Oracle already refused several times to provide a public API to release the native memory reserved by direct NIO buffers, they are almost unusable in performance-critical code without a mean to trigger their "destruction".

Sorry, I forgot to follow-up on this. There's been additional info on Unsafe specifically. It's going to be cleaned-up and move into the official JDK in Java 9. It's obviously too popular to kill.

I'm not sure what kind of trouble you're having with buffer deallocation. Have you tried using a large buffer as a memory pool and using .slice()d buffers out of it? Also, direct buffers are simple wrappers around Unsafe.allocateMemory() and Unsafe.freeMemory(). Why not use those directly and construct the ByteBuffer instances out of the pointer .allocateMemory() returns?

As I understand it there are moves to getting "structs" into Java at some point, so there's hope it'll be useful for someone yet. Of course the idea of writing to the metal (and indeed, only AMD's metal, at that), is kinda useless to most people who aren't building bespoke hardware and software solutions. Even a console writer would be loathe to bother because it means a lot of extra effort for a minimal performance gain and then having to ignore it all anyway when the inevitable PC port (read: Steambox) comes along.

There's also been some info during JavaOne that the sun.* package will be inaccessible in Java 9 with Jigsaw modularization. Not sure what that means exactly yet, but it's scary to think how we're going to be able to deal with all that without sun.misc.Unsafe.

I'm a bit worried. Sun and Oracle already refused several times to provide a public API to release the native memory reserved by direct NIO buffers, they are almost unusable in performance-critical code without a mean to trigger their "destruction".

Sorry, I forgot to follow-up on this. There's been additional info on Unsafe specifically. It's going to be cleaned-up and move into the official JDK in Java 9. It's obviously too popular to kill.

Thanks. I feel better now. Where can I get some more information about those public APIs? We'll have to use different code paths depending on the Java version but as long as it works, it's ok for me.

I'm not sure what kind of trouble you're having with buffer deallocation. Have you tried using a large buffer as a memory pool and using .slice()d buffers out of it? Also, direct buffers are simple wrappers around Unsafe.allocateMemory() and Unsafe.freeMemory(). Why not use those directly and construct the ByteBuffer instances out of the pointer .allocateMemory() returns?

In some cases, it is difficult to plan how much native memory I'm going to use when launching my applications, I can vaguely estimate that but my own code isn't the only one to use the native heap. I often use the direct NIO buffers obtained from the public API and I use Oracle/Sun/Apache proprietary API only to destroy them. Most of the direct NIO buffers that I use are created by third party libraries, I can override a mechanism to get them and destroy them quite easily but using Unsafe.allocateMemory would require some modifications of the architecture and some maintainers don't want to expose any API to release the native memory (it has been a source of debates for JMonkeyEngine 3 and I handle that in my own code for Ardor3D).

Yesterday's presentation: AMD Mantle Technical Deep Dive. There were two more but those are not available publicly yet. No code was shown either, but it's still a nice abstraction layer above the hardware; it won't be much more code than using OpenGL. Highlights:

- Multi-vendor and multi-platform. Obviously it'll start as AMD and Windows 7/8 only, but it's not tied to any of that. They're hoping it will become an industry standard.- It completely eliminates CPU bottlenecks. An FX8350 underclocked to 2GHz can drive an R9 290X with no loss in frame rate. They claim a FX8350 is just as fast as an i7-4770K.- They're targeting 100k draw calls (compared to the 3k-10k cap today). RTS games with 3000 units on screen, 25000 discrete objects simulating and moving.- Explicit multi-threading. Basically your application's threads are the rendering threads, there's no threading at the driver level. This is a huge deal considering Java's concurrency features.- Multiple threads, driving multiple command queues, executing on multiple GPUs. Full application-level control, multi-GPUs will be able to get to 90%+ scaling. Can do AFR, SFR or your own custom scheme.- GPU page table remapping exposed. Used to implement AMD's Partial Resident Textures, can now be used however you like.- Generalized resources, everything is either "memory" or "texture". Usage hints are gone (and so is driver behavior unpredictability as a result), you now explicitly tell the driver how you're using a resource and when that usage changes. The driver then handles low-level details like flushing, compression/decompression, etc. When you do have to stall, you can control when it happens and that it happens only once.- New binding model that's as flexible as bindless, but without complicating the shaders.- Monolithic pipelines. You build (almost) all state in advance, binding is done without validation. Some state is left out of pipelines, to avoid combinatorial explosion (they profiled existing engines to determine which state should be left out).- Pipeline serialization, which includes pre-compiled shader code, for low start-up times.

It's a valid concern, but I think you missed the first point in my post:

Quote

Multi-vendor and multi-platform. Obviously it'll start as AMD and Windows 7/8 only, but it's not tied to any of that. They're hoping it will become an industry standard.

I intentionally mentioned it before everything else. It's very important that both AMD's Mantle and Nvidia's G-Sync are to be open (or at least licenseable) and eventually become industry standards. Unlike, say, Nvidia's CUDA and PhysX.

Do you know how one programs the GPU with Mantel, GLSL or something new?

They did not say (unless I missed something). I don't think it's going to be something new though. I did read something about HLSL last month and they should easily provide GLSL extensions if necessary. Also, @grahamsellers mentioned OpenGL extensions that will provide Mantle-like functionality & performance.

What is with all the low level rasterization stuff(stencil, depth, blending ...) just a copy from OpenGL or something new?

Yes, such standard GPU features are identical across all APIs. Depth-testing and blending were actually mentioned in the presentation as two pieces of state that are not included in the monolithic pipelines.

Info from past couple of days: There is a validation layer on top of Mantle which is "indeed extremely important and very valuable. also makes it significantly easier than on PS4" according to DICE. Seems AMD is taking profiling, debugging and integration with other tools (e.g. PerfStudio) very seriously.

I used to be extremely skeptical to Mantle since I thought Nvidia wouldn't be able to implement it. If Mantle failed it'd be a loss for AMD which could increase Nvidia's market share killing competition, and if it succeeded it'd split up PC between Nvidia and AMD since in that case Nvidia and Intel might make their own Mantle-like API. If Nvidia actually swallows their stupid pride and makes a Mantle driver (assuming they can) I would definitely start using it immediately. Multithreaded rendering? Precompiled pipelines? Much more flexible state? Just to mention a few? Sign me the f*ck up.

Without having seen any code, I'm sure I'll add it to LWJGL the same day it's out. Interestingly, we might have to start worrying about JNI. It's only noise with OpenGL, but if we're moving to 100k-1M draw calls per frame, JNI overhead could easily add up to a few milliseconds. An FFI mechanism would be nice in Java 9...

I think a multi-vendor, multi-platform Mantle would be a great opportunity for Linux to directly compete with Windows as a gaming platform. I also can't wait for lightweight drivers vs the 200+ MB monstrosities we have to download now every couple of weeks.

Biggest hurdle for Mantle is going to be getting Nvidia, Intel and Apple to support it, without that not sure its going to have widespread enough appeal. For that to happen it'll need to be an open standard and AMD will have to give up its exclusively control on the API probably to a party like Khronos.

Having said that, AMD is in a pretty strong position atm as their chips are used in all the next gen consoles (PS4, XB1 & WiiU) and in about 1/3 of the gaming PC's out there. Therefore it could become pretty common as the code is suppose to be shareable across hardware.

HSA seems to be an architecture (or concept?), that allows CPU and GPU to access the same adress space, i.e. passing pointers instead of copying data, and other stuff. The new AMD third generation APU (Kaveri) A10-7850K will have this. Kaveri will include AMD’s Mantle API [2].

The article above says that many companies, including Oracle, are members of the HSA Foundation. It says also, by joining HSA Foundation Oracle strives for better utilization of the GPU with Java.

Other interesting things:[1] Java 8 Sumatra processing demo (Sumatra is an OpenJDK Project to allow to take advantage of GPUs and APUs)[2] a biased article with more details about the APUs (read also comments of user Ken Luskin for more background info)

Nvidia and java have announced they will be allowing speed boosts using the gpu , this involves even array scanning functions with up to 48* performance on some tasks.

Missed that but seems like its acceleration is done using Nvidia's CUDA library.

Its not clear from the links I've found whether it'll be automatic behind the scene acceleration, like what AMD is working on for Java 9 (using OpenCL), or whether its just a simple binding to Nvidia's CUDA libraries which developers will have to work with directly, since the articles states Java 8 then i'm assuming its probably just a binding to CUDA.

Not difficult nor expensive to add Mantle support on modern engines (probably expect all of the major ones to have it, and maybe even nvidia could use it, it seems that the decision to use it or not is still up to nvidia).

It's in alpha stage, they have not fully optimized Mantle yet and they are already hitting GPU bottlenecks instead of CPU, but there is still much room for optimization. Underclocked to 2GHz, the FX-8350 was still GPU bottlenecked with R9 290X. They have not even looked much at GPU performance yet, just CPU, and there are already amazing gains.

Support to fully utilize CPU cores means FX-8350 is comparable to power to an i7 4770K. High-end CPUs not needed to support high-end GPUs. Everything scales amazingly well with more cores.

Frame rate jumped up to x3 when changed from Direct3D (DirectX) to (alpha) Mantle. The tech demo shown is supposed to come out on Q1 2014 and will be moddable. It looks damn pretty (starts at min 25 in the video).

Future: GPU might be used as a co-processor even for non multimedia applications.

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org