If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

The Linux Graphics Driver Stack Remains Insecure

02-02-2013, 01:00 PM

Phoronix: The Linux Graphics Driver Stack Remains Insecure

The Linux graphics driver stack remains currently insecure with some fundamental issues that jeopardize the Linux desktop's integrity, but improvements are still being made to address the current issues...

The ideal way is isolating users in a separate VM by restricting a GPU user to its own data through abstracting the memory address space. This method is already used by the Nouveau driver for NVIDIA GeFore 8 hardware and newer while it's possible to be supported by the AMD Radeon HD 7000 series and newer along with Intel Sandy Bridge graphics and newer.

Believe we are already doing this for GCN hardware. It's not actually a separate VM (didn't think Nouveau does that either), just a separate virtual address space**, but we do use the GPUVM page tables to control what the GPU can access. GPUVM is also implemented for Cayman but don't remember if it's enabled by default yet.

The problem with this separate VM approach though is that it does increase the context-switching delay, which could particularly cause problems when using DRI2 and Qt5 and other cases. Right now what the Radeon and Intel open-source drivers is command submission validation by making sure they aren't accessing bad areas of RAM. This method yields a lower context-switching delay and doesn't have any specific hardwae requirements, but does come at a cost of higher CPU usage with needing to check the CS packets for their validity.

AMD GPUs have hardware support to minimize context switching delays in Cayman, GCN and beyond... basically the ability to have multiple page tables, each associated with a "VM ID" or VMID, then have the hardware automatically switch between them as needed. The HW supports a finite number of VMIDs so they do need to be managed carefully but starting with Cayman there are enough VMIDs to mimimize the overhead.

As mentioned, it is possible to use GPUVM in earlier hardware but without VMID support the context switching delays make it impractical so we use continue to use command submission validation for older hardware.

Comment

I hope they sort this out, especially given the rise of WebGL and how much closer it let's potentially malicious code get to these kinds of issues.

I think Linux users should really welcome WebGL, as it has the potential to be revolutionary in terms of the number games which are accessible to Linux users. The vast majority of gaming is in casual games, and those are a great target for WebGL. Yay for cross-platform standards!

Comment

Believe we are already doing this for GCN hardware. It's not actually a separate VM (didn't think Nouveau does that either), just a separate virtual address space**, but we do use the GPUVM page tables to control what the GPU can access. GPUVM is also implemented for Cayman but don't remember if it's enabled by default yet.

AMD GPUs have hardware support to minimize context switching delays in Cayman, GCN and beyond... basically the ability to have multiple page tables, each associated with a "VM ID" or VMID, then have the hardware automatically switch between them as needed. The HW supports a finite number of VMIDs so they do need to be managed carefully but starting with Cayman there are enough VMIDs to mimimize the overhead.

As mentioned, it is possible to use GPUVM in earlier hardware but without VMID support the context switching delays make it impractical so we use continue to use command submission validation for older hardware.

Anyway, good to know radeon switched to using them instead of command submission validation. We always used hw contexts on Nouveau and switching delay isn't that high.

Comment

Before this thread goes off the rails like the rest of them, it's probably worth mentioning that code implemented for "security" also provides increased "stability" by intercepting bad accesses caused by driver or app bugs, rather than letting them go through and possibly cause eventual crash/hang problems.

The security measures being discussed are aimed at making sure the graphics stack is not *worse* than the rest of the Linux stack, not aiming for some abstract higher level of security.

Comment

Before this thread goes off the rails like the rest of them, it's probably worth mentioning that code implemented for "security" also provides increased "stability" by intercepting bad accesses caused by driver or app bugs, rather than letting them go through and possibly cause eventual crash/hang problems.

Yeah, that kind of damage limitation is good... makes it easier to fix future problems, by ensuring the symptoms occur closer to whatever is causing them.