Thursday, August 21, 2014

What's the deal with VGA arbitration?

Let's take a step back on the VFIO VGA train and take a look at what exactly is VGA, why does it need to be arbitrated, and why can't we seem to get that arbitration working upstream.

VGA (Video Graphics Array) is a remnant of early PCs and certainly falls within the category of a legacy interface. If you take a look at my slides from KVM Forum 2013, you can get a little taste of the history. VGA came after things like EGA and CGA and incorporated many of their features for compatibility, but it was effectively the dawn of the PC era. We didn't have interfaces like PCI. Devices lived at known, fixed addresses, and there was only ever intended to be one device. VGA devices are initialized through proprietary code provided as the VBIOS, ie. the ROM on PCI VGA cards of today. How does the VBIOS know where to find the VGA device for initialization? It's always at a known, fixed address, MMIO regions 0xa0000-0xbffff and a couple sets of I/O port ranges. Remember ISA NICs with jumpers that could only be programmed to a couple addresses and required specifying which address to the driver? VGA devices don't have jumpers.

In a modern PCI, we're no longer using an ISA bus and we can clearly support more than a single VGA device, but the mechanisms to do so are via layers of compatibility. PCI bridges actually have a VGA Enable bit in their configuration space which defines whether the bridge will do positive decode on transactions to the VGA spaces, effectively defining whether it will take ownership of a transaction to the VGA area. Each PCI bridge has one of these enable bits, but the PCI specification only defines the results when a single bridge per bus enables VGA. Therefore, at any given time, VGA access can only be directed to a single PCI bus. It's the responsibility of system software to manage which bus VGA is routed to. Managing the VGA routing is what we call VGA arbitration.

In an ideal configuration, each VGA device is on a separate, leaf bus, where there are no further VGA devices downstream. System software then needs only to disable the VGA enable bit on one set of bridges and enable it on another. Things get tricky when we have multiple VGA devices on the same bus or one VGA device downstream of another. In this case we need to prevent the other VGA devices from claiming the transactions so that they reach the intended VGA device. One way to do this is via the PCI MMIO and I/O port space enable bits on the device. The PCI specification defines that a device can only claim a transaction to an address space owned by the device when the appropriate bit is set in the PCI COMMAND register. By disabling these bits, we can guarantee that the device won't claim the transaction, allowing the intended device on the same bus or downstream bus to do so. Another option to do this is to use device specific mechanisms to configure the controller to ignore these transactions, in effect putting the device into a legacy-free operating state.

When a PC boots, the system BIOS selects a primary VGA device. Some BIOS/motherboard manufacturers allow the user to select the primary VGA device. This configures the chain of PCI bridges to one of the VGA devices, disabling the others. The ROM is copied from the PCI option ROM space of the VGA controller and stored in the shadow ROM area at 0xc0000 to be compatible with legacy PCs. The VBIOS executes from there and relies on being able to reach its device at the known, fixed VGA address spaces.

The operating system may continue to use VGA access to the device for compatibility, and boot loaders like GRUB and SYSLINUX don't have device specific video drivers. Instead they make use of other standards layered on top of VGA, like VESA that allow them to switch modes and use the VGA device with standard drivers. In general though, we expect that once we're using device specific drivers in the operating system, the VGA space sees little to no use. This means that when we want to switch VGA routing to the non-primary device so that we can repeat the above boot process in a virtual machine, we generally don't have many conflicts and can even boot two VMs simultaneously, switching VGA routing between VGA device, with bearable performance.

Where we get into trouble (AIUI) is that DRI support in Xorg wants to create a fixed mapping (mmap) to the VGA MMIO space. This fixed mapping implies that we can no longer change the VGA routing on the bridges since the mmap would suddenly target a different device. Xorg therefore disables DRI support when there are multiple participants in VGA arbitration. People like DRI support [citation needed] and multiple VGA devices are fairly common, especially when VGA support is built into the processor, such as with Intel IGD. As a result, host drivers want to opt-out of VGA arbitration as quickly as they can by disabling legacy interfaces on the device so that there are never multiple VGA arbitration participants and DRI remains enabled.

For a typical plugin PCIe VGA card, the device can be guaranteed that there are no downstream VGA devices and there are no other devices on the same bus, simply because the point-to-point topology of PCI Express does not allow it. Such devices don't need to do anything special to opt-out of VGA arbitration, they simply need to not rely on VGA address spaces and notify system software that arbitration is no longer necessary.

Intel IGD isn't that lucky. The IGD VGA device lives on the PCIe root complex where nearly all of the other devices in the system are either sibling devices on the same bus or devices on subordinate buses from the root complex. In order for VGA to be routed to any other device in the system, the IGD device needs to be configured to not claim the transaction. As noted earlier, this can be done either by using standard PCI disabling of MMIO and I/O or via device specific methods. The native IGD driver would really like to continue running when VGA routing is directed elsewhere, so the PCI method of disabling access is sub-optimal. At some point Intel knew this and included mechanisms in the device that allowed VGA MMIO and I/O port space to be disabled separately from PCI MMIO and I/O port space. These mechanisms are what the i915 driver uses today when it opts-out of VGA arbitration.

The problem is that modern IGD devices don't implement the same mechanisms for disabling legacy VGA. The i915 driver continues to twiddle the old bits and opt-out of arbitration, but the IGD VGA is still claiming VGA transactions. This is why many users who claim they don't need the i915 patch finish their proclamation of success with a note about the VGA color palate being corrupted or noting some strange drm interrupt messages on the host dmesg. They've ignored the fact that the VGA device is only working further into guest boot, when the non-legacy parts of the device drivers have managed to activate the device. The errors they see are a direct results of VGA accesses by the guest being claimed by the IGD device rather than the assigned VGA device.

So why can't we simply fix the i915 driver to correctly disable legacy VGA support? Well, in their infinite wisdom, hardware designers have removed (or broken) that feature of IGD. The only mechanism is therefore to use the PCI disabling mechanism, but the host driver can't run permanently with PCI I/O space disabled, so it must participate in VGA arbitration, enabling I/O space only when needed since it will always claim VGA transactions as well as PCI space transactions. This has the side-effect that when another VGA device is present, we now have multiple VGA arbitration participants, and Xorg disables DRI.

How can we resolve this?

a) Have a mode switch in i915 that allows it to behave correctly with VGA arbitration.

This is the currently available i915 patch. The problems here are that DRI will be disabled and the maintainer is not willing to accept a driver mode switch option, instead pushing for a solution in other layers.

b) Remove Xorg dependency on mapping VGA space for DRI.

Honestly, I have no idea how difficult this is. The problem is one of compatibility and deprecation. If we could fix Xorg today to decouple DRI from VGA arbitration, how long would it be before we could fix the i915 driver to work correctly with VGA arbitration? Could we ever do it or would we need to maintain compatibility with older Xorg?

c) Next-gen VGA arbitration

We can do tricks with page fault handlers to allow user space to have an apparently consistent mmap of VGA MMIO space. On switching VGA routing we can invalidate the mmap pages and re-route VGA in the fault handler. The trouble is again one of compatibility and deprecation. If we provided Xorg with a v2 VGA arbitration interface today, could the i915 driver ever rely on it being used?

d) Don't use VGA

This is actually a promising path, and one that we'll talk more about in the future...

On the FAQ, question 3: https://vfio.blogspot.co.nz/2014/08/vfiovga-faq.html"I have Intel host graphics, when I start the VM I don't get any output on the assigned VGA monitor and my host graphics are corrupted. I also see errors in dmesg indicating unexpected drm interrupts."

You mention to use the patch, this is for VMs using SeaBIOS right? Is it always required? I've been trying to get this working on a laptop that meets all hardware requirements but lacks the EFI vbios support on the nvidia 860M, so OVMF is not an option from what I understand. From what I can tell the HDMI and VGA ports also belong to the Intel IGP, but VNC should still be an option correct?

When using SeaBIOS I'm able to install Windows and see my gpu listed as "3D Video Controller" in Device Manager reporting Code 28, nvidia drivers refuse to install though.

I've not applied the patch you mention as I'm not experiencing these issues your talking about, there is no graphic corruption and no relevant errors in dmesg. Is the situation with laptops different here due to the hardware setup where often the nvidia/amd gpu output is handled via Intel?

I'm having difficulty understanding what is wrong with my setup since the gpu does appear to be passed through to the VM. I have not had much luck finding anything about passthrough with Code 28, I could only wish it were Code 43 :) Would you be able to clarify what's wrong? I can only assume it has something to do with laptop hardware design being different from desktops.

I have more details on my specific setup here: https://www.reddit.com/r/VFIO/comments/41ohss/laptop_with_nvidia_gpu_passed_through_but_cannot/

As I replied to your other comment, laptops are not likely to work. Optimus support is highly integrated into the system BIOS in ways that we have no idea how to duplicate. I don't think it's likely to help, but the i915 vga arbiter patch is necessary any time you have host Intel graphics and make use of the x-vga=on option.