AMD Ryzen VFIO GPU pass-through – gaming VM on new hardware

This is kind of a follow up to my USB3 card pass-though post some years ago … as I finally bough some new hardware and want to share my configuration for VFIO GPU and USB pass-through to a Windows 10 VM. Host system is Debian 10 with libvirt. With my former Intel based PC i was having luck it worked so well, this time i took some time to select the hardware…

New and existing hardware

  • Mainboard: Gigabyte X570 Aorus Pro
  • CPU: AMD Ryzen 5 3600
  • RAM and cpu cooler
  • … and a new power supply because the old one lacked one ATX_12V cpu power connector

    Existing hardware:
  • Gainward GTX 1070 Phoenix GS
  • Intel X520 10Gbe PCIe SFP+ card
  • Renesas USB3 card to pass through to the Windows 10 gaming VM
  • a USB headset i bought to get rid of the crappy and troublesome sound pass-through from VM to host

I went for the Gigabyte X570 Aorus Pro because it had 3 PCIe x16 slots configured as x16/x8/x4 (attached to cpu/cpu/chipset) when only the x16 slot is used or x8/x8/x4 when both cpu slots are used and the ability to select which slot should be used as boot GPU.
So my initial thoughts were to put the guest os GPU to slot 1 and X520 nic to slot 3 and the USB3 card (PCIe x1) somewhere in between and everything is ok but … the AMD Ryzen, unlike most Intel CPUs, has no integrated graphics and a second graphics card is needed for GPU pass-through to a VM as the boot GPU can’t be passed to a VM.
Had an old nVidia GT640 around doing nothing and the boot gpu can be set to slot 2. Perfect.
During assembly i noticed that the USB3 card wouldn’t fit as both graphics cards are 2 slots wide.
The solution and whats possible with the board and what doesn’t work shall be part of this blog entry.

VFIO and IOMMU

A requirement for VFIO pass-through to a VM are good iommu groups. These are used to isolate hardware so it can safely be passed through to a guest system.
A restriction with iommu groups is that all devices in a given iommu group have to be passed to the guest system together with the exception of pci bridges, these are never passed to the guest os (this is not entirely true, given that the ACS override patch for the linux kernel exists where this restriction is removed – but you should know about the risks). So the smaller (less devices) an iommu group is, the “better”. More info can be found here VFIO tips and tricks: IOMMU Groups, inside and out.

The amount and mapping of devices to iommu groups can very based on mainboard, bios version and CPU model. During my investigation which board to buy the Gigabyte X570 Aorus Pro (and also the Aorus Elite) were found to have good iommu groups.

One of the best ressources for VFIO pass-through is the Arch Wiki: PCI passthrough via OVMF.

The IOMMU groups

Back to the Gigabyte X570 Aorus Pro boards iommu groups with BIOS version f11. To enable SVM and IOMMU on the x570 board with the AMD Ryzen cpu these BIOS settings have to changed to:

  • SVM -> Enable
  • IOMMU -> Enable
  • ACS Enable -> Enable
  • Enable AER cap -> Enable

and the kernel command line in the boot loader needs to be extended with amd_iommu=on iommu=pt.

The resulting iommu groups are:

IOMMU group 17
	01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]
IOMMU group 35
	0f:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 7
	00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 25
	04:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]
IOMMU group 15
	00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
	00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 33
	0e:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU group 5
	00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU group 23
	03:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a4]
	09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 13
	00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU group 31
	0e:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
IOMMU group 3
	00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 21
	03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a3]
IOMMU group 11
	00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 1
	00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU group 28
	0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070] [10de:1b81] (rev a1)
	0b:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
IOMMU group 18
	02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57ad]
IOMMU group 36
	10:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 8
	00:05.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 26
	05:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU group 16
	00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0 [1022:1440]
	00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1 [1022:1441]
	00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2 [1022:1442]
	00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3 [1022:1443]
	00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4 [1022:1444]
	00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5 [1022:1445]
	00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6 [1022:1446]
	00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7 [1022:1447]
IOMMU group 34
	0e:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]
IOMMU group 6
	00:03.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU group 24
	03:0a.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a4]
	0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 14
	00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU group 32
	0e:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
IOMMU group 4
	00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 22
	03:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a4]
	08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
	08:00.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
	08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU group 12
	00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU group 30
	0d:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
IOMMU group 2
	00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU group 20
	03:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a3]
IOMMU group 10
	00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU group 29
	0c:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107 [GeForce GT 640] [10de:0fc1] (rev a1)
	0c:00.1 Audio device [0403]: NVIDIA Corporation GK107 HDMI Audio Controller [10de:0e1b] (rev a1)
IOMMU group 0
	00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 19
	03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:57a3]
IOMMU group 9
	00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU group 27
	07:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)

The iommu groups and mapped devices can be listed with the following script. If there is no output either iommu support is not enabled or the hardware doesn’t support it.

#!/usr/bin/env bash
for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d); do
    echo "IOMMU group $(basename "$iommu_group")";
    for device in $(ls -1 "$iommu_group"/devices/); do
        echo -n $'\t'; lspci -nns "$device";
    done;
done

What do we get from this:
IOMMU group 28 has the GTX 1070 mapped together with the nVidia sound device so there should be no problem passing these to the guest os. This showed to be true. Yay!

IOMMU Group 33 has a single USB3 controller mapped to it and therefore it should be possible to pass-through this one to the guest os. This didn’t work as expected.
I found some posts [1] on the web about freezes and lockups of the host system when passing through the single USB immediately after starting the VM. I also experienced freezes and lockups.
[1] Passing though a USB controller on a Gigabyte X570 Aorus Master Causing Freezing

IOMMU group 22 with two more USB controllers mapped to it could be the remedy but it has a pci bridge and some other device mapped to it and from reading [2] it was not clear if these USB controllers could be used without the ACS override patch. Looked like a dead end.
[2] Integrated USB controller passthrough

Prepare to pass-through devices

To be passed-through devices need to be assigned the vfio-pci driver at boot so they are not claimed by their correct device drivers (nouveau for the nvidia gpu and xhci_hcd for the usb3 controllers).

On Debian 10 ‘Buster’ this is done in initramfs with some modprobe and initramfs configuration.

As mentioned before the kernel command line is extended with amd_iommu=on iommu=pt to enable iommu in the kernel. Don’t forget to update-grub.

GPU

For the guest os GPU the file /etc/modprobe.d/vfio.conf defines the PCI IDs the vfio-pci module should bind to (nvidia gpu and sound device from iommu group 28)

options vfio-pci ids=10de:1b81,10de:10f0

and /etc/initramfs-tools/modules specifies the modules to include in the initramfs, in my case

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
irqbypass
kvm
kvm_amd
xhci_hcd

After that regenerate the initramfs with update-initramfs -k all -u and reboot.

USB controllers

The onboard AMD USB3 controllers all have the same PCI ID 1022:149c and letting vfio-pci bind to this ID at boot would leave me with no usable usb ports. So not an option.

Investigating the possibility to selectively bind vfio-pci at boot to devices by pci address yielded some possibilites [3][4] but nothing useable. Also the modprobe.d man page says that the modprobe install command is deprecated, so not something a want to invest time in. I think it could be made to work with some more time.
[3] Arch Wiki: PCI passthrough via OVMF – Using identical guest and host GPUs
[4] Re: [vfio-users] How to set PCI address in “options vfio-pci” ?

USB headset / audio

The USB headset was another problem without and usb controller in the guest os. I bought it to get rid of the crappy and troublesome sound pass-through from guest os to the host. It mostly works after booting the guest os but at some point the sound gets choppy and glitches mix into the audio stream and eventually sound output stops completely or you’d need to take off the headphones because it’s too bad.
Trying USB pass-through for the usb headset to the guest os (i’m using libvirt / virt-manager) seems to work in plain windows 10 but doesn’t work with any game. So not usable.

Input devices

For input devices like keyboard and mouse there is a latency free alternative to passing-through an usb controller: EVDEV [5]. This is a seamless event pass-through method build into the windows kernel and there are even virtio drivers for windows.
[5] Evdev Passthrough Explained — Cheap, Seamless VM Input

As explained in the previous link, enabling evdev is simple. Usable devices for evdev are exposed in two different places /dev/input and /dev/input/by-id where the latter is preferred as the device names are constructed from the manufacturer and model name. First list all input devices in /dev/input/by-id with:

# ls -1 /dev/input/by-id
usb-Dell_Dell_USB_Keyboard-event-if01
usb-Dell_Dell_USB_Keyboard-event-kbd
usb-Logitech_USB_Gaming_Mouse-event-mouse
usb-Logitech_USB_Gaming_Mouse-mouse
usb-Sharkoon_Technologies._Skiller_SGH2_3rd_Jun_2017-event-if03

The usable devices are keyboard usb-Dell_Dell_USB_Keyboard-event-kbd and mouse usb-Logitech_USB_Gaming_Mouse-event-mouse. Refer to the link [5] on how to identify the correct devices.

After that add the following snippet to the libvirt domain (VM) XML definition. If there are <qemu:env entries, place the <qemu:arg lines above.

<qemu:commandline>
 <qemu:arg value='-object'/>
 <qemu:arg value='input-linux,id=mouse1,evdev=/dev/input/by-id/MOUSE_NAME'/>
 <qemu:arg value='-object'/>
 <qemu:arg value='input-linux,id=kbd1,evdev=/dev/input/by-id/KEYBOARD_NAME,grab_all=on,repeat=on'/>
 </qemu:commandline>

If it’s not working refer to the troubleshooting section of the above link [5].

The good thing with EVDEV is, when it works it does so well and by pressing both CTRL keys at the same time the input devices were switched between host and guest. Fine.

One problem i found was that the Mouse4 button of my mouse wasn’t recognized in some games. Thats bad.

…back to USB controllers

With both the USB headset and the mouse / keyboard through evdev only partially working i was back searching for a solution on how to pass-through a real usb controller to the guest os.
One option was to buy a new secondary graphics card for the host that only occupies one slot. Not something i had planned.

So back to trying to pass-through the onboard usb3 controllers. The posts in Integrated USB controller passthrough suggested that it’d be possible to pass-through the two usb3 controllers from my IOMMU group 22.

Found some information about the driver_override sysfs file, but didn’t manage to get i to work. The last post on this forum Vfio driver override not working about rebinding a device to a different driver was an idea worth testing.

Unbind the two usb3 controllers from IOMMU group 22 from driver xhci_hcd and bind to vfio-pci. Added the pci devices to the vm in virt-manager.
First block before starting the VM, second block after VM has stopped. This worked and the host os didn’t freeze. Yay!

echo "0000:08:00.1" > /sys/bus/pci/drivers/xhci_hcd/unbind
echo "0000:08:00.3" > /sys/bus/pci/drivers/xhci_hcd/unbind
echo "1022 149c" > /sys/bus/pci/drivers/vfio-pci/new_id

echo "0000:08:00.1" > /sys/bus/pci/drivers/vfio-pci/unbind
echo "0000:08:00.3" > /sys/bus/pci/drivers/vfio-pci/unbind
echo "1022 149c" > /sys/bus/pci/drivers/vfio-pci/remove_id
echo "0000:08:00.1" > /sys/bus/pci/drivers/xhci_hcd/bind
echo "0000:08:00.3" > /sys/bus/pci/drivers/xhci_hcd/bind

Now integrate into libvirt which can execute hook scripts at different stages of VM operation. This is explained in Libvirt: Hooks for specific system management. For qemu this script /etc/libvirt/hooks/qemu does all the work:

#!/bin/bash
# match the desired domain and stage for execution
# see: Hooks for specific system management https://libvirt.org/hooks.html

# domain win10
# pass-through usb controllers with pci addresses 0000:08:00.1 and
# 0000:08:00.3 to guest OS.
# unbind from xhci_hcd driver and bind to vfio-pci before VM is started
# unbind from vfio-pci and bind to xhci_hcd when VM is stopped so devices
# are usable on host again

if [[ $1 == "win10" ]] && [[ $2 == "prepare" || $2 == "release" ]]
  then
    if [[ $2 == "prepare" ]]
      then
        # unbind usb controllers from driver xhci_hcd
        echo "0000:08:00.1" > /sys/bus/pci/drivers/xhci_hcd/unbind
        echo "0000:08:00.3" > /sys/bus/pci/drivers/xhci_hcd/unbind
        # add the pci id id to vfio-pci driver, all unbound devices are
        # claimed by the driver, manual binding not necessary
        echo "1022 149c" > /sys/bus/pci/drivers/vfio-pci/new_id
      else
        # unbind usb controllers from driver vfio-pci
        # remove the pci id from the driver vfio-pci
        echo "0000:08:00.1" > /sys/bus/pci/drivers/vfio-pci/unbind
        echo "0000:08:00.3" > /sys/bus/pci/drivers/vfio-pci/unbind
        echo "1022 149c" > /sys/bus/pci/drivers/vfio-pci/remove_id
        # bind the now freed usb controllers to driver xhci_hcd
        echo "0000:08:00.1" > /sys/bus/pci/drivers/xhci_hcd/bind
        echo "0000:08:00.3" > /sys/bus/pci/drivers/xhci_hcd/bind
    fi
fi

I chose to swap the drivers for these devices before the domain starts (stage = prepare) and after the domain has stopped (stage = release).

Everything works now. As like before i use a hardware usb switch to switch my usb devices (keyboard, mouse, usb headset) to the computer i currently work on, be it host, guest or a laptop i use with a docking station on the same desk.

2 thoughts on “AMD Ryzen VFIO GPU pass-through – gaming VM on new hardware”

  1. I have been trying to achieve a similar configuration for days now, with no success. For some reason, I am getting the “Code:43” error no matter what. I followed your guide, added couple of modules in “vfio.conf” that I hadn’t before, but still the problem persists.
    I had it working before on an MSI X370 Gaming Plus + Intel i9 9900K, but as soon as I switched to GB Aorus X570 Master + Ryzen9 3900X, I cannot make it work…
    This is my thread on L1F: hxxps://forum.level1techs.com/t/cannot-resolve-nvidia-code-43-error-in-vfio/160590
    And here on Reddit: hxxps://www.reddit.com/r/VFIO/comments/i7u6sz/nvidia_error_code43/
    Any input/suggestion is welcome.

    Reply
    • I read your threads on L1t and reddit, seems you’d already tried everything.

      What BIOS version are you on?

      Reply

Leave a Comment