James Cameron
Cadet
- Joined
- Jul 23, 2022
- Messages
- 9
I've been attempting to create a Windows VM and pass through a GTX 1070, but I'm running into an issue. The VM runs perfectly fine without the GPU, but fails to boot once I pass through the GPU to the VM. I don't understand what the error message is telling me or how I can resolve the issue.
I did some Googling before posting and saw references to IOMMU groups. I ran the "lspci -v" command in shell to provide a list of PCI devices in hopes that it provides useful info. I've tried the ACS patch as described here and in other similar threads, but the issue persists: https://www.truenas.com/community/threads/how-to-edit-grub_cmdline_linux_default.95245/#post-678380.
I'm not sure if this is relevant, but I'll note this motherboard has two PCIe x16 slots with what I understand to be shared PCIe lanes between them (shown as X16/NA or X8/X8 on page 20 of the motherboard manual linked below). I have the GPU in the top x16 slot and the Dell H310 HBA in the bottom x16 slot (it's a x8 device, but I don't have any x8 slots), so my assumption is that each device should have access to 8 PCIe lanes. Could there be an issue with sharing these PCIe lanes between the two devices and trying to pass through one of the devices from the same IOMMU group to the VM?
I'd appreciate any support that this fantastic community can offer!
Configuration
OS: TrueNAS Scale 22.02.4 (current)
Motherboard: Supermicro C7Z270-CG-L (manual: https://www.supermicro.com/manuals/motherboard/Z270/MNL-1913.pdf)
CPU: Intel 7700k
PCIe Device #1 (GPU): GTX 1070
PCIe Device #2 (HBA): Dell H310
Error Message
Error Message - More Info
LSPCI -V Output
I did some Googling before posting and saw references to IOMMU groups. I ran the "lspci -v" command in shell to provide a list of PCI devices in hopes that it provides useful info. I've tried the ACS patch as described here and in other similar threads, but the issue persists: https://www.truenas.com/community/threads/how-to-edit-grub_cmdline_linux_default.95245/#post-678380.
I'm not sure if this is relevant, but I'll note this motherboard has two PCIe x16 slots with what I understand to be shared PCIe lanes between them (shown as X16/NA or X8/X8 on page 20 of the motherboard manual linked below). I have the GPU in the top x16 slot and the Dell H310 HBA in the bottom x16 slot (it's a x8 device, but I don't have any x8 slots), so my assumption is that each device should have access to 8 PCIe lanes. Could there be an issue with sharing these PCIe lanes between the two devices and trying to pass through one of the devices from the same IOMMU group to the VM?
I'd appreciate any support that this fantastic community can offer!
Configuration
OS: TrueNAS Scale 22.02.4 (current)
Motherboard: Supermicro C7Z270-CG-L (manual: https://www.supermicro.com/manuals/motherboard/Z270/MNL-1913.pdf)
CPU: Intel 7700k
PCIe Device #1 (GPU): GTX 1070
PCIe Device #2 (HBA): Dell H310
Error Message
Code:
[EFAULT] internal error: qemu unexpectedly closed the monitor: 2022-09-26T21:55:43.089676Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.0,addr=0x7: vfio 0000:01:00.0: group 1 is not viable Please ensure all devices within the iommu_group are bound to their vfio bus driver.
Error Message - More Info
Code:
Error: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor_base.py", line 165, in start if self.domain.create() < 0: File "/usr/lib/python3/dist-packages/libvirt.py", line 1353, in create raise libvirtError('virDomainCreate() failed') libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-09-26T21:55:43.089676Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.0,addr=0x7: vfio 0000:01:00.0: group 1 is not viable Please ensure all devices within the iommu_group are bound to their vfio bus driver. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/middlewared/main.py", line 176, in call_method result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self) File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1293, in _call return await methodobj(*prepared_call.args) File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1272, in nf return await func(*args, **kwargs) File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1140, in nf res = await f(*args, **kwargs) File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_lifecycle.py", line 39, in start await self.middleware.run_in_thread(self._start, vm['name']) File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1208, in run_in_thread return await self.run_in_executor(self.thread_pool_executor, method, *args, **kwargs) File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1205, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run result = self.fn(*self.args, **self.kwargs) File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/vm_supervisor.py", line 62, in _start self.vms[vm_name].start(vm_data=self._vm_from_name(vm_name)) File "/usr/lib/python3/dist-packages/middlewared/plugins/vm/supervisor/supervisor_base.py", line 174, in start raise CallError('\n'.join(errors)) middlewared.service_exception.CallError: [EFAULT] internal error: qemu unexpectedly closed the monitor: 2022-09-26T21:55:43.089676Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.0,addr=0x7: vfio 0000:01:00.0: group 1 is not viable Please ensure all devices within the iommu_group are bound to their vfio bus driver.
LSPCI -V Output
Code:
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05) Subsystem: Super Micro Computer Inc Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers Flags: bus master, fast devsel, latency 0, IOMMU group 0 Capabilities: [e0] Vendor Specific Information: Len=10 <?> Kernel driver in use: skl_uncore 00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 122, IOMMU group 1 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 0000e000-0000efff [size=4K] Memory behind bridge: de000000-df0fffff [size=17M] Prefetchable memory behind bridge: 00000000c0000000-00000000d1ffffff [size=288M] Capabilities: [88] Subsystem: Super Micro Computer Inc 6th-9th Gen Core Processor PCIe Controller (x16) Capabilities: [80] Power Management version 3 Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [a0] Express Root Port (Slot+), MSI 00 Capabilities: [100] Virtual Channel Capabilities: [140] Root Complex Link Capabilities: [d94] Secondary PCI Express Kernel driver in use: pcieport 00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) (rev 05) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 123, IOMMU group 1 Bus: primary=00, secondary=02, subordinate=02, sec-latency=0 I/O behind bridge: 0000d000-0000dfff [size=4K] Memory behind bridge: df100000-df2fffff [size=2M] Prefetchable memory behind bridge: [disabled] Capabilities: [88] Subsystem: Super Micro Computer Inc Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) Capabilities: [80] Power Management version 3 Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [a0] Express Root Port (Slot+), MSI 00 Capabilities: [100] Virtual Channel Capabilities: [140] Root Complex Link Capabilities: [d94] Secondary PCI Express Kernel driver in use: pcieport 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) (prog-if 00 [VGA controller]) Subsystem: Super Micro Computer Inc HD Graphics 630 Flags: bus master, fast devsel, latency 0, IRQ 160, IOMMU group 2 Memory at dd000000 (64-bit, non-prefetchable) [size=16M] Memory at b0000000 (64-bit, prefetchable) [size=256M] I/O ports at f000 Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS) Capabilities: [300] Page Request Interface (PRI) Kernel driver in use: i915 Kernel modules: i915 00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model Subsystem: Super Micro Computer Inc Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model Flags: fast devsel, IRQ 11, IOMMU group 3 Memory at df64f000 (64-bit, non-prefetchable) [disabled] [size=4K] Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit- Capabilities: [dc] Power Management version 2 Capabilities: [f0] PCI Advanced Features 00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller (prog-if 30 [XHCI]) Subsystem: Super Micro Computer Inc 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller Flags: bus master, medium devsel, latency 0, IRQ 129, IOMMU group 4 Memory at df630000 (64-bit, non-prefetchable) [size=64K] Capabilities: [70] Power Management version 2 Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 00:14.2 Signal processing controller: Intel Corporation 200 Series PCH Thermal Subsystem Subsystem: Super Micro Computer Inc 200 Series PCH Thermal Subsystem Flags: fast devsel, IRQ 5, IOMMU group 4 Memory at df64e000 (64-bit, non-prefetchable) [size=4K] Capabilities: [50] Power Management version 3 Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit- 00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1 Subsystem: Super Micro Computer Inc 200 Series PCH CSME HECI Flags: bus master, fast devsel, latency 0, IRQ 159, IOMMU group 5 Memory at df64d000 (64-bit, non-prefetchable) [size=4K] Capabilities: [50] Power Management version 3 Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+ Kernel driver in use: mei_me Kernel modules: mei_me 00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode] (prog-if 01 [AHCI 1.0]) Subsystem: Super Micro Computer Inc 200 Series PCH SATA controller [AHCI mode] Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 158, IOMMU group 6 Memory at df648000 (32-bit, non-prefetchable) [size=8K] Memory at df64c000 (32-bit, non-prefetchable) I/O ports at f090 I/O ports at f080 I/O ports at f060 Memory at df64b000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [70] Power Management version 3 Capabilities: [a8] SATA HBA v1.0 Kernel driver in use: ahci Kernel modules: ahci 00:1b.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #19 (rev f0) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 124, IOMMU group 7 Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: [disabled] Prefetchable memory behind bridge: [disabled] Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Super Micro Computer Inc 200 Series PCH PCI Express Root Port Capabilities: [a0] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Access Control Services Capabilities: [220] Secondary PCI Express Kernel driver in use: pcieport 00:1b.3 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #20 (rev f0) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 125, IOMMU group 8 Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: [disabled] Prefetchable memory behind bridge: [disabled] Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Super Micro Computer Inc 200 Series PCH PCI Express Root Port Capabilities: [a0] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Access Control Services Capabilities: [220] Secondary PCI Express Kernel driver in use: pcieport 00:1b.4 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #21 (rev f0) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 126, IOMMU group 9 Bus: primary=00, secondary=05, subordinate=05, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: df500000-df5fffff [size=1M] Prefetchable memory behind bridge: [disabled] Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Super Micro Computer Inc 200 Series PCH PCI Express Root Port Capabilities: [a0] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Access Control Services Capabilities: [220] Secondary PCI Express Kernel driver in use: pcieport 00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #1 (rev f0) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 127, IOMMU group 10 Bus: primary=00, secondary=06, subordinate=06, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: df400000-df4fffff [size=1M] Prefetchable memory behind bridge: [disabled] Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Super Micro Computer Inc 200 Series PCH PCI Express Root Port Capabilities: [a0] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Access Control Services Capabilities: [220] Secondary PCI Express Kernel driver in use: pcieport 00:1d.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #9 (rev f0) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 128, IOMMU group 11 Bus: primary=00, secondary=07, subordinate=07, sec-latency=0 I/O behind bridge: [disabled] Memory behind bridge: df300000-df3fffff [size=1M] Prefetchable memory behind bridge: [disabled] Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Super Micro Computer Inc 200 Series PCH PCI Express Root Port Capabilities: [a0] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Access Control Services Capabilities: [220] Secondary PCI Express Kernel driver in use: pcieport 00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (Z270) Subsystem: Super Micro Computer Inc 200 Series PCH LPC Controller (Z270) Flags: bus master, medium devsel, latency 0, IOMMU group 12 00:1f.2 Memory controller: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller Subsystem: Super Micro Computer Inc 200 Series/Z370 Chipset Family Power Management Controller Flags: bus master, fast devsel, latency 0, IOMMU group 12 Memory at df644000 (32-bit, non-prefetchable) [size=16K] 00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio Subsystem: Super Micro Computer Inc 200 Series PCH HD Audio Flags: bus master, fast devsel, latency 32, IRQ 161, IOMMU group 12 Memory at df640000 (64-bit, non-prefetchable) [size=16K] Memory at df620000 (64-bit, non-prefetchable) [size=64K] Capabilities: [50] Power Management version 3 Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+ Kernel driver in use: snd_hda_intel Kernel modules: snd_hda_intel 00:1f.4 SMBus: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller Subsystem: Super Micro Computer Inc 200 Series/Z370 Chipset Family SMBus Controller Flags: medium devsel, IRQ 16, IOMMU group 12 Memory at df64a000 (64-bit, non-prefetchable) I/O ports at f040 Kernel driver in use: i801_smbus Kernel modules: i2c_i801 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V DeviceName: Intel Ethernet i219-V Subsystem: Super Micro Computer Inc Ethernet Connection (2) I219-V Flags: bus master, fast devsel, latency 0, IRQ 133, IOMMU group 13 Memory at df600000 (32-bit, non-prefetchable) [size=128K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] PCI Advanced Features Kernel driver in use: e1000e Kernel modules: e1000e 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. GP104 [GeForce GTX 1070] Flags: fast devsel, IRQ 11, IOMMU group 1 Memory at de000000 (32-bit, non-prefetchable) [disabled] [size=16M] Memory at c0000000 (64-bit, prefetchable) [disabled] [size=256M] Memory at d0000000 (64-bit, prefetchable) [disabled] [size=32M] I/O ports at e000 [disabled] Expansion ROM at df000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Legacy Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [250] Latency Tolerance Reporting Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Kernel driver in use: vfio-pci Kernel modules: nouveau, nvidia_current_drm, nvidia_current 01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1) Subsystem: ASUSTeK Computer Inc. GP104 High Definition Audio Controller Flags: bus master, fast devsel, latency 0, IRQ 10, IOMMU group 1 Memory at df080000 (32-bit, non-prefetchable) [size=16K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel 02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) Subsystem: Dell 6Gbps SAS HBA Adapter Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 1 I/O ports at d000 Memory at df240000 (64-bit, non-prefetchable) [size=64K] Memory at df200000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at df100000 [disabled] [size=1M] Capabilities: [50] Power Management version 3 Capabilities: [68] Express Endpoint, MSI 00 Capabilities: [d0] Vital Product Data Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [138] Power Budgeting <?> Kernel driver in use: mpt3sas Kernel modules: mpt3sas 05:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. Device 500f (rev 03) (prog-if 02 [NVM Express]) Subsystem: Kingston Technology Company, Inc. Device 500f Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0, IOMMU group 14 Memory at df500000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [b0] MSI-X: Enable+ Count=16 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [158] Secondary PCI Express Capabilities: [178] Latency Tolerance Reporting Capabilities: [180] L1 PM Substates Kernel driver in use: nvme Kernel modules: nvme 06:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller (prog-if 30 [XHCI]) Subsystem: Super Micro Computer Inc ASM1142 USB 3.1 Host Controller Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 15 Memory at df400000 (64-bit, non-prefetchable) [size=32K] Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+ Capabilities: [68] MSI-X: Enable+ Count=8 Masked- Capabilities: [78] Power Management version 3 Capabilities: [80] Express Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [200] Advanced Error Reporting Capabilities: [280] Secondary PCI Express Capabilities: [300] Latency Tolerance Reporting Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 07:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. Device 500f (rev 03) (prog-if 02 [NVM Express]) Subsystem: Kingston Technology Company, Inc. Device 500f Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0, IOMMU group 16 Memory at df300000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [b0] MSI-X: Enable+ Count=16 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [158] Secondary PCI Express Capabilities: [178] Latency Tolerance Reporting Capabilities: [180] L1 PM Substates Kernel driver in use: nvme Kernel modules: nvme
Last edited: