Hi
I have 2 nvidia cards. I have reserved 1 for passthrough in advance options.
When I add it to a VM and then run the vm i get:
and then whole libvirtd freezes so it cant even be restarted
what am i doing wrong here ?
one thing i can think of maybe having a role is that i installed one HW card AFTER the system was installed.. maybe i need to trigger update of initfs or something ? but the card seems to work according to nvidia-smi
EDIT:
Ok I killed the PID that i saw with fuser, all hell broke loose and libvirtd crashed, but after reboot it was working
I have 2 nvidia cards. I have reserved 1 for passthrough in advance options.
When I add it to a VM and then run the vm i get:
Code:
kernel: VFIO - User Level meta-driver version: 0.3 kernel: NVRM: Attempting to remove device 0000:03:00.0 with non-zero usage count!
and then whole libvirtd freezes so it cant even be restarted
Code:
# fuser -v /dev/nvidia0 USER PID ACCESS COMMAND /dev/nvidia0: root 29724 F.... nvidia-device-p # fuser -v /dev/nvidia1 USER PID ACCESS COMMAND /dev/nvidia1: root 29724 F.... nvidia-device-p # ps -ef |grep 29724 root 29724 29700 1 16:06 ? 00:00:00 nvidia-device-plugin
Code:
Thu Feb 2 16:17:48 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:03:00.0 Off | N/A | | 35% 30C P8 N/A / 75W | 0MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce ... Off | 00000000:04:00.0 Off | N/A | | 22% 31C P8 1W / 38W | 0MiB / 1024MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Code:
# dmesg |grep -i iommu [ 0.416758] iommu: Default domain type: Passthrough (set via kernel command line) [ 0.501256] DMAR: IOMMU feature sc_support inconsistent [ 0.501257] DMAR: IOMMU feature dev_iotlb_support inconsistent [ 0.501348] pci 0000:ff:0b.0: Adding to iommu group 0 [ 0.501363] pci 0000:ff:0b.1: Adding to iommu group 0 [ 0.501377] pci 0000:ff:0b.2: Adding to iommu group 0 [ 0.501442] pci 0000:ff:0c.0: Adding to iommu group 1 [ 0.501456] pci 0000:ff:0c.1: Adding to iommu group 1 [ 0.501470] pci 0000:ff:0c.2: Adding to iommu group 1 [ 0.501483] pci 0000:ff:0c.3: Adding to iommu group 1 [ 0.501496] pci 0000:ff:0c.4: Adding to iommu group 1 [ 0.501511] pci 0000:ff:0c.5: Adding to iommu group 1 [ 0.501567] pci 0000:ff:0f.0: Adding to iommu group 2 [ 0.501581] pci 0000:ff:0f.1: Adding to iommu group 2 [ 0.501595] pci 0000:ff:0f.4: Adding to iommu group 2 [ 0.501609] pci 0000:ff:0f.5: Adding to iommu group 2 [ 0.501623] pci 0000:ff:0f.6: Adding to iommu group 2 [ 0.501679] pci 0000:ff:10.0: Adding to iommu group 3 [ 0.501693] pci 0000:ff:10.1: Adding to iommu group 3 [ 0.501708] pci 0000:ff:10.5: Adding to iommu group 3 [ 0.501723] pci 0000:ff:10.6: Adding to iommu group 3 [ 0.501737] pci 0000:ff:10.7: Adding to iommu group 3 [ 0.501767] pci 0000:ff:12.0: Adding to iommu group 4 [ 0.501782] pci 0000:ff:12.1: Adding to iommu group 4 [ 0.501863] pci 0000:ff:13.0: Adding to iommu group 5 [ 0.501879] pci 0000:ff:13.1: Adding to iommu group 5 [ 0.501894] pci 0000:ff:13.2: Adding to iommu group 5 [ 0.501908] pci 0000:ff:13.3: Adding to iommu group 5 [ 0.501923] pci 0000:ff:13.4: Adding to iommu group 5 [ 0.501937] pci 0000:ff:13.5: Adding to iommu group 5 [ 0.501952] pci 0000:ff:13.6: Adding to iommu group 5 [ 0.501967] pci 0000:ff:13.7: Adding to iommu group 5 [ 0.502030] pci 0000:ff:14.0: Adding to iommu group 6 [ 0.502046] pci 0000:ff:14.1: Adding to iommu group 6 [ 0.502061] pci 0000:ff:14.2: Adding to iommu group 6 [ 0.502075] pci 0000:ff:14.3: Adding to iommu group 6 [ 0.502090] pci 0000:ff:14.6: Adding to iommu group 6 [ 0.502105] pci 0000:ff:14.7: Adding to iommu group 6 [ 0.502153] pci 0000:ff:15.0: Adding to iommu group 7 [ 0.502169] pci 0000:ff:15.1: Adding to iommu group 7 [ 0.502185] pci 0000:ff:15.2: Adding to iommu group 7 [ 0.502200] pci 0000:ff:15.3: Adding to iommu group 7 [ 0.502237] pci 0000:ff:16.0: Adding to iommu group 8 [ 0.502254] pci 0000:ff:16.6: Adding to iommu group 8 [ 0.502270] pci 0000:ff:16.7: Adding to iommu group 8 [ 0.502325] pci 0000:ff:17.0: Adding to iommu group 9 [ 0.502342] pci 0000:ff:17.4: Adding to iommu group 9 [ 0.502358] pci 0000:ff:17.5: Adding to iommu group 9 [ 0.502375] pci 0000:ff:17.6: Adding to iommu group 9 [ 0.502391] pci 0000:ff:17.7: Adding to iommu group 9 [ 0.502445] pci 0000:ff:1e.0: Adding to iommu group 10 [ 0.502462] pci 0000:ff:1e.1: Adding to iommu group 10 [ 0.502480] pci 0000:ff:1e.2: Adding to iommu group 10 [ 0.502497] pci 0000:ff:1e.3: Adding to iommu group 10 [ 0.502513] pci 0000:ff:1e.4: Adding to iommu group 10 [ 0.502542] pci 0000:ff:1f.0: Adding to iommu group 11 [ 0.502560] pci 0000:ff:1f.2: Adding to iommu group 11 [ 0.502573] pci 0000:00:00.0: Adding to iommu group 12 [ 0.502589] pci 0000:00:01.0: Adding to iommu group 13 [ 0.502604] pci 0000:00:01.1: Adding to iommu group 14 [ 0.502617] pci 0000:00:02.0: Adding to iommu group 15 [ 0.502630] pci 0000:00:03.0: Adding to iommu group 16 [ 0.502645] pci 0000:00:05.0: Adding to iommu group 17 [ 0.502658] pci 0000:00:05.1: Adding to iommu group 18 [ 0.502672] pci 0000:00:05.2: Adding to iommu group 19 [ 0.502686] pci 0000:00:05.4: Adding to iommu group 20 [ 0.502700] pci 0000:00:11.0: Adding to iommu group 21 [ 0.502722] pci 0000:00:11.4: Adding to iommu group 22 [ 0.502736] pci 0000:00:14.0: Adding to iommu group 23 [ 0.502757] pci 0000:00:16.0: Adding to iommu group 24 [ 0.502772] pci 0000:00:19.0: Adding to iommu group 25 [ 0.502785] pci 0000:00:1a.0: Adding to iommu group 26 [ 0.502799] pci 0000:00:1b.0: Adding to iommu group 27 [ 0.502812] pci 0000:00:1c.0: Adding to iommu group 28 [ 0.502827] pci 0000:00:1c.2: Adding to iommu group 29 [ 0.502841] pci 0000:00:1c.4: Adding to iommu group 30 [ 0.502854] pci 0000:00:1d.0: Adding to iommu group 31 [ 0.502892] pci 0000:00:1f.0: Adding to iommu group 32 [ 0.502913] pci 0000:00:1f.2: Adding to iommu group 32 [ 0.502932] pci 0000:00:1f.3: Adding to iommu group 32 [ 0.502947] pci 0000:02:00.0: Adding to iommu group 33 [ 0.502981] pci 0000:03:00.0: Adding to iommu group 34 [ 0.503001] pci 0000:03:00.1: Adding to iommu group 34 [ 0.503033] pci 0000:04:00.0: Adding to iommu group 35 [ 0.503053] pci 0000:04:00.1: Adding to iommu group 35 [ 0.503067] pci 0000:06:00.0: Adding to iommu group 36 [ 0.503083] pci 0000:07:00.0: Adding to iommu group 37
what am i doing wrong here ?
one thing i can think of maybe having a role is that i installed one HW card AFTER the system was installed.. maybe i need to trigger update of initfs or something ? but the card seems to work according to nvidia-smi
EDIT:
Ok I killed the PID that i saw with fuser, all hell broke loose and libvirtd crashed, but after reboot it was working
Last edited: