July 25, 20196 yr Can't get this GPU to work with a Windows 10 VM. Using X370 Taichi with the latest 5.60 BIOS. The VM is set to use Q35 3.1. After a boot attempt unRAID can't be stopped or restarted via the usual ways and a hard restart or shutdown is necessary. PCI-stubbing and ACS override modes don't change the situation. Any suggestions on a fix? Libvirt logs Quote 2019-07-25 01:33:46.190+0000: 12488: info : libvirt version: 5.1.0 2019-07-25 01:33:46.190+0000: 12488: info : hostname: helion 2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: high-privileges 2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: host-cpu 2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' Kernel logs: Quote [ 418.335657] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 418.335660] pcieport 0000:00:03.1: broadcast resume message [ 418.335665] pcieport 0000:00:03.1: AER: Device recovery successful [ 418.489685] AMD-Vi: Completion-Wait loop timed out [ 418.591165] virbr0: port 2(vnet0) entered forwarding state [ 418.591167] virbr0: topology change detected, propagating [ 418.627225] AMD-Vi: Completion-Wait loop timed out [ 418.764387] AMD-Vi: Completion-Wait loop timed out [ 418.907004] AMD-Vi: Completion-Wait loop timed out [ 419.327643] iommu ivhd0: AMD-Vi: Event logged [ [ 419.327648] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd10] [ 419.336519] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 419.336524] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 419.336526] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 419.336528] pcieport 0000:00:03.1: [21] ACSViol (First) [ 419.336531] pcieport 0000:00:03.1: broadcast error_detected message [ 419.336562] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 419.336564] pcieport 0000:00:03.1: broadcast resume message [ 419.336567] pcieport 0000:00:03.1: AER: Device recovery successful [ 420.327558] iommu ivhd0: AMD-Vi: Event logged [ [ 420.327563] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd40] [ 420.338660] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 420.338667] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 420.338670] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 420.338673] pcieport 0000:00:03.1: [21] ACSViol (First) [ 420.338677] pcieport 0000:00:03.1: broadcast error_detected message [ 420.338730] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 420.338732] pcieport 0000:00:03.1: broadcast resume message [ 420.338737] pcieport 0000:00:03.1: AER: Device recovery successful [ 421.327525] iommu ivhd0: AMD-Vi: Event logged [ [ 421.327530] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd70] [ 421.330873] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 421.330880] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 421.330884] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 421.330886] pcieport 0000:00:03.1: [21] ACSViol (First) [ 421.330890] pcieport 0000:00:03.1: broadcast error_detected message [ 421.330942] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 421.330944] pcieport 0000:00:03.1: broadcast resume message [ 421.330949] pcieport 0000:00:03.1: AER: Device recovery successful [ 422.327738] iommu ivhd0: AMD-Vi: Event logged [ [ 422.327744] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bda0]
July 25, 20196 yr Author @limetech can you please have a look? I have seen the same error (device [1022:1453] error status/mask=00200000/04400000) being reported by other users in the forum and this article states that downgrading the kernel makes everything work fine - https://www.micropissed.com/2018/05/amd-vi-completion-wait-loop-timed-out Will attempt downgrading the BIOS in case the newest one broke something.
July 25, 20196 yr Author I was only able to downgrade from 5.60 to 5.50 on X370 Taichi (https://www.asrock.com/mb/AMD/X370 Taichi/index.asp#BIOS) and it did not change the situation. Downgrading to a lower BIOS version is not possible and is described here https://forum.level1techs.com/t/attention-amd-vfio-users-do-not-update-your-bios/142685 @limetech is it possible that the next version of unRAID has the patch mentioned in the topic above? The diff can be seen here https://clbin.com/VCiYJ. According to the users from the L1 forums, this is a fully working fix. Any tips on applying this before a new unRAID release ware welcomed.
Archived
This topic is now archived and is closed to further replies.