realies Posted July 25, 2019 Share Posted July 25, 2019 Can't get this GPU to work with a Windows 10 VM. Using X370 Taichi with the latest 5.60 BIOS. The VM is set to use Q35 3.1. After a boot attempt unRAID can't be stopped or restarted via the usual ways and a hard restart or shutdown is necessary. PCI-stubbing and ACS override modes don't change the situation. Any suggestions on a fix? Libvirt logs Quote 2019-07-25 01:33:46.190+0000: 12488: info : libvirt version: 5.1.0 2019-07-25 01:33:46.190+0000: 12488: info : hostname: helion 2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: high-privileges 2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: host-cpu 2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' 2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127' Kernel logs: Quote [ 418.335657] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 418.335660] pcieport 0000:00:03.1: broadcast resume message [ 418.335665] pcieport 0000:00:03.1: AER: Device recovery successful [ 418.489685] AMD-Vi: Completion-Wait loop timed out [ 418.591165] virbr0: port 2(vnet0) entered forwarding state [ 418.591167] virbr0: topology change detected, propagating [ 418.627225] AMD-Vi: Completion-Wait loop timed out [ 418.764387] AMD-Vi: Completion-Wait loop timed out [ 418.907004] AMD-Vi: Completion-Wait loop timed out [ 419.327643] iommu ivhd0: AMD-Vi: Event logged [ [ 419.327648] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd10] [ 419.336519] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 419.336524] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 419.336526] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 419.336528] pcieport 0000:00:03.1: [21] ACSViol (First) [ 419.336531] pcieport 0000:00:03.1: broadcast error_detected message [ 419.336562] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 419.336564] pcieport 0000:00:03.1: broadcast resume message [ 419.336567] pcieport 0000:00:03.1: AER: Device recovery successful [ 420.327558] iommu ivhd0: AMD-Vi: Event logged [ [ 420.327563] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd40] [ 420.338660] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 420.338667] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 420.338670] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 420.338673] pcieport 0000:00:03.1: [21] ACSViol (First) [ 420.338677] pcieport 0000:00:03.1: broadcast error_detected message [ 420.338730] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 420.338732] pcieport 0000:00:03.1: broadcast resume message [ 420.338737] pcieport 0000:00:03.1: AER: Device recovery successful [ 421.327525] iommu ivhd0: AMD-Vi: Event logged [ [ 421.327530] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd70] [ 421.330873] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0 [ 421.330880] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) [ 421.330884] pcieport 0000:00:03.1: device [1022:1453] error status/mask=00200000/04400000 [ 421.330886] pcieport 0000:00:03.1: [21] ACSViol (First) [ 421.330890] pcieport 0000:00:03.1: broadcast error_detected message [ 421.330942] pcieport 0000:00:03.1: broadcast mmio_enabled message [ 421.330944] pcieport 0000:00:03.1: broadcast resume message [ 421.330949] pcieport 0000:00:03.1: AER: Device recovery successful [ 422.327738] iommu ivhd0: AMD-Vi: Event logged [ [ 422.327744] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bda0] Quote Link to comment
realies Posted July 25, 2019 Author Share Posted July 25, 2019 @limetech can you please have a look? I have seen the same error (device [1022:1453] error status/mask=00200000/04400000) being reported by other users in the forum and this article states that downgrading the kernel makes everything work fine - https://www.micropissed.com/2018/05/amd-vi-completion-wait-loop-timed-out Will attempt downgrading the BIOS in case the newest one broke something. Quote Link to comment
realies Posted July 25, 2019 Author Share Posted July 25, 2019 I was only able to downgrade from 5.60 to 5.50 on X370 Taichi (https://www.asrock.com/mb/AMD/X370 Taichi/index.asp#BIOS) and it did not change the situation. Downgrading to a lower BIOS version is not possible and is described here https://forum.level1techs.com/t/attention-amd-vfio-users-do-not-update-your-bios/142685 @limetech is it possible that the next version of unRAID has the patch mentioned in the topic above? The diff can be seen here https://clbin.com/VCiYJ. According to the users from the L1 forums, this is a fully working fix. Any tips on applying this before a new unRAID release ware welcomed. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.