RX580 passthrough on X370


Recommended Posts

Can't get this GPU to work with a Windows 10 VM. Using X370 Taichi with the latest 5.60 BIOS. The VM is set to use Q35 3.1. After a boot attempt unRAID can't be stopped or restarted via the usual ways and a hard restart or shutdown is necessary. PCI-stubbing and ACS override modes don't change the situation. Any suggestions on a fix?

 

Libvirt logs

Quote

2019-07-25 01:33:46.190+0000: 12488: info : libvirt version: 5.1.0
2019-07-25 01:33:46.190+0000: 12488: info : hostname: helion
2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: high-privileges
2019-07-25 01:33:46.190+0000: 12488: warning : qemuDomainObjTaint:7986 : Domain id=1 name='Windows 10' uuid=6458b0b6-a653-8a09-e831-77c6393c21ff is tainted: host-cpu
2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:50.014+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:51.015+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:52.017+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:53.009+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127'
2019-07-25 01:33:56.799+0000: 12493: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:56.799+0000: 12493: error : virHostdevReAttachPCIDevices:1071 : Failed to reset PCI device: internal error: Unknown PCI header type '127'
2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'
2019-07-25 01:33:57.034+0000: 12770: error : virPCIGetHeaderType:3238 : internal error: Unknown PCI header type '127'

 

Kernel logs:

Quote

 

[  418.335657] pcieport 0000:00:03.1: broadcast mmio_enabled message

[  418.335660] pcieport 0000:00:03.1: broadcast resume message

[  418.335665] pcieport 0000:00:03.1: AER: Device recovery successful

[  418.489685] AMD-Vi: Completion-Wait loop timed out

[  418.591165] virbr0: port 2(vnet0) entered forwarding state

[  418.591167] virbr0: topology change detected, propagating

[  418.627225] AMD-Vi: Completion-Wait loop timed out

[  418.764387] AMD-Vi: Completion-Wait loop timed out

[  418.907004] AMD-Vi: Completion-Wait loop timed out

[  419.327643] iommu ivhd0: AMD-Vi: Event logged [

[  419.327648] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd10]

[  419.336519] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0

[  419.336524] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)

[  419.336526] pcieport 0000:00:03.1:   device [1022:1453] error status/mask=00200000/04400000

[  419.336528] pcieport 0000:00:03.1:    [21] ACSViol                (First)

[  419.336531] pcieport 0000:00:03.1: broadcast error_detected message

[  419.336562] pcieport 0000:00:03.1: broadcast mmio_enabled message

[  419.336564] pcieport 0000:00:03.1: broadcast resume message

[  419.336567] pcieport 0000:00:03.1: AER: Device recovery successful

[  420.327558] iommu ivhd0: AMD-Vi: Event logged [

[  420.327563] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd40]

[  420.338660] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0

[  420.338667] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)

[  420.338670] pcieport 0000:00:03.1:   device [1022:1453] error status/mask=00200000/04400000

[  420.338673] pcieport 0000:00:03.1:    [21] ACSViol                (First)

[  420.338677] pcieport 0000:00:03.1: broadcast error_detected message

[  420.338730] pcieport 0000:00:03.1: broadcast mmio_enabled message

[  420.338732] pcieport 0000:00:03.1: broadcast resume message

[  420.338737] pcieport 0000:00:03.1: AER: Device recovery successful

[  421.327525] iommu ivhd0: AMD-Vi: Event logged [

[  421.327530] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bd70]

[  421.330873] pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:00.0

[  421.330880] pcieport 0000:00:03.1: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)

[  421.330884] pcieport 0000:00:03.1:   device [1022:1453] error status/mask=00200000/04400000

[  421.330886] pcieport 0000:00:03.1:    [21] ACSViol                (First)

[  421.330890] pcieport 0000:00:03.1: broadcast error_detected message

[  421.330942] pcieport 0000:00:03.1: broadcast mmio_enabled message

[  421.330944] pcieport 0000:00:03.1: broadcast resume message

[  421.330949] pcieport 0000:00:03.1: AER: Device recovery successful

[  422.327738] iommu ivhd0: AMD-Vi: Event logged [

[  422.327744] iommu ivhd0: IOTLB_INV_TIMEOUT device=0d:00.0 address=0x0000000ffe06bda0]

 

Link to comment

@limetech can you please have a look? I have seen the same error (device [1022:1453] error status/mask=00200000/04400000) being reported by other users in the forum and this article states that downgrading the kernel makes everything work fine - https://www.micropissed.com/2018/05/amd-vi-completion-wait-loop-timed-out

 

Will attempt downgrading the BIOS in case the newest one broke something.

Link to comment

I was only able to downgrade from 5.60 to 5.50 on X370 Taichi (https://www.asrock.com/mb/AMD/X370 Taichi/index.asp#BIOS) and it did not change the situation. Downgrading to a lower BIOS version is not possible and is described here https://forum.level1techs.com/t/attention-amd-vfio-users-do-not-update-your-bios/142685

 

 

@limetech is it possible that the next version of unRAID has the patch mentioned in the topic above? The diff can be seen here https://clbin.com/VCiYJ. According to the users from the L1 forums, this is a fully working fix. Any tips on applying this before a new unRAID release ware welcomed.

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.