NVIDIA vfio-pci Refused to change power state, currently in D3??


Recommended Posts

Everything regarding GPU passthrough was ok with my unraid servers (I have 3 servers all with Linux and Win passthrough VMs, for CUDA) but now one system will not boot any VM with passthrough. 

 

It works (randomly?) one the first boot after the install of a VM but then the GPU stops working (no CUDA calls work) at a random point. Error attached below.

 

Quote

Oct 31 15:06:07 Tower kernel: vfio-pci 0000:af:00.1: Refused to change power state, currently in D3
Oct 31 15:06:09 Tower kernel: vfio-pci 0000:af:00.0: Refused to change power state, currently in D3
Oct 31 15:06:09 Tower kernel: vfio-pci 0000:af:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Oct 31 15:06:09 Tower kernel: vfio-pci 0000:af:00.1: Refused to change power state, currently in D3
Oct 31 15:06:09 Tower kernel: pciback 0000:af:00.1: Refused to change power state, currently in D3
root@Tower:~# lspci -nn | grep NVIDIA
3b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
3b:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
af:00.0 VGA compatible controller [0300]: NVIDIA Corporation GV100 [TITAN V] [10de:1d81] (rev ff)
af:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f2] (rev ff)

 

 

########

IOMMU group 65:[10de:1d81] af:00.0 VGA compatible controller: NVIDIA Corporation GV100 [TITAN V] (rev a1)

[10de:10f2] af:00.1 Audio device: NVIDIA Corporation Device 10f2 (rev a1)

########

IOMMU group 34:[10de:1bb1] 3b:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Quadro P4000] (rev a1)

[10de:10f0] 3b:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)

 

1024286119_Screenshot2019-10-3115_12_33.thumb.png.ad6cbc9ad427aa0e8b305e3c6b851247.png

 

 

Any idea what this can be? I even upgraded to the latest RC unraid but no luck.

Any troubleshooting advice?

Link to comment

After rebooting I got this error too:

 

internal error: qemu unexpectedly closed the monitor: 2019-10-31T16:26:57.823841Z qemu-system-x86_64: -device pcie-pci-bridge,id=pci.7,bus=pci.1,addr=0x0: Bus 'pci.1' not found

 

syslog:

Oct 31 16:26:58 Tower kernel: vfio-pci 0000:af:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.