Looks like my Google-Fu has failed me, but maybe someone can shed some insight.
My XPG S70 Blade seems to have connection dropouts with no rhyme or reason that I can find. It'll be fine for a few hours then all of a sudden I get a notification that the device is missing.
Mar 29 11:00:27 TheRedQueen kernel: nvme nvme2: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xFFFF
I tried passing it through to a VM to check on it's firmware to see if there's an update, but to no avail. Also the error has shifted from the passthrough now to
TheRedQueen kernel: vfio-pci 0000:02:00.0: VPD access failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update
I should mention that in the VM the drive never actually drops out of the OS, but unraid it definitely does and a reboot is necessary to get it back on track.
I've reseated it once just in case, but seems the issue seems to happen once every 24 hours. Thoughts?
Supermicro M12SWA-TF
AMD Ryzen Threadripper PRO 3955WX
NVIDIA GTX 1060 6GB (For Transcoding Purposes)
2x LSI 9202-16e HBAs
LSI 9272-8i HBA
2x T-Force Cardea 1TB (Cache) in a ASUS Hyper M.2 Expansion (Bifurcated x4x4x4x4)
Seasonic PRIME 1000W Platinum PSU.
theredqueen-diagnostics-20220329-1129.zip