VMs/Containers randomly "disappearing" until reboot


Recommended Posts

Hi there! So the last week or two I've been running into the issue were everything is working fine, then a day or two later I attempt to access a service and later notice that VMs and containers are either missing or stopped, while several btrfs related errors appear in the log. A server reboot brings everything back online, apart from a couple of new log errors showing in some containers (i.e. sonarr).

Unsure if this might be related, but I'm passing through an external usb HDD to a Win10 VM using the Unassigned Devices plugin. This HDD initially appears as expected, but after a couple of hours shows as missing, even though the Win10 VM properly recognizes it at the same time. This HDD is used for long term storage of BlueIris videos so there is some form of constant activity on it.

I've been playing around with Unraid for a year or so, so I'm still trying to figure how to troubleshoot this.

 

Last 2 diagnostics logs attached, first one tower-diagnostics-20201102-1149.zip was in relation to containers disappearing, second one tower-diagnostics-20201031-0419.zip was related to VMs disappearing while half containers were automatically stopped.

 

Thanks for your time!

Link to comment

Cache device is dropping offline:

Nov  2 09:36:58 Tower kernel: pcieport 0000:00:06.0: AER: Uncorrected (Fatal) error received: 0000:00:00.0
Nov  2 09:36:58 Tower kernel: nvme 0000:05:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
Nov  2 09:37:29 Tower kernel: nvme nvme0: I/O 710 QID 14 timeout, aborting
Nov  2 09:37:29 Tower kernel: nvme nvme0: I/O 711 QID 14 timeout, aborting
Nov  2 09:37:29 Tower kernel: nvme nvme0: I/O 712 QID 14 timeout, aborting
Nov  2 09:37:29 Tower kernel: nvme nvme0: I/O 713 QID 14 timeout, aborting
Nov  2 09:37:29 Tower kernel: nvme nvme0: I/O 714 QID 14 timeout, aborting
Nov  2 09:37:59 Tower kernel: nvme nvme0: I/O 710 QID 14 timeout, reset controller
Nov  2 09:38:29 Tower kernel: nvme nvme0: I/O 0 QID 0 timeout, reset controller
Nov  2 09:38:34 Tower kernel: nvme nvme0: Device shutdown incomplete; abort shutdown

 

Previous reports of issues with Adata devices, if you can try a different brand.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.