juan11perez Posted January 4, 2021 Share Posted January 4, 2021 (edited) Good day, About 4 days ago docker crashed. All dockers reside in /mnt/cache. Cache being ADATA_SX8200PNP_2J3420080645 (nvme0n1) When i checked in the main tab, the drive had disappeared. All else was fine. i restarted the server a couple of times, with no luck. I presumed a faulty drive; so I removed it and installed another nvme drive in the same pci slot and the new drive showed up. So sort of confirmed my suspicion. Nevertheless I reinstalled the presumed failed drive in its old location restarted and it appeared again. Server run fine last 4 days. This morning docker was again dead. I pulled the attached log and restarted unraid. All is working again. Apparently the drive disconnected at 1:48am. Should I assume the drive has had it? It's 2 years old. Thank you syslog.txt Edited January 24, 2021 by juan11perez closed Quote Link to comment
JorgeB Posted January 4, 2021 Share Posted January 4, 2021 NVMe device dropped offline, this can sometimes help: Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append" and before "initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference, if it doesn't look for a BIOS update and/or try a different brand/model device. Quote Link to comment
juan11perez Posted January 4, 2021 Author Share Posted January 4, 2021 @JorgeB Thank you for the advise. I'll implement and monitor. The drive has worked fine for 2 years and only started acting up after I added a 3rd GPU in the 3rd PCI lane. Not sure it has any bearing on it, but it's the only change. I have another identical drive in the second m.2 slot and it's fine; but then again is 4 months old. Once again thank you. Quote Link to comment
juan11perez Posted January 24, 2021 Author Share Posted January 24, 2021 @JorgeB Short note to update and again thank you for your advise. Adding "nvme_core.default_ps_max_latency_us=0" to syslinux did not resolve the issue. The drive kept disconnecting. I replaced it with a Samsung 970plus and that stopped the problem. Sever has now been running for over 10 days with no issues. Not sure if this means the Adata drive failed, but if it did it's quite disappointing as it's only 2 years old and the internet is full of praises for this product. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.