stealth007 Posted February 15, 2022 Share Posted February 15, 2022 Hi, I am an absolute newbie to UNRAID, and I am just trying to set up my server. I've only just got the system booted into the UNRAID OS, but I am facing this issue where my Samsung 970 Evo Plus 1TB NVMe SSD randomly disappears and only reappears when I shut down and turn on the server again. My Samsung 850 Evo 1TB SATA SSD and WD 2TB HDD (old reused devices) appear under unassigned devices just fine, but for some reason, after starting the server, the Samsung 970 Evo Plus NVMe SSD appears for a short while then disappears. It is a brand new drive so I'm not sure what the issue is. The 970 Evo is in an MSI Z590i UNIFY motherboard which supports 2 NVMe drives, though I am only using one. I have attached my Diagnostics ZIP here. I appreciate any help and thank you in advance! diagnostics-20220215-2353.zip Quote Link to comment
JorgeB Posted February 15, 2022 Share Posted February 15, 2022 Logs is spammed with Bluetooth related errors, look for a BIOS update for the board, the below also helps sometimes. Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference. Quote Link to comment
stealth007 Posted February 16, 2022 Author Share Posted February 16, 2022 (edited) 9 hours ago, JorgeB said: Logs is spammed with Bluetooth related errors, look for a BIOS update for the board, the below also helps sometimes. Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference. Hi JorgeB, thank you for your reply! I have already updated the BIOS to its latest version prior to booting into UNRAID. Not sure what the Bluetooth errors are but could they be related to the wireless keyboard and mouse USB dongles I have attached? I tried the fix that you suggested, but I can't seem to even "see" the NVMe device in UNRAID although it appears in the BIOS. Also, under my "Syslinux Configuration", there is additional content after "append initrd=/bzroot", please see below: append initrd=/bzroot,/bzroot-gui unraidsafemode UNRAID is started in normal mode, not safe mode, so I'm not sure why that suffix is shown. I have appended the "nvme_core.default_ps_max_latency_us=0" after that as in: append initrd=/bzroot,/bzroot-gui unraidsafemode nvme_core.default_ps_max_latency_us=0 This does not seem to fix the issue. The NVMe drive does not appear under unassigned devices even after full shutdown and startup. I also tried removing the ",/bzroot-gui unraidsafemode" to make it as you suggested: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 However, the NVMe drive still does not appear under unassigned devices. What should I do in this case? Attached are the new diagnostic files. Thank you for your help! Edited February 16, 2022 by stealth007 wrongly used "quote" function rather than "code" Quote Link to comment
JorgeB Posted February 16, 2022 Share Posted February 16, 2022 5 hours ago, stealth007 said: The NVMe drive does not appear under unassigned devices even after full shutdown and startup. This is a different problem, post new diags. 5 hours ago, stealth007 said: Not sure what the Bluetooth errors See if there's an option to disable the Bluetooth module in the BIOS. Quote Link to comment
stealth007 Posted February 16, 2022 Author Share Posted February 16, 2022 6 hours ago, JorgeB said: This is a different problem, post new diags. Okay, the new diagnostics are attached in this reply. 6 hours ago, JorgeB said: See if there's an option to disable the Bluetooth module in the BIOS. Sure I will check the BIOS to try and disable the Bluetooth. diagnostics-20220216-1010.zip Quote Link to comment
JorgeB Posted February 16, 2022 Share Posted February 16, 2022 Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: pci function 0000:04:00.0 Feb 15 18:04:11 TheSentinel kernel: nvme 0000:04:00.0: can't change power state from D3hot to D0 (config space inaccessible) Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: Removing after probe failure status: -19 Device is failing to initialize, don't think there's much you can do other than trying a different NVMe device (or a different board), you can also try v6.10-rc2 but doubt that it would help. Quote Link to comment
stealth007 Posted February 16, 2022 Author Share Posted February 16, 2022 3 hours ago, JorgeB said: Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: pci function 0000:04:00.0 Feb 15 18:04:11 TheSentinel kernel: nvme 0000:04:00.0: can't change power state from D3hot to D0 (config space inaccessible) Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: Removing after probe failure status: -19 Device is failing to initialize, don't think there's much you can do other than trying a different NVMe device (or a different board), you can also try v6.10-rc2 but doubt that it would help. Thank You for your help and advice all this while! I've turned off the bt and wifi from the bios itself. And as for the nvme troubles, I actually managed to "fix" it in the dumbest way possible I think? Since my motherboard has 2 slots, I simply put it in the other slot and somehow it seems to be detected and works fine for now. I'll just have to keep in mind that if I get a second nvme drive for the original slot, it probably should not be a 970 Evo Plus so as to avoid this issue again. As to the potential explanation why it works in one slot and not the other, according to my motherboard specs, the slot I had it in initially was controlled by the mobo chipset whereas the slot I have it in now is controlled by the CPU, so perhaps there's some shenanigans going on there that I'm not familiar with. I hope this may help anyone else stumbles across a similar issue. Quote Link to comment
stealth007 Posted February 18, 2022 Author Share Posted February 18, 2022 On 2/16/2022 at 11:16 PM, JorgeB said: Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: pci function 0000:04:00.0 Feb 15 18:04:11 TheSentinel kernel: nvme 0000:04:00.0: can't change power state from D3hot to D0 (config space inaccessible) Feb 15 18:04:11 TheSentinel kernel: nvme nvme0: Removing after probe failure status: -19 Device is failing to initialize, don't think there's much you can do other than trying a different NVMe device (or a different board), you can also try v6.10-rc2 but doubt that it would help. Hi JorgeB, I bought a Sabrent Rocket NVMe drive instead, and am facing the same issue as before where the drive appears online for a while then drops offline. I still have the code you provided in syslinux: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 Any idea what could be causing this issue? Is it really an issue with the motherboard? The drive appears fine in the BIOS. Thank you for your help. diagnostics-20220218-1330.zip Quote Link to comment
JorgeB Posted February 18, 2022 Share Posted February 18, 2022 This is usually board or NVMe device related, or both together. Quote Link to comment
stealth007 Posted February 20, 2022 Author Share Posted February 20, 2022 On 2/18/2022 at 5:20 PM, JorgeB said: This is usually board or NVMe device related, or both together. Hi JorgeB, sorry for the late reply. I was doing some testing in Windows instead. Right now I have a Samsung 970 Evo Plus Gen 3 drive installed in the M2_1 slot and a Sabrent Rocket Gen 3 drive installed in the M2_2 slot. The M2_2 slot is the one that keeps dropping offline in unraid. Even when I had the Samsung drive installed in that slot, it would drop offline and now the Sabrent drive does the same thing. I loaded Windows on the Samsung drive in the M2_1 slot and added the sabrent drive in the M2_2 slot as a secondary (D:) drive. I ran the pc with windows running for about an hour and every so often I would try to access the Sabrent "D:" drive and copy files onto and off of it from/to the Samsung "C:" drive. In the scenario above, the drive never dropped offline and always remained online and accessible throughout. The issue where the drive is first detected and then suddenly drops offline within 5 mins of booting only occurs in UNRAID. Could this then be an issue unique to UNRAID? Thank You for your help. Quote Link to comment
JorgeB Posted February 21, 2022 Share Posted February 21, 2022 9 hours ago, stealth007 said: Could this then be an issue unique to UNRAID? Could be a Linux issue with that board/device combo. Quote Link to comment
stealth007 Posted February 21, 2022 Author Share Posted February 21, 2022 3 hours ago, JorgeB said: Could be a Linux issue with that board/device combo. I understand, thank you again for your help anyway in trying to rectify this issue. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.