September 30, 20232 yr I have noticed an ongoing nvme error (its the spammed at the bottom of the syslog). I am uncertain if this is anything to be concerned about, or could be related to the primary problem bringing me here: My system will randomly no longer be available. I cannot ping it, my monitor does not recognize it, the case fans are all spun down, and the hard drive light does not flicker at all -- however, the power button is still on. The only way I can access it, again, is to hard shut down via the button, and power back up. Edit for additional information: The system will, generally, stay up anywhere between 2 - 4 days before this will occur. I am not doing anything specific on the system when it does. I run no Virtual Machines, and generally use it as a Plex server, but this has never occurred while watching something on Plex. theden-syslog-20230930-0216.zip Edited September 30, 20232 yr by Diavolui More information
September 30, 20232 yr For the PCIe error try this: https://forums.unraid.net/topic/118286-nvme-drives-throwing-errors-filling-logs-instantly-how-to-resolve/?do=findComment&comment=1165009
September 30, 20232 yr Author 17 hours ago, JorgeB said: For the PCIe error try this: https://forums.unraid.net/topic/118286-nvme-drives-throwing-errors-filling-logs-instantly-how-to-resolve/?do=findComment&comment=1165009 This solved my NVME/PCIe error! I greatly appreciate that. I assumed it was a part of the greater error, so I was searching too broadly to find this. I was told I should also include the following information: My server ran flawlessly for a month, before I (gracefully) shut it down and installed a fan. It may be irrelevant, but I am uncertain what information is valuable. I also ran a SMART on the cache, and it came up with no errors. The NVME will not run a SMART, for whatever reason, it just immediately goes black. But, the log for them shows no errors. Edit: Memtest86 also showed no errors. Edited October 1, 20232 yr by Diavolui More tests
October 1, 20232 yr 10 hours ago, Diavolui said: The NVME will not run a SMART NVMe devices don't support SMART tests. If the log spam is resolved enable the syslog server and post that after a crash.
October 2, 20232 yr Author On 10/1/2023 at 2:20 AM, JorgeB said: NVMe devices don't support SMART tests. If the log spam is resolved enable the syslog server and post that after a crash. Log spam is resolved. The server used to stay up a few days between crashes, now it is not lasting 24 hours between crashes. theden-syslog-192.168.1.234-20231002-1237.zip theden-diagnostics-20231002-0536.zip
October 2, 20232 yr Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
October 14, 20232 yr Author Update: The system continued to crash, in safe mode. It is currently being stress tested, by a friend, for hardware faults.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.