August 12, 20232 yr I have been running this exact USB key on an older Supermicro board for years with zero issues. Very stable. I since moved all drives and key over to a newer system. (I did run the new system on a trial of unRAID for 30 days, which was stable). After moving everything over and adding 2 NVME drives for a VM pool, I'm now having a stability issue. Over the last 3 days I can't get the server to stay up for 24 straight hours. But during that time it's solid and I see no errors in the logs. I might pull the NVMEs out for testing but it's odd, as I see no real errors and the VMs seem to run just fine. When it locks up I lose GUI and VGA. IPMI still works so I know the board isn't "down". I can reset via IPMI and then I'm back up and running for another 24ish hours. Any thoughts? Thank you in advance! smc-unraid-diagnostics-20230811-2355.zip
August 12, 20232 yr Author 6 hours ago, JorgeB said: Enable the syslog server and post that after a crash. I had mirror to flash enabled is that different from local syslog server? That mirrored flash should be in the attachment.
August 12, 20232 yr 1 minute ago, SmallwoodDR82 said: That mirrored flash should be in the attachment Not unless you added it! The standard diagnostics only include the RAM copy of the syslog, not the one mirrored to flash.
August 12, 20232 yr Author 1 minute ago, itimpi said: Not unless you added it! The standard diagnostics only include the RAM copy of the syslog, not the one mirrored to flash. my fault everyone. I thought it was added to the diags zip. See attached mirrored syslog. I was changing some switches around on Aug 10 so those link down can be ignored. Crash was around Aug 11 23:15 I believe. syslog
August 13, 20232 yr There's nothing relevant logged, suggesting a hardware issue, since it was after a move I would start by checking power supply cables are all correctly plugged and latched to the board.
August 14, 20232 yr Author It’s a dual power supply server (Supermicro CSE-836) Updated: I’ve looked in IPMI and the syslog has zero errors logged and while unraid crashes IPMI stays up. It crashed again today and the NVME drives were removed and had VMs off because of that. So it’s not NVME or VM related. I’m currently running a mem test and I’m half way through it with no errors. Edited August 14, 20232 yr by SmallwoodDR82
August 16, 20232 yr Author update. Had another crash on Monday. Ran memtest. All Passed. Then I was digging around in forums and came across this post. I moved all my containers to a separate NIC via this Guide and since then, it's been stable. 48 hours so far! Fingers crossed.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.