kf0cqx Posted May 28 Posted May 28 Hello, sorry to bug you guys, I have had two instances now in the last couple weeks where unRAID crashes on random. I cannot wake up the local screen, I can't access any of my dockers as they have crashed too. I know this is a vague issue, but was hoping someone here would notice something in the diagnostics possibly. Thanks for any help. voltznet-1-diagnostics-20240528-0741.zip Quote
JorgeB Posted May 28 Posted May 28 You can enable the syslog server and post that after a crash, in case there's something logged there. 1 Quote
kf0cqx Posted June 22 Author Posted June 22 On 5/28/2024 at 9:13 AM, JorgeB said: You can enable the syslog server and post that after a crash, in case there's something logged there. Continuing to have crashes on random. This was all from this evening. System went completely unresponsive although continuing to run. Tried pinging it and got no response also. Just now came across the same thing after it running again for roughly an hour. voltznet-1-diagnostics-20240621-2022.zip syslog syslog-previous Quote
JorgeB Posted June 22 Posted June 22 There are multiple call traces and segfaults, and lots of these: Jun 21 21:00:35 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:02:09 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:04:47 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:05:50 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:08:27 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged So looks like a hardware issue, start by running memtest Quote
kf0cqx Posted June 23 Author Posted June 23 18 hours ago, JorgeB said: There are multiple call traces and segfaults, and lots of these: Jun 21 21:00:35 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:02:09 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:04:47 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:05:50 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged Jun 21 21:08:27 VoltzNet-1 kernel: mce: [Hardware Error]: Machine check events logged So looks like a hardware issue, start by running memtest Memtest failed on 2nd pass (2 bits) at 92Gb out of 128Gb. I reseated dimms and realized the temps were a bit high on the VRM and Chipset so I added a couple fans also. Reran memtest 4 passes and passed. I'm kind of thinking some temp issues may be afoot. Will stop back if this still continues. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.