February 4, 20242 yr Had a power outage last week and the server wouldn't come back up I tried removing different parts of the server for it to post and finally got it to post after removing usb, verifying and fixing problems in usb Today we had another outage, server came back up pretty quickly, but then went back down. Server hasn't been able to stay up with regularity since the outage. I've enabled writing logs to flash drive for when it goes down next but I also got a notification 'machine check events' and to post the logs here. Can anyone take a look and see what's up? I'm guessing flash drive is on it's way out, but maybe something else is too? Bullet format - Wouldn't reboot (mid december, finally came up but don't know why) - Power outage - Checked components, fixed flash - Power outage - Rebooted, won't stay up - Logs uploaded tower-diagnostics-20240204-1535.zip
February 4, 20242 yr Author Here is another log after it shutdown and I had to restart I've removed the dynamix system stats plugin syslog-192.168.0.200.log Edited February 4, 20242 yr by sl0pz
February 4, 20242 yr Author Diagnostics from after the crash from tools>diagnostics tower-diagnostics-20240204-2147.zip
February 4, 20242 yr Community Expert 7 hours ago, sl0pz said: but then went back down. Server shutting down on its own it's almost always hardware issue, or bad power, and unsurprisingly, there's nothing relevant logged in the syslog.
February 5, 20242 yr Author For clarity, the server still has hdd lights, eth lights, and fans running when it's down. The box is seemingly still powered. It takes a hard shutdown and boot to get unraid accessible again
February 5, 20242 yr Author Some chunky errors in this one hopefully will give me a good starting point! syslog-192.168.0.200.log
February 5, 20242 yr Community Expert There are errors withe the flash drive, it may not be the only issue, but it's an issue.
February 6, 20242 yr Author Have a ups ordered and new flash drive as well, will swap over to new drive and ups will clean shutdown if there are power issues
February 12, 20242 yr Author So added a ups and switched to a new flash drive, server went down again an hour later. Got some new ram, everything started up and stayed up pretty well for 12 hours, now it's down again. What's the next thing I could check or replace? Are there different logs to look at?
February 12, 20242 yr Author I just popped it in, booted it up, followed with a parity check. What do you think would be the most effective test? Edited February 12, 20242 yr by sl0pz
February 13, 20242 yr Author Couldn't post with new memory, old memory, or single dimms. Ordered a new power supply as maybe it got damaged in the power outages, not really sure what it could be other than that.
February 15, 20242 yr Author syslog-192.168.0.200.log Ok so new psu is having more uptime, it did crash last night it seems but only once so maybe something has helped with that. I am still having issues showing video output when posting to get to mem test however. From looking over this log I'm guessing still try to find a way to get to memtest? or is it pointing at something else now
February 16, 20242 yr Author While I can get to the server via networked computer, the video output on the MB isn't working with or without the GPU plugged in. Can't access bios or video output (memtest or unraid) via the hdmi ports... edit: got video output through the gpu and running memtest Edited February 16, 20242 yr by sl0pz
February 16, 20242 yr Author Ok memtest 4pass had zero errors (new memory from last week) I'm not sure why I can't get video output from the mb hdmi port when the gpu isn't attached, I believe I have before. I got into the bios no problem now and enabled iommu in (from auto to enabled) as I saw another person said this worked for them Removed GPU stats plugin to see if that gets rid of the log errors. Will post back if/when it crashes overnight!
February 16, 20242 yr Author It doesn't seem like it crashed last night but it is still giving me a machine check events error I guess we'll continue to wait to see if it comes up again? syslog-192.168.0.200.log
February 18, 20242 yr Author Ok So crashed again last night. not sure when, the last part of the log was 4pm and I found it dead at 11:25 There are no entries in the log between those two times. The nvidia log entries are gone after the removal gpustats plugin as well. So just looking for more tests to run to see why it's turning off syslog-192.168.0.200.log
February 18, 20242 yr Author The previous crash was the latest one, I changed the bios settings to typical idle current instead of auto as I found perhaps it was dropping out from failing to come back from a low power state. Current uptime since then is 23 hours
February 21, 20242 yr Author About 4 days of uptime at this point Anyone reading this thread thinking this sounds like me, my solutions were - cleaning up energy source (UPS + new power supply) - new flash drive - upgrading memory (could have been unnessecary) My guess is, because it was the last thing I tried fixing, that the home power going on and off messed with some part of the 8 year old power supply and was not giving clean power to the system.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.