Froberg Posted June 21, 2019 Share Posted June 21, 2019 Hi all I came home today to a sudden server crash. I've just moved the whole thing to a new case last week since I wanted more optimal cooling, but there haven't been any issues. Today I came home, and the whole thing was unresponsive - IPMI still responded and showed the tiniest amount of error data. Had to force a reboot entirely before unraid came back to life. Checking the logs, it only has everything from the reboot and onward and it's doing a parity check now. Can anyone tell me where I can find the logs, if any, from the event so I can have a chance at locating the underlying cause? Cheers. Quote Link to comment
JorgeB Posted June 21, 2019 Share Posted June 21, 2019 Logs are always from after rebooting, if you're on v6.7 you can enable the syslog server. Quote Link to comment
Froberg Posted June 21, 2019 Author Share Posted June 21, 2019 A write error has popped up on a disk. Is it better to wait for the parity check to complete (24 hours) before replacing it, or should I stop the parity check and replace it ASAP? I don't see how this could explain my crash, though. Quote Link to comment
Froberg Posted June 21, 2019 Author Share Posted June 21, 2019 2 minutes ago, johnnie.black said: Logs are always from after rebooting, if you're on v6.7 you can enable the syslog server. Syslog enabled, didn't think it could syslog locally. With flash mirror. Quote Link to comment
JorgeB Posted June 21, 2019 Share Posted June 21, 2019 Please post current diagnostics. Quote Link to comment
Froberg Posted June 21, 2019 Author Share Posted June 21, 2019 Well it just crashed again, so clearly something is up.. Will report back once I am wiser to the goings on. Quote Link to comment
Froberg Posted June 21, 2019 Author Share Posted June 21, 2019 Found an uncorrectable ECC error in the IPMI interface log during the time of the great crash around midnight last night. I'm, for now, assuming that one of the memory sticks came slightly unseated during transport of the server from the old chassis to the new, as the server was un-bootable until I took it down and reseated the RAM. (No new ECC event log to confirm, though..) Diagnostics attached. Assuming the disk write error will come back.. but it's odd given that I've just run a full parity check a few days ago but I think maybe a RAM seating problem could cause an issue with parity checking, too? The disk is only six months old. We'll see how it behaves, but again, diagnostics attached. fortytwo-diagnostics-20190621-1547.zip Quote Link to comment
Froberg Posted June 21, 2019 Author Share Posted June 21, 2019 Parity check is chugging along nicely and no errors reported so far.. they came very rapidly during the last boot-up. Might be the issue is solved - looks like I need me an IPMI capable motherboard next time I am going for an upgrade, too, then. Quote Link to comment
Froberg Posted June 22, 2019 Author Share Posted June 22, 2019 Yeah parity check complete. RAM reseat seems to have solved it. May be of interest to some, still. 🙂 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.