Lime1028 Posted January 8, 2022 Share Posted January 8, 2022 Hello all, I'm hoping you can help me. A power outage took down my unraid server and now it will crash every time it tries to complete a parity check, which it will do every time the drives are mounted. So in other words the server is dead in the water at the moment. It gets about 1-3% of the way through the parity check (it doesn't seem to be consistently stopping at one part) then the whole thing will freeze. Even checking the logs or trying to pause the check is impossible. I set up an external log server to get the syslog, which I have attached. I did a memtest and ran some drive tests, all of which came back clean. I'm at a loss at this point. Thanks to anyone taking the time to read this! syslog.txt Quote Link to comment
itimpi Posted January 8, 2022 Share Posted January 8, 2022 You may have a power supply issue as a parity check is one time when all drives are being accessed at the same time. you can edit the config/disk.cfg file on the flash drive to change the startArray option to “no” to avoid starting the array during the boot sequence. That would give you an option to disable the docker and VM services to see if that makes a difference and to do further investigations. You can also try booting in Safe Mode (which stops any plugins from loading) to see if that has any effect. Quote Link to comment
trurl Posted January 8, 2022 Share Posted January 8, 2022 Since you don't have many drives unless the PSU is actually failing it should be adequate barring cable/connection issues. You should post your diagnostics, it gives us a lot more information about your hardware and configuration than we can get from just a syslog. I usually don't even look at syslog without diagnostics since they can tell me what to look for in syslog Quote Link to comment
Lime1028 Posted January 9, 2022 Author Share Posted January 9, 2022 11 hours ago, trurl said: Since you don't have many drives unless the PSU is actually failing it should be adequate barring cable/connection issues. You should post your diagnostics, it gives us a lot more information about your hardware and configuration than we can get from just a syslog. I usually don't even look at syslog without diagnostics since they can tell me what to look for in syslog Thanks for the reply! As requested I've attached diagnostics. The PSU is definitely not underpowered as it ran for over a year in the system, however it is possible it got damaged in the power outage, as it's only since then that the problems began. I tried spinning up the disk on the server and the server didn't freeze up or have any issues, but this might not be the same level of load as a parity check. tower-diagnostics-20220108-1730.zip Quote Link to comment
Lime1028 Posted January 10, 2022 Author Share Posted January 10, 2022 13 hours ago, trurl said: I've attempted everything in the highlighted comment regarding C-states and RAM speeds, unfortunately the problem still persists. I'm not particularly surprised seeing as those settings were set to their default values before and it was working fine. Thanks for the info though. Quote Link to comment
Lime1028 Posted January 13, 2022 Author Share Posted January 13, 2022 Does anyone else have any ideas of what might be causing these crashes/freezes? It still happens every time a parity check is attempted, usually runs for 5-10 minutes before freezing. Quote Link to comment
trurl Posted January 13, 2022 Share Posted January 13, 2022 What is the exact model of your power supply? Any splitters? Quote Link to comment
Lime1028 Posted January 13, 2022 Author Share Posted January 13, 2022 42 minutes ago, trurl said: What is the exact model of your power supply? Any splitters? It's a Corsair RM750x. There are some splitters in use. Perhaps I'll open it up tomorrow morning and try redistributing the drives across the power connectors. Quote Link to comment
geeksheikh Posted November 1, 2023 Share Posted November 1, 2023 @Lime1028 -- I'm having the same issue, did you ever figure out what the issue was? Quote Link to comment
geeksheikh Posted November 1, 2023 Share Posted November 1, 2023 nvm -- I figured out the issue -- it was crashing because my cpu fan died -- hence overheating and shutting off. replaced fan / heat sync, all is good. Quote Link to comment
Lime1028 Posted November 2, 2023 Author Share Posted November 2, 2023 8 hours ago, srfnmnk said: nvm -- I figured out the issue -- it was crashing because my cpu fan died -- hence overheating and shutting off. replaced fan / heat sync, all is good. Glad you were able to sort it out. In the end it ended up being a dead CPU core. What was happening is that core priority order was such that the dead core wouldn't get any of the normal load and would only be put into use when the system was under heavy load, like a parity check. When the system decided to send something to that core, it would fail, and the system would crash. AMD was a bit annoying to deal with on the RMA as I had to ship the CPU to the US at my cost (a bit crazy to ask for international shipping on an RMA), but they did confirm that it was broken and sent another CPU back. Though they didn't pre-pay the import tax on the replacement CPU, and refused to when I asked about it, so I had to cover it. In the end RMAing the CPU cost almost as much as buying a new one as this was just a cheap Ryzen 3600. When I RMAed my 3080 after a shunt resistor blew Asus just sent me a shipping label and a week latter a new card was on my doorstep. Quote Link to comment
geeksheikh Posted November 2, 2023 Share Posted November 2, 2023 wow -- well, at least you took it all the way to the end. Good work. Dead cpu core, zoinks, super unlucky. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.