September 16, 20232 yr So my unraid system keeps locking up. Under normal use it will work for a while then all the drives, even the device list goes blank and the system won't shutdown. So the next time it boots up it runs a parity check, which locks up. The point where it locks up keeps going down. I have upgraded/replaced ALL the hardware in my system including moving it over to a new chassis, a new HBA, a new USB drive, new NVME cashe drives and the problem persists. I have even formatted my unraid OS back to the stock OS keeping only the array drives/parity and I still cannot get it to quit. I can't complete a long health test on the drives before it locks up. I'm at a loss at this point. The only thing i can think to do is transfer the data from the drives to new drives a few at a time, do a health check, and stress test the old drives to see if something is wrong. But that is a F-ton of work. I would really appreciate some help. 20230915_195433.mp4 unraid-diagnostics-20230915-2015.zip
September 16, 20232 yr Community Expert There's a segfualt in the log, to rule out any plugin issues reboot in safe mode, if the same post new diags.
September 16, 20232 yr Author it still locked up in safe mode. Here is the diagnostic file after the array locked up. unraid-diagnostics-20230916-0942.zip
September 17, 20232 yr Community Expert The Unraid driver is crashing, this suggests a hardware problem, start by using just two RAM sticks, if the same try the other two, that would basically rule out a RAM issue.
September 17, 20232 yr Author I tried that. I tried 1 ram stick, all 3 other ram sticks individually, as well as pair of them. The problem is i have replaced all the hardware after this started happening. All new intel 13th gen. I've even tried going back to the old hardware (6800k) and still no luck. I replaced the HBA, the cables, moved everything to a new larger rack mount enclosure, new power supply, new NVME cache drives, removed the Nvidia GPU from the system as well as the SFP+ network card, and a new unraid USB stick. The only thing i haven't tried replacing is the hard drives and the fact that it's unraid although i did reinstall unraid fresh. As of yesterday i took the parity drives out (didn't format them, just removed them) and the system has been running the longest so far (19 hours) without crashing. The problem is i can't make any changes or my parity will be invalid. I'm actually not even sure it is still valid since the system could have made a change to something without my realizing it. All my critical data is backed up to other locations so its just movies, music, and tv shows here that i could loose. I'm currently trying to transfer the bulk of the data off the server onto new drives so i can try more aggressive tactics and don't have to worry about losing 100+TB of data.
September 18, 20232 yr Community Expert The errors I'm seeing still suggest a hardware issue to me, of course cannot be certain.
September 19, 20232 yr Author I had an issue with the motherboard and had to RMA it but that seems to have solve the problem. The ram came back fine in the memtest but i'm trying 1 stick at a time at the moment. The only other hardware would the the CPU. I guess that will be my next route to check if the ram doesn't solve the problem.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.