I had a disk go into error state a few days ago, but went through the spaceinvaderone video on XFS repair and did SMART tests and all that jazz and I can't actually find anything wrong with the disk. No errors or anything, so I re-added it to the array, and started a parity rebuild of the drive data. Standard procedure, per my understanding - and I got through all that without any issues or questions.
However, I keep coming back to my server to check in on the status of that parity rebuild only to find the array is stopped and the parity rebuild never completed. It has happened 3 times now. No idea what's going on.
I do see a lot of weird errors about a USB device not responding in sys log, and I do see some warnings that UPS communication is dropping in and out in the GUI notifications. My server is plugged into its UPS via a USB cable so I wonder if that could be related - however, other things plugged into that UPS appear fine, so I don't think the UPS is failing to supply power. And if that's not the issue, I'm not sure what else to do with the UPS to troubleshoot it.
I am also on a new USB boot drive. The previous one was the original drive I built the first iteration of my Unraid server with in like 2017 and it finally gave up the ghost. But I got a USB drive recommended for Unraid by spaceinvaderone - I don't think that's what's causing the USB errors.
I added a Coral m.2 TPU and 2x80mm exhaust fans recently. The Coral seems to be working fine, and I mention the exhaust fans because I saw in another thread Squid mentioned CPU overheating can cause this kind of random shutdown. My CPU hasn't overheated before, and it has more fans now, so I'm pretty sure that's not it. The new fans went in because the drives can run a little hot on hot days, but recent ambient temperature has been low.
All told, I have a fair number of changes recently that make it a little harder to troubleshoot, and I have yet to find the actual crash or shutdown or whatever is happening in the logs.
Still looking - I have a feeling the answer is in here somewhere - I'm just not super familiar with the log format and haven't tracked it down yet.
Diagnostics attached. Let me know if ya'll have any ideas - thank you!
greenplanet-diagnostics-20240401-0755.zip