jryan324 Posted March 11 Share Posted March 11 At a bit of a loose end, never had an issue in 5 years but alas, here we are! Any help would be appreciated. Unraid Version: 6.12.6 I had issues with the same 3 drives throwing errors at the same time and it turned out they were the only ones plugged directly in the motherboard (whilst the others in the expansion card were fine) so I thought to replace the motherboard. Seemed to have fixed the issue but again it happened a few weeks later. And now after a reboot, it all comes back and is working for an unknown amount of time. between a few days or few weeks. When it happens, the mover seems to be stalled and it wont let me reboot so I did a terminal mover stop which did it then I could reboot. I hope its not just all 3 drives are dead but I might just be that unlucky, any advice on how to proceed would be great! I'm very rarely doing things on the server, just the usual stuff working away in the background like nextcloud/plex. I have a lot of information stored on nextcloud that I would hate to lose/corrupt. Let me know if anyone can help! Diagnostics attached from when I realised the drives had gone bad before reboot. Thanks!! tower-diagnostics-20240311-1009.zip Quote Link to comment
JorgeB Posted March 11 Share Posted March 11 There are ATA errors for multiple disks simultaneously, this can sometimes happen with Ryzen onboard SATA controllers, especially under load, solution would be to use an addon controller, could also be a power related issue. Quote Link to comment
jryan324 Posted March 11 Author Share Posted March 11 Hi! Thanks for the reply. I presume there would be no way to diagnose which of the two is the problem without trial and error? I do have a UPS supplying the server so could it be that? Cheers Quote Link to comment
JorgeB Posted March 11 Share Posted March 11 57 minutes ago, jryan324 said: I do have a UPS supplying the server so could it be that? Unlikely, could be a PSU issue, if you can try with a different PSU, could also be power splitter issue if you are using them. Quote Link to comment
jryan324 Posted May 13 Author Share Posted May 13 Hi! Sorry to revive this. Thought I had it sorted by replacing the SATA add on controller I had but it failed again yesterday. Here are the logs. Going to replace the PSU now and see if that helps. Cheers QuickShare_2405131021.zip Quote Link to comment
jryan324 Posted May 13 Author Share Posted May 13 I had attached the latest logs. Any assistance would be great if anything has changed. Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 Diags don't show the start of the problem, they show constant errors for sdd, but there' no SMART for sdd, any idea which device is that? Quote Link to comment
jryan324 Posted May 13 Author Share Posted May 13 I have reattached the zip file just incase I broke it whilst extracting. The sdd is the cache drive. Whenever the other two drives throw errors, everything on the server just stops working, all dockers don't respond but then a reboot fixes everything and clears the errors. tower-diagnostics-20240512-1052.zip Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 Those look similar, reboot to clear the logs and post new diags as soon as there are issues logged with the cache device. Quote Link to comment
jryan324 Posted May 13 Author Share Posted May 13 The Cache drive has never logged anything wrong. I think the first time this all happened, it reported udma crc error count increased but that stopped after a while. Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 But it's spamming the log, if it still does it after a reboot post new diags. Quote Link to comment
jryan324 Posted May 13 Author Share Posted May 13 Rebooted and everything seems fine. Logs attached. tower-diagnostics-20240513-1405.zip Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 Keep en eye on the log, if new errors start appearing post new diags. Quote Link to comment
jryan324 Posted Friday at 12:34 PM Author Share Posted Friday at 12:34 PM Had been stable for a while but I did a big search on Sonarr overnight and woke up to some drives throwing errors. Diags attached. tower-diagnostics-20240531-1332.zip Quote Link to comment
JorgeB Posted Friday at 12:46 PM Share Posted Friday at 12:46 PM Syslog already rotated so cannot see the beginning of the problem, but based on what I can see it looks like a power/connection issue, check/replace cables for cache and disk3 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.