May 3, 20251 yr So, in honesty and transparency.. I'm about at my wit's end with Unraid and about to move to TrueNAS due to this issue... I recently lost 15TB of data as 3 of my hard drives started to fail on my Unraid server. I initially thought it was bad cabling or a power issue; however, I have not only replaced the 3 hard drives, i also replaced all cabling leading from the hard drives to the HBA and i have also replaced the HBA as well. I also tested with a known working power supply from another rig and was experiencing the same issues. Unfortunately, I am continuing to get the same Read Errors during Parity Sync that eventually killed the file table and forced the drives into an "Unmountable: No file system" state that wouldn't even recover after using the xfs repair tool. I have ran a full suite of preclears on each of the disks, running 3 cycles of pre-read, erase, zeroing, and post-read on each of the hard drives. Each preclear ended perfectly fine with no errors being thrown or any issues occurring at all. I ran both short and extended SMART tests on each hard drive and even those came back completed with no issues detected. I have even tried pulling my two parity drives from the array and running a new Config so that i could run a preclear on them as well. Just in case maybe the parity was corrupted and could give it a fresh start with a fresh parity sync after re-adding them to the array after their preclears. This also failed with the same drives dropping out and spinning down during the sync due to read errors. And they are literally brand new. I have no idea what is causing this issue, My most recent attempt had 1 drive start throwing errors and then the other 2 soon followed. Again, All cabling is new, HBA is new, and the Hard Drives are new, but I'm having the same issue as the old hard drives and cabling. Does anyone have any idea what may be causing this issue? Syslogs and Diagnostics are attached. (Diagnostics were made after removing the failed drives and parity drives from the array again). multimediaserv-syslog-20250503-0011.zip multimediaserv-diagnostics-20250502-2125.zip Edited May 3, 20251 yr by ozma64
May 3, 20251 yr Author I just added them to the original post as i forgot to pull them initially. Here they are again though. multimediaserv-diagnostics-20250502-2125.zip Edited May 3, 20251 yr by ozma64
May 3, 20251 yr Community Expert Disk14 dropped offline, this is typically a power/connection issue, but post new diags after a reboot to check SMART.
May 3, 20251 yr Author Well i started a new preclear to reset the drives to try again, i did reboot the server already though. Once the preclear is over, ill run another extended SMART test and provide you with an update. It should be done in about 43 hours or so. Ill also check the power again, maybe one of the SATA Power Extensions i have is going bad...
May 3, 20251 yr Community Expert 58 minutes ago, ozma64 said: ill run another extended SMART test and provide you with an update Most likely it's not a disk problem, you can just post the SMART report.
May 3, 20251 yr Author While this preclear runs to completion, i would like to ask something regarding the Disk dropping offline... Why would this be happening only during the Parity Syncs/Checks? It doesnt happen at all during preclears that reads, erases, and zeros the entire disk... The disk stays online the whole time with not even a hiccup, but the moment i put it into the array and run a sync/check... it pops offline and errors out... Any ideas off the top of your head? Ill get the Smart Reports as soon as i can, i dont wanna disturb it until the Preclears finish. Edited May 3, 20251 yr by ozma64
May 4, 20251 yr Community Expert Solution 4 hours ago, ozma64 said: Why would this be happening only during the Parity Syncs/Checks? All disks are involved in parity sync. Possibly a power problem?
May 4, 20251 yr Author Well, i tested with another PSU, but it was an older one that was working in another rig.. Ill order a new one and see what happens then. Still waiting on the Preclears to end, ill update once i got the smart reports
May 6, 20251 yr Author Here is the smart report for the drive that dropped offline within the diagnostics i provided. Sorry for the long wait, Life pulled me away. Please let me know if there is anything unusual. multimediaserv-smart-20250505-1947.zip Edited May 6, 20251 yr by ozma64
May 6, 20251 yr Author Thank you for confirming this for me. Since i have replaced all the cabling and the HBA, im going to get a new and larger PSU for the JBOD box i made. I was using a small 400 Watt SFX form factor PSU for it.. Thought that would be more than enough to run 6 of my hard drives on... so either it wasnt, or it is failing. Ill let you know if that fixes it once i receive it in the mail and get it hooked up.
May 31, 20251 yr Author Life kept me busy and I just recently was able to get a brand new PSU swapped in and now parity checks are completing without issues. Thank you for the assistance. I am going to rework my JBOD set up and get a more powerful PSU for my drives. Right now im using two 400Watt SFX form factor PSUs to power 12 drives.. (6 on each).. I think I figured out why the "known working" PSU from the other JBOD rig was not working, the drives it was powering were all 1TB drives, where the Rig with failing drives held four 6TB drives in it... I suppose the higher capacity took more power to spin them then the 1TB drives.. I think this model of PSU, though theoretically able to support the total wattage for 6 drives by far, just isnt able to push enough power out on it's two SATA Cables for it... Welp, buy cheap... get cheap... I bought a 1000 Watt PSU with 4 Cables for SATA and PREF. I think im going to add 4 more drives and split the power to 4 drives per cable... this beefier PSU "should" handle more power per line than the tiny PSUs.... hopefully.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.