Jump to content

Need Diag help with constant failing drives

Recommended Posts

Short version of the story....



Been using Unraid for a few years now,,, LOVE it.  Older hardware, but has been running like a top, and don't feel the need for more speed or power.  Fits my use case perfectly.


Four data drives, single parity.  Two SSD's, one for cache, one for VM's and random use...  The VM SSD lives in Unassigned Devices.  Parity and SSD's are plugged into the SATA MB ports, the data drives are on an LSI controller.  Been running like this for years, never a hiccup.


Last week I had my first "Red X" on a drive.  Slot 4 was occupied by a WD 10tb.  Got the X, checked SMART data, looked ok... rebuilt parity.  Ran fine for about two days, then dropped out again.  Decided to swap it out, so ordered a refurb Exos 12tb.  Plugged the Exos into the controller, showed up in Unassigned Devices, but kept failing format, failed pre-clear within 20 mins.  Swapped around the cables from the controller, same result.  Plugged the Exos directly to MB SATA ports, and seemed to work just fine, formatted and added to array, rebuilt, 24+ hours, and all ok.  I put the Exos back onto the controller, but swapped cables into the other controller port to test.  Seemed fine with Diskspeed docker and SMART.


Yesterday I put the old 10tb on MB SATA to run pre clear.  The Exos showed up in UA, and was still listed in the array... weird,,,  pre clear is still running on the 10tb without issue, 20 hours later.  The Exos has now dropped out again...  No more than 2 drives are on a single line from the power supply.


I just don't know enough about reading diagnostics,,,  lots of errors from sr0 (I don't and have never had an optical drive)



Please help ;)


Edited by Sniff
Link to comment

First thing to look at would be the connections.  SATA isn't particularly known as a robust connector and in many cases the non-locking variety actually work better than the locking ones.  Reseat all cabling (sata & power) at both ends.  Don't try any cable management (random cabling is better for EFI than nice neat wires) and also minimize (or better yet don't use) any power splitters at all.

Link to comment




The Preclear failed at the final stage.




After the failure, I can longer see or access disk 4 (Exos), completely dropped out...  lsblk doesn't show any disk attached either....  



I unplugged all the power cables, rearranged them... no splitters used, no more than two drives on any one line from the psu. I removed the lsi card and plugged all six drives in the motherboard sata ports using all new sata cables.


Rebooted,,,,   still no disk 4, doesn't show in unraid or by command line.  The 10tb that failed pre clear won't show either.  The original parity, other three data disks, and cache all show up and function just fine.


Ran memtest, all passed without issue....  At this point, am I looking at more of a power or mobo problem?  I don't have another system to throw the Exos or 10tb drive into to test...


When the Exos was attached and functioning (for a short time), this was taken from diskspeed docker.



Edited by Sniff
Link to comment

Small update.....


Swapped power and sata cables, rebooted, Exos showed up again



Installed a new power supply, all new sata cables...  ditched the LSI card for now, all drives plugged into the MB.  Tried a file check on the Exos, no good...  running a re-build now, 18 hours in, still going...

Edited by Sniff
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...