Jump to content

Multiple Disk Failures - Cause?


rts.empire

Recommended Posts

Hi all, 

 

A couple of days ago, I had a disk failure - disk 5 was disabled. No problem, installed a new drive and went to rebuild, but it was estimated to take 180+ days. I checked my disk logs and found disk 3 was coming up with a bunch of IO errors. I cancelled that and changed out the cables and checked connections for Disk 3 in case I bumped something when I put the new drive in. 

Now disk 3 is active but states "unmountable/ no file system". I've done a few restarts while trying a few other things with the dead disk to see if data was recoverable and Disk 3 will sometimes mount and be readable (but with a bunch of IO errors and very slow read rates) and now is back to "unmountable". 

 

My plan is to replace the two failed disks with two new ones arriving tomorrow and reconfigure (with lost data) but after a couple of read errors on a third disk, I want to know if these issues are likely from something else (LSI card or PSU), my cancelling the rebuild when it was running slow or just bad luck and two drives failed at once. 

 

I have attached two diagnostics - first one is from the initial failure. Second one is from just now with the second disk issues (first failed disk is not mounted). 

 

unraid-diagnostics-first.zip unraid-diagnostics-second.zip

Link to comment

There are still issues with disk3:

 

Sep 28 12:02:02 UNRAID kernel: sd 4:0:2:0: [sdk] tag#2187 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=DRIVER_OK cmd_age=0s
Sep 28 12:02:02 UNRAID kernel: sd 4:0:2:0: [sdk] tag#2187 CDB: opcode=0x88 88 00 00 00 00 00 00 00 00 40 00 00 00 08 00 00
Sep 28 12:02:02 UNRAID kernel: I/O error, dev sdk, sector 64 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Sep 28 12:02:02 UNRAID kernel: md: disk3 read error, sector=0

 

Try swapping cables with another disk to see where the problem follows, there's also no SMART report for that disk, see if you can get one manually after that.

Link to comment

I've removed disk5 and changed cables on disk3and it still says unmountable. Although it did run a short-SMART test (attached).

At this stage I can accept if the drive is probably dead - but I want to be sure that's all it is before I pop more drives in as two went in quick succession (not caused by other hardware or faults) 

 

Thanks for your help

unraid-smart-20230928-1920 (1).zip unraid-diagnostics-20230928-1930.zip

Link to comment

Once you have the cloned disk (it needs to be the same capacity as the original or it won't mount) you can try this, note that it will only work if parity is valid:

 

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) if needed, including the cloned disk3 and the new disk5, replacement disk5 should be same size or larger than the old one
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk5
-Start array (in normal mode now) and post new diags

 

Link to comment

Currently trying to complete the clone using ddrescue. My array is is stopped and all the disks are unmounted so I can clone disk 3 to another disk of the same size. 

However, while doing so my parity drive (which was fine when I restarted unRAID and first stopped the array) suddenly dropped out of the array, now shows a "missing disk" in the parity spot and the parity drive is in unassigned devices, unable to be assigned to the array and having a heap of read errors. 

Meanwhile the new disk I am attempting to clone Disk3 too is dropping in and out of the unassigned devices.

I could accept bad luck with two drives (which are older), but surely the parity drive (which is near new) hasn't given up the ghost too?

unraid-diagnostics-20230929-2359.zip

Capture.PNG

Capture2.PNG

Edited by rts.empire
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...