Possible Drive Failure


Recommended Posts

OK, I have successfully migrated to a full unRAID Pro license on the new USB, and re-organized my drives with the grouping and empty slots for planned additions (as I migrate data off the UD mounted devices, they will be pre-cleared and then added to the array pool). 

 

HOWEVER: one of my 8TB drives won't mount under unRAID, and shows errors and a red X beside the slot which it resides in. The logs state that the filesystem (XFS) isn't clean. I've moved that drive into a USB enclosure and it won't mount under UD either. So I then attached it to my Linux Mint laptop which is able to mount it, and I can see the data.

 

As I can likely re-copy the data to unRAID over the network, and I have enough free space to allow this, is that my best option rather than letting it try to rebuild? If I hadn't already added the empty but new 8TB drive to the array, I suspect I could have inserted it and let the drive rebuild from parity.

 

Something tells me that it will be faster just to leave the array as is and do the copy over the network. Can I use fsck from my Mint laptop to check and repair the XFS filesystem and then try re-inserting it into unRAID? Suggestions?

 

Dale

 

 

Link to comment

Yes, I did physically re-arrange the drives. I can try another slot on the hot-swap SATA backplane in my case, but other channels on that same row are functional. Still, could be one bad channel in the SFF-8087 to SFF-8087 cable for that row. I do have a new spare cable so I can try swapping it out as well.

 

Any reason to check the drive/filesystem since it's mountable on the Linux laptop?

 

Link to comment
11 minutes ago, Squid said:

My gut says no, but I'm not the filesystem guy here.

Dang it.... the drive was seen after I re-inserted it into the system, but now unRAID says 'replacement disk inserted'. And if I start the array, it's going to do a full parity check or rebuild based on the messages I'm seeing. I think I'll run out and grab another new 8TB drive, attach both to my laptop and try to copy the data off it before doing anything else.  It is an older 8TB that occasionally threw UDMA CRC errors but I don't think I want to trust it in unRAID to a parity check or data rebuild.

 

I'd feel more comfortable making an 'offline backup'. My other option is to consider it 'lost data', delete the drive from the array and go ahead with replacing the parity drive (a 2 month old 8TB) with the new 10TB Ironwolf and then letting the parity rebuild occur, which has to happen anyways since I'm upgrading the parity drive.

 

While parity is rebuilding on the new parity drive, I could use the old 8TB parity (still almost new) as the recovery destination with both it and the potentially failing and now un-trustworthy 8TB as the source. With luck this would let me recover all the data to a known good and trusted drive, after which I can attach via UD and migrate the data back to the array. Of course this would mean waiting for the parity rebuild on the new 10TB Ironwolf to finish, before I'm able to copy the data back. A long process either way.

 

A few options.... think I'll take a break and think about it.

 

Dale

 

 

Link to comment

For anyone that cares, I decided on the following:

 

1. Created new config with the new 10TB parity drive, without re-assiging the potentially failing disk.

2. Started the array and the parity rebuild on the new 10TB drive is underway.

3. Mounted the potentially failing disk in UD. This time it let me... not sure why it wouldn't previously.

4. Slowly copying data off the potentially failing disk while the parity rebuild is in progress. I realize that the entire array is unprotected until the parity rebuild completes - current estimate is about 24 hrs.

5. Started preclear on 2 month old 8TB drive that was previously used for parity, just to ensure it's OK. I'll add it to the array pool once the pre-clear is done.

 

So for now, I have a working unRAID, that is actually running 6.7.1 stable. I'll slowly re-install any necessary plugins that aren't working, and the Docker containers and VM can wait until the parity rebuild is complete. A little painful, but that's the risk I took when adding the older 8TB drive that is potentially failing. It did pass the pre-clear when I originally added it to the array, but being 4.5 years old, I shouldn't have used it. Lesson learned.

 

So far the data is copying off the old drive with no errors, so with any luck I'll get it completed by the time the parity rebuild finishes.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.