Unmountable: wrong or no FS (AFTER data rebuild)


Go to solution Solved by itimpi,

Recommended Posts

Hrm...

 

So I had a disk fail (disk2 - 6TB); physically installed the new disk (14TB) in an open tray, leaving the 6TB attached (array is down), and cleared the new 14TB.

I then replaced the 6TB in the disk2 entry for the array with the new 14TB and started the data rebuild. Seems normal.

Rebuild finishes, but new 14TB disk2 is now reading "Unmountable: Wrong or no file system", yet shows btrfs as the FS for the entry.

 

I've searched every sequence of terms I can think of, but can't seem to find anything covering this particular sequence of events.

 

What really worries me is that after the data rebuild has finished, my docker img file is still not found (since I didn't configure that correctly to be on my ssd pool for containers at the start).

 

I have NOT done the "Format all unmountable disks" on the array page.

 

Have I lost data in this rebuild or is there something I'm missing to get this disk online correctly?

Link to comment

Thanks for the quick response trurl!

 

I am quite sure the disk was failing; it is a ~5yr old HGST that has been in multiple server iterations and started throwing read errors in the week leading up to the replacement. The last scheduled scrub failed due to errors.

 

As for the connections, I have also swapped a few older SATA cables with new ones during the replacement and have tried swapping cables/ports for this new 14TB drive, but it is the only one having this unmountable issue.

 

For clarity, here is the drive config for the main array:

  • Parity 1 - WD Gold 14TB
  • Disk 1 - Seagate X14 14TB
  • Disk 2 - Seagate X16 14TB (new, unmountable)
  • Disk 3 - HGST 6TB

I also have a second X16 14TB to replace the other HGST 6TB soon as it is the same age and use as the one that just failed.

 

Sorry I forgot to attach diags on the main post; attached here.

 

Thank you!

knowhere-diagnostics-20230319-0837.zip

Edited by nedpool
Link to comment

Thanks, I had seen that entry in the FAQ but was hoping it didn't come down to that and thought there would be a root cause for the issue that could be resolved. At least I have the 2nd replacement drive to act as a temp store for this.

 

I'm curious, though, what's the likelihood of this happening on a data rebuild in unRAID? This was only the second rebuild so far (first was upgrading disk 1). Is this just a coin flip of risk with btrfs or with unRAID in general? Or are there additional steps in a disk upgrade/replacement that aren't documented? I didn't see anywhere specifying to create a filesystem on the new disk going in after preclear, so I didn't, assuming that was part of the rebuild process behind the scenes. Could that have contributed to this issue?

 

Is there anything I do to prevent this from happening?

Link to comment

While a failing disk doesn't usually corrupt the filesystem it's can happen, in part it might depend on how it fails and also the hardware involved, this can happen with any filesystem, note that you can see how the rebuilt disk will turnout by looking at the emulated disk, whatever is showing on the emulated disk is what's going to be on the rebuilt disk.

Link to comment
  • Solution
2 minutes ago, nedpool said:

I'm curious, though, what's the likelihood of this happening on a data rebuild in unRAID? This was only the second rebuild so far (first was upgrading disk 1). Is this just a coin flip of risk with btrfs or with unRAID in general? Or are there additional steps in a disk upgrade/replacement that aren't documented? I didn't see anywhere specifying to create a filesystem on the new disk going in after preclear, so I didn't, assuming that was part of the rebuild process behind the scenes. Could that have contributed to this issue?

A rebuild will put on the replacements disk exactly what is showing on the emulated drive before starting the rebuild.  If the emulated drive is showing as unmountable then the rebuilt one will be as well.   You do not create a new file system on the disk being rebuilt because the rebuild process is going to restore exactly what was on the emulated drive.

 

You can attempt a repair on the emulated drive before starting a rebuild.  At that point you still have the original disabled disk unchanged in case it was just a glitch that caused it to be disabled which gives you extra recovery options.

Link to comment
46 minutes ago, itimpi said:

If the emulated drive is showing as unmountable then the rebuilt one will be as well.

 

Gotcha. That definitely explains it. As nice as it would be to automatically repair in such a situation, I completely understand the reasoning for not doing so and I should not have assumed it would.

Link to comment

Follow up:

The original 6TB must have failed spectacularly; I could only salvage ~50GB of 4.5TB of data that was on the drive and the same junk got sync'd to the new drive.

btrfs check (--repair)

could not repair the partition on the new drive.

Ended up just formatting the new drive and accepting the loss of the media (98% replaceable).

 

I now have a properly configured docker directory on the ssd pool due this ordeal, as well. Some lessons learned and hopefully a better system.

 

Thanks for the help!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.