[6.9.2] Disk Unmountable: not mounted

September 4, 20214 yr

Hi,

My server has 8 data disks and 1 parity disk. A couple of weeks ago I started the process of upgrading my data disks from 8TB to 16TB. The parity disk, disk 1, disk 2 and disk 3 went fine. Just after finishing the rebuild on disk 3, disk 8 became unmountable. Since I was upgrading anyway, I went ahead and pulled disk 8, installed a new disk and kicked off the rebuild. It finished, but the new 16TB disk 8 was unmountable. I ran the xfs_repair with no luck, then ran xfs_repair -L. The disk became mountable, but I lost 6TB of data. I paused on adding new drives while I tried to figure out which data were lost and how to recover, a significant amount was in the lost+found directory.

All was stable for a couple of days now, now disk 3 has became unmountable. I've run both short and extended SMART self-tests and they come back clean. a xfs_repair without parameters comes back with:

Quote

ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

I do have the old 8TB disk 3, so if need be I can reformat and restore from the old disk.

Since this error has occurred on two different disk types on two different cables and two different controllers, I'm concerned that something needs correcting or this behavior may continue with other disks.

Any advice appreciated, diagnostics attached.

Many thanks,

Redbear

pretzel-diagnostics-20210903-2016.zip

Quote

September 4, 20214 yr

00:17.0 RAID bus controller [0104]: Intel Corporation SATA Controller [RAID mode] [8086:2822]
    Subsystem: Gigabyte Technology Co., Ltd SATA Controller [RAID mode] [1458:b005]
02:00.0 RAID bus controller [0104]: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller [1b4b:9485] (rev c3)
    Subsystem: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller [1b4b:9480]

RAID mode is not recommended, and Marvell controllers are not recommended.

Which controller are you having problems with?

Quote

September 4, 20214 yr

Author

Thanks for your time and response.

The first unmountable disk issue with disk 8 occurred on the Marvel controller.

The second/current unmountable disk issue with disk 3 is occurring on the Intel controller.

I can reboot the machine and place the controllers into AHCI mode instead of RAID mode. If I can stabilize the machine, I can move all of the data to four 16tb data drives while I look into replacing the marvel based controller. For what it's worth the machine has been rock solid for three years.

Edited September 4, 20214 yr by redbear

Quote

September 4, 20214 yr

4 hours ago, redbear said:

I can reboot the machine and place the controllers into AHCI mode instead of RAID mode

RAID mode is fine with Intel fakeRAID, it still uses the AHCI driver.

4 hours ago, redbear said:

For what it's worth the machine has been rock solid for three years.

I would run memtest, multiple filesystem corruption without an apparent reason could be the result of bad RAM.

Quote

September 4, 20214 yr

Author

Ok, kicked off the memtest a couple of hours ago. So far so good. Also found a couple of controllers lying about (Startech and LSI). I'll check them against the recommended list and potentially swap one in for the Marvell tonight after the memtest.

Quote

September 6, 20214 yr

Author

The memtest ran overnight with three passes, no errors. I figured out the Startech is also Marvel based so I'll swap in the LSI as soon as I get a new breakout cable. Not sure I want to turn the array back on until I get rid of the Marvel based card.

Quote

September 8, 20214 yr

Author

Ok, I've removed the Marvel card (swapped with an LSI/Broadcom board) and changed the Intel controller to AHCI mode (@JorgeB, just to be safe).

The array is back online, disk3 is still unmountable. I no longer have the option to run an xfs_repair, since the disk's format now reads as "auto".

At this point I'm willing to move forward and replace or rebuild the disk. Based on the diags (new set attached) should I rebuild it from parity replace it and rebuild it, or something else?

pretzel-diagnostics-20210907-2241.zip

Quote

September 8, 20214 yr

A rebuild will never fix an unmountable state as the rebuild simply makes the physical drive match the emulated one.

if the format for the drive is set to ‘auto’ and you know what it was before then you can set it explicitly which will cause the repair option to again be offered. With it set to ‘auto’ the system cannot offer a repair if the file system type is not recognised as it does not know which tool to use.

Quote

September 8, 20214 yr

Author

Thanks for the tip to set explicitly. I did that, and I was able to run xfs_repair -L.

The disk mounts now, and it's down about 6TB of data. Oddly similar amount to the first disk that became unmountable.

There were 8 8TB data drives in the machine originally. My goal is to 5 16TB data drives. I have upgraded the parity drive and 4 of the data drives. Currently parity is valid.

My current thought is:

1. shut down docker and mover,

2. use Unbalance to copy data from disk4 up to disk1.

3. remove disk 4,

4. use it's port to attach my old disk3

5. use Unassigned Devices & MC to copy the lost data back to the new disk3

6. use Unbalance to copy data from my remaining 8tb drives to the new 16TB drives

7. remove the remaining 8TB drives

8. add the fifth 16TB disk

9. rebuild parity

Thoughts?

Quote

September 8, 20214 yr

Seems confusing to me and probably wrong. How does disk4 play into any of this?

Maybe new diagnostics would clarify.

Quote

September 8, 20214 yr

Author

Fair, disk4 frees up a slot to bring back the old disk3 for the restore.

pretzel-diagnostics-20210908-1509.zip

Edited September 8, 20214 yr by redbear

Quote

September 8, 20214 yr

Still seems wrong. With a missing disk and single parity you will no longer have parity protection. Then when you replace the missing disk you will have to rebuild it.

Also unclear how the new disks figure into this either. Are they going to replace other disks that already have data on them? No need to move or copy any data to replace a disk with a larger disk, just replace/rebuild.

I will be away for a few hours. See if you can explain what you want in more detail and we will see if we can come up with a plan.

Quote

September 9, 20214 yr

Seems like a cheap USB dock or enclosure for attaching your original disk and accessing it with Unassigned Devices would simplify things.

Quote

September 9, 20214 yr

Maybe I was just getting confused because you wanted to move data around for some reason.

Reviewing the first post obviously you know how to upsize disks by replace/rebuild.

I guess you could go with your plan of removing disk4, use its port to access old disk3 unassigned, then when done rebuild disk4 to a larger disk.

Not sure how moving things around with unbalance fits into all this though. Are you planning to remove some of these smaller disks and not replace them? And that is why you need to rebuild parity at the end?

Quote

September 9, 20214 yr

Author

Yes, exactly, the idea is to remove the smaller disks and not replace them. That wasn't the original plan, but now I'm worried about another one of the drives becoming unmountable and not having a recent copy of the data.

Quote

September 9, 20214 yr

If you are going to rebuild disk4 to a larger disk after you are finished stealing its port you shouldn't really need to move its data, but if it makes you feel better.

Do you have backups of everything important and irreplaceable?

Quote

[6.9.2] Disk Unmountable: not mounted

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)