Workflow for recover Date of corrupted XFS Raid Volume


corgan

Recommended Posts

Hallo

 

today, my server crashed, and I needed to reboot.

After reboot, the first array Disk was missing and corrupted.

 

Disk:

ST16000NM001G-2KK103_ZL20FTBK (sdb)

No Smart Errors

 

xfs 68 TB Array with 7 Disk + 1 Parity

 

Dec 12 17:10:48 serva4 kernel: SGI XFS with ACLs, security attributes, no debug enabled
Dec 12 17:10:48 serva4 kernel: XFS (md1): Mounting V5 Filesystem
Dec 12 17:10:48 serva4 kernel: XFS (md1): Ending clean mount
Dec 12 17:10:48 serva4 kernel: XFS (md1): Metadata CRC error detected at xfs_agi_read_verify+0x86/0xcf [xfs], xfs_agi block 0x67fffff9a 
Dec 12 17:10:48 serva4 kernel: XFS (md1): Unmount and run xfs_repair
Dec 12 17:10:48 serva4 kernel: XFS (md1): First 128 bytes of corrupted metadata buffer:
Dec 12 17:10:48 serva4 kernel: 00000000: 91 c7 3f e9 3d 0a f6 09 ca d3 88 6e 72 47 6c 25  ..?.=......nrGl%
Dec 12 17:10:48 serva4 kernel: 00000010: 65 72 51 16 6b 30 5d c4 34 ad 5d 56 bb f1 52 a6  erQ.k0].4.]V..R.
Dec 12 17:10:48 serva4 kernel: 00000020: 45 16 39 a1 ac c2 6e 03 a5 ab 9e c3 ce fb 23 e0  E.9...n.......#.
Dec 12 17:10:48 serva4 kernel: 00000030: 7a 27 f4 f6 51 34 ce 8e f1 e3 6e 04 02 5d 41 4a  z'..Q4....n..]AJ
Dec 12 17:10:48 serva4 kernel: 00000040: 44 27 61 60 6c e5 02 6b 6a 29 fb 58 f5 06 98 f1  D'a`l..kj).X....
Dec 12 17:10:48 serva4 kernel: 00000050: 54 90 4f 6b bd 55 3c 71 c9 61 10 a0 47 69 c9 22  T.Ok.U<q.a..Gi."
Dec 12 17:10:48 serva4 kernel: 00000060: 60 62 c6 25 26 60 59 92 65 ee 58 64 3b c4 fb 3e  `b.%&`Y.e.Xd;..>
Dec 12 17:10:48 serva4 kernel: 00000070: bb 41 07 79 60 3a fd 3f e0 fe 81 80 31 8d 9e 02  .A.y`:.?....1...
Dec 12 17:10:48 serva4 kernel: XFS (md1): metadata I/O error in "xfs_read_agi+0x7c/0xc8 [xfs]" at daddr 0x67fffff9a len 1 error 74
Dec 12 17:10:48 serva4 kernel: XFS (md1): Error -117 reserving per-AG metadata reserve pool.
Dec 12 17:10:48 serva4 kernel: XFS (md1): xfs_do_force_shutdown(0x8) called from line 540 of file fs/xfs/xfs_fsops.c. Return address = 000000009ffc26d9
Dec 12 17:10:48 serva4 kernel: XFS (md1): Corruption of in-memory data detected.  Shutting down filesystem
Dec 12 17:10:48 serva4 kernel: XFS (md1): Please unmount the filesystem and rectify the problem(s)
Dec 12 17:10:48 serva4 root: mount: /mnt/disk1: mount(2) system call failed: Structure needs cleaning.
Dec 12 17:10:48 serva4 emhttpd: shcmd (40): exit status: 32
Dec 12 17:10:48 serva4 emhttpd: /mnt/disk1 mount error: not mounted
Dec 12 17:10:48 serva4 emhttpd: shcmd (41): umount /mnt/disk1
Dec 12 17:10:48 serva4 root: umount: /mnt/disk1: not mounted.
Dec 12 17:10:48 serva4 emhttpd: shcmd (41): exit status: 32
Dec 12 17:10:48 serva4 emhttpd: shcmd (42): rmdir /mnt/disk1
Dec 12 17:10:48 serva4 emhttpd: shcmd (43): mkdir -p /mnt/disk2
Dec 12 17:10:48 serva4 emhttpd: shcmd (44): mount -t xfs -o noatime /dev/md2 /mnt/disk2

 

- Then I stopped the array

- activate Maintenance Mode

- Checked the corrupted drive from the gui, with testing and verbose. -nv

No changes written to the Disk

 

I got thousands of these logs 

entry "corporate bag Layers" in directory inode 14739857867 points to non-existent inode 28614441717, would junk entry

 

and around 75000 of these

 disconnected dir inode 19665158868, would move to lost+found

 

Complete Log

 

So I red though the Forum and the documents. [1] [2]

But now I'm more confused as before.

What is the correct workflow to not lose any Data or found, thousands of files with wrong names in the lost+found folder.

 

First repair the disk with xfs_repair, or simple swap the disk with a new one and let the array rebuild?

 

Array and Cache Disks:

grafik.thumb.png.1d2859be70709513b51e317c9f69245c.png

 

Link to comment

Depends upon the value to you of the data.

 

If it's 100% irreplaceable and your wife will never let you forget that the baby pictures are gone, then easeus is the way to go.  But, in that case you should have a backup of it anyways, in which case do the repair.

 

If it's simply media that can be redownloaded, then I'd do the repair.

 

If you do the repair, then you can't do easeus as the filestructure will be changed.

 

But if you rebuild onto another disk then you have the best of both worlds.  You can run the repair against the replacement disk and still have the original to run with easeus.

Link to comment
14 hours ago, Squid said:

if you rebuild onto another disk then you have the best of both worlds.  You can run the repair against the replacement disk and still have the original to run with easeus.

But note that rebuild itself won't fix corruption. The rebuilt disk will likely have similar corruption as the original.

 

15 hours ago, Squid said:

If that will work with Unraid filesystems they don't make it apparent from a quick glance at the website.

Link to comment
3 hours ago, Squid said:

It supports XFS.  You run the trial version first to confirm before actually purchasing.

 

Thanks so much for the clarification. Will try the free version and hope the best.

And actually there are indeed some baby photos of my kid on the drive. (from which I luckily have a backup :) )

Link to comment
9 hours ago, corgan said:

ATM I get the "Stopped. Missing disk." Message.

Why is there a missing disk? You were supposed to do XFS Repair on the replacement, and EASUS on the original if that didn't work out.

On 12/12/2021 at 8:05 PM, Squid said:

if you rebuild onto another disk then you have the best of both worlds.  You can run the repair against the replacement disk and still have the original to run with easeus.

 

 

Link to comment
45 minutes ago, corgan said:

The replacement is "on the way"..

 

11 hours ago, corgan said:

Is it safe to start the array? ATM I get the "Stopped. Missing disk." Message.

With single parity, it should let you start the array if you only have a single missing disk. The usual method is to repair the emulated filesystem before rebuilding, though it doesn't matter much if you are not rebuilding to the same disk.

 

Of course, there is no protection from parity since there is already a missing disk, so it is only as safe as your backup plan.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.