unmountable disk, xfs_repair not working

daemian · May 21

So i have been having some issues w/ my unraid recently (crashing, likely becasue of macvlan). I *think* we may have gotten to the bottom of that. I started a new thread as I dont think this issue is related to that (though i could be wrong).

I started my array yesterday, it was running through the parity check and all seemed to be well.

Fast forward to this morning and suddenly in the main page I see:

Unmountable disk present:
Disk 2 • HGST_HDN726040ALE614_N8H1H2EZ (sde)

This disk is not disabled (no red icon). I am doing my best to follow these instructions in the docs.

I tried to stop the array, and it got stuck on syncing filesystem. Tried several things, eventually had to hard power it down. Brought it back up and started it in maintenance mode and ran xfs_repair on disk 2 with the -n flag and got:

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_fdblocks 458152498, counted 459133020
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
inode 2584779900 - bad extent starting block number 4503567558284715, offset 0
correcting nextents for inode 2584779900
bad data fork in inode 2584779900
would have cleared inode 2584779900
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
entry "4096-4096-max.png" in shortform directory 2584779899 references free inode 2584779900
would have junked entry "4096-4096-max.png" in directory inode 2584779899
inode 2584779900 - bad extent starting block number 4503567558284715, offset 0
correcting nextents for inode 2584779900
bad data fork in inode 2584779900
would have cleared inode 2584779900
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
entry "4096-4096-max.png" in shortform directory inode 2584779899 points to free inode 2584779900
would junk entry
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

Then tried running it again without -n and got:


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

Not sure if i should try the -L option. I googled and saw another post that asked about the smart report for the drive as it may be failing, so I am attaching that here as well.

Looking for any advice on how to proceed. I don't know if this is related to the recent crashed, or purely coincidental. Perhaps one of the hard crashed caused the issue and it a took a bit to manifest - I don't know.

Thank you for your help. Great appreciated!

HGST_HDN726040ALE614_N8H1H2EZ-20240521-1119.txt

Edited May 21 by daemian

JorgeB · May 21

12 minutes ago, daemian said:

Not sure if i should try the -L option.

Yes, that's the only option, and SMART looks OK.

daemian · May 21

thanks as always JorgeB,

I ran it w/ -L and here was the output

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
sb_fdblocks 458152498, counted 459133020
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
inode 2584779900 - bad extent starting block number 4503567558284715, offset 0
correcting nextents for inode 2584779900
bad data fork in inode 2584779900
cleared inode 2584779900
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 0
        - agno = 3
        - agno = 2
entry "4096-4096-max.png" in shortform directory 2584779899 references free inode 2584779900
junking entry "4096-4096-max.png" in directory inode 2584779899
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (15:874858) is ahead of log (1:2).
Format log to cycle 18.
done

Out of curiosity i ran it again now w/ -n


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

Looks good to my eye.

Started the array successfully now and things seem to be working 🤞

Appreciate the help as always, I will report back if i have more trouble.

unmountable disk, xfs_repair not working

Recommended Posts

daemian

Link to comment

JorgeB

Link to comment

daemian

Link to comment

Join the conversation