Had two unmountable disks in array. after xfs_repair and boot, still have one unmountable

bonmot · October 9, 2021

Hi all - in 2nd day of my Unraid trial. All seemed to be going well, but after a reboot (to clear the error log), two of my disks showed up as unmountable. Somehow, I was able to get one of them to remount after xfs_repair and a reboot, but I still have one which won't mount.

Here's the results of the latest xfs_repair....

Quote

.root@Honeycrisp:~# xfs_repair -v /dev/md3

Phase 1 - find and verify superblock...

- block cache size set to 536480 entries

Phase 2 - using internal log

- zero log...

* ERROR: mismatched uuid in log

* SB : f20cef9b-f18e-434f-9033-02501b097fcc

* log: 73c6bb68-39f3-443a-95ec-370d865dc353

zero_log: head block 891157 tail block 891157

- scan filesystem freespace and inode maps...

sb_icount 64, counted 32

sb_ifree 61, counted 29

sb_fdblocks 976277671, counted 976277675

- found root inode chunk

Phase 3 - for each AG...

- scan and clear agi unlinked lists...

- process known inodes and perform inode discovery...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- process newly discovered inodes...

Phase 4 - check for duplicate blocks...

- setting up duplicate extent list...

- check for inodes claiming duplicate blocks...

- agno = 0

- agno = 1

- agno = 3

- agno = 2

Phase 5 - rebuild AG headers and trees...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- reset superblock...

Phase 6 - check inode connectivity...

- resetting contents of realtime bitmap and summary inodes

- traversing filesystem ...

- agno = 0

- agno = 1

- agno = 2

- agno = 3

- traversal finished ...

- moving disconnected inodes to lost+found ...

Phase 7 - verify and correct link counts...

SB summary counter sanity check failed

Metadata corruption detected at 0x47518b, xfs_sb block 0x0/0x200

libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x200

xfs_repair: Releasing dirty buffer to free list!

xfs_repair: Refusing to write a corrupt buffer to the data device!

xfs_repair: Lost a write to the data device!

fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair

I'm something of a newbie to CLI - I definitely am more comfortable with a GUI.

I ran xfs_repair from the command line while the array was in maintenance mode.

This particular drive was a new purchase. The only strange thing I can say was that when I tried to run preclear on it, it finished instantly. I tried this two or three times, then said - I'll just add it to the array and hope for the best.

Diagnostics attached.

If anyone can provide advice, I'd be grateful. I'd prefer to not have to start over with a new array.

Cheers,

bonmot

honeycrisp-diagnostics-20211009-1231.zip

trurl · October 9, 2021

Start the array in normal mode and attach new diagnostics to your NEXT post in this thread.

bonmot · October 9, 2021

Oops. Sorry about that. Here’s the array in normal mode diagnostics.

thanks!

honeycrisp-diagnostics-20211009-1307.zip

bonmot · October 9, 2021

Any ideas?

JorgeB · October 10, 2021

Looks like a xfs_repair problem, update to v6.10-rc1 since it includes newer xfs-progs and re-run xfs_repair.

bonmot · October 11, 2021

Thanks, Jorge B. I got impatient and decided to blow away my array and start again. I'm almost back to where I was (parity is being rebuilt now). I'm a little afraid of going to a potentially unstable release (RC1), but if it happens again, I will do so.

cheers!

Had two unmountable disks in array. after xfs_repair and boot, still have one unmountable

Recommended Posts

bonmot

Link to comment

trurl

Link to comment

bonmot

Link to comment

bonmot

Link to comment

JorgeB

Link to comment

bonmot

Link to comment

Join the conversation