Jump to content

Unmountable xfs disk - XFS (dm-2): Internal error xfs_efi_item_recover


Go to solution Solved by JorgeB,

Recommended Posts

The Problem:
I experienced a lockup during an array stop procedure, unclean shutdown, and now an unmountable disk. There were no known issues prior to this error. 

What Happened: 

I was preparing for a manual parity check. I went through my normal procedure: shutdown running services, stop array, reboot and parity check. I shutdown all VMs and Containers, and clicked Stop Array, the system hung while unmounting the disks. I left the system overnight, and came back in the morning to no progress. I did a hard shutdown via power button, unplugged the server power, then booted back up. I started the array in maintenance mode, and issued a parity check. There was a notification for the unclean shutdown. The parity check completed ~24 hours later (normal) with no issues. I stopped the array, and started it back in normal mode. Disk 3 in my array was detected as being unmountable. I stopped the array, shutdown, unplugged power, and booted back up. Disk 3 was still unmountable. Stopped the array again. Pulled the diagnostic log, and now I'm here. 

 

syslog entry for disk 3: 
 

Jul  4 11:49:06 solidsnake emhttpd: mounting /mnt/disk3
Jul  4 11:49:06 solidsnake emhttpd: shcmd (115): mkdir -p /mnt/disk3
Jul  4 11:49:06 solidsnake emhttpd: shcmd (116): mount -t xfs -o noatime,nouuid /dev/mapper/md3p1 /mnt/disk3
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Mounting V5 Filesystem
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Starting recovery (logdev: internal)
Jul  4 11:49:06 solidsnake kernel: 00000000: 36 12 01 00 01 00 00 00 40 d4 b7 3c 84 88 ff ff  6.......@..<....
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Internal error xfs_efi_item_recover at line 614 of file fs/xfs/xfs_extfree_item.c.  Caller xlog_recover_process_intents+0x9c/0x25e [xfs]
Jul  4 11:49:06 solidsnake kernel: CPU: 12 PID: 14282 Comm: mount Tainted: P           O       6.1.64-Unraid #1
Jul  4 11:49:06 solidsnake kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi Ultimate, BIOS P3.10 04/25/2019
Jul  4 11:49:06 solidsnake kernel: Call Trace:
Jul  4 11:49:06 solidsnake kernel: <TASK>
Jul  4 11:49:06 solidsnake kernel: dump_stack_lvl+0x44/0x5c
Jul  4 11:49:06 solidsnake kernel: xfs_corruption_error+0x63/0x83 [xfs]
Jul  4 11:49:06 solidsnake kernel: ? xlog_recover_process_intents+0x9c/0x25e [xfs]
Jul  4 11:49:06 solidsnake kernel: xfs_efi_item_recover+0x92/0x1a8 [xfs]
Jul  4 11:49:06 solidsnake kernel: ? xlog_recover_process_intents+0x9c/0x25e [xfs]
Jul  4 11:49:06 solidsnake kernel: xlog_recover_process_intents+0x9c/0x25e [xfs]
Jul  4 11:49:06 solidsnake kernel: ? preempt_latency_start+0x2b/0x46
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jul  4 11:49:06 solidsnake kernel: xlog_recover_finish+0x2b/0x290 [xfs]
Jul  4 11:49:06 solidsnake kernel: ? xfs_ag_resv_init+0x164/0x1af [xfs]
Jul  4 11:49:06 solidsnake kernel: xfs_log_mount_finish+0x5a/0x111 [xfs]
Jul  4 11:49:06 solidsnake kernel: xfs_mountfs+0x5c6/0x73b [xfs]
Jul  4 11:49:06 solidsnake kernel: xfs_fs_fill_super+0x683/0x761 [xfs]
Jul  4 11:49:06 solidsnake kernel: ? xfs_open_devices+0x184/0x184 [xfs]
Jul  4 11:49:06 solidsnake kernel: get_tree_bdev+0x1d5/0x229
Jul  4 11:49:06 solidsnake kernel: vfs_get_tree+0x1c/0x8a
Jul  4 11:49:06 solidsnake kernel: path_mount+0x62f/0x70d
Jul  4 11:49:06 solidsnake kernel: do_mount+0x5c/0x8d
Jul  4 11:49:06 solidsnake kernel: __do_sys_mount+0x100/0x12e
Jul  4 11:49:06 solidsnake kernel: do_syscall_64+0x6b/0x81
Jul  4 11:49:06 solidsnake kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce
Jul  4 11:49:06 solidsnake kernel: RIP: 0033:0x14780a0c9eea
Jul  4 11:49:06 solidsnake kernel: Code: 48 8b 0d 31 1f 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fe 1e 0d 00 f7 d8 64 89 01 48
Jul  4 11:49:06 solidsnake kernel: RSP: 002b:00007ffcbd4cdd18 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
Jul  4 11:49:06 solidsnake kernel: RAX: ffffffffffffffda RBX: 000000000040f380 RCX: 000014780a0c9eea
Jul  4 11:49:06 solidsnake kernel: RDX: 000000000040f5d0 RSI: 000000000040f650 RDI: 000000000040f5b0
Jul  4 11:49:06 solidsnake kernel: RBP: 0000000000000000 R08: 000000000040f610 R09: 0000000000000060
Jul  4 11:49:06 solidsnake kernel: R10: 0000000000000400 R11: 0000000000000206 R12: 000000000040f5b0
Jul  4 11:49:06 solidsnake kernel: R13: 000000000040f5d0 R14: 000014780a25efa4 R15: 000000000040f498
Jul  4 11:49:06 solidsnake kernel: </TASK>
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Corruption detected. Unmount and run xfs_repair
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Failed to recover intents
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Filesystem has been shut down due to log error (0x2).
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Please unmount the filesystem and rectify the problem(s).
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): Ending recovery (logdev: internal)
Jul  4 11:49:06 solidsnake kernel: XFS (dm-2): log mount finish failed
Jul  4 11:49:06 solidsnake root: mount: /mnt/disk3: mount(2) system call failed: Structure needs cleaning.
Jul  4 11:49:06 solidsnake root:        dmesg(1) may have more information after failed mount system call.
Jul  4 11:49:06 solidsnake emhttpd: shcmd (116): exit status: 32
Jul  4 11:49:06 solidsnake emhttpd: /mnt/disk3 mount error: Unsupported or no file system
Jul  4 11:49:06 solidsnake emhttpd: shcmd (117): rmdir /mnt/disk3

 

Next steps:
Looking through some of the other posts, it would appear that I need to run a file system check. The logs mention running xfs_repair. Before I proceed with anything else, I wanted to confirm what exactly my next steps should be to maximize chances of recovering the disk intact. 

Thank you for any help you can provide. 

solidsnake-diagnostics-20240704-1153.zip

Link to comment
20 minutes ago, itimpi said:

First thing is to run a check filesystem via the GUI.   If run with -n (the default) nothing is done but the check results might give a clue as to how well a repair would go so post those results here for feedback.


I started the array in maintenance mode per the XFS instructions. Here is the output from the filesystem check -n on Disk 3. If I'm reading this correctly, it looks like a single file has an issue, and would be rebuilt if modify was allowed? 

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_fdblocks 1489953900, counted 1503578523
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
inode 15068100473 - bad extent starting block number 4503567550935200, offset 0
correcting nextents for inode 15068100473
bad data fork in inode 15068100473
would have cleared inode 15068100473
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 7
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 6
        - agno = 5
entry "Phil Hine - Tantrum Magick.pdf" at block 0 offset 2672 in directory inode 15066855983 references free inode 15068100473
	would clear inode number in entry at offset 2672...
inode 15068100473 - bad extent starting block number 4503567550935200, offset 0
correcting nextents for inode 15068100473
bad data fork in inode 15068100473
would have cleared inode 15068100473
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
entry "Phil Hine - Tantrum Magick.pdf" in directory inode 15066855983 points to free inode 15068100473, would junk entry
bad hash table for directory inode 15066855983 (no data entry): would rebuild
would rebuild directory inode 15066855983
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

 

Link to comment

Reran the check with no option, which would not proceed. Did as suggested, ran it with -L. The repair appears successful, I stopped the array and started it back in normal mode. Disk 3 mounted successfully. Everything looks normal. 

Thank you both for your responses and help. Unless further checks are necessary, I'll marked this as solved. 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...