Jump to content

Help understanding this error message "XFS (dm-3): Internal error !xfs_dir2_namecheck(dep->name, dep->namelen) at line 462 of file fs/xfs/xfs_dir2_readdir.c. Caller xfs_dir2_leaf_getdents+0x213/0x30f [xfs]"


je82

Recommended Posts

Today i saw in the log, from unraid:

 

Quote

Oct 20 11:18:11 NAS kernel: <TASK>
Oct 20 11:18:11 NAS kernel: dump_stack_lvl+0x46/0x5a
Oct 20 11:18:11 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
Oct 20 11:18:11 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
Oct 20 11:18:11 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
Oct 20 11:18:11 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
Oct 20 11:18:11 NAS kernel: iterate_dir+0x95/0x146
Oct 20 11:18:11 NAS kernel: __do_sys_getdents64+0x6b/0xd4
Oct 20 11:18:11 NAS kernel: ? filldir+0x1a3/0x1a3
Oct 20 11:18:11 NAS kernel: do_syscall_64+0x80/0xa5
Oct 20 11:18:11 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Oct 20 11:18:11 NAS kernel: RIP: 0033:0x1485fc4d55c3
Oct 20 11:18:11 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
Oct 20 11:18:11 NAS kernel: RSP: 002b:00007ffec5d47228 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
Oct 20 11:18:11 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 00001485fc4d55c3
Oct 20 11:18:11 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
Oct 20 11:18:11 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000006240 R09: 00000000004958d0
Oct 20 11:18:11 NAS kernel: R10: fffffffffffff000 R11: 0000000000000293 R12: 0000000000453dd4
Oct 20 11:18:11 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
Oct 20 11:18:11 NAS kernel: </TASK>
Oct 20 11:18:11 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair
Oct 20 11:18:23 NAS kernel: XFS (dm-3): Internal error !xfs_dir2_namecheck(dep->name, dep->namelen) at line 462 of file fs/xfs/xfs_dir2_readdir.c.  Caller xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: CPU: 17 PID: 30521 Comm: find Tainted: P        W  O      5.15.46-Unraid #1
Oct 20 11:18:23 NAS kernel: Hardware name: Supermicro Super Server/X12SPI-TF, BIOS 1.4 07/11/2022
Oct 20 11:18:23 NAS kernel: Call Trace:
Oct 20 11:18:23 NAS kernel: <TASK>
Oct 20 11:18:23 NAS kernel: dump_stack_lvl+0x46/0x5a
Oct 20 11:18:23 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
Oct 20 11:18:23 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
Oct 20 11:18:23 NAS kernel: iterate_dir+0x95/0x146
Oct 20 11:18:23 NAS kernel: __do_sys_getdents64+0x6b/0xd4
Oct 20 11:18:23 NAS kernel: ? filldir+0x1a3/0x1a3
Oct 20 11:18:23 NAS kernel: do_syscall_64+0x80/0xa5
Oct 20 11:18:23 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Oct 20 11:18:23 NAS kernel: RIP: 0033:0x14ccdcd7e5c3
Oct 20 11:18:23 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
Oct 20 11:18:23 NAS kernel: RSP: 002b:00007fff742a4de8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
Oct 20 11:18:23 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 000014ccdcd7e5c3
Oct 20 11:18:23 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
Oct 20 11:18:23 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000000030 R09: 0000000000450ad0
Oct 20 11:18:23 NAS kernel: R10: 0000000000000100 R11: 0000000000000293 R12: 0000000000453dd4
Oct 20 11:18:23 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
Oct 20 11:18:23 NAS kernel: </TASK>
Oct 20 11:18:23 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair
Oct 20 11:18:23 NAS kernel: XFS (dm-3): Internal error !xfs_dir2_namecheck(dep->name, dep->namelen) at line 462 of file fs/xfs/xfs_dir2_readdir.c.  Caller xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: CPU: 10 PID: 30521 Comm: find Tainted: P        W  O      5.15.46-Unraid #1
Oct 20 11:18:23 NAS kernel: Hardware name: Supermicro Super Server/X12SPI-TF, BIOS 1.4 07/11/2022
Oct 20 11:18:23 NAS kernel: Call Trace:
Oct 20 11:18:23 NAS kernel: <TASK>
Oct 20 11:18:23 NAS kernel: dump_stack_lvl+0x46/0x5a
Oct 20 11:18:23 NAS kernel: xfs_corruption_error+0x64/0x7e [xfs]
Oct 20 11:18:23 NAS kernel: xfs_dir2_leaf_getdents+0x23e/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: ? xfs_dir2_leaf_getdents+0x213/0x30f [xfs]
Oct 20 11:18:23 NAS kernel: xfs_readdir+0x123/0x149 [xfs]
Oct 20 11:18:23 NAS kernel: iterate_dir+0x95/0x146
Oct 20 11:18:23 NAS kernel: __do_sys_getdents64+0x6b/0xd4
Oct 20 11:18:23 NAS kernel: ? filldir+0x1a3/0x1a3
Oct 20 11:18:23 NAS kernel: do_syscall_64+0x80/0xa5
Oct 20 11:18:23 NAS kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Oct 20 11:18:23 NAS kernel: RIP: 0033:0x14ccdcd7e5c3
Oct 20 11:18:23 NAS kernel: Code: ef b8 ca 00 00 00 0f 05 eb b3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 ff ff ff 7f 48 39 c2 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 48 10 00 f7 d8
Oct 20 11:18:23 NAS kernel: RSP: 002b:00007fff742a4de8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
Oct 20 11:18:23 NAS kernel: RAX: ffffffffffffffda RBX: 0000000000453e00 RCX: 000014ccdcd7e5c3
Oct 20 11:18:23 NAS kernel: RDX: 0000000000008000 RSI: 0000000000453e00 RDI: 0000000000000007
Oct 20 11:18:23 NAS kernel: RBP: ffffffffffffff88 R08: 0000000000006240 R09: 00000000004958d0
Oct 20 11:18:23 NAS kernel: R10: fffffffffffff000 R11: 0000000000000293 R12: 0000000000453dd4
Oct 20 11:18:23 NAS kernel: R13: 0000000000000000 R14: 0000000000453dd0 R15: 0000000000001092
Oct 20 11:18:23 NAS kernel: </TASK>
Oct 20 11:18:23 NAS kernel: XFS (dm-3): Corruption detected. Unmount and run xfs_repair

 

On my logserver i see the log server captured nearly 25 000 lines of log errors like this, it was occuring for nearly 1.5 hours! I dont know if it was occuring that long because it took that long to generate the log or if it actually was something happening for 1.5 hours. Anyway, i dont understand this message, i just saw it now, everything works fine as far as i can tell and the web gui shows no disk errors? What am i looking at here? What could be the cause? Should i be worried? The server has had no issues for years.

Link to comment

Im trying to understand what XFS dm-3 is, when i do cli: dmsetup info -c dm-3     
Device does not exist.

What am i looking at here?

 

dmsetup ls

 

returns:


md1     (254:0)
md2     (254:1)
md3     (254:2)
md4     (254:3)
md5     (254:4)
md6     (254:5)
md7     (254:6)
md8     (254:7)
md9     (254:8)
sdb1    (254:9)
sdc1    (254:10)

Link to comment

Here are the results:

 

Quote

Phase 1 - find and verify superblock... - block cache size set to 6071808 entries Phase 2 - using internal log - zero log... zero_log: head block 2864850 tail block 2864850 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 5 - agno = 8 - agno = 3 - agno = 6 - agno = 7 - agno = 2 - agno = 9 - agno = 10 - agno = 4 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Fri Oct 20 20:18:17 2023 Phase Start End Duration Phase 1: 10/20 20:13:45 10/20 20:13:45 Phase 2: 10/20 20:13:45 10/20 20:13:49 4 seconds Phase 3: 10/20 20:13:49 10/20 20:16:40 2 minutes, 51 seconds Phase 4: 10/20 20:16:40 10/20 20:16:41 1 second Phase 5: Skipped Phase 6: 10/20 20:16:41 10/20 20:18:17 1 minute, 36 seconds Phase 7: 10/20 20:18:17 10/20 20:18:17 Total run time: 4 minutes, 32 seconds

 

any expert advice? should i replace the drive or should i let it go for now and monitor the situation?

Link to comment

i ran the filecheck again with just -v flag

 

Quote

Phase 1 - find and verify superblock... - block cache size set to 6071808 entries Phase 2 - using internal log - zero log... zero_log: head block 2864861 tail block 2864861 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 10 - agno = 1 - agno = 2 - agno = 6 - agno = 5 - agno = 7 - agno = 9 - agno = 8 - agno = 4 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Fri Oct 20 21:39:52 2023 Phase Start End Duration Phase 1: 10/20 21:35:12 10/20 21:35:12 Phase 2: 10/20 21:35:12 10/20 21:35:15 3 seconds Phase 3: 10/20 21:35:15 10/20 21:38:07 2 minutes, 52 seconds Phase 4: 10/20 21:38:07 10/20 21:38:07 Phase 5: 10/20 21:38:07 10/20 21:38:14 7 seconds Phase 6: 10/20 21:38:14 10/20 21:39:51 1 minute, 37 seconds Phase 7: 10/20 21:39:51 10/20 21:39:51 Total run time: 4 minutes, 39 seconds done

 

i cannot see that it fixes anything, should i just relax and monitor ? can it perhaps be a ram issue? but then why did it spew errors only directed at DM-3 ? Also as far as i can tell i had no real read/write operations going on when it started, the server was pretty much in a relaxed state.

Your feedback is appreciated

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...