XFS (md8): Metadata CRC error

jcarre · July 12, 2021

Hello,

I woke up with a message that a disk had errors, and this message has been spamming my logs all day:

Jul 12 15:42:59 ivpiter kernel: XFS (md8): metadata I/O error in "xfs_da_read_buf+0x9e/0xfe [xfs]" at daddr 0x456ab2c0 len 8 error 74 Jul 12 15:42:59 ivpiter kernel: XFS (md8): Metadata CRC error detected at xfs_dir3_data_read_verify+0x7d/0xc6 [xfs], xfs_dir3_data block 0x456ab2c0 Jul 12 15:42:59 ivpiter kernel: XFS (md8): Unmount and run xfs_repair Jul 12 15:42:59 ivpiter kernel: XFS (md8): First 128 bytes of corrupted metadata buffer: Jul 12 15:42:59 ivpiter kernel: 00000000: 19 fc 7c 6f db aa 4d 76 ad 39 6e 81 40 47 fd 0a ..|o..Mv.9n.@G.. Jul 12 15:42:59 ivpiter kernel: 00000010: 59 dc dd c9 db 7d 95 dc dc 77 a8 77 42 63 35 9e Y....}...w.wBc5. Jul 12 15:42:59 ivpiter kernel: 00000020: 3c 49 71 43 25 53 09 44 12 6a 07 ef 46 a0 da b8 <IqC%S.D.j..F... Jul 12 15:42:59 ivpiter kernel: 00000030: 2f 65 72 ba 15 54 73 5b 10 0a af 4a 28 ae 22 60 /er..Ts[...J(."` Jul 12 15:42:59 ivpiter kernel: 00000040: e5 c2 e2 9c 08 18 ef c9 e0 76 18 1a 5f e5 93 eb .........v.._... Jul 12 15:42:59 ivpiter kernel: 00000050: 24 68 21 fe f3 e8 96 cb 75 93 56 c6 f3 6a 56 2b $h!.....u.V..jV+ Jul 12 15:42:59 ivpiter kernel: 00000060: 51 97 f7 59 6c 3c e8 bb 6f 5c a4 3c 74 62 3b ab Q..Yl<..o\.<tb;. Jul 12 15:42:59 ivpiter kernel: 00000070: e2 c3 67 0a 73 49 49 5d be bd 38 46 e0 9a 45 8c ..g.sII]..8F..E.

I also attached the logs and the diagnositcs files.

What should I do?

ivpiter-diagnostics-20210712-1544.zip ivpiter-syslog-20210712-1343.zip

Squid · July 12, 2021

Reseat the cabling to disk #8 and then run the file system checks on it https://wiki.unraid.net/Check_Disk_Filesystems

jcarre · July 12, 2021

Ok, I just changed the sata cable and did the test and got what I believe is a bunch of errors:

imap claims a free inode 1164611144 is in use, would correct imap and clear inode (a lot of these)

inode identifier 8918548065924071487 mismatch on inode 1320212291 bad CRC for inode 1320212292 bad magic number 0x54be on inode 1320212292 bad version number 0xffffffa1 on inode 1320212292

imap claims inode 1320212288 is present, but inode cluster is sparse, correcting imap

entry "fanart-180.jpg" in shortform directory 12189153274 references free inode 12189169408 would have junked entry "fanart-180.jpg" in directory inode 12189153274 would have corrected i8 count in directory 12189153274 from 7 to 6

entry "17507" at block 3 offset 736 in directory inode 1320088957 references free inode 1320212295 would clear inode number in entry at offset 736...

And at the end:

No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 Maximum metadata LSN (4:2483963) is ahead of log (4:2481638). Would format log to cycle 7. No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Mon Jul 12 19:20:08 2021 Phase Start End Duration Phase 1: 07/12 19:19:21 07/12 19:19:21 Phase 2: 07/12 19:19:21 07/12 19:19:22 1 second Phase 3: 07/12 19:19:22 07/12 19:20:08 46 seconds Phase 4: 07/12 19:20:08 07/12 19:20:08 Phase 5: Skipped Phase 6: Skipped Phase 7: Skipped Total run time: 47 seconds

Squid · July 12, 2021

remove the -n flag

jcarre · July 12, 2021

Phase 1 - find and verify superblock... - block cache size set to 3010232 entries Phase 2 - using internal log - zero log... zero_log: head block 2481638 tail block 2481624 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

Squid · July 12, 2021

Add the -L flag Usually there's no corruption.

jcarre · July 12, 2021

Just did and it seems it completed without error. Started the array and the disk mounted normally.

Thanks for your help!

Should I be looking for signs of failure in the future? Right now the UDMA CRC error count is 1329.

itimpi · July 12, 2021

19 minutes ago, jcarre said:

Right now the UDMA CRC error count is 1329.

You need to see if this keeps increasing at anything other than a nominal increase. . CRC errors are connection issues and more often than not are cabling related rather than the drive itself. CRC errors never get reset so you just need to check it is not increasing much.

trurl · July 12, 2021

You can acknowledge the current CRC count for that drive by clicking on its warning on the Dashboard page and it will warn again if it increases.

1 hour ago, jcarre said:

looking for signs of failure

Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected?

XFS (md8): Metadata CRC error

Recommended Posts

jcarre

Link to comment

Squid

Link to comment

jcarre

Link to comment

Squid

Link to comment

jcarre

Link to comment

Squid

Link to comment

jcarre

Link to comment

itimpi

Link to comment

trurl

Link to comment

Join the conversation