Jump to content

XFS corrupted metadata


alexxx

Recommended Posts

Hello!

 

I'm very new to unRAID and I like the experience thus far but I've hit a snag in my migration process. I have been using Proxmox for perhaps 5 years now and my disk space was running and thought I would try unRAID instead since RaidZ is a hassle with adding more drives.

 

So in the process of migrating data I'm transferring large amount of data over smb to three new Western Digital Red 8 TB drives, one in parity and two in the array, everything looks good and no SMART-errors are popping up. But after a while I'm getting this in log;

 

May  3 04:54:36 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x92c66aa8
May  3 04:54:36 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 04:54:36 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 04:54:36 Tower kernel: ffff880231fd9000: 58 44 42 33 e5 ac fb 07 00 00 00 00 92 c6 6a a8  XDB3..........j.
May  3 04:54:36 Tower kernel: ffff880231fd9010: 00 00 00 01 00 06 5c 85 dc 12 b4 a6 45 ad 49 da  ......\.....E.I.
May  3 04:54:36 Tower kernel: ffff880231fd9020: 97 73 80 02 67 0c 26 9f 00 00 00 00 92 c6 6b 36  .s..g.&.......k6
May  3 04:54:36 Tower kernel: ffff880231fd9030: 0a 70 04 00 00 60 00 20 00 98 00 20 00 00 00 00  .p...`. ... ....
May  3 04:54:36 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x92c66aa8
May  3 04:54:36 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 04:54:36 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 04:54:36 Tower kernel: ffff880231fd9000: 58 44 42 33 e5 ac fb 07 00 00 00 00 92 c6 6a a8  XDB3..........j.
May  3 04:54:36 Tower kernel: ffff880231fd9010: 00 00 00 01 00 06 5c 85 dc 12 b4 a6 45 ad 49 da  ......\.....E.I.
May  3 04:54:36 Tower kernel: ffff880231fd9020: 97 73 80 02 67 0c 26 9f 00 00 00 00 92 c6 6b 36  .s..g.&.......k6
May  3 04:54:36 Tower kernel: ffff880231fd9030: 0a 70 04 00 00 60 00 20 00 98 00 20 00 00 00 00  .p...`. ... ....
May  3 04:54:36 Tower kernel: XFS (md1): metadata I/O error: block 0x92c66aa8 ("xfs_trans_read_buf_map") error 74 numblks 8
May  3 05:47:28 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_data_read_verify+0x9c/0xa2 [xfs], xfs_dir3_data block 0x5562e4a8
May  3 05:47:28 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:47:28 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:47:28 Tower kernel: ffff8802324c7000: 58 44 44 33 fe b7 e8 e6 00 00 00 00 55 62 e4 a8  XDD3........Ub..
May  3 05:47:28 Tower kernel: ffff8802324c7010: 00 00 00 01 00 0f cf 09 dc 12 b4 a6 45 ad 49 da  ............E.I.
May  3 05:47:28 Tower kernel: ffff8802324c7020: 97 73 80 02 67 0c 26 9f 00 00 00 00 54 cc c3 71  .s..g.&.....T..q
May  3 05:47:28 Tower kernel: ffff8802324c7030: 02 c8 00 18 03 d8 00 18 05 90 00 18 00 00 00 00  ................
May  3 05:47:28 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_data_read_verify+0x9c/0xa2 [xfs], xfs_dir3_data block 0x5562e4a8
May  3 05:47:28 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:47:28 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:47:28 Tower kernel: ffff8802324c7000: 58 44 44 33 fe b7 e8 e6 00 00 00 00 55 62 e4 a8  XDD3........Ub..
May  3 05:47:28 Tower kernel: ffff8802324c7010: 00 00 00 01 00 0f cf 09 dc 12 b4 a6 45 ad 49 da  ............E.I.
May  3 05:47:28 Tower kernel: ffff8802324c7020: 97 73 80 02 67 0c 26 9f 00 00 00 00 54 cc c3 71  .s..g.&.....T..q
May  3 05:47:28 Tower kernel: ffff8802324c7030: 02 c8 00 18 03 d8 00 18 05 90 00 18 00 00 00 00  ................
May  3 05:47:28 Tower kernel: XFS (md1): metadata I/O error: block 0x5562e4a8 ("xfs_trans_read_buf_map") error 74 numblks 8
May  3 05:47:28 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_data_read_verify+0x9c/0xa2 [xfs], xfs_dir3_data block 0x5562e4a8
May  3 05:47:28 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:47:28 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:47:28 Tower kernel: ffff8802324c7000: 58 44 44 33 fe b7 e8 e6 00 00 00 00 55 62 e4 a8  XDD3........Ub..
May  3 05:47:28 Tower kernel: ffff8802324c7010: 00 00 00 01 00 0f cf 09 dc 12 b4 a6 45 ad 49 da  ............E.I.
May  3 05:47:28 Tower kernel: ffff8802324c7020: 97 73 80 02 67 0c 26 9f 00 00 00 00 54 cc c3 71  .s..g.&.....T..q
May  3 05:47:28 Tower kernel: ffff8802324c7030: 02 c8 00 18 03 d8 00 18 05 90 00 18 00 00 00 00  ................
May  3 05:47:28 Tower kernel: XFS (md1): metadata I/O error: block 0x5562e4a8 ("xfs_trans_read_buf_map") error 74 numblks 8
May  3 05:55:12 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x230210e0
May  3 05:55:12 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:55:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:55:12 Tower kernel: ffff88023179d000: 58 44 42 33 30 1e d9 52 00 00 00 00 23 02 10 e0  XDB30..R....#...
May  3 05:55:12 Tower kernel: ffff88023179d010: 00 00 00 01 00 09 1d 56 dc 12 b4 a6 45 ad 49 da  .......V....E.I.
May  3 05:55:12 Tower kernel: ffff88023179d020: 97 73 80 02 67 0c 26 9f 00 00 00 00 23 02 0f fa  .s..g.&.....#...
May  3 05:55:12 Tower kernel: ffff88023179d030: 0d e0 00 80 00 60 00 28 00 a8 00 28 00 00 00 00  .....`.(...(....
May  3 05:55:12 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x230210e0
May  3 05:55:12 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:55:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:55:12 Tower kernel: ffff88023179d000: 58 44 42 33 30 1e d9 52 00 00 00 00 23 02 10 e0  XDB30..R....#...
May  3 05:55:12 Tower kernel: ffff88023179d010: 00 00 00 01 00 09 1d 56 dc 12 b4 a6 45 ad 49 da  .......V....E.I.
May  3 05:55:12 Tower kernel: ffff88023179d020: 97 73 80 02 67 0c 26 9f 00 00 00 00 23 02 0f fa  .s..g.&.....#...
May  3 05:55:12 Tower kernel: ffff88023179d030: 0d e0 00 80 00 60 00 28 00 a8 00 28 00 00 00 00  .....`.(...(....
May  3 05:55:12 Tower kernel: XFS (md1): metadata I/O error: block 0x230210e0 ("xfs_trans_read_buf_map") error 74 numblks 8
May  3 05:57:12 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x383dc8838
May  3 05:57:12 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:57:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:57:12 Tower kernel: ffff8800163ea000: 58 44 42 33 ad f7 bd b3 00 00 00 03 83 dc 88 38  XDB3...........8
May  3 05:57:12 Tower kernel: ffff8800163ea010: 00 00 00 01 00 02 29 54 dc 12 b4 a6 45 ad 49 da  ......)T....E.I.
May  3 05:57:12 Tower kernel: ffff8800163ea020: 97 73 80 02 67 0c 26 9f 00 00 00 03 83 dc 88 98  .s..g.&.........
May  3 05:57:12 Tower kernel: ffff8800163ea030: 0a 10 04 c0 00 60 00 58 01 08 00 58 00 00 00 00  .....`.X...X....
May  3 05:57:12 Tower kernel: XFS (md1): Metadata CRC error detected at xfs_dir3_block_read_verify+0x9c/0xa2 [xfs], xfs_dir3_block block 0x383dc8838
May  3 05:57:12 Tower kernel: XFS (md1): Unmount and run xfs_repair
May  3 05:57:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer:
May  3 05:57:12 Tower kernel: ffff8800163ea000: 58 44 42 33 ad f7 bd b3 00 00 00 03 83 dc 88 38  XDB3...........8
May  3 05:57:12 Tower kernel: ffff8800163ea010: 00 00 00 01 00 02 29 54 dc 12 b4 a6 45 ad 49 da  ......)T....E.I.
May  3 05:57:12 Tower kernel: ffff8800163ea020: 97 73 80 02 67 0c 26 9f 00 00 00 03 83 dc 88 98  .s..g.&.........
May  3 05:57:12 Tower kernel: ffff8800163ea030: 0a 10 04 c0 00 60 00 58 01 08 00 58 00 00 00 00  .....`.X...X....
May  3 05:57:12 Tower kernel: XFS (md1): metadata I/O error: block 0x383dc8838 ("xfs_trans_read_buf_map") error 74 numblks 8

And watching the console output it is telling me to run xfs_repair on the drive in question, md1. I started the array in maintenance mode and did just that. I don't have the output from that command at the moment but it was showing me some inode CRC error and told me that it was being repaired, but I tried running the command again before starting the array and another inode CRC error appeared but with another inode address. This kept repeating a couple of times. I tried starting the array and the migration process again but the same message keeps popping up in the log. There are not a lot of them but never the less they get me worried.

 

I'm running the server on consumer grade hardware at the moment since I have to migrate the data before adding the drives to my normal server. Is this anything to worry about, is my data safe?

 

Best regards Alex 

Link to comment

I stopped the array and started in maintenance mode again and ran xfs_repair twice and this is the result;

 

xfs_repair /dev/md1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
Metadata CRC error detected at xfs_dir3_data block 0x14314710/0x1000
bad CRC for inode 653022083
bad CRC for inode 653022083, will rewrite
cleared inode 653022083
        - agno = 1
        - agno = 2
        - agno = 3
bad CRC for inode 7185617787
bad CRC for inode 7185617787, will rewrite
cleared inode 7185617787
        - agno = 4
        - agno = 5
        - agno = 6
bad CRC for inode 13380699939
bad CRC for inode 13380699939, will rewrite
cleared inode 13380699939
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 0
        - agno = 2
        - agno = 3
entry "27.jpg" at block 0 offset 1584 in directory inode 339076709 references non-existent inode 339734932
        clearing inode number in entry at offset 1584...
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
Metadata CRC error detected at xfs_dir3_data block 0xffcece0/0x1000
bad hash table for directory inode 268213615 (hash value mismatch): rebuilding
rebuilding directory inode 268213615
bad hash table for directory inode 339076709 (no data entry): rebuilding
rebuilding directory inode 339076709
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 339472788, moving to lost+found
Phase 7 - verify and correct link counts...
done
root@Tower:~# xfs_repair /dev/md1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad CRC for inode 1301342819
bad CRC for inode 1301342819, will rewrite
cleared inode 1301342819
        - agno = 1
bad CRC for inode 2462978563
bad CRC for inode 2462978563, will rewrite
cleared inode 2462978563
bad CRC for inode 2481424819
bad CRC for inode 2481424819, will rewrite
cleared inode 2481424819
        - agno = 2
        - agno = 3
        - agno = 4
Metadata CRC error detected at xfs_dir3_data block 0x203848b58/0x1000
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 3
        - agno = 0
        - agno = 2
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...