XFS bad blocks


Recommended Posts

I left a file transfer going yesterday, copying 3+TB from disk2 to disk3, came in this morning to an unresponsive disk 3 and 4, and unraid complaining about they failed smart something or other. Stopped the array, and both drives disappeared from the device list. I reseated both, performed a clean reboot (log included is from before reboot), and both drives do show up now, however disk 3 is disabled (4 appears to be fine), and appears to have some sort of file corruption going on from the logs. I started in maintenance mode and ran the check filesystem and got this.

xfs_repair /dev/md3 -n 2>&1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
bad directory block magic # 0x5b444042 in block 0 for directory inode 17074
corrupt block 0 in directory inode 17074
would junk block
no . entry for directory 17074
no .. entry for directory 17074
problem with directory contents in inode 17074
would have cleared inode 17074
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 1
bad directory block magic # 0x5b444042 in block 0 for directory inode 17074
corrupt block 0 in directory inode 17074
would junk block
no . entry for directory 17074
no .. entry for directory 17074
problem with directory contents in inode 17074
would have cleared inode 17074
entry ".actors" at block 0 offset 120 in directory inode 21523196102 references free inode 17074
would clear inode number in entry at offset 120...
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
entry ".actors" in directory inode 21523196102 points to free inode 17074, would junk entry
bad hash table for directory inode 21523196102 (no data entry): would rebuild
- traversal finished ...
- moving disconnected inodes to lost+found ...
disconnected inode 17075, would move to lost+found
disconnected inode 17076, would move to lost+found
disconnected inode 17077, would move to lost+found
disconnected inode 17078, would move to lost+found
disconnected inode 17079, would move to lost+found
disconnected inode 17080, would move to lost+found
disconnected inode 17081, would move to lost+found
disconnected inode 17082, would move to lost+found
disconnected inode 17083, would move to lost+found
disconnected inode 17084, would move to lost+found
disconnected inode 17085, would move to lost+found
disconnected inode 17086, would move to lost+found
Phase 7 - verify link counts...
would have reset inode 21523196102 nlinks from 8 to 7
No modify flag set, skipping filesystem flush and exiting.

 

For disk4

 

xfs_repair /dev/md4 -n 2>&1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 5
- agno = 4
- agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.


 

What now? Is my data in danger?

syslog-20150107-082426.zip

disk3.txt

disk4.txt

Link to comment

Well, I did memtest overnight just to rule that out, no errors after ~16 hours.

 

Since the smart report was fine, I did xfs_repair with no arguments to fix the filesystem. Some files were put in the lost and found, but they are small files and it referenced the .actor folder so probably just metadata/images, everything else looks okay and haven't found any corrupted files, but it's over 3.5 TB worth of tv episodes so who knows.

 

I am currently rebuilding to the same drive.

 

Link to comment

http://lime-technology.com/forum/index.php?topic=36295.0;topicseen

 

I believe it to be related to the issue in the above thread. Nobody has quite pinned down the cause, and different solutions seem to work for some and not others. Disabling spindown, xen boot, not mixing ezpansion and motherboard ports, having all drives the same filesystem, etc. Rebuild is finished, so I'm going to convert the last drive to xfs, issue occurred when moving a large amount of files from a RFS drive to an XFS drive, though that shouldn't cause problems. I'll try removing 2 of the 2TB drives and running off the saslp or motherboard only next.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.