flamegrilled Posted August 12, 2015 Share Posted August 12, 2015 XFS (md6): Corruption detected. Unmount and run xfs_repair Aug 12 10:05:00 Tower kernel: XFS (md6): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 12 10:05:00 Tower kernel: XFS (md6): Unmount and run xfs_repair Aug 12 10:05:00 Tower kernel: XFS (md6): First 64 bytes of corrupted metadata buffer: I have not done this before "Unmount and run xfs_repair" Is it that simple "xfs_repair -v /dev/md6" ? syslog_shares.zip Link to comment
trurl Posted August 12, 2015 Share Posted August 12, 2015 I have not done it either, but the wiki article was recently updated to include XFS. Check Disk Filesystems Link to comment
Squid Posted August 12, 2015 Share Posted August 12, 2015 You can do everthing in the WebUI once you stop the array, and restart it in maintenance mode. Just click the disk number, and you'll have options to run the check and/or repair (highly suggest you hit "help" for what everything means) Link to comment
flamegrilled Posted August 13, 2015 Author Share Posted August 13, 2015 I did this "http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Running_xfs_repair" to /dev/md6 with this xfs_repair -v /dev/md6 command but nothing happened. Did it three times.I stopped maintenance mode.I started the array in normal mode and md6 still moaned about corruption. I thought it was the swap of the parity(2TB) to a 4TB that caused the issue as it did not ask me to format the 2TB after assigning it to a data slot(6).I then did a new config as it did not like the larger parity-data swap exercise.After the parity check all was well and I moved it to its permanent position(the box physically). I then tried a new share creation as I see the original shares disappeared. I tried creating different shares and it reply saying "yoursharename" is deleted.I did notice that the original share folders were present on the disk.It was when I checked the log that I saw md6 moaning about corruption. In an attempt to rectify the error, I decided to do another new config shifting the data drives around.MD6 became MD1 and the error still persists after using the GUI this time thanks to SQUID and a reminder to RTFM first before asking questions. Below is the latest log snapshot.The array starts in normal mode but kicks out the message below.I have powered down the box to reseat the data connections. Aug 13 08:28:07 Tower kernel: ffff880098912000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 13 08:28:07 Tower kernel: ffff880098912010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 13 08:28:07 Tower kernel: ffff880098912020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 13 08:28:07 Tower kernel: ffff880098912030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 13 08:28:07 Tower kernel: XFS (md1): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 13 08:28:07 Tower logger: Starting NFS server daemons: Aug 13 08:28:07 Tower logger: /usr/sbin/exportfs -r Aug 13 08:28:07 Tower kernel: XFS (md1): Unmount and run xfs_repair Aug 13 08:28:07 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer: Aug 13 08:28:07 Tower kernel: ffff880098912000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 13 08:28:07 Tower kernel: ffff880098912010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 13 08:28:07 Tower kernel: ffff880098912020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 13 08:28:07 Tower logger: /usr/sbin/rpc.nfsd 8 Aug 13 08:28:07 Tower kernel: ffff880098912030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 13 08:28:07 Tower kernel: XFS (md1): metadata I/O error: block 0x58 ("xfs_trans_read_buf_map") error 117 numblks 8 Aug 13 08:28:07 Tower logger: /usr/sbin/rpc.mountd Aug 13 08:28:07 Tower rpc.mountd[27107]: Version 1.2.8 starting Aug 13 08:28:07 Tower emhttp: shcmd (676): /etc/rc.d/rc.atalk status Aug 13 08:28:07 Tower emhttp: Starting Docker... Aug 13 08:28:08 Tower kernel: BTRFS info (device loop0): disk space caching is enabled Aug 13 08:28:08 Tower kernel: BTRFS: has skinny extents Aug 13 08:28:08 Tower avahi-daemon[26595]: Service "Tower" (/services/smb.service) successfully established. Aug 13 08:28:08 Tower logger: Resize '/var/lib/docker' of 'max' Aug 13 08:28:08 Tower logger: starting docker ... Aug 13 08:28:08 Tower kernel: BTRFS: new size for /dev/loop0 is 16106127360 Aug 13 08:28:10 Tower rc.unRAID[27238][27242]: Processing /etc/rc.d/rc.unRAID.d/ start scripts. Aug 13 08:28:12 Tower kernel: XFS (md1): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 13 08:28:12 Tower kernel: XFS (md1): Unmount and run xfs_repair Aug 13 08:28:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer: Aug 13 08:28:12 Tower kernel: ffff88025374f000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 13 08:28:12 Tower kernel: ffff88025374f010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 13 08:28:12 Tower kernel: ffff88025374f020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 13 08:28:12 Tower kernel: ffff88025374f030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 13 08:28:12 Tower kernel: XFS (md1): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 13 08:28:12 Tower kernel: XFS (md1): Unmount and run xfs_repair Aug 13 08:28:12 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer: Aug 13 08:28:12 Tower kernel: ffff88025374f000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 13 08:28:12 Tower kernel: ffff88025374f010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 13 08:28:12 Tower kernel: ffff88025374f020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 13 08:28:12 Tower kernel: ffff88025374f030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 13 08:28:12 Tower kernel: XFS (md1): metadata I/O error: block 0x58 ("xfs_trans_read_buf_map") error 117 numblks 8 tower-diagnostics-20150813-0825.zip Link to comment
itimpi Posted August 13, 2015 Share Posted August 13, 2015 Did you try and run xfs_repair via the GUI or via the command line? If run via the command line there should always be output on progress even if it does nothing. Link to comment
flamegrilled Posted August 13, 2015 Author Share Posted August 13, 2015 itimpi :Did you try and run xfs_repair via the GUI or via the command line? I did with -v .I did with -n.I did it without any switches. I have attached the xfsrepairs and the latest log when starting the array normally. xfsrepair_1.zip screenlog_1.zip Link to comment
flamegrilled Posted August 13, 2015 Author Share Posted August 13, 2015 Well I have attempted xfs_repair -L /dev/md1 as the last resort and even that produced no results. Aug 13 15:01:47 Tower kernel: XFS (md1): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 13 15:01:47 Tower kernel: XFS (md1): Unmount and run xfs_repair Aug 13 15:01:47 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer: Aug 13 15:01:47 Tower kernel: ffff880253feb000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 13 15:01:47 Tower kernel: ffff880253feb010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 13 15:01:47 Tower kernel: ffff880253feb020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 13 15:01:47 Tower kernel: ffff880253feb030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ root@Tower:~# xfs_repair -L /dev/md1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata corruption detected at block 0x58/0x1000 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done Link to comment
RobJ Posted August 13, 2015 Share Posted August 13, 2015 Well, this is disappointing. My best guess is that xfs_repair fixed everything it could on the first pass, hasn't found anything else since, but is INCAPABLE of fixing the metadata corruption. The log was already fine, so the -L option had nothing to do. It does keep reporting the metadata corruption, so it knows about it, but apparently does not know how to fix it. At the same time, the XFS file system thinks xfs_repair IS capable of fixing it, so keeps telling you to use it. All I can say is wait for an XFS update, in a future kernel update, perhaps an updated xfs_repair will be able to fix it then. Or you can copy everything off, and reformat. Disappointing ... Apart from this block of metadata, everything else appears fine, tree is fine, so your data is probably fine, and drive should be usable. If it were me though, I'd reformat it. We don't yet know the full ramifications of this one failure, might be harmless, might be dangerous. Was anything moved into the 'lost+found' folder? Link to comment
flamegrilled Posted August 14, 2015 Author Share Posted August 14, 2015 Robj I think I have solved this one.I have attached the "screenlogs" pasted the latest one. Aug 14 07:09:01 Tower kernel: ffff8800995ac010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 14 07:09:01 Tower kernel: ffff8800995ac020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 14 07:09:01 Tower kernel: ffff8800995ac030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 14 07:09:01 Tower kernel: XFS (md1): Metadata corruption detected at xfs_dir3_block_read_verify+0xa6/0xb1, block 0x58 Aug 14 07:09:01 Tower kernel: XFS (md1): Unmount and run xfs_repair Aug 14 07:09:01 Tower kernel: XFS (md1): First 64 bytes of corrupted metadata buffer: Aug 14 07:09:01 Tower kernel: ffff8800995ac000: 58 44 42 33 7f 35 b4 55 00 00 00 00 00 00 00 58 XDB3.5.U.......X Aug 14 07:09:01 Tower kernel: ffff8800995ac010: 00 00 00 01 00 00 00 9b c5 06 e0 29 17 fc 4b 42 ...........)..KB Aug 14 07:09:01 Tower kernel: ffff8800995ac020: af 1e e9 94 80 d5 75 12 00 00 00 00 00 00 00 66 ......u........f Aug 14 07:09:01 Tower kernel: ffff8800995ac030: 02 70 0d 28 00 00 00 00 00 00 00 00 00 00 00 00 .p.(............ Aug 14 07:09:01 Tower kernel: XFS (md1): metadata I/O error: block 0x58 ("xfs_trans_read_buf_map") error 117 numblks 8 Aug 14 07:09:05 Tower emhttp: shcmd (1218): mv '/mnt/user/Video' '/mnt/user/avs' |& logger Aug 14 07:09:05 Tower emhttp: shcmd (1219): mv '/boot/config/shares/Video.cfg' '/boot/config/shares/avs.cfg' &> /dev/null Aug 14 07:09:05 Tower emhttp: shcmd (1220): :>/etc/samba/smb-shares.conf Aug 14 07:09:05 Tower emhttp: shcmd (1221): cp /etc/exports- /etc/exports Aug 14 07:09:05 Tower avahi-daemon[10229]: Files changed, reloading. Aug 14 07:09:05 Tower emhttp: shcmd (1222): echo '"/mnt/user/avs" -async,no_subtree_check,fsid=100 *(sec=sys,ro,insecure,anongid=100,anonuid=99,all_squash)' >>/etc/exports Aug 14 07:09:05 Tower emhttp: Restart SMB... Aug 14 07:09:05 Tower emhttp: shcmd (1223): killall -HUP smbd Aug 14 07:09:05 Tower emhttp: shcmd (1224): cp /etc/avahi/services/smb.service- /etc/avahi/services/smb.service Aug 14 07:09:05 Tower avahi-daemon[10229]: Files changed, reloading. Aug 14 07:09:05 Tower avahi-daemon[10229]: Service group file /services/smb.service changed, reloading. Aug 14 07:09:05 Tower emhttp: shcmd (1225): pidof rpc.mountd &> /dev/null Aug 14 07:09:05 Tower emhttp: Restart NFS... Aug 14 07:09:05 Tower emhttp: shcmd (1226): exportfs -ra |& logger Aug 14 07:09:05 Tower emhttp: shcmd (1227): /etc/rc.d/rc.atalk status Aug 14 07:09:06 Tower avahi-daemon[10229]: Service "Tower" (/services/smb.service) successfully established. Aug 14 07:10:09 Tower emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 Aug 14 07:12:10 Tower emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 Aug 14 07:23:06 Tower kernel: mdcmd (26): spindown 3 Aug 14 07:23:06 Tower kernel: mdcmd (27): spindown 4 Aug 14 07:23:06 Tower kernel: mdcmd (28): spindown 5 Aug 14 07:23:06 Tower kernel: mdcmd (29): spindown 6 Aug 14 07:24:43 Tower emhttp: shcmd (1228): /usr/sbin/hdparm -y /dev/sdc &> /dev/null Aug 14 07:24:44 Tower emhttp: shcmd (1229): /usr/sbin/hdparm -y /dev/sdd &> /dev/null Aug 14 07:26:06 Tower kernel: mdcmd (30): spindown 1 Aug 14 07:26:06 Tower kernel: mdcmd (31): spindown 2 Aug 14 07:41:06 Tower kernel: mdcmd (32): spindown 0 Aug 14 08:00:01 Tower logger: mover started Aug 14 08:00:01 Tower logger: skipping "plexmediaserver" Aug 14 08:00:01 Tower logger: mover finished Aug 14 08:32:54 Tower emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1 We had a power outage two weeks ago.I had no ups on the box at the time.That could have caused this corruption issue.I had not touched the box at all up until Monday as I tried to do a parity swap from 2TB to a 4TB.The preclear on two 4TB took days and I only did a 2x preclear.I still need to figure out why the 2TB that was moved to a data slot and which happened to be the "faulty disk" did not ask to be formatted to XFS. After the xfs_repairs starting with a -v from a remote ssh session to a GUI repair using -n and the finally -L as the last resort,I removed the md1 or disk1 after stopping the array in maintenance mode and started the array without it.The array detected the missing disk and started normally.I did not check the shares or the presence of smb shares at this stage.The array was stopped and started with the disk1 and it detected it as a new disk a continued to rebuild the disk1 into the array which took a couple of hours.I decided to try another xfs_repair -n,then -v and again -L and no change.The system just continued to kick out the xfs error text every few seconds. I started the array in normal mode and found all the shares that were missing and the new ones I created initially where present in the list of shares.The system continued to kick out xfs errors and I decided to move the valid data and delete all shares.I could delete all but one video share.I stopped and started the array as this would have restarted smb.The "video" share was still present after that.I decided rename the share in the GUI solved the shares creation problem.The XfS dump issue stopped after the rename i.e ": mv '/boot/config/shares/Video.cfg' '/boot/config/shares/avs.cfg' &> /dev/null " as shown in the attached log.I still cannot delete the renamed share on disk1 and a new xfs message has popped up "Structure needs cleaning".This appears at the second level or folder in that broken share.I have removed disk1 from global shares to delete the share structure but it keeps on saying""Structure needs cleaning".I have decided to format the drive with reiserfs. avs_shares.zip Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.