Squiggley Posted July 26, 2019 Share Posted July 26, 2019 I suspect my server cupboard became too hot yesterday as it started behaving badly. It tells me that the newest 8TB disk I put in a few weeks ago is Unmountable: No file system. I could not run any smart disk tests on it so I rebooted the system. After a reboot I am still missing the disk from my array but I can now run the smart tests on it. Which all seem to be fine. Is the disk faulty or should I just rebuild the array? Thanks heart-of-gold-diagnostics-20190726-0317.zip Quote Link to comment
Squiggley Posted July 26, 2019 Author Share Posted July 26, 2019 So I took the array down and ran xfs_repair.... I had to use the -L option with the following results # xfs_repair -vL /dev/sdi1 Phase 1 - find and verify superblock... - block cache size set to 1092552 entries Phase 2 - using internal log - zero log... zero_log: head block 1343446 tail block 1343435 ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... sb_fdblocks 64531954, counted 66679050 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 4 - agno = 7 - agno = 6 - agno = 5 - agno = 3 - agno = 1 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (11:1343435) is ahead of log (1:2). Format log to cycle 14. XFS_REPAIR Summary Fri Jul 26 18:43:47 2019 Phase Start End Duration Phase 1: 07/26 18:42:47 07/26 18:42:47 Phase 2: 07/26 18:42:47 07/26 18:42:59 12 seconds Phase 3: 07/26 18:42:59 07/26 18:43:13 14 seconds Phase 4: 07/26 18:43:13 07/26 18:43:13 Phase 5: 07/26 18:43:13 07/26 18:43:13 Phase 6: 07/26 18:43:13 07/26 18:43:24 11 seconds Phase 7: 07/26 18:43:24 07/26 18:43:24 Total run time: 37 seconds done It still says its unmountable: No file system. Can I rebuild this disk from the parity drive? Thanks Quote Link to comment
JorgeB Posted July 26, 2019 Share Posted July 26, 2019 You need to run xfs_repair on the mdX device to maintain parity, having said that filesystem itself should have been fixed, just would make parity out of sync, run xfs_repair one more time, on the md device, then start the array and post new diags if still unmountable. Either way parity can't help with filesystem corruption and you'll need to run a correcting check since it's now out of sync. Quote Link to comment
Squiggley Posted July 26, 2019 Author Share Posted July 26, 2019 Thanks for the reply. So I have run an xfs_repair on the md5 drive and these are the results.xfs_repair_results.txt The array is not mounting fine but I am missing a LOT of data I have 2.5TB of stuff in my lost&found folder. Is there any decent way of recovering these 10477 files? Also Did I cause this by running xfs_repair on sdi1 instead of md5? Thanks Quote Link to comment
Squiggley Posted July 26, 2019 Author Share Posted July 26, 2019 Oh Just noticed whilst disk 5 is mounting and no longer giving the unmountable no file system error it does say device is disabled, contents emulated Quote Link to comment
JorgeB Posted July 26, 2019 Share Posted July 26, 2019 That explains why running fsck on the device didn't fix it, de disk is being emulated, you didn't mentioned that, now since the disk itself looks fine and if you think the emulated disk is missing data you can instead do a new config to re-enable the disk and then re-sync parity. Quote Link to comment
Squiggley Posted July 26, 2019 Author Share Posted July 26, 2019 1 minute ago, johnnie.black said: That explains why running fsck on the device didn't fix it, de disk is being emulated, you didn't mentioned that, now since the disk itself looks fine and if you think the emulated disk is missing data you can instead do a new config to re-enable the disk and then re-sync parity. I didn't mention it being emulated because it has only just started saying it.... I think..... I want to make sue I get this right...... Sorry if this is a dumbass question but I am rather worried now about doing the wrong thing. How would I do a new config and re-sync parity? Thanks for all your help! Quote Link to comment
JorgeB Posted July 27, 2019 Share Posted July 27, 2019 -Tools -> New Config -> Retain current configuration: All -> Apply Check all disks are assigned and start array to begin parity sync. Note, doing this assumes all disks are OK, if you want to play it safer first rebuild the disabled disk to a new disk, then do a new config with the old one. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.