darkwolf Posted April 8, 2021 Share Posted April 8, 2021 So I have has some cabling and controller issues, kept on losing parity, so disabled parity (i know I know) until I could get my replacement card in. Re-cabled with new cables, had an issue with md1 (And the physical drive as well), rechecked power connections, same, then changed out whole new power connection for 4 drives and md1 came back fine, but now somewhere in the ups and downs the md4 device began having I/O errors, but the device (currently /dev/sdg1) xfs checks looks fine (-n so it wouldn't change anything). I did the 'shrink array' method of rebuilding the array with new config, keep data, to see if that would 'fix' the error. Still no. Since I do not have a parity right now, I am guessing it is safe to xfs_repair the non-md device (ie /dev/sdg1), and recover what I can, then just make a new config - keep files, and go from there? Output from xfs_repair root@media:~# xfs_repair -vn /dev/sdg1 Phase 1 - find and verify superblock... - block cache size set to 6157048 entries Phase 2 - using internal log - zero log... zero_log: head block 1233498 tail block 1233498 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Thu Apr 8 01:18:04 2021 Phase Start End Duration Phase 1: 04/08 01:17:39 04/08 01:17:39 Phase 2: 04/08 01:17:39 04/08 01:17:40 1 second Phase 3: 04/08 01:17:40 04/08 01:17:56 16 seconds Phase 4: 04/08 01:17:56 04/08 01:17:57 1 second Phase 5: Skipped Phase 6: 04/08 01:17:57 04/08 01:18:04 7 seconds Phase 7: 04/08 01:18:04 04/08 01:18:04 Total run time: 25 seconds root@media:~# xfs_repair -vn /dev/md4 Phase 1 - find and verify superblock... superblock read failed, offset 0, size 524288, ag 0, rval -1 fatal error -- Input/output error media-diagnostics-20210408-0054.zip Quote Link to comment
darkwolf Posted April 8, 2021 Author Share Posted April 8, 2021 (Oh, and then of course adding my parity drive back in (now that it is looking fine) and doing parity checks of course Quote Link to comment
darkwolf Posted April 8, 2021 Author Share Posted April 8, 2021 So I did it, because most of the data is either backed up somewhere or media data I can re-rip from disc. Looks like I have a few disk errors Suggestions? Quote Link to comment
darkwolf Posted April 8, 2021 Author Share Posted April 8, 2021 The drives with read errors are on separate sas cables and different power segments, so I am thinking it is a mix of old drives and funkyness from my old controller. My plan atm is removing the drives with the read errors from the drive pool, making a new config with the remaining drives, then mount -o ro,norecovery the other drives and moving the data over to the array. I have some spare drives to throw in to make more space so that should work out space wise. I know I may have some file corruption, anything of importance, like I said, is backed up already so I may just restore those shares that are important and worry about the rest on a case by case basis. I am still open to feedback though, as that process will take a while and I wont start on it for a few days Quote Link to comment
JorgeB Posted April 8, 2021 Share Posted April 8, 2021 1 hour ago, darkwolf said: Suggestions? Post new diags after the errors previous one didn't have those. Quote Link to comment
darkwolf Posted April 10, 2021 Author Share Posted April 10, 2021 I just took the faulty drives out of the array, rebuilt the new config, and copied over the data. Everything looked good, Once I get my new drives in for double parity and everything is good I will pre-clear the 'faulty' drives and see how that goes. If I have issues with them I will make a new post, thanks for doing your awesome jobs! Very much appreciate everyone who helps out here in the community! Much love! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.