Jerky_san Posted August 19, 2018 Share Posted August 19, 2018 So I upgraded to a threadripper today. Taichi X399 and 2990WX and come to find out the Taichi and an LSI9201-16i DO NOT PLAY WELL! I had read that you should disable boot rom images but what ended up happening is that two of my disk "failed" and then after I brought the array down to figure out what was going on. The HBA totally disappeared on me. I went back to my old setup and I am trying to rebuild. I have a dual parity system and I had a single parity drive fail and a single data disk. Scared that I would lose the data on that data disk I started the array back up with a fresh new disk so I could maintain the data on that disk if something happened.. Well something did happen. I am getting "Parity is invalid" and I am also getting " Unmountable: No file system " on the data pre-cleared drive I put in. I am not getting any emulated data like previous rebuilds that I've had to go through. I am starting to get pretty nervous at this point. I have the old data drive so I debated trying to start a new config and rebuild the parity but I think that might hose me further. I'd rather stick with having at least one valid parity drive but I guess if you lose a parity drive and a data drive your hosed even in a dual parity system? tower-diagnostics-20180818-2350.zip Quote Link to comment
JorgeB Posted August 19, 2018 Share Posted August 19, 2018 You need to check file system on disk6, either after the rebuild finishes, or by canceling current rebuild. https://lime-technology.com/wiki/Check_Disk_Filesystems#Drives_formatted_with_XFS or https://lime-technology.com/wiki/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Quote Link to comment
JorgeB Posted August 19, 2018 Share Posted August 19, 2018 P.S. if xfs_repair asks for -L don't use yet, so we can try another thing first. Quote Link to comment
Jerky_san Posted August 19, 2018 Author Share Posted August 19, 2018 27 minutes ago, johnnie.black said: You need to check file system on disk6, either after the rebuild finishes, or by canceling current rebuild. https://lime-technology.com/wiki/Check_Disk_Filesystems#Drives_formatted_with_XFS or https://lime-technology.com/wiki/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Ok I'll do it right after the check is complete. Do you believe it will just rebuild the drive and the parity disk and after the repair be "good" in the heat scenario? Thank you a lot for your help. Quote Link to comment
JorgeB Posted August 19, 2018 Share Posted August 19, 2018 If the filesystem repair works all should be good, I usually recommend doing it before rebuilding in case it doesn't work, but it usually does. Quote Link to comment
JorgeB Posted August 19, 2018 Share Posted August 19, 2018 Since the rebuild is going to take a while and I'm going on vacation for a couple of weeks later today, this is what I would like you to try if when running xfs_repair -v on disk6 you get an error like this: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Start the array in maintenance mode, then on the console type: mkdir /x mount -vt xfs -o noatime,nodiratime /dev/md6 /x If it mounts see below, if it doesn't try: mount -vt xfs /dev/md6 /x If it mounts with 1st or 2nd option now unmount: umount /x And run xfs_repair again: xfs_repair -v /dev/md6 If it doesn't mount with either option use -L xfs_repair -vL /dev/md6 2 1 Quote Link to comment
Jerky_san Posted August 19, 2018 Author Share Posted August 19, 2018 (edited) 4 hours ago, johnnie.black said: Since the rebuild is going to take a while and I'm going on vacation for a couple of weeks later today, this is what I would like you to try if when running xfs_repair -v on disk6 you get an error like this: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Start the array in maintenance mode, then on the console type: mkdir /x mount -vt xfs -o noatime,nodiratime /dev/md6 /x If it mounts see below, if it doesn't try: mount -vt xfs /dev/md6 /x If it mounts with 1st or 2nd option now unmount: umount /x And run xfs_repair again: xfs_repair -v /dev/md6 If it doesn't mount with either option use -L xfs_repair -vL /dev/md6 Thanks for taking the time to write all this up even though you have a vacation to prepare for. I'll try running right after it finishes. Currently it's showing 10 hours 40 minutes remaining but I'm guessing its more like 12 from previous checks as it slows down towards the end. Edited August 19, 2018 by Jerky_san Quote Link to comment
Jerky_san Posted August 20, 2018 Author Share Posted August 20, 2018 Rebuild finished so did an xfs_repair and below is what it showed but it doesn't mount sadly. root@Tower:~# xfs_repair -v /dev/md6 Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock - block cache size set to 3026704 entries sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculate d value 96 resetting superblock root inode pointer to 96 sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calc ulated value 97 resetting superblock realtime bitmap ino pointer to 97 sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with cal culated value 98 resetting superblock realtime summary ino pointer to 98 Phase 2 - using internal log - zero log... zero_log: head block 8 tail block 8 - scan filesystem freespace and inode maps... sb_icount 0, counted 64 sb_ifree 0, counted 61 sb_fdblocks 1952984865, counted 1952984857 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 6 - agno = 4 - agno = 0 - agno = 3 - agno = 5 - agno = 2 - agno = 7 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Note - stripe unit (0) and width (0) were copied from a backup superblock. Please reset with mount -o sunit=<value>,swidth=<value> if necessary XFS_REPAIR Summary Sun Aug 19 20:28:13 2018 Phase Start End Duration Phase 1: 08/19 20:28:12 08/19 20:28:12 Phase 2: 08/19 20:28:12 08/19 20:28:12 Phase 3: 08/19 20:28:12 08/19 20:28:12 Phase 4: 08/19 20:28:12 08/19 20:28:12 Phase 5: 08/19 20:28:12 08/19 20:28:12 Phase 6: 08/19 20:28:12 08/19 20:28:12 Phase 7: 08/19 20:28:12 08/19 20:28:12 Total run time: done Quote Link to comment
Jerky_san Posted August 20, 2018 Author Share Posted August 20, 2018 (edited) Mounted it fine with the command given by johnnie earlier and ran repair again and it yet again said everything is fine.. but still won't mount in unraid. Edit: Well it mounted and the disk is completely empty.. that's a sad face. So what's next? Assume it involves me copying all the data off the old disk into disk 6 again? Edited August 20, 2018 by Jerky_san Quote Link to comment
JorgeB Posted September 1, 2018 Share Posted September 1, 2018 If I understand correctly the rebuild was done to a new disk, i.e., you still have the old disk, if yes that disk should still be OK, you can check with UD and if yes do a new config with it and the remaining disks. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.