Parity Invalid

Jerky_san · August 19, 2018

So I upgraded to a threadripper today. Taichi X399 and 2990WX and come to find out the Taichi and an LSI9201-16i DO NOT PLAY WELL! I had read that you should disable boot rom images but what ended up happening is that two of my disk "failed" and then after I brought the array down to figure out what was going on. The HBA totally disappeared on me. I went back to my old setup and I am trying to rebuild. I have a dual parity system and I had a single parity drive fail and a single data disk.

Scared that I would lose the data on that data disk I started the array back up with a fresh new disk so I could maintain the data on that disk if something happened.. Well something did happen. I am getting "Parity is invalid" and I am also getting " Unmountable: No file system " on the data pre-cleared drive I put in. I am not getting any emulated data like previous rebuilds that I've had to go through. I am starting to get pretty nervous at this point. I have the old data drive so I debated trying to start a new config and rebuild the parity but I think that might hose me further. I'd rather stick with having at least one valid parity drive but I guess if you lose a parity drive and a data drive your hosed even in a dual parity system?

tower-diagnostics-20180818-2350.zip

JorgeB · August 19, 2018

You need to check file system on disk6, either after the rebuild finishes, or by canceling current rebuild.

https://lime-technology.com/wiki/Check_Disk_Filesystems#Drives_formatted_with_XFS

or

https://lime-technology.com/wiki/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

JorgeB · August 19, 2018

P.S. if xfs_repair asks for -L don't use yet, so we can try another thing first.

Jerky_san · August 19, 2018

27 minutes ago, johnnie.black said:

You need to check file system on disk6, either after the rebuild finishes, or by canceling current rebuild.

https://lime-technology.com/wiki/Check_Disk_Filesystems#Drives_formatted_with_XFS

or

https://lime-technology.com/wiki/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

Ok I'll do it right after the check is complete. Do you believe it will just rebuild the drive and the parity disk and after the repair be "good" in the heat scenario? Thank you a lot for your help.

JorgeB · August 19, 2018

If the filesystem repair works all should be good, I usually recommend doing it before rebuilding in case it doesn't work, but it usually does.

JorgeB · August 19, 2018

Since the rebuild is going to take a while and I'm going on vacation for a couple of weeks later today, this is what I would like you to try if when running xfs_repair -v on disk6 you get an error like this:

The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.

Start the array in maintenance mode, then on the console type:

mkdir /x
mount -vt xfs -o noatime,nodiratime /dev/md6 /x

If it mounts see below, if it doesn't try:

mount -vt xfs /dev/md6 /x

If it mounts with 1st or 2nd option now unmount:

umount /x

And run xfs_repair again:

xfs_repair -v /dev/md6

If it doesn't mount with either option use -L

xfs_repair -vL /dev/md6

Jerky_san · August 19, 2018

4 hours ago, johnnie.black said:
Since the rebuild is going to take a while and I'm going on vacation for a couple of weeks later today, this is what I would like you to try if when running xfs_repair -v on disk6 you get an error like this:
The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Start the array in maintenance mode, then on the console type:
mkdir /x
mount -vt xfs -o noatime,nodiratime /dev/md6 /x
If it mounts see below, if it doesn't try:
mount -vt xfs /dev/md6 /x
If it mounts with 1st or 2nd option now unmount:
umount /x
And run xfs_repair again:
xfs_repair -v /dev/md6
If it doesn't mount with either option use -L
xfs_repair -vL /dev/md6

Thanks for taking the time to write all this up even though you have a vacation to prepare for. I'll try running right after it finishes. Currently it's showing 10 hours 40 minutes remaining but I'm guessing its more like 12 from previous checks as it slows down towards the end.

Edited August 19, 2018 by Jerky_san

Jerky_san · August 20, 2018

Rebuild finished so did an xfs_repair and below is what it showed but it doesn't mount sadly.

root@Tower:~# xfs_repair -v /dev/md6
Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
        - block cache size set to 3026704 entries
sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculate                                                                                                             d value 96
resetting superblock root inode pointer to 96
sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calc                                                                                                             ulated value 97
resetting superblock realtime bitmap ino pointer to 97
sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with cal                                                                                                             culated value 98
resetting superblock realtime summary ino pointer to 98
Phase 2 - using internal log
        - zero log...
zero_log: head block 8 tail block 8
        - scan filesystem freespace and inode maps...
sb_icount 0, counted 64
sb_ifree 0, counted 61
sb_fdblocks 1952984865, counted 1952984857
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 6
        - agno = 4
        - agno = 0
        - agno = 3
        - agno = 5
        - agno = 2
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Note - stripe unit (0) and width (0) were copied from a backup superblock.
Please reset with mount -o sunit=<value>,swidth=<value> if necessary

XFS_REPAIR Summary Sun Aug 19 20:28:13 2018

Phase           Start           End             Duration
Phase 1:        08/19 20:28:12 08/19 20:28:12
Phase 2:        08/19 20:28:12 08/19 20:28:12
Phase 3:        08/19 20:28:12 08/19 20:28:12
Phase 4:        08/19 20:28:12 08/19 20:28:12
Phase 5:        08/19 20:28:12 08/19 20:28:12
Phase 6:        08/19 20:28:12 08/19 20:28:12
Phase 7:        08/19 20:28:12 08/19 20:28:12

Total run time:
done

Jerky_san · August 20, 2018

Mounted it fine with the command given by johnnie earlier and ran repair again and it yet again said everything is fine.. but still won't mount in unraid.

Edit:

Well it mounted and the disk is completely empty.. that's a sad face. So what's next? Assume it involves me copying all the data off the old disk into disk 6 again?

Edited August 20, 2018 by Jerky_san

JorgeB · September 1, 2018

If I understand correctly the rebuild was done to a new disk, i.e., you still have the old disk, if yes that disk should still be OK, you can check with UD and if yes do a new config with it and the remaining disks.

Parity Invalid

Recommended Posts

Jerky_san

Link to comment

JorgeB

Link to comment

JorgeB

Link to comment

Jerky_san

Link to comment

JorgeB

Link to comment

JorgeB

Link to comment

Jerky_san

Link to comment

Jerky_san

Link to comment

Jerky_san

Link to comment

JorgeB

Link to comment

Join the conversation