cliewmc Posted January 29, 2018 Posted January 29, 2018 Hi there, this is my issue: Parity (Disk DSBL) and one data disk2 (Unmountable No File System) Is there any way I can recover the Parity? I believe disk2 is completely dead based on the error report. All the file systems are xfs. I have attached logs but can provide more if they are not the correct ones. Any advice I can get would be greatly appreciated. Regards. cL NAS1_1main.pdf NAS1_2parity_settings.pdf NAS1_3disk2_settings.pdf clnasty-smart-20180129-1455.zip clnasty-smart-20180129-1432.zip
JorgeB Posted January 29, 2018 Posted January 29, 2018 Please post the diagnostics: Tools -> Diagnostics, but disk2 looks fine, it probably only needs a filesystem repair, parity disk dropped offline, so grab current diags, reboot and grab new ones, then post both.
cliewmc Posted January 29, 2018 Author Posted January 29, 2018 Hey jb, thanks for the swift response. I’ll work on the diags. Thanks. cL
cliewmc Posted January 29, 2018 Author Posted January 29, 2018 Here are the diag reports, before & after reboot. Thanks. cL clnasty-diagnostics-20180130-0921.zip clnasty-diagnostics-20180130-0928.zip
trurl Posted January 29, 2018 Posted January 29, 2018 Moved to V6 General Support since obviously the OP isn't running V5.
JorgeB Posted January 29, 2018 Posted January 29, 2018 Parity disk has some reallocated sectors, but the way it dropped offline is more consistent with a cable issue, replace cables to rule them out, you should then at least run an extended SMART test and possibly replacing it even if it passes the extended test, especially if those reallocated sectors are new. Disk2 looks fine, there is filesystem corruption that should be fixed by running xfs_repair, you'll most likely need to use -L though: http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui
cliewmc Posted January 29, 2018 Author Posted January 29, 2018 Thanks JB, will do, will report my results later. Appreciate the support. [phew! sigh of relief] cL
cliewmc Posted January 30, 2018 Author Posted January 30, 2018 12 hours ago, johnnie.black said: Parity disk has some reallocated sectors, but the way it dropped offline is more consistent with a cable issue, replace cables to rule them out, you should then at least run an extended SMART test and possibly replacing it even if it passes the extended test, especially if those reallocated sectors are new. Disk2 looks fine, there is filesystem corruption that should be fixed by running xfs_repair, you'll most likely need to use -L though: http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Parity: Tried to reconnect cable but same issue so I will need to get new cables to test. Disk2: Running the xfs_repair -L and it's taking a long time - still running after 2 hours... with this message: Phase 1 - find and verify superblock... so I suspect the disk has a lot of errors and may not be recoverable. I'll wait until tomorrow to see if we will see the end of the repair. I remember reading that it's only a few minutes to half an hour.
JorgeB Posted January 30, 2018 Posted January 30, 2018 9 minutes ago, cliewmc said: Parity: Tried to reconnect cable but same issue so I will need to get new cables to test. You'll need to resync parity to re-enable the disk. 10 minutes ago, cliewmc said: Disk2: Running the xfs_repair -L and it's taking a long time - still running after 2 hours... with this message: Phase 1 - find and verify superblock. That's not a very good sign, are you running xfs_repair from the GUI?
cliewmc Posted January 30, 2018 Author Posted January 30, 2018 Hi JB, the xfs_repair was from the GUI. I found that I clicked on Main, then go back to the disk2 screen, the results were displayed. It's not good, it's 2,700 pages long in Word document. Extract of portions of it looked like this: --- start --- Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... agf_freeblks 98235484, counted 98235466 in ag 0 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 data fork in regular inode 2153876768 claims used block 269477115 correcting nextents for inode 2153876768 --- entry ".master through with-08.06.17.pdf" at block 0 offset 2312 in directory inode 180545041 references free inode 5368712851 clearing inode number in entry at offset 2312... clearing inode number in entry at offset 640... --- rebuilding directory inode 3227471378 entry ".." in directory inode 3227471387 points to free inode 4294969678 bad hash table for directory inode 3227471387 (no data entry): rebuilding rebuilding directory inode 3227471387 entry ".." in directory inode 3227471388 points to free inode 4294969678 bad hash table for directory inode 3227471388 (no data entry): rebuilding rebuilding directory inode 3227471388 --- list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!done --- end --- Disk2 is really dead . My next hope is that I can rescue the parity drive with a new cable and try to rebuild a new drive to replace disk2. I might still be in trouble as the parity has some errors: (see attached files) #Attribute Name Flag Value Worst Threshold Type Updated Failed Raw Value 5 Reallocated sector count 0x0033 100 100 050 Pre-fail Always Never 1240 Will keep you informed. Thank you for your advice. cL NAS1_2parity_settings.pdf
JorgeB Posted January 30, 2018 Posted January 30, 2018 1 hour ago, cliewmc said: Disk2 is really dead Filesystem corruption not the same as a failed disk, if parity was in sync it would rebuild the same corrupt filesystem. 1 hour ago, cliewmc said: list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!done --- end --- Is this how xfs_repair ends? 1 hour ago, cliewmc said: 5 Reallocated sector count 0x0033 100 100 050 Pre-fail Always Never 1240 Yes, those are the reallocated sectors I told you about and I would replace that disk.
cliewmc Posted January 31, 2018 Author Posted January 31, 2018 disk2: Is this how xfs_repair ends? Answer: Yes. In this case, what is the sequence for recovery? Reboot server to and found disk2 is now mounted! parity: Yes, those are the reallocated sectors I told you about and I would replace that disk. Answer: Okay, noted. I will set out to use new cable first, rebuild (now that disk2 is recovered), then change it. Thanks JB, it's a relief that disk2 is up! Progress is positive. cL
cliewmc Posted January 31, 2018 Author Posted January 31, 2018 Hi JB, as an update, disk2 is up and I found hundreds of thousands of pdf, xlsx, docx files in 'lost+found' folder. I have deleted them. It turned out these were created by Ransomware Protection. I would advise not to set "Recreate Bait Files" if doing reboots because they take some time to create. During file creations and a reboot happens, these get trapped inno-man's land, and end up unlinked. In subsequent xfs_repair checks, more of these have been found. I had taken off the parity disk and set it to 'no device'. I have proceeded to preclear it to use as a normal data disk - don't trust it to be a parity disk. I am readying a replacement in the meantime. My NAS is now running without parity for the time being. Regards.
JonathanM Posted February 1, 2018 Posted February 1, 2018 20 minutes ago, cliewmc said: I have proceeded to preclear it to use as a normal data disk - don't trust it to be a parity disk. Of all the drives you need to trust, it's a data disk. All disks are required to rebuild a faulty disks, so a questionable data drive is more likely to cause data loss than a parity disk. Consider the scenario where you have single parity and 2 disks fail. If one of those dead disks is the parity drive, you've only lost 1 drive's worth of data, if you 2 data drives fail you lose both data drives, even if the parity drive is fine.
trurl Posted February 1, 2018 Posted February 1, 2018 Parity is the least important disk, since it doesn't contain any of your data. As jonathanm said.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.