Request help with Disk Issues: How do I recover?


cliewmc

Recommended Posts

Hi there, this is my issue: Parity (Disk DSBL) and one data disk2 (Unmountable No File System) 

 

Is there any way I can recover the Parity? I believe disk2 is completely dead based on the error report. All the file systems are xfs. 

 

I have attached logs but can provide more if they are not the correct ones. 

 

Any advice I can get would be greatly appreciated. Regards. cL

 

 

NAS1_1main.pdf

NAS1_2parity_settings.pdf

NAS1_3disk2_settings.pdf

clnasty-smart-20180129-1455.zip

clnasty-smart-20180129-1432.zip

Link to comment

Parity disk has some reallocated sectors, but the way it dropped offline is more consistent with a cable issue, replace cables to rule them out, you should then at least run an extended SMART test and possibly replacing it even if it passes the extended test, especially if those reallocated sectors are new.

 

Disk2 looks fine, there is filesystem corruption that should be fixed by running xfs_repair, you'll most likely need to use -L though:

 

http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

 

 

Link to comment

 

12 hours ago, johnnie.black said:

Parity disk has some reallocated sectors, but the way it dropped offline is more consistent with a cable issue, replace cables to rule them out, you should then at least run an extended SMART test and possibly replacing it even if it passes the extended test, especially if those reallocated sectors are new.

 

Disk2 looks fine, there is filesystem corruption that should be fixed by running xfs_repair, you'll most likely need to use -L though:

 

http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui

 

 

Parity: Tried to reconnect cable but same issue so I will need to get new cables to test.

Disk2: Running the xfs_repair -L and it's taking a long time - still running after 2 hours... with this message: Phase 1 - find and verify superblock... so I suspect the disk has a lot of errors and may not be recoverable. I'll wait until tomorrow to see if we will see the end of the repair. I remember reading that it's only a few minutes to half an hour. 

Link to comment
9 minutes ago, cliewmc said:

Parity: Tried to reconnect cable but same issue so I will need to get new cables to test.

You'll need to resync parity to re-enable the disk.

 

10 minutes ago, cliewmc said:

Disk2: Running the xfs_repair -L and it's taking a long time - still running after 2 hours... with this message: Phase 1 - find and verify superblock.

That's not a very good sign, are you running xfs_repair from the GUI?

Link to comment

Hi JB, the xfs_repair was from the GUI. I found that I clicked on Main, then go back to the disk2 screen, the results were displayed. It's not good, it's 2,700 pages long in Word document. Extract of portions of it looked like this: 

--- start ---

Phase 1 - find and verify superblock...

Phase 2 - using internal log

        - zero log...

ALERT: The filesystem has valuable metadata changes in a log which is being

destroyed because the -L option was used.

        - scan filesystem freespace and inode maps...

agf_freeblks 98235484, counted 98235466 in ag 0

        - found root inode chunk

Phase 3 - for each AG...

        - scan and clear agi unlinked lists...

        - process known inodes and perform inode discovery...

        - agno = 0

        - agno = 1

        - agno = 2

data fork in regular inode 2153876768 claims used block 269477115

correcting nextents for inode 2153876768

---

entry ".master through with-08.06.17.pdf" at block 0 offset 2312 in directory inode 180545041 references free inode 5368712851

                    clearing inode number in entry at offset 2312...

                    clearing inode number in entry at offset 640...

---

rebuilding directory inode 3227471378

entry ".." in directory inode 3227471387 points to free inode 4294969678

bad hash table for directory inode 3227471387 (no data entry): rebuilding

rebuilding directory inode 3227471387

entry ".." in directory inode 3227471388 points to free inode 4294969678

bad hash table for directory inode 3227471388 (no data entry): rebuilding

rebuilding directory inode 3227471388

---

list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!done

--- end ---

 

Disk2 is really dead :(. My next hope is that I can rescue the parity drive with a new cable and try to rebuild a new drive to replace disk2. I might still be in trouble as the parity has some errors: (see attached files)

 

#Attribute Name Flag Value Worst Threshold Type Updated Failed Raw Value

5  Reallocated sector count  0x0033  100  100  050  Pre-fail  Always  Never  1240

 

 

Will keep you informed. Thank you for your advice. cL

NAS1_2parity_settings.pdf

Edited by cliewmc
Link to comment
1 hour ago, cliewmc said:

Disk2 is really dead

Filesystem corruption not the same as a failed disk, if parity was in sync it would rebuild the same corrupt filesystem.

 

1 hour ago, cliewmc said:

list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!done

--- end ---

Is this how xfs_repair ends?

 

1 hour ago, cliewmc said:

5  Reallocated sector count  0x0033  100  100  050  Pre-fail  Always  Never  1240

Yes, those are the reallocated sectors I told you about and I would replace that disk.

Link to comment
 

disk2: Is this how xfs_repair ends? Answer: Yes.  In this case, what is the sequence for recovery? 

Reboot server to and found disk2 is now mounted! 

 

parity: Yes, those are the reallocated sectors I told you about and I would replace that disk. Answer: Okay, noted. I will set out to use new cable first, rebuild (now that disk2 is recovered), then change it. 

 

Thanks JB, it's a relief that disk2 is up! Progress is positive. cL

 

Edited by cliewmc
Link to comment

Hi JB, as an update, disk2 is up and I found hundreds of thousands of pdf, xlsx, docx files in 'lost+found' folder. I have deleted them. It turned out these were created by Ransomware Protection. I would advise not to set "Recreate Bait Files" if doing reboots because they take some time to create. During file creations and a reboot happens, these get trapped inno-man's land, and end up unlinked. In subsequent xfs_repair checks, more of these have been found. 

 

I had taken off the parity disk and set it to 'no device'. I have proceeded to preclear it to use as a normal data disk - don't trust it to be a parity disk. I am readying a replacement in the meantime. My NAS is now running without parity for the time being. 

 

Regards. 

Edited by cliewmc
Link to comment
20 minutes ago, cliewmc said:

I have proceeded to preclear it to use as a normal data disk - don't trust it to be a parity disk.

Of all the drives you need to trust, it's a data disk. All disks are required to rebuild a faulty disks, so a questionable data drive is more likely to cause data loss than a parity disk.

 

Consider the scenario where you have single parity and 2 disks fail. If one of those dead disks is the parity drive, you've only lost 1 drive's worth of data, if you 2 data drives fail you lose both data drives, even if the parity drive is fine.

  • Upvote 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.