Jump to content

Potential drive failure while performing "Parity-Sync / Data-Rebuild" for new drive


arcotton

Recommended Posts

Hello, I'm currently in the process of performing a data rebuild for a new replacement drive (DISK 1). During the rebuild process, I noticed some concerning warning messages complaining about disk sectors appear for a different drive in my array (DISK5).

This prompted me to click on the drive in question and I noticed the last smart test for this drive resulted in errors. I became concerned and paused the data rebuild. I looked over the smart report (attached below *information is a little over my head*) and I saw at the very beginning it indicates "SMART overall-health self-assessment test result: PASSED". 

I'm now confused because I'm not sure if the drive is failing and needs to be replaced or if it is safe for me to continue rebuilding. The dashboard seems to indicate the disk is not healthy and these are some of the following attributes I noticed (Not sure what they mean): 

Reallocated sector count: 360
Reported uncorrect: 88
Current pending sector: 496
Offline uncorrectable: 496

 

Looking at what other folks have experienced, it appears when the SMART report returns errors for a drive, the drive most likely needs to be replaced. Not sure if this is the case given my situation, but if it is what would be the best course action going forward given the fact I'm currently in the middle of Parity-Sync/Data-Rebuild processing?

Do I simply keep the rebuild paused, spin down the drive, replace the drive with a new one. and then continue the rebuild? I feel like that might be a little too easy... 

 

In addition to the SMART report, I attached my server's diagnostic information and a couple screenshots. I'm currently running UNRAID Version: 6.8.2. 

 

If anymore information is needed please let me know :)  

 

 

Dashboard Screen Shot:

Server_Dashboard_Array_View.png

 

 

More Failing Drive Information (DISK5):

Disk5 information.png

DISK5 SMART Report.txt Server_Diagnostic_Info.zip

Edited by arcotton
Link to comment
2 hours ago, arcotton said:

"SMART overall-health self-assessment test result: PASSED". 

This is basically useless, the important part is the SMART test, which failed, so the disk is failing and needs to be replaced.

 

2 hours ago, arcotton said:

Do I simply keep the rebuild paused, spin down the drive, replace the drive with a new one. and then continue the rebuild? I feel like that might be a little too easy... 

You can't do that since you only have single parity and disk1 is already emulated, do you still have an intact old disk1? I.e., was this an upgrade or a replacement because it failed?

Link to comment
7 hours ago, arcotton said:

replacement drive (DISK 1). During the rebuild process, I noticed some concerning warning messages complaining about disk sectors appear for a different drive in my array (DISK5).

Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Probably disk5 was already indicating it had problems before you decided to replace disk1.

Link to comment

Thank you for the quick replies,

9 hours ago, JorgeB said:

You can't do that since you only have single parity and disk1 is already emulated, do you still have an intact old disk1? I.e., was this an upgrade or a replacement because it failed?

 

This was a replacement drive because disk1 had failed, I still have the old drive but it is toast as far as I know.

 

4 hours ago, trurl said:

Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Probably disk5 was already indicating it had problems before you decided to replace disk1.


No I do not have notifications setup or receive alerts by email, in hindsight that would have been nice....I need to look into that and get that working going forward.

Ok so given this current mess what are my best options? Is it safe for me to continue the rebuild for disk1, let that finish, replace failing disk5, and then perform a rebuild again?
I know DISK5 is failing but does that mean the data is corrupt, not sure if it is safe to continue rebuilding or not.
 

 

Edited by arcotton
Link to comment
17 hours ago, arcotton said:

Ok so given this current mess what are my best options?

You can let the rebuild finish, there will be some/a lot of corrupt data on the rebuilt disk, but there are not many options, you can then rebuild disk5 but will also result in corrupt data, you can instead use ddrescue on disk5, there will also be corrupt data but at least you can find out which files are affected and need replacing, you can also try that on the old disk1, if not completely dead.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...