Jump to content

Data rebuild @3515 days New disk showing as DISK_INVALID in diag


Recommended Posts

Hello

I am trying to replace a disk but am getting extremely slow rebuild times. Up to 3315 days at 25KBS. I stopped the rebuild after pulling a diagnostic.

I pulled the new disk out of the array and restarted with the disk being emulated and pulled a second diagnostic.

 

I'm not sure what I have done wrong, but I'm sure that someone much smarter than me at this stuff will see my error.

Thanks in advance for any assistance.

 

JorgeB, if you're out there bail me out again please.

 

Chas

tower-diagnostics-20220318-1959.zip tower-diagnostics-20220318-2010.zip

Link to comment

JorgeB

I hope that this isn't in bad form, but you've helped me so much before, I was hoping that you could tell me what I've messed up this time.

 

 

I am trying to replace a disk but am getting extremely slow rebuild times. Up to 3315 days at 25KBS. I stopped the rebuild after pulling a diagnostic.

I pulled the new disk out of the array and restarted with the disk being emulated and pulled a second diagnostic.

 

I'm not sure what I have done wrong, but I'm sure that someone much smarter than me at this stuff will see my error.

Thanks in advance for any assistance.

 

tower-diagnostics-20220318-2010.zip tower-diagnostics-20220318-1959.zip

Link to comment
11 hours ago, kysdaddy said:

JorgeB

To ping someone (which is what you were trying to do), you begin by typing an     @       followed immediately by the user name.  As soon as you start typing, a dropdown list of possible choices will appear.  Continue typing until the user you want to ping appears.  Then select his name on that list.

 

Example:

@kysdaddy

Link to comment

JorgeB, I was hoping that I could get a bit of information/direction from you. I currently have 19 disks in my array including my parity and one of the disks is missing Failed last week..  I only have one parity.

When I try to add the 20th disk to the array to replace the missing disk, is when I get the 3515 days to rebuild. Disk#23 is now showing as failed, but there is nothing on it.

 

I am currently trying 

The "Clear Drive Then Remove Drive" Method,  to shrink the array. https://wiki.unraid.net/Shrink_array

Disk 23 is about 6 hours from zeroing out..

 

My question is am I better to do this or to simply remove 22 there was like two movies on it before it died and 23 the rebuild the parity from scratch or continue what I am doing remove 23, tell the system that the parity is good and then replace 22.

 

Or a I way off base and going the complete wrong direction.

I thought about adding a second parity but can because of the 3515 day issue.

 

Any suggestions?

 

Chas

Link to comment
7 minutes ago, kysdaddy said:

Disk#23 is now showing as failed, but there is nothing on it.

It's still part of the array, and with single parity you just need one failed disk to not be able to replace another, if there are multiple failed disks you also can't add a new parity now, that would need to be done before, you can do a new config with the remaining new disks and/or use ddrescue to try and recover as much data as possible form the failed disk(s).

 

9 minutes ago, kysdaddy said:

The "Clear Drive Then Remove Drive" Method,  to shrink the array.

That's also not a good option because of the other failed disks, you can use the other method.

Link to comment

So if I am hearing this correctly, I should stop the clearing script/process.

 

and

1. Make sure that the drive or drives you are removing have been removed from any inclusions or exclusions for all shares, including in the global share settings. Shares should be changed from the default of "All" to "Include". This include list should contain only the drives that will be retained.

2. Make sure you have a copy of your array assignments, especially the parity drive. You may need this list if the "Retain current configuration" option doesn't work correctly

3. Stop the array (if it is started)

4. Go to Tools then New Config

5. Click on the Retain current configuration box (says None at first), click on the box for All, then click on close

6. Click on the box for Yes I want to do this, then click Apply then Done

7. Return to the Main page, and check all assignments. If any are missing, correct them. Unassign the drive(s) you are removing. Double check all of the assignments, especially the parity drive(s)!

8. Do not click the check box for Parity is already valid; make sure it is NOT checked; parity is not valid now and won't be until the parity build completes

9. Start the array to commit the changes; system is usable now, but it will take a long time rebuilding parity

 

Correct?

Link to comment

I have as comment to make at this point.  A few years ago (OK, half a decade), I made a statistical analysis of hard disk failures using several different assumed annual failure rates.   You can find that here:

 

    https://forums.unraid.net/topic/50504-dual-or-single-parity-its-your-choice/

 

That was another point made later by @SSD and you can find that here:

 

     https://forums.unraid.net/topic/50504-dual-or-single-parity-its-your-choice/#comment-552912

 

So using a Parity2 setup makes a lot of sense when one considers the possibility of encountering a second hard drive with problems during a rebuild of another disk.  But then there is the cost of implementing a Parity2 setup.  I (personally) feel that if anyone, who has more than about eight or nine array data disks, should definitely be considering  adding Parity2 to their array.  With 20+ data disks, you are well over that...

 

PS---- Having followed the forum for about dozen years, I think that that the true annual failure rate for disks used in Unraid servers is probably under 2%.  Now, there are a lot of 'Disk disabled' problems (which will require that an array disk be rebuilt using parity) that are not actual hard disk failures.  These are often corrected by simply rebuilding the data back onto the original hard disk.  

Link to comment
9 minutes ago, Frank1940 said:

These are often corrected by simply rebuilding the data back onto the original hard disk.  

I have never had a failed drive in an unRAID server in 10 years of use; however, I have rebuilt drives back onto themselves twice.  Both times it was a cabling issue that lead to a lot of CRC errors which eventually disabled the disk.

 

My oldest drives were used for eight years before I sold them.  They all still passed S.M.A.R.T. tests with no errors.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...