Disk Read Errors and "Notice [X] - array health report [FAIL]"


Recommended Posts

I am working through getting my first Unraid server off the ground.  So far things are going well, but I have two old WD Red drives that are apparently failing and out of warranty.  They have been running in a Windows box for years without issue, but are unable to complete the extended SMART tests and are showing read errors.  Both drives are currently empty and I was not planning to use them with any irreplaceable data.

 

What is the risk of keeping these disks in the array until they actually fail?  I am running single parity currently and have plenty of extra capacity at the moment, but it feels wrong just throwing away questionable disks.

 

General system specs:

  • Supermicro 846 chassis with dual 920W SQ PSU's
  • Supermicro X9DR3-F
  • Intel Xeon E5-2695v2 x2
  • 128GB Samsung ECC RAM
  • Drives follow

 

image.thumb.png.be85d5d32e176ff533125f63ae14e668.png

 

image.thumb.png.e1febf7ad0472950973b3109948b9a5d.png

Link to comment

You can do it but the only problem is that in a drive failure ALL other disks must be able to be read 100% to rebuild the data, so if you had a parity, these two bad disks and 2 good disks (where you stored important files) and one of the good disks died, then to rebuild you must rely on two known bad disks, and if that fails then even the important data is gone. I mean you can do it but know the risks. (In. other words in the event of a failure, your data is only as safe while it’s rebuilding as your least reliable disk). 

Edited by PeteAsking
  • Like 1
  • Thanks 1
Link to comment
5 minutes ago, PeteAsking said:

You can do it but the only problem is that in a drive failure ALL other disks must be able to be read 100% to rebuild the data, so if you had a parity, these two bad disks and 2 good disks (where you stored important files) and one of the good disks died, then to rebuild you must rely on two known bad disks, and if that fails then even the important data is gone. I mean you can do it but know the risks. (In. other words in the event of a failure, your data is only as safe while it’s rebuilding as your least reliable disk). 

Ah, alright.  If I understand it correctly, I would be looking at two primary risks:

  • One of the failing disks dies, then the other failing disk dies on the rebuild > I lose the array
  • An apparently 'good' disk dies, then one of the failing disks dies on the rebuild > I lose the array

(If so, looks like I'm pulling them out)

Link to comment

“An apparently 'good' disk dies, then one of the failing disks dies on the rebuild > I lose the array“


I think thats the only risk if one of the good disks is a parity. Otherwise you can just copy the files off the disks like they are regular disks or even plug them into another machine to copy files off. 
 

If both disks that fail are the bad disks you would just lose the shares on those two disks. 
 

So I believe the issues revolve mainly around the parity (as that may not reliably be rebuilt). 

Edited by PeteAsking
Link to comment

Another thing you could do with the disks instead of making them part of the array is having them as unassigned disks and pass them to a vm or something to use as a dumping location and that way they still have a use but are not part of the array as such. Perhaps as a onedive or dropbox drive for a vm or some other thing you might use them for. 

  • Thanks 1
Link to comment
18 minutes ago, Unraiding said:

Ah, alright.  If I understand it correctly, I would be looking at two primary risks:

  • One of the failing disks dies, then the other failing disk dies on the rebuild > I lose the array
  • An apparently 'good' disk dies, then one of the failing disks dies on the rebuild > I lose the array

(If so, looks like I'm pulling them out)

Not simply a matter of drive failure during rebuild. Bad disks can also give bad data that makes the rebuild no good.

  • Like 1
  • Thanks 1
Link to comment
8 hours ago, Unraiding said:

Ah, alright.  If I understand it correctly, I would be looking at two primary risks:

  • One of the failing disks dies, then the other failing disk dies on the rebuild > I lose the array
  • An apparently 'good' disk dies, then one of the failing disks dies on the rebuild > I lose the array

Actually you only lose the contents of the failed drives and not the whole array as with Unraid each drive is actually a discrete self-contained file system (one of its strengths in severe failure scenarios).   However the gist is correct in that with untrustworthy drives in the array you are at a significant risk of data loss.

  • Like 1
  • Thanks 1
Link to comment

Thanks for the replies everyone.  I think PeteAsking's unassigned suggestion will help me get over the mental block.

 

Now the follow up question - what is the best way to implement this change?  I reordered the drives slightly once to confirm the issue wasn't a bad backplane connection and it started a full parity rebuild (~48 hours).  Any way I can avoid doing this again since the drives are empty?

Edited by Unraiding
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.