Best Practice for adding a new disk to array


Recommended Posts

Hi,

 

I have read lots of threads about pre-clearing, formatting, parity sync etc.

 

But at this point it is not exactly clear to me what is the best practice to add any new drive?

By adding a new drive I don't just mean adding a single drive to a running unRAID array with lots of existing drives.

The question is more general -> starting at the very first drive to setup a new unRAID server.

 

I have read that pre-clear only does a stress-test of the drive. Therefore my question is, how can I make sure that any potential bad sectors of a new drive get diagnosed and marked as such right from the moment I add it to the system, so I won't run into surprises later.

 

Clarification is much appreciated...  :)

 

Cheers,

Marcel

 

 

Link to comment

Pre-clear does not provide 100% protection so you need to get the expectation right.

So perhaps the drive got a bad knock in transit and is a bit wonkie. It may take months of normal usage for it to go bad. Preclear / stress test just pushes that forward so you identify it quickly. Obviously if the drive is already shot then it will identify that too.

 

No matter how many times you preclear, there will be a probability of some passing pre-clear only to fail later. Hence the commonly suggested best practice to preclear 3 times.

 

My opinion is 3 is very arbitrary. Why not 1, 2, 4, 5, etc? I have not seen anything beyond anecdotal (i.e. not a statistical analysis) to support any particular value. Now adding the many different brands, conditions, models etc that are used among unRAID users -> I highly doubt a proper statistical analysis can be done.

 

Hence I preclear once.

If probability is against me then it can be against me even if I pre clear a gazillion times. In such case, I rely on parity and unRAID idiosyncratic architecture to restore.

Failing that, I have Crashplan for my most critical data.

Failing that, I will just look up to the sky and say "Hasa Diga Eebowai!".

 

Link to comment

This is how I settled on running at least 3 preclear passes, not based on what others said but based on what I experienced.

 

I have personally experienced the following:

A) drives cleared pass 1 but failed on pass 2.

B) drives cleared pass 1 and 2 but failed on pass 3.

 

I have never personally experienced the following:

W) drives cleared pass 3 but failed on pass 4

T) drives cleared pass 4 but failed on pass 5

F) drives cleared pass 5 but failed on pass 6

 

 

Way early on I had a couple of spare 2TB drives and didn't need the space immediately so I ran preclear on them for a little over 2 weeks.

Link to comment

Others have already discussed the testing aspect of preclear, but...

...I have read that pre-clear only does a stress-test of the drive...

Actually, the main purpose of preclear is to make a new drive consistent with existing parity. By setting all the bits on the drive to zero, the drive can be added to a parity array without affecting parity. If you don't preclear a drive before you add it to the parity array, unRAID will have to clear the drive itself.

 

 

Link to comment

Before I started this thread, after reading through existing ones, I got the feeling that there isn't the one community approved best practise.

Everybody has their own opinions. Seems that is just the case  ;)

 

I do agree very much with what testdasi and trurl have pointed out.

 

1) the process to clear a drive in unRAID has actually nothing to do with testing a drive. it is just a necessary step (writing zeros for every bit) to enable the parity mechanism.

2) a nice side effect of the clear process is that it is some kind of stress test - under normal conditions (besides a reconstruct) it would never happen to a drive that all its bits get written in one session

3) using the pre-clear plug-in and potentially increasing the number of passes for clearing the stress test character of the procedure can be emphasized

4) another benefit of pre-clearing a drive with the plug-in before keeping it as a cold spare is that when the time comes for it to be replaced, the process is much faster and it is also unlikely to happen that you get a bad surprise with the drive failing right during unRAID's clear process.

 

I hope that sums it up correctly (?)

 

Maybe I didn't make it clear enough where my own confusion has been in my original post.

My question is whether the unRAID clear and/ or the pre-clear plug-in actually mark bad sectors in some way, so during normal operation unRAID wont try to write/read stuff from them (like chkdsk does for Windows systems)

 

 

@BRIT: I can understand your rationale given your own experience. I think in the end it comes down to how much effort/time you want to spend on the testing.

Link to comment

With modern disks, the drive firmware is meant to reallocate bad sectors so that at the logical level every sector appears to be 'perfect'.  As such you do not 'mark' bad sectors in the sense I think you mean.    The pre-clear process is designed to help the firmware detect problem sectors so the firmware can handle the reallocation process.

 

Another point to note is that the unRAID built-in clear process does no check as to whether the writes were actually successful, whereas the pre-clear one has an additional read phase that checks the sectors read back have the expected values.

Link to comment

The built in unraid clear does not test the drive at all. All it does is issue write 0s to the sectors. It does not verify the sectors have been correctly zereoed.

 

The preclear script while achieving the same end effect of having the drive zeroed and setting the signature that the drive is already cleared does take explicit steps to attempt to read, write, and reread to verify the sectors. It also makes sure the drive heads are exercised by doing seeks to other spots on the drive while reading/writing/verifying. During this process, the drive itself will go through and mark certain sectors bad if its unable to read or write the sectors or mark the sector as sketchy if it was unable to read the sector. Preclear also issues smart tests and examines the smart attributes at the end of each pass.

 

It is through the extensive testing and exercising of the drive during preclear and examining the smart reports that we are able to determine if a drive has failed or not. If the drive has any "pending reallocation" or any "reallocated" sectors that I consider the drive to have failed.

Link to comment

@BritT: Thanks a lot for the detailed explanation!  :)

 

Just for clarification: if the drive does the marking of bad sectors itself I assume this would also happen during the unRAID clear process? Or is this somehow actively triggered by the pre-clear script?

 

Given your explanation I'll pre-clear any drive before adding it to the array in the future.

 

Cheers,

Marcel

Link to comment

Just for clarification: if the drive does the marking of bad sectors itself I assume this would also happen during the unRAID clear process? Or is this somehow actively triggered by the pre-clear script?

 

It's the drive that does it.  More specifically, it's a write to a sector that forces the drive to decide whether to keep or replace the sector.  If a sector has been detected as questionable, then it thoroughly tests it with test patterns, and measures the percentage of error correction bits needed.  Too much and it's replaced (remapped to a spare sector).

 

I wonder if Check Disk still does that, with NTFS drives.  They all used to, but there's no point now.  If you mark it as bad, then it's not in use, and you can write something to it, forcing the drive to decide whether it's good or needs to be replaced.  Either way, the end result is a good sector.

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.