Preclear and Failed Drives



Recommended Posts

The problem with the elevated heat is alignment. Reminds me of my older systems where we used a special controller to double our capacity. (I'm referring to early 80s).  We had to do a low level format every 6 months as the seasons changed or you would have problems.

 

So elevated heat over extended time of writes could trigger failures in drives when you do not use them in the same conditions.

 

I would suggest the use of badblocks which does 4 passes of specific bit pattern tests.

It's a tried and true method of testing and documenting the failed blocks on a drive.

It's been in linux for a very long time. The output can be fed to reiserfs so that the filesystem will avoid using the bad blocks. But this doesn't come into play for a parity operation.

 

Badblocks does have a way of weeding out a marginal drive.

 

The standard dd write/read is not all that taxing on the drive itself, it's a sequential operation and moves the head in small increments. It's the elevated heat and alignment that could kill a drive.

 

50c is way too hot to be running a drive for 10 passes. If the average temperature inside a case is mid 30s to mid 45's then that's the temperature it should be exercised at.

Link to comment
  • 3 weeks later...
  • Replies 67
  • Created
  • Last Reply

Top Posters In This Topic

The problem with the elevated heat is alignment. Reminds me of my older systems where we used a special controller to double our capacity. (I'm referring to early 80s).  We had to do a low level format every 6 months as the seasons changed or you would have problems.

 

So elevated heat over extended time of writes could trigger failures in drives when you do not use them in the same conditions.

 

I would suggest the use of badblocks which does 4 passes of specific bit pattern tests.

 

50c is way too hot to be running a drive for 10 passes. If the average temperature inside a case is mid 30s to mid 45's then that's the temperature it should be exercised at.

 

I agree that 50C is too hot - never thought much about alignment problems due to heat, but it does make a lot of sense.  I don't really have control over the hard drive temp until I remove it from its external case - which I can't do because then the warranty is voided - so I test it thoroughly first.

 

Even if I did badblocks then a preclear, I am certain the drive would reach over 50C due to the lousy heat disspiation of the overall design.  (Just for reference, the drives I am dismantling are LaCie d2 quadra - all aluminum housing, but poor heat transfer to it, and a hole for a fan, but none installed... not sure why they're offering 3 year warranty!)

 

Link to comment

My vote went to I run one or more preclear cycles, and I have never seen a drive fail  which is true for new drivers that I have precleared.  But not true for older drives.  I have had first pass failures on some 1TB Seagates that were already out of warranty.  I don't usually run anymore passes on used drive that can make it on the first one.  New ones usually get 3 pass.

 

I have also had new drives that were slow preclears on first pass, but would then run the next 2 pass at normal preclear speed.

Link to comment
  • 3 weeks later...

I was hoping that this thread would tell me that I only needed to do one cycle. Now I need to do 2, maybe 3 cycles. The hardest thing is definitely waiting. I'm just setting up for the first time, and to wait 4 days or so just to do anything is so difficult. But I realize it's probably worth it in the long run. I'm starting with 3TB drives since that's what I had on hand already and it's taking forever!

Link to comment

I do one cycle. Had one disk fail in preclear out of 30 or so.  And one other disk fail about a year old. Otherwise, everything has been solid. I would not do more than 1 cycle personally. You are just wearing it out. :P

 

I partially disagree. If you see any pending sectors, run the preclear for another pass or two.

 

This is why I like badblocks. if you use the -o parameter the bad blocks are saved to a file.

If there are any entries, you know to run badblocks again until there are none.

 

It's important to have no pending sectors.

Link to comment

The 50C plus environment is not by choice - it happens in external drive enclosures where I cannot control cooling well enough.  If I want to RMA one of those drives before it causes data loss, I basically need it to fail while it's still new (or new to me).  As soon as I take it out of the case I can no longer RMA it... and wouldn't you know, where I am it is often the case than an external USB drive is cheaper than an internal one?  Not always, but there ya go... the world is full of strange occurrences.

 

Get a small desktop AC powered fan and aim it directly at the USB drive. You'll notice drive temperatures stay much more reasonable.

Link to comment

I do one cycle. Had one disk fail in preclear out of 30 or so.  And one other disk fail about a year old. Otherwise, everything has been solid. I would not do more than 1 cycle personally. You are just wearing it out. :P

 

I partially disagree. If you see any pending sectors, run the preclear for another pass or two.

 

This is why I like badblocks. if you use the -o parameter the bad blocks are saved to a file.

If there are any entries, you know to run badblocks again until there are none.

 

It's important to have no pending sectors.

 

Totally agree that if first preclear shows potential drive problems, more diagnosis (or a replacement) is required before adding to an array.

 

I like the badblocks option and would consider using it when indicated.  I've just been lucky (knock on wood) to not have had many drive failures.  But I also have good cooling, stable wiring, and I tend to buy the most stable "consumer" drives (my array is almost totally Hitachi).  All my drives are in removable trays, so I almost never open the case and risk knocking something loose.  If users are having many failures, and feel the need to run many preclear cycles, they might want to look at some of these types of factors to improve array reliability. 

 

IMO running a preclear at 50c would be the worst thing to do.  I'd direct a fan on it (as has been suggested), and if that didn't keep temps under 45c, try preclearing in a fridge via a laptop!  If nothing works to keep temps down, I'd check the smart stats, and if all is good pull the drive and add it to the array.  If the drive fails unRAID's protection is still there.  But chances are good the drive won't fail and you won't have weakened it through such a stressful test.

Link to comment

Maybe i am a noob - but this "pre-clear" is for what?

to stress test a hard drive to make sure it is not bad right out of the gate.

 

My unraid is running since two years (10 HDDs) and non of them had a preclear and no HDD-failure so fare.

You have been extremely lucky.

 

So please, can someone explaine, what the preclear is exactly doing?

It has 2 functions.  It stress tests the new HDD to try and catch early failure.  It also writes a special signature to the drive so that unRAID believes it has already cleared it.  When you add a precleared drive the only thing that has to be done is to add it to the array and once added format it.  formatting takes considerably less time than letting unRAID itself clear it and add it to the array (if unRAID does the clearing the array will be offline and unavailable the entire time).

Link to comment

In addition, the history of the clear vs preclear.

 

When new drives are added to the array, (without being pre-cleared), emhttp clears the drive by writing 0's to all sectors.

While this happens, your array cannot be used until it finishes.

After the 0's,  a signature is written to the drive, the drive is added to the slots and parity is assumed to be good because all 0's have been written to that slot.

 

The preclear executes these steps (and more) in parallel while your array is active.

 

Once the preclear is finished and you add a drive to the array, emhttp sees the signature and notices the drive is already (pre) cleared. emhttp adds it to the slots and starts the array.

 

Preclear also stresses the drive by.

1. reading the drive first. Any questionable sectors will be marked as pending_sectors.

2. writing 0's to the drive, Any pending sectors are re-written and tested or reallocatd.

3. reading the drive again to see if any pending_sectors still exist.

4. writing a signature.

5. comparing the SMART logs from before and after to show variances.

 

At this point you can choose to redo the preclear if there are still pending_sectors.

 

Some people choose to execute multiple passes of preclear.

 

It's my feeling that a badblocks in write destructive mode is a better test, but this is only a recent observation.

badblocks was written specifically to test sectors with multiple patterns and capture the bad blocks.

 

The preclear script has been doing well for everyone for quite a long time, so it's a good tool to use whenever you get a new drive.

It tests the drive and reveals any potential issues with the drive before you add it to the array.

 

If a person chooses to skip the manual preclear, emhttp will do a clear anyway and blindly add the drive to the array if no significant errors were detected.

 

Note, a pending sector is not significant enough to stop this, however, it is a future failure waiting to be dealt with.

 

Drives with pending sectors will eventually reveal issues, perhaps at the worst time, i.e. when you are rebuilding a failed drive.

 

If a sector cannot be read while a rebuild is occurring you run the risk of the drive timing out, being kicked out of the array and having a multiple drive failure.

Link to comment
  • 3 weeks later...
  • 3 weeks later...

I've been running 22 drives for almost a year now in my first unRAID build. I've never run a preclear and never had an issue with any of my drives.

Although I do perform a parity check every month.

But I've also used at least 200 drives at home since 2001 and never had an issue with any of them. (well I did have the sata connection break off on one of my drives a few years ago, but that was my fault. WD did replace it under warranty)

Link to comment
  • 1 month later...

lol! I needed this comment to cheer me a bit  :D

 

I stumbled upon unRAID a week ago and searching for a case for my home server I've got 5 years old HP pavilion from a friend of mine last Friday, Saturday morning bought 2T to join my existing saturated 1T in what seems to be relaxing weekend DIY project :P 

Went too sleep Sunday 4am just succeeded to make my MB boot from USB :P

 

As 33 years old Linux virgin, you can imagine my frustration regarding the progress speed in finishing my first NAS...

 

Luckily, 1st/3 pre-clearing of my 2T is hitting to it's 25th hour which leaves me just enough time to confuse myself even more...

 

As I don't find a straight answer to next problem I'll dare to pose this question here where my adventure ends for today...

 

How/should I transfer data from my 1T to pre-cleared/formatted 2T once the process is finished???

 

First thought was to use Hiren's boot CD to copy files under provided win xp environment but reading some posts here getting impression how naive I am...

 

It's in option to buy another 2T this weekend...

 

Even an appropriate link would be very appreciated!

Thanks in advance!

 

 

Link to comment

please...

 

i want to do 2nd preclear of my HD and i tried option with email notifications

 

./preclear_disk.sh -M 4 /dev/sdX ("root" email address)

 

what comes back is:

 

bash: syntax error near unexpected token `('

 

what is that, i didn't found similar post to guide me in this case...

 

 

 

Link to comment
  • 1 month later...

To run preclear with email notifications, use something like:

 

./preclear_disk.sh -m [email protected] -M 4 /dev/sdX

That will work, but ONLY is you installed and configured MAIL on your server.  If you did not, the mail option will not work.

 

As far as moving your data...

Once you assign the disks to your array, and the array shares are visible on your LAN, moving data is as simple as copying the file with your file-explorer on your PC.    I strongly suggest assigning a parity drive on the unRAID array first, before populating it with files.  It makes recovery much easier if a disk were to fail in its first few hours/days of operation.

Link to comment
  • 1 month later...

Is it possible to run pre clear without writing any data (read-only) to test drives on a quarterly basis for possible read failures, especially if those drives aren't being accessed on any regular frequency?

 

Basically, this would allow for some amount of data integrity checking and possible drive failure in the future. Trying to catch potential drive failures as early as possible.

Link to comment

Is it possible to run pre clear without writing any data (read-only) to test drives on a quarterly basis for possible read failures, especially if those drives aren't being accessed on any regular frequency?

 

Basically, this would allow for some amount of data integrity checking and possible drive failure in the future. Trying to catch potential drive failures as early as possible.

To run the post-read-verify only on a drive (the drive will not be written to).

      preclear_disk.sh  -V [-A|-a] /dev/???

 

options to preclear_disk.sh can be seen by typing

preclear_disk.sh -?

 

Joe L.

Link to comment

Is it possible to run pre clear without writing any data (read-only) to test drives on a quarterly basis for possible read failures, especially if those drives aren't being accessed on any regular frequency?

 

Basically, this would allow for some amount of data integrity checking and possible drive failure in the future. Trying to catch potential drive failures as early as possible.

 

The suggested monthly parity checks help too.

Link to comment
  • 2 months later...

This may be an old topic, but since I just voted, I figured I'd weigh in here with my rationale too.

 

I used to only pre-clear once.  Then I bought a WD30EZRX drive from Newegg.

 

I pre-cleared it as usual.  Reading the output of the pre-clear script, I noticed it said that I had 3 sectors pending re-allocation before the run, and 4 after pre-read cycle 1.  It ended with 35 sectors pending re-allocation.

 

So I fired up round 2.  After it was done, I had 42 sectors pending re-allocation.

 

Round 3, and the drive vomited all over my syslog and got yanked.  I don't have the results, as it never finished, but if I recall it was in the multiple tens of thousands range.

 

The replacement drive survived 3 rounds of pre-clearing fine.

 

I just purchased 2 more WD30EZRX drives, one of which survived the 3 rounds of pre-clearing fine, and the other was so DOA it never even started the pre-clear cycle.

 

I think I'll be switching to the badblocks + pre-clear method from here on out though.

Link to comment
  • JorgeB unpinned this topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.