Jump to content
jonp

Native Pre-Clear Support

14 posts in this topic Last Reply

Recommended Posts

With the current large sized hard disks it is almost obligatory to do a pre-clear, having your array unprotected for an extended period of time when adding or changing disks, simply isn't acceptable.

 

Share this post


Link to post

With the current large sized hard disks it is almost obligatory to do a pre-clear, having your array unprotected for an extended period of time when adding or changing disks, simply isn't acceptable.

 

While I agree it's best to use pre-clear to minimize the time to add a new drive, the array is NOT unprotected when adding a new drive without doing that.  It's simply unavailable for a long time, while it clears the new drive.  But parity is still actively protecting all of the other drives during this time.  Once a new drive is cleared, THEN it is added to the array ... just like it would have been immediately if it was pre-cleared.

 

Share this post


Link to post

This would be nice to have in the GUI, but it isn't hard to run preclear from the terminal either. 

 

Nice feature idea.

 

craigr

Share this post


Link to post

Any chance of rolling in a badblocks routine, could be optional, instead of the typical read and write tests? This would give an awesome workout to a new disk.

Share this post


Link to post

Any chance of rolling in a badblocks routine, could be optional, instead of the typical read and write tests? This would give an awesome workout to a new disk.

 

I think this is a great idea.

 

I know the tools use DD in a blind write to clear the drive.

Preclear then reads it.

 

badblocks can go further to use multiple pass pattern tests which can go a long way towards really exercising a drive.

While it's been said that badblocks alone cannot detect drive problems, it's pretty much the same with a blind dd write then read.

At least with badblocks, if there's an issue, there's a file that contains the list of bad blocks.

 

Plus it does multipass with any set of characters you choose.

 

As far as drive testing, Perhaps some of this logic could be borrowed to do drive read tests

Share this post


Link to post

I'm cross posting a pointer here to show how I do my own exercise pre-clear and provide ideas for a future native preclear.

http://lime-technology.com/forum/index.php?topic=32564.msg335021#msg335021

 

Each of these 'steps' could be a checkbox to enable on demand.

 

1. save the smartctl log - Marks a point in time in the log directory.

 

2. I usually do a smartctl -t long to pre-read the drive and mark a point in time within the self test logs.

(equivilent to JoeL's pre-read)

 

3. Save the smartctl log again and compare for issues.

 

4. badblocks

-- here is where I differ. I'll usually use shifting bit patterns with badblocks.

Here's an example command line.

badblocks -sv -w -c 512 -t0xaa -t0x55 -t0xff -t0x00 -o /boot/logs/ST4000DX001-1CE168_Z3014JCE.201410221000.badblocks  /dev/sdd

 

http://linux.die.net/man/8/badblocks

Sometimes I add -t0x7f -t0xf7 for extra passes/patterns on drives I may question, but the final pattern is always -t0x00.

I do not adjust the block size as that has shown to sometimes shorten the test's final blocks.

I adjust the count of blocks to be 2x the largest cache size. This can be adjusted as needed.

There's also the ability to microsleep between blocks. I suppose if the block count were adjusted to cache size, then add in the micro sleeps it would do bursty writes to cache and limit interference with other array activities. 

This has always been my beef with a massive DD down a drive. Sure I want it to be fast, but I do not want it to flush my cache and make the array unusable.

 

A checkbox for each of these patterns (With a mandatory 0x00 at the end) would be neat.

 

I'm sure there will be a comment about wearing the drive out. I prefer to call it burning the drive in.

I see many people running multiple clear pass'es anyway.

Therefore the multiple pattern tests of badblocks is my equivalent to that.

(depending on my confidence in the drive)

 

5. Save the smartctl log again and compare for issues. again. I also keep these logs because the mtime can be used to calculate durations from the prior steps.

 

6. Another smartctl -t long to post-read the drive and mark a point in time within the self test logs.

(equivilent to JoeL's post-read)

 

7. Save the smartctl log again and compare for issues.  Again saving the log for historical review and to calculate duration from the prior step.

 

After review and insuring there are no pending sectors, I'll use JoeL's script to add the precleared signature.

Share this post


Link to post
I'm sure there will be a comment about wearing the drive out. I prefer to call it burning the drive in.
A hard drive is meant to be read and written, it's not like you are doing anything out of the ordinary, just doing it all in a short period of time. The only caveat would be that temperatures can get out of hand if you keep the drive active and don't have good cooling. Perhaps a native preclear test should also log temperature changes and give a warning if temps climb too much during the exercise.

 

I'd MUCH prefer to find any issues prior to committing real data to the drive.

Share this post


Link to post

I do similar things with my new drives.  I run WD's Data Lifeguard and do a quick test; then an extended test; then write zeroes to the full drive; then repeat the quick & extended tests.

 

If it's a drive I don't need right away, I then stick it on my "testing box" [A spare old PC I use ONLY for testing drive and data recovery] and do a full Level 4 run of Spinrite on the drive (this takes a LONG time).

 

I've actually never had a drive that passed the Data Lifeguard tests fail the Spinrite run, but a Level 4 run of Spinrite is about as intense as you can get -- multiple data patterns;  every block re-written multiple times; and if it did find any read errors it would do a statistical analysis of the errors and attempt to fix the block before doing a reallocation.

 

Spinrite, by the way, DOES track the drive temps and will both warn you and stop the test if the temps get out of line.

 

 

Share this post


Link to post

Please keep the following settable options.

 

      -w size  = write block size in bytes.  If not specified, default is 2048k.

      -r size  = read block size in bytes, default is one cylinder at a time ( heads * sectors * 512))

      -b count = number of blocks to read at a time.  If not specified, default is 200 blocks at a time.

 

I have found on my system that if I don't run it with these settings, "preclear_disk.sh -c 1 -w 32768 -r 32768 -b 1000 /dev/sdx", when it starts writing zeros it consumes all of the CPU cycles and lags my system terribly.

Share this post


Link to post

It would be nice if you could keep the post read times down like bjp999 did for his version of preclear.  The post read is approximately the same as the pre-read and the write.

Share this post


Link to post

It would be nice if you could keep the post read times down like bjp999 did for his version of preclear.  The post read is approximately the same as the pre-read and the write.

 

badblocks does this. Each sweep is predictable and the same. Plus you can use varying bit shifting patterns for some of the passes.

The default is

x'ff'  11111111

x'aa'  10101010

x'55'  01010101

x'00  00000000

 

Yes, it's 4 passes, which equates to 8 sweeps. But it's a controlled write, read, write, read, write, read, write, read.

 

Although you can easily program any character pattern and only do a 1 pass write/read of x'00'

There's also a random pattern too.

 

As I remember from reading details in the old spinrite days, varying these bit shifting patterns helps to find track alignment issues.

It has weeded out some bad drives in my arsenal over the years.

 

I don't know that it's going to be as fast as dd though. I've yet to modify the code to provide the MB/s during the updates.

It's slightly faster then a smart long test which can be estimated by looking at the smartctl log in the Extended self-test routine

recommended polling time: ie. (652) minutes for a new 6TB drive.

Share this post


Link to post

The problem with badblocks is that it actually doesn't detect bad blocks. This program was relevant when media errors were tracked by the filesystem. Nowadays, bad blocks are usually retired by S.M.A.R.T. routines and their position remapped by drive's firmware. If a sector is unreadable, either dd or badblocks will be aborted.

The trick is to write data, read it and, as you said, compare some critical S.M.A.R.T attributes from the beginning with those same atributes at the end.

 

 

Share this post


Link to post

I would love to see this added natively within unraid.

 

It would defiantly improve the user experience.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.