Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add


Recommended Posts

I should be getting my WD 2TB EARS drive today and I want to pre clear the drive 3 times, would this be correct?

 

cd /boot

preclear_disk.sh /dev/sdf-3

 

Not quite.  The arguments (i.e. -c -n etc) always go BEFORE the /dev/sdX part.  If you want to clear the drive for 3 passes the command would be:

cd /boot
preclear_disk.sh -c 3 /dev/sdf

Link to comment

I should be getting my WD 2TB EARS drive today and I want to pre clear the drive 3 times, would this be correct?

 

cd /boot

preclear_disk.sh /dev/sdf-3

no.  It is not the correct syntax.

 

It would be

preclear_disk.sh -c 3 /dev/sdX

where sdX = the correct 3 letter device name for your specific disk.

 

To see all the options type

preclear_disk.sh -?

 

Link to comment

Hi, I'm just pre-clearing a 2TB WD20EARS drive (jumper in place).  I've noticed a few bits of info in the syslog which don't look great, such as...

 

Jan 6 02:15:49 Tower kernel: end_request: I/O error, dev sdd, sector 3814563928

Jan 6 02:15:49 Tower kernel: __ratelimit: 22 callbacks suppressed

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820491

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820492

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820493

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820494

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820495

 

and

 

Jan 6 02:15:48 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

Jan 6 02:15:48 Tower kernel: ata4.00: failed command: READ DMA EXT

Jan 6 02:15:48 Tower kernel: ata4.00: cmd 25/00:00:58:a0:5d/00:01:e3:00:00/e0 tag 0 dma 131072 in

Jan 6 02:15:48 Tower kernel: res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan 6 02:15:48 Tower kernel: ata4.00: status: { DRDY }

Jan 6 02:15:48 Tower kernel: ata4: hard resetting link

 

Anything to worry about or should I just sit tight till it completes??

 

Also, once the pre-clear is complete and assuming all is well, can I just upgrade my current parity drive with the new drive by re-assigning the devices in the settings page, or do I need to physically unplug the old drive and move the cabling to the new one?

 

Thanks, Matt.

Link to comment

Hi, I'm just pre-clearing a 2TB WD20EARS drive (jumper in place).  I've noticed a few bits of info in the syslog which don't look great, such as...

 

Jan 6 02:15:49 Tower kernel: end_request: I/O error, dev sdd, sector 3814563928

Jan 6 02:15:49 Tower kernel: __ratelimit: 22 callbacks suppressed

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820491

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820492

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820493

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820494

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820495

 

and

 

Jan 6 02:15:48 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

Jan 6 02:15:48 Tower kernel: ata4.00: failed command: READ DMA EXT

Jan 6 02:15:48 Tower kernel: ata4.00: cmd 25/00:00:58:a0:5d/00:01:e3:00:00/e0 tag 0 dma 131072 in

Jan 6 02:15:48 Tower kernel: res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan 6 02:15:48 Tower kernel: ata4.00: status: { DRDY }

Jan 6 02:15:48 Tower kernel: ata4: hard resetting link

 

Anything to worry about or should I just sit tight till it completes??

 

Also, once the pre-clear is complete and assuming all is well, can I just upgrade my current parity drive with the new drive by re-assigning the devices in the settings page, or do I need to physically unplug the old drive and move the cabling to the new one?

 

Thanks, Matt.

We won't really know until we see the smart reports.  The "timeout" error and hard reset could be a cabling issue, or a power supply issue, or a disk itself, or a disk controller?

 

The buffer I/O errors look more like unreadable sectors... I've not see them before.  we'll know for sure when you get the end smart report.

 

Joe L.

Link to comment

Hi, I'm just pre-clearing a 2TB WD20EARS drive (jumper in place).  I've noticed a few bits of info in the syslog which don't look great, such as...

 

Jan 6 02:15:49 Tower kernel: end_request: I/O error, dev sdd, sector 3814563928

Jan 6 02:15:49 Tower kernel: __ratelimit: 22 callbacks suppressed

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820491

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820492

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820493

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820494

Jan 6 02:15:49 Tower kernel: Buffer I/O error on device sdd, logical block 476820495

 

and

 

Jan 6 02:15:48 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

Jan 6 02:15:48 Tower kernel: ata4.00: failed command: READ DMA EXT

Jan 6 02:15:48 Tower kernel: ata4.00: cmd 25/00:00:58:a0:5d/00:01:e3:00:00/e0 tag 0 dma 131072 in

Jan 6 02:15:48 Tower kernel: res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan 6 02:15:48 Tower kernel: ata4.00: status: { DRDY }

Jan 6 02:15:48 Tower kernel: ata4: hard resetting link

 

Anything to worry about or should I just sit tight till it completes??

 

Also, once the pre-clear is complete and assuming all is well, can I just upgrade my current parity drive with the new drive by re-assigning the devices in the settings page, or do I need to physically unplug the old drive and move the cabling to the new one?

 

Thanks, Matt.

We won't really know until we see the smart reports.  The "timeout" error and hard reset could be a cabling issue, or a power supply issue, or a disk itself, or a disk controller?

 

The buffer I/O errors look more like unreadable sectors... I've not see them before.  we'll know for sure when you get the end smart report.

 

Joe L.

 

I saw the same errors with 1 of my 3 WD20EARS disks in the first preclear cycle. They were gone the second time. Smart reports were fine both times.

Link to comment

You have 37 pending sectors.  Not a good thing.

 

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       37

 

Pending sectors mean that the drive had issues reading from these sectors, and it has therefore marked them for potential reallocation the next time that sector is written.  I say "potential reallocation" because before it remaps them it will try one more time to read the sector.  I have seen occasional cases where there will be some pending sectors and they go away and never repeat.  Can't explain the behavior but there it is.  But more often these pending sectors become reallocated.

 

There have also been a few cases where, no matter what the owner does, a few pending sectors remain, and don't get better or worse.  Also can't explain that, but people usually just accept it and monitor the drive.

 

Will be interesting to see the results after your second preclear.

Link to comment

You have 37 pending sectors.  Not a good thing.

 

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       37

Actually, since all un-readable sectors should have been identified in the pre-read, it indicates the "pending sectors" were probably identified in the post-read.  That is not a good thing. 

 

As bjp999 said we'll see what happens in the next clear cycle.  I do not expect you'll have those errors go away.

 

 

Link to comment

I will be adding more logic to the preclear_disk analysis.  I can see from your output one item to check I did not consider.

 

Expect a newer version of preclear_disk.sh shortly to catch and report on that type of error.

 

In the interim you can always type

diff /tmp/smart_start_sdX /tmp/smart_finish_sdX

 

to see all the differences.

Link to comment

Thanks for the replies.  Drive is only 2 days old too.  And now its $10 cheaper >:(.  I'll post again in about 26 hours.  Thanks.

 

Edit - should I stop preclearing and wait for your new version?

The newer version is attached to the preclear thread.  Give it a try.

 

The difference is in the output report only.  It will print the number of re-allocated sectors and pending-re-allocation sectors for both the beginning and end of the preclear process.

 

Joe L.

Link to comment

When I run ...

 

preclear_disk.sh -l

 

I get the following results:

 

========================================

Disks not assigned to the unRAID array

  (potential candidates for clearing)

========================================

ls: cannot access /dev/disk/by-path/e: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

    /dev/sdo = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796260

    /dev/sdu = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796928

 

This is with 5.0b2.

 

The bottom two are right.

Link to comment

You have 37 pending sectors.  Not a good thing.

 

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       37

Actually, since all un-readable sectors should have been identified in the pre-read, it indicates the "pending sectors" were probably identified in the post-read.  That is not a good thing.   

 

As bjp999 said we'll see what happens in the next clear cycle.   I do not expect you'll have those errors go away.

 

 

 

Here's the log after the new run with the new preclear -A and no jumper.  I stopped it originally after a couple of hours to switch to your new report version (just in case something looked odd with the original smart report).  Let me know your thoughts?  Thanks

syslog_-_16jan11_-_new_preclear_without_jumper.zip

Link to comment

When I run ...

 

preclear_disk.sh -l

 

I get the following results:

 

========================================

Disks not assigned to the unRAID array

  (potential candidates for clearing)

========================================

ls: cannot access /dev/disk/by-path/e: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

ls: cannot access /dev/disk/by-path/-: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

     /dev/sdo = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796260

     /dev/sdu = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796928

 

This is with 5.0b2.

 

The bottom two are right.

I'll fix it when I get a moment this evening and post a newer version.  When I tested my disk did not have an existing partition.   

 

As you said, just a few extra messages when it went looking for the partitions that were not in the assigned array.

There is another shell error that will show in the output report on some disks if it has multiple "Undefined" attributes.  They too are harmless and do not affect the output.  I've already fixed that one, so I'm happy you found the other before I posted my current fix.

 

The "-l" option was an extra to make it easier on new users of unRAID.    Don't want to confuse them with errors.  ;D

 

Joe L

 

Joe L.

Link to comment

You have 37 pending sectors.  Not a good thing.

 

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       37

Actually, since all un-readable sectors should have been identified in the pre-read, it indicates the "pending sectors" were probably identified in the post-read.  That is not a good thing.   

 

As bjp999 said we'll see what happens in the next clear cycle.   I do not expect you'll have those errors go away.

 

 

 

Here's the log after the new run with the new preclear -A and no jumper.  I stopped it originally after a couple of hours to switch to your new report version (just in case something looked odd with the original smart report).  Let me know your thoughts?  Thanks

 

Here's just the smart reports incase you want to see how they came out.  I think this time is better, but I don't know if that makes a difference.

Preclear_smart_reports.zip

Link to comment

When I run ...

 

preclear_disk.sh -l

 

I get the following results:

 

========================================

Disks not assigned to the unRAID array

  (potential candidates for clearing)

========================================

ls: cannot access /dev/disk/by-path/e: No such file or directory

ls: cannot access /dev/disk/by-path/public: No such file or directory

...

     /dev/sdo = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796260

     /dev/sdu = ata-WDC_WD20EADS-00S2B0_WD-WCAVY5796928

 

This is with 5.0b2.

 

The bottom two are right.

This should now be fixed.  New version of preclear_disk.sh attached to the first post in this thread.

 

Joe L.

Link to comment

when I use the -l option its showing drives that are part of my array, is this correct?  All below are in the array, version .9.9b

 

root@Tower:/boot# preclear_disk.sh -l

========================================

Disks not assigned to the unRAID array

  (potential candidates for clearing)

========================================

    /dev/sdu = ata-ST32000542AS_5XW1BRQF

    /dev/sdj = ata-ST32000542AS_5XW21SQH

    /dev/sdb = ata-WDC_WD10EADS-00L5B1_WD-WCAU49838186

    /dev/sdt = ata-WDC_WD20EADS-00R6B0_WD-WCAVY0225668

    /dev/sds = ata-WDC_WD20EADS-00R6B0_WD-WCAVY2274690

    /dev/sdr = ata-WDC_WD20EARS-00S8B1_WD-WCAVY2440611

 

thanks

 

Josh

 

Link to comment

when I use the -l option its showing drives that are part of my array, is this correct?  All below are in the array, version .9.9b

 

root@Tower:/boot# preclear_disk.sh -l

========================================

Disks not assigned to the unRAID array

  (potential candidates for clearing)

========================================

     /dev/sdu = ata-ST32000542AS_5XW1BRQF

     /dev/sdj = ata-ST32000542AS_5XW21SQH

     /dev/sdb = ata-WDC_WD10EADS-00L5B1_WD-WCAU49838186

     /dev/sdt = ata-WDC_WD20EADS-00R6B0_WD-WCAVY0225668

     /dev/sds = ata-WDC_WD20EADS-00R6B0_WD-WCAVY2274690

     /dev/sdr = ata-WDC_WD20EARS-00S8B1_WD-WCAVY2440611

 

thanks

 

Josh

 

Well that's certainly not right if they are assigned to your array. 

 

Can you post the following or send it in a PM so I can try to figure out what it is not doing correctly.

ls -l /dev/disk/by-id/*

 

ls -l /dev/disk/by-path/*

 

cat /boot/config/disk.cfg

 

Joe L.

Link to comment

Attached

 

Thanks....

 

I've narrowed it down to the date format.

 

On my older server running 4.7, I see this style of date in the "ls" command:

lrwxrwxrwx 1 root root  9 Jan 15 10:06 /dev/disk/by-path/pci-0000:00:1d.7-usb-0:3:1.0-scsi-0:0:0:0 -> ../../sda

 

On the 5.0beta version, the date in the "ls" command looks like this:

lrwxrwxrwx 1 root root 9 2011-01-07 09:03 /dev/disk/by-path/pci-0000:00:1f.2-scsi-4:0:0:0 -> ../../sdg

Because of that, there are fewer whitespace delimited "fields" in the output and the preclear script does not properly extract the last field which holds the device name.

 

I'm working on the fix... but for the next few minutes I've removed the preclear_disk.sh from being attached so it does not get new users confused.

 

Joe L.

Link to comment

You have 37 pending sectors.  Not a good thing.

 

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       37

Actually, since all un-readable sectors should have been identified in the pre-read, it indicates the "pending sectors" were probably identified in the post-read.  That is not a good thing.  

 

As bjp999 said we'll see what happens in the next clear cycle.   I do not expect you'll have those errors go away.

 

 

 

Here's the log after the new run with the new preclear -A and no jumper.  I stopped it originally after a couple of hours to switch to your new report version (just in case something looked odd with the original smart report).  Let me know your thoughts?  Thanks

 

Here's just the smart reports incase you want to see how they came out.  I think this time is better, but I don't know if that makes a difference.

 

Not good ...

 

You now have 35 reallocated sectors + 3 new pending sectors.  You can keep running preclear cycles in hopes that the reallocated sectors stabalize, but based on experience here I don't believe it will happen.  Every run or two will prodice a few more reallocated sectors, and you'll never be able to trust the drive with data.  My advice would be to RMA the disk.

 

Remember that it is far better to learn this BEFORE you add a disk to your array.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

 1 Raw_Read_Error_Rate     0x002f   193   177   051    Pre-fail  Always       -       1938

 3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       1358

 4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       13

 5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       35

 7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0

 9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       65

10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0

11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0

12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       11

192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       8

193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       35

196 Reallocated_Event_Count 0x0032   199   199   000    Old_age   Always       -       1

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       3

198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       131

199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0

200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       142

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.