Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add

Joe L. · October 15, 2009

root@Tower:~# hdparm -i /dev/hdc

/dev/hdc:

Model=ST31500341AS, FwRev=CC1H, SerialNo=9VS2L1L4

Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }

RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4

BuffType=unknown, BuffSize=0kB, MaxMultSect=16, MultSect=16

CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=18446744072344861488

IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}

PIO modes: pio0 pio1 pio2 pio3 pio4

DMA modes: mdma0 mdma1 mdma2

UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma6

AdvancedPM=no WriteCache=enabled

Drive conforms to: unknown: ATA/ATAPI-4,5,6,7

* signifies the current active mode

Is it possible your BIOS has an option like this (in red) for the last two SATA ports? (take note how it defaults to IDE mode)

I started preclear to get an idea of speeds and everything looks normal ( i think )

unRAID server Pre-Clear disk /dev/hdc

= cycle 1 of 1

= Disk Pre-Read in progress: 0% complete

= ( 1,645,056,000 bytes of 1,500,301,910,016 read ) 116 MB/s

=

Disk Temperature: 30C, Elapsed Time: 0:00:15

Speed looks good. I would not worry too much about how it presents itself to the Linux OS, but I'll bet it is pretending to be a PATA drive.

Joe L.

spinbot · October 15, 2009

I would have never even thought of anything like that

I suspect their is an option in the BIOS to change this.

Assuming I could find this option in the BIOS and if I changed this AFTER the preclear completed, would that create any issues?

Joe L. · October 15, 2009

I would have never even thought of anything like that

I suspect their is an option in the BIOS to change this.

Assuming I could find this option in the BIOS and if I changed this AFTER the preclear completed, would that create any issues?

If you change it, and a disk was already assigned to the array, you might need to re-assign it on the devices page if unRAID was unable to figure it out on its own. It probably will not care at all, or work any faster... so you might just leave it for now.

Joe L.

spinbot · October 15, 2009

I've got some form of OCD/Perfectionist in me and its hard to just leave it

My pre_clear died on me last night ( Windows can be thanked for that as it did an automatic update and rebooted my computer ... that's fixed and won't happen again! ).

So, I got adventurous this morning and went into the BIOS. I've located the part of the BIOS that relates to this.

There are two options (defaults listed after):

OnChip SATA Type - Native IDE

OnChip SATA Port4/5 Type - IDE (this cannot be changed unless the option before is changed)

My options for OnChip SATA Type include:

Native IDE

RAID

AHCI

Once I pick either RAID or AHCI, I then can edit OnChip SATA Port4/5 Type and my options are:

As SATA Type

IDE

I'm googling and I have found similar questions, but the answers are for Windows and most say if I select RAID or AHCI I would need drivers.

===============================================================

EDIT: So I came across an article here:

Here’s a possible solution if you’re in this kind of SATA hard drives not found situation. The trick is related to ACHI (Advanced Host Controller Interface) hardware mechanism which are designed to allow software to communicate with Serial ATA (SATA) devices such as host bus adapters with features not offered by Parallel ATA (PATA) controllers such as higher speeds, hot-plugging and native command queuing (NCQ). Windows XP (and any OS older than Vista or linux kernel 2.6.19) does not have came pre-packaged with driver to support AHCI/SATA mode, thus creating a very common error.....

As the kernel of Linux UnRaid uses is newer, that would mean the AHCI driver is built in. I updated the BIOS to "AHCI" and "As SATA Type"

Now, it comes up with the right name. Hopefully this change doesn't create any other issues. Opinions?

My drive output:

root@Tower:~# ls -l /dev/disk/by-id

total 0

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-ST31000528AS_9VP0H3TD -> ../../sdd

lrwxrwxrwx 1 root root 10 Oct 15 06:10 ata-ST31000528AS_9VP0H3TD-part1 -> ../../sdd1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-ST31500341AS_9VS2L1L4 -> ../../sde

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-ST31500341AS_9VS2LXAF -> ../../sda

lrwxrwxrwx 1 root root 10 Oct 15 06:10 ata-ST31500341AS_9VS2LXAF-part1 -> ../../sda1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-ST320011A_3HT0GTF1 -> ../../hda

lrwxrwxrwx 1 root root 10 Oct 15 06:10 ata-ST320011A_3HT0GTF1-part1 -> ../../hda1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-ST3750330AS_5QK00GT5 -> ../../sdc

lrwxrwxrwx 1 root root 10 Oct 15 06:10 ata-ST3750330AS_5QK00GT5-part1 -> ../../sdc1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 ata-WDC_WD15EADS-00P8B0_WD-WMAVU0072618 -> ../../sdb

lrwxrwxrwx 1 root root 10 Oct 15 06:10 ata-WDC_WD15EADS-00P8B0_WD-WMAVU0072618-part1 -> ../../sdb1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 scsi-SATA_ST31000528AS_9VP0H3TD -> ../../sdd

lrwxrwxrwx 1 root root 10 Oct 15 06:10 scsi-SATA_ST31000528AS_9VP0H3TD-part1 -> ../../sdd1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 scsi-SATA_ST31500341AS_9VS2L1L4 -> ../../sde

lrwxrwxrwx 1 root root 9 Oct 15 06:10 scsi-SATA_ST31500341AS_9VS2LXAF -> ../../sda

lrwxrwxrwx 1 root root 10 Oct 15 06:10 scsi-SATA_ST31500341AS_9VS2LXAF-part1 -> ../../sda1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 scsi-SATA_ST3750330AS_5QK00GT5 -> ../../sdc

lrwxrwxrwx 1 root root 10 Oct 15 06:10 scsi-SATA_ST3750330AS_5QK00GT5-part1 -> ../../sdc1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 scsi-SATA_WDC_WD15EADS-00_WD-WMAVU0072618 -> ../../sdb

lrwxrwxrwx 1 root root 10 Oct 15 06:10 scsi-SATA_WDC_WD15EADS-00_WD-WMAVU0072618-part1 -> ../../sdb1

lrwxrwxrwx 1 root root 9 Oct 15 06:10 usb-Lexar_JD_FireFly_AA04011000035322-0:0 -> ../../sdf

lrwxrwxrwx 1 root root 10 Oct 15 06:10 usb-Lexar_JD_FireFly_AA04011000035322-0:0-part1 -> ../../sdf1

Joe L. · October 15, 2009

You should have had ACHI all along... there should not be any problems.

spinbot · October 15, 2009

Good guess then!

Thanks for the confirmation. I think I can safely say I will leave this thread for the next person with questions now !

Talos · October 16, 2009

Ran preclear on my first disc (a brand new WD10EADS) into my new system last night..

Came back with a few errors. Below is a C&P from the telnet session.

System specs are as follows:

Asus P5Q-Deluxe

E6400 underclocked to 1600mhz

2gb G.skill DDR2800

2x Adaptec 1430SA raid cards

3xNorco SS-500 Hotswap modules.

Seasonic M12-700w

Lian Li PC-A17 case

AHCI enabled in BIOS

Adaptec raid bios's disabled

===========================================================================

= unRAID server Pre-Clear disk /dev/sda

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Disk Post-Clear-Read completed DONE

Disk Temperature: 27C, Elapsed Time: 15:01:48

============================================================================

==

== Disk /dev/sda has been successfully precleared

==

============================================================================

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

< 1 Raw_Read_Error_Rate 0x002f 100 253 051 Pre-fail Always

- 0

---

> 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always

- 0

63c63

< 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always

- 6

---

> 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always

- 7

67c67

< 199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always

- 0

---

> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always

- 0

============================================================================

These errors are bad right?

Joe L. · October 16, 2009

Ran preclear on my first disc (a brand new WD10EADS) into my new system last night..

Came back with a few errors. Below is a C&P from the telnet session.

These errors are bad right?

Where do you see any errors?

Did you read how to interpret the SMART report results? See here for some help http://en.wikipedia.org/wiki/S.M.A.R.T

There are three SMART parameters that changed from the report taken before the pre-clearing process.

One of those, the "Load Cycle Count" increased from 6 to 7. This indicated the disk heads were loaded from their parked position onto the disk surface so it could be read and written. If it had NOT incremented it would have been a problem.

The other two have just changed the current value from the factory initialized value of 253 to 200 now that statistics are being collected. Neither are anywhere near their "failure threshold" (the last numeric column in each row)

Enjoy your new disk...

Joe L.

Talos · October 16, 2009

Where do you see any errors?

Did you read how to interpret the SMART report results? See here for some help http://en.wikipedia.org/wiki/S.M.A.R.T

There are three SMART parameters that changed from the report taken before the pre-clearing process.

One of those, the "Load Cycle Count" increased from 6 to 7. This indicated the disk heads were loaded from their parked position onto the disk surface so it could be read and written. If it had NOT incremented it would have been a problem.

The other two have just changed the current value from the factory initialized value of 253 to 200 now that statistics are being collected. Neither are anywhere near their "failure threshold" (the last numeric column in each row)

Enjoy your new disk...

Joe L.

Heh.. Thanks heaps for the explanation Joe.. Guess I saw the CRC error count line, panicked and assumed it was bad... i will have a bit more of a read of the wiki link..

Joe L. · October 20, 2009

Hi Joe L.,

I am wondering if it would be easy for you to include an option for completely wiping the disk instead of preclearing it.

Or the preclearing method in it's current form can be used for this purposes? (but at least the readback seems unneccessary in this case)

That would be extremly useful on a disk replacement, when the old disk is going to be sold out.

Thank you in advance for your feedback.

I think you intended this post to be in the thread on the preclear_disk.sh script...

If you are just interested in clearing a disk prior to selling it, you can just use the existing "-n" option to cause it to skip the pre-reading and post-reading phases of the pre-clear script. It will then run in about a third of the time.

You would invoke it as

preclear_disk -n /dev/???

(with ? ? ? being the drive designation for the drive to be cleared sda, sdb,... hda, hdb, etc...)

olympia · October 20, 2009

Ooops, yes, sure it's belongs to here. Sorry for messing this up.

Can clearing with -n option considered to be safe against undelete/unerase/unformat tools?

Joe L. · October 20, 2009

Can clearing with -n option considered to be safe against undelete/unerase/unformat tools?

The ReiserFS file system is not friendly to undelete/unerase/unformat tools to begin with. So writing all zeros on top of that should be safe enough.

As far as recovering deleted files from a reiserfs file-system. It is actually pretty easy if you have not overwritten the data by writing other files to the disk.

However... the preclear_disk.sh script does not write to the reiser file-system. It works directly on the raw disk itself. As far as I know, there is no un-erase when you write all zeros to a raw disk, regardless of the file-system on it.

But if you are paranoid enough, then you can always zero the disk twice

You could use -c 20 to perform 20 cycles of zeroing the disk if you wanted. It will just take 20 times longer...

BTW, would be nice if we could use /dev/random instead of /dev/zero for this. But form what I gather they don't work quite the same, so it may not be trivial.

If you are absolutely certain of the disk device you can simply type

dd if=/dev/urandom of=/dev/your_device

But be VERY careful... one error and you WILL overwrite the wrong disk... and there is no un-do command.

One more thing... for the "really" paranoid. If a disk has re-allocated a sector because it was unable to read it, then the disk probably still has the original contents of that sector somewhere on its disk. Now, it was re-located because it could not be read, so odds are you'll have a really tough time reading it in its original location on the disk, even if you could somehow get to it using low-level disk diagnostics.

One way to ensure the data on the entire disk is un-readable (including original locations of re-allocated sectors) is to heat the platters within it to a temperature above a point known as the Curie temperature. At this point, the energy being put into the magnetic material on the platters from the heat will permanently disrupt the magnetic domain structure of the material, turning it into a paramagnetic material. See here for detailed instructions, if you are really paranoid.

Now, the disk might lose some of its re-sale value if you use heat to erase it, but if you are really paranoid, the drop in value will not be a significant issue.

Joe L.

(Curie_temperature of iron = 768°C, well above the normal temperature range recommended for disks in an unRAID server combine that with 660.32 °C being the melt temperature of aluminum will guarantee the erasure of the data, and the loss of value on ebay. Best to sell as-is, untested. )

Edited to add links...

olympia · October 20, 2009

OK, thank you for all the hints!

Talos · October 24, 2009

Ive just whacked in a Seagate 1.5tb 11 series drive to use as my Parity drive and run preclear over the drive overnight. Thought i'd get the results below checked before I assign the drive as it's reported back a few more things than my WD10EADS drives did when I ran them through preclear.

Does the drive appear OK to use or is there something I should be looking at?

===========================================================================

= unRAID server Pre-Clear disk /dev/sdg

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Disk Post-Clear-Read completed DONE

Disk Temperature: 32C, Elapsed Time: 19:23:44

============================================================================

==

== Disk /dev/sdg has been successfully precleared

==

============================================================================

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

< 1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 17417

---

> 1 Raw_Read_Error_Rate 0x000f 118 100 006 Pre-fail Always - 181712693

58c58

< 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 732

---

> 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 141171

63,66c63,66

< 188 Unknown_Attribute 0x0032 100 253 000 Old_age Always - 0

< 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

< 190 Airflow_Temperature_Cel 0x0022 069 069 045 Old_age Always - 31 (Lifetime Min/Max 26/31)

< 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always

---

> 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

> 189 High_Fly_Writes 0x003a 088 088 000 Old_age Always - 12

> 190 Airflow_Temperature_Cel 0x0022 068 060 045 Old_age Always - 32 (Lifetime Min/Max 26/40)

> 195 Hardware_ECC_Recovered 0x001a 052 044 000 Old_age Always

69,72c69,72

< 199 UDMA_CRC_Error_Count 0x003e 200 253 000 Old_age Always - 0

< 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 256289288486912

< 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 0

< 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 776

---

> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

> 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 210333138419731

> 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 3172977585

> 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 120762575

============================================================================

Joe L. · October 25, 2009

Ive just whacked in a Seagate 1.5tb 11 series drive to use as my Parity drive and run preclear over the drive overnight. Thought i'd get the results below checked before I assign the drive as it's reported back a few more things than my WD10EADS drives did when I ran them through preclear.

Does the drive appear OK to use or is there something I should be looking at?

===========================================================================

= unRAID server Pre-Clear disk /dev/sdg

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Disk Post-Clear-Read completed DONE

Disk Temperature: 32C, Elapsed Time: 19:23:44

============================================================================

==

== Disk /dev/sdg has been successfully precleared

==

============================================================================

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

< 1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 17417

---

> 1 Raw_Read_Error_Rate 0x000f 118 100 006 Pre-fail Always - 181712693

58c58

< 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 732

---

> 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 141171

63,66c63,66

< 188 Unknown_Attribute 0x0032 100 253 000 Old_age Always - 0

< 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

< 190 Airflow_Temperature_Cel 0x0022 069 069 045 Old_age Always - 31 (Lifetime Min/Max 26/31)

< 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always

---

> 188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

> 189 High_Fly_Writes 0x003a 088 088 000 Old_age Always - 12

> 190 Airflow_Temperature_Cel 0x0022 068 060 045 Old_age Always - 32 (Lifetime Min/Max 26/40)

> 195 Hardware_ECC_Recovered 0x001a 052 044 000 Old_age Always

69,72c69,72

< 199 UDMA_CRC_Error_Count 0x003e 200 253 000 Old_age Always - 0

< 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 256289288486912

< 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 0

< 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 776

---

> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

> 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 210333138419731

> 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 3172977585

> 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 120762575

============================================================================

There are no re-allocated sectors, or sectors pending re-allocation, or rather, if there are, no additional sectors were re-allocated in the pre-clear script. Everything else looks very normal.

Talos · October 25, 2009

There are no re-allocated sectors, or sectors pending re-allocation, or rather, if there are, no additional sectors were re-allocated in the pre-clear script. Everything else looks very normal.

Thanks Joe.. Just what I wanted to hear. Was a bit worried by the rather large numbers at the end of the read and seek error lines. Still don't know enough about interpreting these SMART reports...

Joe L. · October 25, 2009

There are no re-allocated sectors, or sectors pending re-allocation, or rather, if there are, no additional sectors were re-allocated in the pre-clear script. Everything else looks very normal.

Thanks Joe.. Just what I wanted to hear. Was a bit worried by the rather large numbers at the end of the read and seek error lines. Still don't know enough about interpreting these SMART reports...

Nobody but the manufacturer knows how to interpret the "raw"values.

The normalized "read" value is 118, the worst it has ever been is 100, and if it ever gets down to 6 the drive will be considered to be in a failure mode by the SMART tests. So, in reality, during the preclear, the raw-read-rate improved. (118 is a better value than 100. The "100" is probably a factory initialized starting value, as are all the "253" values on other variables you see.)

Joe L.

purko · October 26, 2009

There are no re-allocated sectors, or sectors pending re-allocation, or rather, if there are, no additional sectors were re-allocated in the pre-clear script. Everything else looks very normal.

Thanks Joe.. Just what I wanted to hear. Was a bit worried by the rather large numbers at the end of the read and seek error lines. Still don't know enough about interpreting these SMART reports...

Nobody but the manufacturer knows how to interpret the "raw"values.

Just FYI, I have 8 WD-EADS, and none of them has ever shown such crazy numbers. They've always had those raw numbers at 0. (so far).

Purko

Joe L. · October 26, 2009

There are no re-allocated sectors, or sectors pending re-allocation, or rather, if there are, no additional sectors were re-allocated in the pre-clear script. Everything else looks very normal.

Thanks Joe.. Just what I wanted to hear. Was a bit worried by the rather large numbers at the end of the read and seek error lines. Still don't know enough about interpreting these SMART reports...

Nobody but the manufacturer knows how to interpret the "raw"values.

Just FYI, I have 8 WD-EADS, and none of them has ever shown such crazy numbers. They've always had those raw numbers at 0. (so far).

Purko

Interesting... as I said, only the manufacturer knows... and it might differ by firmware version even within the same drive. Some drives only start keeping statistics after they have some number of running hours logged on them.

Many manufacturers show the raw numbers, many do not, some only show them on some parameters. About the only raw-value that can be understood (most-of-the-time) is the temperature. It is not always correctly calibrated, as at least one drive has reported a temperature "lower" than the ambient room temperature, and that is just not possible, no matter how many fans are used... (unless evaporate cooling is in effect, and if the drives are "wet" there's another problem that needs attention that the "SMART" report might not alert you to.)

Joe L.

RobJ · October 27, 2009

Just FYI, I have 8 WD-EADS, and none of them has ever shown such crazy numbers. They've always had those raw numbers at 0. (so far).

This looks like a classic case of whether you show lots of info and risk confusing and alarming the large base of non-technical users, or show very little info and frustrate your power users. Seagate chose to show more info, probably assuming this is usually only visible to more technical and knowledgeable users. WD went the other way, like Microsoft with dumbed-down interfaces, and Detroit with idiot lights. This is the difference between an oil gauge and a Service Engine indicator. Both are valid at times. I as a more technical user like to know what is really happening behind the scenes, but I don't apply that to everything. While I prefer more info when it is related to computers, I'm tired of auto info, and don't need to know the current compression ratios and oil pressure readings, I'd rather let the mechanic get greasy. Before crossing a bridge, I don't need the current metal and concrete stress readings or thermal expansion/contraction measurements.

We need a balance between the two choices of more info or less info. The best interface would be one that provides a simple and intuitive initial display, plus a link or button for the detailed technical info, PLUS links to explanatory info (what do these numbers mean). There are a growing number of tools able to read SMART info, but in general they seem to go to one extreme or the other. They either are limited to a PASS or FAIL indicator only, or they show large complex tables of raw information, with little to no explanation. I think most users would like to see more information, but only if it is accompanied with an appropriate explanation that provides some perspective on the numbers.

All drives have soft errors (often in the millions), and read and seek error rates, whether they choose to reveal them or not. If the VALUE or WORST numbers are still near 100 (or higher) or 200, then the manufacturer considers them completely nominal, within their expected range, whatever the size or behavior of the RAW VALUE's.

purko · October 28, 2009

We need a balance between the two choices of more info or less info. The best interface would be one that provides a simple and intuitive initial display, plus a link or button for the detailed technical info, PLUS links to explanatory info (what do these numbers mean).

I very much agree with that.

Maybe we can write some kind of script that goes through smartctl's output and makes sense of it...

Here is what I wish to have:

root@Tower:~#
root@Tower:~# smartchk /dev/sda

/dev/sda:
4: This disk has little wear and is still in good condition.
root@Tower:~#
root@Tower:~#

....where the code returned is on the Purko's scale from 0 to 5

5: This disk is in perfectly healthy condition.

4: This disk has little wear and is still in good condition.

3: This disk is past half of its life expectancy.

2: This disk is showing all signs of old age. Consider replacing it now.

1: This disk is in critical condition. Should be replaced immediately

0: This code is not even needed: When the disk is dead you'll know it.

Can somebody write such a tool?

If we had this tool, then preclear_disk.sh could be much more understandable and helpful upon completion.

Purko

Joe L. · October 28, 2009

If we had this tool (to interpret the results), then preclear_disk.sh could be much more understandable and helpful upon completion.

Purko

If it were only so easy...

If you are lucky, you might be able to narrow a given drive down to three or four categories:

1. Dead... no sign of life... Cannot read or write.

2. Use for write only operations... since you'll never read it back.

3. Failing in some way now... could be mechanical, electrical, or comically aural (making a funny noise)

4. Working now, but extremely high odds of failure within the next 175000 hours.

Most drives will fall into category 4, with high odds of failure at some future point. ;D

Joe L.

purko · October 28, 2009

If you are lucky, you might be able to narrow a given drive down to three or four categories:

1. Dead... no sign of life... Cannot read or write.

You don't need any tool to tell you that. You see it yourself.

Seriously, if these numbers are different for different manufacturers, and if only manufecturers know what these numbers mean, and if no mortal can interpret these number, then what's the whole point of having smartctl?

Purko

IG82 · October 28, 2009

Thanks for the script Joe, just used for the first time but plan on getting into the habit.

I have just cleared a brand new 1.5TB EADS WD drive, took a little over 21 hours!

Here are the details from the Smart tests, the raw error read rate has increased during the test from 0 to 1. None of my other WD non-EADS drives have any errors of this type so I don't think it is an anomaly due to first use. Is it something to be concerned about?

> Offline data collection status:  (0x84)       Offline data collection activity
>                                       was suspended by an interrupting command from host.
54c54
<   1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
---
>   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
58c58
<   7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
---
>   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
63c63
< 193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       14
---
> 193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       16
67c67
< 199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
---
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
=========================================================================

Joe L. · October 28, 2009

Thanks for the script Joe, just used for the first time but plan on getting into the habit.

I have just cleared a brand new 1.5TB EADS WD drive, took a little over 21 hours!

Here are the details from the Smart tests, the raw error read rate has increased during the test from 0 to 1. None of my other WD non-EADS drives have any errors of this type so I don't think it is an anomaly due to first use. Is it something to be concerned about?

> Offline data collection status:  (0x84)       Offline data collection activity
>                                       was suspended by an interrupting command from host.
54c54
<   1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
---
>   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
58c58
<   7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
---
>   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
63c63
< 193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       14
---
> 193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       16
67c67
< 199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
---
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
=========================================================================

About the only RAW values you can interpret yourself are those for re-allocated sectors, sectors pending re-allocation, and drive temperature.

Your drive is currently in category 4 of those listed in this post

Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Joe L.

sureguy

sureguy

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation