Preclear.sh results - Questions about your results? Post them here.


Recommended Posts

Well, I stopped the current run of preclear and went into the BIOS to check and this is what it shows (* = current setting)

 

OnChip SATA Channel -Enabled*

                               Disabled

 

OnChip SATA Type - Native IDE*

                            RAID

                            AHCI

                            Legacy IDE

                            IDE->AHCI

 

SATA IDE Combined Mode - Enabled*

                                      Disabled

 

SATA-III Mode - Auto*

                        Force Max Gen2

 

 

 

 

You want AHCI, Not "Native IDE"  You'll know if you have it right when all the disks are showing as /dev/sdX
Link to comment

Hi Just made a preclear run two times on a disk that I am a bit suspicious of (may have had a hit or two before I started using it now). Here are results for these two runs. Same disk, just ran it two times in a row without reboot. I am aware of my system running a bit hot, and am gonna look into it (but most of the times the disks are spun down). I see a bunch of Hardware_ECC_Recovered in the first run. Could that be a physical damage that has now been recovered? Should I run it more times to look for changes?

 

/Lars Olof

 

========================================================================1.5

== invoked as: ./preclear_disk.sh /dev/sdc

==  ST31000340AS    9QJ13D2F== Disk /dev/sdc has been successfully precleared

== with a starting sector of 64

== Ran 1 cycle

==

== Using :Read block size = 8225280 Bytes

== Last Cycle's Pre Read Time  : 3:57:16 (70 MB/s)

== Last Cycle's Zeroing time   : 3:24:03 (81 MB/s)

== Last Cycle's Post Read Time : 7:17:10 (38 MB/s)

== Last Cycle's Total Time     : 14:39:30

==

== Total Elapsed Time 14:39:30

==

== Disk Start Temperature: 36C

==

== Current Disk Temperature: -->43<--C,

==

============================================================================

** Changed attributes in files: /tmp/smart_start_sdc  /tmp/smart_finish_sdc

               ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE

     Raw_Read_Error_Rate =   119     115            6        ok          234687296

        Spin_Retry_Count =   100     100           97        near_thresh 4

        End-to-End_Error =   100     100           99        near_thresh 0

 Airflow_Temperature_Cel =    57      64           45        In_the_past 43

     Temperature_Celsius =    43      36            0        ok          43

  Hardware_ECC_Recovered =    51      26            0        ok          234687296

No SMART attributes are FAILING_NOW

 

0 sectors were pending re-allocation before the start of the preclear.

0 sectors were pending re-allocation after pre-read in cycle 1 of 1.

0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.

0 sectors are pending re-allocation at the end of the preclear,

   the number of sectors pending re-allocation did not change.

0 sectors had been re-allocated before the start of the preclear.

0 sectors are re-allocated at the end of the preclear,

   the number of sectors re-allocated did not change.

============================================================================

 

========================================================================1.5

== invoked as: ./preclear_disk.sh /dev/sdc

==  ST31000340AS    9QJ13D2F== Disk /dev/sdc has been successfully precleared

== with a starting sector of 64

== Ran 1 cycle

==

== Using :Read block size = 8225280 Bytes

== Last Cycle's Pre Read Time  : 3:32:13 (78 MB/s)

== Last Cycle's Zeroing time   : 3:24:03 (81 MB/s)

== Last Cycle's Post Read Time : 7:18:57 (37 MB/s)

== Last Cycle's Total Time     : 14:16:16

==

== Total Elapsed Time 14:16:16

==

== Disk Start Temperature: 43C

==

== Current Disk Temperature: -->42<--C,

==

============================================================================

** Changed attributes in files: /tmp/smart_start_sdc  /tmp/smart_finish_sdc

               ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE

     Raw_Read_Error_Rate =   116     119            6        ok          102094578

        Spin_Retry_Count =   100     100           97        near_thresh 4

        End-to-End_Error =   100     100           99        near_thresh 0

 Airflow_Temperature_Cel =    58      57           45        In_the_past 42

     Temperature_Celsius =    42      43            0        ok          42

No SMART attributes are FAILING_NOW

 

0 sectors were pending re-allocation before the start of the preclear.

0 sectors were pending re-allocation after pre-read in cycle 1 of 1.

0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.

0 sectors are pending re-allocation at the end of the preclear,

   the number of sectors pending re-allocation did not change.

0 sectors had been re-allocated before the start of the preclear.

0 sectors are re-allocated at the end of the preclear,

   the number of sectors re-allocated did not change.

============================================================================

 

 

Link to comment

Changing it to AHCI made the system not boot.

 

Went back into BIOS and saw all drives reading as IDE and the only drive I could select to boot from was one of the data drives and no longer the USB.

 

I am beginnning to hate this board.

 

That sounds like the problem I had. Double check the boot device is still your USB drive. It may have changed.

Link to comment

Well, I stopped the current run of preclear and went into the BIOS to check and this is what it shows (* = current setting)

 

OnChip SATA Channel -Enabled*

                               Disabled

This setting is fine

 

OnChip SATA Type - Native IDE*

                            RAID

                            AHCI

                            Legacy IDE

                            IDE->AHCI

This should be AHCI

 

SATA IDE Combined Mode - Enabled*

                                      Disabled

This should be disabled

Link to comment

Thanks guys. Starting preclear.... again.

 

If you want to avoid stopping your jobs (and also run several at the same time), you should look into the "screen" package in UnMenu, Lets you start virtual sessions and keep them running even if your telnet drops for some reason. Just saying...

 

/Lars Olof

Link to comment

Thanks guys. Starting preclear.... again.

 

If you want to avoid stopping your jobs (and also run several at the same time), you should look into the "screen" package in UnMenu, Lets you start virtual sessions and keep them running even if your telnet drops for some reason. Just saying...

 

/Lars Olof

Unfortunately, even "screen" will not help when you are changing BIOS options.
Link to comment

I figure some of you might get a kick out of this...

 

I purchased a 2TB drive just before Christmas and had never un-packed it.    I did not have a spare port in the server to connect it to, and I did not need the space.  I just put the disk on the shelf as a spare.  It is/was a 2TB ST32000542AS drive...

 

Recently I had ordered some of the very cheap 2 port SATA disk controllers from e-bay, (see here) they arrived the other day from China and I finally installed the 2TB drive in my newer server.   The good news, the 2 port disk controller seems to work just fine, and 6 ports for under $25 is a pretty good deal.

It connects at 3Gb/s.

Feb 25 16:11:56 Tower2 kernel: ata9: SATA max UDMA/133 abar m8192@0xfe9fe000 port 0xfe9fe100 irq 16

Feb 25 16:11:56 Tower2 kernel: ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Feb 25 16:11:56 Tower2 kernel: ata9.00: ATA-8: ST32000542AS, CC34, max UDMA/133

Feb 25 16:11:56 Tower2 kernel: ata9.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)

Feb 25 16:11:56 Tower2 kernel: ata9.00: configured for UDMA/133

 

The bad news... the Seagate disk is not doing so well in its initial pre-clear.

 

In the syslog is a constant series of messages like this:

Feb 25 18:55:09 Tower2 kernel: ata9.00: error: { UNC }
Feb 25 18:55:09 Tower2 kernel: ata9.00: configured for UDMA/133
Feb 25 18:55:09 Tower2 kernel: ata9: EH complete
Feb 25 18:55:13 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 25 18:55:13 Tower2 kernel: ata9.00: irq_stat 0x48000000
Feb 25 18:55:13 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED
Feb 25 18:55:13 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in
Feb 25 18:55:13 Tower2 kernel:          res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F>
Feb 25 18:55:13 Tower2 kernel: ata9.00: status: { DRDY ERR }
Feb 25 18:55:13 Tower2 kernel: ata9.00: error: { UNC }
Feb 25 18:55:13 Tower2 kernel: ata9.00: configured for UDMA/133
Feb 25 18:55:13 Tower2 kernel: ata9: EH complete
Feb 25 18:55:17 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 25 18:55:17 Tower2 kernel: ata9.00: irq_stat 0x48000000
Feb 25 18:55:17 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED
Feb 25 18:55:17 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in
Feb 25 18:55:17 Tower2 kernel:          res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F>
Feb 25 18:55:17 Tower2 kernel: ata9.00: status: { DRDY ERR }
Feb 25 18:55:17 Tower2 kernel: ata9.00: error: { UNC }
Feb 25 18:55:17 Tower2 kernel: ata9.00: configured for UDMA/133
Feb 25 18:55:17 Tower2 kernel: ata9: EH complete
Feb 25 18:55:20 Tower2 kernel: ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 25 18:55:20 Tower2 kernel: ata9.00: irq_stat 0x48000000
Feb 25 18:55:20 Tower2 kernel: ata9.00: failed command: READ FPDMA QUEUED
Feb 25 18:55:20 Tower2 kernel: ata9.00: cmd 60/08:00:b8:93:ad/00:00:00:00:00/40 tag 0 ncq 4096 in
Feb 25 18:55:20 Tower2 kernel:          res 41/40:08:b8:93:ad/00:00:00:00:00/00 Emask 0x409 (media error) <F>
Feb 25 18:55:20 Tower2 kernel: ata9.00: status: { DRDY ERR }
Feb 25 18:55:20 Tower2 kernel: ata9.00: error: { UNC }
Feb 25 18:55:20 Tower2 kernel: ata9.00: configured for UDMA/133
Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] Unhandled sense code
Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj]  Result: hostbyte=0x00 driverbyte=0x08
Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj]  Sense Key : 0x3 [current] [descriptor]
Feb 25 18:55:20 Tower2 kernel: Descriptor sense data with sense descriptors (in hex):
Feb 25 18:55:20 Tower2 kernel:         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Feb 25 18:55:20 Tower2 kernel:         00 ad 93 b8
Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj]  ASC=0x11 ASCQ=0x4
Feb 25 18:55:20 Tower2 kernel: sd 9:0:0:0: [sdj] CDB: cdb[0]=0x28: 28 00 00 ad 93 b8 00 00 08 00
Feb 25 18:55:20 Tower2 kernel: end_request: I/O error, dev sdj, sector 11375544
Feb 25 18:55:20 Tower2 kernel: Buffer I/O error on device sdj, logical block 1421943
Feb 25 18:55:20 Tower2 kernel: ata9: EH complete

 

The initial smart report looked like this  (Note... 0 Power_on_hours, no re-allocated sectors, no sectors pending re-allocation):

[color=maroon]SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   100   100   006    Pre-fail  Always       -       28166
 3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
 4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
 5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       74
 9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       0
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   253   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   068   068   045    Old_age   Always       -       32 (Lifetime Min/Max 26/32)
194 Temperature_Celsius     0x0022   032   040   000    Old_age   Always       -       32 (0 26 0 0)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       28166
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       20
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       20
199 UDMA_CRC_Error_Count    0x003e   200   253   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       141815525146627
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       2407[/color]

 

 

The disk has been pre-clearing for about 2 hours now...  The smart report shows this:

[color=maroon]ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   072   070   006    Pre-fail  Always       -       9462713
 3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
 4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
 5 Reallocated_Sector_Ct   0x0033   094   094   036    Pre-fail  Always       -       [b][color=red]282[/color][/b]
 7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       4727
 9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       2
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       2204
188 Command_Timeout         0x0032   099   099   000    Old_age   Always       -       8590065666
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   064   045    Old_age   Always       -       31 (Lifetime Min/Max 26/36)
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 26 0 0)
195 Hardware_ECC_Recovered  0x001a   048   046   000    Old_age   Always       -       9462713
197 Current_Pending_Sector  0x0012   095   095   000    Old_age   Always       -       [b][color=red]209[/color][/b]
198 Offline_Uncorrectable   0x0010   095   095   000    Old_age   Offline      -       [b][color=red]209[/color][/b]
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       12854837116933
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3544051280[/color]

 

What I find interesting is not that there are 209 sectors pending re-allocation, but that 282 have already been re-allocated?

 

I cannot figure out how that has happened, since I've not written to the disk at all.  The pre-read has not even read 1% of the disk.

================================================================== 1.7

=                unRAID server Pre-Clear disk /dev/sdj

=               cycle 1 of 1, partition start on sector 63

= Disk Pre-Read in progress: 0% complete

= ( 4,935,168,000  bytes of  2,000,398,934,016  read )

=

Disk Temperature: 31C, Elapsed Time:  2:16:22

 

I've got nothing to lose if I let the preclear run its course.  One thing for sure, by the time it is done, Ill be looking at an RMA.

My big mistake... putting the disk on the shelf rather than installing it right away... so much for returning it to newegg.   Now it will have to go back to Seagate...  

 

In the time it has taken me to write this post I now have

[color=red]  5 Reallocated_Sector_Ct   0x0033   092   092   036    Pre-fail  Always       -       [b]333[/b]
197 Current_Pending_Sector  0x0012   095   095   000    Old_age   Always       -       [b]238[/b]
198 Offline_Uncorrectable   0x0010   095   095   000    Old_age   Offline      -       [b]238[/b][/color]

 

I wish I knew how it is figuring out what to put in the sectors it has already re-allocated.  I've still never written to the disk at all.  It is still less than 1% through the pre-read having read about 6Gig out of the 2000Gig.

= Disk Pre-Read in progress: 0% complete

= ( 6,580,224,000  bytes of  2,000,398,934,016  read )

 

Ouch...

 

Edit:  It has been 10 hours.  it is now 1% through the pre-read.  I've still not written to the drive at all....

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   072   070   006    Pre-fail  Always       -       38629196
  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
  5 Reallocated_Sector_Ct   0x0033   065   065   036    Pre-fail  Always       -      [b][color=red] 1440[/color][/b]
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       18372
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       [b][color=red]10[/color][/b]
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       8932
188 Command_Timeout         0x0032   099   099   000    Old_age   Always       -       30065229831
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   073   064   045    Old_age   Always       -       27 (Lifetime Min/Max 26/36)
194 Temperature_Celsius     0x0022   027   040   000    Old_age   Always       -       27 (0 26 0 0)
195 Hardware_ECC_Recovered  0x001a   048   046   000    Old_age   Always       -       38629196
197 Current_Pending_Sector  0x0012   081   081   000    Old_age   Always       -       [b][color=red]794[/color][/b]
198 Offline_Uncorrectable   0x0010   081   081   000    Old_age   Offline      -       [b][color=red]794[/color][/b]
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       66559108186125
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3572622799

 

Still have no idea how it is re-allocating sectors...  it cannot know what to put in the re-allocated sector, no writes have been made to the drive at all.

 

Joe L.

Link to comment

Joe.

 

Excuse an amateur in the field of disk farming, but is this not the time to stop this run and put the disk in another known good slot for testing? You are running a new unknown disk on a new unknown controller with new unknown cabling. Isn't it better to have at least something known good in the equation? Sorry if I didn't read t thoroughly and you already did it.

 

/Lars Olof

Link to comment

larson might have a point.

although i could not think of a practical solution, i have a possible conspiracy theory explanation: is it possible that this drive has been actually used before, returned and then sent again as new? that would explain your problems...

I would doubt it had been used before I put it in service since the initial SMART report showed zero hours run-time.

Now, it might have been drop-kicked by the shipping company on its trip from newegg to my house.  Used, no... Abused, possibly.

 

It is now at 15 hours of run-time.  I do not know if anybody has seen a disk slowly fail like this, so I'm just letting it run.  It is still VERY SLOWLY reading the disk.  After 15 hours it is still only at 1% in the pre-read.

 

I figure in a few more hours the re-allocated sectors will have reached the failure threshold and the disk will be considered as failing SMART tests.

 

As far as suspecting the new disk controller card. No... I do not suspect it at all, nothing on it would cause the SMART data and everything else to continuously report media errors.    Cable problems would show as timeouts, or ICRC errors.

 

Once the SMART data shows FAILING_NOW I'll swap cables around, but for now, as far as I can see, the disk controller is working exactly as it should.  (Oh yes, the disk made a lot of clicking noises when I first powered the server on after installing it... even before it was being accessed by the Linux OS.  I had a bad feeling even then.)

 

Joe L.

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   071   070   006    Pre-fail  Always       -       57557914
  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
  5 Reallocated_Sector_Ct   0x0033   046   046   036    Pre-fail  Always       -      [b][color=red] 2238[/color][/b]
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       27314
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       [color=red][b]15[/b][/color]
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       13219
188 Command_Timeout         0x0032   100   098   000    Old_age   Always       -       42950328330
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   073   064   045    Old_age   Always       -       27 (Lifetime Min/Max 26/36)
194 Temperature_Celsius     0x0022   027   040   000    Old_age   Always       -       27 (0 26 0 0)
195 Hardware_ECC_Recovered  0x001a   048   046   000    Old_age   Always       -       57557914
197 Current_Pending_Sector  0x0012   072   072   000    Old_age   Always       -       [color=red][b]1166[/b][/color]
198 Offline_Uncorrectable   0x0010   072   072   000    Old_age   Offline      -       [color=red][b]1166[/b][/color]
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       209435490254866
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3591162722

 

 

Link to comment

The disk finally died... Interestingly, it stopped responding to SMART reports, or just about anything.

 

I powered down, swapped it to a different disk controller, (for those who suspected it might be a disk controller thing) and after powering up got this smart report:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   072   070   006    Pre-fail  Always       -       66340428
  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       5
  5 Reallocated_Sector_Ct   0x0033  [color=red][b] 036   036   036[/b][/color]    Pre-fail  Always  [b][color=red] FAILING_NOW 2625[/color][/b]
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       31627
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       [b][color=red]18[/color][/b]
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       5
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       15308
188 Command_Timeout         0x0032   099   098   000    Old_age   Always       -       60130459662
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   072   064   045    Old_age   Always       -       28 (Lifetime Min/Max 28/28)
194 Temperature_Celsius     0x0022   028   040   000    Old_age   Always       -       28 (0 26 0 0)
195 Hardware_ECC_Recovered  0x001a   049   046   000    Old_age   Always       -       66340428
197 Current_Pending_Sector  0x0012   068   068   000    Old_age   Always       -       [b][color=red]1350[/color][/b]
198 Offline_Uncorrectable   0x0010   068   068   000    Old_age   Offline      -       [b][color=red]1350[/color][/b]
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10423885627416
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3599756433

 

I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware.

I'll just go through the RMA process now.

 

Joe L.

Link to comment

come to think of it now, isnt that  supposed to  be one of the seagate drives that cave to have their firmware update because of the "click of death" issue? If it came with the old firmware, that might be your problem.

 

I hope you now understand how important and useful preclear is, Joe (just kidding)

Link to comment

I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware.

 

I believe the drive is supposed to do pre-emptive reallocations when it has trouble reading data but is ultimately successful.  Without such a feature, a parity check would almost never result in a reallocated sector.  Am I missing something?

Link to comment

I don't like that it re-allocated the sectors without them being written to... To me, that is a mistake in the disks's firmware.

 

I believe the drive is supposed to do pre-emptive reallocations when it has trouble reading data but is ultimately successful.  Without such a feature, a parity check would almost never result in a reallocated sector.  Am I missing something?

Yes, a "feature" of the "md" driver is to re-construct the failed "read" (by use of the other disks in the array), supply it to the OS program requesting the data, and then write the same sector back to the disk.  It is that "write" that allows the sector pending-reallocation to be re-allocated.

 

The block of code is in unraid.c

                    /* If we're trying to read a failed disk, then we must read

                        * parity and all the "other" disks and compute it.

                        */

                        if ((col->read_bi || (failed && sh->col[failed_num].read_bi)) &&

                            !buff_uptodate(col) && !buff_locked(col)) {

                                if (disk_valid( col)) {

                                        dprintk("Reading col %d (sync=%d)\n", i, syncing);

                                        set_buff_locked( col);

                                        locked++;

                                        set_bit(MD_BUFF_READ, &col->state);

                                }

                                else if (uptodate == disks-1) {

                                        dprintk("Computing col %d\n", i);

                                        compute_block(sh, i); /* also sets it Uptodate */

                                        uptodate++;

 

                                        /* if failed disk is enabled, write it */

                                        if (disk_enabled( col)) {

                                                dprintk("Writing reconstructed failed col %d\n", i);

                                                set_buff_locked( col);

                                                locked++;

                                                set_bit(MD_BUFF_WRITE, &col->state);

                                        }

 

                                        /* this stripe is also now in-sync */

                                        if (syncing)

                                                set_bit(STRIPE_INSYNC, &sh->state);

                                }

                        }

 

Without a subsequent write, or a successful read, I see no way a sector can be re-allocated with the correct contents.

 

 

Link to comment

Hi, have just run preclear for the first time on a couple new 2tb drives. had them running simultaneously, and the difference the report gave was a fair bit, especially the g-sense, drive most of been dropped at some point by the looks of it? still says it passed though.

 

Anyway heres a copy past from unmenu

 

drive 1 - 26hrs 39mins to clear

 

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       8
 2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
 3 Spin_Up_Time            0x0023   068   068   025    Pre-fail  Always       -       9971
 4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       3
 5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
 8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
 9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       33
10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       3
181 Program_Fail_Cnt_Total  0x0022   100   100   000    Old_age   Always       -       154883
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       29
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   050   000    Old_age   Always       -       30 (Lifetime Min/Max 25/50)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       7
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       3

 

drive2 - 27hrs 18mins to clear

 

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       52
 2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
 3 Spin_Up_Time            0x0023   068   068   025    Pre-fail  Always       -       9927
 4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       3
 5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
 8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
 9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       33
10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       3
181 Program_Fail_Cnt_Total  0x0022   100   100   000    Old_age   Always       -       30707
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       254
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   054   000    Old_age   Always       -       30 (Lifetime Min/Max 24/46)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       13
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       3

Link to comment

If these are the results that I get after (2) preclears on a new 2 TB EARS drive, should I RMA it?

 

** Changed attributes in files: /tmp/smart_start_sdb  /tmp/smart_finish_sdb
                ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE
      Raw_Read_Error_Rate =   143     100           51        ok          5916
      Temperature_Celsius =   132     133            0        ok          18
No SMART attributes are FAILING_NOW

0 sectors were pending re-allocation before the start of the preclear.
0 sectors were pending re-allocation after pre-read in cycle 1 of 1.
0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.
0 sectors are pending re-allocation at the end of the preclear,
    the number of sectors pending re-allocation did not change.
0 sectors had been re-allocated before the start of the preclear.
0 sectors are re-allocated at the end of the preclear,
    the number of sectors re-allocated did not change.

 

** Changed attributes in files: /tmp/smart_start_sdg  /tmp/smart_finish_sdg
                ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VALUE
      Raw_Read_Error_Rate =   166     143           51        ok          7198
          Seek_Error_Rate =   100     200            0        ok          0
      Temperature_Celsius =   132     131            0        ok          18
No SMART attributes are FAILING_NOW

0 sectors were pending re-allocation before the start of the preclear.
0 sectors were pending re-allocation after pre-read in cycle 1 of 1.
0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.
0 sectors are pending re-allocation at the end of the preclear,
    the number of sectors pending re-allocation did not change.
0 sectors had been re-allocated before the start of the preclear.
0 sectors are re-allocated at the end of the preclear,
    the number of sectors re-allocated did not change.

 

Thanks.

Link to comment

I have a Seagate 1.5TB drive that I was using in my DirecTV HD DVR.  The video was freezing frequently.  I replaced it with a EVDS drive, and no more freezing issues.

 

I ran it through preclear and got the following results.

 

Disk Temperature: 31C, Elapsed Time:  18:35:57

========================================================================1.7

==  ST31500341AS    9VS09RZ8

== Disk /dev/sdm has been successfully precleared

== with a starting sector of 63

============================================================================

** Changed attributes in files: /tmp/smart_start_sdm  /tmp/smart_finish_sdm

                ATTRIBUTE  NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VA

LUE

      Raw_Read_Error_Rate =  102    115            6        ok          396794

7

        Spin_Retry_Count =  100    100          97        near_thresh 2

        End-to-End_Error =  100    100          99        near_thresh 0

          High_Fly_Writes =    1      1            0        near_thresh 284

  Airflow_Temperature_Cel =    69      75          45        near_thresh 31

      Temperature_Celsius =    31      25            0        ok          31

  Hardware_ECC_Recovered =    49      22            0        ok          396794

7

No SMART attributes are FAILING_NOW

 

0 sectors were pending re-allocation before the start of the preclear.

0 sectors were pending re-allocation after pre-read in cycle 1 of 1.

0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.

0 sectors are pending re-allocation at the end of the preclear,

    the number of sectors pending re-allocation did not change.

44 sectors had been re-allocated before the start of the preclear.

44 sectors are re-allocated at the end of the preclear,

    the number of sectors re-allocated did not change.

 

Is this drive ok?  I am not planning to use it in my array.  HTPC perhaps.

 

 

Link to comment

I have a Seagate 1.5TB drive that I was using in my DirecTV HD DVR.  The video was freezing frequently.  I replaced it with a EVDS drive, and no more freezing issues.

 

I ran it through preclear and got the following results.

 

Disk Temperature: 31C, Elapsed Time:  18:35:57

========================================================================1.7

==  ST31500341AS    9VS09RZ8

== Disk /dev/sdm has been successfully precleared

== with a starting sector of 63

============================================================================

** Changed attributes in files: /tmp/smart_start_sdm  /tmp/smart_finish_sdm

                ATTRIBUTE   NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS      RAW_VA

LUE

      Raw_Read_Error_Rate =   102     115            6        ok          396794

7

         Spin_Retry_Count =   100     100           97        near_thresh 2

         End-to-End_Error =   100     100           99        near_thresh 0

          High_Fly_Writes =     1       1            0        near_thresh 284

  Airflow_Temperature_Cel =    69      75           45        near_thresh 31

      Temperature_Celsius =    31      25            0        ok          31

   Hardware_ECC_Recovered =    49      22            0        ok          396794

7

No SMART attributes are FAILING_NOW

 

0 sectors were pending re-allocation before the start of the preclear.

0 sectors were pending re-allocation after pre-read in cycle 1 of 1.

0 sectors were pending re-allocation after zero of disk in cycle 1 of 1.

0 sectors are pending re-allocation at the end of the preclear,

    the number of sectors pending re-allocation did not change.

44 sectors had been re-allocated before the start of the preclear.

44 sectors are re-allocated at the end of the preclear,

    the number of sectors re-allocated did not change.

 

Is this drive ok?  I am not planning to use it in my array.  HTPC perhaps.

 

 

Looks OK to me.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.