Jump to content

Slow Pre-Clear


Recommended Posts

Posted

I recently ran a pre-clear on a WD20EARX 2 TB drive and it took around 25-30 hours.

 

I just recieved a replacement WD20EARS 2 TB drive that I am currently running preclear.  It's been 70 hours and the Post-Read is at 12% complete and 2.4 MB/s.

 

Is this normal?

 

Here's a smart report I did while the drive was still going through pre-clear:

smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)

Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD20EARS-60MVWB0

Serial Number:    WD-WCAZAC335127

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Mon Jan  9 15:33:21 2012 Local time zone must be set--see zic m

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x80) Offline data collection activity

was never started.

Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (40200) seconds.

Offline data collection

capabilities: (0x5b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

No Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

SCT capabilities:       (0x303d) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0027  100  253  021    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      5

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002f  100  253  051    Pre-fail  Always      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      71

10 Spin_Retry_Count        0x0033  100  253  051    Pre-fail  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      4

184 End-to-End_Error        0x0033  100  100  097    Pre-fail  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

190 Airflow_Temperature_Cel 0x0022  075  070  040    Old_age  Always      -      25 (Lifetime Min/Max 23/30)

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      2

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -      2

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  100  253  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  100  253  000    Old_age  Offline      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

Posted

I found these errors in the syslog for the disk - note the erros in bold.  Does anyone know what these mean? 

 

Jan  6 16:46:00 MLDataServer kernel: ------------[ cut here ]------------ (Drive related)

Jan  6 16:46:00 MLDataServer kernel: ---[ end trace 4eaa2a86a8e2da22 ]--- (Drive related)

Jan  6 16:46:00 MLDataServer kernel: scsi4 : ahci (Drive related)

Jan  6 16:46:00 MLDataServer kernel: ata4: SATA max UDMA/133 abar m1024@0xfeb4f000 port 0xfeb4f280 irq 19 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)Jan  6 16:46:00 MLDataServer kernel: ata4.00: ATA-8: WDC WD20EARS-60MVWB0, 51.0AB51, max UDMA/100 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: ata4.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA (Drive related)

Jan  6 16:46:00 MLDataServer kernel: ata4.00: configured for UDMA/100 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: scsi 4:0:0:0: Direct-Access    ATA      WDC WD20EARS-60M 51.0 PQ: 0 ANSI: 5 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] 4096-byte physical blocks (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] Write Protect is off (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA (Drive related)

Jan  6 16:46:00 MLDataServer kernel:  sdd: (Drive related)

Jan  6 16:46:00 MLDataServer kernel: sd 4:0:0:0: [sdd] Attached SCSI disk (Drive related)

Jan  6 16:46:00 MLDataServer emhttp: pci-0000:00:11.0-scsi-3:0:0:0 host4 (sdd) WDC_WD20EARS-60MVWB0_WD-WCAZAC335127 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: unRAID driver 1.1.1 installed (System)

Jan  6 16:46:00 MLDataServer kernel: md: import disk0: [3,0] (hda) WDC WD20EARS-00S8B1 WD-WCAVY3915117 size: 1953514552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: import disk1: [8,80] (sdf) ST31500541AS    6XW020BT size: 1465138552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: import disk2: [8,96] (sdg) ST31500541AS    6XW00EZS size: 1465138552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: import disk3: [8,112] (sdh) ST31500541AS    6XW020P2 size: 1465138552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: import disk4: [8,128] (sdi) ST31500541AS    6XW00HSV size: 1465138552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: import disk5: [8,16] (sdb) WDC WD20EARX-00P WD-WMAZA5535730 size: 1953514552 (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: recovery thread woken up ... (Drive related)

Jan  6 16:46:00 MLDataServer kernel: md: recovery thread has nothing to resync (Drive related)

Jan  6 16:56:31 MLDataServer kernel: md: recovery thread woken up ... (Drive related)

Jan  6 16:56:31 MLDataServer kernel: md: recovery thread has nothing to resync (Drive related)

Jan  6 16:57:13 MLDataServer kernel:  sdd: unknown partition table (Drive related)

Jan  7 07:01:56 MLDataServer kernel:  sdd: sdd1 (Drive related)

Jan  9 19:42:02 MLDataServer kernel: ata4.00: exception Emask 0x10 SAct 0x1 SErr 0x40d0202 action 0xe frozen (Errors)

Jan  9 19:42:02 MLDataServer kernel: ata4.00: irq_stat 0x00400040, connection status changed (Drive related)

Jan  9 19:42:02 MLDataServer kernel: ata4: SError: { RecovComm Persist PHYRdyChg CommWake 10B8B DevExch } (Errors)

Jan  9 19:42:02 MLDataServer kernel: ata4.00: failed command: READ FPDMA QUEUED (Minor Issues)

Jan  9 19:42:02 MLDataServer kernel: ata4.00: cmd 60/00:00:38:3e:77/02:00:1e:00:00/40 tag 0 ncq 262144 in (Drive related)

Jan  9 19:42:02 MLDataServer kernel: ata4.00: status: { DRDY } (Drive related)Jan  9 19:42:02 MLDataServer kernel: ata4: hard resetting link (Minor Issues)

Jan  9 19:42:03 MLDataServer kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related)

Jan  9 19:42:03 MLDataServer kernel: ata4.00: configured for UDMA/100 (Drive related)

Jan  9 19:42:03 MLDataServer kernel: ata4: EH complete (Drive related)

Posted

I've seen this sort of thing twice with brand new WD 2TB drives.  Both times there were no issues in the SMART or syslogs, so I took them back to my dealer and got a replacement.  I think some people have stated that once the first preclear pass was done the second pass ran at normal speed, but that was not the case for me.

 

Regards,

 

Stephen

 

Posted

I tried all sorts of hardware changes and different cables.  None of which worked.

 

I noticed an "Disabling IRQ #19" error being reported.  Which led me to do some more searching.

 

I went into the Bios and set up the Sata as AHCI and also set the CPU to run at maximum performance.

 

This last Pre-Clear on the same drive is doing much better.  I'm running at 98 MB/s with 24% of the Post-Read complete.  It's on about 17 Hours thus far.

 

So I'm hoping one of those two things solved the problem.

 

Fingers crossed!  If this one works, I'll try running the 2nd pre-clear on this drive and the Seagate that also had a slow pre-clear.

 

Posted

From my experience the ports should always be set for AHCI for unRAID use.  The AHCI Linux driver seems solid;  some of the others might not be so good.

Posted

Doh.  Woke up this morning to the Pre-Clear Post-Read at 77% complete and 2.5 MB/s.

 

I have 9 drives in the system including parity and cache. 

Posted

Well, I'm happy to report that I finally made it through the PreClear at normal speeds after updating the BIOS to the latest version.

 

So hopefully that was it.  I'm running it a 2nd time to see if it makes it through again.

Posted

Ok. Spoke to soon.

 

The 2nd Pre-Clear on the same drive slowed to 2.4 MB/s after 30% complete on the Post Read. It was about 30 hrs in.

 

So I decided to try a brand new WD20EARX in the same slot.

 

It too suffered the same fate and slowed to 2.4 MB/S during the Post-Read.

 

It seems a lot of users are having similar problems.  Are we sure there isn't something bigger at work here?  Is there another BIOS setting that could need changing?

Or even possibly something with the Pre-Clear software?

 

Can I still use these drives if I just let them make their way through the Pre-Clear?

 

 

Here are some of the errors in the syslog:

 

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: No handler for Region [sACS] (f74c21b8) [PCI_Config] (20090903/evregion-319) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: Region PCI_Config(2) has no handler (20090903/exfldio-295) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (psparse-0537): Method parse/execution failed [\PRID.P_D0._STA] (Node f741e7b0), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (uteval-0250): Method execution failed [\PRID.P_D0._STA] (Node f741e7b0), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: No handler for Region [sACS] (f74c21b8) [PCI_Config] (20090903/evregion-319) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: Region PCI_Config(2) has no handler (20090903/exfldio-295) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (psparse-0537): Method parse/execution failed [\PRID.P_D1._STA] (Node f741e858), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (uteval-0250): Method execution failed [\PRID.P_D1._STA] (Node f741e858), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: No handler for Region [sACS] (f74c21b8) [PCI_Config] (20090903/evregion-319) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: Region PCI_Config(2) has no handler (20090903/exfldio-295) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (psparse-0537): Method parse/execution failed [\SECD.S_D0._STA] (Node f741e9c0), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (uteval-0250): Method execution failed [\SECD.S_D0._STA] (Node f741e9c0), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: No handler for Region [sACS] (f74c21b8) [PCI_Config] (20090903/evregion-319) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error: Region PCI_Config(2) has no handler (20090903/exfldio-295) (Errors)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (psparse-0537): Method parse/execution failed [\SECD.S_D1._STA] (Node f741ea68), AE_NOT_EXIST (Minor Issues)

Jan 16 14:47:33 MLDataServer kernel: ACPI Error (uteval-0250): Method execution failed [\SECD.S_D1._STA] (Node f741ea68), AE_NOT_EXIST (Minor Issues)

 

Jan 17 11:37:13 MLDataServer kernel: irq 19: nobody cared (try booting with the "irqpoll" option) (Errors)

Jan 17 11:37:13 MLDataServer kernel: Pid: 24180, comm: sum Tainted: G        W  2.6.32.9-unRAID #8 (Errors)

Jan 17 11:37:13 MLDataServer kernel: Call Trace: (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c10451cf>] __report_bad_irq+0x2e/0x6f (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c1045305>] note_interrupt+0xf5/0x13c (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c1045a14>] handle_fasteoi_irq+0x5f/0x9d (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c1004a82>] handle_irq+0x1a/0x24 (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c1004285>] do_IRQ+0x40/0x96 (Errors)

Jan 17 11:37:13 MLDataServer kernel:  [<c1002f29>] common_interrupt+0x29/0x30 (Errors)

Posted

Your problem has nothing to do with the hard drives themselves.  It has to do with whatever is messing with the IRQs and disabling the IRQ associated with the drives. 

 

In my case, all the drives on the PCIe SATA controller get assigned to IRQ #16 as that is where the sata_mv Linux driver gets assigned by the BIOS.  Something happens to disable that IRQ and then performance tanks.  Just like in your case, I go from >100 MB/s preclears to 2.6 MB/s preclears when the IRQ gets disabled.  A BIOS update did nothing for me, but, at least (after many BIOS tweaks) the cause for me seems to be identifiable as related to switching video inputs on my monitor.  Why that affects IRQs I don't know but, it is now 100% repeatable.

Posted

Interesting.  I'm anxious to see what you come up with to solve the problem.  I've already completely rebuilt my system due to other issues and I was hoping it was going to be smooth sailing.

 

Apparently it's extremely difficult to find pieces that play nicely together.

 

As a side note, I've disabled the Realtek NIC card in the BIOS.  I didn't think it mattered since I was using a PCI-E Rosewill Nic card already based upon the forum saying that the Realtek has issues.

 

I'm about 74% through the Post-Read with speads of 72.3 MB/S currently.  I'm hoping it makes it through this Pre-Clear and 2 more.

 

I have my unraid system connected to a 100 MB switch (while it sits on my desk out of the rack) which runs to a Gigabit switch...that wouldn't be causing issues would it?

Posted

Ok.  I've successfully finished the first Pre-Clear at normal speeds since disabling the RealTek NIC.  I'm currently on the 2nd Pre-Clear and it is still running at 100 MB/S through the Post-read and 25% complete.  Fingers crossed. :)

Posted

I successfully pre-cleared the same drive 3x.

 

Then I tried another 2TB drive and it slowed down again. :(

 

Are all 2 TB drives supposed to start on sector 64, or if it's older, you still start on sector 63?  I'm running unraid 4.7.

Posted

All new drives should be 4k-aligned. Existing non-AF drives can be left non-aligned. There is no reason to change them.

 

Does it matter though if they are 4k aligned?  I have several that are older that have come back refurbished.  How do I tell if they are non-AF drives?

I guess would it do any harm if all future drives that get added to the system are 4k aligned since that's the default setting I have for Pre-Clear?

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...