WD 20 EARS DOA


Recommended Posts

Hi,

 

Just wanted to know how many of the WD 20 EARS usualy are DOA (Dead On Arrival) and have to be RMAd.

 

I've ordered a batch of 10 discs and two of them are dead. One isn't making it over the controller initialisation, the other shows no reaction on preclear (or, better said: preclear says: try to powercycle, which didn't help anyway...)

 

So it is 1 out of 5 for me  :o - 2 are precleared without any errors, 6 are currently cleared without any problems so far.

 

Is 1 out of 5 OK? Did you experience the same numbers?

 

Plz tell me, I'm curious ....

Link to comment

Did you just buy them recently? If you check out newegg, you see that a lot of people in the last month or so complained about DOA disks, but before that it was pretty good reviews. I think maybe WD did a really big bad batch of drives.

I'm in the middle of a preclear of 3 of those drives- they all seem to respond, but I am doing heavy clearing/testing to ensure there's no issues down the road.

Link to comment

In my experience it's very common to have a high fall-out rate, which I why I hate providing hard drives for purchase - it's just a big hassle - we only do it as a service to help server customers.  Last hard drive purchase was for 17 WD EADS drives - 2 of them were DOA.  Right now, looks like WD has quality issues; in past it's been Samsung, before that, Seagate.  Looking how HD's are packaged, I'd also guess a large percentage of drive problems arise from rough handling during shipment.

Link to comment

Hi,

 

well, I think I have two other "problem childs" (thank you AC/DC) ...

 

Looking at the serial numbers, I found out that:

 

All working discs start with "WMAZA32...."

 

Some drives report a longer serial which include "MVWB0", those are working.

 

All drives with "WCASA" failed during preclear, my syslog went up to 2GB ...

 

The two DOA drives had also a "WMAZA1..." serial. They did not work at all.

 

Seems that the faulty drives can be determined by serial numbers.

 

I'm gona test the two "WCASA" drives with WinXP, the DOAs don't work at all.

 

BTW: The two dd tasks on sdg and sdf (the two "WCASA" drives) are still running, but not responding. Can I kill them and restart preclear? Which tasks should I kill?

 

I'm a little fed up, 4 out of 10 is a PITA  >:(

serials_WD20EARS.jpg.5241a8999b1d86aea38551e14798dbc8.jpg

Link to comment

My last order of Seagate HDDs I had 1 of 2 fail on me right off the bat. After a single preclear pass the drive wouldn't even pass a SMART short test. It would fail during read. Looking at the way the drives were packaged, it's a surprise they both weren't dead. They were in their anti-static bags, then wrapped in a giant ball of bubble wrap and placed loosely inside a larger box for shipping.

 

The way Seagate shipped the replacement drive is how all drives should be shipped. It was wrapped in it's anti-static bag, then placed in the middle of two large foam pieces that were firmly set inside a box. There is no chance for movement inside that package.

 

The 8 separate orders of HDDs before that I had no issues with the WD or Seagate drives. All passed preclear cycles without issue.

Link to comment

Hi,

 

well, I think I have two other "problem childs" (thank you AC/DC) ...

 

Looking at the serial numbers, I found out that:

 

The two DOA drives had also a "WMAZA1..." serial. They did not work at all.

 

I recently (boxing day sale) bought a drive with serial: WMAZA1 543717, this worked but the preclear was taking over twice as long as normal, so I took it back and got a replacement, which is also a WMAZA1 drive, it precleared normally.  This was the only issue I've had with WD 2TB green drives, so that's 1 bad out of 8 for me.

 

Stephen

 

Link to comment

I just recently got 4 WD 2TB EARS from different vendors:

 

1 from Amazon (late Dec)

2 from NewEgg (one from Nov, the other from late Dec)

1 from TigerDirect (Early Jan)

 

(I guess I ordered from different places to heighten the chances that they would be from different batches.)

 

Unfortunately, 2 of these drives had immediate issues.  One would consistently fail to preclear and although the other passed preclear, it recently redballed after putting a few GB of data on it.  Running a short/long SMART test fails at 90% with an unknown failure and a notice of Raw_Read_Error_Rate of "FAILING_NOW".  I RMA'd the first one already and am in the process of RMA'ing the other. 

 

Interestingly enough, the two that failed both had serials starting with WMAZA1.  And unfortunately, out of the two remaining, I have one other "WMAZA1", but this one appears to be good (so far...).  Out of my recent purchases, this one has been in use since Nov, so it's the "oldest".  Really sounds like a bad batch, tho...

 

In case anyone's interested, here's the smart status of the 2nd one that redballed and that's still in my server (I have a spare WD (WCAYY0) preclearing now to take it's place, but until then it's still in my server):

 

smartctl -a -d ata /dev/sdb
smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WMAZA1075191
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Jan 22 12:22:24 2011 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
				was suspended by an interrupting command from host.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (  73)	The previous self-test completed having
				a test element that failed and the test
				element that failed is not known.
Total time to complete Offline 
data collection: 		 (35760) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 255) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   037   036   051    Pre-fail  Always   FAILING_NOW 23474
  3 Spin_Up_Time            0x0027   171   171   021    Pre-fail  Always       -       6416
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       25
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       123
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       19
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       502
194 Temperature_Celsius     0x0022   129   112   000    Old_age   Always       -       21
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       55

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: unknown failure    90%       118         -
# 2  Short offline       Completed: unknown failure    90%       117         -
# 3  Short offline       Completed: unknown failure    90%       105         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Link to comment

I don't think you can necessarily read too much in to the serial numbers.

One point of significance about the WD 2TB Green drives is that they recently went from 500GB/platter to 667GB/platter (and thus down from 4 platters to 3 platters).

 

There have been a number of revisions of the WD20EARS drive:

WD20EARS-00S8B1 (I have 3 of these with serial numbers that start with WCAVY)

WD20EARS-00J2GB0 (I don't have any of these but this vr-zone article describes the cosmetic differences between the S8B1 and J2GB0 revisions)

 

and the latest 3 platter version:

WD20EARS-00MVWB0 (i have 1 of these with a serial number that starts with WCAZA)

 

It looks to me like everyone above is talking about the 3 platter MVWB0 revision with different serial number differences, so I want to draw the distinction between comparing different revisions of the same drive with different serial number prefixes for the same revision. I expect differences between my new 3 platter drive and my older 4 platter drives.

Link to comment

Hi,

 

Just wanted to know how many of the WD 20 EARS usualy are DOA (Dead On Arrival) and have to be RMAd.

 

I've ordered a batch of 10 discs and two of them are dead. One isn't making it over the controller initialisation, the other shows no reaction on preclear (or, better said: preclear says: try to powercycle, which didn't help anyway...)

 

So it is 1 out of 5 for me  :o - 2 are precleared without any errors, 6 are currently cleared without any problems so far.

 

Is 1 out of 5 OK? Did you experience the same numbers?

 

Plz tell me, I'm curious ....

 

I just picked up 3 off of Newegg and 1 of the 3 died on me while formatting.  It sucks to have to RMA it, but they turned out better than two Hitachi's I got, where both showed up dead and have to be returned.

Link to comment

OK, so I just restarted the preclear of three disks which were "misbehaving".

 

So far they are through 35% of prereading ... keep fingers crossed!

 

Maybe I've stressed my Tower too much - but with 4GB RAM it should be possible to do 6 parallel preclears, shouldn't it?  ::)

 

The interesting thing is - the ten disks came in one tray...

 

OK, so the two are going RMA tomorrow, and maybe one or two of the just preclearing ones. Still I don't know which of them are three plattern disks and which are 3 plattern ones?  ???

 

 

Link to comment

Darn!

 

The two disks which fooled me yesterday did stop again on preclear - both at 51%!!!

 

After stopping preclear:

 

SMART status Info for /dev/sdf

smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

 

The drives are not there anymore until rebooting the machine.

 

I'm trying to preclear them again directly on the server, not via telnet (which, btw, worked for all other drives). Trying to powercycle them...

 

 

Link to comment

Darn!

 

The two disks which fooled me yesterday did stop again on preclear - both at 51%!!!

 

After stopping preclear:

 

SMART status Info for /dev/sdf

smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

 

The drives are not there anymore until rebooting the machine.

 

I'm trying to preclear them again directly on the server, not via telnet (which, btw, worked for all other drives). Trying to powercycle them...

 

 

Try writing zeros to the first sector holding the MBR.  It has been known to help in some cases, especially if the jumper setting was changed.

 

dd if=/dev/zero count=8 of=/dev/sdf

 

I don't know about you, but a drive that decides to become un-responsive does not give me great confidence.

 

What specific power supply are you using?

 

Joe L.

Link to comment

Hi Joe,

 

my power supply is a SilverStone single rail 12V 60A 750W one. Should be more than sufficient for even 20 drives.

 

My drives are caged in three SuperMicro 5 in 3 bays, so double-Molex connections should prevent power failure. They all are tight.

 

I've just switched the non working drives to another bay and therefore another part of my controller (SuperMicro SASL), just to check if there is a problem with one of my controllers or cabeling. But if there was, why did the disks stop twice at 51% without any previous error?

 

5 drives are currently working, 5 not, but one of the non-working-drives was preclearing OK when I stopped for reboot... so there are still chances on 6 of 10... which is very bad also.

 

The jumpers went on the drives the moment they left their plastic bag ... but if they'll fail again this time, I'll try writing zeros to the MBR. Thanks for the hint.

 

Herbert

Link to comment

When I came back to office this morning, 2 discs had stopped at 61% prereading (4:05), the third started to zero but stopped (14:00). Tower was almost not reacting to anything (no web interface, login lasted almost 30 secs).

 

Seems that syslogd was full.

 

I try to preclear the third disk alone, without the two "problem child"...

 

Going to try the "zeroing" that Joe mentioned afterwards.

 

I'll keep reporting back!

 

BTW: Should I try to zero them with -A parameter? Just to be sure it is not an alignment error?

Link to comment

BTW: Should I try to zero them with -A parameter? Just to be sure it is not an alignment error?

The "-A" indicates the type of pre-clear signature.  It changes the values of two of the bytes in the signature. (One that indicates the starting sector for the partition unRAID will use, and one other that indicates its length) 

 

It will do nothing to fix the issue you are having.

 

I would power cycle the drive, write zeros to its MBR, and then try another pre-clear cycle.

To write zeros to the MBR of /dev/sdg type:

dd if=/dev/zero count=8 of=/dev/sdg

 

I think the high load cycle count is because the errors are occurring, and the disk has errors, and the device driver in Linux keeps resetting the drive in an attempt to re-establish communications with the drive. 

 

Each time it resets the drive, the load cycle count is incremented.

 

Joe L.

Link to comment

Out if the 4 WD20EARS I have purchased only one has been DOA (in terms of failing preclear), these were all single drive purchases.

 

My father recently purchased 4 x WD20EARS from scan.co.uk in the UK and all 4 were DOA!  Don't think this can be blamed on WD though, the four drives were wrapped in a couple of layers of bubblewrap and thrown in a plastic bag to be shipped!

 

I still favour WD despite slow RMA process and use Scan despite their terrible packaging.  Last WD RMA took a horrendous time to complete with them unable to give me any information on where my drive was after it was delivered to them.  Scan replaced all the WD20EARS but one of the jobsworths there seemed to take great pleasure in wasting my time when I visited their actual store.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.