Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

WD Datacenter Gold 12TB issues

Featured Replies

Hello,

I am trying to ready a Western Digital Datacenter Gold 12TB drive to replace my current parity drive. I ran the preclear for over 58 hours making it through 4 of 5 steps before failing on step 5. Here are my questions:

 

1. Am I dealing with a bad drive?

2. Is there a way to start the preclear without going through steps 1 - 4?

3. Could there be a BIOS issue here?

 

Here is the preclear report:

 

############################################################################################################################
#                                                                                                                          #
#                                         unRAID Server Preclear of disk 8DG3KEVD                                          #
#                                       Cycle 1 of 1, partition start on sector 64.                                        #
#                                                                                                                          #
#                                                                                                                          #
#   Step 1 of 5 - Pre-read verification:                                                  [17:14:50 @ 193 MB/s] SUCCESS    #
#   Step 2 of 5 - Zeroing the disk:                                                        [41:09:23 @ 80 MB/s] SUCCESS    #
#   Step 3 of 5 - Writing unRAID's Preclear signature:                                                          SUCCESS    #
#   Step 4 of 5 - Verifying unRAID's Preclear signature:                                                        SUCCESS    #
#   Step 5 of 5 - Post-Read verification:                                                                          FAIL    #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
############################################################################################################################
#                              Cycle elapsed time: 58:33:47 | Total elapsed time: 58:33:47                                 #
############################################################################################################################


############################################################################################################################
#                                                                                                                          #
#                                               S.M.A.R.T. Status default                                                  #
#                                                                                                                          #
#                                                                                                                          #
#   ATTRIBUTE                    INITIAL  STATUS                                                                           #
#   5-Reallocated_Sector_Ct      0        -                                                                                #
#   9-Power_On_Hours             0        -                                                                                #
#   194-Temperature_Celsius      34       -                                                                                #
#   196-Reallocated_Event_Count  0        -                                                                                #
#   197-Current_Pending_Sector   0        -                                                                                #
#   198-Offline_Uncorrectable    0        -                                                                                #
#   199-UDMA_CRC_Error_Count     131      -                                                                                #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
############################################################################################################################
#   SMART overall-health self-assessment test result: PASSED                                                               #
############################################################################################################################

--> FAIL: Post-Read verification failed. Your drive is not zeroed.


root@Tower:/usr/local/emhttp#

Thanks!

 

Dale

 

The UDMA_CRC-Error_Count usually indicates a bad cable, or bad connection on the existing cable.

Edited by tdallen

I concur, looks like a bad cable to me.

  • Author

I replaced the cable, started preclear again. This time it was going much faster and the CRC error count did not increase from where it was before the cable was replaced. I got through steps 1 through 4, but it has been hung up on step 5 for several hours at 18%. I really need to determine if the drive is faulty even though SMART shows it is fine, or if there is a problem with  preclear, or something else.

 

Help!

 

Thanks


Dale

4 hours ago, dchamb said:

I replaced the cable, started preclear again. This time it was going much faster and the CRC error count did not increase from where it was before the cable was replaced. I got through steps 1 through 4, but it has been hung up on step 5 for several hours at 18%. I really need to determine if the drive is faulty even though SMART shows it is fine, or if there is a problem with  preclear, or something else.

 

Help!

 

Thanks


Dale

 

Not a good sign. If the cabling is good and the drive locks up, seems like a bad drive.

 

Using fancy names like "gold" and "datacenter" may give you a warm fuzzy feeling in your psyche that this drive is going to have a long and problem free life, but truth is drives are drives, and commercial and enterprise drives have similar failure rates. The "bathtub curve" phenomenon  is real, meaning that early drive fatality is more common than fatality after a break in period.

 

12TB drives are relatively new and no where near in the sweet spot on price. We old timers tend to be very price conscious, because the premium drives we bought in the 2T and 3T days are long gone or in backup servers, and we realize that it is costly to do the refresh cycles. 8T have been the way to go for drives for past 6-8 months or so at ~$20/T. 12T are about $35/T.

 

So you are one of few I've seen with 12's. They keep saying they are pushing the laws of physics to make higher capacity drives, but somehow they keep doing it anyway. I guess HAMR is coming and maybe we'll see a jump in sizes. But these 12s may be eeking out the last bit of capacity from the current tech, and it could be that you are a bit out there on the bleeding edge, and more failures are going to be normal. Or it could just be that this is a bad drive, and a replacement will work just fine.

 

BTW, a failure in the 3rd pass is a bad thing. That literally means that the drive read something other than a zero somewhere on the disk. There is a file that gets generated that tells you where. Probably just a couple of bytes. It could be that cable crosstalk could induce a non-zero signal AFTER the read, or that the marginal cable connection did something similar. But I will say that this is extremely rare. I've only seen it a small handful of times. The drive's ECC will usually not let a bad read escape the drive. I call it spewing garbage when a drive returns data that is different than what was written to the disk. There are those that would argue that it is impossible - but it does happen as you've proven. Bit rot is sometimes blamed when it happens in the real world, but you've got some very fast rotting happening if this problem develops between the 2nd and 3rd stage of a preclear!

 

You might rule out cabling problems, but I'd be pretty quick to pull the trigger on a replacement. If you're within the return windows from whence you bought it, you'd be assured to get a brand new drive, which is better than a possible refurb from WD.

  • Author
 
Not a good sign. If the cabling is good and the drive locks up, seems like a bad drive.
 
Using fancy names like "gold" and "datacenter" may give you a warm fuzzy feeling in your psyche that this drive is going to have a long and problem free life, but truth is drives are drives, and commercial and enterprise drives have similar failure rates. The "bathtub curve" phenomenon  is real, meaning that early drive fatality is more common than fatality after a break in period.
 
12TB drives are relatively new and no where near in the sweet spot on price. We old timers tend to be very price conscious, because the premium drives we bought in the 2T and 3T days are long gone or in backup servers, and we realize that it is costly to do the refresh cycles. 8T have been the way to go for drives for past 6-8 months or so at ~$20/T. 12T are about $35/T.
 
So you are one of few I've seen with 12's. They keep saying they are pushing the laws of physics to make higher capacity drives, but somehow they keep doing it anyway. I guess HAMR is coming and maybe we'll see a jump in sizes. But these 12s may be eeking out the last bit of capacity from the current tech, and it could be that you are a bit out there on the bleeding edge, and more failures are going to be normal. Or it could just be that this is a bad drive, and a replacement will work just fine.
 
BTW, a failure in the 3rd pass is a bad thing. That literally means that the drive read something other than a zero somewhere on the disk. There is a file that gets generated that tells you where. Probably just a couple of bytes. It could be that cable crosstalk could induce a non-zero signal AFTER the read, or that the marginal cable connection did something similar. But I will say that this is extremely rare. I've only seen it a small handful of times. The drive's ECC will usually not let a bad read escape the drive. I call it spewing garbage when a drive returns data that is different than what was written to the disk. There are those that would argue that it is impossible - but it does happen as you've proven. Bit rot is sometimes blamed when it happens in the real world, but you've got some very fast rotting happening if this problem develops between the 2nd and 3rd stage of a preclear!
 
You might rule out cabling problems, but I'd be pretty quick to pull the trigger on a replacement. If you're within the return windows from whence you bought it, you'd be assured to get a brand new drive, which is better than a possible refurb from WD.
Seems the fault lies with preclear. It crashed on a segmentation fault. I'm going to reboot and try it from a command line.

Btw, I'm not hung up on the fancy names either. That's just what they call it. I have a 10TB WD Gold that runs like a top so when they came out with a 12TB for the same price as the 10TB I grabbed it up. Being 62 myself I think I'm an old timer myself lol!

Sent from my SM-G955U using Tapatalk

Seems the fault lies with preclear. It crashed on a segmentation fault. I'm going to reboot and try it from a command line.

Btw, I'm not hung up on the fancy names either. That's just what they call it. I have a 10TB WD Gold that runs like a top so when they came out with a 12TB for the same price as the 10TB I grabbed it up. Being 62 myself I think I'm an old timer myself lol!

Sent from my SM-G955U using Tapatalk

Did you use preclear plugin?
It tends to die i.e. stops progressing and cpu and memory utilization for preclear script skyrocket to 100%.

Sent from my SM-G955U1 using Tapatalk

  • Author
23 hours ago, AndroidCat said:

Did you use preclear plugin?
It tends to die i.e. stops progressing and cpu and memory utilization for preclear script skyrocket to 100%.

Sent from my SM-G955U1 using Tapatalk
 

I used the preclear plugin. But when I try to run the script, it keeps telling me the drive is busy! Why can't I get this thing to preclear? I'm thinking of just putting the drive in the array and forgetting preclear.

Screenshot (5).png

I used the preclear plugin. But when I try to run the script, it keeps telling me the drive is busy! Why can't I get this thing to preclear? I'm thinking of just putting the drive in the array and forgetting preclear.
5a23419cd9206_Screenshot(5).thumb.png.17c3c02d0503d0afb8af60a9f7049d70.png
I had to kill it from cli and start over. Luckily it saves progress periodically and resumes where it left off.

Sent from my SM-G955U1 using Tapatalk

  • Author

 

2 hours ago, AndroidCat said:

I had to kill it from cli and start over. Luckily it saves progress periodically and resumes where it left off.

Sent from my SM-G955U1 using Tapatalk
 

Not sure what is there to kill. I rebooted the unRAID machine and it still says the device is busy. It looks like an error in the script to me.

 
Not sure what is there to kill. I rebooted the unRAID machine and it still says the device is busy. It looks like an error in the script to me.
Yep, looks like different issue.

Sent from my SM-G955U1 using Tapatalk

  • Author

johnnie.black,

 

Thanks! That did the trick! Preclear works again and reports my drive was successfully precleared. I am rebuilding my parity drive now.

  • 2 months later...

How long does it take to preclear a 12tb drive?

Must take forever at least a few days.

  • Author

It was a couple of days but that was going through the first 4 phases of preclear. Preclear crashed because of the script problem in phase 5 so I never completed it. I assigned to unRAID and everything has been working fine. 

My array consists of the 12TB WD Gold for the parity drive, a 10TB Gold data and 3 6TB Red drives and it takes 1 day 40 minutes to do a parity check at 135.1 MB/s.

 

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.