Jump to content

Slow parity sync


jj0076

Recommended Posts

I've tried searching the forum but only managed to confuse myself more. A while ago (5+ weeks) my parity drive showed up a red ball and a bunch of errors. So I shut down, removed drive, rebooted, shutdown, replaced the same drive and booted up and brought the array online to start a parity sync. It ran over night and all appeared well the next morning, so I ran a parity check which ran while I was at work and all was fine when I returned home.

 

Last night I noticed that it had red balled again - I followed the same procedure and it is now just over 10% into the sync running at less that 10 MB/sec and looking like it will take just under 2 days to complete.

 

Is it time to pull that drive out and replace with a new one? (and if I do put in a new parity, am I running a risk by putting the old drive back into the array as a data drive?)

 

Thanks in advance.

Link to comment

Run a smart test and post the results.

 

Personally I would have replaced the drive on the first red ball and then investigated the old drive.  Until you've given the old drive a good check out I wouldn't be putting it anyway near your array as a data drive.

 

Old drives are great for door stops, paper weights etc. :)

Link to comment

The most common cause of red balls is cabling problems. My guess is that's what you have. If that's the problem, replace the SATA cable with a known good one, Or at least resecure both ends of the current cable. Also resecure the power connection. This is a very very common issue.

 

But the way, to know if the drive is failing or there is a cabling problem is to get a SMART report and post the results.

Link to comment

Thanks for the help, following the parity sync it red balled again so I'm on the way to get some new drives now.

 

No idea how to do a smart test so I'll look that up later to run on the old drive.

 

Time to learn how to get a SMART report. You should preclear the new disk anyhow. smartctl -a /dev/sdX I believe. You could throw in a LONG test too.

Link to comment

Thanks for the help, following the parity sync it red balled again so I'm on the way to get some new drives now.

 

No idea how to do a smart test so I'll look that up later to run on the old drive.

 

Time to learn how to get a SMART report. You should preclear the new disk anyhow. smartctl -a /dev/sdX I believe. You could throw in a LONG test too.

 

Here is a link to the Manual page for smartctl:

 

    http://smartmontools.sourceforge.net/man/smartctl.8.html

 

Look toward the bottom of the page for examples of typical command lines. 

 

Also, both the Dynamix and unMENU plugins contain built-in SMART Reports app's.  Installing either one of these app's will give you a virtually idiot-proof series of mouse clicks to do and/or get all the tests and reports.

Link to comment

Thanks for the help, following the parity sync it red balled again so I'm on the way to get some new drives now.

 

No idea how to do a smart test so I'll look that up later to run on the old drive.

 

Unless you need a new drive it is better to get the SMART report first. Failed drives are far less common than loose cables.

Link to comment

 

Thanks for the help, following the parity sync it red balled again so I'm on the way to get some new drives now.

 

No idea how to do a smart test so I'll look that up later to run on the old drive.

 

Unless you need a new drive it is better to get the SMART report first. Failed drives are far less common than loose cables.

 

Well I'm nearly out of storage space anyway so if it's not failed then that solves that issue as well!!

 

I'll get on the task of pre clear, smart report etc later tonight.

Link to comment

For now just run a smart report. Very quick. There are GUI tools like myMain (see my Sig). But the sickest easy would be to run it from the command line. E.g., smartctl -A -a /dev/sdz (where sdz is tge sata device id for the drive in question.

Link to comment

I've attached the short smart report for the drive in question. Looking at the reallocated sector count and the current pending sector count, things don't look good. Is there anything that can be done with this drive?

 

If it were me, I order a new drive ASAP.  Personally, I would shut the server down until I got it.  (The last thing you want is problems with a second drive.)  I would run the new drive through three preclear cycles.  (If you don't have a second computer that you can use to do this, keep the array off line.) Install it and allow parity to rebuilt. 

 

If you are really want to see if this drive is salvageable, you could take the old drive and attempt to run several cycles of preclear on it and see what happens.  In some cases, the 'Current_Pending_Sector' and  'Offline_Uncorrectable' counts might drop to zero and stay there, and the 'Reallocated_Sector_Ct ' count remain stable.  If these conditions don't happen, then my next use for the drive would be a doorstop.  (My last failed drive actually prevented two different computers from even booting into unRAID!!!)

Link to comment

Cheers for confirming what I already thought from the smart report. I bought a new drive this afternoon so the unraid box will be shut down overnight and the new one installed tomorrow afternoon. Thanks again.

I strongly urge you to run the new drive through some full surface verification routine, preclear preferably, or at least a smart long scan before trusting it with your data. New drive doesn't mean it's good, a small percentage of new drives fail within the first few days of service.
Link to comment

Yes, roughly speaking.  SMART data isn't 100% fool proof because manufacturers use different values, but any reallocated sectors or pending ones aren't a good thing.  Some drives fail and the SMART data was good, and others let you know.

 

For example heres your drive

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0027  165  162  021    Pre-fail  Always      -      6741

  4 Start_Stop_Count        0x0032  098  098  000    Old_age  Always      -      2207

  5 Reallocated_Sector_Ct  0x0033  181  181  140    Pre-fail  Always      -      372

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  080  080  000    Old_age  Always      -      15059

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  100  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      706

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      41

193 Load_Cycle_Count        0x0032  198  198  000    Old_age  Always      -      7956

194 Temperature_Celsius    0x0022  123  112  000    Old_age  Always      -      27

196 Reallocated_Event_Count 0x0032  051  051  000    Old_age  Always      -      149

197 Current_Pending_Sector  0x0032  198  196  000    Old_age  Always      -      880

198 Offline_Uncorrectable  0x0030  076  076  000    Old_age  Offline      -      40606

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  001  001  000    Old_age  Offline      -      275728

 

And here's an old drive I just pre-cleared and added to the array from my PC.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -      825

  2 Throughput_Performance  0x0026  252  252  000    Old_age  Always      -      0

  3 Spin_Up_Time            0x0023  067  065  025    Pre-fail  Always      -      10157

  4 Start_Stop_Count        0x0032  099  099  000    Old_age  Always      -      1274

  5 Reallocated_Sector_Ct  0x0033  252  252  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  252  252  051    Old_age  Always      -      0

  8 Seek_Time_Performance  0x0024  252  252  015    Old_age  Offline      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      10883

10 Spin_Retry_Count        0x0032  252  252  051    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  252  252  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  099  099  000    Old_age  Always      -      1161

181 Program_Fail_Cnt_Total  0x0022  100  100  000    Old_age  Always      -      3021256

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -      145

192 Power-Off_Retract_Count 0x0022  252  252  000    Old_age  Always      -      0

194 Temperature_Celsius    0x0002  064  061  000    Old_age  Always      -      33 (Min/Max 16/39)

195 Hardware_ECC_Recovered  0x003a  100  100  000    Old_age  Always      -      0

196 Reallocated_Event_Count 0x0032  252  252  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  252  252  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  252  252  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0036  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      90

223 Load_Retry_Count        0x0032  252  252  000    Old_age  Always      -      0

225 Load_Cycle_Count        0x0032  100  100  000    Old_age  Always      -      1286

 

It's old but has no current pending sectors, or reallocated even counts.  Not bad for a drive with 10,000 power on hours eh.

 

I would advise you do a pre-clear on the new drive with screen on the new drive before you use it.  It will test it properly and format it for immediate use on the array (when you add it it will take a few seconds to install instead of having to fully format the disk).  http://lime-technology.com/wiki/index.php/Configuration_Tutorial#Preclearing_With_Screen

 

2Tb will take around 20 hours roughly from my experience.  Just make sure you pre-clear the right drive ;)  And I always take a screen shot of the main menu to make sure if I do anything with the other drives like unplugging cables that they are put back in the correct order.

Link to comment

Ok, so I'm all up and running again thanks to all the help here. I have pre-cleared and smart tested the new drive and the parity sync is complete with the first parity check due to run overnight. I put the old drive into the arrray (unassigned) with a view to running a few pre-clear cycles on it to see if its fit for anything and encountered something odd. I was expecting it to show up as sdf, but it got listed as hdf instead. When it was parity is was sdb, which has now been taken by the new drive.

 

From what I found by searching, if a drive shows as hdx rather that sdx it is a BIOS setting issue? Is that correct? So I guess my next question is, should I be concerned by this - and if so what changes do I need to be looking into?

 

Thanks again for the continued support.

Link to comment

From what I found by searching, if a drive shows as hdx rather that sdx it is a BIOS setting issue? Is that correct? So I guess my next question is, should I be concerned by this - and if so what changes do I need to be looking into?

If the disk is showing up as a 'hd;' device, then this normally means that it is configured in the BIOS to run in IDE emulation mode.  This tends to lead to lower performance.  You need to look into your BIOS settings to see what mode the disks are set to run in.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...