Jump to content

Current pending sectors, but xfs check ok?


lixe

Recommended Posts

I started my first server a week ago and put my old external hard drives in it, a WD Red 4TB and a WD Green 2TB.

 

Right now I'm not using a parity disk, but will start soon (by buying another hard drive).

 

Luckily I still have backups of my files, because I recently discovered some problems with my drives:

 

WD Red 4TB (XFS):

1 Raw read error rate 0x002f 200 199 051 Pre-fail Always Never 2361

3 Spin up time 0x0027 179 170 021 Pre-fail Always Never 8025

4 Start stop count 0x0032 088 088 000 Old age Always Never 12871

5 Reallocated sector count 0x0033 200 200 140 Pre-fail Always Never 0

7 Seek error rate 0x002e 100 253 000 Old age Always Never 0

9 Power on hours 0x0032 097 097 000 Old age Always Never 2508 (3m, 12d, 12h)

10 Spin retry count 0x0032 100 100 000 Old age Always Never 0

11 Calibration retry count 0x0032 100 100 000 Old age Always Never 0

12 Power cycle count 0x0032 088 088 000 Old age Always Never 12871

192 Power-off retract count 0x0032 189 189 000 Old age Always Never 8996

193 Load cycle count 0x0032 199 199 000 Old age Always Never 5673

194 Temperature celsius 0x0022 122 104 000 Old age Always Never 30

196 Reallocated event count 0x0032 200 200 000 Old age Always Never 0

197 Current pending sector 0x0032 200 200 000 Old age Always Never 22

198 Offline uncorrectable 0x0030 100 253 000 Old age Offline Never 0

199 UDMA CRC error count 0x0032 200 200 000 Old age Always Never 0

200 Multi zone error rate 0x0008 100 253 000 Old age Offline Never 0

 

WD Green 2TB (XFS):

1 Raw read error rate 0x002f 195 195 051 Pre-fail Always Never 4350

3 Spin up time 0x0027 253 237 021 Pre-fail Always Never 1941

4 Start stop count 0x0032 059 059 000 Old age Always Never 41324

5 Reallocated sector count 0x0033 200 200 140 Pre-fail Always Never 0

7 Seek error rate 0x002e 200 200 000 Old age Always Never 0

9 Power on hours 0x0032 090 090 000 Old age Always Never 8000 (10m, 27d, 8h)

10 Spin retry count 0x0032 100 100 000 Old age Always Never 0

11 Calibration retry count 0x0032 100 100 000 Old age Always Never 0

12 Power cycle count 0x0032 059 059 000 Old age Always Never 41236

192 Power-off retract count 0x0032 200 200 000 Old age Always Never 207

193 Load cycle count 0x0032 128 128 000 Old age Always Never 218821

194 Temperature celsius 0x0022 121 098 000 Old age Always Never 29

196 Reallocated event count 0x0032 200 200 000 Old age Always Never 0

197 Current pending sector 0x0032 199 001 000 Old age Always Never 406

198 Offline uncorrectable 0x0030 200 200 000 Old age Offline Never 238

199 UDMA CRC error count 0x0032 200 200 000 Old age Always Never 0

200 Multi zone error rate 0x0008 186 183 000 Old age Offline Never 3789

 

I discovered https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems and did a check, but there weren't any problems reported.

 

Is there something I could/should do? Or can I find out which files exactly are concerned by those sectors?

Link to comment

Did you preclear these two disks before you used then for unRaid? Always a good idea to see if an older drive is up to the challenge.

 

I would not use these drives in my array if I were you. One or two pending relocations maybe you'd be able to clear and disk would be ok. But 22 is high, and 406 is outrageous. These disks might be usable for backups (better than nothing), but I'd replace them.

Link to comment

I did format them with my iMac, and formatted them again after putting in my server with unRaid.

 

But no, I didn't pre clear the drives (just googled what it does). So is there any sense in doing it now (of course all data is gone afterwards)?

 

If I copy a file from those drives without any error, can I be sure that all data is ok (which means if some files were corrupted I would get an error message)?

Link to comment

I agree with bjp999.  You might want to read this:

 

  https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

 

I would order two new drives within the next hour!  I would start with replacing the 2TB drive with a NEW drive. (I personally preclear all drives three times but do at least one cycle) I suspect that drive will have read errors which would prevent rebuilding the 4Tb drive. 

 

Do NOT do anything to those two drives until you have have your server working again.  You might be able to recover some of the files from them if that becomes an issue. I would suspect that WD Red would still be in warranty so after everything is working, I would go that route.  (I have never had a drive questioned on a warranty return.  The manufacturer's have always shipped the new drive with a day after receipt so they don't automatically inspect them on return before shipment of the replacement.)

 

If I copy a file from those drives without any error, can I be sure that all data is ok (which means if some files were corrupted I would get an error message)?

 

Be able to read one file is no indication that the disk is good.  You would have to be able to copy EVERY file...

 

EDIT:  I just looked at your power-on hours and I suspect both drives are in warranty at this point. 

Link to comment

Yes, if the drive returns data, you can depend on it. A pending relocation means that the drive had trouble reading that spot on the disk.  When that sector is WRITTEN, the disk will do a final assessment, and if the sector is found to be bad or marginal, the drive will map a spare sector into that spot - making that spot unusable.

 

But relocations are rare in healthy drives. And experience shows that once they start to happen, more and more sectors start to go. In other words, despite the fact drives are prepared to deal with the problem, it is a harbinger of bad things to come. And you have two advanced cases.

 

Not sure they'd be under warranty is they were pulled from external cases. You can always check with the website.

Link to comment

Thx for the fast replies!

 

Didn't think I would have warranty any longer, but my WD Red still have warranty (the WD Green doesn't, bought it May 2011). Still there is a problem: The WD Red was a MyBook Studio originally, which I recently opened to put the drive in my server. Usually it says that warranty ends if you open the case, but is that really a problem in my case (since of course I knew what I was doing or better to say how I have to do it)?

Link to comment

Thx for the fast replies!

 

Didn't think I would have warranty any longer, but my WD Red still have warranty (the WD Green doesn't, bought it May 2011). Still there is a problem: The WD Red was a MyBook Studio originally, which I recently opened to put the drive in my server. Usually it says that warranty ends if you open the case, but is that really a problem in my case (since of course I knew what I was doing or better to say how I have to do it)?

 

If you run the bare drive's serial number and it says it's under warranty, it's under warranty.

 

Keep the fact it was in an external to yourself.

Link to comment

Would have done that, but sadly the serial number is linked to the specific product (just checked).

 

But I did write a support request to WD and hopefully they will change it anyway!

 

By the way, what is the best way to preclear a drive using unRAID?

Link to comment

Would have done that, but sadly the serial number is linked to the specific product (just checked).

 

But I did write a support request to WD and hopefully they will change it anyway!

 

By the way, what is the best way to preclear a drive using unRAID?

 

Preclear plugin using my fast post-read version.

Link to comment

Could someone tell me the fastest/easiest way to remove the drives from my array, since I'm not using parity until now and all the tutorials I could find include a parity disk. I also don't want to create a new array because my cache disk and third disk are healthy, although I will maybe pre clear them anytime soon to be 100% sure.

Link to comment

In case WD won't replace the disk I did also preclear the WD disk and am surprised a litte bit by now :D

 

############################################################################################################################

#                                                                                                                          #

#                                        unRAID Server Preclear of disk /dev/sdc                                          #

#                                      Cycle 1 of 1, partition start on sector 64.                                        #

#                                                                                                                          #

#                                                                                                                          #

#  Step 1 of 5 - Pre-read verification:                                                  [2:30:29 @ 147 MB/s] SUCCESS    #

#  Step 2 of 5 - Zeroing the disk:                                                        [9:28:44 @ 117 MB/s] SUCCESS    #

#  Step 3 of 5 - Writing unRAID's Preclear signature:                                                          SUCCESS    #

#  Step 4 of 5 - Verifying unRAID's Preclear signature:                                                        SUCCESS    #

#  Step 5 of 5 - Post-Read verification:                                                  [2:30:52 @ 146 MB/s] SUCCESS    #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

############################################################################################################################

#                              Cycle elapsed time: 14:30:12 | Total elapsed time: 14:30:13                                #

############################################################################################################################

 

 

############################################################################################################################

#                                                                                                                          #

#                                              S.M.A.R.T. Status default                                                  #

#                                                                                                                          #

#                                                                                                                          #

#  ATTRIBUTE                    INITIAL  CYCLE 1  STATUS                                                                  #

#  5-Reallocated_Sector_Ct      0        0        -                                                                      #

#  9-Power_On_Hours            2537    2552    Up 15                                                                  #

#  194-Temperature_Celsius      29      33      Up 4                                                                    #

#  196-Reallocated_Event_Count  0        0        -                                                                      #

#  197-Current_Pending_Sector  22      1        Down 21                                                                #

#  198-Offline_Uncorrectable    0        0        -                                                                      #

#  199-UDMA_CRC_Error_Count    0        0        -                                                                      #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

############################################################################################################################

#  SMART overall-health self-assessment test result: PASSED                                                              #

############################################################################################################################

 

 

############################################################################################################################

#                                                                                                                          #

#                                        unRAID Server Preclear of disk /dev/sde                                          #

#                                        Cycle 2 of 2, partition start on sector 64.                                      #

#                                                                                                                          #

#                                                                                                                          #

#  Step 1 of 5 - Pre-read verification:                                                  [9:38:54 @ 115 MB/s] SUCCESS    #

#  Step 2 of 5 - Zeroing the disk:                                                        [9:28:26 @ 117 MB/s] SUCCESS    #

#  Step 3 of 5 - Writing unRAID's Preclear signature:                                                          SUCCESS    #

#  Step 4 of 5 - Verifying unRAID's Preclear signature:                                                        SUCCESS    #

#  Step 5 of 5 - Post-Read verification:                                                  [9:41:09 @ 114 MB/s] SUCCESS    #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

############################################################################################################################

#                              Cycle elapsed time: 28:48:42 | Total elapsed time: 57:36:47                                #

############################################################################################################################

 

 

############################################################################################################################

#                                                                                                                          #

#                                              S.M.A.R.T. Status default                                                  #

#                                                                                                                          #

#                                                                                                                          #

#  ATTRIBUTE                    INITIAL  CYCLE 1  CYCLE 2  STATUS                                                        #

#  5-Reallocated_Sector_Ct      0        0        0        -                                                              #

#  9-Power_On_Hours            2553    2582    2610    Up 57                                                          #

#  194-Temperature_Celsius      29      34      33      Up 4                                                          #

#  196-Reallocated_Event_Count  0        0        0        -                                                              #

#  197-Current_Pending_Sector  1        0        0        Down 1                                                        #

#  198-Offline_Uncorrectable    0        0        0        -                                                              #

#  199-UDMA_CRC_Error_Count    0        0        0        -                                                              #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

#                                                                                                                          #

############################################################################################################################

#  SMART overall-health self-assessment test result: PASSED                                                              #

############################################################################################################################

 

 

What do you guys think? Can I use it again or should I still try to get it replaced even though there is currently no error detectable? Of course I still wouldn't store my most important data to that disk....

 

Link to comment

The results are both good and bad:

- good in that all sector errors were cleared

- good in that there are no reallocated sectors, so the media under the bad ones is fine. That means it was likely an electrical issue that scrambled the data (power spike or outage while the drive was being written to)

- bad in that the first zeroing did not clear all of the bad sectors, not a good sign.  Two possibilities, one of them took 2 zeroings to clear, or a new one showed up.  Neither possibility is good.

 

Right now, the SMART report looks fine, no evidence of the previous issues, so it will probably pass any drive testing, not qualify for RMA.

 

But I would not feel comfortable using this drive without more testing.  I would Preclear one or two more times, and expect perfection each time.  Then I would still monitor it.

Link to comment

The results are both good and bad:

- good in that all sector errors were cleared

- good in that there are no reallocated sectors, so the media under the bad ones is fine. That means it was likely an electrical issue that scrambled the data (power spike or outage while the drive was being written to)

- bad in that the first zeroing did not clear all of the bad sectors, not a good sign.  Two possibilities, one of them took 2 zeroings to clear, or a new one showed up.  Neither possibility is good.

 

Right now, the SMART report looks fine, no evidence of the previous issues, so it will probably pass any drive testing, not qualify for RMA.

 

But I would not feel comfortable using this drive without more testing. I would Preclear one or two more times, and expect perfection each time. Then I would still monitor it.

 

And if you get another sector Current Pending Sector, stop the preclear and immediately RMA it. 

 

Of the several drives that I have RMA in the past ten or fifteen years, the replacement drives have always shipped the same day or the next day.  With that turnaround time, they are not inspecting the drives as they receive them.  I have the feeling that they realize that pulling a drive from service that is loaded with an OS and data is not something that anyone does on a whim.  To argue with users  as to whether a drive is defective is a losing PR nightmare in this type of situation.  It is simply a good PR to replace 5% good drives in their RMA program than to encounter negative PR from unhappy customers by refusing to honor a in-warranty drive that the customer thinks is defective. 

 

However, they probably maintain a database which could be used to identify customers who routinely send back drives that test good within the last two months before the warranty runs out! 

Link to comment

But I would not feel comfortable using this drive without more testing.  I would Preclear one or two more times, and expect perfection each time.  Then I would still monitor it.

 

Agree, also in my experience refurbished drives are a crapshoot, there's at least a 50/50 chance of it failing soon, so it's not an easy decision.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...