[SOLVED] Pending Sector and Uncorrectable errors


Recommended Posts

I received notifications this morning of 8 Pending Sectors and 8 Offline Uncorrectable errors on an... elderly... (4.75 power-on years) 2TB Seagate Barracuda. I just got home, ran a short SMART test (attached) and have started a long SMART test. Reallocated sector count is still at 0.

 

I was planning on upgrading to a new 8TB parity and moving the current 4TB parity drive into the array as a replacement for an equally (or more) elderly 1TB drive. Should I plan on replacing this one first instead?

 

Extended SMART results will be posted as soon as they're available.

nas-smart-20180710-1845.zip

Edited by FreeMan
resolved
Link to comment

OK, smart report attached. It shows errors but I've not reviewed it, doing this via TeamViewer from the office...

 

I got an additional earning this morning that there was an error, then a notification the it was recovered.

 

extended SMART and fresh diagnostics attached.

 

of note, the power went out at the house while we were gone on vacay. the UPS did its thing and shut the server down properly. it remained powered off for about a week while we were gone, and these issues cropped up about 48 hours after turning it back on.

nas-diagnostics-20180711-1012.zip

nas-smart-20180711-0841.zip

Edited by FreeMan
spelling/grammar
Link to comment
Extended SMART test failed = disk needs to be replaced
Thanks, Johnnie.

It was bound to happen eventually, I thought I was going to get a jump on it and start replacing the oldest drives before that happened...

Guess I got started a bit too late. Thank goodness for CrashPlan - that disk has all of family photos on it, and I just got an email from CP this morning that backups are complete.

Sent from Tapatalk

Link to comment

That was odd...

 

I just got 2 notifications:

 

@14:56

Notice [NAS] - array turned good

{numbers}

Array has 0 disks with read errors

normal

 

@14:57

Notice [NAS] - Parity check started

{numbers}

Size 4TB

Warning

 

I was going to ask about running a non-correcting parity check (my last one ran w/no issues on 1 June. The server was down on 1 July.) just to ensure that everything was good. I don't think I want to run a correcting parity check because disk 7 is known to be failing, and I don't want a read issue there to cause a change to parity and corrupt a (believed/known) good parity in case of complete disk failure.

 

Does running a non-correcting parity check make sense at this point, or should I just pop down to the shop, pick up a new 8TB drive, and do a parity/data-drive switcharoo?

Link to comment

I've just ordered a new WD MyBook external drive from Newegg and I'll stop by the warehouse to pick it up tomorrow. (The bonus of having a warehouse in the city where you live. The downside - I have to pay sales tax.)

 

Since the preclear plugin seems to be out of fashion these days, what's recommended to test the drive for infant mortality prior to shucking and installing internally?

Link to comment
38 minutes ago, FreeMan said:

I've just ordered a new WD MyBook external drive from Newegg and I'll stop by the warehouse to pick it up tomorrow. (The bonus of having a warehouse in the city where you live. The downside - I have to pay sales tax.)

 

Since the preclear plugin seems to be out of fashion these days, what's recommended to test the drive for infant mortality prior to shucking and installing internally?

Plug it in to a windows box and run the wddiag suite on it. http://downloads.wdc.com/windlg/WinDlg_v1_31.zip

A sequence of writing zeroes and then a long smart test would accomplish something very similar to preclear.

 

Just be sure to keep the drive cool, the externals don't have the best circulation so I'd put a fan blowing on it.

Link to comment
11 hours ago, FreeMan said:

I don't think I want to run a correcting parity check because disk 7 is known to be failing, and I don't want a read issue there to cause a change to parity and corrupt a (believed/known) good parity in case of complete disk failure.

Definitely don't, it might corrupt parity.

 

11 hours ago, FreeMan said:

Does running a non-correcting parity check make sense at this point, or should I just pop down to the shop, pick up a new 8TB drive, and do a parity/data-drive switcharoo?

Don't see the point in running a check, just replace the disk.

 

 

 

Link to comment
21 hours ago, johnnie.black said:

Definitely don't, it might corrupt parity.

 

Currently, I have a 4TB drive as my parity. I just picked up an 8TB drive, and once the initial testing to ensure there will be no infant mortality happens, it will become my new parity and the existing 4TB will replace the failing 2TB disk7.


Replacing parity with a larger drive is simple - shut down the array, put the larger disk in, assign the biggest disk to the parity slot & let it rebuild parity. However, I'm not 100% convinced this is a good idea, because it's possible I may have a bad file or two on the failing drive.

 

Replacing the data drive is simple, except that I don't have a 4TB or smaller drive to replace it with.

 

I do have just enough space on other drives to be able to scatter the data from the failing disk to other disks, remove the failing disk from the array, then swap parity & add the former parity back into the array.

 

What is the best procedure to do what I need to do?

 

Link to comment

You want the "parity swap" procedure ... https://lime-technology.com/wiki/The_parity_swap_procedure

 

This procedure copies your existing parity to the new (larger) disk.  When that is done it rebuilds the failed data drive onto the disk that was the old parity disk.

 

Make sure you understand what this process involves as you must ensure that you do all the necessary steps .... ask for help from the experts here if you are unsure or need further clarification on the steps involved.

Link to comment

Thanks, @remotevisitor. I knew the instructions were out there somewhere.

 

The new drive has been tested and zeroed (as part of the testing) and parity swap is in progress at 13% complete on writing parity to the new drive.

 

Guess I should plan on another drive sooner rather than later to replace a few of the senior 1TB drives I've currently got.

Link to comment

If nothing else, you found a bug in CA Backup where it was still attempting to backup even though the destination didn't exist.  (Because the array wasn't started)  This had the effect of spamming your logs with all the errors about xattr

Edited by Squid
Link to comment
11 minutes ago, Squid said:

If nothing else, you found a bug in CA Backup

 

Glad I could help! :)

 

Do I need to manually recreate all my shares, or are they likely to come back on their own after a reboot? I'm sitting at 56% complete on the drive rebuild right now, so I'm not rebooting any time soon, but if I need to manually recreate, I'll get started on it right now so the array is usable again.

Link to comment

Question still stands: Do I have to manually recreate the shares (in which case I'll figure everything out now), or will they recreate on a reboot (or just stop/start the array)?

 

2 hours ago, Squid said:

You still get the prize for the bug find.

 

What do I get? My very own, unautographed, digital image of @Squid in my thread? ;)

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.