Drive Clearly Hurting in Non-Corr ParityChk; Run Correcting or Replace Now?


Recommended Posts

Figured I’d check in here before I embark on the current plan I have based on threads I’ve read so far: an old disk in my array is showing the Orange Triangle in its dashboard status with reported uncorrect: 1, and my monthly non-correcting parity check is almost past that drive’s maximum terabyte position with 310 sync errors detected.  It’s running hot at 51 C and heating the drives nearby it as well.

 

From what I can tell, my choices are basically roll the dice either way.  If I run the correcting parity check to fix the 310 sync errors, the whole drive could die during that second round (I’m still not 100% sure it’ll survive this one, but the drive should spin down in an hour or so when the check moves on to the higher capacity drives exclusively).  If I replace the drive, whatever errors I’m sitting on right now will be rebuilt into the replacement drive, which isn’t great, but feels way better than risking losing the entire drive’s worth of data.  Nothing added recently is irreplaceable, but the drive as a whole would be a huge pain to repopulate.

 

So despite the sync errors on this parity check, based on what I’ve read, I’m making an educated gamble and planning to throw the replacement drive in there and start a build tomorrow morning when this parity check completes.

 

If I’m absolutely misreading the risks in either approach and should run another correcting parity check no matter what, I’d greatly appreciate anyone warning me off my current path!

 

Link to comment

Diagnostics attached; was just about to start the replacement / rebuild this morning, but I'll leave this here and wait til I get home again before messing with anything in case something in the log is throwing up flags I'm oblivious to with my limited knowledge.

 

Thanks!

 

EDIT: Just noticed my docker.img is sitting on the troubled disk (#8) somehow, instead of in the cache. Errors make more sense now since I hadn't been writing much of anything to that drive lately, but I've got a feeling I need to shift that file to my cache drive somehow. Going to research that process when I get home, too.

 

tower3-diagnostics-20190301-1725.zip

Edited by wheel
Link to comment

Yeah, the age alone has had this disk on my "watch list" for awhile now. If the errors are related to my docker.img being on the problem drive, does that change my stance on the risks of running a correcting parity check vs just replacing the drive now?

Link to comment

I would guess a second check would most likely complete without read erroros, but it's just an educated guess, you can also run an extended SMART test before to confirm all is good, the error is typical from Seagate where there was an UNC error but on a bogus LBA address, so the media itself should be fine.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.