Jump to content

Pls Help: multiple "handle_strip read error" msgs on parity drive


Recommended Posts

...was just running a monthly parity (no-correct) check on my unRAID 4.7 system (MD1510/LI) and about to shutdown again after it had finished, when I noticed all these errors in the syslog (see attached)...

 

The parity check seemed to have later finished and indicated no errors, but I don't know how credible that now is

 

The drive in question appears to be the parity drive (disk0, 6XW024J9, scsi 11:0:0:0) and has been in commission in my array since about Nov, 2009 without any issues to date. FYI, I don't leave my array on 24/7, just power it up and down periodically when I have some specific archiving activities to perform.

 

I'm also including the SMART info for the parity drive and would be very appreciative if somebody can tell me the way ahead from here... should I run more tests (which?) on the parity drive, or just replace it??

 

From what I can see, there is a Reported_Uncorrect count of 6 - this also shows in the unMENU Smart History page for this drive as ATA_Error_Count (6), where it has ALWAYS previously been 0.

 

It's long since past any kind of warranty, so there's no point in trying to pinpoint any specific issues for RMA or the like.

 

Very grateful in advance of any assistance!

 

Thanks,

m.

syslog-2013-05-01.txt

SMART_ST32000542AS_6XW024J9.txt

Link to comment

I'm trying to run a SMART long-test on the drive in the meantime in case the results of that prove illuminating... Unfortunately I think a spin-down timed it out, so I've set the timeout to Never and restarted the test. Again, the waiting.

 

In the meantime, can anyone confirm the procedure for invalidating and forcing a re-write of parity - in case I can successfully recommission this same drive if it proves to be OK (maybe take it offline, run some pre-clear iterations and reinstate it)?? Would that be a possibly useful way to go?

Link to comment

Well, in the absence of any advice, I'm rebuilding the parity drive to see if that does any bad-block reallocations, or "just works".

 

I stopped the array, unassigned the parity drive. Restarted (without parity) to force it to forget the parity config. Stopped again. Reassigned the parity drive and re-started the array - it is currently performing a full parity sync.

 

Once it's done, I'll re-run the no-correct parity check and see if it hiccups again or not.

Link to comment

Well. in the absence of any advice, I'm rebuilding the parity drive to see if that does any bad-block reallocations, or "just works".

 

I stopped the array, unassigned the parity drive. Restarted (without parity) to force it to forget the parity config. Stopped again. Reassigned the parity drive and re-started the array - it is currently performing a full parity sync.

 

Once it's done, I'll re-run the no-correct parity check and see if it hiccups again or not.

That is exactly what needs to be done to re-construct the drive.  Basically, except for the parity errors, or un-readable sectors, it is re-writing what is already there.  The un-readable sectors will be either re-written in  place  or re-allocated.
Link to comment

Thank a lot Joe.

 

I'm a little confused as to why there were no short or long test errors.

What should I do if this rebuild/re-check doesn't throw up anything - i.e. NO reallocations or errors. Was it all just a hiccup or glitch in that case??

 

Any theories from the logs as to what the nature of these errors might have been?

 

I'll post back the results when the rebuild and recheck(no correct) are done.

Link to comment

Would also be grateful if anybody can explain the nature of the original errors (first post syslog)... Were these transient failures that eventually succeeded (after a number of retries)?

 

If they were permanently unrecoverable failures, how is it that the parity check ultimately succeeded (and why were no reallocations apparently performed when rewriting the disk)?

 

 

EDIT - just for completeness, ran another short and long SMART test on the newly rebuilt drive (see attached).

SMART_ST32000542AS_6XW024J9.txt

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...