Parity is valid, yet it found 396 errors


Recommended Posts

OK thats confusing.  I did a noncorrecting parity check last night.  I'm told I have valid parity in one part of the web page, but also told the 'last check completed' found 396 errors.  Thats confusing.  Is it telling me that there is a drive with errors, and that the errors will be recreated if there is a data drive rebuilt?

 

I have about 100 lines of this in my system log:

Oct 25 18:38:53 Tower2 kernel: md: recovery thread: PQ incorrect, sector=81804808

Oct 25 18:38:53 Tower2 kernel: md: recovery thread: PQ incorrect, sector=81804824

Oct 25 18:38:53 Tower2 kernel: md: recovery thread: PQ incorrect, sector=81804840

.

.

Oct 25 18:38:53 Tower2 kernel: md: recovery thread: PQ incorrect, sector=81808136

Oct 25 18:38:53 Tower2 kernel: md: recovery thread: PQ incorrect, sector=81808152

 

That wording from the parity check is confusing.  If I put a check in there to correct errors,  would it help anything?

Link to comment

I had a similar thing a while back, when I first implemented dual parity. A non-correcting parity check found a handful of errors single error in both P and Q. Now that suggested to me the possibility of a real data error but the problem was how to find which data disk was affected. Some people argue that with some complex maths it's possible with only two parity bits to work out which data disk is affected, but I'm not yet completely convinced. Either way, unRAID does nothing about errors in both P and Q. I sought help and I was persuaded that, since I didn't know if I had some files subtlely corrupted or whether there was an error when the original parity values were calculated, my best option was to run a correcting parity check to at least bring the parity back into agreement with the data. I did that and it's been fine ever since. I'm struggling to find the link to the discussion at the moment, but if I succeed I'll add it.

 

EDIT: I found the link (http://lime-technology.com/forum/index.php?topic=48193.msg468265#msg468265) and refreshed my memory. It wasn't a handful of errors, but in fact just the one. Perhaps your problem is more serious. Post your diagnostics.

 

Link to comment

Parity valid just means that parity was built successfully at the time and the disk is working, you have some sync errors, many things can cause them, most commonly unclean shutdowns, if there were some since last parity check/build do a correcting check and all should be fine, if there were none and there's no apparent reason for the errors you should a correcting check anyway and if next check has more errors you need to investigate what's causing them, it could be, RAM, disks, controller, etc.

Link to comment

Oh I certainly do know one unclean shutdown.  Blinking clock radio a few weeks ago in the house tells me we lost power.  Yet my other unraid server has no parity errors.  I fully expected the drives were probably already spun down at the time of the power outage, since both systems were probably idle.    OK, new mental note... good idea to manually run parity check after unclean shutdown.

Link to comment

Oh I certainly do know one unclean shutdown.  Blinking clock radio a few weeks ago in the house tells me we lost power.  Yet my other unraid server has no parity errors.  I fully expected the drives were probably already spun down at the time of the power outage, since both systems were probably idle.    OK, new mental note... good idea to manually run parity check after unclean shutdown.

Your servers should be on UPS.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.