Parity Errors - recovery thread: stopped logging


weirdcrap
Go to solution Solved by trurl,

Recommended Posts

UnRAID 6.11-RC4

 

I ran a parity check after replacing a failing disk and now I've got a bunch of parity errors. Last month's parity check had zero errors.

 

My general process when I get parity errors is to let the current check finish, noting the reported incorrect sectors. I then run a second non-correcting check to verify the reported sectors are the same before I run a correcting check to fix them. If they aren't the same I assume its a memory problem and start checking my RAM rather than re-write my parity.

 

Looking over my logs today I noticed this line: "recovery thread: stopped logging" and as indicated it doesn't appear to be reporting sectors anymore despite the parity error count continuing to increase?

 

Why was this done and how is it helpful to the end user? I do not like having errors suppressed for no good reason. Now when I run my second check I'll only be able to compare the errors up to whatever point the logging stops. I mean sure having a few hundred sectors report the same is probably a good indication its not RAM flipping bits or anything but still I'd like to know every sector that generates an error not just some of them.

 

Can I force UnRAID to log all reported sectors? How else can I go about ensuring my parity errors aren't caused by bad RAM before I overwrite my parity data with a correcting check?

void-diagnostics-20220918-1342.zip

Edited by weirdcrap
Link to comment
  • Solution
3 minutes ago, weirdcrap said:

Looking over my logs today I noticed this line: "recovery thread: stopped logging" and as indicated it doesn't appear to be reporting sectors anymore despite the parity error count continuing to increase?

Parity errors could go to millions if someone is checking invalid parity, for example, filling log space and then nothing could be logged. The lines it does log should be enough for you to see what you need.

 

Looks like you are having connection problems on disk15. I recommend dual parity with so many disks.

 

Link to comment
1 hour ago, trurl said:

Parity errors could go to millions if someone is checking invalid parity, for example, filling log space and then nothing could be logged. The lines it does log should be enough for you to see what you need.

 

Looks like you are having connection problems on disk15. I recommend dual parity with so many disks.

 

I suppose that's fair though I'd still like the ability to decide when enough logging is enough. With a fair amount of RAM even millions of parity errors shouldn't fill up the log if I bump up the mount point for /var/log.

 

Yes I always have intermittent connection issues during parity checks. I think it's the Norco bays I'm using. I plan on getting this moved into a new case soon and direct connecting disks instead of having a shared power backplane like they do now. After that I plan on going dual parity.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.