Inconsistent Party Sync Errors


Recommended Posts

Hello, I have been having this issue for about a month now and I've tried everything I can think of to fix it, but here's my problem:

 

I have a parity check (no correct) that runs every morning at 1am which takes about ~9 hours to complete. It seems like I get about 1-2 parity sync errors once in a while which I know is not good but I've looked all over the forums trying to find ways to figure out what the problem is with no avail. However yesterday I got a party check back that had 200 sync errors which is pretty bad so here I am. I also have not ran any parity sync with write corrections enabled.

 

I know this is probably a bad assumption but I've been assuming that the parity sync errors have been "read errors" and the parity is actually correct which could explain why it returns no parity errors most days even with automatic correction turned off.

 

Things I've tried:

 - Ran memtest for 24 hours (no errors)

 - Disabled all plugins

 - File system check on all disks (no issues)

 - Extended smart check on all disks (no issues)

 

Is there any way I can figure out which disk is having problems? I have no read/write errors on any of my disks so I wouldn't even know which one to swap out. Is this something I shouldn't worry about?

 

Thanks for reading, any advise/help would be very useful as I'm pretty new to unraid.

 

Here is my syslog: syslog.txt

 

Here is the result of my party checks

1952411849_PartySync.PNG.53442ed615d619a91c728f45b35f16e4.PNG

 

Here is the current status of my disks (system uptime is 15 days, 10 hours):

image.thumb.png.e25fe39636d46bdfbde24747571b306b.png

 

 

Link to comment

@johnnie.black There is a parity check running right now that will be finished soon so I will grab the diagnostics from that and then do another parity check right after. These should be non-correcting checks right? If these 2 consecutive tests return with no sync errors should I keep running until I get a run with errors? (the sync errors do not appear after every consecutive run).

 

Thanks for the quick reply.

Link to comment
19 minutes ago, John Detter said:

These should be non-correcting checks right?

For now yes.

 

20 minutes ago, John Detter said:

If these 2 consecutive tests return with no sync errors should I keep running until I get a run with errors?

Yes, would be good to compare if the errors are in the same sectors or not.

Link to comment

(posting parity check result 1/2)

The day got away from me a bit, but here are the diagnostics from the run that just finished: unraid-server-diagnostics-20200710-1418.zip

 

There were 2000+ errors in my latest run:

image.png.f41d2b665c3c4badfe65cfcf89ca0ef2.png

 

I have another run going right now. I'm a bit surprised that there were so many sync errors, I've never had this many before. I'm running another non-correcting check now, I will try to post that later today when it finishes.

Link to comment
On 7/10/2020 at 12:27 PM, John Detter said:

Ran memtest for 24 hours (no errors)

First thing would be this again, I assume you've run before when it was detecting 1 or 2 errors, now it appears to be worse and the errors are on different sectors which is consistent with a RAM issue, with more errors it might be easier to detect any RAM issue.

Link to comment

I ran memtest86 for over 24 hours and no issues, its still running now and I will probably keep it running until tomorrow. The only thing left I could think of is maybe swapping out sata cables? Also this is non-ECC ram which I know is not ideal, but this is the second time I've run this test for over 24 hours and never had an issue. 

 

I agree that it would make a lot of sense if it was a ram issue but it seems like memtest can't reproduce the issue, is there anything else I should try before buying new ram?

 

image.thumb.png.3bca0475e857b8f8adc00b3a9bcc2c95.png

Link to comment

Small update: I did still get errors after removing half of the ram. I've transplanted the system onto a different mobo/cpu combo that I had and so far it actually has no sync errors, I will update again later today/tomorrow with the results of the parity check that's running right now. However if I do get a small number of parity sync errors it is possible that those are legitimate errors right? I suppose I need to do 2 consecutive parity checks and see if the errors are consistent? (either no errors or same amount of errors with same sectors).

Link to comment

First parity sync resulted in 0 sync errors, another parity sync is in progress right now but it looks like there 3 errors so far so it seems like the problem hasn't gone away. I guess it has to be the ram? The only thing I moved to the new motherboard is the disks + ram + psu. I'm just going to run a parity sync with the kingston sticks in to see if I can get consistent results, I believe I was still getting sync errors with the crucial sticks.

Link to comment

After taking out the crucial sticks it seems like I keep getting 1 sync error on the same sector, so now I'm running a correcting check and then I'll run another parity check to see if I get 0 sync errors. It does seem to be one of the crucial sticks causing the issue. I'm about 90% sure the problem is solved now, just need to buy some more ram.

 

Thank you everyone for the help! :)

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.