Jump to content

How to correct Unraid Parity Check Errors

Featured Replies

Posted

I know this has been discussed a lot, but I just built my first Unraid box a month or so ago and I am already getting parity check errors. I bought 5 new 18TB Exos hard drives. 1 arrived to me with SMART errors already so I returned it and replaced it with a new one. Party was rebuilt after I inserted the new drive. Now I changed my settings to run a parity check at the start of each month. The first one finished last night with 528 errors. From other posts I have seen, that doesn't seem like a lot, but it is still concerning to me as this array has precious data. What can I do to fix these errors? Should I run a correcting check and see if more errors reappear Dec. 1?

nas-diagnostics-20231102-1155.zip

Edited by dak1220

Solved by JorgeB

Go to solution
  • Community Expert

Go to memtest86.com and get that memtest since you have ECC then test your RAM.

  • Author
33 minutes ago, trurl said:

Go to memtest86.com and get that memtest since you have ECC then test your RAM.

I have this running now. From what I have seen, with 64GB this may take some time. I will come back when I have something from it. Thank you for your help.

  • Author
20 hours ago, trurl said:

Go to memtest86.com and get that memtest since you have ECC then test your RAM.

So the memtest completed with 0 errors. Is it possible the original failing drive caused some of this? If I recall, one of the issues were some unreadable sectors. Maybe bad data got written to parity or something? If there is anything else to try please let me know. I would like to get these errors to 0.

  • Community Expert

Log shows constant memory errors being corrected, try with just one stick of RAM, if the same try the other one.

  • Author
8 minutes ago, JorgeB said:

Log shows constant memory errors being corrected, try with just one stick of RAM, if the same try the other one.

Try the memtest again? Or do you mean the parity check?

  • Community Expert

Just use the server normally, you can run a parity check, and check if those errors are still being logged

  • Author
On 11/3/2023 at 10:59 AM, JorgeB said:

Just use the server normally, you can run a parity check, and check if those errors are still being logged

It looks like both sticks of memory are giving that ECC memory error in the logs. I have another set of RAM that isn't ECC. I assume I should try it? Memory errors can cause the parity check to show errors?

  • Community Expert

If the errors are corrected they should not cause sync issues, but that's not normal, unclear to me as well how well ECC RAM is supported with Ryzen and any specific board, try the other RAM.

  • Author
23 hours ago, JorgeB said:

If the errors are corrected they should not cause sync issues, but that's not normal, unclear to me as well how well ECC RAM is supported with Ryzen and any specific board, try the other RAM.

 

So I installed the other RAM. It seems like the memory errors are missing from the logs now, but I still got the 528 errors on a parity check I ran overnight. I have attached new diagnostics.

nas-diagnostics-20231107-0825.zip

  • Community Expert
  • Solution

Run a correcting check, then a non correcting one without rebooting, if the 2nd one finds new errors post new diags.

  • Author
17 minutes ago, JorgeB said:

Run a correcting check, then a non correcting one without rebooting, if the 2nd one finds new errors post new diags.

 

I am running a correcting check now. I assume there might be a small risk of files being corrupted? My parity checks take just over 24 hours, so 2 in a row will take some time. I can report back after both checks have been run. Thank you for your help.

  • Community Expert
24 minutes ago, dak1220 said:

I assume there might be a small risk of files being corrupted?

There is a small change some files could be already corrupt, but the previous sync finding the same errors as before is a good sign, and most likely parity is just out of sync.

  • Author
On 11/7/2023 at 9:42 AM, JorgeB said:

There is a small change some files could be already corrupt, but the previous sync finding the same errors as before is a good sign, and most likely parity is just out of sync.

 

So  the first parity check finished and corrected the 528 errors I have been having. I ran a second non-correcting check and it just finished a few minutes ago and found 0 errors. While the change in RAM didn't seem to make a difference, I assume I should probably not use the RAM that was causing errors anymore? Are there any further steps I should take to make sure this is resolved?

  • Community Expert
24 minutes ago, dak1220 said:

Are there any further steps I should take to make sure this is resolved?

I would way for the next scheduled check, and if still no errors consider it resolved.

  • Author
2 minutes ago, JorgeB said:

I would way for the next scheduled check, and if still no errors consider it resolved.

 

My next scheduled check is the 1st of next month. If I have no errors then I will consider it fully resolved. For now though I will mark your answer as the solution and reopen something later on if necessary. Thank your for your help.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...