Quarterly parity check has detected errors


Go to solution Solved by JorgeB,

Recommended Posts

I'm quite new to unRAID so any help you can offer would be greatly appreciated. Logs are attached.

 

My first quarterly parity check detected some errors (3). Things had been running smoothly until that point with mostly new hardware. A faulty UPS meant some unclean shutdowns so I thought that may have been the cause.

 

I ran a correcting parity check which picked up five errors.

 

Next I ran another non-correcting check which has picked up four errors. The sectors seem to be the same?

 

Hard drives are connected directly to the motherboard.

 

I have

- run extended SMART and all disks passed

- currently running memtest86+ but clear so far after a few hours

- switched the SATA cables to new ones

 

Assuming the memtest is clear overnight, what should I do next?

 

Edited by grapefruitevening
Link to comment
  • Solution
Apr 12 06:10:23 tower kernel: md: recovery thread: P incorrect, sector=11515649200
Apr 12 06:24:24 tower kernel: md: recovery thread: P incorrect, sector=11797234176
Apr 13 07:55:05 tower kernel: md: recovery thread: P incorrect, sector=19001710760


Apr 13 21:12:23 tower kernel: md: recovery thread: P corrected, sector=9239094240
Apr 13 23:04:06 tower kernel: md: recovery thread: P corrected, sector=11515649200
Apr 14 00:26:27 tower kernel: md: recovery thread: P corrected, sector=13144112592
Apr 14 01:31:14 tower kernel: md: recovery thread: P corrected, sector=14377903704
Apr 14 01:38:38 tower kernel: md: recovery thread: P corrected, sector=14514929456


Apr 14 22:04:40 tower kernel: md: recovery thread: P incorrect, sector=9239094240
Apr 15 01:18:45 tower kernel: md: recovery thread: P incorrect, sector=13144112592
Apr 15 02:58:28 tower kernel: md: recovery thread: P incorrect, sector=14377903704
Apr 15 04:09:49 tower kernel: md: recovery thread: P incorrect, sector=14514929456

 

Not all sectors are the same, some might have been wrongly correctly, hence why they were detected again, this suggests a hardware issue, most commonly RAM related

Link to comment
9 hours ago, JorgeB said:

Not all sectors are the same, some might have been wrongly correctly, hence why they were detected again, this suggests a hardware issue, most commonly RAM related

 

Is this the sort of error memtest should pick up? So far it hasn't picked up anything but I'll leave it running for 24 hours. Is there something else I can do to nail down the cause?

 

Is the best approach from here to replace the RAM, run a correcting check and then a non-correcting check to make sure the problem is solved?

Link to comment
20 hours ago, JorgeB said:
Apr 14 22:04:40 tower kernel: md: recovery thread: P incorrect, sector=9239094240
Apr 15 01:18:45 tower kernel: md: recovery thread: P incorrect, sector=13144112592
Apr 15 02:58:28 tower kernel: md: recovery thread: P incorrect, sector=14377903704
Apr 15 04:09:49 tower kernel: md: recovery thread: P incorrect, sector=14514929456

 

1 hour ago, JorgeB said:

Usually yes but it's not a guarantee, if it doesn't remove one of the RAM sticks and run a couple of checks, if errors are not consistent try the other one.

 

I ran memtest for more than 24 hours and no errors. I've swapped out the RAM for a spare set I had and running a new check.

 

What we hope to see is the same four sectors from the last check?

Link to comment
On 4/17/2023 at 9:57 PM, JorgeB said:

First check will likely found some errors, if it's correct 2nd check should find 0, if it's non correct 2nd check should find the same ones.

 

1st non-correcting check detected the same 4 errors.

2nd correcting check corrected the same 4 errors.

3rd non-correcting check detected 0 errors.

 

So it seems like the RAM was the issue. I'll run another check in a week or two to confirm.

 

Thank you very much for your help tracking down the problem!

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.