Jump to content
mraneri

[Solved] Replaced failed data drive - "Parity check completed ... finding 183141001 errors"

11 posts in this topic Last Reply

Recommended Posts

Posted (edited)

Update: After the parity rebuild the old parity drive failed. After replacing and rebuilding the second drive there are 0 errors.

 

 

I have a small server with 3x3TB data drives, 1x3TB parity, and 2x240GB SSD cache. It has run fine for the past couple years, but last week one of the data drives failed. 

It showed SMART failures, UnRaid kicked it out of the array and started emulating it.

The only spare I had was a 4TB drive so I had to replace the parity drive. It cleared preclear so I set it as parity and moved the old parity to the array and let UnRaid start copying the parity and rebuilding the data drive. Everything went smooth, parity check after the rebuild returned 0 errors.

 

This morning I get a notification that a parity check has started, it wasn't scheduled and I don't know what triggered it. I let it run its course and it finished with 183141001 errors. This is the first time in the two years the server has been running that I have had a parity check with any errors and I am at a loss for what happened - or what I should do next. As of right now everything on the server seems to be working normally.

Thank you for any insight or advice.

Capture.PNG

tower-diagnostics-20191009-2213.zip

Edited by mraneri

Share this post


Link to post

your server restarted at this morning.  here is the first line in the syslog.

Oct  9 02:28:12 Tower kernel: microcode: microcode updated early to revision 0x27, date = 2019-02-26

You are also getting segfaults near the end of the syslog.  I believe these are usually memory related.  You might want to run memtst (from the boot menu) unless you have ECC memory.  I would also double check that you didn't unlock any of the memory sticks when you were doing the drive changes. 

Share this post


Link to post
7 hours ago, mraneri said:

it wasn't scheduled and I don't know what triggered it.

It was triggered by an unclean shutdown:

Oct  9 02:29:05 Tower emhttpd: unclean shutdown detected

I agree with Frank1940 that you should run memtest.

Share this post


Link to post
Quote

Not ECC memory. I'm running Memtest now and will report back. Thank you both.

 

Almost seven hours, four passes, zero errors.

Edited by mraneri

Share this post


Link to post

With so many sync errors most likely something happened during the disk replacement, but difficult to guess what without the logs covering that, still run memtest for 24H and if no errors then run another parity check to see if you get the same number of errors.

Share this post


Link to post

Likely the problem was during the replacement, but without the diags best option now is probably to run a correcting check, but there could be corruption on the rebuilt disk, if you have checksum of your files run a check.

Share this post


Link to post
28 minutes ago, johnnie.black said:

Likely the problem was during the replacement, but without the diags best option now is probably to run a correcting check, but there could be corruption on the rebuilt disk, if you have checksum of your files run a check.

That is with a write corrections checked.

I just learned about making checksums for files doing research for this and installed the file integrity plugin.

Is there anything else I can do from here?

Share this post


Link to post
11 hours ago, mraneri said:

Is there anything else I can do from here?

Without previous diags or checksums can't think of anything else, unless you want to check the files on the rebuilt disk one by one.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.