CurlyBen Posted May 4, 2022 Share Posted May 4, 2022 Hi everyone, I've got a bit of a problem! I recently shut down my server (using the GUI) to remove two unassigned drives, and when I booted up again a parity check started. I think there may be an intermittent issue when booting which then triggers the parity check, I've not got to the bottom of it. Anyway, parity check, no problem... except it immediately started showing errors. It's still running (non-correcting mode) but, with over half a million sync errors by 30%, I clearly have a problem. There's nothing obviously wrong - to me anyway - in the SMART data and I'm not aware of an unclean shutdown since the last successful parity check (late February), and what I've read so far suggests an unclean shutdown wouldn't cause this massive number of errors. Are there any other likely culprits? Or do I have a disk that is failing in a way SMART doesn't detect? I don't know if it's relevant, but most of the recent file changes on the array have been adding/moving media around. Is it likely the errors will be in this data (which is easily replaceable) or spread throughout the array? I think I have good copies of all my most important data - but I'm not so confident I want to test it! Logs are attached but, as mentioned above, the parity check is still running so I don't know if anything will be included yet. The array is 4x8tb drives, 1x parity and 3x data. tower-diagnostics-20220504-2331.zip Quote Link to comment
JorgeB Posted May 5, 2022 Share Posted May 5, 2022 Run memtest, if no errors are found after a couple of passes run a correcting check followed by a non correcting one, if there are errors on the 2nd run post new diags without rebooting. Quote Link to comment
CurlyBen Posted May 5, 2022 Author Share Posted May 5, 2022 54 minutes ago, JorgeB said: Run memtest, if no errors are found after a couple of passes run a correcting check followed by a non correcting one, if there are errors on the 2nd run post new diags without rebooting. Thanks Jorge. I'm assuming that if the errors are on a data drive, rather than the parity drive, this will solidify the errors - so if that's what you're suggesting then there's no way to correct anyway? Quote Link to comment
JorgeB Posted May 5, 2022 Share Posted May 5, 2022 Unless you have checksums for the existing files, or were using btrfs, there's no way to known if the problem is with parity or data, we can only try to find the current issue and put parity back in sync. Quote Link to comment
CurlyBen Posted May 5, 2022 Author Share Posted May 5, 2022 Thanks Jorge. I'm currently running extended SMART tests and I'll do a filesystem check too, assuming they all come back clean I'll do as you advised in your first post. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.