TrondHjertager Posted November 5, 2020 Share Posted November 5, 2020 Hi! I've suddenly started to get parity errors after parity checks. Been running for over a year now with no errors, now suddenly I'm getting errors. First there was one, then 4, then 6. Now im up to 30. I've done an extensive smart check on all the disks. One of my 14 disks have shown 2 reallocated sectors for about 6 months now, so this is nothing new. Nothing else showed up during the scans. I have 14 disks with dual parity. I am not sure where I should start to solve this problem. I've searched around, but couldn't find anything usefull. I am new to unraid. I've added the diagnostic-file. Thanks in advanced! ❤️ unraid02-diagnostics-20201105-1943.zip Quote Link to comment
JorgeB Posted November 5, 2020 Share Posted November 5, 2020 Start by running memtest. Quote Link to comment
TrondHjertager Posted November 19, 2020 Author Share Posted November 19, 2020 On 11/5/2020 at 7:58 PM, JorgeB said: Start by running memtest. how long should i let the memtest run? 12hours? 24? After another paritycheck this week, the error-count is now up to 52. Not sure how this would affect a rebuild if one of my disks goes down. Will errors in the parity corrupt the whole parity, or will it just leave some files corrupt after a rebuild? Quote Link to comment
JorgeB Posted November 19, 2020 Share Posted November 19, 2020 Close to 24 hours is possible, the longer the better. 45 minutes ago, TrondHjertager said: Not sure how this would affect a rebuild if one of my disks goes down. Will errors in the parity corrupt the whole parity, or will it just leave some files corrupt after a rebuild? Most likely would corrupt a few files during a rebuild. Quote Link to comment
TrondHjertager Posted December 14, 2020 Author Share Posted December 14, 2020 So, I finally did a memtest, and it didn't find any errors. But now my parity checks are running amok Anyone got any tips for the next step in trying to identify the problem here? diagnostics attached. unraid02-diagnostics-20201214-0924.zip Quote Link to comment
JorgeB Posted December 14, 2020 Share Posted December 14, 2020 If it's not the RAM it will be harder to track down, it can be controller, board/CPU or just one of the disks, see here for some troubleshooting tips on a recent case. Quote Link to comment
TrondHjertager Posted December 15, 2020 Author Share Posted December 15, 2020 (edited) I am thinking of first trying to change out the sf-8087 cables and the power cables for all my drives. Doing this, I will also disconnect all the disk from the backplane. Doing this, I can isolate three potential causes in one go. My question now is, if any of these three things are causing the errors, will the parity then correct the errors automatically? Or will it just spit out the same number of errors as the last check I had? Edited December 15, 2020 by TrondHjertager Quote Link to comment
JorgeB Posted December 15, 2020 Share Posted December 15, 2020 Even if it's fixed the first check after you change something might still correct errors, so you should always run it twice, if the 2nd one finds more errors it's not fixed yet. Quote Link to comment
TrondHjertager Posted December 16, 2020 Author Share Posted December 16, 2020 So I think I might have some more serious problems than I originally thought. Sometime within the last 12 hours my server totally crashed. Everything froze, and I couldn't get as much as a ping reply from it. I had to do a hard power off. Just before I shut the server down, I got this message on the server screen: Anyone got any idea? New diagnostics attached. unraid02-diagnostics-20201216-0846.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.