November 24, 20169 yr I recently had the monthly parity check run and noticed a large number of errors: Event: unRAID Parity check Subject: Notice [TOWER] - Parity check finished (244153054 errors) Description: Duration: 11 hours, 28 minutes, 20 seconds. Average speed: 48.4 MB/s so I ran it again, thinking that if it had just run, the parity errors should be finished, but on the second run, I got: Event: unRAID Parity check Subject: Notice [TOWER] - Parity check finished (244153054 errors) Description: Duration: 11 hours, 58 minutes, 21 seconds. Average speed: 46.4 MB/s However I notice that the number did not change when I ran it again a third time Getting numbers like this make me think that I have something set wrong with my system. Or do I need to clear something? I am running 6.2.4 and have a 2 TB parity disk and 8 other disks ranging in size from 7500 GB to 2 TB, Can anyone suggest what I am doing wrong? Thanks in advance (Ps id did a number of searches for parity errorsand did not find an answer to this problem)
November 24, 20169 yr I suspect that your monthly and manual parity check runs are being performed with the 'write corrections to parity' unselected. Therefore the runs are detecting a disparity between the parity disk and your data disks but is not doing anything to correct the problem. Assuming you have no problems with any of your data disks you need to rerun the parity check with 'write corrections to parity" enabled. If on the other hand you do have a problem with a data disk, stop and ask for advice before continuing
November 24, 20169 yr Community Expert Have you ever had zero parity errors? Exactly zero is the normal situation, and it's hard to imagine getting that many if you ever built parity in the first place unless you have a serious hardware issue.
November 24, 20169 yr Author How do I do that from the console? None of the disks seem to be having any problems... I suspect that your monthly and manual parity check runs are being performed with the 'write corrections to parity' unselected. Therefore the runs are detecting a disparity between the parity disk and your data disks but is not doing anything to correct the problem. Assuming you have no problems with any of your data disks you need to rerun the parity check with 'write corrections to parity" enabled. If on the other hand you do have a problem with a data disk, stop and ask for advice before continuing
November 24, 20169 yr Author Here is the diagnostics file. Thanks for looking at it. tower-diagnostics-20161124-0753.zip
November 24, 20169 yr Community Expert Your 1st check was nocorrect: Nov 22 17:14:23 Tower kernel: mdcmd (45): check nocorrect 2nd was a correcting check: Nov 23 07:04:56 Tower kernel: mdcmd (55): check correct There was probably something wrong with how the parity was first synced, but if next check doesn't find any sync errors all should be well now.
November 24, 20169 yr Author Thanks to all, I am running another check as I type which should be done in a few hours. When I run a parity check from the "main" tab, is that a "correct" or "nocorrect" one? I see that I can schedule a monthly check to be either correct or nocorrect...
November 24, 20169 yr Community Expert Depends on If you check or not the box "write to corrections to parity" next to the parity check button.
November 24, 20169 yr Author Thanks to all! The third parity check finished and shows no errors. Last check completed on Thu 24 Nov 2016 11:24:07 AM PST (today), finding 0 errors. Duration: 11 hours, 24 minutes, 6 seconds. Average speed: 48.7 MB/sec I found the checkbox! I think the automatic check that was done happened without writing the parity.
November 24, 20169 yr Community Expert The 1st check was done because of an unclean shutdown, when this happens unRAID starts an automatic check. There was a bug in previous v6.2 releases and this situation wasn't detected, it was fixed on v6.2.4, but it's doing a non-correcting check, which IMO was a mistake and it's going to get fixed on an upcoming release. After an unclean shutdown it's important to do a correcting check ASAP because it's normal to have a few sync errors, but certainly not millions.
Archived
This topic is now archived and is closed to further replies.