tucansam Posted October 14, 2013 Share Posted October 14, 2013 From the main status page, I see this: Last checked on Tue Oct 1 13:14:21 2013 MST (thirteen days ago), finding 58 errors. I have never had any errors reported with parity checks before. What does it mean? Link to comment
JonathanM Posted October 14, 2013 Share Posted October 14, 2013 It means the parity disk was not in sync with your data disks, and if it was a non-correcting check, it still isn't in sync. I'd pull a syslog, smart reports on all your drives, and run a non-correcting check, then pull another syslog. Zip them all up, and post them. It could be nothing, it could be an indication that you are about to have a problem. Hard to tell without logs and reports. Link to comment
tucansam Posted October 15, 2013 Author Share Posted October 15, 2013 Syslog is full of thousands of dupe file error messages, talk about the mover running, and spindown commands. Here are the first four disks smart reports sda.txt sdb.txt sdc.txt sdd.txt Link to comment
tucansam Posted October 15, 2013 Author Share Posted October 15, 2013 Last two. A non-correcting check is running now. I'll check syslog again tomorrow when its done. Thanks again for the help! sde.txt sdf.txt Link to comment
archedraft Posted October 15, 2013 Share Posted October 15, 2013 I didn't notice anything concerning in your smart reports. Link to comment
garycase Posted October 15, 2013 Share Posted October 15, 2013 It's almost certain the errors are on the parity drive. Run a correcting parity check to fix parity -- then repeat it and confirm you now get zero sync errors. Link to comment
tucansam Posted October 15, 2013 Author Share Posted October 15, 2013 It's almost certain the errors are on the parity drive. Run a correcting parity check to fix parity -- then repeat it and confirm you now get zero sync errors. Will do. Is this something that I should be concerned about, or is this just the cost of doing business? I had errors on the parity drive itself a few weeks ago (red balled and blinking I think) and had started a thread about it, replaced the SATA cable and life was good again. Am I looking at data corruption or are the errors largely a non-issue? Thanks again to all. Link to comment
garycase Posted October 16, 2013 Share Posted October 16, 2013 There ARE certain cases where a reported error is not actually on the parity drive -- but those are very rare, and in my experience every sync error I've seen was on the parity drive [i confirmed that by doing a full validation of ALL data on the array with my backups every time]. I'd do the correcting parity check (that's the only kind I ever do) ... and then run it again and confirm it now has zero sync errors. If the 2nd check does NOT have zero sync errors, THEN it's time to be concerned. Post back with the details if that happens. Link to comment
tucansam Posted October 17, 2013 Author Share Posted October 17, 2013 Parity check with correction complete: Last checked on Tue Oct 15 21:46:19 2013 MST (today), finding 58 errors. * Duration: 12 hours, 18 minutes, 33 seconds. Average speed: 90.3 MB/sec I will now run a parity check without correction and post back. Link to comment
tucansam Posted October 17, 2013 Author Share Posted October 17, 2013 Last checked on Wed Oct 16 22:52:14 2013 MST (today), finding 0 errors. * Duration: 12 hours, 20 minutes, 46 seconds. Average speed: 90.0 MB/sec Hopefully this means I'm good to go? Should I be doing automatic correction parity checks monthly? My system is set up for parity checks on the first of every month -- I don't believe its correcting errors by default. Link to comment
JonathanM Posted October 17, 2013 Share Posted October 17, 2013 Should I be doing automatic correction parity checks monthly? My system is set up for parity checks on the first of every month -- I don't believe its correcting errors by default. Depends on your level of paranoia about your data. A non-correcting check that gives errors allows you to do some detective work to try to pin down the cause before you irreversibly change the parity drive to match the current state of the data drives. Most of the time a correcting check is what is called for, and you move on with life. Every once in a while, consecutive parity checks will come up with much different results, meaning something is actively corrupting data reads. In that case, you really don't want to write ANYTHING to the disks until you figure out what the issue is. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.