November 1, 201213 yr I've always run my monthly parity check without errors, until last night. I woke up this morning and my system is reporting 4 parity errors. Two are reported at between 20 and 24 minutes into the parity check Nov 1 00:20:04 Tower kernel: md: parity incorrect: 265136200 (Errors) Nov 1 00:24:54 Tower kernel: md: parity incorrect: 329704256 (Errors) and two around 5 and a half hours into the check. Nov 1 05:23:20 Tower kernel: md: parity incorrect: 3423061520 (Errors) Nov 1 05:41:47 Tower kernel: md: parity incorrect: 3560457392 (Errors) The only action I'd taken prior to this check which was at all unusual, is that I removed about 500 GB of material that I no longer wanted on the server. I don't know if that could have precipitated the problem but it was done just a day or two before the parity check. What is the most sensible next step. Should I rerun parity with the option to correct parity errors (how exactly do I do that), or is there something else to try first? Thanks for any help
November 1, 201213 yr What is the most sensible next step. Should I rerun parity with the option to correct parity errors (how exactly do I do that), or is there something else to try first? A correcting check is the only way to bring the parity drive back in sync with the other drives. Problem is, at this point you don't know which drive is wrong. The first thing I would do is collect smart reports on all your drives, capture the current syslog, zip it all up and post it here. I wouldn't run a correcting check until I knew for sure all the data drives were healthy.
November 1, 201213 yr Author What is the most sensible next step. Should I rerun parity with the option to correct parity errors (how exactly do I do that), or is there something else to try first? A correcting check is the only way to bring the parity drive back in sync with the other drives. Problem is, at this point you don't know which drive is wrong. The first thing I would do is collect smart reports on all your drives, capture the current syslog, zip it all up and post it here. I wouldn't run a correcting check until I knew for sure all the data drives were healthy. Thanks jonathanm. Attached is a zip containing what you suggested; smart reports for 9 data drives, a parity drive and a cache drive, plus the current syslog containing the report of parity errors right near the bottom of the report. I'm far from an expert, but the smart reports seem Ok to me. Hopefully someone more knowledgeable than me can tell me if I need to do anything before attempting to correct parity. Thanks again for your help. smart-20121101.zip
November 1, 201213 yr Author Sorry to reply to my own post, but I think something changed without me doing anything. When I started this thread my unraid menu screen was showing that it had completed it's regular monthly parity check and found four errors. I can't post an image of the screen at that time because it has changed but it had a line called something like "Sync errors" and showed the number 4. I posted about this situation and was waiting for some guidance from members. I just went back and looked again at the unraid menu screen and it shows a different summary that says "Parity is valid" and that it last finished checking parity at 6:30 this morning and found four errors. (image of unraid menu screen attached) Did it correct them? I certainly didn't run a second check with a "CORRECT" option, just the monthly parity check. Further evidence: on the "Disk Management" page it says: "Parity is Valid:. Last parity check < 1 day ago . Parity updated 4 times to address sync errors." I'm just confused. Do I currently have valid parity or do I have current errors that need to be addressed? If my parity is now valid how did it get so, as I believe the monthly check has the NOCORRECT option. Thanks for any insight.
Archived
This topic is now archived and is closed to further replies.