May 2, 201115 yr Greetings, I ran a parity check last night and it finished with 20 errors. I started another check, assuming that the errors would have been fixed during the first check. However, I'm halfway through the second parity check and 20 errors have popped up again. I am running unRaid 4.7 and my syslog is attached. I wondered if it might have be caused by the automatic movement of 15 files from the cache drive to the array in the middle of the check. Any ideas as to what might be going on? Cheers syslog.txt
May 2, 201115 yr Greetings, I ran a parity check last night and it finished with 20 errors. I started another check, assuming that the errors would have been fixed during the first check. However, I'm halfway through the second parity check and 20 errors have popped up again. I am running unRaid 4.7 and my syslog is attached. I wondered if it might have be caused by the automatic movement of 15 files from the cache drive to the array in the middle of the check. Any ideas as to what might be going on? Cheers File movement has absolutely nothing to do with parity errors. Parity is maintained while the moves are made. Are you sure you are not doing NOCORRECT parity checks, where it detects, but does not correct the errors. (The web-management interface ALWAYS says it corrected them even when in NOCORRCT mode. It never got updated to where it says something different) Sure looks like you are doing parity checks in CORRECT mode. I'd say you have either bad memory, or memory set up with the wrong voltage. timing, or clock speed, or a bad motherboard chipset, or a bad disk drive returning random data, or possible a bad power supply. Start with smart reports on all your disks, since bad sectors could cause this class of error, then a memory test, preferably overnight, then you can go about isolating the cause if those are not the issue. Unfortunately, when you have this class of error, it could be almost anything.
May 2, 201115 yr Author Hi Joe, Thanks for your input. The only thing that has changed in the array recently in the addition of a second sata card, an Adaptec 1430SA, that I installed yesterday. I put it in the second pci-e x16 slot. In the first slot I have a Supermicro AOC-SASLP-MV8 that has been in the server for the past year. I'll remove the Adaptec and run another check. Could that be the cause? Prior to that I have had few issues with the server since I built it last year. Thanks again.
May 2, 201115 yr Hi Joe, Thanks for your input. The only thing that has changed in the array recently in the addition of a second sata card, an Adaptec 1430SA, that I installed yesterday. I put it in the second pci-e x16 slot. In the first slot I have a Supermicro AOC-SASLP-MV8 that has been in the server for the past year. I'll remove the Adaptec and run another check. Could that be the cause? Prior to that I have had few issues with the server since I built it last year. Thanks again. Looking closer, since the addresses of the parity errors are the same, it is possible the first flipped some bits, and the second flipped them back. A third parity check might be in order, but this time do it in the nocorrect mode. Joe L.
May 2, 201115 yr Author How do I run the check in nocorrect mode? I'm using the default interface and I don't see where to set that. Thanks
May 2, 201115 yr Author I did smart reports on all the disks, and all reported no errors except for disk 9 that had a UDMA_CRC_ERROR_COUNT of 10. Does that have to do with cable issues? smart_sdd.txt
May 3, 201115 yr How do I run the check in nocorrect mode? I'm using the default interface and I don't see where to set that. Thanks in your version of unRAID, you must do it on the command line. (or from the button in unMENU's array management page) Log in and type /root/mdcmd check NOCORRECT
May 3, 201115 yr Author Thanks, Joe. Yes I found it, and it's about halfway through the check. So far no errors. If the check completes without errors, is everything then OK with the array? Cheers
May 3, 201115 yr Author Hi Joe, In nocorrect mode, the parity check completed with no sync errors. Does that mean that the problem is solved, or do I need to do further tests? Cheers
May 3, 201115 yr Run one more regular correcting check to make sure. That is BAD advice. If there is a problem with one of your disks or hardware it would write BAD information to the parity disk. (if your system was working properly, it would do no harm at all) The reason the NOCORRECT parity check exists is because we unRAID users requested it to assist in our tests in exactly the situation you had, random parity errors. The more correct advice would be to run several more NOCORRECT parity checks. If they all are without error, then you are fine. If they detect anything, at least parity is correct (as best as it is right now) and you can still use it in the event another disk failed.
May 4, 201115 yr Author Hi Joe, I have run 3 successive NOCORRECT parity checks with no errors. Is there any need to run a regular parity check before resuming use of the array and adding new, precleared drives?
May 4, 201115 yr Hi Joe, I have run 3 successive NOCORRECT parity checks with no errors. Is there any need to run a regular parity check before resuming use of the array and adding new, precleared drives? no need.
May 4, 201115 yr A lot of members here will never normally run correcting parity checks. I always run nocorrect checks. I don't want the parity drive changing if a data drive is acting up and feeding bad info to the OS. Peter
May 4, 201115 yr I know my advise was wrong before but now that your confident in the system wouldn't it be prudent to run a correcting check? Just to come full circle. That is what revealed the problem in the first place and to be absolutely sure that it has been resolved a successful completion of a correcting test is required.
May 5, 201115 yr I know my advise was wrong before but now that your confident in the system wouldn't it be prudent to run a correcting check? Just to come full circle. That is what revealed the problem in the first place and to be absolutely sure that it has been resolved a successful completion of a correcting test is required. A correcting check will not find errors that a nocorrect check will miss. It makes no difference as long as there are no parity errors when done. Peter
May 5, 201115 yr I know my advise was wrong before but now that your confident in the system wouldn't it be prudent to run a correcting check? Just to come full circle. That is what revealed the problem in the first place and to be absolutely sure that it has been resolved a successful completion of a correcting test is required. A correcting check will not find errors that a nocorrect check will miss. It makes no difference as long as there are no parity errors when done. Peter This is true in theory; however, a correcting parity check not fixing parity correctly is the bases for this thread.
Archived
This topic is now archived and is closed to further replies.