March 20, 201214 yr Hi All, First, my apologies if I've posted this in the wrong section. I'm a new user to unraid. I recently bought a pro license after evaluating the product and liking the results. Unfortunately, I'm running into parity check errors after copying my data over. I basically did back to back parity checks while also selecting the "Correct any Parity-Check errors by writing the Parity disk with corrected parity. " check box. What I ended up with was this: Last checked on Mon Mar 19 03:13:38 2012 EDT, finding 144 errors. Last checked on Mon Mar 19 13:58:46 2012 EDT, finding 134 errors. My setup is below. I've already ran memtest for about a day and a half with no errors showing up so I don't think its a memory issue. I don't think its a power supply issue either since I think 565 watts should be more than enough for 4 drives. Any other thoughts? Setup: UnRaid 5.0B14 Motherboard: Asus A8N-SLI Premium Processor: Athlon X2 4200 2B of DDR400 (512mbx4) Rosewill RC-218 PCI Express SATA II Controller Card Enermax 565W PSU Seagate 1.5TB HDD x4 If there are any logs you'd like to see let me know and I will post them. I've attached the syslog as a start. Thanks for your help! -noacess syslog32012.txt
March 20, 201214 yr Author SMART Reports attached. Thanks sdd-disk1.txt sda-disk2.txt sde-disk3.txt sdb-parity.txt
March 20, 201214 yr 2 of the drives have reallocated sectors. They may have caused the problem but should be ok now. Run another parity check to be sure.
March 20, 201214 yr Author UPDATE: 69% done with the third parity check, Sync errors corrected: 18 I'll report back again with its finished.
March 21, 201214 yr Author Sigh....it just finished: Last checked on Tue Mar 20 19:34:52 2012 EDT, finding 114 errors. Any suggestions? Thanks for the responses so far. -noacess
March 21, 201214 yr Sigh....it just finished: Last checked on Tue Mar 20 19:34:52 2012 EDT, finding 114 errors. Any suggestions? Thanks for the responses so far. -noacess It will be difficult to find the actual device causing this. We've seen individual disk drives that when read return an occasional random byte even though they do not show any other errors. You can try doing repeated chksum's on specific devices on specific files. The same device/file should ALWAYS return the same checksum (as long as you are not writing to the disk) It will not be easy to find your issue. Good luck. It has been a specific disk drive in most cases, but at least one disk controller had the issue. Joe L.
March 21, 201214 yr Author I think the first thing I'm going to try is moving all of the drives to the Rosewill controller. Right now I believe all 4 drives are connected to the onboard Nforce 2 SATA 2 connectors. If that doesn't change anything I'm going to put my unraid setup in a new box. I actually have a second system with the same motherboard (A8N-SLI Premium) and a Athlon 64 4600+ CPU. I'll run a memtest on the second system tonight to prep for that scenario. *fingers crossed* -noacess
March 22, 201214 yr Author I moved all of the disk drives to the Rosewill Controller and am still experiencing the same issues (two back to back parity checks showing sync errors). I'm going to move all the drives to the on board Nforce2 Sata II controller tonight and repeat the process. If that doesn't fix the issue I'm kind of at a loss as none of the disks have given me any indications that they're bad. I could build a whole new system and see if that works but my hopes are fading fast.... -noacess
March 28, 201214 yr I moved all of the disk drives to the Rosewill Controller and am still experiencing the same issues (two back to back parity checks showing sync errors). I'm going to move all the drives to the on board Nforce2 Sata II controller tonight and repeat the process. If that doesn't fix the issue I'm kind of at a loss as none of the disks have given me any indications that they're bad. I could build a whole new system and see if that works but my hopes are fading fast.... -noacess We've seen many cases where a disk with NO OTHER symptoms returns random data on occasion. These disks are the hardest to isolate since they show no absolutely errors other than they cannot return consistent data for some sectors. The technique used to isolate a specific disk is to repeatadly read the blocks that fail parity and compute the checksum for each disk in your array. Unless you are writing to the array the same blocks should ALWAYS return the same checksum. If one occasionally returns a different checksum, it is the one to replace. (or, at least try a different power cable, a different disk controller port to isolate the actual intermittent hardware) See here in the wiki: http://lime-technology.com/wiki/index.php?title=FAQ#How_To_Troubleshoot_Recurring_Parity_Errors
Archived
This topic is now archived and is closed to further replies.