June 9, 201115 yr I ran a parity check (but do not correct errors) today, and I had 15 sync errors, but there's no reported errors in the main Disk Status section. I move all files to the server via TeraCopy, and I don't get errors when the files are MD5 checked against the originals. Is there a way for me to test the files where the errors were returned against originals to determine if the file is bad or if the parity data is bad? Log snippet during parity check. Jun 8 14:10:14 unRAID kernel: mdcmd (23): check NOCORRECT (unRAID engine) Jun 8 14:10:14 unRAID kernel: (Routine) Jun 8 14:10:14 unRAID kernel: md: recovery thread woken up ... (unRAID engine) Jun 8 14:10:14 unRAID kernel: md: recovery thread checking parity... (unRAID engine) Jun 8 14:10:14 unRAID kernel: md: using 1152k window, over a total of 1953514552 blocks. (unRAID engine) Jun 8 14:13:42 unRAID kernel: md: parity incorrect: 41910000 (Errors) Jun 8 14:13:42 unRAID kernel: md: parity incorrect: 41910008 (Errors) Jun 8 14:43:13 unRAID kernel: md: parity incorrect: 388294176 (Errors) Jun 8 14:43:13 unRAID kernel: md: parity incorrect: 388294184 (Errors) Jun 8 14:47:11 unRAID sSMTP[5550]: Creating SSL connection to host Jun 8 14:47:11 unRAID sSMTP[5550]: SSL connection using RC4-SHA Jun 8 14:47:16 unRAID sSMTP[5550]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection r32sm616268qcs.14) uid=0 username=root outbytes=1394 Jun 8 15:22:52 unRAID kernel: md: parity incorrect: 837721920 (Errors) Jun 8 15:37:36 unRAID kernel: md: parity incorrect: 999878152 (Errors) Jun 8 15:37:36 unRAID kernel: md: parity incorrect: 999878160 (Errors) Jun 8 15:47:11 unRAID sSMTP[6907]: Creating SSL connection to host Jun 8 15:47:11 unRAID sSMTP[6907]: SSL connection using RC4-SHA Jun 8 15:47:14 unRAID sSMTP[6907]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection f16sm652812qck.45) uid=0 username=root outbytes=1394 Jun 8 16:02:36 unRAID kernel: md: parity incorrect: 1266340016 (Errors) Jun 8 16:02:36 unRAID kernel: md: parity incorrect: 1266340024 (Errors) Jun 8 16:05:10 unRAID sSMTP[7310]: Creating SSL connection to host Jun 8 16:05:11 unRAID sSMTP[7310]: SSL connection using RC4-SHA Jun 8 16:05:13 unRAID sSMTP[7310]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection j4sm289159vdu.31) uid=0 username=root outbytes=1366 Jun 8 16:20:45 unRAID kernel: md: parity incorrect: 1454859960 (Errors) Jun 8 16:39:21 unRAID kernel: md: parity incorrect: 1642660072 (Errors) Jun 8 16:47:11 unRAID sSMTP[8617]: Creating SSL connection to host Jun 8 16:47:11 unRAID sSMTP[8617]: SSL connection using RC4-SHA Jun 8 16:47:15 unRAID sSMTP[8617]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection t28sm700081qcs.29) uid=0 username=root outbytes=1394 Jun 8 17:47:11 unRAID sSMTP[9969]: Creating SSL connection to host Jun 8 17:47:11 unRAID sSMTP[9969]: SSL connection using RC4-SHA Jun 8 17:47:14 unRAID sSMTP[9969]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection u15sm746088qcq.12) uid=0 username=root outbytes=1394 Jun 8 17:53:23 unRAID kernel: md: parity incorrect: 2314377664 (Errors) Jun 8 18:41:28 unRAID kernel: md: parity incorrect: 2676882120 (Errors) Jun 8 18:41:28 unRAID kernel: md: parity incorrect: 2676882128 (Errors) Jun 8 18:47:11 unRAID sSMTP[11353]: Creating SSL connection to host Jun 8 18:47:12 unRAID sSMTP[11353]: SSL connection using RC4-SHA Jun 8 18:47:14 unRAID sSMTP[11353]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection f16sm785107qck.21) uid=0 username=root outbytes=1394 Jun 8 19:37:42 unRAID kernel: mdcmd (24): spindown 4 (Routine) Jun 8 19:47:12 unRAID sSMTP[12621]: Creating SSL connection to host Jun 8 19:47:12 unRAID sSMTP[12621]: SSL connection using RC4-SHA Jun 8 19:47:15 unRAID sSMTP[12621]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection bp7sm664982vbb.23) uid=0 username=root outbytes=1393 Jun 8 20:47:12 unRAID sSMTP[13748]: Creating SSL connection to host Jun 8 20:47:12 unRAID sSMTP[13748]: SSL connection using RC4-SHA Jun 8 20:47:15 unRAID sSMTP[13748]: Sent mail for [email protected]@gmail.com (221 2.0.0 closing connection q1sm386384vdt.11) uid=0 username=root outbytes=1393 Jun 8 20:52:04 unRAID kernel: md: parity incorrect: 3710486024 (Errors) Jun 8 21:18:16 unRAID kernel: md: sync done. time=25683sec rate=76062K/sec (unRAID engine) Jun 8 21:18:16 unRAID kernel: md: recovery thread sync completion status: 0 (unRAID engine)
June 10, 201115 yr I ran a parity check (but do not correct errors) today, and I had 15 sync errors, but there's no reported errors in the main Disk Status section. Its pretty typical to see parity check errors without any errors being reported on the main screen, it just means that the check code managed to read the data from all the drives (data and parity) without any indication of error, but when it did the actual parity calculation it found the calculated parity (based on the data from the data drives) did not match what it read from the parity drive. Here are some thoughts: since the errors are not at low block numbers (say less than about 20000) the cause probably was not due to improper shut down of the server. run another check (of the NOCORRECT kind) and see if the errors show up on the exact same blocks. If the errors move it might indicate a memory or cabling issue. run a memory test to make sure RAM is ok post your logfile, someone always asks for it get smart logs (and post them) for all the drives (including parity) these errors might be due to a drive having problems, there are a couple of fields in these that will sometimes show which drive is dying which version of unRAID are you running? I move all files to the server via TeraCopy, and I don't get errors when the files are MD5 checked against the originals. Is there a way for me to test the files where the errors were returned against originals to determine if the file is bad or if the parity data is bad? Unfortunately not, if we had a tool that could take the block numbers that the parity check says are in error and look up which files on each disk are using those blocks then we could just check those files for damage. But we don't have such a tool. Regards, Stephen
June 10, 201115 yr Author ^^ Thanks Stephen. I noticed today that the main screen read "15 errors were corrected." which is weird because I ran the NOCORRECT parity check. It did not say this the past few days, only that 15 errors were found at the last check but no mention of correcting them.I'm organizing a bunch of files now, so I'll run another NOCORRECT check later today and report back.
June 10, 201115 yr ^^ Thanks Stephen. I noticed today that the main screen read "15 errors were corrected." which is weird because I ran the NOCORRECT parity check. It did not say this the past few days, only that 15 errors were found at the last check but no mention of correcting them.I'm organizing a bunch of files now, so I'll run another NOCORRECT check later today and report back. It will always say "corrected" even when in nocorrect mode. That is a bug in the wording of the message. It did not correct.
June 10, 201115 yr Author ^^ Thanks Stephen. I noticed today that the main screen read "15 errors were corrected." which is weird because I ran the NOCORRECT parity check. It did not say this the past few days, only that 15 errors were found at the last check but no mention of correcting them.I'm organizing a bunch of files now, so I'll run another NOCORRECT check later today and report back. It will always say "corrected" even when in nocorrect mode. That is a bug in the wording of the message. It did not correct. I'm talking about different wording compared to when it was running the nocorrect check. And, for a day or two it still showed that 15 errors were found. Now, today, it said that 15 errors were fixed. Wish I took a screenshot of it.
June 10, 201115 yr Use the "md5 - Deep Checksums" package in unMenu to determine which disk/file contains the error. If no files show errors then the errors are on the parity disk.
June 10, 201115 yr Author Use the "md5 - Deep Checksums" package in unMenu to determine which disk/file contains the error. If no files show errors then the errors are on the parity disk. I downloaded and installed the GCC compiler as well as the MD5 package. Where do I run it from? I can't see anything in unMenu to access it.
June 10, 201115 yr Use the "md5 - Deep Checksums" package in unMenu to determine which disk/file contains the error. If no files show errors then the errors are on the parity disk. I downloaded and installed the GCC compiler as well as the MD5 package. Where do I run it from? I can't see anything in unMenu to access it. it is run from the linux command line once installed. (You only need to compile it once)
June 10, 201115 yr Google "MD5deep" for instructions. I keep a .hashes directory at the top level of each of my data disks that each contain copies of the hashes of all of my disks.
June 10, 201115 yr Author Thanks for the help guys, but md5hash looks like it's over my head. I was hoping for an easy to use (gui/web interface) way to do the hash checking.
Archived
This topic is now archived and is closed to further replies.