garycase Posted April 10, 2011 Share Posted April 10, 2011 Okay, I had my first errors on my UnRAID box after months of error-free operation. At first I thought I had a failed disk, but that's apparently not the case. Summary: => Two days ago (Thursday), while adding some new movies to the "DVDs" share, the system seemed to hang .. so I looked at the UnRAID screen and (Ouch !!) saw ~ 300 errors on both disk 0 (parity) and disk 5. Interestingly, it was reporting the disk temperature for both of these disks as 0 (NOT a "*" like when they're spun down). My initial thought was I had a disk failure (likely #5) ... so I stopped the array and shut it down. The array shut down okay; and when I rebooted it, it still said parity was okay; but showed the errors on disk 0 & 5 in the "errors" column. => I cleared the stats ... so all columns were zeroes. Then I spent many hours running a comparison utility to confirm that everything I've copied to the array in the last couple months was okay -- everything tested perfectly. I keep a complete set of backup disks ... anything I copy to the array is also copied to a backup disk; when a backup gets full, I label it and store it in a fireproof file cabinet. Since disk #9 was fairly new, I ran a complete compare on both #8 and #9 ... and all files are okay. Note that a full disk (which #8 was) takes ~ 12 hours to compare, so this process occupied most of Friday. Note that these backup disks do NOT correlate with specific disks in the array -- I simply fill the backup disks; then store them; whereas the shares on the UnRAID server are filled based on UnRAID's algorithms ... but they do give me a complete backup of all files. => Next I ran a parity sync overnight Friday night. This found ~300 sync errors ... which were clearly related to the errors previously displayed. => After that was done, I ran UnMenu and selected the "File Check" option for disk #5, which was the disk that had displayed the errors. This completed with no errors found. => I then ran a parity sync again -- and it completed with zero errors. Question: Has anyone seen a similar set of errors? Anything else I should check? I'm leaning towards a "fluke" failure of the controller; perhaps a static anomaly; etc. I ran MemTest for 4 hours just to be sure it's not a failing memory module (no problems there) .. but can't think of anything else to test. UnRAID Pro v4.6 14 disks; SuperMicro C2SEA Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.