I am faced with an intermittent issue where a file that I move to my server when read back from the server no longer has the correct CRC32. Having noticed it happen a few times (more often on larger files) paranoia has set in and I started including CRC32 in the filenames as well.
When a file compares bad, I can re-check it's CRC32 multiple times from the server and it is always wrong. Last night for example 2 out of about 300 files that I moved had bad CRC.
I copied both of the bad copies back to my machine and compared them using HxD file compare to the source local files. In both cases, a single bit was toggled wrong in a random place within the file.
This happens when running normal with a parity drive, out of curiosity, I moved a ton of files with no parity drive in place and still had the problem occur, albeit the files moved a lot quicker.
Interestingly, if I grab all the files that I have tagged with a CRC32 in the filename and read them back from unRAID, always check out ok, if the file copies correctly, it seems to stay good on the server.
Now before I include logs and specs, this has been happening since my first unRAID server, and I have replaced my server board/case/cooling/cables/memory/cpu three times since I first bought a key some 4 years ago and have experienced it from version 4.x until the last version 5.
I had always attributed it to maybe a dodgy network card and resigned myself to always performing writes to the array using a write/test approach. I now want to get to the bottom of it because this has persisted through different motherboards, PSU, hard drives that were 1TB and 1.5TB that are now all 2, 3 and 4TB drives.
I thought it may have something to do with drives spinning up/down as it would seem that when I browsed to a new share, the current copy would pause for a number of seconds (I assume while a drive spun up) and sometimes that file would end up being corrupt, but I couldn't reproduce it often.
So - I can't afford to keep my drives spinning at all times to avoid the problem, any suggestions? One thing that is common to my servers is the adaptec 1430 4 port sata cards.
As a test, all the files that had successfully copied, I have for the last two nights just been reading all files sequentially that have a CRC32 in the filename and not a single error reading these files that were successfully copied. At my wits end.
I thought ethernet did CRC32 on each packet, is it possible unRAID is not checking the packets at all? Even the gigabit switch has been changed out for a different model.