February 20, 200818 yr Hi all, I have problem, which I am hoping someone can help me solve. Recently I have begun to notice that some of my files break up during playback from my unRAID server. At first I thought it was the file itself since it was a recorded TV episode, i.e. signal broke up during recording, etc. So last night, I tried to confirm this. I compared the orginal file to the one on unRAID, and at the time the unRAID file broke up, the original was perfectly fine. This has really begin to concern me, since this means that data is not being stored bit-for-bit to the server. Note also that this does not happen all the time, maybe once or twice in the whole file. I did a parity check and it passed with 0 errors, which should rule out a bad drive. I thought it could be the network cable (100ft run), but if that was the case then the file should be completely corrupted. Could a bad NIC card cause this? How do I determine if this is the case? Anyone have any suggestions as to what can be causing this. I am now very concerned that the data on my server is no longer good.
February 20, 200818 yr Can you repeat the problem such that the glitch occurs always in the same spot or does it move? The former indicates a bad file, the latter a bad read connection. Bill
February 20, 200818 yr Author Last night, when I checked it always happened in the same spot. I will test for this again tonight.
February 20, 200818 yr Are the original and Unraid copy the same file format, or do you have, for example, an ISO as the original and a DIVX as the version on Unraid? Bill
February 20, 200818 yr On your windows PC, open up a command window and then type: comp c:\original_file.avi \\tower\movies\original_file_copy.avi If the files are different, it will tell you. ( oh yeah... use your file names and paths... not "original_file...") Joe L.
February 20, 200818 yr Author Will try that also when I get home. What kind of output does that give?
February 21, 200818 yr Author well now i am confused I compared the two files, which I noticed the break up on, to their originals. On one the output of compare was "Files compare OK", however, on the other I get mixed results. I ran compare on the second file three times. The first two times it reported an error at different locations with different values, and on the third is reported "Files compare OK" First attempt Compare error at OFFSET 22FF152A file1 = DF file2 = FF Second attempt Compare error at OFFSET 3553A52A file1 = 1A file2 = 3A Third attempt Files compare OK Which result is to be trusted? I have also noticed that the break up no longer occurs at the time I noted last night on the files. This is particularly strange because when it happened I immediately checked on another PC, which also showed the same behavior at the time. I really don't know where to begin troubleshooting.
February 21, 200818 yr This symptom is almost always from a NIC problem. Try replacing your NIC in the unRAID box. If the problem persists, try replacing the NIC in your desktop.
February 21, 200818 yr I used to see this kind of error way back, in the early days of networking, and more recently with an nForce4 board, something it was noted for. This looks like a stuck-on or stuck-off bit, the 0x20 bit within a 1024 byte buffer. I usually saw it in an 8-bit, 16-bit, or 32-bit pattern, indicating a hardware register with a flaky bit, so your pattern would indicate a much larger buffer in RAM. If I remember right, the block size unRAID uses internally was 1024, in some places. I would test your memory, thoroughly, multiple passes. It would also be useful to repeat the file compare test more times for additional data points, to confirm the pattern, and perhaps narrow it down more.
February 21, 200818 yr Correction: that should be the 0x20 bit at offset 0x52a within a 4k buffer, not a 1k buffer. More data points are definitely needed. BubbaQ is absolutely correct, the networking hardware at either end is the top suspect. A 4k buffer could be on the NIC.
February 21, 200818 yr Author I will try replacing the NIC card in the unRAID box. I have an extra 3Com gigabit NIC (not sure which model) will it work in unRAID? By the way, below are the boards in the server and two other machines. unRAID server Asus P5B-VM DO (another failing Asus NIC problem?) Desktop MSI 6150 HTPC Abit IP35 Pro RobJ, before I started using unRAID I tried hardware RAID. Could the nForce4 problem explain the reason why I was getting terrible transfer speeds from my desktop to the server?
February 21, 200818 yr RobJ, before I started using unRAID I tried hardware RAID. Could the nForce4 problem explain the reason why I was getting terrible transfer speeds from my desktop to the server? I don't think so. The nForce chipsets have a good feature set and great performance, but buggy implementation in early versions. Mine was very fast, and when it worked, you could see the potential. The problems I and many others had were data integrity related, IDE detection and boot up related, and compatibility problems with Maxtor's, possibly other problems I don't recall.
February 23, 200818 yr Author Does anyone know if 3com NICs are supported in unRAID? I have one, but I am not sure what model it is. Also wanted to report that it appears that transferred files are also corrupted. I tested a file on two different machines, and the point where unRAID had a glitch showed up on both machines, but the original file is perfect at that point. So I guess that means whatever data I transferred to my unRAID box since this problem started, which I am not sure when, is possibly corrupted. I am really disappointed to say the least Edit: Looking at 3com's website I think it is 3C2000-T. I did a quick search and it uses the sk98lin driver. Is this in unRAID?
February 23, 200818 yr Didn't you say your file compare once found no differences... Third attempt Files compare OK All we really know so far is that the networking is dropping a bit here and there at times... we do not know if it is only in one direction (unraid to your PC) or in both. You might try to perform a checksum on both copies of your file using local programs on each box. in the unRaid box, log in via telnet and then type md5sum /mnt/user/Movies/your_movie_file and then get a md5 checksum program for your PC and do the same there. Same checksum = same file... no corruption. Joe L.
February 23, 200818 yr I'm sorry, I did not realize that your desktop machine was an nForce 4. The corruption errors you are seeing are very very similar to what I was seeing from mine, and are typical of the data integrity problems I and others have had. Can you determine which files were transferred from the MSI? That would absolutely be my number one suspect! Your unRAID server may be fine, if you can prove that the corrupted files are just those that were copied from the MSI to it. In the past, when you downloaded large files to the MSI, did you more often than normal have to re-download, because of a corrupted download? That's quite typical of this problem. Edit: Looking at 3com's website I think it is 3C2000-T. I did a quick search and it uses the sk98lin driver. Is this in unRAID? Yes it is, used by several users.
February 23, 200818 yr The tool I used to detect the bad bits and their pattern was the DOS utility FC.EXE, very similar to the COMP.COM you already used. It too is available in most Windows systems, and I used the following batch file to avoid redundant typing: fc /b %2 %1%2 Just save that line to FCB.BAT in your Windows directory, and map a drive letter to your unRAID or remote folder containing the suspect files, open a DOS console, change to the local folder containing the same, and type fcb remotedrive filename. Using Joe's example (comp c:\original_file.avi \\tower\movies\original_file_copy.avi), map T to \\tower\movies, change to c:\, then type 'fcb t: original_file.avi'. Repeat the test multiple times on each file to be tested, and note the results. Try the same test from another computer. Huge files (over a gigabyte) work best. You may have to copy them to each computer to be able to test this. If you can NOT get a pair of computers to report any inconsistencies, then you have probably eliminated BOTH of those computers! That should help to determine the suspect machine. I would be interested in seeing a list of the last 4 bytes of the error offsets, and the changed byte values at those offsets.
February 23, 200818 yr Author I have been running my unRAID box headless, so today I connected a keyboard and monitor to do some more testing and possibly replace the NIC if needed. I decided to first run memtest (from the unRAID boot screen) on the system to rule out memory as the cause. My first run on my 2x512MB sticks resulted in 42 errors almost immediately after the test began. So I concluded that the memory was causing the problems. I then tested each stick by itself, and after 5 passes each reported 0 errors. Thinking that maybe a bios setting was causing the problem. I put both sticks back in, leaving the bios settings as is, and ran memtest again. So far after 3 passes it has reported 0 errors. A possible cause maybe that a memory stick was not seated properly. Also the power has gone out for a few of seconds a couple times recently so maybe it screwed something up in the system. I have already ordered a UPS for the server because of the power outages. When will UPS support be added to unRAID? I am going to assume the memory was the culprit, and let memtest run for maybe 10 passes. If it passes without any problems, I am going to close up the system and hope that the problem is now gone. Does anyone else have any suggestions?
February 23, 200818 yr Glad you found something... bad memory can cause any number of problems, all seemingly unrelated. Just a note... your memory could also be heat sensitive. Close your box, let it come up to temperature, and then run memtest again. With memory being as cheep as it is these days, I personally would replace it... (Unless you previously had bios settings that were causing it to fail and you have subsequently changed them to be more conservative) Joe L.
February 23, 200818 yr Another thing to check would be that the memory works reliably with that board and your bios settings. Voltage/timing issues make some memory sticks behave erratically, with those problems sometimes (as Joe points out) flaring up when the board gets hot. Perhaps going to the mobo supplier's website and seeing if your memory is on the recommended list or just google the two parts to see if someone has posted problems with that combination. Bill
February 23, 200818 yr Author The memory is Corsair 2x512MB Cas5 kit. In the bios I have left everything to its default for the memory. The case is a CM Stakcker 810, that has two rear fans and the side panel is vented. So I don't think it is a heat problem, but I will close it up and run memtest overnight.
Archived
This topic is now archived and is closed to further replies.