May 11, 201214 yr Hi guys i am really banging my head here ( read going bold at an alarming rate ) long story short - i am a complete newbie on unraid - so the learning curve is a bit steep. I am having an issue where data copied over to the unraid share have quite a high chance of corrupting, and i am not entirely sure as to why this is happening ie - i did this a few times already - where i setup an unraid with a certain amount of disks, build it with parity, and then copy a certain amount (>1TB) of test series to it over a SMB network (from windows 7 machine towards unraid) after this i use a software called winmerge to verify if all the files is copied successfully - and every time so far, it encountered files that is corrupted completely ( ie when i open the video via a media player - the original file opens without any problems, but the new-copy fails to open at all - complete file corruption the last attempt i went the preclear route ( which took over 24 hours for all the drives), and then i opted to re-create the array using only disks which didn't report any "new" failing smart parameters (one drive was showing errors - i will try and have it replaced soon - i then went ahead and copied about 1.7TB of files over, noticing that the copying speed was not going at a constant rate (ie jumping between 40/50MBpss, and then dropping to below 10MBps in some cases (so much so that copying the data took another day..... ). I even tried this with a new array without a parity drive -(expecting a more cosntant speed since parity calculations/writes wasnt supposed to happen - still the same network performance. problem is - once again verifying the data showed me that about 16GB of files didn't survive the process. Corrupted files is spread accross multiple disks - which leads me to believe its either a network issue or something else went wrong. i did broke the rules a bit - i am running a beta with some addons ( simplefeatures and Unmenu with some basic packages). i am completely willing to start form a clean slate (even if possible -use a secondary onboard lan-interface if this proves to be the culprit). I am using user shares with a split level of three (resulting in a season of shows to be stored per disk) i plan to start using the unraid server sometime soon for backing up data ( or probably even better - use it to store latest data and my old-rig as a backup - still deciding on that) , so resolving this issue is rather important can anybody recommend where i need to start looking to find the cause of the issue? - any ideas is welcome -as i am running out of ideas Thx Neo_x Syslog : http://pastebin.com/nVMmF1nN unraid version : 5.0-beta14 2012/05/28 Solved - turns out some files on my source server had a incorrect modified date 2098/01/01 (which got further damaged .modified after copying it to unraid / Linux), causing me to be unable to open them with VLC media player. corrected it with an application called touch (adjusting modified date/time) obtained from http://www.stevemiller.net/downloads/ctb10w32.zip
May 11, 201214 yr Author What is the type of MB? What does "ifconfig" and "ethtool eth0" show? cant find a link with more details - but the MB is a XFX NFORCE 680I SLI INTEL SOCKET 775 DDR2 - running a Q6600, and them some additional 4 port promise sata cards. ifconfig seems that it will be more use when i actually copy some data to it again? root@Storage:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:06:0e:1a inet addr:192.168.0.4 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21205 errors:0 dropped:0 overruns:0 frame:0 TX packets:19970 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1987411 (1.8 MiB) TX bytes:28792554 (27.4 MiB) Interrupt:41 Base address:0xc000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:130 errors:0 dropped:0 overruns:0 frame:0 TX packets:130 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:27106 (26.4 KiB) TX bytes:27106 (26.4 KiB) ethtool capture follows root@Storage:~# ethtool eth0 Settings for eth0: Supported ports: [ MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: MII PHYAD: 1 Transceiver: external Auto-negotiation: on Supports Wake-on: g Wake-on: g Link detected: yes root@Storage:~#
May 11, 201214 yr Did you run memtest at least overnight? The most probable culprit to corruption into many files spread across different disks is faulty memory.
May 11, 201214 yr Did you run memtest at least overnight? The most probable culprit to corruption into many files spread across different disks is faulty memory. And second most suspect is the network card (believe it or not) We've had other members with similar problems, and the network card was involved in several... As stated though, memory is the most likely.
May 11, 201214 yr I do not know about your board, but i know some of the earlier NForce motherboards were known for bad nvidia NICs. They were also known for data corruption in unraid. see here http://lime-technology.com/wiki/index.php/Hardware_Compatibility#Motherboards_.2F_Processors I know i had a 600 series and it would not negotiate 1GB unless i unplugged it and plugged it back in. I returned it after I was told there is no fix for that from the manufacturer and that nvidia was giving up on motherboard support so a driver was unlikely. this may not be your problem at all.. but. if you have a spare NIC (even a pci 100MB one) give it a try and see if it is better. EDIT: I searched the forums a bit I dont see anyone that has a server based on an nforce build.
May 12, 201214 yr Author Did you run memtest at least overnight? The most probable culprit to corruption into many files spread across different disks is faulty memory. Trying this now. Did you run memtest at least overnight? The most probable culprit to corruption into many files spread across different disks is faulty memory. And second most suspect is the network card (believe it or not) We've had other members with similar problems, and the network card was involved in several... As stated though, memory is the most likely. ok. since i have the luxury of still having the data on another machine,i figure i will try and start form a clean rc3, and test the functionality there. (i will also try using teracopy - CRC check - altho i doubt that windows copy can corrupt files wrt to the extra NIC - i can test it as a temporary measure, but there is only two PCI slots - both taken up by sata cards. will have to shell out some cash in order to utilize the PCI-e slots (although if memory serves this board didnt like certain combinations of PCI-e cards). thx for the feedback guys - will report back on memtest / testing on RC3
May 12, 201214 yr altho i doubt that windows copy can corrupt files Actually, if you are just using "copy" or "xcopy". I have seen it produce bad copies when moving terabytes of stuff across the network. even windows server to windows server. use robocopy. you can also use beyond compare and terracopy for more stable writes.
May 28, 201214 yr Author Hi guys sorry for the taking this long to reply - had been busy after finalizing most of the tests mentioned in previous posts (ie recreating, and testing on a different NIC etc etc) -i am still facing issues with file corruption strange part - file size etc matches, but detailed view from a windows client doesnt show a date modified. EVEN STRANGER PART - AND I TESTED THIS WITH A FEW CORRUPT FILES if i delete a culprit file, and recopy it (doesn't matter from which client - i tried three (media center, other server and a Core i7 laptop) no improvement - Tercaopy with CRC check enabled doesnt pick up anything strange. trying to open the same video from VLC -shows the file is corrupt (doesn't even start to play it. ) as mentioned - i tried different NIC's and even direct LAN cables - no change- same file(s) corrupts every single time - and this is for roughly 20GB / 1.2TB of data. for example - i have a complete season of a specific show refusing to reproduce on the UNRAID share i am completely at a loss, as i haven't seen consistent failures to copy to another machine in my 18 year of working on a PC - there is always something random that goes wrong if its for example a mermory/board issue - this time around - it just sticks me the finger when i try and re-copy a file can anyone recommend what other ways i can try to isolate the issue? ideas i have (except no clue on how to implement) : * mount/share a drive directly on the unraid (without having it in the array) - and test copy like that (maybe its specific drives?) * try and virtualize the UNRAID? ( Notice Johnm managed... so shouldn't be all the impossible) * don't exactly want to go this route - as it will complicate a few things , but if it guarantees data validity, i am all for it otherwise - i am really at wits end - will have to try alternate solutions(MDADM?), until such time as i can afford to buy more compatible hardware thx in advance for any constructive ideas Neo_x
May 28, 201214 yr You could try working through this to see if an issue with the drives being read shows up; http://lime-technology.com/wiki/index.php/FAQ#How_To_Troubleshoot_Recurring_Parity_Errors
May 28, 201214 yr Step 1. Get a MD5 checksum of the file from a windows program Step 2. Transfer it to unRAID Step 3. Get an MD5 checksum there. If the same, there is nothing wrong with the transfer. (or the file) Step 4. Do a memory check on your PC. (not the server, but the PC) We've seen other members issues like this where it was the network card. And others where it was the PC. on unRAID, to get an MD5 checksum, at the command prompt type: md5sum /mnt/user/ShareName/directory/filename Joe L.
May 28, 201214 yr Author You could try working through this to see if an issue with the drives being read shows up; http://lime-technology.com/wiki/index.php/FAQ#How_To_Troubleshoot_Recurring_Parity_Errors Step 1. Get a MD5 checksum of the file from a windows program Step 2. Transfer it to unRAID Step 3. Get an MD5 checksum there. If the same, there is nothing wrong with the transfer. (or the file) Step 4. Do a memory check on your PC. (not the server, but the PC) We've seen other members issues like this where it was the network card. And others where it was the PC. on unRAID, to get an MD5 checksum, at the command prompt type: md5sum /mnt/user/ShareName/directory/filename Joe L. Both very excellent ideas. Compeltely forgot about the MD5 possibility willl try it and report back once done thx Joe / Lionel much appreciated
May 28, 201214 yr Author *feedback* original file(from pc) : 1df3467443cef371a72f3f2df4a05b6f copied file(from pc) : 1df3467443cef371a72f3f2df4a05b6f file on mnt(unraid) : 1df3467443cef371a72f3f2df4a05b6f Thus its not on MD5 /file contents level - all matches I am not sure if testing the memory of the PC is going to help, as since mentioned in my previous post, i determined its something to do with the source files (the same files keeps failing to copy) copying from three different machines to unraid produces teh same results with the same files... I have since noticed a very odd similar characteristic on the failing files : Date modified is 01 ?January ?2098, ??12:00:00 AM once i copy it over to unraid - ls -lt shows that the file is updated to 2038-01-19 03:14 and then copying it back to windows just makes matter worse - date modified just doesn't show up at all(explorer GUI), and VLC player is not capable of opening the file anymore. from the windows pc command prompt dir /TW sp*.* reports : Volume in drive J is BEEEG Volume Serial Number is E27F-6551 Directory of J:\ 2098/01/01 12:00 AM 183 437 312 sp1.avi 1601/01/01 02:00 AM 183 437 312 sp1_.avi 2 File(s) 366 874 624 bytes sp1.avi - orginal file sp1_.avi - file after return copying it from unraid back to windows ( i get the same result from a mapped drive) This is the first time i encounter something like this.... yes most probably the issues starts on the source file having a future modified date. any thoughts? Thx Neo_x
May 28, 201214 yr Author Is NTP configured? no , dont have it enabled. fixed my time zone now just in case did manage to solve (updated my first post) Solved - turns out some files on my source server had a incorrect modified date 2098/01/01 (which got further damaged .modified after copying it to unraid / Linux), causing me to be unable to open them with VLC media player. corrected it with an application called touch (adjusting modified date/time) obtained from http://www.stevemiller.net/downloads/ctb10w32.zip
Archived
This topic is now archived and is closed to further replies.