Kandinsky Posted November 26, 2012 Share Posted November 26, 2012 Dear Experts, I have been happily running Unraid for about 6 months now with no problems at all. The issue I am faced with now is when I come to write new files over the network to it which timeout with nothing being written, the whole system just hangs/stops as a result. I then can't connect to it via the web interface or log into the system with Putty either and have to perform a hard reset on the server to get it back up again. As it seems to stop working I can't retrieve any log files post this happening as I have to perform a hard reset on the box wiping the logs. I have included my current log file from a hard reset this morning. Parity is OK. I also upgraded from RC5 to R8a thinking this might help but the same behavior occurs. I am wondering whether there is something fundamentally wrong with the install on the flash drive, so am wondering if I should reformat it and start with a fresh copy of unraid? If so, how can I transfer across my current settings for the array name etc. I can always install the plugins again later I guess? I can't find any mention of this type of problem on the forums so would appreciate any advice from the people with more knowledge and experience than I. Many thanks in advance! syslog-2012-11-26.txt Quote Link to comment
dgaschk Posted November 26, 2012 Share Posted November 26, 2012 See my sig to disable add-ons. Quote Link to comment
jazzysmooth Posted November 26, 2012 Share Posted November 26, 2012 I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct. Quote Link to comment
Kandinsky Posted December 1, 2012 Author Share Posted December 1, 2012 I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct. No hardware changes recently other than replacing a 2Tb Parity drive with a 3Tb one but was writing file to the system for a couple of weeks. The log file shows the network card to be: Tower kernel: eth0: Identified chip type is 'RTL8168E/8111E'. (Network) I am using a Foxconn A88GMV AMD 880G (Socket AM3) Motherboard and no extra SATA cards at the moment just the 6 onboard SATA connectors. I have also tried fitting some new SATA cables as well but to no avail. I have also just tried a fresh vanilla install of unRAID 8a with only the unmenu plugin and just this morning tried to copy a 1Gb file which again ended up with the network drive no longer being visible and the webgui cannot be accessed anymore, so I assumed crashed again. The motherboard BIOS is 2010 but not sure if upgrading it would make any difference. I will perform the tests you mentioned and report back. Quote Link to comment
Kandinsky Posted December 1, 2012 Author Share Posted December 1, 2012 I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct. OK I have tried all the network tests you suggested. 1. Yes no problem but was only running at less than 9Mb/s which seems slow (copied a 8Gb file between disks) 2. Yes they seem to copy including ones that were 50-60Mb 3. As soon as I tried to copy large video files it seems to crash the system Streaming from unRAID is faultless and has never crashed when watching movies etc, its ONLY when writing files to the array. Do people think it's still a network problem please? Is there a way to capture the log files onto the USB stick so that after the reboot I can see what errors are occuring as they get wiped when I reboot the box? Thanks! Quote Link to comment
Frank1940 Posted December 1, 2012 Share Posted December 1, 2012 Try this: http://lime-technology.com/forum/index.php?topic=24271.msg211992#msg211992 It is a bit of a long shot as you had problems prior to rc8 but several people have had Samba problems in rc8 that were fixed by upgrading from 3.6.7 to 3.6.8 Quote Link to comment
COglesby21 Posted December 1, 2012 Share Posted December 1, 2012 I am having this exact same problem except I am new to Unraid. I setup my array with 2 data drives and copied all of my media files (1TB worth) over the network with no problems at 50+MB/sec. I then installed my 3rd disk, configured as parity, ran parity sync which took about 25 hours then ran an additional parity check once the sync was finished to double check everything. Now however when I try to copy any large files over the network, I get 15MB/sec at best or less transfer and within transferring the first file I get a network error and my array is completely and I have to reboot the system. Reading works fine so just like the original poster it is writing only to the array. Quote Link to comment
Kandinsky Posted December 2, 2012 Author Share Posted December 2, 2012 Try this: http://lime-technology.com/forum/index.php?topic=24271.msg211992#msg211992 It is a bit of a long shot as you had problems prior to rc8 but several people have had Samba problems in rc8 that were fixed by upgrading from 3.6.7 to 3.6.8 Thanks for the advice. I did upgrade to Samba 3.6.8 but the problem still seems to be happening. I did manage to capture the log file from the server whilst the network connection and webgui stopped working using the tail -f --lines=100 /var/log/syslog >/boot/syslogtail.txt command. The file doesn't look easy to read but if anyone has ANY ideas of how I can fix this I would be so grateful as it can't be good for the system to have to hard reboot the box all the time. Thanks in advance... syslogtail2.txt Quote Link to comment
dgaschk Posted December 2, 2012 Share Posted December 2, 2012 Run checkdisk on the flash in a PC or Mac. Run reiserfsck check on all of the data drives. See Check File Systems in my sig. Quote Link to comment
jazzysmooth Posted December 2, 2012 Share Posted December 2, 2012 I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct. OK I have tried all the network tests you suggested. 1. Yes no problem but was only running at less than 9Mb/s which seems slow (copied a 8Gb file between disks) 2. Yes they seem to copy including ones that were 50-60Mb 3. As soon as I tried to copy large video files it seems to crash the system Streaming from unRAID is faultless and has never crashed when watching movies etc, its ONLY when writing files to the array. Do people think it's still a network problem please? Is there a way to capture the log files onto the USB stick so that after the reboot I can see what errors are occuring as they get wiped when I reboot the box? Thanks! You have a Realtek nic (same as I did) and you have the lockup when writing larger files (same as I did). I also never had an issue reading files, only writes. I'd try a dedicated nic. Quote Link to comment
dgaschk Posted December 2, 2012 Share Posted December 2, 2012 Disable pnp is BIOS. See my sig. Quote Link to comment
Kandinsky Posted December 2, 2012 Author Share Posted December 2, 2012 Run checkdisk on the flash in a PC or Mac. Run reiserfsck check on all of the data drives. See Check File Systems in my sig. Thanks, that looks like it was a good idea. I get this error below. I understand from the Wiki that I need to be careful about using the --rebuild-tree. Any advice on this please? Checking internal tree.. \/ 18 (of 18//165 (of 167\/ 1 (of 161|bad_path: The l eft delimiting key [11512 11513 0x1448b001 IND (1)] of the node (401319054) must be equal to the first element's key [11510 11511 0xbf89d001 IND (1)] within the node. /166 (of 167/block 4307355 15: The level of the node (0) is not correct, (2) expected the problem in the internal node occured (430735515)finished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 2 found corruptions can be fixed only when running with --rebuild-tree Quote Link to comment
dgaschk Posted December 2, 2012 Share Posted December 2, 2012 Yes. Run with rebuild-tree. Quote Link to comment
Kandinsky Posted December 5, 2012 Author Share Posted December 5, 2012 Run checkdisk on the flash in a PC or Mac. Run reiserfsck check on all of the data drives. See Check File Systems in my sig. Thanks to everyone for their input. The final reason why the system was dropping off the network and file transfers timing out was due to file system problems. Once I ran the reiserfsck --check and subsequent fix commands on 2 drives. No the problem has gone away and back to writing files no problems so it wasn't anything to do with the NIC (realtec). There is no indication in the log files that there was a file system problem so I assume there is no way to actually know you need to run the fix on them? I assume it would be good practice to perform a periodic reiserfsck on all drives. Are there any plugins for this to automate the process or report on it at all? Quote Link to comment
mr-hexen Posted December 5, 2012 Share Posted December 5, 2012 I am having this exact same problem except I am new to Unraid. I setup my array with 2 data drives and copied all of my media files (1TB worth) over the network with no problems at 50+MB/sec. I then installed my 3rd disk, configured as parity, ran parity sync which took about 25 hours then ran an additional parity check once the sync was finished to double check everything. Now however when I try to copy any large files over the network, I get 15MB/sec at best or less transfer and within transferring the first file I get a network error and my array is completely and I have to reboot the system. Reading works fine so just like the original poster it is writing only to the array. those speeds are normal (maybe 15/sec is a touch low, but normalish). The reason is parity calculations are being done on the fly. search around for benefits of Cache drive. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.