mkny13 Posted July 10, 2011 Share Posted July 10, 2011 My server was working smoothly for almost a year. A few months ago, every once in awhile it would become unresponsive--could not connect to shares, couldn't bring up the standard menu or unmenu in a browser using the name or IP, can't ping IP, and server shows no signs of activity. A hard power cycle was the only way to bring it back up. The past few weeks, though, this has happened with increasing frequency and comes on quicker. I can't even get through a resync before it dies again. Hardware specs: Case: Antec 900. Power Supply: Enermax Modu 82+ 525W, Modular. Motherboard: Gigabyte GA-MA74GM-S2 Processor: AMD Athlon 64 LE-1640 Ram: Transcend JETRAM 2GB 2 x 1GB DDR2 800 Drive Controller: Adaptec 1430SA PCI-E x4 Drives: 6 total, mix of 1 and 2 TB WDEARS plus a 1TB Maxtor and an older 160GB cache Flash drive: 2GB Cruzer Software: Version 4.5.6 unMenu with about half the available plugins enabled TwonkyMedia go file: #!/bin/bash # Start the Management Utility /usr/local/sbin/emhttp & cd /boot/packages && find . -name '*.auto_install' -type f -print | sort | xargs -n1 sh -c # Directory caching # sysctl vm.vfs_cache_pressure=1 off, until find out whether should run alongside cache_dirs /boot/cache_dirs -d 10 -w -e "Backup" ##################################### ### Wait for the array to start ### ##################################### # (before installing any packages that may expect the array to be fully started) until `cat /proc/mdcmd 2>/dev/null | grep -q -a "STARTED" ` ; do echo ">>>waiting..." ; sleep 1 ; done ; echo ">>>STARTED." #unMenu autostart cmd /boot/unmenu/uu # Twonky /boot/custom/twonkymedia-i386-glibc-2.2.5-5.1/twonkymedia -inifile /boot/custom/twonkymedia-i386-glibc-2.2.5-5.1/twonkymedia-server.ini I've read through the FAQ, Troubleshooting guide, and done my best to search the forum. I'd appreciate any advice anyone has to offer. I've attached logs from top and tail operations I left running up to the point of the latest crash, a syslog generated by powerdown at the last successful clean shutdown (20110629), and a syslog from a little after boot but before lockup (20110710). If there's any more info I can provide, please let me know. Should I go ahead and upgrade to the latest stable version? I considered this and thought it might be better to get my server stabilized before attempting that but if others advice to the contrary, I can do that right away. Thanks! Mike MK-unraid-log.zip Quote Link to comment
cyrnel Posted July 10, 2011 Share Posted July 10, 2011 Looks like networking problems to me. Jun 27 15:23:27 Unraid kernel: r8169: eth0: link down Jun 27 15:23:27 Unraid ifplugd(eth0)[1309]: Link beat lost. Jun 27 15:23:32 Unraid kernel: r8169: eth0: link up Jun 27 15:23:33 Unraid ifplugd(eth0)[1309]: Link beat detected. Jun 27 15:23:35 Unraid kernel: r8169: eth0: link down Jun 27 15:23:35 Unraid ifplugd(eth0)[1309]: Link beat lost. Jun 27 15:23:37 Unraid kernel: r8169: eth0: link up Jun 27 15:23:38 Unraid ifplugd(eth0)[1309]: Link beat detected. Plenty of alignment errors, too. Incompatibilities usually look much worse than this though, so perhaps check the cable, the switch port it's plugged into, or the switch/router itself. Quote Link to comment
mkny13 Posted July 11, 2011 Author Share Posted July 11, 2011 Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive. Quote Link to comment
Joe L. Posted July 11, 2011 Share Posted July 11, 2011 Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive. If the system log fills up all available memory (with those error messages) Linux will terminate processes in an attempt to make more memory available. It will terminate the processes that have been idle the longest... typically, that will end up killing processes used to log in and to supply the management interface. So yes, network errors can fill the system log and use all available memory, and subsequently lock up the machine. Quote Link to comment
bkasten Posted July 11, 2011 Share Posted July 11, 2011 Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive. I have had in the past had MB's lock up on me when the on-board NIC was failing. The NIC had not completely failed, but was dying, and it would lock up my computer (not my unRAID server, just another computer). It was somewhat frustrating, but eventually I figured it out, and diabled the on-board NIC, and added a PCI NIC I had lying around. Fixed it right up and had no problems from it since. So, yes, networking issues can sometimes lock up a computer. Bruce Quote Link to comment
mkny13 Posted July 12, 2011 Author Share Posted July 12, 2011 I could put in a PCI NIC easily enough (that might let me sucessfully use S3 sleep, too). Would you recommend doing that first or maybe creating a swap file on a hard disk to avoid running out of memory? Quote Link to comment
bkasten Posted July 12, 2011 Share Posted July 12, 2011 I think the immediate need is to fix your NIC issues. Once that is fixed, you may find that you don't need any more memory. At least that is where I would start. Bruce Quote Link to comment
wsume99 Posted July 12, 2011 Share Posted July 12, 2011 I'd try the PCI NIC. Adding more RAM isn't going to fix the problem, it might delay the onset of the issue but eventually you'll run out of space. The analogy here is a leaking roof. If your roof was leaking would you fix the roof (fix the NIC) or would you just put a bigger bucket in the house to catch the water (add more RAM)? Quote Link to comment
mkny13 Posted July 12, 2011 Author Share Posted July 12, 2011 OK, solid advice. Anyone know offhand the cheapest gigabit NIC that's known to be unraid compatible? Quote Link to comment
mkny13 Posted July 12, 2011 Author Share Posted July 12, 2011 Looks like this is the winner, at least if I'm taking advantage of Prime. Is there any compelling reason to spend an extra $10 on an Intel 1000 Quote Link to comment
cyrnel Posted July 12, 2011 Share Posted July 12, 2011 Did you have a chance to swap cables/ports & such? Quote Link to comment
mkny13 Posted July 12, 2011 Author Share Posted July 12, 2011 I moved a few weeks ago, which entailed a new cable and port (or at least a 80% chance of new port, as I made no attempt to reconnect my devices to their previous ports.) Quote Link to comment
cyrnel Posted July 12, 2011 Share Posted July 12, 2011 Okay. It was a reach but had to try it. I'd still go with an Intel Pro nic. The driver is in another league. The two I bought through eBay for $12/ea (seller sinowo) arrived in about a week. Quote Link to comment
mkny13 Posted July 13, 2011 Author Share Posted July 13, 2011 I tried to cancel my d-link order and get an Intel but too late. Oh well. Here's something I wished had occurred to me earlier: couldn't I test this theory of network errors causing syslog overrun by unplugging the ethernet cable entirely, then booting and letting the array finish it's resync before I ever plug it back in? Giving that a try today--got nothing else to do until NIC gets here tomorrow, anyway. Quote Link to comment
mkny13 Posted July 14, 2011 Author Share Posted July 14, 2011 Tried that out. Unplugged the ethernet cable, booted unraid, left it alone for 24 hours, just plugged back in and checked on it, looks to be running smoothly. Now I've started a Parity Check. I'm going to unplug the net-cable again, check on it tonight. And my new NIC should be here today, so I'll install that ASAP. Thanks for the help everyone. Quote Link to comment
mkny13 Posted August 1, 2011 Author Share Posted August 1, 2011 Well, hit issues with the new NIC (knew I should've sprung for the Intel). Still, I'm going to mark this one SOLVED. Thanks for the help everyone. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.