July 10, 201114 yr My server was working smoothly for almost a year. A few months ago, every once in awhile it would become unresponsive--could not connect to shares, couldn't bring up the standard menu or unmenu in a browser using the name or IP, can't ping IP, and server shows no signs of activity. A hard power cycle was the only way to bring it back up. The past few weeks, though, this has happened with increasing frequency and comes on quicker. I can't even get through a resync before it dies again. Hardware specs: Case: Antec 900. Power Supply: Enermax Modu 82+ 525W, Modular. Motherboard: Gigabyte GA-MA74GM-S2 Processor: AMD Athlon 64 LE-1640 Ram: Transcend JETRAM 2GB 2 x 1GB DDR2 800 Drive Controller: Adaptec 1430SA PCI-E x4 Drives: 6 total, mix of 1 and 2 TB WDEARS plus a 1TB Maxtor and an older 160GB cache Flash drive: 2GB Cruzer Software: Version 4.5.6 unMenu with about half the available plugins enabled TwonkyMedia go file: #!/bin/bash # Start the Management Utility /usr/local/sbin/emhttp & cd /boot/packages && find . -name '*.auto_install' -type f -print | sort | xargs -n1 sh -c # Directory caching # sysctl vm.vfs_cache_pressure=1 off, until find out whether should run alongside cache_dirs /boot/cache_dirs -d 10 -w -e "Backup" ##################################### ### Wait for the array to start ### ##################################### # (before installing any packages that may expect the array to be fully started) until `cat /proc/mdcmd 2>/dev/null | grep -q -a "STARTED" ` ; do echo ">>>waiting..." ; sleep 1 ; done ; echo ">>>STARTED." #unMenu autostart cmd /boot/unmenu/uu # Twonky /boot/custom/twonkymedia-i386-glibc-2.2.5-5.1/twonkymedia -inifile /boot/custom/twonkymedia-i386-glibc-2.2.5-5.1/twonkymedia-server.ini I've read through the FAQ, Troubleshooting guide, and done my best to search the forum. I'd appreciate any advice anyone has to offer. I've attached logs from top and tail operations I left running up to the point of the latest crash, a syslog generated by powerdown at the last successful clean shutdown (20110629), and a syslog from a little after boot but before lockup (20110710). If there's any more info I can provide, please let me know. Should I go ahead and upgrade to the latest stable version? I considered this and thought it might be better to get my server stabilized before attempting that but if others advice to the contrary, I can do that right away. Thanks! Mike MK-unraid-log.zip
July 10, 201114 yr Looks like networking problems to me. Jun 27 15:23:27 Unraid kernel: r8169: eth0: link down Jun 27 15:23:27 Unraid ifplugd(eth0)[1309]: Link beat lost. Jun 27 15:23:32 Unraid kernel: r8169: eth0: link up Jun 27 15:23:33 Unraid ifplugd(eth0)[1309]: Link beat detected. Jun 27 15:23:35 Unraid kernel: r8169: eth0: link down Jun 27 15:23:35 Unraid ifplugd(eth0)[1309]: Link beat lost. Jun 27 15:23:37 Unraid kernel: r8169: eth0: link up Jun 27 15:23:38 Unraid ifplugd(eth0)[1309]: Link beat detected. Plenty of alignment errors, too. Incompatibilities usually look much worse than this though, so perhaps check the cable, the switch port it's plugged into, or the switch/router itself.
July 11, 201114 yr Author Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive.
July 11, 201114 yr Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive. If the system log fills up all available memory (with those error messages) Linux will terminate processes in an attempt to make more memory available. It will terminate the processes that have been idle the longest... typically, that will end up killing processes used to log in and to supply the management interface. So yes, network errors can fill the system log and use all available memory, and subsequently lock up the machine.
July 11, 201114 yr Really? Could networking issues lock up the machine entirely? Even on a local monitor and keyboard the thing is unresponsive. I have had in the past had MB's lock up on me when the on-board NIC was failing. The NIC had not completely failed, but was dying, and it would lock up my computer (not my unRAID server, just another computer). It was somewhat frustrating, but eventually I figured it out, and diabled the on-board NIC, and added a PCI NIC I had lying around. Fixed it right up and had no problems from it since. So, yes, networking issues can sometimes lock up a computer. Bruce
July 12, 201114 yr Author I could put in a PCI NIC easily enough (that might let me sucessfully use S3 sleep, too). Would you recommend doing that first or maybe creating a swap file on a hard disk to avoid running out of memory?
July 12, 201114 yr I think the immediate need is to fix your NIC issues. Once that is fixed, you may find that you don't need any more memory. At least that is where I would start. Bruce
July 12, 201114 yr I'd try the PCI NIC. Adding more RAM isn't going to fix the problem, it might delay the onset of the issue but eventually you'll run out of space. The analogy here is a leaking roof. If your roof was leaking would you fix the roof (fix the NIC) or would you just put a bigger bucket in the house to catch the water (add more RAM)?
July 12, 201114 yr Author OK, solid advice. Anyone know offhand the cheapest gigabit NIC that's known to be unraid compatible?
July 12, 201114 yr Author Looks like this is the winner, at least if I'm taking advantage of Prime. Is there any compelling reason to spend an extra $10 on an Intel 1000
July 12, 201114 yr Author I moved a few weeks ago, which entailed a new cable and port (or at least a 80% chance of new port, as I made no attempt to reconnect my devices to their previous ports.)
July 12, 201114 yr Okay. It was a reach but had to try it. I'd still go with an Intel Pro nic. The driver is in another league. The two I bought through eBay for $12/ea (seller sinowo) arrived in about a week.
July 13, 201114 yr Author I tried to cancel my d-link order and get an Intel but too late. Oh well. Here's something I wished had occurred to me earlier: couldn't I test this theory of network errors causing syslog overrun by unplugging the ethernet cable entirely, then booting and letting the array finish it's resync before I ever plug it back in? Giving that a try today--got nothing else to do until NIC gets here tomorrow, anyway.
July 14, 201114 yr Author Tried that out. Unplugged the ethernet cable, booted unraid, left it alone for 24 hours, just plugged back in and checked on it, looks to be running smoothly. Now I've started a Parity Check. I'm going to unplug the net-cable again, check on it tonight. And my new NIC should be here today, so I'll install that ASAP. Thanks for the help everyone.
August 1, 201114 yr Author Well, hit issues with the new NIC (knew I should've sprung for the Intel). Still, I'm going to mark this one SOLVED. Thanks for the help everyone.
Archived
This topic is now archived and is closed to further replies.