My unRAID server stopped responding, why did this happen?


Recommended Posts

Hi. Today I was streaming a SD video from my unRAID, and it stopped responding, basically the network file share and the HTTP site was very unresponsive. This same result occurred on another PC (Both Windows 7 x64 systems). I was able to telnet to the box and restarted it, which fixed the issue and the unRAID performed fine.

I've attached my latest syslog file. I can't ascertain why this occurred, though the same/similar instance may of happened some time ago, though I can't exactly recall. I'm on unRAID Server Pro v4.5.1. My server is built from 10 Seagate 1.5TB 7200RPM disks, using the 6 on board SATA ports on a Asus M4A785T-M mobo, and the remaining 4 disks are evenly running off two SiliconImage 3114 SATA PCI cards. The temperature of the disks were at 40C and are set to switch off after 3 hours of inactivity (Though the sever was only running for 2 hours when this incident occurred).

 

Any reason what can cause this or why this happened?

 

Cheers

syslog-20110409-164514.zip

Link to comment
Apr  9 14:37:34 Tower emhttp: shcmd (39): killall -HUP smbd

Apr  9 14:37:34 Tower emhttp: shcmd (40): /etc/rc.d/rc.nfsd restart | logger

Apr  9 14:38:02 Tower init: Re-reading inittab

Apr  9 16:36:42 Tower kernel: r8169: eth0: link down

Apr  9 16:36:43 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:44 Tower kernel: r8169: eth0: link up

Apr  9 16:36:45 Tower kernel: r8169: eth0: link down

Apr  9 16:36:49 Tower kernel: r8169: eth0: link up

Apr  9 16:36:49 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:49 Tower kernel: r8169: eth0: link down

Apr  9 16:36:50 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:53 Tower kernel: r8169: eth0: link up

Apr  9 16:36:53 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:54 Tower kernel: r8169: eth0: link down

Apr  9 16:36:54 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:57 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:57 Tower kernel: r8169: eth0: link up

Apr  9 16:36:58 Tower kernel: r8169: eth0: link down

Apr  9 16:36:59 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:37:02 Tower kernel: r8169: eth0: link up

Apr  9 16:37:03 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:37:03 Tower kernel: r8169: eth0: link down

Apr  9 16:37:04 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:37:06 Tower kernel: r8169: eth0: link up

Apr  9 16:37:06 Tower ifplugd(eth0)[1303]: Link beat detected.

 

Well, not that this is helpful, but the problem looks like a network issue at 1636:42.  You said you were able to telnet in, so you didn't lose all connectivity to the box...

 

Any other problems at that time?  Mini-power-outage?  Trouble with NICs?

Link to comment

Apr  9 14:37:34 Tower emhttp: shcmd (39): killall -HUP smbd

Apr  9 14:37:34 Tower emhttp: shcmd (40): /etc/rc.d/rc.nfsd restart | logger

Apr  9 14:38:02 Tower init: Re-reading inittab

Apr  9 16:36:42 Tower kernel: r8169: eth0: link down

Apr  9 16:36:43 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:44 Tower kernel: r8169: eth0: link up

Apr  9 16:36:45 Tower kernel: r8169: eth0: link down

Apr  9 16:36:49 Tower kernel: r8169: eth0: link up

Apr  9 16:36:49 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:49 Tower kernel: r8169: eth0: link down

Apr  9 16:36:50 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:53 Tower kernel: r8169: eth0: link up

Apr  9 16:36:53 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:54 Tower kernel: r8169: eth0: link down

Apr  9 16:36:54 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:36:57 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:36:57 Tower kernel: r8169: eth0: link up

Apr  9 16:36:58 Tower kernel: r8169: eth0: link down

Apr  9 16:36:59 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:37:02 Tower kernel: r8169: eth0: link up

Apr  9 16:37:03 Tower ifplugd(eth0)[1303]: Link beat detected.

Apr  9 16:37:03 Tower kernel: r8169: eth0: link down

Apr  9 16:37:04 Tower ifplugd(eth0)[1303]: Link beat lost.

Apr  9 16:37:06 Tower kernel: r8169: eth0: link up

Apr  9 16:37:06 Tower ifplugd(eth0)[1303]: Link beat detected.

 

Well, not that this is helpful, but the problem looks like a network issue at 1636:42.  You said you were able to telnet in, so you didn't lose all connectivity to the box... - Correct. I was able to connect to it, but it was like it was on a very slow dialup connection. I was able to visually see the HTTP site, but it didn't render very well and the shares were not viable, Windows Explorer kept crashing while attempting.

 

 

 

Any other problems at that time?  Mini-power-outage?  Trouble with NICs? - Na the power hadn't gone out, the NIC's seem to be fine all round and is all fine today. Bouncing the server fixed the issue though, perhaps a buffer overload on the NIC for my unRAID server? I would say this kind of event would of been recorded in my syslog?The connection on the NIC (cable) is firm as well.

 

Cheers

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.