Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Loss of Network Crashes Server

Featured Replies

I've been having some network issues lately, which I thought were fixed.  However when I came home today, I found my network down.  I power cycled my switch and that fixed it, however when I tried to access my server, I found it unresponsive.

 

After connecting the server to a monitor, I found it totally frozen and I had to force a power off to start it back up.  Once I do this it's fine, but every time my network goes down, the server freezes.

 

I have no idea why this would be, since the loss of network shouldn't cause the server to freeze.

 

I did some googling to see if anyone else had a similar problem, but had no success.


Can anyone provide some help here?  What could cause the server to crash because it lost the network?

As a disclaimer, I'm fairly new to UnRAID (though I'm fairly certain I set it up correctly), so I'd be happy to attempt to provide additional information as needed, but may need some assistance learning how to get it (links to guides are great).

  • Community Expert

Let's start by getting a diagnostic file when the system is running.  Granted this will not be a snap shot when the problem occurs bu there may a clue in it.

 

Questions:  You say " my network goes down"  What do you mean by that statement?  Why is it going down?  And what makes up your network (both devices and configuration) and what makes it different?   What happens after your restore the network? 

 

You have to realize that most of us don't experience networks that that 'just go down'.  Except for power outages, networks generally just work...

I would start first with investigating why your network isn't stable, perhaps your switch isn't up to par?

 

unRAID won't crash when network connection is lost. Is it set to DHCP and did it loose its IP address due to not be able to contact anymore the DHCP server?

 

Have a screen and keyboard attached directly to unRAID, this can help to do more examination.

 

  • Author
22 hours ago, Frank1940 said:

Let's start by getting a diagnostic file when the system is running.  Granted this will not be a snap shot when the problem occurs bu there may a clue in it.

 

Questions:  You say " my network goes down"  What do you mean by that statement?  Why is it going down?  And what makes up your network (both devices and configuration) and what makes it different?   What happens after your restore the network? 

 

You have to realize that most of us don't experience networks that that 'just go down'.  Except for power outages, networks generally just work...

 

19 hours ago, bonienl said:

I would start first with investigating why your network isn't stable, perhaps your switch isn't up to par?

 

unRAID won't crash when network connection is lost. Is it set to DHCP and did it loose its IP address due to not be able to contact anymore the DHCP server?

 

Have a screen and keyboard attached directly to unRAID, this can help to do more examination.

 

 

I've attached my diagnostics zip file.  I peeked around at a few things and saw some errors related to the Lubuntu VM I was running, perhaps that had something to do with it?

 

My network setup is:

Modem (Moto Surfboard) > PFSense SG-1000 Firewall > Netgear 5-port Gigabit Switch > [UniFi AC Pro WAP / Windows 10 Gaming Rig / UnRaid server (SkyNet)]

 

The issue the other day was fixed with a switch reboot, so I'm not sure that was related to my original issues.

Unfortunately, I was never able to pinpoint what the issue was with my network.  I've since bought the SG-1000 firewall (it's a Raspberry Pi sized firewall appliance) and the network has been relatively stable since then (aside from the switch reboot), so my thinking is that my old firewall machine had something go wrong (it was running on a used Dell OptiPlex 760, so I wouldn't be surprised).  The issue the other day was the first network problem I've had since I switched to the SG-1000.

 

The actual network problems I was having were basically this:

My firewall (running on the Dell 760 at the time) was getting an external IP address and I was able to VPN into it from my phone (cell data not wifi) and access the webgui, but not anything else on the LAN network.

My gaming rig (my main computer) was connected to the network but not getting an external connection.  As far as I could tell, it was a DNS issue because I was able to set a static IP and use Google's DNS servers to get the external connection working.

Obviously my first thought was a firewall configuration issue (I'll admit I'm not great with networking), but even after a factory reset and a fresh install of PFSense, I was unable to get it working.  I'd also been using that same computer as my firewall for a few months at this point, so it was working when I first set it up, and a fresh install should have fixed whatever configuration I messed up, but it didn't.  I even went so far as reinstalling windows 10 on my computer thinking it may have been a bug or driver issue with my machine.  However, my phone would connect to WiFi as well, but also wasn't getting an external connection, so that didn't really matter.

As I said above, once I replaced the old machine with the SG-1000, it has worked out of the box except for the one issue this week.

 

The UnRaid server is set to DHCP, with a static IP set via my firewall (which is my DHCP server).

 

I did have a monitor, keyboard, and mouse attached to the server, so that was the first thing I checked when I realized it was down.  The server was definitely frozen.  The webGUI was still displayed on the screen, but I could not move the mouse nor use the keyboard.  The shutdown button had no effect, so I was forced to hit the reset switch instead.

 

I appreciate the assistance so far.  I've since installed the CA Fix Common Problems plugin (which is awesome) and it reports no issues.  I've also removed the only virtual machine I was running, which perhaps had something to do with this (maybe because I was also running a large cloud backup at the time too?).  I'd like to have at least one of those running at some point, but I'm not sure why it was throwing any errors to begin with.  It might explain the crashes it was having, which I assumed were caused by my poor attempts at configuring things I didn't fully understand (I was using the VM to run the UniFi webgui stats page and was trying to setup Zabbix for network monitoring, but couldn't quite figure it out).

The UnRaid server is running off the hardware from my old gaming machine (2nd gen i5-2500k (no OC) with 16gb RAM and an old GPU for vga) which I've used to run multiple VMs simultaneously, so I wouldn't think it was a lack of power.

 

Hopefully that mess of information proves helpful.

skynet-diagnostics-20170414-1821.zip

  • 3 weeks later...
  • Author

This has happened a few more times, but I'm thinking it's an issue caused by my firewall.

 

Still, that doesn't really explain why the server freezes when the network goes down...

  • Community Expert

I think it is a bit more complex then you realize.  A simple network failure does not freeze the unRAID server.  (You can test this by unplugging the network cable from either the server or the switch.)   I seem to recall that there have been a few other cases where a firewall has caused a unRAID server to lock up. (I don't think these cases were the simple NAT firewalls but rather the software ones running on separate hardware.  Although one was running on unRAID as service and that was an issue because of unRAID needing the network before the firewall was running.)  Obviously, something is happening and I suspected that your server Network interface is locking up because of some issue with what is happening on the network which is turn causes the server to freeze while it waits on the interface to respond.  I don't believe any OS is designed to take care of a problem with a outside firewall hanging up...

 

I have had to resolve networking issues by rebooting all of the switches (I have four of these in my network)  and the router on my network.  While it is not that common to have networks become flaky, it not unheard of either.  In my case, the first time it happened, the network was very far down on the list of things I looked at....

I agree with Frank that a simple network disconnection won't cause unRAID to lock up. There is a deeper lying problem.

 

Start with the basics: disable Docker + VM services and let unRAID start in safemode without plugins, this gives  you a 'maiden' system to test on.

Once stability is confirmed you can start adding stuff and find the possible culprit.

 

  • Author
On 5/1/2017 at 4:58 AM, Frank1940 said:

I think it is a bit more complex then you realize.  A simple network failure does not freeze the unRAID server.  (You can test this by unplugging the network cable from either the server or the switch.)   I seem to recall that there have been a few other cases where a firewall has caused a unRAID server to lock up. (I don't think these cases were the simple NAT firewalls but rather the software ones running on separate hardware.  Although one was running on unRAID as service and that was an issue because of unRAID needing the network before the firewall was running.)  Obviously, something is happening and I suspected that your server Network interface is locking up because of some issue with what is happening on the network which is turn causes the server to freeze while it waits on the interface to respond.  I don't believe any OS is designed to take care of a problem with a outside firewall hanging up...

 

I have had to resolve networking issues by rebooting all of the switches (I have four of these in my network)  and the router on my network.  While it is not that common to have networks become flaky, it not unheard of either.  In my case, the first time it happened, the network was very far down on the list of things I looked at....

I've lost internet on the server before, so it definitely didn't make sense that the network loss caused it to crash.  It seems far more likely that all the issues are caused by the firewall at this point, although it has been more stable lately.

 

Do you have any more information on the firewall causing the lockup?  I would be curious what they thought caused it, and maybe I could disable/change/fix that on my firewall so this stops occurring.  Since my entire network has issues when this happens, it makes far more sense the issue is with the firewall instead of the server, since the server is otherwise completely stable (aside from when I break things messing with stuff).

 

Bonienl, those are the next steps I will take if I can't fix it from the firewall end.  I still think the issue is caused by the firewall, but I might be able to change something on the server that will at least prevent it from crashing if it happens again.  I have the paid version of UnRAID, but I'm not sure if the array stays online when you lose internet; perhaps that could be part of the reason it crashes when it loses the outside connection?

  • Community Expert

Search for the  name of your firewall software on this forum first and see what you find.  You could also try Google but you will need a good short generic description of your problem to narrow the range of the results.

  • 3 weeks later...
  • Author

Alright so this seemed to have settled down, but now it's happened twice this week.

 

I've looked over my logs for my firewall, and unfortunately I'm not sure when the network went down today, but I didn't see anything that would indicate a problem.  I'm going to keep a better watch and note when I lose the network to see if I can correlate anything.  The last couple times my WAP has needed to be rebooted in addition to the switch, where before it worked automatically.  Not sure if that means anything but it's annoying.  I'm running the UniFi controller as a docker in UnRAID, but that hadn't mattered before.

 

I'm also going to leave my dockers and VMs off for a while and see if that prevents a crash.  What I don't get is how the server can crash at the same time as the network going down, that is fixed by rebooting an unmanaged switch...

Is there something with UnRAID that relies on a live network connection, or something like that?  I've got the paid version so it shouldn't require the network, and just unplugging it from the server doesn't crash it (I'd test it again now to be sure, but it's doing the parity check), so I'm at a bit of a loss here.

 

If we look at it the other way, is there something UnRAID could be doing that causes it to crash, that then brings down the network?  My firewall is running on separate hardware, not running on UnRAID.  I'm also able to keep my phone VPNed into my network and get external internet, but not reach anything on my LAN aside from my firewall, which remains up and working totally fine when it crashes.  I don't really think it's causing it at this point, although I just cannot be sure.

 

As I've said before, I've swapped out my switch a couple times, so it's unlikely to be the issue.  My modem has worked fine for years, and my WAN connection isn't having an issue.  Whatever is happening is fixed by rebooting the switch, but the switch seems fine otherwise.  The firewall doesn't seem to suffer any ill effects, and I had no log entries up until I came home today and rebooted the switch.

 

The only thing I can think of that may have something to do with it is that I assigned my main devices static IPs in my firewall (my WAP, server, and gaming rig), and this started happening about the same time.  I also saw a firewall log entry about static IP routes

 

 

May 18 17:23:37 php-fpm  

/rc.linkup: The command '/usr/sbin/arp -s 'unraid server ip' 'unraid server mac'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'

 

 

This is driving me mad and nothing makes sense....

Edited by lordbob75

Try putting a different ethernet card in the unraid server.

  • Author
1 minute ago, jonathanm said:

Try putting a different ethernet card in the unraid server.

FYI, I also added a note about a firewall log related to my server ip route in case you missed it.

 

Now, this is an interesting idea...  What would be the reasoning here?  Is it possible for a bad NIC to cause issues like this?

I think I want to let the Parity check finish first, but I have another NIC I can put in and try it.  The one I'm using currently is built into my motherboard (Asus p8p67 pro, my old gaming rig motherboard/cpu) so I can't remove it but I can disable it in BIOS.  I hope my other NIC works in this machine though, I have had bad luck with NICs lately, though most of those were with an old computer.

 

Not sure there's going to be an easy way to test it aside from waiting to see if it happens again though...

  • Author

I don't like to double post, but I just found this older thread of a guy with pretty much the same issue as me.

 

 

It doesn't look like he ever found the cause, but had some luck reducing the number of issues.  I'll have to check my BIOS for the c-state setting (honestly not sure what that is, doesn't sound familiar), and I want to try unplugging my server from the network to see if it brings the switch back up.

 

Also, could this be caused by 2 dockers on the same port, or maybe some weird broadcasting issue (like someone mentioned)?

C-State was an issue some of the early Skylakes had. I believe it was supposed to be fixed by microcode updates in the bios.

 

I have had issues similar to that thread. It has got a lot better with bios updates but has yet to go away. When it happens, the server is unresponsive yet it appears to be accessing or flooding the network in some fashion.

 

I have tailed the syslog and looked at diagnostics and there is nothing to indicate a problem. It has all the appearance of a random hard system lockup not triggered by anything, except for the network flooding. It's rather disappointing that the most expensive hardware I have put into the server has been the least reliable. The cheap AMD hardware that cost 1/2 the price of the Skylake stuff was completely reliable. I switched from the AMD to the Skylake so I could run VM's. Otherwise, I was running the same Dockers as before and without or without the VM running it still happens. I have changed the Dockers a bit since but it didn't help. Actually, it seems to be more reliable if the VM is left running but I have no hard evidence of that.

 

The only other change I made when I upgrade the CPU/Motherboard/RAM was switching the cache from a spinner to a pair of SSD's.

 

I really don't want to run without the Dockers I'm using because they're important to my whole server setup. Besides, with my server operating for months between lockups, I could easily spend > a year testing combinations of dockers and VM's hoping to find one of them as the cause. 

 

In my case, the server is doing it. Unplugging or rebooting it will bring the network back up.

 

12 hours ago, lordbob75 said:

The only thing I can think of that may have something to do with it is that I assigned my main devices static IPs in my firewall (my WAP, server, and gaming rig), and this started happening about the same time.  I also saw a firewall log entry about static IP routes

 

 

May 18 17:23:37 php-fpm  

/rc.linkup: The command '/usr/sbin/arp -s 'unraid server ip' 'unraid server mac'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'

 

 

This is driving me mad and nothing makes sense....

 

If your intention is to set a fixed IP address for your server then using ARP is the wrong thing to do. In general there is no need to create a fixed ARP entry, it may cause all kinds of unwanted side effects.

 

A fixed IP address can be set from the GUI, see network settings.

  • Author
On 5/19/2017 at 6:29 AM, bonienl said:

 

If your intention is to set a fixed IP address for your server then using ARP is the wrong thing to do. In general there is no need to create a fixed ARP entry, it may cause all kinds of unwanted side effects.

 

A fixed IP address can be set from the GUI, see network settings.

I set it through the DHCP server on my firewall, though I did check the box for "ARP Table Static Entry".  Is it better to not do that?

I did want to set the static IPs on the DHCP server rather than locally, although it should work fine either way.

  • 2 weeks later...
  • Author

Alright so it's been about 2 weeks since my last post.  I've been running my server with no applications (VMs/docker)  running, and this morning it had frozen and the network was down.  Unplugging the network cable from my server instantly brought the network back up.

 

I'm happy it's not my dockers causing this, but what is?!

I wish I knew. Mine appears to be a hard lock-up but I'm not convinced it really is locked-up because that shouldn't cause the network issue. If I get time, I might attempt to snoop the network traffic.

  • Author
6 hours ago, lionelhutz said:

I wish I knew. Mine appears to be a hard lock-up but I'm not convinced it really is locked-up because that shouldn't cause the network issue. If I get time, I might attempt to snoop the network traffic.

I'd be interested in anything you find out as well.

I've no real idea what could cause this kind of issue, so I'm not sure where to go from here.

 

Is there some way to get the logs from the server crash?  Maybe have it saved to USB drive or something (assuming that doesn't crash it too)?

28 minutes ago, lordbob75 said:

Is there some way to get the logs from the server crash?  Maybe have it saved to USB drive or something (assuming that doesn't crash it too)?

Only if you have the Fix Common Problems plugin installed and set to troubleshooting mode to collect diagnostics continuously.

  • Author
14 minutes ago, jonathanm said:

Only if you have the Fix Common Problems plugin installed and set to troubleshooting mode to collect diagnostics continuously.

Aha, excellent.  I'll turn that on then.

 

Edit:  I didn't realize this was troubleshooting mode.  Can I safely turn this on for a week or two?  There's no way for me to predict this happening.

Edited by lordbob75

I have tried tailing the syslog and nothing appears in the log when my server locks up.

  • Author
43 minutes ago, lionelhutz said:

I have tried tailing the syslog and nothing appears in the log when my server locks up.

Dang, thats disappointing.  Seriously makes me wonder whats going on.

Hi LordBob,

 

I received your e-mail to our support box on this topic and must say I'm a little at a loss for this one myself.  Hard lockups can typically be diagnosed if we see kernel panics while tailing the log via command line with a locally attached monitor and keyboard, but if there is nothing printing out to the log when the crash occurs, then there is no smoking gun for us to track.  The main things I would try to check would be:

 

1 - BIOS updates.  I know that in the past, BIOS updates to the motherboard have resolved issues regarding stability, especially when leveraging features such as virtualization.  Check to see if there is one available for your system and give it a shot.

 

2 - Try the latest RC release of 6.4.  If you can afford the risk to try an RC-release on your hardware, I would suggest doing so.  You can gain access to the release here:  

3 - Try it on another network / location.  If it is indeed something at the network layer causing a system hangup, it'd be helpful to identify that as the cause for sure by simply moving the system to a new network and see if the same result occurs.  Even just a basic switch by itself sitting between a client machine and the server would be enough.

 

 

Searching for history of P8P67 Pro boards there are other cases where a lockup would bring down the network connected to it.  Here, around post #87.  Google will net other similar/same cases as well.  There are some ACPI errors in your log, but didn't find anything related to the main issue and seems to be common issue with 4.9.x kernels.

Edited by unevent

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.