Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Server unreachable constantly

Featured Replies

Hi, I need help digging into an ongoing problem I've been having with server outages. These outages happen a couple times a week, the server is still on, fans running, but Unraid dashboard is unavailable locally and through Unraid connect. All of my reverse proxy docker containers are also inaccessible and the router shows that the server is offline, indicating some kind of network outage or outright crash of the OS. 

Typically only a server restart can bring it back, unraid is able to gracefully shut down if I hit the power button once

brulu-diagnostics-20231210-1931.zip

Edited by Dextabrewa

Solved by Dextabrewa

4 hours ago, Dextabrewa said:

These outages happen a couple times a week,

Can you nail down a reason for the "couple" of times? Neighbors coming/going (ie: devices coming and leaving)?

 

4 hours ago, Dextabrewa said:

the router shows that the server is offline

What router?... Make/Model, Yours/ISPs?

 

4 hours ago, Dextabrewa said:

Typically only a server restart can bring it back

Well... Your in REALLY big trouble when the computer doesn't start at all.

 

We/You will figure it out.

 

MrGrey.

 

  • Community Expert

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

  • Community Expert

You might want to try enabling the syslog server (probably using the Mirror to flash option) and post the file that produces after the issue occurs so we can see if anything was logged leading up to the problem.

  • Author
8 hours ago, MrGrey said:

Can you nail down a reason for the "couple" of times? Neighbors coming/going (ie: devices coming and leaving)?

 

What router?... Make/Model, Yours/ISPs?

 

Well... Your in REALLY big trouble when the computer doesn't start at all.

 

We/You will figure it out.

 

MrGrey.

 


Router: Ubiquiti Networks UniFi Dream Machine
ISP: Google Fiber 1gig

 

There doesn't seem to be an obvious reason for the crashes. The computer always reboots no problem and the Unraid dashboard comes back up. I never have to hard restart either, its always graceful. 

  • Author
3 hours ago, itimpi said:

You might want to try enabling the syslog server (probably using the Mirror to flash option) and post the file that produces after the issue occurs so we can see if anything was logged leading up to the problem.


I have a syslog file that goes back a couple months that I can also share here: 

 

syslog-192.168.1.164.log

  • Author
5 hours ago, JorgeB said:

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

 

If it was hardware related I would expect the server to just shut down or the OS to crash, the OS seems to be running but the network goes down. I replaced the onboard motherboard networking with a PCI Network card and the outage still happens. 

I have the same thing happening since an update somewhere in september/october IIRC. Usually at these times, the dashboard shows this :

 

image.png.7f51d49a571909dec0f1147759339efe.png

Dashboard becomes unresponsive. Everything else also. Can't even power off it doesn't respond (even from a one push on the HW button). And the terminal doesn't load so I can see what happens with HTOP...

 

Edit : Since the last few months, I managed to get some diagnostics during one of this hang, showing nothing also.

Edit2 : it was before September actually :

 

Edited by xoC

  • Community Expert
1 hour ago, Dextabrewa said:

 

If it was hardware related I would expect the server to just shut down or the OS to crash, the OS seems to be running but the network goes down. I replaced the onboard motherboard networking with a PCI Network card and the outage still happens. 

 

The syslog seems to show the eth1 connection going up and down frequently.

 

Have you tried changing the network settings so that eth0 is not bonded with eth1 to see if that helps?

  • Community Expert
1 hour ago, Dextabrewa said:

I have a syslog file that goes back a couple months that I can also share here:

Various call traces and btrfs is detecting a lot of data corruption, suggest running memtest.

  • Author
4 minutes ago, JorgeB said:

Various call traces and btrfs is detecting a lot of data corruption, suggest running memtest.

I replaced the SSD with the data corruption yesterday, those errors should be gone going forward. I had suspected that was the issue with the system crashes/freezes, but it proceeded to go offline just hours after the drive replacement

Edited by Dextabrewa

  • Community Expert
8 minutes ago, Dextabrewa said:

I replaced the SSD with the data corruption yesterday

Almost certainly the SSD was not the problem.

  • Author
22 minutes ago, JorgeB said:

Almost certainly the SSD was not the problem.

 

The faulty SSD had errors when I checked the SSD log, it was caused by a bad set of ram I was using months prior (my theory), I replaced the ram with a known good set. But I will run a memtest again to verify.  

Edited by Dextabrewa

  • Author
2 hours ago, JorgeB said:

Almost certainly the SSD was not the problem.

This seems to be the common error whenever my server is unreachable. My eth0 is set correctly in the interface rules to the new PCI network card, 

Network card: Gigabit Dual NIC with Intel 82576 Chip, 1Gb
Router: Unifi Dream Machine

 

Any suggestions?

I've included updated diagnostics that includes a crash. 

Dec 11 13:32:10 Brulu kernel: igb 0000:29:00.0 eth0: igb: eth0 NIC Link is Down
Dec 11 13:32:10 Brulu kernel: br0: port 1(eth0) entered disabled state
Dec 11 13:32:14 Brulu ntpd[1812]: Deleting interface #1 br0, 192.168.1.164#123, interface stats: received=12, sent=12, dropped=0, active_time=264 secs

 

brulu-diagnostics-20231211-1338.zip

  • Author
3 hours ago, JorgeB said:

Various call traces and btrfs is detecting a lot of data corruption, suggest running memtest.

Ran memtest for 1 hours with no errors, if you feel like I should run it longer I can. 

  • Author
  • Solution

Swapping out the ethernet cable seems to have fixed the issue for now.. Will report back in a few days!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.