Network issue - server sometimes can't be reached without reseting network connection


theone

Recommended Posts

In the last 24 hours I have had network loss twice on my unRAID server.

This means I cannot reach it from another network device (PC, phone) on the local network. This is also true for the VMs running on the server (Windows and Home Assistant - they loose network connection - LAN and internet)

 

To solve this I disconnect and reconnect the LAN cable connected to my server.

 

This has happened to be in the past but maybe once every few weeks/months - Not twice in 24 hours.

 

I see nothing in the logs regarding network loss and only see Local Master change after connection is regained.

 

In the morning (08:48) realised no network connection (home assistant not working) so reset network cable (According to home assitant logs connection lost at 04:20)

Got back from work (19:23) and again realised no network connection (home assistant not working) so reset network cable again (According to home assitant logs connection lost at 17:48)

 

What could be the reason for this?

Could it be the Local Master take over? and if so how do I turn it of on Raspberry Pi OS?

 

 

Jan 15 03:32:40 Tower root: /etc/libvirt: 88.1 MiB (92360704 bytes) trimmed on /dev/loop3
Jan 15 03:32:40 Tower root: /var/lib/docker: 6.2 GiB (6674800640 bytes) trimmed on /dev/loop2
Jan 15 03:32:40 Tower root: /mnt/virtual_machines: 173.8 GiB (186667610112 bytes) trimmed on /dev/nvme0n1p1
Jan 15 03:32:40 Tower root: /mnt/downloads: 10.8 GiB (11549663232 bytes) trimmed on /dev/sdf1
Jan 15 03:32:40 Tower root: /mnt/cache: 51.7 GiB (55552176128 bytes) trimmed on /dev/nvme1n1p1
Jan 15 03:40:01 Tower  crond[1439]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
Jan 15 04:40:01 Tower  apcupsd[6374]: apcupsd exiting, signal 15
Jan 15 04:40:01 Tower  apcupsd[6374]: apcupsd shutdown succeeded
Jan 15 04:40:03 Tower  apcupsd[23838]: apcupsd 3.14.14 (31 May 2016) slackware startup succeeded
Jan 15 04:40:03 Tower  apcupsd[23838]: NIS server startup succeeded
Jan 15 06:36:21 Tower  ntpd[1418]: no peer for too long, server running free now
Jan 15 08:48:45 Tower kernel: r8169 0000:06:00.0 eth0: Link is Down
Jan 15 08:48:45 Tower kernel: br0: port 1(eth0) entered disabled state
Jan 15 08:48:49 Tower kernel: r8169 0000:06:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Jan 15 08:48:49 Tower kernel: br0: port 1(eth0) entered blocking state
Jan 15 08:48:49 Tower kernel: br0: port 1(eth0) entered forwarding state
Jan 15 08:50:01 Tower  nmbd[10802]: [2023/01/15 08:50:01.932508,  0] ../../source3/nmbd/nmbd_incomingdgrams.c:303(process_local_master_announce)
Jan 15 08:50:01 Tower  nmbd[10802]:   process_local_master_announce: Server LOBBYPI at IP 192.168.1.107 is announcing itself as a local master browser for workgroup WORKGROUP and we think we are master. Forcing election.
Jan 15 08:50:01 Tower  nmbd[10802]: [2023/01/15 08:50:01.932782,  0] ../../source3/nmbd/nmbd_become_lmb.c:151(unbecome_local_master_success)
Jan 15 08:50:01 Tower  nmbd[10802]:   *****
Jan 15 08:50:01 Tower  nmbd[10802]:   
Jan 15 08:50:01 Tower  nmbd[10802]:   Samba name server TOWER has stopped being a local master browser for workgroup WORKGROUP on subnet 192.168.1.104
Jan 15 08:50:01 Tower  nmbd[10802]:   
Jan 15 08:50:01 Tower  nmbd[10802]:   *****
Jan 15 08:50:20 Tower  nmbd[10802]: [2023/01/15 08:50:20.321813,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan 15 08:50:20 Tower  nmbd[10802]:   *****
Jan 15 08:50:20 Tower  nmbd[10802]:   
Jan 15 08:50:20 Tower  nmbd[10802]:   Samba name server TOWER is now a local master browser for workgroup WORKGROUP on subnet 192.168.1.104
Jan 15 08:50:20 Tower  nmbd[10802]:   
Jan 15 08:50:20 Tower  nmbd[10802]:   *****
Jan 15 08:58:29 Tower webGUI: Successful login user root from 192.168.1.218
Jan 15 19:23:30 Tower kernel: r8169 0000:06:00.0 eth0: Link is Down
Jan 15 19:23:30 Tower kernel: br0: port 1(eth0) entered disabled state
Jan 15 19:23:33 Tower kernel: r8169 0000:06:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Jan 15 19:23:33 Tower kernel: br0: port 1(eth0) entered blocking state
Jan 15 19:23:33 Tower kernel: br0: port 1(eth0) entered forwarding state
Jan 15 19:25:31 Tower  nmbd[10802]: [2023/01/15 19:25:31.349402,  0] ../../source3/nmbd/nmbd_incomingdgrams.c:303(process_local_master_announce)
Jan 15 19:25:31 Tower  nmbd[10802]:   process_local_master_announce: Server LOBBYPI at IP 192.168.1.107 is announcing itself as a local master browser for workgroup WORKGROUP and we think we are master. Forcing election.
Jan 15 19:25:31 Tower  nmbd[10802]: [2023/01/15 19:25:31.349724,  0] ../../source3/nmbd/nmbd_become_lmb.c:151(unbecome_local_master_success)
Jan 15 19:25:31 Tower  nmbd[10802]:   *****
Jan 15 19:25:31 Tower  nmbd[10802]:   
Jan 15 19:25:31 Tower  nmbd[10802]:   Samba name server TOWER has stopped being a local master browser for workgroup WORKGROUP on subnet 192.168.1.104
Jan 15 19:25:31 Tower  nmbd[10802]:   
Jan 15 19:25:31 Tower  nmbd[10802]:   *****
Jan 15 19:25:48 Tower  nmbd[10802]: [2023/01/15 19:25:48.454809,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan 15 19:25:48 Tower  nmbd[10802]:   *****
Jan 15 19:25:48 Tower  nmbd[10802]:   
Jan 15 19:25:48 Tower  nmbd[10802]:   Samba name server TOWER is now a local master browser for workgroup WORKGROUP on subnet 192.168.1.104
Jan 15 19:25:48 Tower  nmbd[10802]:   
Jan 15 19:25:48 Tower  nmbd[10802]:   *****
Jan 15 19:36:47 Tower  ntpd[1418]: no peer for too long, server running free now

 

Edited by theone
Link to comment

The master browser does not matter, it is unimportant which machine does the job, Almost no client today uses the brower service at all anymore.

 

What I see in your listing and feel it could be much more problematic is the "flow control" turned on on you 1G (only) card. If the connected switch does not honor this feature, it could lead to port blocades in certain network situations.

Usually, if all devices are at the same speedlevel, flow control is not needed and should be turned off.

Only if your network uses devices with different lan speeds, it can be used to stop faster devices from overrunning slower ones.

 

But if your cabling is a bit unstable it may happen that a "you can continue" packet is lost and a formerly paced out sender does not recocnise that the pause is over and it should continue. The Port is "dead" then until someone fixes it (for instance by pulling out the cable therefor forcing a port reset)

 

Check your switches' stats for error packets or retransmission. If the number is high and counting upwards, check the cabeling.

 

(turning off flow control will fix it too but if the lines are bad, your network is running suboptimal, so better check the stats and fix the lines if needed)

 

Link to comment
21 minutes ago, MAM59 said:

The master browser does not matter, it is unimportant which machine does the job, Almost no client today uses the brower service at all anymore.

 

What I see in your listing and feel it could be much more problematic is the "flow control" turned on on you 1G (only) card. If the connected switch does not honor this feature, it could lead to port blocades in certain network situations.

Usually, if all devices are at the same speedlevel, flow control is not needed and should be turned off.

Only if your network uses devices with different lan speeds, it can be used to stop faster devices from overrunning slower ones.

 

But if your cabling is a bit unstable it may happen that a "you can continue" packet is lost and a formerly paced out sender does not recocnise that the pause is over and it should continue. The Port is "dead" then until someone fixes it (for instance by pulling out the cable therefor forcing a port reset)

 

Check your switches' stats for error packets or retransmission. If the number is high and counting upwards, check the cabeling.

 

(turning off flow control will fix it too but if the lines are bad, your network is running suboptimal, so better check the stats and fix the lines if needed)

 

My switch is unmanaged so I can't get statistics from it.

 

Also, I have different speed clients on the LAN (100Mb and 1Gb).

 

Why do you say that the card is 1G only? it is a motherboard NIC that supports 10/100/1000.

How do I turn flow control ON/OFF?

 

Link to comment
1 minute ago, theone said:

My switch is unmanaged so I can't get statistics from it.

Thats bad, then you never know if it supports flow control at all. The best it can do is to forward those packets in the hope that the other side knows what to do (100Mbit devices DO NOT KNOW what to do, that feature did not exist those days already)

 

1 minute ago, theone said:

Also, I have different speed clients on the LAN (100Mb and 1Gb).

Yeah, I assumed this, so this could be the problem that appears randomly.

1 minute ago, theone said:

Why do you say that the card is 1G only? it is a motherboard NIC that supports 10/100/1000.

Yeah, they all do 🙂 with "only" I meant it is not a 10, 25, 40 or 100Gig card. The higher the speed the more the feature is needed (and you won't get an unmanaged switch with 10Gigs and upwards)

 

1 minute ago, theone said:

How do I turn flow control ON/OFF?

 

the easiest way in unraid is to install the "tips and tweaks" plugin. The topmost setting allows you to disable flow control. Give it a try and watch some days if the lockups are gone.

If not, my guess was wrong and you can turn it on again.

it does not hurt anything.

 

Link to comment

One more thing to try is to reboot the switch.  These things contain a small computer and fair size amount of memory. (That's how they handle the data coming in from a 1Gb/s stream to go out to a 100Mb/s stream.)    A reboot sometimes cleans up things like memory leaks. You should also try another port on the switch.  Occasionally, a port will become flaky and lock up. (I had a 'green' switch at one time that had a couple of ports that would 'go green' and never return from that state.)

Edited by Frank1940
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.