NIC Bonding 802.3ad


Recommended Posts

I'm having some issues with my bonding.. i have two NIC which are:

 

root@ATLAS:/mnt/cache/cache_only# sudo cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 38:ea:a7:a9:2c:f9
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 1
	Actor Key: 9
	Partner Key: 3502
	Partner Mac Address: ec:08:6b:e4:f1:36

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 38:ea:a7:a9:2c:f9
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 38:ea:a7:a9:2c:f9
    port key: 9
    port priority: 255
    port number: 1
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: ec:08:6b:e4:f1:36
    oper key: 3502
    port priority: 32768
    port number: 10
    port state: 61

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:24:9b:1a:cf:69
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 2
Partner Churned Count: 2
details actor lacp pdu:
    system priority: 65535
    system mac address: 38:ea:a7:a9:2c:f9
    port key: 9
    port priority: 255
    port number: 2
    port state: 69
details partner lacp pdu:
    system priority: 65535
    system mac address: 00:00:00:00:00:00
    oper key: 1
    port priority: 255
    port number: 1
    port state: 1

However i am only gettng 1gbit throughput total from three different clients:

 

root      2312  0.7  0.0   6500  1696 pts/1    S+   10:52   0:52 iperf3 -s -B 192.168.1.10 -p 5053
root     23263  1.6  0.0   6500  1764 pts/3    S+   12:20   0:25 iperf3 -s -B 192.168.1.10 -p 5054
root     29199  2.3  0.0   6500  1596 pts/2    S+   10:40   2:58 iperf3 -s -B 192.168.1.10 -p 5202

Three different instances of iperf3, gives:

Computer 1
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   390 MBytes   327 Mbits/sec   71             sender
[  4]   0.00-10.00  sec   388 MBytes   326 Mbits/sec                  receiver

Computer 2
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   198 MBytes   166 Mbits/sec  1023             sender
[  4]   0.00-10.00  sec   198 MBytes   166 Mbits/sec                  receiver

Computer 3
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   861 MBytes   722 Mbits/sec                  sender
[  4]   0.00-10.00  sec   861 MBytes   722 Mbits/sec                  receiver

 

Any ideas? Note the retries too!?

Edited by ryan
Link to comment
11 minutes ago, bonienl said:

A single pc to pc connection will not exceed 1 Gb/s. To take advantage of the bonded channel with link aggregation, you need multiple concurrent connections to/from different PCs.

 

There are three different machines connecting to two different iperf3 processes within the above test. The link should meet 2gbps ) i actually need 4 but testing with three.

Link to comment

I suppose address 192.168.1.10 is your unRAID server.

 

When initiating multiple iperf sessions from different sources, it is actually the switch doing the distribution of data over the aggregate channel towards the server. How is your switch set up?

 

One suspicious setting to check: eth1 partner lacp MAC address is 00:00:00:00:00:00, which is unassigned and looks like the switch is not using both ports.

 

 

Link to comment

@bonienl Thanks for you observations - 192.168.1.10 is indeed my unRAID system.

 

The switch was setup with LAG and STP enabled, i also tested without. I have a TP-LINK T1600G-28TS.

 

MAC Address is strange too.. i didnt see that. If it is the switch not setting up the LAG properly, what other settings do i need to consider ? AFAIK i should only select two ports which need to be aggregated?

 

Link to comment

I am not familiar with this TP-link switch model. Does it show the state of the LACP channel and its members?

 

Besides setting the members of the channel are there other configuration options on the switch, e.g. LACP mode or MII timer?

 

LACP gets precedence over STP, in other words spanning tree is not used on an aggregate port channel.

 

Link to comment

Thanks for the clarificaiton on STP.

The MAC Address is also fixed:

details actor lacp pdu:
    system priority: 65535
    system mac address: 38:ea:a7:a9:2c:f9
    port key: 9
    port priority: 255
    port number: 2
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: ec:08:6b:e4:f1:36
    oper key: 2536
    port priority: 32768
    port number: 12
    port state: 61

 

 

 

So, within the TP-LINK:

 

59343613e73e0_ScreenShot2017-06-04at18_28_24.png.d7ed76fbfe9887ae2c2113f90d91b83a.png

 

 

Link to comment
1 hour ago, bonienl said:

See also this topic about traffic balancing at unRAID side (there is no GUI support).

 

 

Hmm i have just tried this, seems to make no difference. 

 

I can see also with testing the opposite i.e. unRAID as the iperf3 client, and two seperate machines. I get the same.. although it does not seem to be balancing through the interfaces!

 

5934585fc9cd0_ScreenShot2017-06-04at20_55_24.png.c6b550abbff7de7129997787ce19df1d.png

59345861335dc_ScreenShot2017-06-04at20_55_01.png.35e84af5d203025291041985d89553d9.png

Link to comment

In addition, i have tried bonding on my workstation.. and tested the opposite way around, this works fine, i get the full 2gbp/s speed. So i know it's not the switch setup, could it perhaps be the onboard ethernet device on the unRAID that doesn't support this somehow ?

Link to comment

The specifications of my motherboard say dual LAN with support of teaming. What does it say for your motherboard?

 

I had a look at "Actor/Partner Churn State" which in your case is "none" for eth0 and "churned" for eth1. I believe the correct state for both interfaces should be "monitoring", perhaps you do have a hardware limitation?

 

Link to comment
1 hour ago, bonienl said:

The specifications of my motherboard say dual LAN with support of teaming. What does it say for your motherboard?

 

I had a look at "Actor/Partner Churn State" which in your case is "none" for eth0 and "churned" for eth1. I believe the correct state for both interfaces should be "monitoring", perhaps you do have a hardware limitation?

 

 

Interesting, i had a look around the web - this unRAID is on a HP NL54 - and i can see others have setup bonding too, so it should work OK.

I think the first post i must have looked at the output too eariy, i now see monitoring in the Actor/Partner state:

 

root@ATLAS:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 00:24:9b:1a:cf:69
Active Aggregator Info:
	Aggregator ID: 2
	Number of ports: 2
	Actor Key: 9
	Partner Key: 3327
	Partner Mac Address: ec:08:6b:e4:f1:36

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:24:9b:1a:cf:69
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:24:9b:1a:cf:69
    port key: 9
    port priority: 255
    port number: 1
warning: this system does not seem to support IPv6 - trying IPv4
    port state: 61
details partner lacp pdu:
warning: this system does not seem to support IPv6 - trying IPv4
    system priority: 32768
    system mac address: ec:08:6b:e4:f1:36
    oper key: 3327
    port priority: 32768
    port number: 12
    port state: 61

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 38:ea:a7:a9:2c:f9
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:24:9b:1a:cf:69
    port key: 9
    port priority: 255
    port number: 2
    port state: 61
details partner lacp pdu:
    system priority: 32768
    system mac address: ec:08:6b:e4:f1:36
    oper key: 3327
    port priority: 32768
    port number: 10
    port state: 61

 

 

I have swapped the interfaces around - that is to say eth0 <> eth1 in the unRAID configuration to test.. still the same, i do not get the full speed.

 

A few tests I can also see that the links are not "shared"
 

bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 192.168.1.10  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 00:24:9b:1a:cf:69  txqueuelen 1000  (Ethernet)
        RX packets 3842122  bytes 5830623843 (5.4 GiB)
        RX errors 0  dropped 570  overruns 0  frame 0
        TX packets 1395401  bytes 101876433 (97.1 MiB)
        TX errors 0  dropped 9 overruns 0  carrier 0  collisions 0

eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:24:9b:1a:cf:69  txqueuelen 1000  (Ethernet)
        RX packets 65645  bytes 94136919 (89.7 MiB)
        RX errors 20  dropped 0  overruns 0  frame 20
        TX packets 2664714  bytes 199981043 (190.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:24:9b:1a:cf:69  txqueuelen 1000  (Ethernet)
        RX packets 3842029  bytes 5830605446 (5.4 GiB)
        RX errors 0  dropped 570  overruns 0  frame 0
        TX packets 464779  bytes 32597478 (31.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 18

Looks like eth0 is not used at all.. I will try and swap around again, it could be something to do with the eth0 interface!

Edited by ryan
Link to comment
3 minutes ago, bonienl said:

In your settings I see Transmit Hash Policy: layer2, which makes balancing purely on MAC addresses.

 

I know you have tried layer2+3, did you change back? Also the interface needs to be set down/up to make the change effective.

 

 

OK, just tried that and confirmed it was layer2 (probably because of the reboot.. but i change, and reset the interface state.. What interesting is now when i have set this,

Transmit Hash Policy: layer2+3 (2)

Actor Churn State: none
Partner Churn State: none


 

root@ATLAS:~# cat /sys/class/net/bond0/bonding/xmit_hash_policy
layer2 0

root@ATLAS:~# ifconfig bond0 down;echo 'layer2+3' >/sys/class/net/bond0/bonding/xmit_hash_policy;ifconfig bond0 up

root@ATLAS:~# cat /sys/class/net/bond0/bonding/xmit_hash_policy
layer2+3 2

root@ATLAS:~# cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2+3 (2)

...

 

Link to comment
  • 3 weeks later...

I believe I have the same sort of issue. 

 

Only one of the NICs in the bond is being used vast majority of the time. I too, have a TPLink based managed switch. Can't remember the model off the top of my head since I'm at work. 

eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9000
        ether 00:22:15:aa:95:19  txqueuelen 1000  (Ethernet)
        RX packets 2504938  bytes 372587295 (355.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 160445  bytes 13622029 (12.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9000
        ether 00:22:15:aa:95:19  txqueuelen 1000  (Ethernet)
        RX packets 158406547  bytes 150482965806 (140.1 GiB)
        RX errors 0  dropped 1234  overruns 0  frame 0
        TX packets 124156063  bytes 75207219897 (70.0 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  memory 0xf4800000-f4820000  
root@karmic:~# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 00:22:15:aa:95:19
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 1
Actor Key: 9
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:22:15:aa:95:19
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 1
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:22:15:aa:95:19
    port key: 9
    port priority: 255
    port number: 1
    port state: 69
details partner lacp pdu:
    system priority: 65535
    system mac address: 00:00:00:00:00:00
    oper key: 1
    port priority: 255
    port number: 1
    port state: 1

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:22:15:aa:95:19
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: churned
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
    system priority: 65535
    system mac address: 00:22:15:aa:95:19
    port key: 9
    port priority: 255
    port number: 2
    port state: 77
details partner lacp pdu:
    system priority: 65535
    system mac address: 00:00:00:00:00:00
    oper key: 1
    port priority: 255
    port number: 1
    port state: 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.