ryan Posted June 4, 2017 Share Posted June 4, 2017 (edited) I'm having some issues with my bonding.. i have two NIC which are: root@ATLAS:/mnt/cache/cache_only# sudo cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 802.3ad info LACP rate: slow Min links: 0 Aggregator selection policy (ad_select): stable System priority: 65535 System MAC address: 38:ea:a7:a9:2c:f9 Active Aggregator Info: Aggregator ID: 1 Number of ports: 1 Actor Key: 9 Partner Key: 3502 Partner Mac Address: ec:08:6b:e4:f1:36 Slave Interface: eth0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 38:ea:a7:a9:2c:f9 Slave queue ID: 0 Aggregator ID: 1 Actor Churn State: none Partner Churn State: none Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 38:ea:a7:a9:2c:f9 port key: 9 port priority: 255 port number: 1 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: ec:08:6b:e4:f1:36 oper key: 3502 port priority: 32768 port number: 10 port state: 61 Slave Interface: eth1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: 00:24:9b:1a:cf:69 Slave queue ID: 0 Aggregator ID: 2 Actor Churn State: churned Partner Churn State: churned Actor Churned Count: 2 Partner Churned Count: 2 details actor lacp pdu: system priority: 65535 system mac address: 38:ea:a7:a9:2c:f9 port key: 9 port priority: 255 port number: 2 port state: 69 details partner lacp pdu: system priority: 65535 system mac address: 00:00:00:00:00:00 oper key: 1 port priority: 255 port number: 1 port state: 1 However i am only gettng 1gbit throughput total from three different clients: root 2312 0.7 0.0 6500 1696 pts/1 S+ 10:52 0:52 iperf3 -s -B 192.168.1.10 -p 5053 root 23263 1.6 0.0 6500 1764 pts/3 S+ 12:20 0:25 iperf3 -s -B 192.168.1.10 -p 5054 root 29199 2.3 0.0 6500 1596 pts/2 S+ 10:40 2:58 iperf3 -s -B 192.168.1.10 -p 5202 Three different instances of iperf3, gives: Computer 1 [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 390 MBytes 327 Mbits/sec 71 sender [ 4] 0.00-10.00 sec 388 MBytes 326 Mbits/sec receiver Computer 2 [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 198 MBytes 166 Mbits/sec 1023 sender [ 4] 0.00-10.00 sec 198 MBytes 166 Mbits/sec receiver Computer 3 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 861 MBytes 722 Mbits/sec sender [ 4] 0.00-10.00 sec 861 MBytes 722 Mbits/sec receiver Any ideas? Note the retries too!? Edited June 4, 2017 by ryan Quote Link to comment
bonienl Posted June 4, 2017 Share Posted June 4, 2017 A single pc to pc connection will not exceed 1 Gb/s. To take advantage of the bonded channel with link aggregation, you need multiple concurrent connections to/from different PCs. 1 Quote Link to comment
ryan Posted June 4, 2017 Author Share Posted June 4, 2017 11 minutes ago, bonienl said: A single pc to pc connection will not exceed 1 Gb/s. To take advantage of the bonded channel with link aggregation, you need multiple concurrent connections to/from different PCs. There are three different machines connecting to two different iperf3 processes within the above test. The link should meet 2gbps ) i actually need 4 but testing with three. Quote Link to comment
bonienl Posted June 4, 2017 Share Posted June 4, 2017 I suppose address 192.168.1.10 is your unRAID server. When initiating multiple iperf sessions from different sources, it is actually the switch doing the distribution of data over the aggregate channel towards the server. How is your switch set up? One suspicious setting to check: eth1 partner lacp MAC address is 00:00:00:00:00:00, which is unassigned and looks like the switch is not using both ports. Quote Link to comment
ryan Posted June 4, 2017 Author Share Posted June 4, 2017 @bonienl Thanks for you observations - 192.168.1.10 is indeed my unRAID system. The switch was setup with LAG and STP enabled, i also tested without. I have a TP-LINK T1600G-28TS. MAC Address is strange too.. i didnt see that. If it is the switch not setting up the LAG properly, what other settings do i need to consider ? AFAIK i should only select two ports which need to be aggregated? Quote Link to comment
bonienl Posted June 4, 2017 Share Posted June 4, 2017 I am not familiar with this TP-link switch model. Does it show the state of the LACP channel and its members? Besides setting the members of the channel are there other configuration options on the switch, e.g. LACP mode or MII timer? LACP gets precedence over STP, in other words spanning tree is not used on an aggregate port channel. Quote Link to comment
ryan Posted June 4, 2017 Author Share Posted June 4, 2017 Thanks for the clarificaiton on STP. The MAC Address is also fixed: details actor lacp pdu: system priority: 65535 system mac address: 38:ea:a7:a9:2c:f9 port key: 9 port priority: 255 port number: 2 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: ec:08:6b:e4:f1:36 oper key: 2536 port priority: 32768 port number: 12 port state: 61 So, within the TP-LINK: Quote Link to comment
bonienl Posted June 4, 2017 Share Posted June 4, 2017 See also this topic about traffic balancing at unRAID side (there is no GUI support). Quote Link to comment
ryan Posted June 4, 2017 Author Share Posted June 4, 2017 1 hour ago, bonienl said: See also this topic about traffic balancing at unRAID side (there is no GUI support). Hmm i have just tried this, seems to make no difference. I can see also with testing the opposite i.e. unRAID as the iperf3 client, and two seperate machines. I get the same.. although it does not seem to be balancing through the interfaces! Quote Link to comment
ryan Posted June 5, 2017 Author Share Posted June 5, 2017 In addition, i have tried bonding on my workstation.. and tested the opposite way around, this works fine, i get the full 2gbp/s speed. So i know it's not the switch setup, could it perhaps be the onboard ethernet device on the unRAID that doesn't support this somehow ? Quote Link to comment
bonienl Posted June 5, 2017 Share Posted June 5, 2017 The specifications of my motherboard say dual LAN with support of teaming. What does it say for your motherboard? I had a look at "Actor/Partner Churn State" which in your case is "none" for eth0 and "churned" for eth1. I believe the correct state for both interfaces should be "monitoring", perhaps you do have a hardware limitation? Quote Link to comment
ryan Posted June 5, 2017 Author Share Posted June 5, 2017 (edited) 1 hour ago, bonienl said: The specifications of my motherboard say dual LAN with support of teaming. What does it say for your motherboard? I had a look at "Actor/Partner Churn State" which in your case is "none" for eth0 and "churned" for eth1. I believe the correct state for both interfaces should be "monitoring", perhaps you do have a hardware limitation? Interesting, i had a look around the web - this unRAID is on a HP NL54 - and i can see others have setup bonding too, so it should work OK. I think the first post i must have looked at the output too eariy, i now see monitoring in the Actor/Partner state: root@ATLAS:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 802.3ad info LACP rate: slow Min links: 0 Aggregator selection policy (ad_select): stable System priority: 65535 System MAC address: 00:24:9b:1a:cf:69 Active Aggregator Info: Aggregator ID: 2 Number of ports: 2 Actor Key: 9 Partner Key: 3327 Partner Mac Address: ec:08:6b:e4:f1:36 Slave Interface: eth0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:24:9b:1a:cf:69 Slave queue ID: 0 Aggregator ID: 2 Actor Churn State: monitoring Partner Churn State: monitoring Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 00:24:9b:1a:cf:69 port key: 9 port priority: 255 port number: 1 warning: this system does not seem to support IPv6 - trying IPv4 port state: 61 details partner lacp pdu: warning: this system does not seem to support IPv6 - trying IPv4 system priority: 32768 system mac address: ec:08:6b:e4:f1:36 oper key: 3327 port priority: 32768 port number: 12 port state: 61 Slave Interface: eth1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 38:ea:a7:a9:2c:f9 Slave queue ID: 0 Aggregator ID: 2 Actor Churn State: monitoring Partner Churn State: monitoring Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 00:24:9b:1a:cf:69 port key: 9 port priority: 255 port number: 2 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: ec:08:6b:e4:f1:36 oper key: 3327 port priority: 32768 port number: 10 port state: 61 I have swapped the interfaces around - that is to say eth0 <> eth1 in the unRAID configuration to test.. still the same, i do not get the full speed. A few tests I can also see that the links are not "shared" bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 inet 192.168.1.10 netmask 255.255.255.0 broadcast 0.0.0.0 ether 00:24:9b:1a:cf:69 txqueuelen 1000 (Ethernet) RX packets 3842122 bytes 5830623843 (5.4 GiB) RX errors 0 dropped 570 overruns 0 frame 0 TX packets 1395401 bytes 101876433 (97.1 MiB) TX errors 0 dropped 9 overruns 0 carrier 0 collisions 0 eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 00:24:9b:1a:cf:69 txqueuelen 1000 (Ethernet) RX packets 65645 bytes 94136919 (89.7 MiB) RX errors 20 dropped 0 overruns 0 frame 20 TX packets 2664714 bytes 199981043 (190.7 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 00:24:9b:1a:cf:69 txqueuelen 1000 (Ethernet) RX packets 3842029 bytes 5830605446 (5.4 GiB) RX errors 0 dropped 570 overruns 0 frame 0 TX packets 464779 bytes 32597478 (31.0 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 18 Looks like eth0 is not used at all.. I will try and swap around again, it could be something to do with the eth0 interface! Edited June 5, 2017 by ryan Quote Link to comment
ryan Posted June 5, 2017 Author Share Posted June 5, 2017 Not the case, after i swap interfaces back, i get traffic on eth0 but not eth1.. Quote Link to comment
bonienl Posted June 5, 2017 Share Posted June 5, 2017 In your settings I see Transmit Hash Policy: layer2, which makes balancing purely on MAC addresses. I know you have tried layer2+3, did you change back? Also the interface needs to be set down/up to make the change effective. Quote Link to comment
ryan Posted June 5, 2017 Author Share Posted June 5, 2017 3 minutes ago, bonienl said: In your settings I see Transmit Hash Policy: layer2, which makes balancing purely on MAC addresses. I know you have tried layer2+3, did you change back? Also the interface needs to be set down/up to make the change effective. OK, just tried that and confirmed it was layer2 (probably because of the reboot.. but i change, and reset the interface state.. What interesting is now when i have set this, Transmit Hash Policy: layer2+3 (2) Actor Churn State: none Partner Churn State: none root@ATLAS:~# cat /sys/class/net/bond0/bonding/xmit_hash_policy layer2 0 root@ATLAS:~# ifconfig bond0 down;echo 'layer2+3' >/sys/class/net/bond0/bonding/xmit_hash_policy;ifconfig bond0 up root@ATLAS:~# cat /sys/class/net/bond0/bonding/xmit_hash_policy layer2+3 2 root@ATLAS:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2+3 (2) ... Quote Link to comment
unevent Posted June 5, 2017 Share Posted June 5, 2017 Have you considered just doing balance-rr (mode 0) to the switch vs 802.3ad? Quote Link to comment
daze Posted June 23, 2017 Share Posted June 23, 2017 I believe I have the same sort of issue. Only one of the NICs in the bond is being used vast majority of the time. I too, have a TPLink based managed switch. Can't remember the model off the top of my head since I'm at work. eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 9000 ether 00:22:15:aa:95:19 txqueuelen 1000 (Ethernet) RX packets 2504938 bytes 372587295 (355.3 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 160445 bytes 13622029 (12.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 9000 ether 00:22:15:aa:95:19 txqueuelen 1000 (Ethernet) RX packets 158406547 bytes 150482965806 (140.1 GiB) RX errors 0 dropped 1234 overruns 0 frame 0 TX packets 124156063 bytes 75207219897 (70.0 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 20 memory 0xf4800000-f4820000 root@karmic:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 802.3ad info LACP rate: slow Min links: 0 Aggregator selection policy (ad_select): stable System priority: 65535 System MAC address: 00:22:15:aa:95:19 Active Aggregator Info: Aggregator ID: 2 Number of ports: 1 Actor Key: 9 Partner Key: 1 Partner Mac Address: 00:00:00:00:00:00 Slave Interface: eth0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:22:15:aa:95:19 Slave queue ID: 0 Aggregator ID: 1 Actor Churn State: churned Partner Churn State: churned Actor Churned Count: 1 Partner Churned Count: 1 details actor lacp pdu: system priority: 65535 system mac address: 00:22:15:aa:95:19 port key: 9 port priority: 255 port number: 1 port state: 69 details partner lacp pdu: system priority: 65535 system mac address: 00:00:00:00:00:00 oper key: 1 port priority: 255 port number: 1 port state: 1 Slave Interface: eth1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:22:15:aa:95:19 Slave queue ID: 0 Aggregator ID: 2 Actor Churn State: none Partner Churn State: churned Actor Churned Count: 0 Partner Churned Count: 1 details actor lacp pdu: system priority: 65535 system mac address: 00:22:15:aa:95:19 port key: 9 port priority: 255 port number: 2 port state: 77 details partner lacp pdu: system priority: 65535 system mac address: 00:00:00:00:00:00 oper key: 1 port priority: 255 port number: 1 port state: 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.