Network interface rules: duplicate MAC address assignments


Sparktime

Recommended Posts

Hi everyone,

I'm fairly new to Unraid, love it so far! Very intuitive and easy to work with.
However, I'm encountering some issues with changing the interface rules.
As you can see in the screenshot i have a duplicated MAC-address on eth3/eth4. This comes from a PCI 10GBit HP NIC.
image.png.784a7ccf3ab9451f697e04fe9f58508f.png

The PCI-card only has 2 SPF slots but somehow Unraid sees 3. Because of this when I try to change anything in this field Unraid would not allow this because of the duplicated MAC-address, Is there a way to resolve this duplicated MAC issue?

Attached is the diagnostics report if useful.
Server is running Unraid 6.8.2

Best regards, Spark

Sparktime-unraid-diagnostics-20211228-1801.zip

Link to comment
  • 2 months later...

Ill take a stab at this, what model of Nic? I had a similar issue with Mellanox ConnectX3 MCX312A that I was able to fix.  Duplicate Macs are the result of a bad or improper firmware flash so if you bought them off eBay like that, seller didn't understand how to flash when upgrading firmware :) Basically I just flashed the static Mac and guid again.

 

https://forums.servethehome.com/index.php?threads/mellanox-connectx-3-en-duplicate-permanent-mac-address-issue.24790/

 

Im sure that's probably your issue, though I don't know how to flash HP cards (should be some documentation online)

Link to comment

This could be the problem! When I run a "lshw -class network" command, it says it runs a "MT27520 Family [ConnectX-3 Pro]" chipset, seems quite similar to you card.
I bought it from a friend who took it out of a old company server, we both don't know if its been flashed before. When I'm back home, I'll try and flash the card, if I can find how.
Thanks for the help so far!

Link to comment
9 hours ago, Sparktime said:

This could be the problem! When I run a "lshw -class network" command, it says it runs a "MT27520 Family [ConnectX-3 Pro]" chipset, seems quite similar to you card.
I bought it from a friend who took it out of a old company server, we both don't know if its been flashed before. When I'm back home, I'll try and flash the card, if I can find how.
Thanks for the help so far!

Welp that's a hard no.  I tested with a single port and it flashed fine, put a dual port in and unraid had a hernia. It adds a duplicate Mac every time I reboot so I am going to file a bug report in the bug report section as this hasn't gotten any replies in...almost ever :) I can replicate though, I tested with a dual port sfp+ card and it does indeed have duplicates in the OS.

Link to comment

OK I filed a bug report, its not an unraid issue, still trying to track down how unraid determines the nic as I received conflict information, that unraid does not use the same Mac for multiple interfaces, and yet a clean install assigns 1 Mac for port 1 and 2 Macs for port 2.  ill report back if I find info.  im going to stand up a Debian server to see if I see the same result and that will 100% determine if unraid related. 

 

Edit: see this:

 

Edited by Jclendineng
  • Like 1
Link to comment

Unraid uses "udev" to discover the hardware components of your system, including the installed NICs.

 

Your NIC seems to report the wrong information and issues the same MAC address twice.

See if a firmware update exists for your NIC.

 

Link to comment

Finally had time to fiddle with the NIC again. 
Did a firmware upgrade to the latest with Ich777`s Mellanox Firmware Tools as suggested, sadly Unraid still reports a extra port/MAC address.
mstflint only reports 2 MAC addresses which should be correct. I can manually disable the port from the network-rules.cfg but this is quite the pain and shouldn't be the answer.

 

@Jclendineng Any luck with the Debian instance? Unfortunately I don't have a spare server to try this card in.

 

Link to comment
  • 1 month later...

This is definitely an issue I'd love to see an answer for, or in the least a sustainable workaround.

Alas, every time I see this issue raised, I see the response "Update your firmware".

 

I've documented my issues quite well on this forum (As linked above), and how I've tried to workaround it, but it seems to be hoops that I have to jump through after each reboot.

Not a great solution for an automated headless remote server.

 

I'd also appreciate to see what the results of the Debian server test are, if you got around to it.

Link to comment
On 5/10/2022 at 3:40 AM, KptnKMan said:

This is definitely an issue I'd love to see an answer for, or in the least a sustainable workaround.

Alas, every time I see this issue raised, I see the response "Update your firmware".

 

I've documented my issues quite well on this forum (As linked above), and how I've tried to workaround it, but it seems to be hoops that I have to jump through after each reboot.

Not a great solution for an automated headless remote server.

 

I'd also appreciate to see what the results of the Debian server test are, if you got around to it.

I just don't touch it and it seems to work OK. Still shows multiple dupes but that's fine as long as you don't try re-arranging them it works.

Link to comment
14 hours ago, Jclendineng said:

I just don't touch it and it seems to work OK. Still shows multiple dupes but that's fine as long as you don't try re-arranging them it works.

I wish I was happy with that, I'm basically in the same position, I've just left it alone and not touched it.

But it means I cannot setup any complex networking until it is solved, because I want to setup interface failover on my servers, but I can't because of this instability.

I've documented trying to make this work in my other threads on getting to 10Gbit and the thread linked earlier here as well.

 

I just try not to think about how bummed I am about that, but stability is my priority.

Link to comment
Posted (edited)
On 5/14/2022 at 10:20 AM, KptnKMan said:

I wish I was happy with that, I'm basically in the same position, I've just left it alone and not touched it.

But it means I cannot setup any complex networking until it is solved, because I want to setup interface failover on my servers, but I can't because of this instability.

I've documented trying to make this work in my other threads on getting to 10Gbit and the thread linked earlier here as well.

 

I just try not to think about how bummed I am about that, but stability is my priority.

Was this fixed for you in the 6.10.0 Stable? I swear it was still broken in the betas but lo and behold, it is correct for me now.

 

Edit: The only thing I did recently would (should) have 0 affect on networking. I needed to (for long story reasons) use a cache drive nvme in another server, and I had another one I was swapping in. So I set the cache to yes so it moves everything to the array. I then stop the array, disable VMs and Docker and start up again so I can move the system folder domain folder.  I then replaced the cache drive, made sure it was up and running, and set the cache to prefer so it copies back to the cache.  When that's done I set to cache only as it was before, stop the array, enable Docker and VMs and start up again. I then noticed an issue with my 801ad lagg bond and went to network settings to fix and noticed the bond was screwed up, it had dropped my duplicate nic. So I re-did my lagg, set up the vlans again and started the array. Not sure what any of those steps had to do with network setup except that I did have docker and vm services stopped when I rebooted with the new cache drive so potentially that cleared something in the config that docker or vm manager was doing to create the dupe nic. 

Edited by Jclendineng
Link to comment

Oh shit yo, I didn't even know that 6.10 went gold!

I have some reading to do!

 

I haven't decided when to upgrade, but I may wait a little bit and figure out some stuff or upgrade 1 of my systems and see.

 

As always, stability is my top priority for my particular setup.

Link to comment

Yesterday I took the dive and updated my secondary unRAID to 6.10.1, then later my primary once I saw everything was working well.

In particular regard to the 10Gbit dual ConnectX-3 cards, I did some rudimentary testing after upgrade on both systems (I had 164 days solid uptime on my secondary 😁) and found that it works but (at least in my case) not perfectly. I'm going to be keeping an eye on it still, but I'll try to explain my findings.

 

So after upgrading my secondary (My less fussy system because I hardly ever mess with it), I saw that the installed cards were all listed but they were listed as eth0 (mlx4_core), eth2 (mlx4_core) and eth3 (igc). So I went to the network.cfg and network-rules.cfg, and changed them to be eth0, eth1, eth2, and rebooted. Upon reboot I found the same issue where the cards were duplicated and had become eth0 (mlx4_core), eth1 (igc), eth2 (mlx4_core duplicate?), eth3 (mlx4_core).

 

So I thought "uh oh" and edited the network.cfg and network-rules.cfg to reflect the original setup eth0, eth2, eth3. That seemed to work again, and I fully rebooted 3 times to check confirm that the configuration would persist through restarts. Seems good, so then I thought upgrade the primary unRAID to see what's really up.

 

Upon upgrading my primary unRAID, I immediately saw that 4 cards were listed, eth0 (mlx4_core), eth1 (r8169), eth2 (mlx4_core duplicate?), eth3 (mlx4_core). So I thought to immediately try to duplicate the eth0+2+3 configuration of the secondary. After setting network.cfg and network-rules.cfg and rebooting, that seemed to work and no duplicated were present. They showed up as eth0 (mlx4_core), eth2 (mlx4_core), eth3 (r8169).

 

I rebooted my primary server a further 3 times, to check that the card assignments were persistent.

It looks at this moment like the configuration stuck, but I'm still hesitant to change anything. I'm planning in the coming days to reboot my primary a few more times and see if anything switches around, or goes strange.

 

So far, so good though. 👍

Link to comment
2 hours ago, KptnKMan said:

Yesterday I took the dive and updated my secondary unRAID to 6.10.1, then later my primary once I saw everything was working well.

In particular regard to the 10Gbit dual ConnectX-3 cards, I did some rudimentary testing after upgrade on both systems (I had 164 days solid uptime on my secondary 😁) and found that it works but (at least in my case) not perfectly. I'm going to be keeping an eye on it still, but I'll try to explain my findings.

 

So after upgrading my secondary (My less fussy system because I hardly ever mess with it), I saw that the installed cards were all listed but they were listed as eth0 (mlx4_core), eth2 (mlx4_core) and eth3 (igc). So I went to the network.cfg and network-rules.cfg, and changed them to be eth0, eth1, eth2, and rebooted. Upon reboot I found the same issue where the cards were duplicated and had become eth0 (mlx4_core), eth1 (igc), eth2 (mlx4_core duplicate?), eth3 (mlx4_core).

 

So I thought "uh oh" and edited the network.cfg and network-rules.cfg to reflect the original setup eth0, eth2, eth3. That seemed to work again, and I fully rebooted 3 times to check confirm that the configuration would persist through restarts. Seems good, so then I thought upgrade the primary unRAID to see what's really up.

 

Upon upgrading my primary unRAID, I immediately saw that 4 cards were listed, eth0 (mlx4_core), eth1 (r8169), eth2 (mlx4_core duplicate?), eth3 (mlx4_core). So I thought to immediately try to duplicate the eth0+2+3 configuration of the secondary. After setting network.cfg and network-rules.cfg and rebooting, that seemed to work and no duplicated were present. They showed up as eth0 (mlx4_core), eth2 (mlx4_core), eth3 (r8169).

 

I rebooted my primary server a further 3 times, to check that the card assignments were persistent.

It looks at this moment like the configuration stuck, but I'm still hesitant to change anything. I'm planning in the coming days to reboot my primary a few more times and see if anything switches around, or goes strange.

 

So far, so good though. 👍

 

Maybe the dupes came from editing the network.cfg manually, I don't know that it's recommended.  You can try doing it from the GUI, looks like that works and is persistent...I wonder what they fixed? From the reports, they didn't think it was an issue per se so Im assuming they didn't do anything, it was one of the many upstream driver/dependency/kernel updates. In any case I am resting a bit easier knowing that 1 inconsistency is fixed and I don't have to worry about why anymore.

Link to comment
18 minutes ago, Jclendineng said:

 

Maybe the dupes came from editing the network.cfg manually, I don't know that it's recommended.  You can try doing it from the GUI, looks like that works and is persistent...I wonder what they fixed? From the reports, they didn't think it was an issue per se so Im assuming they didn't do anything, it was one of the many upstream driver/dependency/kernel updates. In any case I am resting a bit easier knowing that 1 inconsistency is fixed and I don't have to worry about why anymore.

The duplicates are created BY THEMSELVES before any editing, and I documented the only ways I found to fix this known issue.

 

If you check my thread, editing the network.cfg and network-rules.cfg manually is the ONLY way to make it work, and the GUI gets broken from edits and refuses to save when duplicate cards are created. This is tested on separate machines independently, and others have reported this all over the forum. So, I followed my previous method and this is the result.

 

I'm saying clearly that this issue is NOT FIXED, but it seems to be tentatively stable in the GUI, at least in my current testing. I'm happy about that so far.

I'm going to try some more poking at this in the coming weeks and post it in the thread I started about this issue, so check in there if you're interested in following. 👍

Link to comment
1 hour ago, KptnKMan said:

The duplicates are created BY THEMSELVES before any editing, and I documented the only ways I found to fix this known issue.

 

If you check my thread, editing the network.cfg and network-rules.cfg manually is the ONLY way to make it work, and the GUI gets broken from edits and refuses to save when duplicate cards are created. This is tested on separate machines independently, and others have reported this all over the forum. So, I followed my previous method and this is the result.

 

I'm saying clearly that this issue is NOT FIXED, but it seems to be tentatively stable in the GUI, at least in my current testing. I'm happy about that so far.

I'm going to try some more poking at this in the coming weeks and post it in the thread I started about this issue, so check in there if you're interested in following. 👍

OK just making sure, I didn't fully understand from your post if you had tried the GUI post-6.10. Mine is now fixed, but maybe as you say its only because I haven't tried to edit.

Link to comment
Posted (edited)
5 hours ago, Jclendineng said:

OK just making sure, I didn't fully understand from your post if you had tried the GUI post-6.10. Mine is now fixed, but maybe as you say its only because I haven't tried to edit.

Yeah, I think this happens because the GUI tries to write all the MACs back to the config file, but validation fails because there is a duplicate in configuration.

So its doing what its supposed to do and just fails, and you're stuck staring at the GUI with broken dual MAC addresses that you didn't put there.

 

Only way to fix it is to remove the duplicate manually in the files, and the GUI becomes usable again.

 

But again, I've tried to document as much as I can in my threads about setting up my 10Gbit networking from scratch on unRAID. I had a lot of help along the way by some smart and amazing people. If you want to know what I mean, look at my post history and have a read of those threads. 👍

Edited by KptnKMan
Link to comment
On 5/23/2022 at 4:53 PM, KptnKMan said:

Yeah, I think this happens because the GUI tries to write all the MACs back to the config file, but validation fails because there is a duplicate in configuration.

So its doing what its supposed to do and just fails, and you're stuck staring at the GUI with broken dual MAC addresses that you didn't put there.

 

Only way to fix it is to remove the duplicate manually in the files, and the GUI becomes usable again.

 

But again, I've tried to document as much as I can in my threads about setting up my 10Gbit networking from scratch on unRAID. I had a lot of help along the way by some smart and amazing people. If you want to know what I mean, look at my post history and have a read of those threads. 👍

Cool cool I'll check it out. Im lagging 2 10gb connections to my aggregation switch so definitely enjoy reading other peoples experiences with this. IMO 10gb is the sweet spot currently as 40gb isn't quite ready for *most* home users.

 

Back to this, updated to the new patch and my duplicates are still gone so already better then 6.8.  Glad a permanent fix is coming.  :) Thanks to all the unraid devs for the hard work. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.