X520-DA2 Port Flapping


Fizzyade

Recommended Posts

Hi,

 

Is anybody running a X520-DA2 with success?  I have had to disconnect mine and revert to on board ethernet as I'm having trouble with flapping.

 

I've switched SFP modules and cables and it still happens, another device plugged into the same switch does not exhibit this issue.

 

My card appears to be one of the older ones with the sub device of 7a11 and I'm wondering if this is the issue, it has the latest firmware on it that is available for that particular model. (16.5.20)

 

I have a card arriving tomorrow which is a newer one to try.

 

My log file is full of port is up and port is down messages, continually refreshing the dashboard you can see the device go from connected to disconnected.

Link to comment
9 hours ago, Benson said:

No problems on DA1 (DA2 single port versoion) with Unraid.

Thanks, I have a DA1 coming in addition to a newer DA2.  
 

can you post the output (snipped to show just the DA1 card) from this command

 

lspci -vmn

 

(the card you will be looking forward will have the vid/did of 8086:10fb - you can run lspci -vm first to show the human readable version, but that doesn’t display the numeric values for the sub vid/did which is what I’m interested in knowing!)

Link to comment
2 hours ago, Fizzyade said:

Thanks, I have a DA1 coming in addition to a newer DA2.  
 

can you post the output (snipped to show just the DA1 card) from this command

 

lspci -vmn

 

(the card you will be looking forward will have the vid/did of 8086:10fb - you can run lspci -vm first to show the human readable version, but that doesn’t display the numeric values for the sub vid/did which is what I’m interested in knowing!)

Device: 06:00.0
Class:  0200
Vendor: 8086
Device: 10fb
SVendor:        8086
SDevice:        0006
Rev:    01

 

Pls note, it is not Intel original.

Edited by Benson
  • Like 1
Link to comment
4 hours ago, Benson said:

Device: 06:00.0
Class:  0200
Vendor: 8086
Device: 10fb
SVendor:        8086
SDevice:        0006
Rev:    01

 

Pls note, it is not Intel original.

Ahh cool, thank you very much!  It shows as a similar name to mine, but I suspect yours has a bigger flash part on it and most likely has newer firmware.

 

The question I probably should have asked at the start, do you happen to know/remember what brand & model the card is?

Link to comment
On 9/17/2019 at 9:41 AM, Benson said:

The brand was Unicaca ( China made ), I have 3 of these card and two have try manual update the firmware, but I can't confirm this one have or not.

Thanks, I've just had a 10Gtek one delivered and it's showing the same vid, did, svid, sdid as yours, will try installing it in a bit. 

Link to comment
3 hours ago, Benson said:

Weird. What Switch model you use ?

This particular one is a Mikrotik CRS305-1G-4S+, I auto auto neg enabled on other ports going to other devices without issue, it just doesn't like the X520 for some reason.

 

There's also a link between a CRS305-1G-4S+ and a CRS309-1G-8S+ which works flawlessly with auto negotiation on the link.

Link to comment
1 hour ago, Benson said:

I use Mikrotik CSS326-24G-2S+RM and Ubnt Switch haven't problem.

 

I have try ConnextX3 , Emulex 10G NIC, with different SFP+ module with Unraid also work normal.

Yeah, it's very strange.  My network consists of those 2 switches, Untangle as the router, UniFi 16 port POE, UniFi 8 port POE (60W) * 3, UniFi 8 port POE (150W).  Everything works seamlessly, the only machine having issues is the unraid one.

 

I'm using 10GTek SFP+ modules on the Unraid connection (as per everywhere else) and they're working fine everywhere, it's just this one machine that is giving me grief, and of course it has to be the machine that gets a lot of traffic (4K streaming via Emby for the household).

 

Still, it's fixed now with autoneg off, no great loss as I only want them connected at 10Gb anyway.

 

This thread might be of use to somebody else at some point!

Link to comment

I run them in both my unraid servers as well. China 10GTek ones from Amazon, haven't had any issues. Let me know if I can pull any info for you. Unifi 10Gb backbones via SFP+ (over copper).

 

Brain fart..... I run SuperMicro cards now based on Intel 82599. I USED to run the 10GTek cards....

Edited by cybrnook
Derp
  • Thanks 1
Link to comment
3 hours ago, cybrnook said:

I run them in both my unraid servers as well. China 10GTek ones from Amazon, haven't had any issues. Let me know if I can pull any info for you. Unifi 10Gb backbones via SFP+ (over copper).

 

Brain fart..... I run SuperMicro cards now based on Intel 82599. I USED to run the 10GTek cards....

Thanks dude.  The issue occurs with 2 different dell X520 revision cards and the 10gtek version, I’ve changed from LR SMF cable and SFP modules to SR MMF cable and modules and still the problem persisted.

 

I spoke too soon, despite hours with no fault it then had a spurt and suddenly hit 950 drops in the space of a hour or two.

 

so, went through my box of networking equipment and found an genuine intel SR MMF module, so have switched that in to see if there is any change, I’m running out of ideas otherwise.

 

Link to comment

You sure it's not your switch that's flapping? I have seen where particular settings will trigger bugs on switches.

 

Maybe one test you could do is when the flapping starts to occur, log into unraid and see if you can ping the adapter locally. If it's flapping I would expect you to see, even locally, a response then not a response etc. Another thing, are you running in a LACP bond or anything? What if you run just one of the ports, do they flap when running by themselves (both ports tested?).

 

Trying to think of way's to rule out the adapter/OS as the cause.....

Link to comment
51 minutes ago, cybrnook said:

You sure it's not your switch that's flapping? I have seen where particular settings will trigger bugs on switches.

 

Maybe one test you could do is when the flapping starts to occur, log into unraid and see if you can ping the adapter locally. If it's flapping I would expect you to see, even locally, a response then not a response etc. Another thing, are you running in a LACP bond or anything? What if you run just one of the ports, do they flap when running by themselves (both ports tested?).

 

Trying to think of way's to rule out the adapter/OS as the cause.....

yeah, i’ve swapped module positions to one which isnt flapping and it immediately starts flapping with Unraid.

 

Since putting the Intel original module in I’ve had no link downs, too early to judge as i thought id fixed it earlier today with disabling autoneg, will check in the morning.

 

Theres currently a fair amount of traffic going through it and has been since i installed the intel module, so it’s a good test.

 

Not running LACP, don't need 20Gb of throughput! 

 

Thanks for the suggestions, always appreciated.  Somebody will always come up with something different to try.

 

Got everything crossed at the moment.

Link to comment

I just don't think it's the card, especially if you have gone through three so far. Mathematically/Statistically speaking, you are out of bounds for thinking it's the card at this point 🙂 Unless, of course they are all using the same ix driver, and you are triggering a bug somewhere (I have seen things like tcpdump trigger port flapping in LACP bonds before. But that was a Cisco switch and a Broadcom SR-IOV card).. But with that said, I have intel based NIC's and never had an issue.....

 

Do you have any other 10Gb switches other than the Mikrotik? What's the opposite 10Gb device you have on your network? If I look at your topology correctly, your gigabit everything else except for your link between the Mikrotik and your server? 10Gb won't help you if your link from your unifi switches (which appear to be all 1Gb) are 1Gbe. Unless you are using the SFP1 and 2 on your 16 port PoE as a bonded pair (that needs to be setup properly in your UCK as well, and the Mikrotik would also need to support it) into your Mikrotik? But even then, the rest of your devices are 1Gbe, so even if all your clients were saturating the bond, you would max out a 2Gb since those are not SFP+ ports.

 

Sorry, I know I am more looking at your layout then answering your question, but I am curious how you have it all hooked up....

Edited by cybrnook
  • Like 1
Link to comment
53 minutes ago, cybrnook said:

I just don't think it's the card, especially if you have gone through three so far. Mathematically/Statistically speaking, you are out of bounds for thinking it's the card at this point 🙂 Unless, of course they are all using the same ix driver, and you are triggering a bug somewhere (I have seen things like tcpdump trigger port flapping in LACP bonds before. But that was a Cisco switch and a Broadcom SR-IOV card).. But with that said, I have intel based NIC's and never had an issue.....

 

Do you have any other 10Gb switches other than the Mikrotik? What's the opposite 10Gb device you have on your network? If I look at your topology correctly, your gigabit everything else except for your link between the Mikrotik and your server? 10Gb won't help you if your link from your unifi switches (which appear to be all 1Gb) are 1Gbe. Unless you are using the SFP1 and 2 on your 16 port PoE as a bonded pair (that needs to be setup properly in your UCK as well, and the Mikrotik would also need to support it) into your Mikrotik? But even then, the rest of your devices are 1Gbe, so even if all your clients were saturating the bond, you would max out a 2Gb since those are not SFP+ ports.

 

Sorry, I know I am more looking at your layout then answering your question, but I am curious how you have it all hooked up....

at this point I actually think its a compatibility issue between the x520 and the 10Gtek module for some reason, no drops since 9pm when I found an Intel SFP module and plugged it in.  I have the Generic version of the same module plugged into Aquantia NIC’s and they work fine.  Theyre all x520 cards, so they will all be using the ixgbe driver.

 

there are 4 devices using the 10Gb network and theres an uplink at 10Gb from downstairs to upstairs, the rest of the devices are all at 1Gb (i.e Unifi switches hanging off the 10Gb switches).  The 10Gb is there to connect the various machines downstairs and upstairs which transfer large amounts of data between them.

 

Theres about 55 devices in total on the network.  

 

As i said its definitely not the switch because i csn plug unraid into anither 10Gb port on the same switch which doesnt flap and it starts flapping, I have switched modules and a non flapping one suddenly flaps when put into the x520, all my 10Gb modules are 10Gtek ones until ai remembered that i had some Intel ones, luckily one of them was a MMF LC SR module, so my last shot was to try that.  So far its hanging in there.

Link to comment

It’s definitely a compatibility with the 10Gtek intel modules and the X520, no link flaps since I installed the genuine intel module.

 

i just migrated my esxi server to Proxmox, using a thunderbolt 3 enclosure with another 10Gtek module plugged into an x520, after it installed I was pinging hosts on the network and it was was down up and down like nobodies business, had a 60% loss rate on ping.

 

went searching in my magic networking box as I knew I had another intel module (but wasn’t sure what type it was), luckily it was a also MMF SR module, switched that in and straight away ping was behaving and the link stopped flapping.

 

i’ve not Had any issues with the 10Gtek modules in MikroTik, UniFi or Aquantia devices, just the intel ones.

 

you live and learn.

Edited by Fizzyade
  • Like 1
Link to comment
  • 4 months later...

I found myself in the same situation: X520 card (intel 82599 based) on unRAID server and microtik CRS305-1G-4S+IN switch, 10Gtek transceivers at both end.  The ping was solid, but the link was going down very briefly every few seconds.  For some reason the Microtik switch came with RouterOS enabled.  Since I did not need the router funcionality, I thought I rebooted to the switch OS.  Just like that, the flapping on the X520 link is gone.  I just checked, and the link did not go down once overnight.  Maybe just a coincidence, not sure, as always YMMV, but I thought I mentioned it.  Transceivers are 10Gtek AXS85-192-M3, and the PCIe the card is also from 10Gtek, model X520-10G-1S-X8.

 

Link to comment
  • 2 months later...

Did you ever resolve this fizzyade ?, Ive just installed 2 x X520-DA1 along with 4x intel FTLX8571D3BCV spf+ modules and the mikrotik 4 port 10gb switch, 20mtr cable and 1mtr from Switch SFP Ltd. Issue im having link keeps dropping server side ive attached screenshot

Screenshot_20200330-101542_Samsung Internet.jpg

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.