Jump to content

Eth1 port (10GbE Mellanox card) no longer being detected


Go to solution Solved by MAM59,

Recommended Posts

(Using OS 6.11.1)
I have an ethernet cable from my server plugged into my router for internet access, but then an SFP+ cable running from a 10GbE Mellanox card in my server straight into my PC, for a faster direct connection between the two. I was trying to get my server to prioritize using the 10GbE connection over the slower ethernet connection and in the process tried deleting the network.cfg file. Now the Mellanox card isn't showing up in my Network Settings at all... And when I try to switch even the regular Ethernet settings back to how they were before, I lose access to the web GUI and have to delete network.cfg to start again. Kind of drawing a blank here...
 

Thanks in advance for any help.

skynet-diagnostics-20221014-0544.zip

Link to comment

This is a common error that you did, and there are a lot of articles here that show you how to fix it.

 

In Brief:

* put both cards in unraid into the same BRIDGE (only one config will show, dont worry) (NO BOND!!! just BRIDGE)

* connect the slow one from Unraid ONLY to your router

* connect the fast one from Uraid ONLY to your PC (and dont connect your PC to anything else)

 

You PC will use the fast line to unraid and unraid will forward your packets that should go offside on to the router.

Thats it.

(the drawback is that if unraid is turned off or not available, your PC cannot access the internet)

(BTW: there will not be an ETH1, both cards are ETH0 then and share the same config)

 

Edited by MAM59
Link to comment

Hmm... Thank you for the response. That sounds like it might be a good solution to what I was trying to accomplish originally. And I tried it just now to be sure, but it appears that my 2nd issue still remains... That I'm not able to use my Mellanox connection to connect to the Unraid server in the first place (caused by whatever happened when I deleted the network.cfg file). Since then, it's stopped showing up in my Network Settings at all, no matter how I've tried to configure it. (I lose access to the Web GUI when I try to connect any way except for through my Wi-Fi and the server's default ethernet<->router connection)

This is what I'm seeing in the Web GUI.
image.thumb.png.b33ccefd0a340cf336af282d9ff29cb1.png

 

And yet it looks like it's still being detected in System Devices...

image.thumb.png.7a0a5e11c4a5115f67d694f0577552c5.png

 

And this is what I'm seeing in Windows Network Connections:

image.png.631d1c5fa48ef487ea211468c8a962fd.png

 

Any other ideas as to what might be causing this?

 

Edited by Intothisworld
Link to comment
  • Solution

hmm, I am not a specialist for motherboards, but what I can see from your list is that the melanox card is within the same group like the video card and some sata connections.

 

This leads me to the idea that some of the PCIe lanes of those slots are shared and Mellanox does not get enough of them to start up. As you have an Intel box, you are always very low on spare lanes.

 

You have an X3, so it needs 4PCIe lanes privately to work.

 

Maybe you should try to swap cards into different slots and see what happens ? But of course, this also depends on the slots that your board is offereing to you.

 

Anyway Mellanox together with a video card (that claims all 16lanes) is surely a bad idea...

 

(Also some boards can switch lanes in the bios. mine for instance can either turn on wifi, use another nvme slot, or offer 4 more sata lines. Whatever you select the other devices stop working then)

 

Link to comment

Update: Didn't have any luck with deleting the network.cfg and rebooting, but I looked up the specs on my motherboard, and indeed the PCIe lanes weren't enough to power all 3 devices I had plugged in. I decided to just remove my graphics card entirely, and bump my MBA and the Mellanox up a slot. Bingo, worked like a charm. Apparently, the card was only working before because my network settings had already been established when I installed it. Deleting/resetting the network.cfg seems to have revealed my PCIe limitation there.

 

I then went back and tried the solution @MAM59 mentioned to my original problem, bridging the 10GbE and regular ethernet connections. It does work pretty well, and I've been testing it the past couple days. But it still seems like there's a bit of interference between the two connections (heavy server usage affects my internet speed and vice versa). Is there any way to simply isolate the 2 connections from each other? Have one line into my PC for internet, one line out to my server, and have the bandwidth from each not affect the other at all?

 

I appreciate all your guys' help so far.

Edited by Intothisworld
Link to comment
10 minutes ago, Intothisworld said:

is there any way to simply isolate the 2 connections from each other? Have one line into my PC for internet, one line out to my server, and have the bandwidth from each not affect the other at all?

sure there is, but you need a 10G switch for this. connect your pc, unraid and router (with 1g) to the switch and leave 1G card of unraid unconnected

(disable it too in unraid)

Funnily this is the normal setup for a network without any tricks

 

But of course, it will cost some money.

 

Link to comment

Huh okay, I'll have to look into that... Just to clarify, so that will result in the server being unconnected to anything except for the 10 GbE into the PC, right? And then the PC will be connected to the Switch? So basically just an inverse of the other set up you mentioned. Any internet needed for Unraid will go through the PC's internet connection.

 

And then the Unraid <-> PC file transfers and stuff will be going exclusively through the 10GbE cable. And the PC <-> internet traffic will be going exclusively through the Switch. With each types of traffic completely independent of and not affecting the other? Am I understanding that all correctly?

I want to also mention, I'm not worried particularly about the independence of the server <-> internet from the PC <-> Internet. I rarely, rarely use internet from the server itself (only downloading plugins and such).

Edited by Intothisworld
Link to comment

No No 🙂

The switch will make all connections equal in preference. This theoretically slows down the 10G connections by 10% (sparing out the 1G for the router) but this does not happen in reality (or at least only for some milliseconds that you won't notice).

The switch has enough internal power to handle all connections at full speed and it knows about the difference in speed between the 1G and 10G connections. The traffic from your pc to the internet wont be send to the unraid and local unraid traffic will never be seen at the router.

Thats why it is called "switch" (there were dumb "hubs" in the past which forwarded all packets to all connected ports, but they have been dropped for selective switches by evolution 🙂 ) it only sends packets to the ports that it knows the target is connected to.

Even with a slowdowned Unraid (btw. why is it slow? thats uncommon) the packets to the router are in full speed and not blocked.

Of course UNRAID also can use the internet if it wants to.

As I have already said, this is the "normal" way of cabling.

 

 

Link to comment
On 10/18/2022 at 1:25 AM, JorgeB said:

You could also use the 10GbE NIC just for direct connection between the server and the PC for data transfer, and use separate gigabit for internet access on both.

 

Okay, yeah that's what I thought... And that's exactly what I had set up and was trying to make work (maybe I didn't explain it well). My setup is based off this YouTube video (Windows version & Mellanox) :

 

 

But yeah, for some reason my PC seems to think that the traffic from both of those network connections is linked or something. Seems very strange that the purely local 10GbE traffic would somehow interfere with the speed of the purely online traffic... Is there perhaps a PC setting or something that will tell my computer to stop associating the two?

 

These are the Network Settings I had set up...

image.thumb.png.20ff5107755ff40830ebdd1ae52a370c.png

 

I had a suspicion that it might be the Routing Table that needed adjusted, but it wouldn't let me add/edit anything, which is why I tried deleting the network.cfg originally:

image.thumb.png.b1eac4f8b6a38efac659b8ba53d49359.png

Link to comment

And then these are my PC network settings. First one is Wi-Fi [should-be-]internet-only connection:

image.png.bc596306565b28b91892efb3ba2f290d.png

 

And this is for the Mellanox 10GbE card:

image.png.372ce62724c7322d3d83b226b33b9327.png

 

My cabling is as you described. PC has Wi-Fi USB adapter to router, and then 10GbE SFP+ cable running to server. Server has a 1 Gb ethernet cable running to router (via Powerline adapter, if that matters), and then that 10Gbe SFP+ cable running into the PC.

 

Anything amiss that you're seeing in any of these screenshots? Let me know if you'd like another diagnostic report or anything. Thank you :)

Edited by Intothisworld
Link to comment
17 minutes ago, Intothisworld said:

But yeah, for some reason my PC seems to think that the traffic from both of those network connections is linked or something.

What do you mean by this? That is, what do you observe?

 

Also how are you accessing the server, using the 10GbE IP or did you add the server name to the hosts file?

Link to comment

Yep, I'm connecting to the WebGUI via that eth1 IP address (192.168.1.102), as well as mapping a network drive in Windows Explorer using that same IP address as well.

 

What I'm observing is that if I try to upload or download files at the same time that I'm doing transfers to/from the server, the speed of one (or both) will often drop off considerably. I've even had my cloud upload speed drop down to less than 100 kilobytes per second or so.

Admittedly it is usually some pretty heavy loads (both online traffic & local traffic), but it's nothing that I haven't been able to do successfully before (pre-having the server), so I'm pretty sure it's something to do with my server setup that's causing the lag. Sometimes these uploads/downloads are straight to or from the array (in which case, the lag is probably understandable to a certain degree), but other times the online traffic is to/from the C drive instead (completely unrelated to the server at all), and yet that speed correlation is still there.

Link to comment
8 minutes ago, Intothisworld said:

What I'm observing is that if I try to upload or download files at the same time that I'm doing transfers to/from the server, the speed of one (or both) will often drop off considerably. I've even had my cloud upload speed drop down to less than 100 kilobytes per second or so.

Understood, but no idea what could cause that, I've never seen that issue and don't remember anyone else complaining of something similar.

 

When you are doing that is everything accessing the same storage? (array or pool)

Link to comment

your setup contains much too many possible points for lagging and stopping.

Powerline, WLAN, all these are subject of interruptions and slowdowns.

And because you are using Google Servers for DNS, even your internal 10Gbe LAN can come to a full stop if a DNS request is not fullfilled due to a lost WLAN or Powerlan connection.

There is nothing really safe in your LAN.

 

Link to comment
  • 2 weeks later...

Hey, thanks for your replies. I took a few days to test some things. At the moment, I'm using the bridging set up that MAM59 originally mentioned in his first reply. To sum up, running nothing but the 10GbE cable out of my PC straight to the server (no direct internet connection at all, either wi-fi or ethernet), and then the server being the one connected to the internet via its own separate ethernet cable.  So here's a good example of what I'm encountering speed-wise, and hopefully this will suggest some causes.

 

I'm using a rom manager program that automatically sorts and renames files in a designated set of folders. The rom manager program is on my C drive. And all of these designated folders are on my server (the array specifically; I don't have any pool devices set up.). I also use Mega's "MegaSync" app to auto-upload certain folders from that folder-set to cloud storage. MegaSync is also running from my C drive. Typically, when I don't have the rom manager running, MegaSync uploads at about 3 to 3.5 MB / second. But as soon as I click "go" on the rom manager, my cloud upload speed drops down to only about 100 KB / second or so...

 

Based on the bridging set up, in an ideal situation at least, I would want those two C drive programs to only send communication through the 10GbE to the server, making decisions about moving files where they need to go, but having the files themselves be moved only internally within the server and outbound to the cloud via that dedicated ethernet cable. But based on the fact that the outbound files and intra-server files seem to be tied together speed-wise, I can only guess that maybe one of those programs is running files to itself through the 10GbE cable and then sending back and out through the ethernet to the cloud?

 

Or on the other hand, what I'm really hoping, is that maybe there's simply a setting in Windows somewhere that is causing it to conflate how it manages local connection bandwidth vs. internet connection bandwidth, not realizing that it can treat them separately... But again, this is all pretty new to me, so take all my guesses and speculating with a grain of salt.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...