Move To New Server Hardware - No Network


Recommended Posts

Hello Unraid Folks, and Happy Friday!

 

I recently upgraded my Unraid server (version 6.11.1) from a mini-ITX Asus build (i7-4790/16GB RAM/6 SATA ports) to a proper ATX Asus build (i7-7700K/32GB RAM/4 SATA ports.  I also wanted to add another hard drive to the mix, and as you can see the new motherboard actually has less SATA ports, so I opted for a PCIe x4 8-port SATA controller.  In both builds (brought it with me to the new build I mean), I also have a dual-port NIC, PCIe x1 (this was for a pfSense VM I messed with awhile back, and it has been routed exclusively to that VM in the IOMMU settings in Unraid). 

 

On the old build, I used the onboard NIC with a static IP for Unraid, and my plan was to do the same here.  After building out this new ATX build, moving my USB thumb drive over and booting up, I no longer have network connectivity.  Somewhere in the boot-up process of Unraid, there are some pretty healthy errors while it attempts to find the drives on the new SATA card (I assume it is finding those, FWIW), and the onboard network card actually goes dark during that same part of boot-up, never to return.  It definitely has to do with one of the two PCIe cards, because if I power down the server and pull those two cards, the onboard NIC works beautifully, and I can reach the web GUI and everything like a charm.  I went into the Network settings in the GUI, saw the correct IP and MAC naturally, and even updated a DNS setting just to see if I could solidify the networking against that new onboard NIC, but no good - when I install the cards again, the onboard NIC goes dark (meaning no link lights) and I can no longer reach it.

 

From what I've read, it seems like Unraid is running a udev command to enumerate devices, and those are taking over the ability for the onboard NIC to be recognized.  I could be a bit off base here, but that's the only lead I found that seemed relevant to what I'm experiencing.  Anyone have any ideas?  Kinda dead in the water without my Unraid server!

Link to comment

take a (deep) look into the manual of your new motherboard.

Today it is quite common (especially for badly equipped intel CPUs) that some onboard devices may share PCIe lanes with one of the slots. You can then select in the BIOS usally which device should get the lanes (and the other one is hanging in thin air).

These restrictions are printed in the manual, although they may be a bit hard to find, so read carefully.

 

Also it is quite common with CPUS that have a graphic card onchip, that they "steal" lanes from the slots, I dont know about intel, dont use them for a decade anymore, but with AMDs it is normal that from the 1st 16x slots only the lanes 1-4 and 9-16 are free, the internal graphic occupies lanes 5-8.

 

Once again: READ CAREFULLY!

 

Link to comment

Hi MAM59,

 

Thanks for the good suggestions.  I had been thinking that the PCIe contention could be an issue, but all I could find in the book was some IRQ-sharing warnings (attached below)  I screencapped a few items here that may lend some more texture around the issue (the IRQ table from the MB, two different ifconfig dumps, and an lspci from the most recent config).

Yesterday, I had the dual-NIC in PCIex16-Slot 1, and the SATA controller in PCIex16-Slot 2, and I noticed that I couldn't see the dual-NIC at all in an ifconfig.  This morning I changed that configuration, putting the SATA controller in PCIex16-Slot1 and moving the dual NIC to PCIex1-Slot2 (figuring that maybe using both x16 slots would be causing issues).  Things seemed to improve somewhat, since the new ifconfig now shows all three eth interfaces.  However, the ports are dark for both the onboard NIC and the dual NIC now.  The only other config I can think might help is moving the dual NIC to PCIex1-Slot1 next to the SATA controller.  (the SATA controller is x4, so I don't have as many options for it beyond the two x16 slots).

ifconfig_first_config.jpg

IRQ sharing table.jpg

ifconfig_second_config.jpg

lspci_second_config.jpg

Link to comment

Hi Jorge,

 

Sorry for the long wait in reply.  I was able to test an onboard port to see the drives were working (makes sense - the entirety of this array worked great and was recognized in the last server, so it'd be scary if all 7 drives did during the switch).  I fell down a couple of rabbit holes (isn't that what we do here with homelabs?) and found this on "the other guy's" forum:

TrueNAS Scale with ASM1062 SATA not detecting disks | TrueNAS Community

 

It turns out this card runs for 2-port ASM1061 controllers to multiply the ports out, and I've seen warnings across the forums to avoid these sorts of multipliers (and at that link too).  So I moved along to find an LSI card with IT-crossed firmware flashing (that topic has been a hell of a ride!), and I now am the proud owner of an LSI 9240-8I card, with 20.00.07.00 IT firmware-flashed before it even got to me. 

The good: the drives all came up immediately, recognized and happy...for a minute.

 

The bad: I've now immediately started seeing I/O errors reported on some of the drives, which is again weird (this whole build is new, btw, less than a year old, just moved to this new MB and case recently).  These I/O errors seem to be severe enough that I can't get my Docker containers started either.  Kinda confusing.

Anecdotally I've seen that this is a "good" FW to be on, but I am trying to see if there is a newer one maybe?  I'll try to attach updated diagnostics after work today, but I just wanted to keep the thread alive.  Thanks so much for helping!

Link to comment

JorgeB, you're a gorram genius!  Moved the LSI HBA card to PCIex16 Slot 2 and it's working a treat, zero issues.  (I did try to push/reseat the card while in Slot 1 but it was in there snug, so perhaps an issue with that slot?)  Things have been working beautifully at this point, but perhaps because of the jacking around earlier one of my data drives was marked as Disabled.  I'm rebuilding it now, currently at 31% rebuilt.  Thank you again for all your help with this - I'll head back down the road of re-adding the dual-NIC to the mix once this "new build" has been vetted for a day or two (Uptime Is Law™), and I'll report back if I get stuck there.

  • Like 1
Link to comment
  • 2 weeks later...

Hi JorgeB, thanks again for your help - things have been working great since changing away from that slot (upgraded Unraid twice since then, and reconfigured/upgraded my array & cache drives a couple of times to maximize/add space.  I'd say the build is solid again, so thank you!

 

Two questions for you to get back to one of the purposes of my original post: 

 

- I reinstalled the dual-port NIC, putting in PCIex1-Slot1 - it does start out with both ports "down", but when I go to settings, it seems that I can put them to "port up" in the configuration if I wish.  Eventually I'll be pulling these out of boot and adding them to a VM for pfSense (I had done this awhile back and I'd like to get back to it now that the build is steady), but I just want to be sure I'm not experiencing any weird behavior at this point in that slot.

 

- I tried to research this in other threads, but I'm not getting any traction: it seems that, way more than ever before, my boot times have climbed significantly, topping out at nearly 8 minutes (30 seconds on the way down, another minute to regain ping, and then almost six minutes to get to GUI, with an additional 40 seconds to finish starting the array).  The almost exclusive majority of that boot time is getting "stuck" on starting winbindd.  I can't seem to see anything that is strange about what I have set up at this point, so I'm not sure what caused it to spike so.  This obviously makes it difficult to rely on the Unraid box as a pfSense router with 8 minute boot times.  I played with the SMB settings but it doesn't seem to have helped.  Any thoughts?

Thank you again for your help here.

dragonvault-diagnostics-20221121-0934.zip

Link to comment

IP is set statically for Unraid on the onboard NIC.

The server does have internet access during boot (at this point).  One of my issues/frustrations with pfSense is that it is a VM, which doesn't get any love until after the array is started, so if that's the only way internet access is handled, I wouldn't be able to have that work.  I'm going to try to set up an active/passive pfSense configuration when I have more hardware to play with, but that's definitely after Christmas, so at this point I'm not using pfSense at all, and the routing is being handled by my Asus router upstream.

Link to comment
7 minutes ago, Dragonwyntir said:

IP is set statically for Unraid on the onboard NIC.

The addon NIC must have an IP address set, manual or DHCP, or it will start port down.

 

8 minutes ago, Dragonwyntir said:

The server does have internet access during boot (at this point)

And the DNS server is not set to the pfSense VM?

 

Link to comment

Good to know about the port down - that should fix itself when I get to binding those ports to the VM.

 

It is not set to the pfSense VM, but I have it set to a pihole Docker container, which is not live at boot time (obviously), so maybe that's the hangup?  Is that why it stops at that winbindd prompt?  Should I just drop in some OpenDNS servers in the Network Settings for Unraid?

Link to comment
17 minutes ago, Dragonwyntir said:

set to a pihole Docker container, which is not live at boot time (obviously)

Yeah that "docker" is really an error in itself. Should have never been offered.

 

As you have already noticed, you create a chicken&egg problem with questionable results.

 

Drop it and stay far away from it. At least, if you want to use it on the unraid host itself (it works ok for OTHER machines in the lan).

Better get a real pi and put pihole on it, the name is not accidental 🙂

 

 

Link to comment

@MAM59 Yeah, I get your point, totally.  The Docker implementation here isn't perfect (we use Docker extensively at my day job).  I think for the occasional container fun and light duty stuff it is fine though.  What I'd really like to see is some more direct understanding within Unraid of VMs/containers that are infrastructure-dependent (like pfSense and piHole) that can be temporarily bypassed for boot and then relied upon when they are awake.  Would be a pretty bespoke thing at the OS level I'd imagine, though, but that would help to deal with the chicken&egg issue.

 

My ideal thought about it would be to have a rasp pi or nuc or mini-pc that runs a hypervisor with the primary infrastructure stuff running, and then have those paired with VMs/containers in Unraid for an active/passive failover in the event the primary infrastructure stuff needs patching/maintenance/whatever.  I'm hoping to give that a whirl when money is more fluid after Christmas.  This is all over-engineered silliness for a home lab, but I assume I'm in good company here for that kind of fun.

Link to comment
16 minutes ago, Dragonwyntir said:

This is all over-engineered silliness for a home lab, but I assume I'm in good company here for that kind of fun.

🙂 silliness can never been over-engineered.

Pihole (on a seperate machine) is fine, but it does not really scale good. You run into many problems if you try to double it to create a failover.

 

Relying on a single PI is dangerous, but those thingys are amazingly stable and if you use a Pi-400 for instance, you get a nice box, a keyboard and a very good (silent) cooling system. And its cheaper than buying and assembling the parts seperately. Sadly currently its not the time to buy a PI, prices are still much too high and they are hard to get now...

 

 

Link to comment

I started with pi hole as a docker container on unRAID using it only for other devices on the LAN and not for unRAID itself.  But, if unRAID was not up for whatever reason (I run it 24x7 but stuff happens) the rest of my network would fall back to Cloudflare for DNS. 

 

I decided to run pi hole on a Raspberry Pi instead and it has been going strong for over two years.  It never goes down unless I power it down (it has a PoE HAT for power).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.