Adam64 Posted January 24, 2016 Share Posted January 24, 2016 Hi All, I'm having a strange problem and need some guidance on how to troubleshoot it. My unRAID box loses it's network connection periodically. Sometimes within a couple of days, and sometimes a week or two goes by then it happens. The symptom I see is that nothing can connect to unRAID -- accessing files, telnet, the gui, rdp to VMs all stop working. I've found that to solve it all I have to do is disconnect the ethernet connection and reconnect it. There's nothing in the log. I've replaced the ethernet switch. I'm using the built-in ethernet port on the motherboard: 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V My thought is to add a pcie ethernet card to see if that changes things. Any other thoughts? Edit: I also changed from DHCP to static IP address to see if that changed anything. Quote Link to comment
jonp Posted January 24, 2016 Share Posted January 24, 2016 You'll want to attach a monitor and keyboard to the system when the network drops and then login and type diagnostics at the command prompt, then press enter. Then type powerdown and press enter. Then plug your flash device into your pc and grab the diagnostics zip file off the root of the flash device and upload it here. Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 You'll want to attach a monitor and keyboard to the system when the network drops and then login and type diagnostics at the command prompt, then press enter. Then type powerdown and press enter. Then plug your flash device into your pc and grab the diagnostics zip file off the root of the flash device and upload it here. Thanks Jon. Can you teach me to fish? I looked through the various files and didn't see anything that stood out as an issue (I'm not linux guru but am reasonable technical). What would you look for? Quote Link to comment
Frank1940 Posted January 25, 2016 Share Posted January 25, 2016 Attached that entire file to your next reply and other folks will do the analysis for you. Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 The latest symptom is that my two Windows VMs intermittently lose the ability to connect to the internet and have to be rebooted to get it to work again. Diagnostics attached. unraid-diagnostics-20160125-0622.zip Quote Link to comment
ptmuldoon Posted January 25, 2016 Share Posted January 25, 2016 Have you tried the basic troubleshoot steps like replacing the ethernet cable? Or redoing the cable ends with new plug and crimper tool on both ends of the cable? And also trying a different port on any router or switch? Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 Yes to both. And a different switch. Edit: I also turned off spanning tree which can sometimes cause problems. Quote Link to comment
ptmuldoon Posted January 25, 2016 Share Posted January 25, 2016 Sounds like you've done most of what you can without next trying a possible pcei ethernet card. How old is the motherboard? Think you had any stranger power outages or lightning strike that could have messed with the MB? Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 The motherboard was purchased in the last 2 months, so it's pretty new. I have a PCIE Ethernet card on order from Amazon that'll be here Wednesday, so I am planning on trying that. No lightning strikes (I live in sunny southern california ). Quote Link to comment
Frank1940 Posted January 25, 2016 Share Posted January 25, 2016 The latest symptom is that my two Windows VMs intermittently lose the ability to connect to the internet and have to be rebooted to get it to work again. With this symptom, I would reboot EVERYTHING that is involved in the network--- (Switches, router, modem, DVD players, TV's, etc.). All of these devices are computers running Linux and they can be come corrupted after being up for a long period of time. Often a simple reboot will fix a gremlin. Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 Agreed, and I've done that. Everything's been rebooted. Quote Link to comment
Frank1940 Posted January 25, 2016 Share Posted January 25, 2016 In that case, let me tell you that I had similar situation a few months ago. The only thing difference is that I wasn't losing the unRAID servers but WIN7 computers and media boxes. Since I had multiple devices with issues, I deduced that it had to be a network problem. I started rebooting individual network devices until I found the one that fixed the problem. I found that I had a 'bad' switch. What I think the issue was that this switch was a 'green' switch that powered down ports that were inactive and also limited the power it was using on ports that seem to to be 'closer' to the switch. Apparently, one of these two features was causing the switch not to power up the connection after service was requested. Replacing the switch fixed the problem. (Since, it was a 16 port GB switch, it was not a cheap repair. But I reached the point where I had adopted the Auto mechanic's method of replacing stuff until I had it fixed! Luckily, the switch was the first thing that I tried.) Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 In that case, let me tell you that I had similar situation a few months ago. The only thing difference is that I wasn't losing the unRAID servers but WIN7 computers and media boxes. Since I had multiple devices with issues, I deduced that it had to be a network problem. I started rebooting individual network devices until I found the one that fixed the problem. I found that I had a 'bad' switch. What I think the issue was that this switch was a 'green' switch that powered down ports that were inactive and also limited the power it was using on ports that seem to to be 'closer' to the switch. Apparently, one of these two features was causing the switch not to power up the connection after service was requested. Replacing the switch fixed the problem. (Since, it was a 16 port GB switch, it was not a cheap repair. But I reached the point where I had adopted the Auto mechanic's method of replacing stuff until I had it fixed! Luckily, the switch was the first thing that I tried.) Yeah, thanks for that. I actually thought I had the same problem here -- when it first happened I replaced my switch and started working again. At that point i didn't know that just unplugging and re-plugging the cable fixed it. Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 Quick update: I had a pcie ethernet card sitting around (intel desktop class, so probably won't use it for long-term) and put it in there. We'll see how it goes. One thing I have noticed already though is that the SMB access from my Win10 box to unRAID seems to be working flawlessly. This is always been intermittently problematic since I first started using unRAID, and I assumed it was the SMB2/3 thing. Could be placebo effect; don't know yet. Quote Link to comment
Adam64 Posted January 25, 2016 Author Share Posted January 25, 2016 Well, sadly, the new NIC didn't fix the problem with the VMs losing access to the internet intermittently. Wasn't sure how that would work because why would a hardware NIC be involved in a virtual bridge, but I was hoping. Quote Link to comment
Adam64 Posted January 26, 2016 Author Share Posted January 26, 2016 Well, sadly, the new NIC didn't fix the problem with the VMs losing access to the internet intermittently. Wasn't sure how that would work because why would a hardware NIC be involved in a virtual bridge, but I was hoping. I changed the virtual switch name to br1 and the VMs seem to be working fine now. So, for those that may be following this I'll say that I think the Intel NIC I218-V on my motherboard is either bad or has a driver issue. To summarize: Seemingly solved by replacing the Ethernet card: - unreliable SMB connections - lost connections to the server momentarily resolved by disconnecting and reconnecting the cable Seemingly solved by renaming the virtual switch from br0 to br1: - Windows VMs intermittently lose network access Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.