swamiforlife Posted February 22, 2018 Posted February 22, 2018 I have a new unraid server setup up fresh with 6.4.1. I have 2 network interfaces, One is a Mellanox Connectx-2 card and the other is an onboard intel NIC. I have both interfaces setup with static ip's and NO bonding and NO bridging. The problem is every once in a while both network cards stop responding and i can;t even ping them. So i have to take the cables out and plug them back in and then i can access the server. So what could be the cause?
JorgeB Posted February 22, 2018 Posted February 22, 2018 Please post your diagnostics after this happens and before rebooting: Tools -> Diagnostics
swamiforlife Posted February 22, 2018 Author Posted February 22, 2018 I have uploaded the diagnostics files. This has happened on the 21st and 22nd of Feb mainly, the errors before that are probably with me setting up the server and playing with it. tower-diagnostics-20180222-1851.zip
JorgeB Posted February 22, 2018 Posted February 22, 2018 Not saying this is related but your server is having memory errors, you need to sort this out first: Feb 20 08:15:46 Tower kernel: mce: [Hardware Error]: Machine check events logged Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 5: 8c00004000010091 Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: TSC 28946f898b967 Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: ADDR 19f535ba40 Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: MISC 2042268686 Feb 20 08:15:46 Tower kernel: EDAC sbridge MC1: PROCESSOR 0:206d7 TIME 1519094746 SOCKET 1 APIC 20
swamiforlife Posted February 22, 2018 Author Posted February 22, 2018 How can i sort out the memory errors? And how harmful are they to the integrity of the data on the server?
JorgeB Posted February 22, 2018 Posted February 22, 2018 Check the board's SEL, it should show which is the problem DIMM.
swamiforlife Posted February 22, 2018 Author Posted February 22, 2018 Hey Johnnie, Thanks for the help, i figured out what SEL is and will try to fix that, but what do you think about the Mellanox Adapter dropping connection?
JorgeB Posted February 22, 2018 Posted February 22, 2018 Wasn't the connection dropping on both at the same time? If it's just the Mellanox check the cables/transceivers, are you using a switch or direct connecting?
swamiforlife Posted February 22, 2018 Author Posted February 22, 2018 I was using a Dell x1052 switch at first and still had the drops and also tried a direct connection between 2 servers and still had the same type of drops.
JorgeB Posted February 22, 2018 Posted February 22, 2018 Try a different cable and failing that a different NIC, I use the same NIC and never had drop issues.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.