Hannibal Posted July 9, 2017 Share Posted July 9, 2017 Hello, First time ever posting on a forum so if i do anything incorrectly im sorry...But i have a strange issue pertaining to mellanox cards... i have 2 connectx-2 cards one is in my unraid server and the other is in my windows 10 machine... I got everything setup correctly made sure all drivers and firmware were up to date on both cards on both systems... I have the card on my unraid box setup with an ip of 10.10.10.1 and my windows machine is 10.10.10.2... every so often on the windows side of things the card will drop connection by stating network cable unplugged or the connection will disable itself... Thus causing a call trace on unraid forcing me to perform an unclean shutdown and then a parity check... I've went into power management and made sure windows cannot turn off the card to save power and that didnt fix anything... So last night i disconnected the sfp+ cable between both machines and a call trace still occurred making me believe the issue could possibly be a compatibility issue with the cards and unraid because mellanox support told me they do not support unraid as an OS which confuses me because i see some other members here on this forum seem to be using Connectx-2 cards without issue... Any help in the matter would be greatly appreciated.... Thank you... Quote Link to comment
JorgeB Posted July 9, 2017 Share Posted July 9, 2017 I use the same NICs in the same way, and also get cable unplugged sometimes, mainly when rebooting the unRAID server, but I just need to disable and enable it in my Windows desktop, unRAID never crashed because of this. Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 Hmmm... Could i possibly have a defective card then? Because during a parity check that was just started at around 2am of last night another call trace was listed and the cause was my connectx-2 card... I just removed the card from the system so that my parity check would actually complete because im getting tired of all the unclean shutdowns... This is the 4th or 5th one this week that unraid is stating is being caused by the mellanox card.... Quote Link to comment
JorgeB Posted July 9, 2017 Share Posted July 9, 2017 That or some incompatibility, I use it in 3 of my unRAID servers without issues. Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 (edited) Then maybe ill put it into another windows machine and see if the issue still occurs between two windows based computers to see if the card is bad... Kinda sucks because i wanted the 10GB connection for large file dumps where regular everyday gigabit connection is too slow... Edited July 9, 2017 by Hannibal wrong wording Quote Link to comment
klamath Posted July 9, 2017 Share Posted July 9, 2017 Fibre or twinax? I use the same ones, have the exact same issues, pulling out the twinax a few times will force it to connect, the freenas server i have uses fibre and never has a connection issue. Tim Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 Cisco twinax cables... I'm going to put the card I suspect is faulty into another windows machine to see if the issue is still there... But I just can't deal with the call traces it's causing.... I've had atleast 4 unclean shutdowns this week because of it forcing me into unnecessary parity checks... Quote Link to comment
klamath Posted July 9, 2017 Share Posted July 9, 2017 2 hours ago, Hannibal said: Cisco twinax cables... I'm going to put the card I suspect is faulty into another windows machine to see if the issue is still there... But I just can't deal with the call traces it's causing.... I've had atleast 4 unclean shutdowns this week because of it forcing me into unnecessary parity checks... Id check the best practices for these cards and make sure your bios is configured correctly, https://community.mellanox.com/docs/DOC-2489 There are others out there, the above is a good example and starting point. TL;DR some bios features can interfere with the card, also make sure your up to date on bios, i had to do some tricky things with my cards since they came pre-flashed with HP bios. Tim Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 alright... I'm sorry if this is a stupid question but this is my very first time ever working with 10GB cards... How would i go about checking what bios is on the cards? I have removed the card from my unraid box just so i can get through a parity check without another call trace occurring.... I've also made sure my bios on both windows 10 and the unraid box are up to date... But i haven't when in and checked any bios settings on each system in depth or anything of that sort.... Quote Link to comment
klamath Posted July 9, 2017 Share Posted July 9, 2017 3 minutes ago, Hannibal said: alright... I'm sorry if this is a stupid question but this is my very first time ever working with 10GB cards... How would i go about checking what bios is on the cards? I have removed the card from my unraid box just so i can get through a parity check without another call trace occurring.... I've also made sure my bios on both windows 10 and the unraid box are up to date... But i haven't when in and checked any bios settings on each system in depth or anything of that sort.... There are cli tools you can download, a good rubber meets the road post here: https://community.mellanox.com/thread/1858 Tim Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 Ok, thank you for that... I have the mlxup tools but I believe those are just for firmware related issues..... Quote Link to comment
klamath Posted July 9, 2017 Share Posted July 9, 2017 2 minutes ago, Hannibal said: Ok, thank you for that... I have the mlxup tools but I believe those are just for firmware related issues..... Depending on how you got the cards don't be shocked in the firmware versions dont line up with mellanox proper firmwares, each var has their own "sauce" firmware on it, id recommend just reflashing with proper mellanox firmware. Quote Link to comment
Hannibal Posted July 9, 2017 Author Share Posted July 9, 2017 I actually purchased the cards off Newegg and they were supposedly new... Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 (edited) According to this i am currently on the latest ever firmware ever published on mellanox's website.... Unless there is something im doing wrong here... Both cards produce the same thing posted below... Also according to the PSID they are not IBM or HP flashed cards.... Device #1: ---------- Device Type: ConnectX2 Part Number: MNPA19_A1-A3 Description: ConnectX-2 Lx EN network interface card; single-port SFP+; PCIe2.0 5.0GT/s; mem-free; RoHS R6 PSID: MT_0F60110010 PCI Device Name: mt26448_pci_cr0 Port1 MAC: 6cb3114d0670 Port2 MAC: 6cb3114d0671 Versions: Current Available FW 2.9.1200 N/A PXE 3.3.0400 N/A Status: No matching image found Edited July 10, 2017 by Hannibal Quote Link to comment
klamath Posted July 10, 2017 Share Posted July 10, 2017 14 hours ago, Hannibal said: According to this i am currently on the latest ever firmware ever published on mellanox's website.... Unless there is something im doing wrong here... Both cards produce the same thing posted below... Also according to the PSID they are not IBM or HP flashed cards.... Device #1: ---------- Device Type: ConnectX2 Part Number: MNPA19_A1-A3 Description: ConnectX-2 Lx EN network interface card; single-port SFP+; PCIe2.0 5.0GT/s; mem-free; RoHS R6 PSID: MT_0F60110010 PCI Device Name: mt26448_pci_cr0 Port1 MAC: 6cb3114d0670 Port2 MAC: 6cb3114d0671 Versions: Current Available FW 2.9.1200 N/A PXE 3.3.0400 N/A Status: No matching image found That looks right, my only other recommendation is looking into bios features and power savings that could be interfering. Tim Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 alright, thank you... I was planning on putting the other card into a windows machine to see if the issue is still there... Because yesterday i removed the other mellanox card from my unraid system just so the parity check would complete... The results i posted in my previous post were from my windows based system... Is there any possibility the card i had in my unraid machine could be faulty? Quote Link to comment
klamath Posted July 10, 2017 Share Posted July 10, 2017 Just now, Hannibal said: alright, thank you... I was planning on putting the other card into a windows machine to see if the issue is still there... Because yesterday i removed the other mellanox card from my unraid system just so the parity check would complete... The results i posted in my previous post were from my windows based system... Is there any possibility the card i had in my unraid machine could be faulty? Possible yes, for an enterprise card that ships as defective the odds are pretty low but are non-zero. If you got a realtek nic or some cheap POS the likelihood of it being defective goes up. Heck i had a defective twinax cable so anything is possible. Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 (edited) Because if i remember correctly i only paid like i think $88 dollars a pop for these mellanox cards off newegg.com.... I just want it to work without the call trace issues... Performing large multiple terabyte file dumps over regular old everyday gigabit NICs is painfully slow compared to the 10GB adapters.... Edited July 10, 2017 by Hannibal Quote Link to comment
uldise Posted July 10, 2017 Share Posted July 10, 2017 18 hours ago, Hannibal said: Cisco twinax cables i have 2 mellanox connectx-2 and with my Cisco 7m ACTIVE DAC they refuses to connect at all. Brocade Active DACs working very well. i just finished my 10Gbit home network some weeks ago and this was my only incompatibility between all equipment. so, i would try to change cable for test.. what are a distance between your PCs? Quote Link to comment
klamath Posted July 10, 2017 Share Posted July 10, 2017 15 minutes ago, Hannibal said: Because if i remember correctly i only paid like i think $88 dollars a pop for these mellanox cards off newegg.com.... I just want it to work without the call trace issues... Performing large multiple terabyte file dumps over regular old everyday gigabit NICs is painfully slow compared to the 10GB adapters.... I hear ya, i bought mine used from amazon for $20 a card, got 4 of them in my house all working well besides the twinax acting a fool on startup sometimes, to lazy to replace with fibre. Tim Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 The cards I have can only run off a twinax cable correct? Quote Link to comment
uldise Posted July 10, 2017 Share Posted July 10, 2017 2 minutes ago, Hannibal said: The cards I have can only run off a twinax cable correct? No, you can run with fiber transceivers too.. Quote Link to comment
klamath Posted July 10, 2017 Share Posted July 10, 2017 My freenas server uses fibre and has zero issues, my windows 10, unraid servers both have issues on boot sometimes where the link never comes up, disabling the nic or pulling the twinax out and then back in usually resets it. Never had the link drop mid flight like what your describing. Tim Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 (edited) Its strange.... That's why i pulled the twinax cable to see if the call trace would still occur and it did during a parity check... I think im gonna just throw the card i took from the unraid machine into a windows pc and mess around with it... Been contemplating just swapping my windows 10 pc over to linux to see if the issue is still there with two machines running linux... idk kinda at wits end here lol... Would there be any benefit to me switching over to fibre vs using the twinax cable? Edited July 10, 2017 by Hannibal Quote Link to comment
Hannibal Posted July 10, 2017 Author Share Posted July 10, 2017 26 minutes ago, uldise said: i have 2 mellanox connectx-2 and with my Cisco 7m ACTIVE DAC they refuses to connect at all. Brocade Active DACs working very well. i just finished my 10Gbit home network some weeks ago and this was my only incompatibility between all equipment. so, i would try to change cable for test.. what are a distance between your PCs? The distance between PC's is less than id say 2ft the PC's that are connected with the cisco twinax cable are both in the same rack.... Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.