pengrus Posted January 17, 2014 Share Posted January 17, 2014 But I care! I am 90% sure this isn't an unraid problem whatsoever (hence the lounge posting), but unraid is the only one properly logging anything, so I throw myself at your mercy. I recently changed HTPCs to a NUC i5 Haswell just before christmas. Twice since then, it has frozen, or refused to shut down or some such other thing. When it does this, every other wired client in my house loses its connection. Every one. The PS3, the XBOX, the office computers, and sadly, my unraid server. The error it throws: Jan 17 10:21:26 Tower kernel: irq 46: nobody cared (try booting with the "irqpoll" option) (Errors) Jan 17 10:21:26 Tower kernel: Pid: 8973, comm: sleep Not tainted 3.9.11p-unRAID #4 (Errors) Jan 17 10:21:26 Tower kernel: Call Trace: (Errors) Jan 17 10:21:26 Tower kernel: [<c105f922>] __report_bad_irq+0x29/0xb4 (Errors) Jan 17 10:21:26 Tower kernel: [<c105fae4>] note_interrupt+0x137/0x1ac (Errors) Jan 17 10:21:26 Tower kernel: [<c105e058>] handle_irq_event_percpu+0x109/0x11a (Errors) Jan 17 10:21:26 Tower kernel: [<c10880aa>] ? change_protection+0x28/0x2f (Errors) Jan 17 10:21:26 Tower kernel: [<c105e08e>] handle_irq_event+0x25/0x3c (Errors) Jan 17 10:21:26 Tower kernel: [<c10604d3>] handle_edge_irq+0xae/0xcf (Errors) Jan 17 10:21:26 Tower kernel: [<c1003e12>] handle_irq+0x69/0x70 (Errors) Jan 17 10:21:26 Tower kernel: [<c100362e>] do_IRQ+0x37/0x9b (Errors) Jan 17 10:21:26 Tower kernel: [<c10883e6>] ? sys_mprotect+0x165/0x177 (Errors) Jan 17 10:21:26 Tower kernel: [<c14018ac>] common_interrupt+0x2c/0x31 (Errors) Jan 17 10:21:26 Tower kernel: handlers: Jan 17 10:21:26 Tower kernel: [<f848197b>] e1000_msix_other [e1000e] Jan 17 10:21:26 Tower kernel: Disabling IRQ #46 Manually rebooting (it is completely unresponsive) the NUC fixes the issue. Any ideas? TIA, Pengrus Quote Link to comment
doorunrun Posted January 17, 2014 Share Posted January 17, 2014 I'll take a shot; these are the things I would try....... Look over http://lime-technology.com/wiki/index.php/Boot_Codes and consider adding (appending) the "irqpoll" to your syslinux.cfg file. I think you can troubleshoot this by booting from a "live" CD/USB and running "lspci -vv" from the command line (terminal). That should give you a verbose listing of what devices are on PCI bus AND the interrupts used. I'm guessing it's the Ethernet adapter e1000. I just don't know if lspci is on the unRAID USB stick. Quote Link to comment
pengrus Posted January 18, 2014 Author Share Posted January 18, 2014 I am reasonably confident that it is the e1000e that's on irq 46. It is now on 16. I have appended the irqpoll to syslinux, but can't reboot at the moment. What would cause this to happen? Why would some other computer on the network completely hijack all the wired connections, but leave the wifi just fine? Literally as soon as I power off the NUC, everything gets its connection back. Other than running wireshark for a few weeks nonstop, what can I do to diagnose? Thanks for your help! Quote Link to comment
doorunrun Posted January 18, 2014 Share Posted January 18, 2014 ...When it does this, every other wired client in my house loses its connection. Every one. The PS3, the XBOX, the office computers, and sadly, my unraid server. The error it throws.............. Sorry, I got confused and shot from the hip; as you said the problem is with NUC not unRAID, so changing IRQs in unRAID isn't going to really do anything to the NUC (duh). The idea would be to find out why the NUC is flooding the network; a DOS attack within your own house, as it were. Since it's a wired device there's some explanation as to it causing problems with your other wired network devices. But if Wi-Fi is OK then that points to the way your router handles the two networks....oh, my head hurts... What is the software/OS the NUC HTPC is running? Here are some more simple/basic things to "try," NUC BIOS update? Router FW update? Run NUC wireless and hope the problem goes away? Contact Intel Support? The Wireshark idea sounds good! ....Or have I got the situation totally a$$-backward? Quote Link to comment
c3 Posted January 19, 2014 Share Posted January 19, 2014 1) get a better switch. Your network is behaving like you have a hub. You might have overrun the address table size, or the nic might have gone crazy. A better switch would detect the trouble on the port connected to the NUC and prevent the outage. 2) change the port and cables, it might be a physical layer event, but since reboot fixes it, maybe not. 3) Change anything you can find on the NUC nic stack, the nic driver, the nic firmware, the auto negotiate setting, etc. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.