Jump to content

irq 46: nobody cared


pengrus

Recommended Posts

But I care!

 

I am 90% sure this isn't an unraid problem whatsoever (hence the lounge posting), but unraid is the only one properly logging anything, so I throw myself at your mercy. 

 

I recently changed HTPCs to a NUC i5 Haswell just before christmas.  Twice since then, it has frozen, or refused to shut down or some such other thing.  When it does this, every other wired client in my house loses its connection.  Every one.  The PS3, the XBOX, the office computers, and sadly, my unraid server.  The error it throws:

 

Jan 17 10:21:26 Tower kernel: irq 46: nobody cared (try booting with the "irqpoll" option) (Errors)
Jan 17 10:21:26 Tower kernel: Pid: 8973, comm: sleep Not tainted 3.9.11p-unRAID #4 (Errors)
Jan 17 10:21:26 Tower kernel: Call Trace: (Errors)
Jan 17 10:21:26 Tower kernel:  [<c105f922>] __report_bad_irq+0x29/0xb4 (Errors)
Jan 17 10:21:26 Tower kernel:  [<c105fae4>] note_interrupt+0x137/0x1ac (Errors)
Jan 17 10:21:26 Tower kernel:  [<c105e058>] handle_irq_event_percpu+0x109/0x11a (Errors)
Jan 17 10:21:26 Tower kernel:  [<c10880aa>] ? change_protection+0x28/0x2f (Errors)
Jan 17 10:21:26 Tower kernel:  [<c105e08e>] handle_irq_event+0x25/0x3c (Errors)
Jan 17 10:21:26 Tower kernel:  [<c10604d3>] handle_edge_irq+0xae/0xcf (Errors)
Jan 17 10:21:26 Tower kernel:  [<c1003e12>] handle_irq+0x69/0x70 (Errors)
Jan 17 10:21:26 Tower kernel:  [<c100362e>] do_IRQ+0x37/0x9b (Errors)
Jan 17 10:21:26 Tower kernel:  [<c10883e6>] ? sys_mprotect+0x165/0x177 (Errors)
Jan 17 10:21:26 Tower kernel:  [<c14018ac>] common_interrupt+0x2c/0x31 (Errors)
Jan 17 10:21:26 Tower kernel: handlers:
Jan 17 10:21:26 Tower kernel: [<f848197b>] e1000_msix_other [e1000e]
Jan 17 10:21:26 Tower kernel: Disabling IRQ #46

 

Manually rebooting (it is completely unresponsive) the NUC fixes the issue.

 

Any ideas?

 

TIA,

 

Pengrus

Link to comment

I'll take a shot; these are the things I would try....... Look over http://lime-technology.com/wiki/index.php/Boot_Codes and consider adding (appending) the "irqpoll" to your syslinux.cfg file.

 

I think you can troubleshoot this by booting from a "live" CD/USB and running "lspci -vv" from the command line (terminal). That should give you a verbose listing of what devices are on PCI bus AND the interrupts used. I'm guessing it's the Ethernet adapter e1000.

 

I just don't know if lspci is on the unRAID USB stick.

Link to comment

I am reasonably confident that it is the e1000e that's on irq 46.  It is now on 16.  I have appended the irqpoll to syslinux, but can't reboot at the moment.

 

What would cause this to happen?  Why would some other computer on the network completely hijack all the wired connections, but leave the wifi just fine?

 

Literally as soon as I power off the NUC, everything gets its connection back.  Other than running wireshark for a few weeks nonstop, what can I do to diagnose?

 

Thanks for your help!

Link to comment

...When it does this, every other wired client in my house loses its connection.  Every one. The PS3, the XBOX, the office computers, and sadly, my unraid server.  The error it throws..............

Sorry, I got confused and shot from the hip; as you said the problem is with NUC not unRAID, so changing IRQs in unRAID isn't going to really do anything to the NUC (duh). The idea would be to find out why the NUC is flooding the network; a DOS attack within your own house, as it were.  ;)

 

Since it's a wired device there's some explanation as to it causing problems with your other wired network devices. But if Wi-Fi is OK then that points to the way your router handles the two networks....oh, my head hurts...

 

What is the software/OS the NUC HTPC is running? Here are some more simple/basic things to "try," NUC BIOS update? Router FW update? Run NUC wireless and hope the problem goes away? Contact Intel Support?  The Wireshark idea sounds good!

 

....Or have I got the situation totally a$$-backward?

Link to comment

1) get a better switch. Your network is behaving like you have a hub. You might have overrun the address table size, or the nic might have gone crazy. A better switch would detect the trouble on the port connected to the NUC and prevent the outage.

2) change the port and cables, it might be a physical layer event, but since reboot fixes it, maybe not.

3) Change anything you can find on the NUC nic stack, the nic driver, the nic firmware, the auto negotiate setting, etc.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...