LaurentHU Posted May 19, 2011 Share Posted May 19, 2011 Hello, I've recently upgraded to a Zacate (E35M1-M) system with 4.7, both at the same time, and since then I see frequent partial crashes. First thing : I had the issue with the onboard Realtek NIC, I changed it for a D-Link DGE-528T, works perfectly. BUT, now I get this frequently : May 19 00:51:10 NAS kernel: irq 19: nobody cared (try booting with the "irqpoll" option) May 19 00:51:10 NAS kernel: Pid: 0, comm: swapper Tainted: G W 2.6.32.9-unRAID #8 May 19 00:51:10 NAS kernel: Call Trace: May 19 00:51:10 NAS kernel: [] __report_bad_irq+0x2e/0x6f May 19 00:51:10 NAS kernel: [] note_interrupt+0xf5/0x13c May 19 00:51:10 NAS kernel: [] handle_fasteoi_irq+0x5f/0x9d May 19 00:51:10 NAS kernel: [] handle_irq+0x1a/0x24 May 19 00:51:10 NAS kernel: [] do_IRQ+0x40/0x96 May 19 00:51:10 NAS kernel: [] common_interrupt+0x29/0x30 May 19 00:51:10 NAS kernel: [] ? acpi_idle_enter_simple+0xfe/0x12a May 19 00:51:10 NAS kernel: [] cpuidle_idle_call+0x63/0x9b May 19 00:51:10 NAS kernel: [] cpu_idle+0x3a/0x4e May 19 00:51:10 NAS kernel: [] start_secondary+0x195/0x19a May 19 00:51:10 NAS kernel: handlers: May 19 00:51:10 NAS kernel: [] (ahci_interrupt+0x0/0x3df [ahci]) May 19 00:51:10 NAS kernel: [] (sil_interrupt+0x0/0x26d [sata_sil]) May 19 00:51:10 NAS kernel: Disabling IRQ #19 Always the same, never the same time, and after that for some reason the system never really recovers, load that was <1 goes to >5, and everything is way slow... Any idea? Quote Link to comment
LaurentHU Posted May 20, 2011 Author Share Posted May 20, 2011 Up - no idea anyone? Quote Link to comment
dgaschk Posted May 20, 2011 Share Posted May 20, 2011 Please see here: http://lime-technology.com/forum/index.php?topic=9880.0 Make sure "AHCI" is selected in BIOS. Quote Link to comment
LaurentHU Posted May 28, 2011 Author Share Posted May 28, 2011 Here is the complete logfile... I have AHCI selected in the BIOS for the mainboard SATA controller, but also have a Sil 3114 PCI controller that can't be setup on AHCI if I understood it well, and that sits on the same IRQ that my NIC. Bad, and can't be setup in BIOS again - ASUS E35 MB. Quote Link to comment
LaurentHU Posted May 28, 2011 Author Share Posted May 28, 2011 Sorry, so Sata controller is on the same irq than the onboard SATA as far as I can see : root@NAS:/proc# more interrupts CPU0 CPU1 0: 6504 142202 IO-APIC-edge timer 1: 0 2 IO-APIC-edge i8042 9: 0 0 IO-APIC-fasteoi acpi 12: 0 3 IO-APIC-edge i8042 17: 136 11235 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3 18: 7028 959268 IO-APIC-fasteoi ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, eth0 19: 2265 80198 IO-APIC-fasteoi ahci, sata_sil NMI: 0 0 Non-maskable interrupts LOC: 142490 6746 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 156778 89635 Rescheduling interrupts CAL: 23 55 Function call interrupts TLB: 3595 8214 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 5 5 Machine check polls ERR: 0 MIS: 0 Quote Link to comment
bcbgboy13 Posted May 28, 2011 Share Posted May 28, 2011 You made the mistake of buying brand new, just of the shelf hardware platform. I always wonder what are people thinking when they do this. See - they are usually no problems when you run a mainstream software (such as Windows) as the hardware vendors and the software ones will work hand in hand (exchanging beta hardware and software) to insure that they are no major problems on the "launch" date when the new hardware hit the shelves. But not sure how many Linux developers were given the free hardware to test these beforehand. And then we have the many Linux "flavors" and the Slackware upon which Unraid is based is not the most popular and widely used nowadays. And then Unraid development itself goes with even slower pace and they do not use the latest Slackware releases. So for someone to buy a brand new shiny hardware platform and to expect that it will work under a custom software like Unraid without any problems is like hopping for a miracle. Miracles sometimes happen but most of the time they do not. So the early adopters are now in the "sweet" troubleshooting territory all for them-self. What can you do. 1. Always disable all the unused hardware features on the motherboard - serial and parallel ports, audio, firewire, floppy, IDE controller if you do not use any of the older PATA drives as you are going to free some resources this way. 2. Check out daily your motherboard vendor for BIOS updates and if available apply them immediately and then do not forget about "#1" above. And you may even complain to them that this hardware does not work with that software... 3. Forget about 4.7 - go with the latest beta. If you just started with Unraid 4.7 then IMHO you should not have any issues with 5.0b6a If these do not help your only choice is to try to run the latest full Slackware distro. And please do not forget to share your success (or pain) along the way. PS.read your PM Quote Link to comment
LaurentHU Posted May 28, 2011 Author Share Posted May 28, 2011 Here is the logfile... syslog-2011-05-28.txt.zip Quote Link to comment
LaurentHU Posted May 28, 2011 Author Share Posted May 28, 2011 Thanks for the reply... Understand what you say, indeed it's a mistake in hindsight. I have now booted with irqpoll option and it looks better, I'll just have to assess the impact on perfs. I think it's the Sil 3114 card that is causing the issue, plus the fact the BIOS doesn't allow you to remap the IRQs... Quote Link to comment
LaurentHU Posted May 29, 2011 Author Share Posted May 29, 2011 Just as an update : * Booting with irqpoll option works, just the performances are then abysmal... Not a solution! * I flashed the Sil 3114 card with the latest "IDE" (=non RAID) BIOS, flashing worked OK but on the UNRAID server it doesn't boot anymore!! * I flashed the card with the latest Raid BIOS, and now it is running smoothly with it since more than 12 hours. Let's see if it stays like this! Quote Link to comment
cyrnel Posted May 30, 2011 Share Posted May 30, 2011 There's a new BIOS from Asus. I'm just started testing tonight but so far it's behaving much better than 502. http://www.asus.com/Motherboards/AMD_CPU_on_Board/E35M1M_PRO/#download (http://lime-technology.com/forum/index.php?topic=12330.30) Quote Link to comment
LaurentHU Posted May 30, 2011 Author Share Posted May 30, 2011 Yep, I've seen this, unfortunately I was too cheap to pay for the Pro version, and the non-Pro didn't get the same BIOS update. Damn!!! On the current BIOS version my Sil 3114 is back at square 1, giving me still the same errors as reported above. How frustrating... Quote Link to comment
LaurentHU Posted May 30, 2011 Author Share Posted May 30, 2011 Sequel to this oh-so-interesting story : found the 1002 BIOS on an ASUS FTP Server, updated to it now, will see if I have improvement. So far IRQ mapping is exactly the same, and I still can't redistribute IRQs manually in the BIOS. Quote Link to comment
LaurentHU Posted June 1, 2011 Author Share Posted June 1, 2011 Problem with the Sil 3114 seems to be definitely solved with the new BIOS Now do I try to reuse the onboard network card? Quote Link to comment
cyrnel Posted June 1, 2011 Share Posted June 1, 2011 Good to hear. pcie cards seem better overall. I heard they were getting copious input from raid-card owners. I'm only using two-port cards but the old BIOS had a thing about trying to use my monoprice cards for video. I'm not sure shared IRQs was really a cause. Granted, it didn't behave normally but even when I managed to isolate eth0 or sata IRQs they would eventually have problems. Seemed more like a routing thing, or just big bugs. Mine is better as well. Whatever the source it's calmed down. Raw performance isn't quite as good unfortunately. Quick disk write numbers without work to correct it look 25% slower to the same array. The Realtek is still horrible and will still crash hard above ~20MB/s. I haven't tried the Intel cards or other drivers yet. BTW, if I hadn't mentioned it already, a full Slackware 13.1 install eventually died on the old BIOS. Same symptoms. Seems to work then familiar "Disabling IRQ", even just idling. When time allows I'll move my 13.1 array over for testing. It could mean more options for current drivers. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.