October 29, 201213 yr So I decided to post my issue in as part of a new thread as to get more eyes on this. Oct 28 15:21:20 Clara-Belle kernel: sas: ata1: end_device-0:0: dev error handler Oct 28 15:21:20 Clara-Belle kernel: sas: ata2: end_device-0:1: dev error handler Oct 28 15:21:20 Clara-Belle kernel: sas: ata3: end_device-0:2: dev error handler Oct 28 15:21:20 Clara-Belle kernel: sas: ata4: end_device-0:3: dev error handler Oct 28 15:21:20 Clara-Belle kernel: ata4.00: ATA-8: ST31000528AS, CC38, max UDMA/133 Oct 28 15:21:20 Clara-Belle kernel: ata4.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: qc timeout (cmd 0xef) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to set xfermode (err_mask=0x4) Oct 28 15:21:20 Clara-Belle kernel: drivers/scsi/mvsas/mv_sas.c 1522:mvs_I_T_nexus_reset for device[3]:rc= 0 Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: revalidation failed (errno=-5) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: qc timeout (cmd 0xec) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: revalidation failed (errno=-5) Oct 28 15:21:20 Clara-Belle kernel: ata4.00: disabled Oct 28 15:21:20 Clara-Belle kernel: ata4: hard resetting link Oct 28 15:21:20 Clara-Belle kernel: mvsas 0000:0b:00.0: Phy3 : No sig fis Oct 28 15:21:20 Clara-Belle kernel: drivers/scsi/mvsas/mv_sas.c 1522:mvs_I_T_nexus_reset for device[3]:rc= 0 Oct 28 15:21:20 Clara-Belle kernel: ata4: EH complete Oct 28 15:21:20 Clara-Belle kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 On the next boot it was fine. This is not the first time that this happens. Is it related to irqs? Why I say that, was on shutdown I noticed the following: Disabling IRQ #16 http://i.imgur.com/1rhiU.png When it boots up fine (detects all disks), I do not see this. This is how my vm config looks like: http://i.imgur.com/VzX0N.png 1) Possible cause: Passing through both a M1015 and a MV8. This disk looks like to be on the MV8, or is it? I cant tell at this point... If it is on the MV8 then these are the cables that are connected, got these off ebay: http://www.ebay.com/itm/110931840838?ssPageName=STRK:MEWNX:IT&_trksid=p3984.m1439.l2649 Will purchase these to test: http://www.newegg.com/Product/Product.aspx?Item=N82E16816133033 2) Possible cause: Another random user claims irqpoll seems to help resolve similar issues, not sure if this is relavent for recent kernels (v5rc8a) append initrd=bzroot irqpoll http://lime-technology.com/forum/index.php?topic=918.msg6193#msg6193 Not sure what this does.. EDIT: did not solved it, happened again with the parameter in syslinux.cfg Will try: "noirqdebug" as per this thread http://lime-technology.com/forum/index.php?topic=19593.msg175182#msg175182 and report back EDIT: "noirqdebug" did not solved it, happened again with the parameter in syslinux.cfg Here is /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 23 0 0 0 IO-APIC-edge timer 1: 9 0 0 0 IO-APIC-edge i8042 6: 0 3 0 0 IO-APIC-edge floppy 7: 0 0 0 0 IO-APIC-edge parport0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 4 0 0 0 IO-APIC-edge i8042 14: 43 0 0 0 IO-APIC-edge ide0 15: 0 0 0 0 IO-APIC-edge ide1 16: 8822 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1 17: 6382 7547 7397 7241 IO-APIC-fasteoi ioc0 18: 434044 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 19: 1726732 0 0 0 IO-APIC-fasteoi mvsas 40: 0 0 0 0 PCI-MSI-edge PCIe PME 41: 0 0 0 0 PCI-MSI-edge PCIe PME 42: 0 0 0 0 PCI-MSI-edge PCIe PME 43: 0 0 0 0 PCI-MSI-edge PCIe PME 44: 0 0 0 0 PCI-MSI-edge PCIe PME 45: 0 0 0 0 PCI-MSI-edge PCIe PME 46: 0 0 0 0 PCI-MSI-edge PCIe PME 47: 0 0 0 0 PCI-MSI-edge PCIe PME 48: 0 0 0 0 PCI-MSI-edge PCIe PME 49: 0 0 0 0 PCI-MSI-edge PCIe PME 50: 0 0 0 0 PCI-MSI-edge PCIe PME 51: 0 0 0 0 PCI-MSI-edge PCIe PME 52: 0 0 0 0 PCI-MSI-edge PCIe PME 53: 0 0 0 0 PCI-MSI-edge PCIe PME 54: 0 0 0 0 PCI-MSI-edge PCIe PME 55: 0 0 0 0 PCI-MSI-edge PCIe PME 56: 0 0 0 0 PCI-MSI-edge PCIe PME 57: 0 0 0 0 PCI-MSI-edge PCIe PME 58: 0 0 0 0 PCI-MSI-edge PCIe PME 59: 0 0 0 0 PCI-MSI-edge PCIe PME 60: 0 0 0 0 PCI-MSI-edge PCIe PME 61: 0 0 0 0 PCI-MSI-edge PCIe PME 62: 0 0 0 0 PCI-MSI-edge PCIe PME 63: 0 0 0 0 PCI-MSI-edge PCIe PME 64: 0 0 0 0 PCI-MSI-edge PCIe PME 65: 0 0 0 0 PCI-MSI-edge PCIe PME 66: 0 0 0 0 PCI-MSI-edge PCIe PME 67: 0 0 0 0 PCI-MSI-edge PCIe PME 68: 0 0 0 0 PCI-MSI-edge PCIe PME 69: 0 0 0 0 PCI-MSI-edge PCIe PME 70: 0 0 0 0 PCI-MSI-edge PCIe PME 71: 0 0 0 0 PCI-MSI-edge PCIe PME 72: 27511 6575494 40006 18756 PCI-MSI-edge eth0-rxtx-0 73: 14498 24692 6116212 16706 PCI-MSI-edge eth0-rxtx-1 74: 15964 23463 20818 4689849 PCI-MSI-edge eth0-rxtx-2 75: 5796124 23256 15030 20082 PCI-MSI-edge eth0-rxtx-3 76: 0 0 0 0 PCI-MSI-edge eth0-event-4 77: 1462937 191977 217932 213623 PCI-MSI-edge mpt2sas0-msix0 78: 0 0 0 0 PCI-MSI-edge vmci 79: 0 0 0 0 PCI-MSI-edge vmci NMI: 0 0 0 0 Non-maskable interrupts LOC: 22483901 22483964 22483853 22483873 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 IRQ work interrupts RTR: 0 0 0 0 APIC ICR read retries RES: 10282630 12919166 10952486 11598716 Rescheduling interrupts CAL: 69033 50148 571038 597827 Function call interrupts TLB: 532814 415962 428039 359889 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 750 750 750 750 Machine check polls ERR: 0 MIS: 0 3) Possible cause: PSU related? Case: SC933T-R760B has a Triple Redundant PSU, but only one is plugged in at this point http://www.servethehome.com/supermicro-sc933t-r760b-3u-15x-35-sassata-storage-chassis-review/ Not very likely. 4) Possible cause: Drive related? Output of "smartctl -a -d ata /dev/sdn" attached, this doesn't look good or does it? Reallocated sector ct is 0, the drive should be ok. In any, I pulled it from the array and I'm currently running Spinrite on it. Will report back... 1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail Always - 144973736 195 Hardware_ECC_Recovered 0x001a 035 020 000 Old_age Always - 144973736 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 EDIT: I now think this is more drive related. I shutdown the unRAID vm and the LED on the drive bay has yet to go off...remains solid. very weird 5) Possible cause: Kernel bug or ESXi incompatibly related? Any thoughts would be really appreciated. esxi 5.1 unraid v5rc8a mobo: supermicro x8sil smart_output_sdn.txt syslog.txt.zip
February 25, 201313 yr Did you ever figure this out? Sounds like the issues you are having are similar to me.
February 25, 201313 yr Author Hi jesseasi, I just noticed your reply. Basically, I did just about everything to try to solve this issue. Ultimately, I solved the issue by swapping the MV8 for another M1015. I had really wished that I had not, as I had plans for the M1015 but alas, I couldn't live with the MV8 in passthrough mode. It was just unreliable. I know it works well for some. As you can see, I tried swapping out cables and I knew it that the drive in question was good, and tried various kernel parameters. The MV8 worked flawlessly in bare metal. To me, it seemed it was a low-level/driver problem that was beyond me. As a consequence, I have a spare MV8...
March 21, 201313 yr I have this same issue with random drives randomely not showing up to assign to unraid. I also use MV8's and it happens to drives that are on each of them. Unless someone comes up with a better idea than to buy the LSI card, I'll probably just restart the unraid VM 2x to get them to show up. Luckily I don't restart unraid that often.
Archived
This topic is now archived and is closed to further replies.