Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

unRAID loses disk (sporadic) in VM running ESXi

Featured Replies

So I decided to post my issue in as part of a new thread as to get more eyes on this.

 

Oct 28 15:21:20 Clara-Belle kernel: sas: ata1: end_device-0:0: dev error handler

Oct 28 15:21:20 Clara-Belle kernel: sas: ata2: end_device-0:1: dev error handler

Oct 28 15:21:20 Clara-Belle kernel: sas: ata3: end_device-0:2: dev error handler

Oct 28 15:21:20 Clara-Belle kernel: sas: ata4: end_device-0:3: dev error handler

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: ATA-8: ST31000528AS, CC38, max UDMA/133

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: qc timeout (cmd 0xef)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to set xfermode (err_mask=0x4)

Oct 28 15:21:20 Clara-Belle kernel: drivers/scsi/mvsas/mv_sas.c 1522:mvs_I_T_nexus_reset for device[3]:rc= 0

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: revalidation failed (errno=-5)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: qc timeout (cmd 0xec)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: revalidation failed (errno=-5)

Oct 28 15:21:20 Clara-Belle kernel: ata4.00: disabled

Oct 28 15:21:20 Clara-Belle kernel: ata4: hard resetting link

Oct 28 15:21:20 Clara-Belle kernel: mvsas 0000:0b:00.0: Phy3 : No sig fis

Oct 28 15:21:20 Clara-Belle kernel: drivers/scsi/mvsas/mv_sas.c 1522:mvs_I_T_nexus_reset for device[3]:rc= 0

Oct 28 15:21:20 Clara-Belle kernel: ata4: EH complete

Oct 28 15:21:20 Clara-Belle kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0

 

On the next boot it was fine. This is not the first time that this happens. Is it related to irqs? Why I say that, was on shutdown I noticed the following:

 

Disabling IRQ #16
http://i.imgur.com/1rhiU.png

When it boots up fine (detects all disks), I do not see this.

 

This is how my vm config looks like: http://i.imgur.com/VzX0N.png

 

1) Possible cause:

Passing through both a M1015 and a MV8. This disk looks like to be on the MV8, or is it? I cant tell at this point... If it is on the MV8 then these are the cables that are connected, got these off ebay:

 

http://www.ebay.com/itm/110931840838?ssPageName=STRK:MEWNX:IT&_trksid=p3984.m1439.l2649

 

Will purchase these to test: http://www.newegg.com/Product/Product.aspx?Item=N82E16816133033

 

2) Possible cause:

Another random user claims irqpoll seems to help resolve similar issues, not sure if this is relavent for recent kernels (v5rc8a)

append initrd=bzroot irqpoll

 

http://lime-technology.com/forum/index.php?topic=918.msg6193#msg6193

Not sure what this does..

EDIT: did not solved it, happened again with the parameter in syslinux.cfg

 

Will try: "noirqdebug" as per this thread http://lime-technology.com/forum/index.php?topic=19593.msg175182#msg175182 and report back

 

EDIT: "noirqdebug" did not solved it, happened again with the parameter in syslinux.cfg

 

Here is /proc/interrupts

          CPU0      CPU1      CPU2      CPU3     

  0:        23          0          0          0  IO-APIC-edge      timer

  1:          9          0          0          0  IO-APIC-edge      i8042

  6:          0          3          0          0  IO-APIC-edge      floppy

  7:          0          0          0          0  IO-APIC-edge      parport0

  9:          0          0          0          0  IO-APIC-fasteoi  acpi

12:          4          0          0          0  IO-APIC-edge      i8042

14:        43          0          0          0  IO-APIC-edge      ide0

15:          0          0          0          0  IO-APIC-edge      ide1

16:      8822          0          0          0  IO-APIC-fasteoi  ehci_hcd:usb1

17:      6382      7547      7397      7241  IO-APIC-fasteoi  ioc0

18:    434044          0          0          0  IO-APIC-fasteoi  uhci_hcd:usb2

19:    1726732          0          0          0  IO-APIC-fasteoi  mvsas

40:          0          0          0          0  PCI-MSI-edge      PCIe PME

41:          0          0          0          0  PCI-MSI-edge      PCIe PME

42:          0          0          0          0  PCI-MSI-edge      PCIe PME

43:          0          0          0          0  PCI-MSI-edge      PCIe PME

44:          0          0          0          0  PCI-MSI-edge      PCIe PME

45:          0          0          0          0  PCI-MSI-edge      PCIe PME

46:          0          0          0          0  PCI-MSI-edge      PCIe PME

47:          0          0          0          0  PCI-MSI-edge      PCIe PME

48:          0          0          0          0  PCI-MSI-edge      PCIe PME

49:          0          0          0          0  PCI-MSI-edge      PCIe PME

50:          0          0          0          0  PCI-MSI-edge      PCIe PME

51:          0          0          0          0  PCI-MSI-edge      PCIe PME

52:          0          0          0          0  PCI-MSI-edge      PCIe PME

53:          0          0          0          0  PCI-MSI-edge      PCIe PME

54:          0          0          0          0  PCI-MSI-edge      PCIe PME

55:          0          0          0          0  PCI-MSI-edge      PCIe PME

56:          0          0          0          0  PCI-MSI-edge      PCIe PME

57:          0          0          0          0  PCI-MSI-edge      PCIe PME

58:          0          0          0          0  PCI-MSI-edge      PCIe PME

59:          0          0          0          0  PCI-MSI-edge      PCIe PME

60:          0          0          0          0  PCI-MSI-edge      PCIe PME

61:          0          0          0          0  PCI-MSI-edge      PCIe PME

62:          0          0          0          0  PCI-MSI-edge      PCIe PME

63:          0          0          0          0  PCI-MSI-edge      PCIe PME

64:          0          0          0          0  PCI-MSI-edge      PCIe PME

65:          0          0          0          0  PCI-MSI-edge      PCIe PME

66:          0          0          0          0  PCI-MSI-edge      PCIe PME

67:          0          0          0          0  PCI-MSI-edge      PCIe PME

68:          0          0          0          0  PCI-MSI-edge      PCIe PME

69:          0          0          0          0  PCI-MSI-edge      PCIe PME

70:          0          0          0          0  PCI-MSI-edge      PCIe PME

71:          0          0          0          0  PCI-MSI-edge      PCIe PME

72:      27511    6575494      40006      18756  PCI-MSI-edge      eth0-rxtx-0

73:      14498      24692    6116212      16706  PCI-MSI-edge      eth0-rxtx-1

74:      15964      23463      20818    4689849  PCI-MSI-edge      eth0-rxtx-2

75:    5796124      23256      15030      20082  PCI-MSI-edge      eth0-rxtx-3

76:          0          0          0          0  PCI-MSI-edge      eth0-event-4

77:    1462937    191977    217932    213623  PCI-MSI-edge      mpt2sas0-msix0

78:          0          0          0          0  PCI-MSI-edge      vmci

79:          0          0          0          0  PCI-MSI-edge      vmci

NMI:          0          0          0          0  Non-maskable interrupts

LOC:  22483901  22483964  22483853  22483873  Local timer interrupts

SPU:          0          0          0          0  Spurious interrupts

PMI:          0          0          0          0  Performance monitoring interrupts

IWI:          0          0          0          0  IRQ work interrupts

RTR:          0          0          0          0  APIC ICR read retries

RES:  10282630  12919166  10952486  11598716  Rescheduling interrupts

CAL:      69033      50148    571038    597827  Function call interrupts

TLB:    532814    415962    428039    359889  TLB shootdowns

TRM:          0          0          0          0  Thermal event interrupts

THR:          0          0          0          0  Threshold APIC interrupts

MCE:          0          0          0          0  Machine check exceptions

MCP:        750        750        750        750  Machine check polls

ERR:          0

MIS:          0

 

 

3) Possible cause:

PSU related?

Case: SC933T-R760B has a Triple Redundant PSU, but only one is plugged in at this point

http://www.servethehome.com/supermicro-sc933t-r760b-3u-15x-35-sassata-storage-chassis-review/

 

Not very likely.

 

4) Possible cause:

Drive related?

 

Output of "smartctl -a -d ata /dev/sdn" attached, this doesn't look good or does it? Reallocated sector ct is 0,  the drive should be ok. In any, I pulled it from the array and I'm currently running Spinrite on it. Will report back...

 

  1 Raw_Read_Error_Rate    0x000f  117  099  006    Pre-fail  Always      -      144973736

195 Hardware_ECC_Recovered  0x001a  035  020  000    Old_age  Always      -      144973736

5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      0

 

EDIT: I now think this is more drive related. I shutdown the unRAID vm and the LED on the drive bay has yet to go off...remains solid. very weird

 

 

5) Possible cause:

Kernel bug or ESXi incompatibly related?

 

Any thoughts would be really appreciated.

 

esxi 5.1

unraid v5rc8a

mobo: supermicro x8sil

smart_output_sdn.txt

syslog.txt.zip

  • 3 months later...

Did you ever figure this out?  Sounds like the issues you are having are similar to me.

  • Author

Hi jesseasi,

 

I just noticed your reply.

 

Basically, I did just about everything to try to solve this issue. Ultimately, I solved the issue by swapping the MV8 for another M1015. I had really wished that I had not, as I had plans for the M1015 but alas, I couldn't live with the MV8 in passthrough mode. It was just unreliable. I know it works well for some.

 

As you can see, I tried swapping out cables and I knew it that the drive in question was good, and tried various kernel parameters. The MV8 worked flawlessly in bare metal. To me, it seemed it was a low-level/driver problem that was beyond me. As a consequence, I have a spare MV8...

  • 4 weeks later...

I have this same issue with random drives randomely not showing up to assign to unraid. I also use MV8's and it happens to drives that are on each of them.

Unless someone comes up with a better idea than to buy the LSI card, I'll probably just restart the unraid VM 2x to get them to show up.

Luckily I don't restart unraid that often.

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.