Msh100

Members
  • Posts

    7
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Msh100's Achievements

Noob

Noob (1/14)

0

Reputation

  1. So I've spent a good hour reseating and dusting all the connections, and now it seems okay. I need to run some pre-clears and re-sync, if that all goes well then I assume this issue is simply doen to poor connections.
  2. Sorry to bring up such an old thread but I never got this solved and recently I have had some time on my hands! Basically I am getting the same symptoms still. In the original post I mentioned about VMs, well this isn't just the case and happens anyway. I have replaced the SAS card and that has not helped. I have also upgraded to the latest unraid as of today. To trigger this issue I am simply changing one disk from "unassigned" to an actual drive. When I do that, all the disks go unassigned and the similar syslog messages appear. I have attached the diagnostics which you asked for before. Any help would be greatly appriciated! Failing drives are of course a possibility, though I just wouldn't expect this outcome. homeserver-diagnostics-20221231-1721.zip
  3. I have disabled IOMMU so it's unrelated to that, however in the logs, I have also noticed Jul 31 12:44:10 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:11 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:12 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:13 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:14 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: _base_fault_reset_work: Running mpt3sas_dead_ioc thread success !!!! Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221103000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x000a), sas_addr(0x4433221103000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(0) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221101000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x0009), sas_addr(0x4433221101000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(2) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221104000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x000b), sas_addr(0x4433221104000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(7) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221106000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x000c), sas_addr(0x4433221106000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(5) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221105000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x000d), sas_addr(0x4433221105000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(6) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221107000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: removing handle(0x000e), sas_addr(0x4433221107000000) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: enclosure logical id(0x590b11c01210fd00), slot(4) Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: unexpected doorbell active! Jul 31 12:44:15 HomeServer kernel: mpt2sas_cm0: sending diag reset !! Jul 31 12:44:16 HomeServer kernel: mpt2sas_cm0: Invalid host diagnostic register value Jul 31 12:44:16 HomeServer kernel: mpt2sas_cm0: System Register set: Jul 31 12:44:16 HomeServer kernel: mpt2sas_cm0: diag reset: FAILED I guess the best lead right now is there's something up with the SAS controller?
  4. [ 97.777213] tun: Universal TUN/TAP device driver, 1.6 [ 97.824344] mdcmd (36): check [ 97.824353] md: recovery thread: recon D1 D3 ... [ 98.034791] mpt3sas 0000:06:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM [ 201.059073] br0: port 2(vnet0) entered blocking state [ 201.059077] br0: port 2(vnet0) entered disabled state [ 201.059107] device vnet0 entered promiscuous mode [ 201.059189] br0: port 2(vnet0) entered blocking state [ 201.059190] br0: port 2(vnet0) entered forwarding state [ 221.602074] mpt2sas_cm0: SAS host is non-operational !!!! [ 222.627071] mpt2sas_cm0: SAS host is non-operational !!!! [ 223.650069] mpt2sas_cm0: SAS host is non-operational !!!! [ 224.674075] mpt2sas_cm0: SAS host is non-operational !!!! [ 225.699076] mpt2sas_cm0: SAS host is non-operational !!!! [ 226.722072] mpt2sas_cm0: SAS host is non-operational !!!! [ 226.722176] mpt2sas_cm0: _base_fault_reset_work: Running mpt3sas_dead_ioc thread success !!!! [ 226.727073] blk_update_request: I/O error, dev sdd, sector 41710208 op 0x0:(READ) flags 0x0 phys_seg 72 prio class 0 [ 226.727080] md: disk0 read error, sector=41710144 [ 226.727082] md: disk0 read error, sector=41710152 --- many of the same message --- [ 226.727147] md: disk0 read error, sector=41710704 [ 226.727148] md: disk0 read error, sector=41710712 [ 226.730101] sd 7:0:1:0: [sde] tag#2947 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=5s [ 226.730105] sd 7:0:1:0: [sde] tag#2947 CDB: opcode=0x88 88 00 00 00 00 00 02 7c 72 80 00 00 02 40 00 00 [ 226.730106] blk_update_request: I/O error, dev sde, sector 41710208 op 0x0:(READ) flags 0x0 phys_seg 72 prio class 0 [ 226.730110] md: disk29 read error, sector=41710144 [ 226.730111] md: disk29 read error, sector=41710152 --- many of the same message --- [ 226.730166] md: disk29 read error, sector=41710704 [ 226.730167] md: disk29 read error, sector=41710712 [ 226.730182] sd 7:0:3:0: [sdg] tag#2948 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=5s [ 226.730184] sd 7:0:3:0: [sdg] tag#2948 CDB: opcode=0x88 88 00 00 00 00 00 02 7c 72 80 00 00 02 40 00 00 [ 226.730185] blk_update_request: I/O error, dev sdg, sector 41710208 op 0x0:(READ) flags 0x0 phys_seg 72 prio class 0 [ 226.730187] md: disk4 read error, sector=41710144 [ 226.730188] md: disk4 read error, sector=41710152 So it's clear at 201, that I am starting the VM. It's a lot of error messages, so I guess nothing too interesting, however there's one line that jumped out: [ 227.026975] pci 0000:06:00.0: Removing from iommu group 14 IOMMU Group 14 is indeed the disk controller. So why is this happening? How can I see how this is tied to the VM in any way because I can't connect the dots.
  5. A couple of days ago I started to get errors on one of my drives. I thought nothing of it and resumed as normal, until I decided to simply try and rebuild that drive. During the rebuild process, a second drive encountered the same issue. Something is probably off here, I decided to rebuild again and it ran for many many hours without error, until I started a VM and pretty much instantly began to get errors on all the drives. Consistenty, I can replicate this problem. I boot unraid, the rebuild starts, runs for however long, until I start a VM at which point errors from all directions. I have attempted to Google this and a couple of suggestions pointed to passthrough, but as far as I am aware, I am not passing through anything other than the unraid mount itself (which I have also tried to remove with no luck). I have also changed the PCIe ACS override from multifunctional to disabled. As far as I am aware, I am not doing anything out of the ordinary and this setup has been running months prior to the first errors, without incident. I won't dismiss that something may be failing, but the fact it is perfectly replicatable simply by starting a VM, makes me think there's something else at play. Any advice would be greatly appriciated!
  6. Hey, hopefully a relative simple question. I have two NICs in an active-backup bond. How will unraid prioritise which is the primary NIC? Scenario: One NIC is GbE, other is 10GbE, I want the latter to take priority where possible. Thanks
  7. I am trying to setup an Ubuntu 20.04 VM with limited success. I'm not entirely sure where to start debugging this one. The VM setup is nothing special, here's the configuration: The only "funky" thing I have done is set the primary drive to be a partition on an unmanaged volume. I have changed this to Auto but the results are the same. When I boot, the Ubuntu installer simply does the following: Googling this only showed cases of this happening on Virtualbox. Any tips would be appreciated. If any more debugging information is needed, please let me know what to send! Thanks