dnoyeb

Members
  • Posts

    130
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

dnoyeb's Achievements

Apprentice

Apprentice (3/14)

10

Reputation

  1. lmao, ok, is that something that should / would have changed when upgrading? Thanks
  2. Upgraded my test machine to the latest rc1 to see how it would do; very basic system, only like 2 plugins. Just noticed docker failed to start; and when looking around; I see a cache mnt point and a disk1; but no actual "user" anymore... of which I have a feeling is keeping docker from running if I had to guess since it doesn't have a spot for its files now. Anyways, diag attached. tower-diagnostics-20210813-0950.zip
  3. thanks man. ok so then really there is only one drive that has any "real" errors to it, which are reallocated sectors. will keep an eye on things going forward and see if anything increases; pulled a diag from 2019 and that same drive had the same 8 reallocated sectors... so probably not leading to the issue. Thanks again so much for your help. Time will tell.
  4. ah, actually disk 7 has reallocated sectors. Sounds like candidate to replace... Now to figure out how to upgrade parity to 12tb (from 8tb) and not lose my parity; then use the 8tb parity drive to become disk 7.... also strange is just about every newer drive in my system show signs of "Raw read error rate" and "Hardware ECC recovered"... not really sure what to make of that (haven't really looked at that before to know if it's recently increased or not). Only that one disk 7 has reallocated sectors however.
  5. Guess I'll start with doing the file system check on that device... Other than that; at a bit of a loss since the pool's disks don't show any errors or smart flags. This thing has been such a rock for so many years!
  6. What's strange is that day, the system was just completely locked up. first time in the 10 year's I've had the system that it has happened.... That and it did a parity check on startup and didn't find anything.
  7. Needing a little help; got an email today about 3645 errors; ran it again since I realized my monthly check wasn't setup to auto correct. Same number, shows that it fixed it. This system has been running for almost 10 years at this point; migrated to new mobo/cpu years ago, but there are drives in there since 2012. Anyways, not sure what to make of it, since in the gui it doesn't show any drive specific errors... I did see in my log that the cache drive has some sort of metadata store error and recommends repairing; but don't see how / why that would result in parity errors. Can someone take a peak at the diag file and advise on how I can tell if there is a drive needing to be replaced? I've got 2 extra 12tb sitting in there just waiting to be swapped into the array (precleared already)... Thanks! unraid-diagnostics-20210517-1039.zip
  8. Retested on beta 29, issue still there. Based on further testing and seeing Radek's comment; it occurs in 6.8.3 as well. I tried both cards that were in the box, same issue on each address. Here's the current error that pops up and the diag is attached: Execution error internal error: qemu unexpectedly closed the monitor: 2020-09-29T17:39:26.418978Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:03:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Device or resource busy One side note I just wanted to bring up; based on mellanox's website, they're up to driver version 5.x whereas the diagnostic files i'm looking through seem to show this build is using the 4.0 driver version. Any reason to think that could contribute? Or when these things are passed through are they completely transparent to unraid? tower-diagnostics-20200929-1351.zip
  9. oh I see one i'm going to test tomorrow! "webgui: better handling of multiple nics with vfio-pci" do you guys prefer that we update the testing in our bug report thread for record keeping purposes?
  10. Anonymized version attached; if you need the other version let me know and i'll dm it. Other testing done last night: I tried disabling the usb2.0 ports and had same issue, also tried disabling the usb3.0 ports and moved the key over to the 2.0 to just make sure it wasn't something funny like that. Neither helped. Thanks for any guidance. tower-diagnostics-20200804-0854.zip
  11. bit more info, I see in the logs these lines: Aug 3 19:04:09 Tower kernel: vfio-pci 0000:03:00.0: enabling device (0100 -> 0102) Aug 3 19:04:10 Tower kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x18c Aug 3 19:04:10 Tower kernel: genirq: Flags mismatch irq 16. 00000000 (vfio-intx(0000:03:00.0)) vs. 00000080 (ehci_hcd:usb1) Hopefully this helps in the troubleshooting..
  12. Trying to get a mellanox-3 card passed through to a VM and having some troubles. To set the stage, I have two of these cards in my unraid server, I am using one of them for the OS. I used the Tools / System Devices / Bind Selected to VFIO at boot method and have verified that the card is added: cat vfio-pci.cfg BIND=0000:03:00.0|15b3:1003 Here is the log showing it was successful in being bound at boot of unraid: Loading config from /boot/config/vfio-pci.cfg BIND=0000:03:00.0|15b3:1003 --- Processing 0000:03:00.0 15b3:1003 Vendor:Device 15b3:1003 found at 0000:03:00.0 IOMMU group members (sans bridges): /sys/bus/pci/devices/0000:03:00.0/iommu_group/devices/0000:03:00.0 Binding... Successfully bound the device 15b3:1003 at 0000:03:00.0 to vfio-pci --- vfio-pci binding complete Devices listed in /sys/bus/pci/drivers/vfio-pci: lrwxrwxrwx 1 root root 0 Aug 3 18:56 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.4/0000:03:00.0 ls -l /dev/vfio/ total 0 crw------- 1 root root 249, 0 Aug 3 18:56 12 crw-rw-rw- 1 root root 10, 196 Aug 3 18:56 vfio This card shows up when setting up the VM : Other PCI Devices: Mellanox Technologies MT27500 Family [ConnectX-3] | Ethernet controller (03:00.0) When that box is checked and the VM is started, this error shows up in the log: 2020-08-04T00:24:59.369033Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pci.0,addr=0x8: vfio 0000:03:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Device or resource busy 2020-08-04 00:25:00.476+0000: shutting down, reason=failed I'm confused as to how / why it is in use since it is allocated to VFIO at boot. I have tried enabling "PCIe ACS override" and for the fun of it did the "VFIO allow unsafe interrupts" Neither helped (didn't think they would but tried anyways. Does anyone have any thoughts on other things to try? I am working to setup a RockNSM VM and need to pass through this NIC so the VM can capture the 10g mirror of my uplink to my router. Thanks in advance for any assistance.