Jump to content

johnsanc

Members
  • Content Count

    172
  • Joined

  • Last visited

Everything posted by johnsanc

  1. I have the X570 Creator. Here are the IOMMU groups after the BIOS tweaks. Note that USB passthrough with these X570 board is not great. One of the controllers does not support Reset and the other requires a bit of config editing to passthrough properly. Basically everything in this thread applies to this board as well: IOMMU group 0: [1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 1: [1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 2: [1022:1483] 00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 3: [1022:1482] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 4: [1022:1482] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 5: [1022:1483] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 6: [1022:1483] 00:03.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 7: [1022:1482] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 8: [1022:1482] 00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 9: [1022:1482] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 10: [1022:1484] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 11: [1022:1482] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 12: [1022:1484] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 13: [1022:1484] 00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 14: [1022:1484] 00:08.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 15: [1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61) [1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51) IOMMU group 16: [1022:1440] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0 [1022:1441] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1 [1022:1442] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2 [1022:1443] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3 [1022:1444] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4 [1022:1445] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5 [1022:1446] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6 [1022:1447] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7 IOMMU group 17: [1022:57ad] 01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream IOMMU group 18: [1022:57a3] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge IOMMU group 19: [1022:57a3] 02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge IOMMU group 20: [1022:57a3] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge IOMMU group 21: [1022:57a4] 02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:1485] 2e:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:149c] 2e:00.1 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c] 2e:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller IOMMU group 22: [1022:57a4] 02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:7901] 2f:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51) IOMMU group 23: [1022:57a4] 02:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge [1022:7901] 30:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51) IOMMU group 24: [8086:15ea] 03:00.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) IOMMU group 25: [8086:15ea] 04:00.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) IOMMU group 26: [8086:15ea] 04:01.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) IOMMU group 27: [8086:15ea] 04:02.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) IOMMU group 28: [8086:15ea] 04:04.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) IOMMU group 29: [8086:15eb] 05:00.0 System peripheral: Intel Corporation JHL7540 Thunderbolt 3 NHI [Titan Ridge 4C 2018] (rev 06) IOMMU group 30: [8086:15ec] 07:00.0 USB controller: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] (rev 06) IOMMU group 31: [1b21:1187] 24:00.0 PCI bridge: ASMedia Technology Inc. Device 1187 IOMMU group 32: [1b21:1187] 25:01.0 PCI bridge: ASMedia Technology Inc. Device 1187 [1b21:0612] 26:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02) IOMMU group 33: [1b21:1187] 25:02.0 PCI bridge: ASMedia Technology Inc. Device 1187 [8086:1539] 27:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03) IOMMU group 34: [1b21:1187] 25:03.0 PCI bridge: ASMedia Technology Inc. Device 1187 [8086:2723] 28:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a) IOMMU group 35: [1b21:1187] 25:04.0 PCI bridge: ASMedia Technology Inc. Device 1187 [1b21:0612] 29:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02) IOMMU group 36: [1b21:1187] 25:05.0 PCI bridge: ASMedia Technology Inc. Device 1187 IOMMU group 37: [1b21:1187] 25:06.0 PCI bridge: ASMedia Technology Inc. Device 1187 [10de:128b] 2b:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1) [10de:0e0f] 2b:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1) IOMMU group 38: [1b21:1187] 25:07.0 PCI bridge: ASMedia Technology Inc. Device 1187 IOMMU group 39: [1d6a:07b1] 2d:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02) IOMMU group 40: [1987:5016] 31:00.0 Non-Volatile memory controller: Phison Electronics Corporation E16 PCIe4 NVMe Controller (rev 01) IOMMU group 41: [10de:1e84] 32:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] (rev a1) [10de:10f8] 32:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1) [10de:1ad8] 32:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev a1) [10de:1ad9] 32:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI Controller (rev a1) IOMMU group 42: [1000:0087] 33:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05) IOMMU group 43: [1022:148a] 34:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function IOMMU group 44: [1022:1485] 35:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP IOMMU group 45: [1022:1486] 35:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP IOMMU group 46: [1022:149c] 35:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller IOMMU group 47: [1022:1487] 35:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller IOMMU group 48: [1022:7901] 36:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51) IOMMU group 49: [1022:7901] 37:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
  2. I've moved both SSDs to the AMD controller. Fingers crossed that resolves the issues.
  3. A few hours ago it looks like some issues arose starting from an IO_PAGE_FAULT event in the logs. From there I started getting a ton of errors on my cache pool and eventually disk9 was kicked from the array even though nothing should have been written to it since the only activity at that time was going to cache drives. Likely related to this topic... only now I have the full diagnostics leading up to the errors. Should I move both SSDs to the same controller and off of the ASMedia controller? For example moving to 30:00.0. Any help or suggestions are appreciated - this seems to keep happening every day or two ever since upgrading to 6.8.2 tower-diagnostics-20200202-1849.zip
  4. I was downloading something in my Windows 10 VM then the connection dropped out. A reboot did not fix the issue and not sure how to get diagnostics. Attached are photos of the syslog after the dropout and after reboot from the local GUI. Any idea what happened and how to resolve? EDIT: Disregard, I'm an idiot - I was on work VPN from my laptop I was trying to access from. Doh!
  5. I recently had some issues with with my cache pool. My SSDs were both beyond their LBAs written warranty - however I wasn't seeing anything in the SMART reports that indicate a bad disk aside from a handful of CRC errors, which I thought are usually attributed to bad cables. Sometimes one of the disks would just drop out with an unmountable filesystem. And sometimes it seems to have even impacted other disks connected to the motherborad (not sure if related). I ended up replacing these drives and now all of my issues seem to be resolved... I am closely monitoring my logs, so hopefully this was actually the root cause of my issues. If not, when then at least I ruled out one possibility. So I guess my question is, what are things to lookout for to know when an SSD is dying? I know they don't really die the same way as mechanical drives. Also, can a dying SSD impact other disks, particularly if mover is running?
  6. I get this error on every startup with my new X570 board, is it anything to be concerned about? Jan 31 19:45:37 Tower ntpd[2286]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized I couldn't find much info on this error, but it sounds like ntpd is starting too early. I also didn't know if its related to this at all:
  7. Thanks again @johnnie.black - you always seem to come to rescue for things like this. It's a testament to the community and one reason why I stick with unRAID even when weird issues come up every now and then. Thanks to your helpful FAQ I was able to manually mount of the drives, copy data to the array, format the cache pool, and move data back. ... Now if only I could figure out why it happened in the first place. If something like this happens again I'll be sure to provide complete diagnostics before rebooting.
  8. Thanks I'll give that a try. I am really struggling with seeing the value of btrfs mirrored cache pool. Seems like every issue that occurs results in a slim chance of recovering data from either drive. There really should be an easy way to at least convert a disk from the pool into a single cache drive. The pool really does give a false sense of security.
  9. Unfortunately I don't but here is the old thread from the last time I had a cache pool issue, which was not long ago. (completely different hardware though) In the future I will download diagnostics before any reboots. I had syslog saving to the cache pool... but obviously that wont help in this case.
  10. I am having a bunch of weird issues with 6.8.2 that I have never seen before. I am not sure if its just a coincidence and I have cables / drives failing or if there is an issue with my hardware and 6.8.2. Basically I woke up this morning with a disk in my cache pool marked as unmountable with no file system. I rebooted and the issue persisted. I noticed that in the syslog it says a UUID is missing, when I run blkid I can see that the disk now has a different UUID than it did previously. Any idea why this happened? What are the steps for me to restore my cache pool or at least save that data that is on the other disk? tower-diagnostics-20200131-1241.zip
  11. @unbalanced - How did you figure out what was causing the issue? I do not have any additional network cards, but I am still plagued by this issue, which I only recall happening with 6.8.2... Maybe i'll try to downgrade and see if rebooting works normally.
  12. Thanks, I am already using the latest BIOS. I suppose theres still a few kinks to work out with X570. In the meantime I tried this: https://forum.level1techs.com/t/devops-workstation-fixing-nvme-trim-on-linux/148354 I did not have a TRIM job going at the time, but I figured its worth a shot to try it anyways, before I replace cables. Also, I believe all of the errors around 5:30 AM were related to TRIM of my cache pool. It seems as if the writes are blocked while TRIM is running. I turned off Docker and re-ran TRIM and no errors. Weird boot loop also seems to be resolved as long as I use legacy boot instead of UEFI
  13. I also ran into this but mine cut off on the "tsc-early" line - I had to do a hard power down and power back on to get past the boot loop. Im beginning to think theres something about these x570 boards that doesn't reboot like others ive used. It seems like I usually run into problems if I just let it reboot, but things are much more stable with a power down / power on
  14. Interesting - I wonder why virtualization would cause an issue with that. I obviously am not passing that through to my VM. Are there any BIOS settings or anything I should look into?
  15. Attached are my diagnostics from this morning as well as the relevant portion of my syslog since the time I started mover (a few reboots ago). Just skimming it though it I can see theres a few issues... but any interpretation and recommendations are welcome. Also, not sure if its related or not, but I've had weird issues lately with this new hardware (x570 board) where sometimes unraid gets stuck in a boot loop and it wont make it past extracting /bzimage. I never had these issues with my older hardware just a few weeks ago. I suppose the flash could be dying, but that would be a weird coincidence that the flash starts dying right when I get new hardware. syslog-johnsanc.log tower-diagnostics-20200130-0921.zip
  16. I recently invoked the mover script and after awhile I noticed 2 disks became marked as disabled due to write errors Disk 7 = 52 writes, 2252 errors (Are these read errors?) Disk 15 = 546,971 writes, 731 errors However the mover was only moving files to Disk 15. I do have turbo write enabled, could Disk 7 read errors cause Disk 15 write errors? Also, could a cabling issue cause this? Both disks look fine in the SMART reports. I should also mention that I just completed a parity check with about ~90,000 sync errors due to a mysterious reboot that happened in the middle of the night. The server started back up and started a parity check so I let it run. Would a parity check log errors for individual disks? If it was a cabling issue I'm surprised I didn't get individual disk errors during the check. Errors on Disk7: Jan 29 22:14:03 Tower kernel: ata2.00: failed command: READ DMA EXT Jan 29 22:14:03 Tower kernel: ata2.00: cmd 25/00:00:50:1f:34/00:04:88:01:00/e0 tag 28 dma 524288 in Jan 29 22:14:03 Tower kernel: ata2.00: status: { DRDY } Jan 29 22:14:03 Tower kernel: ata2: hard resetting link Jan 29 22:14:13 Tower kernel: ata2: softreset failed (1st FIS failed) Jan 29 22:14:13 Tower kernel: ata2: hard resetting link Jan 29 22:14:23 Tower kernel: ata2: softreset failed (1st FIS failed) Jan 29 22:14:23 Tower kernel: ata2: hard resetting link Jan 29 22:14:58 Tower kernel: ata2: softreset failed (1st FIS failed) Jan 29 22:14:58 Tower kernel: ata2: limiting SATA link speed to 3.0 Gbps Jan 29 22:14:58 Tower kernel: ata2: hard resetting link Jan 29 22:15:03 Tower kernel: ata2: softreset failed (1st FIS failed) Jan 29 22:15:03 Tower kernel: ata2: reset failed, giving up Jan 29 22:15:03 Tower kernel: ata2.00: disabled Jan 29 22:15:03 Tower kernel: ata2: EH complete Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#29 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#29 CDB: opcode=0x88 88 00 00 00 00 01 88 34 23 50 00 00 05 40 00 00 Jan 29 22:15:03 Tower kernel: print_req_error: I/O error, dev sdc, sector 6580085584 Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#30 CDB: opcode=0x88 88 00 00 00 00 01 88 34 28 90 00 00 05 40 00 00 Jan 29 22:15:03 Tower kernel: print_req_error: I/O error, dev sdc, sector 6580086928 Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:15:03 Tower kernel: sd 3:0:0:0: [sdc] tag#28 CDB: opcode=0x88 88 00 00 00 00 01 88 34 1f 50 00 00 04 00 00 00 Jan 29 22:15:03 Tower kernel: print_req_error: I/O error, dev sdc, sector 6580084560 Errors on Disk15: Jan 29 22:12:28 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jan 29 22:12:28 Tower kernel: ata1.00: failed command: WRITE DMA EXT Jan 29 22:12:28 Tower kernel: ata1.00: cmd 35/00:00:30:cb:38/00:04:88:01:00/e0 tag 15 dma 524288 out Jan 29 22:12:28 Tower kernel: ata1.00: status: { DRDY } Jan 29 22:12:28 Tower kernel: ata1: hard resetting link Jan 29 22:12:28 Tower kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jan 29 22:12:33 Tower kernel: ata1.00: qc timeout (cmd 0xec) Jan 29 22:12:33 Tower kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jan 29 22:12:33 Tower kernel: ata1.00: revalidation failed (errno=-5) Jan 29 22:12:33 Tower kernel: ata1: hard resetting link Jan 29 22:12:43 Tower kernel: ata1: softreset failed (1st FIS failed) Jan 29 22:12:43 Tower kernel: ata1: hard resetting link Jan 29 22:12:53 Tower kernel: ata1: softreset failed (1st FIS failed) Jan 29 22:12:53 Tower kernel: ata1: hard resetting link Jan 29 22:13:28 Tower kernel: ata1: softreset failed (1st FIS failed) Jan 29 22:13:28 Tower kernel: ata1: limiting SATA link speed to 3.0 Gbps Jan 29 22:13:28 Tower kernel: ata1: hard resetting link Jan 29 22:13:33 Tower kernel: ata1: softreset failed (1st FIS failed) Jan 29 22:13:33 Tower kernel: ata1: reset failed, giving up Jan 29 22:13:33 Tower kernel: ata1.00: disabled Jan 29 22:13:33 Tower kernel: ata1: EH complete Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] tag#12 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] tag#12 CDB: opcode=0x8a 8a 00 00 00 00 01 88 38 cb 30 00 00 04 00 00 00 Jan 29 22:13:33 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580390704 Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] Read Capacity(16) failed: Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] Sense not available. Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] Read Capacity(10) failed: Result: hostbyte=0x04 driverbyte=0x00 Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] Sense not available. Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] 0 512-byte logical blocks: (0 B/0 B) Jan 29 22:13:33 Tower kernel: sd 2:0:0:0: [sdb] 4096-byte physical blocks Jan 29 22:13:33 Tower kernel: sdb: detected capacity change from 10000831348736 to 0 Jan 29 22:13:33 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580391728 Jan 29 22:13:33 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580392056 Jan 29 22:13:33 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580393088 Jan 29 22:13:33 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580084024 Jan 29 22:15:03 Tower kernel: print_req_error: I/O error, dev sdb, sector 6580392056
  17. Thanks all, yeah needing to put in the local IP for the unraid server was not intuitive for the Syslog Server, the help content should be updated for that. Will report back with proper logging if any other crashes occur.
  18. Yes I do have a UPS. None of the clocks in my house are reset so I don't think it was power related. I'll keep an eye on the logs to see if anything else weird happens. Is this user script the best way to capture logs leading up to a crash? #!/bin/bash mkdir -p /boot/logs FILENAME="/boot/logs/syslog-$(date +%s)" tail -f /var/log/syslog > $FILENAME
  19. Not sure if this is related to 6.8.2 or it was just coincidence, but the upgrade went fine, but at 5:30 AM last night my server rebooted on its own and I woke up to Parity Check running. So far it found 98 sync errors. I've NEVER seen the server reboot on its own after its up and running.
  20. The system devices screen is really helpful for looking at IOMMU groups and devices to pass through to VMs. However, it currently does not display which devices are Reset enabled, which is particularly useful for USB controllers. I use SpaceInvaderOne's command to look at this. Adding this extra piece of info to the System Devices would be a simple but welcome addition. for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d);do echo "IOMMU group $(basename "$iommu_group")"; for device in $(\ls -1 "$iommu_group"/devices/); do if [[ -e "$iommu_group"/devices/"$device"/reset ]]; then echo -n "[RESET]"; fi; echo -n $'\t';lspci -nns "$device"; done; done
  21. I have the same issue with 6.8.1 - the vfio-pci.cfg file appears to do nothing. I really need this functionality since my x570 board uses the same ID for the USB controllers in multiple groups. EDIT: After reading several threads about this exact same issue I have a feeling the timing of when the cfg is loaded doesnt work for everyone. I was able to get binding to work properly using the old school way of passing through devices. I edited my "/boot/config/go" file: #!/bin/bash # Start the Management Utility. /usr/local/sbin/emhttp & # Bind USB For VM /usr/local/sbin/vfio-bind 0000:35:00.3
  22. @bastl - Not sure if there is a configuration somewhere to avoid symlinks. I just divided my external storages up so they didn't include anything that would update frequently or include any symlinks.
  23. FYI - I figured out my issue above.... in short do not have Nextcloud index symlinks that would cause a loop.
  24. I think I figured this out - and its totally user error and not being careful and mounting something as external storage that I shouldn't have. Basically there where two issues: 1) Nextcloud does NOT index symlinks properly. I had literally millions of rows in the oc_filecache that were junk looped paths due to symlinks 2) Something that was mounted was symlinked to system paths where files change basically non-stop, which causes constant modified date updates I ended up having well over 400 million rows. Instead of cleaning it up I just truncated the entire oc_filecache table and starting fresh with correct external storage mounts. At least 1 TB 860 EVO drives are on sale for $105 at Best Buy for black friday...
  25. Hey guys - I hope you can help me. Nextcloud appears to be the culprit as the silent destoryer of my SSD cache pool. In less than one year I am at 482 TBW on my two 1TB SSDs. This is WAY over any normal usage. I did some tests last night and this morning and I can consistently stop the rogue disk writes by turning off Nextcloud. If I turn it back on its fine for a few hours then they start again. This is big issue for me and I suspect this may be happening to others as well and is going unnoticed unless you check the stats page often or regularly check the wear level or lbas written in the smart reports. Any help is appreciated! (Thread linked for more info) EDIT: Ok I think I'm getting closer... I noticed I have a TON of MariaDB logs starting around the time that the writes kicked in: Here is a sample from the log at the time the disk writes went crazy. Any idea what the issue is? I have a fairly standard setup so I'm not really sure what is causing this.