Jump to content

JorgeB

Moderators
  • Posts

    67,786
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Disable VT-d in the BIOS, that error would likely already been there since updating to v6.10, and it can cause other issues, more info below: https://forums.unraid.net/topic/123620-unraid-os-version-6100-available/?do=findComment&comment=1128822
  2. Thanks, diags confirm that the Samsung devices names changed due to a change starting with -rc8, there's just one underscore before the serial, before there were two, you can correct this by doing the following: -unassign all devices from those two pools -start the array to make Unraid "forget" the old device names -stop array -re-assign the devices to both pools, double check you're assigning the original devices to each pool -start array, existing pools will be imported and new names saved
  3. One more update, @RikStigteris helping me confirm if like suspected having IOMMU enable on these servers is the source of the problem, preliminary results look positive since the usual errors logged on them after updating to v6.10 are gone with VT-d disable, he will now use the server normally for a few days so we can confirm if it remains all good. Issue is possibly caused by the onboard NICs when VT-d is enable, can't tell you if it's a HP problem or some Linux issue with the new kernel, certainly nothing suggests an Unraid problem, but hopefully disabling VT-d for now fixes this, again servers with a Pentium or i3 CPU shouldn't have this issue since they don't support VT-d, though I would still recommend disabling it in the BIOS, since apparently it's enable by default, so if later they are upgraded to a Xeon and this issue still exists there won't be a problem.
  4. That's just the syslog, need the complete diags.
  5. This should be an easy fix but first please post the complete diagnostics to confirm the problem is the result of those devices changing the ID.
  6. Sorry can't really help with that, but from what I understand what works with macvlan should work the same with ipvlan, you might try posting in the respective dockers supports threads if there's one:
  7. It does, but you can either have them together or use the disk you want. There's no right or wrong answer, it just depends on what you want to accomplish.
  8. So if those folders already exist on the 3TB disk any new data will go there, overriding the allocation method.
  9. For some, but switching to Ipvlan should fix it.
  10. Is this an Intel chipset? I found that the Intel SATA controller is limited to roughly 2GB/s, at least up to the 11th gen, didn't test with Alder Lake yet.
  11. Unlikely this NIC will have good Linux driver support, if possible I would suggest using and add-on NIC, something from Intel preferred.
  12. How is the split level set for that share? Split setting overrides allocation method.
  13. May 20 03:49:12 TheArk kernel: macvlan_broadcast+0x116/0x144 [macvlan] May 20 03:49:12 TheArk kernel: macvlan_process_broadcast+0xc7/0x110 [macvlan] Macvlan call traces are usually the result of having dockers with a custom IP address, switching to ipvlan might fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info. https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/ See also here: https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/
  14. This was like a false positive, if the server is still unstable you can try is to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
  15. See my reply to your other post, you need to remove device 81 from the VM XML: -device vfio-pci,host=0000:81:00.0,id=hostdev0,bus=pci.4,addr=0x0 \
  16. Server was rebooted in the middle so can't see everything that happened, current cache is mounting, but it's empty, possibly some error during the replacement. Also note that you have device 81:00 bound to vfio-pci, 81 is now one of the NVMe devices and although it's a different device and it wasn't bound it still brought the device offline, so delete that bind and run a correcting scrub on the pool. As for the old data, I assume this device was part of the original pool? C300-CTFDDAC128MAG_00000000102202FBC295 (sdl) If yes this also suggests the replacement wasn't done correctly since there's still a btrfs filesystem there, and it should have been wiped during the replacement, if this was part of the old pool and the pool was redundant you should be able to mount it if you assign alone to a new pool (not with UD), if it doesn't mount in a new pool post new diags, maybe it can be mounted manually.
  17. Log doesn't show the replacement attempt, server was rebooted, only both devices without a filesystem, which one was the original device? I assume it was btrfs formatted?
  18. JorgeB

    CPU 100%

    You're using an LSI RAID controller, get and LSI HBA instead, some recommended models here.
  19. If you copied everything you need from the physical disk1 (not the emulated disk) you can use it to rebuild on top.
  20. Before anything else I would recommend running memtest as there are indications in the log of possible RAM issues, you also have constant OOM issues, maybe try starting one docker at a time to see if one of them is the culprit.
  21. It's an issue with the Samsung 980:
  22. This is a known issue that occurs when there's only one btrfs array device, it makes parity appear to have a btrfs filesystem and that causes issues during btrfs scan for filesystems, don't ask me why the pool works with v6.9, maybe something in the scan order/timings changed but this kind of config has been known to also cause issues with older releases. This should get fixed in the future but for now you can use one of these workarounds: -convert disk2 to xfs like your other array disks -if you really want disk2 to use btrfs convert it to btrfs encrypted -add more btrfs formatted array devices Either of the first two options will require formatting disk2, so you'll need to move/backup current data there.
  23. That and this: 2022-05-21,04:05:23,Information,ColdStation,kern,kernel,NETDEV WATCHDOG: eth0 (igb): transmit queue 0 timed out; It suggests the NIC stopped responding and was reset, but id the server doesn't come back online the reset likely failed.
  24. A photo of the screen when it stops might also give some clues.
×
×
  • Create New...