Jump to content

JorgeB

Moderators
  • Posts

    67,880
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Only one NVMe device is being detected by Linux, this is not a software problem, how is the other device installed, adapter or board? Those are two partitions on the same device.
  2. Not easy to say for sure without swapping parts, but the memory controller is integrated in the CPU, so unless the are some bad traces in the board the problem will be the RAM or the CPU.
  3. Still seeing constant issues with the HBA, I see there's one LSI HBA connected to 24 disks and what looks like the Intel SCU controller connected to the same expander, can you confirm how many enclosures you have and how they are connected?
  4. If it happens again save the diagnostics before rebooting.
  5. Don't understand how disk 7 which appears to be failing and is still generating errors copied with ddresue without any errors, are you sure you copied the correct disk? See if the clone mounts with UD in read-only mode.
  6. That suggests a possible issue with the docker network configuration, you can try resetting it or doing what I suggested above.
  7. Log is full of controller/disks related issues, reboot and post new diags after array start.
  8. There's a newer kernel, so it's worth trying for anyone affected.
  9. Try booting in safe mode, if that doesn't help I would suggest recreating the flash drive with a stock install, first backup current one, then just restore the bare minimum to the new flash: super.dat, your key and the pools folder, that will rule out any config issue/conflict, and if it works like that you can then reconfigure the server or restore the other config files, a few at time to see if you find the culprit.
  10. Backup current flash drive, re-create the flash drive with the v6.11.2 release using the USB tool then restore the /config folder only from the backup.
  11. Nov 2 18:16:53 Trantor kernel: BTRFS info (device dm-4): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 1729, gen 0 Nov 2 18:16:53 Trantor kernel: BTRFS info (device dm-4): bdev /dev/mapper/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 1256, gen 0 Btrfs is finding data corruption on both devices, start by running memtest.
  12. You are having this issue: https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-r2153/ Since we are still investigating what causes the problem if you want you can try downgrading to v6.10.3 to see if it's stable there.
  13. Try switching to ipvlan (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)).
  14. Still constant problems with the controller, make sure it's well seated or try a different slot if available, at least for now also disable IOMMU since it appears to be causing issues, then post new diags after array start.
  15. Keep the cloned disk7 intact foe now, disconnected from the server or connected but unassigned, then check filesystem on emulated disk4 and disk7.
  16. Nothing there or nothing relevant? One thing you can try is to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
  17. Looks like there's filesystem corruption on cache, but syslog is not complete, please reboot and post new diags after array start.
  18. This was my bad, I missed that you were on an old release, there were some changes done to the detect NICs script on v6.10, though it's still not perfect.
  19. If the failed drive is still assigned the procedure is the same as replacing, if it's not still assigned, i.e., old device is completely dead, you just add a new one, no need to reformat.
  20. Other NVMe device is not decrypting, possibly because the LUKS headers are missing, since the pool is backed up you can try re-formatting, then if there are more issues with that device replace it, or if you have a spare replace it now then test the other one.
  21. Missed those, but they are before array start, after would be better, SMART does look weird for that device, it's incomplete.
×
×
  • Create New...