The server was stable until today after making the ipvlan switch. Today I spotted that my containers were effectively unreachable (web servers responsive but not returning content post logon/timing out). I found that if I tried to ls /mnt that the terminal would hang, but the same on /mnt/disk1 was fine. Nothing in syslog at the time of the issues being seen, but issue much earlier in the day (when things had seemed okay still), e.g.
PANIC: zfs: removing nonexistent segment from range tree
I couldn't reboot the server because issuing a reboot would also hang, so eventually had to do a hard reset. After the reset the array got stuck starting with the cache pool the apparent culprit. I rebooted (which was now possible) with a plan to mount the cache read-only, but after the reboot the array has started fine. This all feels like it's related to the cache pool (single SSD), but again I'd had zero problems before the 6.12.x upgrade when on btrfs and only moved to zfs to get around the apparent issues with btrfs on 6.12.x. So my question at this point, assuming that zfs doesn't like something about my hardware that btrfs on 6.11.x was fine with, is whether I should consider reformatting the cache pool to xfs instead.
thanks