Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

kabadisha

Members
  • Joined

  • Last visited

  1. I switched over to the official Mosquitto container, but that didn't resolve the issue. I realised that I had to go and reconfigure the HA MQTT integration and update the protocol version there. That resolved the issue, and I would bet I could have stayed with the spants container.
  2. Finally resolved! I refactored all my media and download directories & mounts according to the TRaSH guides, which was worth doing anyway, however this also didn't reolve the issue. Eventually I found a deal on a replacement motherboard on Ebay. I replaced the motherboard nearly two weeks ago and so far it has been rock solid, even whilst downloading. The thermal paste on the PCH of both my original board, as well as the replacement was very crisp. It's possible that was the issue, but hard to tell. I repasted the new one before swapping it in and maybe when I get some free time I'll test the original once repasted. It's frustrating not to know exactly what the root cause was, but I'm glad it appears to be resolved. :-)
  3. FFS. System freeze again today while I was at work. I'm going to try swapping out the Crucial SATA SSD for the Samsung one and see if that resolves it. Still clutching at fog though :-(
  4. I have changed a few things, and so far the system seems stable (fingers crossed). My Supermicro motherboard has an NVME m.2 slot, so I bought a 2TB WD Black one and then changed a number of things: I made the new NVME drive the primary cache and migrated appdata, docker image etc to it. I took the Samsung SATA SSD and made it into a separate cache pool called data-cache. This cache is now used for downloads, frigate cctv and other potentially heavy write IO. I completely overhauled my directory structure to adopt the TRaSH guides suggested structure. My previous setup involved completed downloads being copied to the array as soon as they are downloaded. Now the move is instantaneous and relies on the mover to transfer files to the array. I configured my paths to bypass FUSE for appdata, docker & downloads and instead write directly to the relevant cache pool. My latest theory is that my sub-optimal setup was causing some kind of disk IO bottleneck and since everything was relying on the same cache pool, the system crapped out. It's an unsatisfactory answer to be honest, but I can't seem to narrow it down any further. The final issue for me now is that the cache pools are both using a single disk right now, which I'm not a fan of, so I've ordered a PCIE NVME adaptor so I can mount two NVME drives. I'll also try adding the second disk back onto the data-cache pool once I have seen a bit of long-term stability.
  5. Another update: I just had another system freeze while sabnzbd was shut down. That means it isn't the culprit. Interestingly, the issue occurred while I was conducting an internet speedtest from the server using a different containerised service. The theme here is that this involves a reasonably large download. This is smelling increasingly like a hardware issue. Since I have already ruled out RAM and I have high confidence in my power supply, I'm wondering if this might be one of the SSDs in my cache pool struggling to keep up when data is being written. I've got two 2TB SATA SSDs from different manufacturers, both about a year old and both reporting good health and 96+% remaining health: Samsung 870 EVO 2TB Crucial MX500 (CT2000MX500SSD1) Feel like I'm clutching at straws though.
  6. Can I ask how you diagnosed this? What tool(s) did you use to check the write throughput of a single drive in a pool?
  7. A further update: The mystery continues. My server was stable for several weeks with Sabnzbd disabled. I brought it back online after the changes above and it was stable for a while, successfully downloading a large number of files. This morning, though: Dead again. This time I booted back up, and started downloads whilst tailing /var/log/* via ssh as well as having htop open and the logs for both Sabnzbd and gluetun. I was able to get the system to hang multiple times over the course of an hour. Each time, there were no log entries at all in any of the logs at all and htop showed no unusual system load. While I'm glad I've narrowed it down to Sabnzbd (or maybe Gluetun), this is proving very difficult indeed to diagnose fully.
  8. Another update in case anyone finds this thread in the future. Shutting down the Sabnzbd container resolved the stability issues. I now have it running again, and so far it seems to be stable. I have changed two things: Previously, I had /downloads/incomplete and /downloads/complete mapped as two different mounts on the container. I have now switched to one mount of the parent directory /downloads instead. My theory is that my previous method was preventing simple file move operations from behaving properly since the move (as far as the container was concerned) was from one mount to another. I have switched from /mnt/user/downloads to /mnt/cache/downloads as the host end of the mount. This avoids the overhead of the FUSE filesystem layer: https://www.reddit.com/r/unRAID/comments/uwxmg6/comment/i9uj9z1/ I hope this helps someone else in future :-)
  9. I've also been having the same issue. It's been a bitch to diagnose because of how infrequently it happens. Still looking for the solution.
  10. Update: Swapping RAM had no impact, however, I have just managed to trigger the issue several times in short succession. I started a large download in sabnzbd and it seemed to trigger the failure. I was tailing syslog at the time of failure and there was literally nothing logged. I'm going to disable sabnzbd and see if that leads to stability.
  11. It just died for the first time while I was actively using it. I tried switching from ipvlan to macvlan to give pihole its own IP to see if that had any impact. Apparently not. I also noticed that the error Error response from daemon: network with name br0 already exists had reappeared. It seems that if you have "preserve custom networks" enabled, this error will be seen in the log. I've disabled preserving networks for now (as I no longer need that feature) and that seems to have resolved that one. I've now pulled out two of the four RAM sticks. If it fails again, I'll swap the pair. If I still get the failure on both pairs then I know it's not a ram issue.
  12. Still having the issue, so BIOS update didn't do the trick. I'm also still not getting anything useful in the logs. This is going to be a bastard to resolve. There's so little to go on :-(
  13. Still no dice. I was getting optimistic, but it died again last night. Pumped the syslog into ChatGPT on the off-chance it might spot something I didn't. It didn't give me much - it seemed to hallucinate a kernel panic in the logs. It did cause me to pay more attention to: kernel: mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8. This doesn't look like a fatal error to me, but it did cause me to go and check for BIOS and IPMI firmware updates. Both had updates available, so I decided to try upgrading those. Upgrading both hasn't resolved those mce firmware bugs, but maybe it'll help system stability.
  14. Ok, some progress: I've managed to resolve Error response from daemon: network with name br0 already exists. It appears that somewhere along the line, the custom br0 network was duplicated (or maybe I created it manually and simply can't remember). Since I have "Preserve user defined networks" enabled in my Docker settings, it wasn't being cleaned up, and so was conflicting with the custom network that Unraid was trying to create for me when I enabled IPv4 custom network on interface br0 under Docker Settings. I suspect that dodgy instance of br0 may have been causing the issues. To resolve it, I did the following: Disable autostart on the pi-hole container Stop Docker Disable IPv4 custom network on interface br0 under Docker Settings Start Docker Edit the pi-hole container config and set the network type to host. It will fail to start and complain about port 80 being taken, but that's ok. Stop Docker Enable IPv4 custom network on interface br0 under Docker Settings Start Docker Edit the pi-hole container config and set network type to Custom-br0 and set the static IP I use for pi-hole on my network (192.168.1.12 in my case). Re-enable autostart on the pi-hole container Not sure if that's going to resolve the system crashes, but I do now get a clean network startup when starting Docker, and I'm counting that as progress: Aug 5 19:39:32 tower rc.docker: Processing... br0 Aug 5 19:39:32 tower rc.docker: created network ipvlan br0 with subnets: 192.168.1.0/24; Aug 5 19:39:32 tower rc.docker: connecting pihole to network br0 Aug 5 19:39:32 tower rc.docker: ip link add link br0 name shim-br0 type ipvlan mode l2 bridge Aug 5 19:39:32 tower rc.docker: ip link set shim-br0 up Aug 5 19:39:32 tower rc.docker: ip -6 addr flush dev shim-br0 Aug 5 19:39:32 tower rc.docker: ip -4 addr add 192.168.1.11/24 dev shim-br0 metric 0 Aug 5 19:39:32 tower rc.docker: ip -4 route add default via 192.168.1.1 dev shim-br0 metric 0 Aug 5 19:39:32 tower rc.docker: created network shim-br0 for host access Aug 5 19:39:32 tower rc.docker: Network started. Time will tell if that resolves the issue.
  15. It died again at some point this morning. Interestingly, this time there does seem to be something in the IPMI health event log, but the error codes don't mean anything to me, so not sure it's very helpful. Screenshot attached. Successfully did a memtest pass with no issues. Going to try disabling PiHole now and see if that improves things...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.