bonzog

Members
  • Posts

    8
  • Joined

  • Last visited

Everything posted by bonzog

  1. Just an update based on time elapsed since downgrading, system has been up for 2 days 14 hours now without hint of instability on 6.11.5. As an aside, ipvlan works well and seems like a sensible change.
  2. Yes, I guess that's expected when the server is completely frozen and non-responsive. I was just trying to follow the bug report guidelines. As it happens, I did have syslog server running for the last two crashes before I decided to revert, as I didn't know until this incident that Unraid doesn't rotate logs by default. Here they are if they are useful to anyone: Crash3.txt is a complete syslog up to the point of the reboot. The hourly docker restarts are a scheduled task to start a one-shot container. External monitoring reported server down 12 minutes after the penultimate log entry. Crash4.txt is also a complete syslog from a planned reboot (trying the new CPU) to eventual crash and reboot. There's a lot of extra noise in this one as I was trying to get ipvlan going as well. External monitoring reported server down 3 minutes after the penultimate log entry. Hope this is useful for the bug hunters. I'm just going to stick with 6.11.5 for now. Syslog-Crash3.txt Syslog-Crash4.txt
  3. (Just filing for info as I have downgraded to 6.11.5 and am happy) Since upgrading from 6.11.5 >> 6.12.2, the system arbitrarily locks up at least twice a day. Diagnostics and screenshot from the last lockup attached. Fix Common Problems highlighted macvlan traces, so I have followed the advice to migrate Docker from macvlan to ipvlan, this is functional but didn't stop the crashes. As this is a Ryzen system, I have double-checked all suggestions re: memory speed and power states. Generally though, my server has been rock solid on 6.11.x with months of uptime without any BIOS power tweaks or suchlike needed. No obvious pattern to the timings of the lockups. Upgraded to latest BIOS and switched processor from 2600X > 5600X (planned upgrade), no improvement I have downgraded back to 6.11.5, and the system is solidly stable again. Ryzen 2600X (then 5600X) B450M chipset 40GB DDR4 (yes, mixed sticks, but it's been stable) LSI SAS2008 controller Nvidia GTX 1080 used by Docker unraid-diagnostics-20230712-0831.zip
  4. Thanks both for the hints, looks like I've found the culprit in some hacky SW TPM scripts I had been playing with to get Windows 11 VMs running some time ago. It wasn't actually the plugins themselves, but a sneaky line in a User Scripts set to run on array start, which included 'chmod 0755 -R /var/lib'! Steps tried today: - Disable docker and VMs, reboot and check permissions - OK on initial boot, permissions broken when array is started. - Additionally delete NerdPack (didn't realise it was deprecated) - same result, permissions broken when array is started. - Additionally delete SWTPM hacks in /boot/extras as I'm not playing with Win 11 VM any more - same result, fine until array is started. - Started wondering exactly what happens when the 'start' button is pressed, then remembered the User Scripts plugin has schedule options including array start.... checked my scripts and realised that the SW TPM script was doing nasty stuff. Thanks for the nudge in the right direction, that'll teach me to keep hacks that are no longer needed (esp since latest release has TPM support built in...) on my 'production' box!
  5. For months now my ability to access Unraid shares from Windows seems to break randomly with the dreaded 0x80070035 error. Recently while trying to get it working again, I tried to run smbstatus on the CLI, and received the message: root@unraid:~# smbstatus invalid permissions on directory '/var/lib/samba/private/msg.sock': has 0755 should be 0700 Unable to initialize messaging context! I did chmod -R 0700 /var/lib/samba/private/msg.sock - immediately smbstatus worked again and my Windows clients were able to access every share without issue. No reboots or fiddling with 'net use' necessary - it immediately worked. The system continued to work fine for a few days, until I rebooted following an OS upgrade to 6.11.1 and the same situation occurred with the permissions needing reset. I'm happy that I can restore access immediately by changing the permissions back, but I'd like to get to the bottom of why. I guess either something periodic, on boot, or on samba startup that causes this - but I've no idea what. I have seen one other post that hints that tdarr might have some influence, but I've never used or installed that. Has anyone any ideas please? The issue spans multiple Unraid versions from pre-6.9 thro' 6.11. Cheers (diagnostics attached) unraid-diagnostics-20221007-2239.zip
  6. My symptoms were quite similar to yours. Working fine, no config changes, just stopped working one night. Never did find a fix and moved to binhex-qbittorrentvpn instead, which is working fine.
  7. Interesting - the log file hasn't been updated since the time I shut down the running container. The timestamp is consistent with the when I rebooted my Unraid host to install a new NV driver and the last entries are: 1644174856 N worker_rtorrent: Shutting down thread. 1644174856 N rtorrent main: Shutting down thread. 1644174856 N rtorrent disk: Shutting down thread. Assuming this might be a permissions issue, I have created a new system user, changed the PUID & PGID of the binhex-rtorrent container, and deleted perms.txt before starting the container again. The user:group on all files has changed successfully, but the startup failure cycle still persists.
  8. I am currently having a very similar issue. The container works on first run after barebones install & basic VPN config, but any subsequent restart of the container or the Unraid host causes the startup loop that you posted. /mnt/user/Media is mapped to /data in the container, permissions look OK, I'm a bit stuck now and would appreciate any advice. Thanks.