Jump to content

Full System Halts


Recommended Posts

Hi All,

 

I wrote some time ago that I was experiencing what appeared to be somewhat random system halts where the only way to resolve it was a full power off as the plug, wait and reboot.

There would be no apparent cause for this but for a while it stopped happening, so I let the issue go.

 

It has recently started flaring up again with a vengeance, so much so now I have a smart plug attached to the machine so I can remotely cut power and reboot.

 

I still don't know for sure why this is happening and I'm not sure where to start.

I've tried looking in the syslog but it only appear to start from the time the system came back online after it's last reboot, which is no good.

 

I think there may be a correlation between the system halts and disk usage - A parity check is scheduled for 2AM every Sunday morning, and that is roughly when the machine stopped responding.

 

image.png.55bd1a96cb2b17e4c8c6db74c3f6356f.png

 

Looking at the power monitoring data my smart plug provides, you can see the halt, shortly after 2AM when the parity check starts, then a long stretch of idling at ~80 watts before I cut power and rebooted earlier this evening.

 

My machine typically idles about the 50 watt mark as I'm the only user, it serves very light NAS duties but does handle downloading and local plex media streaming.

 

All my drives are reporting a healthy state and I've never seem the error count on them above zero.

I've run self tests on all my drives but they all report completed without error.

 

I'm not sure what to do next with this other than bite the bullet and move to a Synology or QNAP box 😟

 

EDIT:

I Turned on the SysLog server and pointed Unraid back at itself to log events, then triggered a Parity check.

I got about 0.8% before the server crashed.

It does not appear to have logged the issue though as the crash occurred at 00:34

 

Quote

Jan  9 00:26:20 Tabris  rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="18525" x-info="https://www.rsyslog.com"] start
Jan  9 00:26:33 Tabris nginx: 2023/01/09 00:26:33 [error] 2829#2829: *50242 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 192.168.11.14, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/1.1", host: "192.168.11.100", referrer: "http://192.168.11.100/Shares/Browse?dir=/mnt/user/syslog-store"
Jan  9 00:28:44 Tabris nginx: 2023/01/09 00:28:44 [error] 2829#2829: *51521 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 192.168.11.14, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/1.1", host: "192.168.11.100", referrer: "http://192.168.11.100/Shares/Browse?dir=/mnt/user/syslog-store"
Jan  9 00:29:15 Tabris kernel: mdcmd (44): check correct
Jan  9 00:29:15 Tabris kernel: md: recovery thread: check P ...
Jan  9 00:29:17 Tabris  emhttpd: read SMART /dev/sdh
Jan  9 00:29:17 Tabris  emhttpd: read SMART /dev/sde
Jan  9 00:29:17 Tabris  emhttpd: read SMART /dev/sdb
Jan  9 00:29:17 Tabris  emhttpd: read SMART /dev/sdf
Jan  9 00:29:40 Tabris nginx: 2023/01/09 00:29:40 [error] 2829#2829: *52366 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 192.168.11.14, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/1.1", host: "192.168.11.100", referrer: "http://192.168.11.100/Shares/Browse?dir=/mnt/user/syslog-store"
Jan  9 00:37:25 Tabris root: Delaying execution of fix common problems scan for 10 minutes
Jan  9 00:37:25 Tabris unassigned.devices: Mounting 'Auto Mount' Devices...
Jan  9 00:37:25 Tabris  emhttpd: Starting services...
Jan  9 00:37:25 Tabris  emhttpd: shcmd (128): /etc/rc.d/rc.samba restart
Jan  9 00:37:25 Tabris  wsdd2[2749]: 'Terminated' signal received.
Jan  9 00:37:25 Tabris  nmbd[2739]: [2023/01/09 00:37:25.112438,  0] ../../source3/nmbd/nmbd.c:59(terminate)
Jan  9 00:37:25 Tabris  nmbd[2739]:   Got SIGTERM: going down...
Jan  9 00:37:25 Tabris  wsdd2[2749]: terminating.
Jan  9 00:37:25 Tabris  winbindd[2754]: [2023/01/09 00:37:25.112685,  0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler)
Jan  9 00:37:25 Tabris  winbindd[2754]:   Got sig[15] terminate (is_parent=0)
Jan  9 00:37:25 Tabris  winbindd[2832]: [2023/01/09 00:37:25.113360,  0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler)
Jan  9 00:37:25 Tabris  winbindd[2832]:   Got sig[15] terminate (is_parent=0)
Jan  9 00:37:25 Tabris  winbindd[2752]: [2023/01/09 00:37:25.113564,  0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler)
Jan  9 00:37:25 Tabris  winbindd[2752]:   Got sig[15] terminate (is_parent=1)
Jan  9 00:37:26 Tabris  rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="3545" x-info="https://www.rsyslog.com"] start
Jan  9 00:37:27 Tabris root: Starting Samba:  /usr/sbin/smbd -D
Jan  9 00:37:27 Tabris  smbd[3600]: [2023/01/09 00:37:27.283810,  0] ../../source3/smbd/server.c:1741(main)
Jan  9 00:37:27 Tabris  smbd[3600]:   smbd version 4.17.3 started.
Jan  9 00:37:27 Tabris  smbd[3600]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:27 Tabris root:                  /usr/sbin/nmbd -D
Jan  9 00:37:27 Tabris  nmbd[3602]: [2023/01/09 00:37:27.302984,  0] ../../source3/nmbd/nmbd.c:901(main)
Jan  9 00:37:27 Tabris  nmbd[3602]:   nmbd version 4.17.3 started.
Jan  9 00:37:27 Tabris  nmbd[3602]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:27 Tabris root:                  /usr/sbin/wsdd2 -d 
Jan  9 00:37:27 Tabris  wsdd2[3616]: starting.
Jan  9 00:37:27 Tabris root:                  /usr/sbin/winbindd -D
Jan  9 00:37:27 Tabris  winbindd[3617]: [2023/01/09 00:37:27.388981,  0] ../../source3/winbindd/winbindd.c:1440(main)
Jan  9 00:37:27 Tabris  winbindd[3617]:   winbindd version 4.17.3 started.
Jan  9 00:37:27 Tabris  winbindd[3617]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:27 Tabris  winbindd[3619]: [2023/01/09 00:37:27.392216,  0] ../../source3/winbindd/winbindd_cache.c:3116(initialize_winbindd_cache)
Jan  9 00:37:27 Tabris  winbindd[3619]:   initialize_winbindd_cache: clearing cache and re-creating with version number 2
Jan  9 00:37:27 Tabris  emhttpd: shcmd (132): /etc/rc.d/rc.avahidaemon restart
Jan  9 00:37:27 Tabris root: Stopping Avahi mDNS/DNS-SD Daemon: stopped
Jan  9 00:37:27 Tabris  avahi-dnsconfd[2778]: read(): EOF
Jan  9 00:37:27 Tabris root: Starting Avahi mDNS/DNS-SD Daemon:  /usr/sbin/avahi-daemon -D
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Found user 'avahi' (UID 61) and group 'avahi' (GID 214).
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Successfully dropped root privileges.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: avahi-daemon 0.8 starting up.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Successfully called chroot().
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Successfully dropped remaining capabilities.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Loading service file /services/sftp-ssh.service.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Loading service file /services/smb.service.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Loading service file /services/ssh.service.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.11.100.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: New relevant interface br0.IPv4 for mDNS.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface lo.IPv6 with address ::1.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: New relevant interface lo.IPv6 for mDNS.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface lo.IPv4 with address 127.0.0.1.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: New relevant interface lo.IPv4 for mDNS.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Network interface enumeration completed.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Registering new address record for 192.168.11.100 on br0.IPv4.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Registering new address record for ::1 on lo.*.
Jan  9 00:37:27 Tabris  avahi-daemon[3636]: Registering new address record for 127.0.0.1 on lo.IPv4.
Jan  9 00:37:27 Tabris  emhttpd: shcmd (133): /etc/rc.d/rc.avahidnsconfd restart
Jan  9 00:37:27 Tabris root: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped
Jan  9 00:37:27 Tabris root: Starting Avahi mDNS/DNS-SD DNS Server Configuration Daemon:  /usr/sbin/avahi-dnsconfd -D
Jan  9 00:37:27 Tabris  avahi-dnsconfd[3645]: Successfully connected to Avahi daemon.
Jan  9 00:37:27 Tabris  emhttpd: shcmd (144): /usr/local/sbin/mount_image '/mnt/user/system/docker/docker.img' /var/lib/docker 20
Jan  9 00:37:28 Tabris kernel: loop2: detected capacity change from 0 to 41943040
Jan  9 00:37:28 Tabris kernel: BTRFS: device fsid 656d119c-456a-4172-8c6b-bda3bd9887eb devid 1 transid 2981255 /dev/loop2 scanned by mount (3681)
Jan  9 00:37:28 Tabris kernel: BTRFS info (device loop2): using free space tree
Jan  9 00:37:28 Tabris kernel: BTRFS info (device loop2): has skinny extents
Jan  9 00:37:28 Tabris kernel: BTRFS info (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 145, gen 0
Jan  9 00:37:28 Tabris kernel: BTRFS info (device loop2): enabling ssd optimizations
Jan  9 00:37:28 Tabris root: Resize device id 1 (/dev/loop2) from 20.00GiB to max
Jan  9 00:37:28 Tabris  emhttpd: shcmd (146): /etc/rc.d/rc.docker start
Jan  9 00:37:28 Tabris root: starting dockerd ...
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: Server startup complete. Host name is Tabris.local. Local service cookie is 385468981.
Jan  9 00:37:28 Tabris kernel: BTRFS warning (device loop2): csum failed root 5 ino 77454 off 0 csum 0x9a7f5ae4 expected csum 0xa7f388c1 mirror 1
Jan  9 00:37:28 Tabris kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 146, gen 0
Jan  9 00:37:28 Tabris kernel: BTRFS warning (device loop2): csum failed root 5 ino 77454 off 4096 csum 0x16af2102 expected csum 0x5b0e635e mirror 1
Jan  9 00:37:28 Tabris kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 147, gen 0
Jan  9 00:37:28 Tabris kernel: BTRFS warning (device loop2): csum failed root 5 ino 77454 off 0 csum 0x9a7f5ae4 expected csum 0xa7f388c1 mirror 1
Jan  9 00:37:28 Tabris kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 148, gen 0
Jan  9 00:37:28 Tabris kernel: Bridge firewalling registered
Jan  9 00:37:28 Tabris kernel: Initializing XFRM netlink socket
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface docker0.IPv4 with address 172.17.0.1.
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: New relevant interface docker0.IPv4 for mDNS.
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: Registering new address record for 172.17.0.1 on docker0.IPv4.
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface br-cd2cd4596443.IPv4 with address 172.18.0.1.
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: New relevant interface br-cd2cd4596443.IPv4 for mDNS.
Jan  9 00:37:28 Tabris  avahi-daemon[3636]: Registering new address record for 172.18.0.1 on br-cd2cd4596443.IPv4.
Jan  9 00:37:29 Tabris  avahi-daemon[3636]: Service "Tabris" (/services/sftp-ssh.service) successfully established.
Jan  9 00:37:29 Tabris  avahi-daemon[3636]: Service "Tabris" (/services/smb.service) successfully established.
Jan  9 00:37:29 Tabris  avahi-daemon[3636]: Service "Tabris" (/services/ssh.service) successfully established.
Jan  9 00:37:30 Tabris  emhttpd: shcmd (160): /usr/local/sbin/mount_image '/mnt/user/system/libvirt/libvirt.img' /etc/libvirt 1
Jan  9 00:37:30 Tabris kernel: loop3: detected capacity change from 0 to 2097152
Jan  9 00:37:30 Tabris kernel: BTRFS: device fsid c172ccc1-4210-457b-bdf9-0c8ff42244ac devid 1 transid 1357 /dev/loop3 scanned by mount (4682)
Jan  9 00:37:30 Tabris kernel: BTRFS info (device loop3): using free space tree
Jan  9 00:37:30 Tabris kernel: BTRFS info (device loop3): has skinny extents
Jan  9 00:37:30 Tabris kernel: BTRFS info (device loop3): enabling ssd optimizations
Jan  9 00:37:30 Tabris root: Resize device id 1 (/dev/loop3) from 1.00GiB to max
Jan  9 00:37:30 Tabris  emhttpd: shcmd (162): /etc/rc.d/rc.libvirt start
Jan  9 00:37:30 Tabris root: Starting virtlockd...
Jan  9 00:37:30 Tabris root: Starting virtlogd...
Jan  9 00:37:30 Tabris root: Starting libvirtd...
Jan  9 00:37:30 Tabris kernel: tun: Universal TUN/TAP device driver, 1.6
Jan  9 00:37:30 Tabris kernel: mdcmd (42): check correct
Jan  9 00:37:30 Tabris kernel: md: recovery thread: check P ...
Jan  9 00:37:30 Tabris rc.docker: Plex: started succesfully!
Jan  9 00:37:30 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1.
Jan  9 00:37:30 Tabris  avahi-daemon[3636]: New relevant interface virbr0.IPv4 for mDNS.
Jan  9 00:37:30 Tabris  avahi-daemon[3636]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Jan  9 00:37:30 Tabris dnsmasq[4933]: started, version 2.87 cachesize 150
Jan  9 00:37:30 Tabris dnsmasq[4933]: compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset no-nftset auth cryptohash DNSSEC loop-detect inotify dumpfile
Jan  9 00:37:30 Tabris dnsmasq-dhcp[4933]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Jan  9 00:37:30 Tabris dnsmasq-dhcp[4933]: DHCP, sockets bound exclusively to interface virbr0
Jan  9 00:37:30 Tabris dnsmasq[4933]: reading /etc/resolv.conf
Jan  9 00:37:30 Tabris dnsmasq[4933]: using nameserver 1.1.1.1#53
Jan  9 00:37:30 Tabris dnsmasq[4933]: using nameserver 8.8.8.8#53
Jan  9 00:37:30 Tabris dnsmasq[4933]: read /etc/hosts - 2 addresses
Jan  9 00:37:30 Tabris dnsmasq[4933]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Jan  9 00:37:30 Tabris dnsmasq-dhcp[4933]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Jan  9 00:37:30 Tabris kernel: r8169 0000:03:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM
Jan  9 00:37:30 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered blocking state
Jan  9 00:37:30 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered disabled state
Jan  9 00:37:30 Tabris kernel: device veth110943b entered promiscuous mode
Jan  9 00:37:30 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered blocking state
Jan  9 00:37:30 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered forwarding state
Jan  9 00:37:30 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered disabled state
Jan  9 00:37:31 Tabris kernel: eth0: renamed from vethec3bdda
Jan  9 00:37:31 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth110943b: link becomes ready
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered blocking state
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 1(veth110943b) entered forwarding state
Jan  9 00:37:31 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): br-cd2cd4596443: link becomes ready
Jan  9 00:37:31 Tabris rc.docker: ProxyManager: started succesfully!
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered blocking state
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered disabled state
Jan  9 00:37:31 Tabris kernel: device veth4016e87 entered promiscuous mode
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered blocking state
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered forwarding state
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered disabled state
Jan  9 00:37:31 Tabris kernel: eth0: renamed from veth52c8d6d
Jan  9 00:37:31 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth4016e87: link becomes ready
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered blocking state
Jan  9 00:37:31 Tabris kernel: docker0: port 1(veth4016e87) entered forwarding state
Jan  9 00:37:31 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): docker0: link becomes ready
Jan  9 00:37:31 Tabris rc.docker: Overseerr: started succesfully!
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered blocking state
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered disabled state
Jan  9 00:37:31 Tabris kernel: device veth8a50a7a entered promiscuous mode
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered blocking state
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered forwarding state
Jan  9 00:37:31 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered disabled state
Jan  9 00:37:31 Tabris kernel: eth0: renamed from veth8b1a00b
Jan  9 00:37:32 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth8a50a7a: link becomes ready
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered blocking state
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 2(veth8a50a7a) entered forwarding state
Jan  9 00:37:32 Tabris rc.docker: radarr: started succesfully!
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 3(veth2dee507) entered blocking state
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 3(veth2dee507) entered disabled state
Jan  9 00:37:32 Tabris kernel: device veth2dee507 entered promiscuous mode
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 3(veth2dee507) entered blocking state
Jan  9 00:37:32 Tabris kernel: br-cd2cd4596443: port 3(veth2dee507) entered forwarding state
Jan  9 00:37:32 Tabris kernel: eth0: renamed from veth2403a6f
Jan  9 00:37:32 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth2dee507: link becomes ready
Jan  9 00:37:32 Tabris rc.docker: sonarr: started succesfully!
Jan  9 00:37:32 Tabris kernel: r8169 0000:03:00.0 eth0: Link is Down
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered blocking state
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered disabled state
Jan  9 00:37:32 Tabris kernel: device vethffda899 entered promiscuous mode
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered blocking state
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered forwarding state
Jan  9 00:37:32 Tabris kernel: bond0: (slave eth0): link status definitely down, disabling slave
Jan  9 00:37:32 Tabris kernel: device eth0 left promiscuous mode
Jan  9 00:37:32 Tabris kernel: bond0: now running without any active interface!
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered disabled state
Jan  9 00:37:32 Tabris kernel: br0: port 1(bond0) entered disabled state
Jan  9 00:37:32 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface br-cd2cd4596443.IPv6 with address fe80::42:f6ff:fe14:796.
Jan  9 00:37:32 Tabris  avahi-daemon[3636]: New relevant interface br-cd2cd4596443.IPv6 for mDNS.
Jan  9 00:37:32 Tabris  avahi-daemon[3636]: Registering new address record for fe80::42:f6ff:fe14:796 on br-cd2cd4596443.*.
Jan  9 00:37:32 Tabris kernel: eth0: renamed from vethd51e7ae
Jan  9 00:37:32 Tabris kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethffda899: link becomes ready
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered blocking state
Jan  9 00:37:32 Tabris kernel: docker0: port 2(vethffda899) entered forwarding state
Jan  9 00:37:32 Tabris rc.docker: SABnzbdVPN: started succesfully!
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface veth8a50a7a.IPv6 with address fe80::24b2:4aff:fe39:3447.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: New relevant interface veth8a50a7a.IPv6 for mDNS.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Registering new address record for fe80::24b2:4aff:fe39:3447 on veth8a50a7a.*.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface veth4016e87.IPv6 with address fe80::94d7:49ff:fef3:b175.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: New relevant interface veth4016e87.IPv6 for mDNS.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Registering new address record for fe80::94d7:49ff:fef3:b175 on veth4016e87.*.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface docker0.IPv6 with address fe80::42:e1ff:fe48:44d9.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: New relevant interface docker0.IPv6 for mDNS.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Registering new address record for fe80::42:e1ff:fe48:44d9 on docker0.*.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface veth110943b.IPv6 with address fe80::58b0:caff:fe85:b89b.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: New relevant interface veth110943b.IPv6 for mDNS.
Jan  9 00:37:33 Tabris  avahi-daemon[3636]: Registering new address record for fe80::58b0:caff:fe85:b89b on veth110943b.*.
Jan  9 00:37:33 Tabris  nmbd[3606]: [2023/01/09 00:37:33.269536,  0] ../../source3/nmbd/nmbd.c:59(terminate)
Jan  9 00:37:33 Tabris  nmbd[3606]:   Got SIGTERM: going down...
Jan  9 00:37:33 Tabris  wsdd2[3616]: 'Terminated' signal received.
Jan  9 00:37:33 Tabris  winbindd[3619]: [2023/01/09 00:37:33.271642,  0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler)
Jan  9 00:37:33 Tabris  winbindd[3619]:   Got sig[15] terminate (is_parent=1)
Jan  9 00:37:33 Tabris  wsdd2[3616]: terminating.
Jan  9 00:37:33 Tabris  winbindd[4959]: [2023/01/09 00:37:33.272079,  0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler)
Jan  9 00:37:33 Tabris  winbindd[4959]:   Got sig[15] terminate (is_parent=0)
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface veth2dee507.IPv6 with address fe80::4497:bbff:fe61:7b35.
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: New relevant interface veth2dee507.IPv6 for mDNS.
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: Registering new address record for fe80::4497:bbff:fe61:7b35 on veth2dee507.*.
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: Joining mDNS multicast group on interface vethffda899.IPv6 with address fe80::60ed:9cff:fe38:521f.
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: New relevant interface vethffda899.IPv6 for mDNS.
Jan  9 00:37:34 Tabris  avahi-daemon[3636]: Registering new address record for fe80::60ed:9cff:fe38:521f on vethffda899.*.
Jan  9 00:37:35 Tabris  smbd[7770]: [2023/01/09 00:37:35.454717,  0] ../../source3/smbd/server.c:1741(main)
Jan  9 00:37:35 Tabris  smbd[7770]:   smbd version 4.17.3 started.
Jan  9 00:37:35 Tabris  smbd[7770]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:35 Tabris  nmbd[7819]: [2023/01/09 00:37:35.481414,  0] ../../source3/nmbd/nmbd.c:901(main)
Jan  9 00:37:35 Tabris  nmbd[7819]:   nmbd version 4.17.3 started.
Jan  9 00:37:35 Tabris  nmbd[7819]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:35 Tabris  wsdd2[7918]: starting.
Jan  9 00:37:35 Tabris  winbindd[7919]: [2023/01/09 00:37:35.580904,  0] ../../source3/winbindd/winbindd.c:1440(main)
Jan  9 00:37:35 Tabris  winbindd[7919]:   winbindd version 4.17.3 started.
Jan  9 00:37:35 Tabris  winbindd[7919]:   Copyright Andrew Tridgell and the Samba Team 1992-2022
Jan  9 00:37:35 Tabris  winbindd[7953]: [2023/01/09 00:37:35.588649,  0] ../../source3/winbindd/winbindd_cache.c:3116(initialize_winbindd_cache)
Jan  9 00:37:35 Tabris  winbindd[7953]:   initialize_winbindd_cache: clearing cache and re-creating with version number 2
Jan  9 00:37:35 Tabris tips.and.tweaks: Tweaks Applied
Jan  9 00:37:35 Tabris unassigned.devices: Mounting 'Auto Mount' Remote Shares...
Jan  9 00:37:35 Tabris  sudo:     root : PWD=/ ; USER=root ; COMMAND=/bin/bash -c /usr/local/emhttp/plugins/unbalance/unbalance -port 6237
Jan  9 00:37:35 Tabris  sudo: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Jan  9 00:37:35 Tabris  ntpd[1017]: Deleting interface #1 br0, 192.168.11.100#123, interface stats: received=6, sent=6, dropped=0, active_time=68 secs
Jan  9 00:37:35 Tabris  ntpd[1017]: 93.93.131.118 local addr 192.168.11.100 -> <null>
Jan  9 00:37:36 Tabris kernel: r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
Jan  9 00:37:36 Tabris kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
Jan  9 00:37:36 Tabris kernel: bond0: (slave eth0): making interface the new active one
Jan  9 00:37:36 Tabris kernel: device eth0 entered promiscuous mode
Jan  9 00:37:36 Tabris kernel: bond0: active interface up!
Jan  9 00:37:36 Tabris kernel: br0: port 1(bond0) entered blocking state
Jan  9 00:37:36 Tabris kernel: br0: port 1(bond0) entered forwarding state
Jan  9 00:37:37 Tabris  ntpd[1017]: Listen normally on 3 br0 192.168.11.100:123
Jan  9 00:37:37 Tabris  ntpd[1017]: new interface(s) found: waking up resolver
Jan  9 00:37:51 Tabris kernel: mdcmd (43): nocheck cancel
Jan  9 00:37:51 Tabris kernel: md: recovery thread: exit status: -4
Jan  9 00:37:58 Tabris  nmbd[7859]: [2023/01/09 00:37:58.530106,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   Samba name server TABRIS is now a local master browser for workgroup WORKGROUP on subnet 172.17.0.1
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]: [2023/01/09 00:37:58.530181,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   Samba name server TABRIS is now a local master browser for workgroup WORKGROUP on subnet 172.18.0.1
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]: [2023/01/09 00:37:58.530245,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   Samba name server TABRIS is now a local master browser for workgroup WORKGROUP on subnet 192.168.11.100
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]: [2023/01/09 00:37:58.530307,  0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   Samba name server TABRIS is now a local master browser for workgroup WORKGROUP on subnet 192.168.122.1
Jan  9 00:37:58 Tabris  nmbd[7859]:   
Jan  9 00:37:58 Tabris  nmbd[7859]:   *****
Jan  9 00:38:01 Tabris nginx: 2023/01/09 00:38:01 [error] 2831#2831: *496 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 192.168.11.14, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/1.1", host: "192.168.11.100", referrer: "http://192.168.11.100/Shares/Browse?dir=/mnt/user/syslog-store"
 

 

 

tabris-diagnostics-20230108-2347.zip

Edited by Saldash
Link to comment
  • 1 year later...

I am also suddenly experiencing complete system lockups as soon as the Docker service starts. I'm running Unraid 6.12.6 with Docker networks ipvlan. When I manually edit Unraid USB /config/docker.cfg to ENABLE_DOCKER="no" and reboot, the system is ok. As soon as I enable Docker, web UI hangs. Same behavior when VMs are enabled or disabled, and whether in safe mode or not. When I reboot I immediately manually start SSH tail streams for tail syslog, zpool iostat, and htop, all are ok and then freeze when Docker starts.

Here is my syslog until it freezes:

 

Jan 30 00:00:37 Tower root: Starting Samba:  /usr/sbin/smbd -D
Jan 30 00:00:37 Tower root:                  /usr/sbin/nmbd -D
Jan 30 00:00:37 Tower root:                  /usr/sbin/wsdd2 -d -4
Jan 30 00:00:37 Tower root:                  /usr/sbin/winbindd -D
Jan 30 00:00:37 Tower wsdd2[8123]: starting.
Jan 30 00:00:37 Tower emhttpd: shcmd (162): /etc/rc.d/rc.avahidaemon restart
Jan 30 00:00:37 Tower root: Stopping Avahi mDNS/DNS-SD Daemon: stopped
Jan 30 00:00:37 Tower avahi-daemon[7498]: Got SIGTERM, quitting.
Jan 30 00:00:37 Tower avahi-dnsconfd[7507]: read(): EOF
Jan 30 00:00:37 Tower avahi-daemon[7498]: Leaving mDNS multicast group on interface br0.IPv4 with address 192.168.1.100.
Jan 30 00:00:37 Tower avahi-daemon[7498]: avahi-daemon 0.8 exiting.
Jan 30 00:00:37 Tower root: Starting Avahi mDNS/DNS-SD Daemon: /usr/sbin/avahi-daemon -D
Jan 30 00:00:37 Tower avahi-daemon[8170]: Found user 'avahi' (UID 61) and group 'avahi' (GID 214).
Jan 30 00:00:37 Tower avahi-daemon[8170]: Successfully dropped root privileges.
Jan 30 00:00:37 Tower avahi-daemon[8170]: avahi-daemon 0.8 starting up.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Successfully called chroot().
Jan 30 00:00:37 Tower avahi-daemon[8170]: Successfully dropped remaining capabilities.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Loading service file /services/sftp-ssh.service.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Loading service file /services/smb.service.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Loading service file /services/ssh.service.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.1.100.
Jan 30 00:00:37 Tower avahi-daemon[8170]: New relevant interface br0.IPv4 for mDNS.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Network interface enumeration completed.
Jan 30 00:00:37 Tower avahi-daemon[8170]: Registering new address record for 192.168.1.100 on br0.IPv4.
Jan 30 00:00:37 Tower emhttpd: shcmd (163): /etc/rc.d/rc.avahidnsconfd restart
Jan 30 00:00:37 Tower root: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped
Jan 30 00:00:37 Tower root: Starting Avahi mDNS/DNS-SD DNS Server Configuration Daemon:  /usr/sbin/avahi-dnsconfd -D
Jan 30 00:00:37 Tower avahi-dnsconfd[8179]: Successfully connected to Avahi daemon.
Jan 30 00:00:37 Tower emhttpd: shcmd (171): /usr/local/sbin/mount_image '/mnt/cache/system/docker/docker-xfs.img' /var/lib/docker 128
Jan 30 00:00:37 Tower kernel: loop3: detected capacity change from 0 to 268435456
Jan 30 00:00:37 Tower kernel: XFS (loop3): Mounting V5 Filesystem
Jan 30 00:00:37 Tower kernel: XFS (loop3): Ending clean mount
Jan 30 00:00:37 Tower root: meta-data=/dev/loop3             isize=512    agcount=4, agsize=8388608 blks
Jan 30 00:00:37 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Jan 30 00:00:37 Tower root:          =                       crc=1        finobt=1, sparse=1, rmapbt=0
Jan 30 00:00:37 Tower root:          =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
Jan 30 00:00:37 Tower root: data     =                       bsize=4096   blocks=33554432, imaxpct=25
Jan 30 00:00:37 Tower root:          =                       sunit=0      swidth=0 blks
Jan 30 00:00:37 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Jan 30 00:00:37 Tower root: log      =internal log           bsize=4096   blocks=16384, version=2
Jan 30 00:00:37 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Jan 30 00:00:37 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Jan 30 00:00:37 Tower emhttpd: shcmd (173): /etc/rc.d/rc.docker start
Jan 30 00:00:37 Tower root: starting dockerd ...
Jan 30 00:00:38 Tower avahi-daemon[8170]: Server startup complete. Host name is Tower.local. Local service cookie is 2005410125.
Jan 30 00:00:39 Tower avahi-daemon[8170]: Service "Tower" (/services/ssh.service) successfully established.
Jan 30 00:00:39 Tower avahi-daemon[8170]: Service "Tower" (/services/smb.service) successfully established.
Jan 30 00:00:39 Tower avahi-daemon[8170]: Service "Tower" (/services/sftp-ssh.service) successfully established.
Jan 30 00:00:41 Tower kernel: Bridge firewalling registered
Jan 30 00:00:41 Tower kernel: Initializing XFRM netlink socket

 

Some text anonymized. Diagnostics attached.

This began after setting up Immich with Docker Compose Manager plugin and heavy workload on 3-way mirror importing files.

I'm aware of disk1 ZFS specific file corruption that has been benign for months and I am rather certain is irrelevant. Have yet to run memtest, but RAM sticks are a couple months old out of the manufacturer box.

Tower-diagnostics-20240130-0021.zip

 

EDIT: My Docker image was most likely full. I resolved this by replacing my system/docker.img with a new vdisk double the size, and fresh pulling my template images.

 

2ND EDIT: As is the law of the universe, after an entire evening of re-pulling images, as I'm declaring victory and about to go to bed, the instant after I publish my edit, the full system halt returns at the exact same log line as above. My docker.img is absolutely not full. I manually isolate container CPUs to all except the first core (and its second thread) and this started after running a container with that cpusets change. There is something poisonous about Unraid 6.12.6 ZFS with an XFS vDisk Docker.img. It infuriates and terrifies me. Pray for me.

Edited by ZooMass
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...