(SOLVED) Server sporadically freezes, no networking at all, weird output on monitor


Divi

Recommended Posts

Unraid version: 6.8.2 official

 

Installed plugins: Preclear, CA Backup/Restore Appdata, Commuynity Applications, Dynamix S3 Sleep (not enabled tho..), Dynamix System Information, Dynamix System Statistics, Dynamix System Temperature, Fix Common Problems, Nerd Tools, Network UPS Tools (NUT), Tips and Tweaks, Unassigned Devices, Unassigned Devices Plus

Dockers: MQTT, pihole-template, plex, shinobi_pro, UniFi

VM's: Generic Linux ru3nning HassOS 3.11 (Cores 7 / 15, and 4Gb RAM assigned)

 

Hardware:

MoBo: AsRock Fatal1ty AB350 Gaming K4 (Global C-States Control = disabled)

CPU: Ryzen 7 1700 @ stock, cooled by AMD Wraith Prizm

RAM: 4x4Gb of chinese 2400mhz DDR4

GPU: Asus GT520 Silent

Disks: 2x4TB Seagate Ironwolf, no cache yet

PSU: Corsair VS650

Networking: Intel i350-T4 installed but not in use as I couldnt make LACP work. Running single gigabit connection from integrated Realtek nic with 4 VLANs.

 

Problem:

Server sometimes freezes, happened twice now in couple days, both times it has been idling when crashing. Networking is lost, doesn't answer to ping and doesn't allow ssh. Had similar problem when it was first installed, but after disabling Global C-States Control it was stable until now. Last things I have installed before problems was CA Backup/Restore plugin that I configured to make a backup to external smb share mounted via unassigned devices plugin. Crash has never happened when this is scheduled. Also I have messed around with home assistant dockers, mqtt docker and lately switched to home assistant running in VM. First time this crash happened (between February 17 and 18) was before switching from docker to HassOS VM (February 19). I have syslog after the first crash, ending to second crash that I noticed last night February 20, about 23:00. 192.168.5.69 is my main PC, which was offline 17:27 as seen in syslog.

I'm curious what mover has been trying to do Feb 20 03:40:01, as mover should be disabled because of no cache is installed.

Oh and when it has crashed, hard power off is the only way out. Power button doesnt do anything, even if pressed continously long time. Need to switch PSU off or take power cord out from UPS.

 

Syslog between crashes:

Feb 18 21:18:40 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.1908.0" x-pid="8449" x-info="https://www.rsyslog.com"] start
Feb 18 21:27:50 Tower webGUI: Successful login user root from 192.168.5.69
Feb 18 21:28:08 Tower kernel: mdcmd (44): set md_num_stripes 1280
Feb 18 21:28:08 Tower kernel: mdcmd (45): set md_queue_limit 80
Feb 18 21:28:08 Tower kernel: mdcmd (46): set md_sync_limit 5
Feb 18 21:28:08 Tower kernel: mdcmd (47): set md_write_method
Feb 18 21:28:08 Tower kernel: mdcmd (48): set spinup_group 0 0
Feb 18 21:28:08 Tower kernel: mdcmd (49): set spinup_group 1 0
Feb 18 21:28:08 Tower kernel: mdcmd (50): start STOPPED
Feb 18 21:28:08 Tower kernel: unraid: allocating 15750K for 1280 stripes (3 disks)
Feb 18 21:28:08 Tower kernel: md1: running, size: 3907018532 blocks
Feb 18 21:28:08 Tower emhttpd: shcmd (796): udevadm settle
Feb 18 21:28:08 Tower root: Starting diskload
Feb 18 21:28:09 Tower tips.and.tweaks: Tweaks Applied
Feb 18 21:28:09 Tower emhttpd: Mounting disks...
Feb 18 21:28:09 Tower emhttpd: shcmd (798): /sbin/btrfs device scan
Feb 18 21:28:09 Tower root: Scanning for Btrfs filesystems
Feb 18 21:28:09 Tower emhttpd: shcmd (799): mkdir -p /mnt/disk1
Feb 18 21:28:09 Tower emhttpd: shcmd (800): mount -t xfs -o noatime,nodiratime /dev/md1 /mnt/disk1
Feb 18 21:28:09 Tower kernel: XFS (md1): Mounting V5 Filesystem
Feb 18 21:28:10 Tower kernel: XFS (md1): Ending clean mount
Feb 18 21:28:10 Tower emhttpd: shcmd (801): xfs_growfs /mnt/disk1
Feb 18 21:28:10 Tower root: meta-data=/dev/md1               isize=512    agcount=4, agsize=244188659 blks
Feb 18 21:28:10 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Feb 18 21:28:10 Tower root:          =                       crc=1        finobt=1, sparse=1, rmapbt=0
Feb 18 21:28:10 Tower root:          =                       reflink=1
Feb 18 21:28:10 Tower root: data     =                       bsize=4096   blocks=976754633, imaxpct=5
Feb 18 21:28:10 Tower root:          =                       sunit=0      swidth=0 blks
Feb 18 21:28:10 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Feb 18 21:28:10 Tower root: log      =internal log           bsize=4096   blocks=476930, version=2
Feb 18 21:28:10 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Feb 18 21:28:10 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Feb 18 21:28:10 Tower emhttpd: shcmd (802): sync
Feb 18 21:28:10 Tower emhttpd: shcmd (803): mkdir /mnt/user
Feb 18 21:28:10 Tower emhttpd: shcmd (804): /usr/local/sbin/shfs /mnt/user -disks 2 -o noatime,allow_other -o remember=0  |& logger
Feb 18 21:28:10 Tower shfs: use_ino: 1
Feb 18 21:28:10 Tower shfs: direct_io: 0
Feb 18 21:28:10 Tower emhttpd: shcmd (806): /usr/local/sbin/update_cron
Feb 18 21:28:11 Tower root: Delaying execution of fix common problems scan for 10 minutes
Feb 18 21:28:11 Tower unassigned.devices: Mounting 'Auto Mount' Devices...
Feb 18 21:28:11 Tower emhttpd: Starting services...
Feb 18 21:28:11 Tower emhttpd: shcmd (808): /etc/rc.d/rc.samba restart
Feb 18 21:28:12 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.1908.0" x-pid="18333" x-info="https://www.rsyslog.com"] start
Feb 18 21:28:13 Tower root: Starting Samba:  /usr/sbin/smbd -D
Feb 18 21:28:13 Tower root:                  /usr/sbin/nmbd -D
Feb 18 21:28:13 Tower root:                  /usr/sbin/wsdd 
Feb 18 21:28:13 Tower root:                  /usr/sbin/winbindd -D
Feb 18 21:28:13 Tower emhttpd: shcmd (822): /usr/local/sbin/mount_image '/mnt/user/system/docker/docker.img' /var/lib/docker 25
Feb 18 21:28:13 Tower kernel: BTRFS info (device loop2): disk space caching is enabled
Feb 18 21:28:13 Tower kernel: BTRFS info (device loop2): has skinny extents
Feb 18 21:28:14 Tower kernel: BTRFS info (device loop2): new size for /dev/loop2 is 26843545600
Feb 18 21:28:14 Tower root: Resize '/var/lib/docker' of 'max'
Feb 18 21:28:14 Tower emhttpd: shcmd (824): /etc/rc.d/rc.docker start
Feb 18 21:28:14 Tower root: starting dockerd ...
Feb 18 21:28:17 Tower kernel: IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Feb 18 21:28:19 Tower root: Fix Common Problems Version 2020.02.06
Feb 18 21:28:20 Tower rc.docker: 150b3faf200b8ec6ced31e944297f7dbaede48e9b87890f53c2cbc95e5b693c6
Feb 18 21:28:22 Tower rc.docker: 86a33cc6062a533bac58566ec1bf21267482041c7a7a69c7aa32079555d54d1f
Feb 18 21:28:22 Tower rc.docker: bf0a0b72cc54d961ea6810f8e537b3b131412f844f877d3e71845323b4ffe935
Feb 18 21:28:24 Tower rc.docker: 5a55816f0c8ca1900d9ea00debcfb48bfa5fc540ed269db8d1577d15088d493d
Feb 18 21:28:25 Tower rc.docker: 6b4a99c56c928fa509a1fe4df5800b16cd74b3e6a37bcab4a3f3006a5036555f
Feb 18 21:28:26 Tower kernel: mdcmd (51): check correct
Feb 18 21:28:26 Tower kernel: md: recovery thread: check P ...
Feb 18 21:28:28 Tower unassigned.devices: Mounting 'Auto Mount' Remote Shares...
Feb 18 21:28:28 Tower unassigned.devices: Mount SMB share '//192.168.5.69/unraidbackup' using SMB3 protocol.
Feb 18 21:28:28 Tower unassigned.devices: Mount SMB command: /sbin/mount -t cifs -o rw,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777,uid=99,gid=100,vers=3.0,credentials='/tmp/unassigned.devices/credentials' '//192.168.5.69/unraidbackup' '/mnt/disks/192.168.5.69_unraidbackup'
Feb 18 21:28:28 Tower unassigned.devices: Successfully mounted '//192.168.5.69/unraidbackup' on '/mnt/disks/192.168.5.69_unraidbackup'.
Feb 18 21:28:28 Tower unassigned.devices: Adding SMB share '192.168.5.69_unraidbackup'.'
Feb 18 21:28:30 Tower root: Fix Common Problems: Warning: Syslog mirrored to flash
Feb 18 21:28:40 Tower kernel: eth0: renamed from vethcc69053
Feb 18 21:28:40 Tower kernel: device br0.20 entered promiscuous mode
Feb 18 21:29:07 Tower rc.docker: home-assistant: started succesfully!
Feb 18 21:29:56 Tower kernel: eth0: renamed from veth774cc56
Feb 18 21:30:42 Tower rc.docker: MQTT: started succesfully!
Feb 18 21:31:37 Tower kernel: eth0: renamed from veth45dcefd
Feb 18 21:31:37 Tower kernel: device br0.5 entered promiscuous mode
Feb 18 21:32:05 Tower rc.docker: pihole-template: started succesfully!
Feb 18 21:35:43 Tower kernel: eth0: renamed from veth3735a1e
Feb 18 21:36:42 Tower rc.docker: plex: started succesfully!
Feb 18 21:38:02 Tower root: Fix Common Problems Version 2020.02.06
Feb 18 21:39:13 Tower kernel: eth0: renamed from veth2a7b730
Feb 18 21:39:13 Tower kernel: device br0.15 entered promiscuous mode
Feb 18 21:40:45 Tower rc.docker: shinobi_pro: started succesfully!
Feb 18 21:40:52 Tower root: Fix Common Problems: Warning: Template URL for docker application  is missing.
Feb 18 21:40:53 Tower root: Fix Common Problems: Warning: Syslog mirrored to flash
Feb 18 21:42:46 Tower kernel: eth0: renamed from vethd578a9a
Feb 18 21:42:46 Tower kernel: device br0 entered promiscuous mode
Feb 18 21:44:20 Tower rc.docker: UniFi: started succesfully!
Feb 18 23:01:56 Tower kernel: vethcc69053: renamed from eth0
Feb 18 23:02:27 Tower kernel: eth0: renamed from vethf20d86d
Feb 18 23:29:08 Tower kernel: CIFS VFS: Server 192.168.5.69 has not responded in 180 seconds. Reconnecting...
Feb 19 03:40:01 Tower crond[2144]: exit status 3 from user root /usr/local/sbin/mover &> /dev/null
Feb 19 05:12:34 Tower kernel: md: sync done. time=27848sec
Feb 19 05:12:34 Tower kernel: md: recovery thread: exit status: 0
Feb 19 19:32:27 Tower kernel: CIFS VFS: Server 192.168.5.69 has not responded in 180 seconds. Reconnecting...
Feb 19 20:13:04 Tower emhttpd: Starting services...
Feb 19 20:13:04 Tower emhttpd: shcmd (3557): /etc/rc.d/rc.samba restart
Feb 19 20:13:07 Tower root: Starting Samba:  /usr/sbin/smbd -D
Feb 19 20:13:07 Tower root:                  /usr/sbin/nmbd -D
Feb 19 20:13:07 Tower root:                  /usr/sbin/wsdd 
Feb 19 20:13:07 Tower root:                  /usr/sbin/winbindd -D
Feb 19 20:13:07 Tower emhttpd: shcmd (3566): smbcontrol smbd close-share 'domains'
Feb 19 20:14:22 Tower emhttpd: Starting services...
Feb 19 20:14:22 Tower emhttpd: shcmd (3571): /etc/rc.d/rc.samba restart
Feb 19 20:14:24 Tower root: Starting Samba:  /usr/sbin/smbd -D
Feb 19 20:14:25 Tower root:                  /usr/sbin/nmbd -D
Feb 19 20:14:25 Tower root:                  /usr/sbin/wsdd 
Feb 19 20:14:25 Tower root:                  /usr/sbin/winbindd -D
Feb 19 20:14:25 Tower emhttpd: shcmd (3580): smbcontrol smbd close-share 'domains'
Feb 19 20:29:36 Tower login[15229]: ROOT LOGIN  on '/dev/pts/0'
Feb 19 20:31:50 Tower sudo:     root : TTY=pts/0 ; PWD=/mnt/user/domains/HassOS ; USER=root ; COMMAND=/usr/bin/qemu-img resize hassos_ova-3.11.qcow2 +20G
Feb 19 20:32:18 Tower ool www[18096]: /usr/local/emhttp/plugins/dynamix/scripts/emcmd 'cmdStatus=Apply'
Feb 19 20:32:18 Tower emhttpd: Starting services...
Feb 19 20:32:18 Tower emhttpd: shcmd (3618): /etc/rc.d/rc.samba restart
Feb 19 20:32:20 Tower root: Starting Samba:  /usr/sbin/smbd -D
Feb 19 20:32:20 Tower root:                  /usr/sbin/nmbd -D
Feb 19 20:32:20 Tower root:                  /usr/sbin/wsdd 
Feb 19 20:32:20 Tower root:                  /usr/sbin/winbindd -D
Feb 19 20:32:20 Tower emhttpd: shcmd (3637): /usr/local/sbin/mount_image '/mnt/user/system/libvirt/libvirt.img' /etc/libvirt 1
Feb 19 20:32:20 Tower kernel: BTRFS: device fsid 289b59d2-1e32-43f4-88f5-fa541d3fcb71 devid 1 transid 18 /dev/loop3
Feb 19 20:32:21 Tower kernel: BTRFS info (device loop3): disk space caching is enabled
Feb 19 20:32:21 Tower kernel: BTRFS info (device loop3): has skinny extents
Feb 19 20:32:21 Tower root: Resize '/etc/libvirt' of 'max'
Feb 19 20:32:21 Tower kernel: BTRFS info (device loop3): new size for /dev/loop3 is 1073741824
Feb 19 20:32:21 Tower emhttpd: shcmd (3639): /etc/rc.d/rc.libvirt start
Feb 19 20:32:21 Tower root: Starting virtlockd...
Feb 19 20:32:21 Tower root: Starting virtlogd...
Feb 19 20:32:21 Tower root: Starting libvirtd...
Feb 19 20:32:21 Tower kernel: tun: Universal TUN/TAP device driver, 1.6
Feb 19 20:32:21 Tower kernel: virbr0: port 1(virbr0-nic) entered blocking state
Feb 19 20:32:21 Tower kernel: virbr0: port 1(virbr0-nic) entered disabled state
Feb 19 20:32:21 Tower kernel: device virbr0-nic entered promiscuous mode
Feb 19 20:32:21 Tower kernel: virbr0: port 1(virbr0-nic) entered blocking state
Feb 19 20:32:21 Tower kernel: virbr0: port 1(virbr0-nic) entered listening state
Feb 19 20:32:21 Tower dnsmasq[25551]: started, version 2.80 cachesize 150
Feb 19 20:32:21 Tower dnsmasq[25551]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify dumpfile
Feb 19 20:32:21 Tower dnsmasq-dhcp[25551]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Feb 19 20:32:21 Tower dnsmasq-dhcp[25551]: DHCP, sockets bound exclusively to interface virbr0
Feb 19 20:32:21 Tower dnsmasq[25551]: reading /etc/resolv.conf
Feb 19 20:32:21 Tower dnsmasq[25551]: using nameserver 192.168.1.1#53
Feb 19 20:32:21 Tower dnsmasq[25551]: read /etc/hosts - 2 addresses
Feb 19 20:32:21 Tower dnsmasq[25551]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Feb 19 20:32:21 Tower dnsmasq-dhcp[25551]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Feb 19 20:32:21 Tower kernel: virbr0: port 1(virbr0-nic) entered disabled state
Feb 19 20:33:42 Tower kernel: br0: port 2(vnet0) entered blocking state
Feb 19 20:33:42 Tower kernel: br0: port 2(vnet0) entered disabled state
Feb 19 20:33:42 Tower kernel: device vnet0 entered promiscuous mode
Feb 19 20:33:42 Tower kernel: br0: port 2(vnet0) entered blocking state
Feb 19 20:33:42 Tower kernel: br0: port 2(vnet0) entered forwarding state
Feb 19 20:37:48 Tower kernel: br0: port 2(vnet0) entered disabled state
Feb 19 20:37:48 Tower kernel: device vnet0 left promiscuous mode
Feb 19 20:37:48 Tower kernel: br0: port 2(vnet0) entered disabled state
Feb 19 20:38:40 Tower kernel: br0.20: port 2(vnet0) entered blocking state
Feb 19 20:38:40 Tower kernel: br0.20: port 2(vnet0) entered disabled state
Feb 19 20:38:40 Tower kernel: device vnet0 entered promiscuous mode
Feb 19 20:38:40 Tower kernel: br0.20: port 2(vnet0) entered blocking state
Feb 19 20:38:40 Tower kernel: br0.20: port 2(vnet0) entered forwarding state
Feb 19 21:04:46 Tower kernel: vethf20d86d: renamed from eth0
Feb 19 21:05:00 Tower kernel: device br0.20 left promiscuous mode
Feb 19 21:05:00 Tower kernel: veth774cc56: renamed from eth0
Feb 19 21:05:13 Tower kernel: eth0: renamed from vethc42cdd2
Feb 19 21:05:13 Tower kernel: device br0.20 entered promiscuous mode
Feb 19 21:24:27 Tower kernel: br0.20: port 2(vnet0) entered disabled state
Feb 19 21:24:27 Tower kernel: device vnet0 left promiscuous mode
Feb 19 21:24:27 Tower kernel: br0.20: port 2(vnet0) entered disabled state
Feb 19 21:25:30 Tower kernel: br0.20: port 2(vnet0) entered blocking state
Feb 19 21:25:30 Tower kernel: br0.20: port 2(vnet0) entered disabled state
Feb 19 21:25:30 Tower kernel: device vnet0 entered promiscuous mode
Feb 19 21:25:30 Tower kernel: br0.20: port 2(vnet0) entered blocking state
Feb 19 21:25:30 Tower kernel: br0.20: port 2(vnet0) entered forwarding state
Feb 19 21:50:29 Tower webGUI: Successful login user root from 192.168.5.69
Feb 19 21:50:30 Tower login[13073]: ROOT LOGIN  on '/dev/pts/1'
Feb 19 21:51:48 Tower CA Backup/Restore: #######################################
Feb 19 21:51:48 Tower CA Backup/Restore: Community Applications appData Backup
Feb 19 21:51:48 Tower CA Backup/Restore: Applications will be unavailable during
Feb 19 21:51:48 Tower CA Backup/Restore: this process.  They will automatically
Feb 19 21:51:48 Tower CA Backup/Restore: be restarted upon completion.
Feb 19 21:51:48 Tower CA Backup/Restore: #######################################
Feb 19 21:51:48 Tower CA Backup/Restore: Stopping MQTT
Feb 19 21:51:49 Tower kernel: device br0.20 left promiscuous mode
Feb 19 21:51:49 Tower kernel: vethc42cdd2: renamed from eth0
Feb 19 21:51:51 Tower CA Backup/Restore: docker stop -t 60 MQTT
Feb 19 21:51:51 Tower CA Backup/Restore: Stopping pihole-template
Feb 19 21:51:55 Tower kernel: veth45dcefd: renamed from eth0
Feb 19 21:51:57 Tower CA Backup/Restore: docker stop -t 60 pihole-template
Feb 19 21:51:57 Tower CA Backup/Restore: Stopping plex
Feb 19 21:52:01 Tower kernel: device br0.5 left promiscuous mode
Feb 19 21:52:01 Tower kernel: veth3735a1e: renamed from eth0
Feb 19 21:52:02 Tower CA Backup/Restore: docker stop -t 60 plex
Feb 19 21:52:02 Tower CA Backup/Restore: Stopping shinobi_pro
Feb 19 21:52:03 Tower kernel: device br0.15 left promiscuous mode
Feb 19 21:52:03 Tower kernel: veth2a7b730: renamed from eth0
Feb 19 21:52:05 Tower CA Backup/Restore: docker stop -t 60 shinobi_pro
Feb 19 21:52:05 Tower CA Backup/Restore: Stopping UniFi
Feb 19 21:52:18 Tower kernel: device br0 left promiscuous mode
Feb 19 21:52:18 Tower kernel: vethd578a9a: renamed from eth0
Feb 19 21:52:20 Tower CA Backup/Restore: docker stop -t 60 UniFi
Feb 19 21:52:20 Tower CA Backup/Restore: Backing up USB Flash drive config folder to 
Feb 19 21:52:20 Tower CA Backup/Restore: Using command: /usr/bin/rsync  -avXHq --delete  --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" /boot/ "/mnt/disks/192.168.5.69_unraidbackup/USB/" > /dev/null 2>&1
Feb 19 21:52:20 Tower CA Backup/Restore: Changing permissions on backup
Feb 19 21:52:20 Tower CA Backup/Restore: Backing up libvirt.img to /mnt/disks/192.168.5.69_unraidbackup/libvirt/
Feb 19 21:52:20 Tower CA Backup/Restore: Using Command: /usr/bin/rsync  -avXHq --delete  --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" "/mnt/user/system/libvirt/libvirt.img" "/mnt/disks/192.168.5.69_unraidbackup/libvirt/" > /dev/null 2>&1
Feb 19 21:52:33 Tower CA Backup/Restore: Backing Up appData from /mnt/user/appdata/ to /mnt/disks/192.168.5.69_unraidbackup/Appdata/[email protected]
Feb 19 21:52:33 Tower CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar -cvaf '/mnt/disks/192.168.5.69_unraidbackup/Appdata/[email protected]/CA_backup.tar' --exclude 'docker.img'  * >> /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log 2>&1 & echo $! > /tmp/ca.backup2/tempFiles/backupInProgress
Feb 19 21:52:38 Tower CA Backup/Restore: Backup Complete
Feb 19 21:52:38 Tower CA Backup/Restore: Verifying backup
Feb 19 21:52:38 Tower CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar --diff -C '/mnt/user/appdata/' -af '/mnt/disks/192.168.5.69_unraidbackup/Appdata/[email protected]/CA_backup.tar' > /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log & echo $! > /tmp/ca.backup2/tempFiles/verifyInProgress
Feb 19 21:52:43 Tower kernel: eth0: renamed from veth3ab691e
Feb 19 21:52:43 Tower kernel: device br0.20 entered promiscuous mode
Feb 19 21:52:46 Tower kernel: eth0: renamed from veth755cd47
Feb 19 21:52:46 Tower kernel: device br0.5 entered promiscuous mode
Feb 19 21:52:48 Tower kernel: eth0: renamed from veth27dd72e
Feb 19 21:52:52 Tower kernel: eth0: renamed from veth2505066
Feb 19 21:52:52 Tower kernel: device br0.15 entered promiscuous mode
Feb 19 21:52:56 Tower kernel: eth0: renamed from veth0be2d7d
Feb 19 21:52:56 Tower kernel: device br0 entered promiscuous mode
Feb 19 21:52:56 Tower CA Backup/Restore: #######################
Feb 19 21:52:56 Tower CA Backup/Restore: appData Backup complete
Feb 19 21:52:56 Tower CA Backup/Restore: #######################
Feb 19 21:52:56 Tower CA Backup/Restore: Backup / Restore Completed
Feb 19 22:50:49 Tower kernel: device br0.15 left promiscuous mode
Feb 19 22:50:49 Tower kernel: veth2505066: renamed from eth0
Feb 19 22:50:53 Tower kernel: eth0: renamed from vethd4660e8
Feb 19 22:50:53 Tower kernel: device br0.15 entered promiscuous mode
Feb 19 23:11:07 Tower kernel: CIFS VFS: Server 192.168.5.69 has not responded in 180 seconds. Reconnecting...
Feb 20 03:40:01 Tower crond[2144]: exit status 3 from user root /usr/local/sbin/mover &> /dev/null
Feb 20 17:27:19 Tower kernel: CIFS VFS: Server 192.168.5.69 has not responded in 180 seconds. Reconnecting...

Diagnostics file, this is after reboot tho because I cant get it out when crashed.. 

https://www.dropbox.com/s/u67zkfv3plorvat/tower-diagnostics-20200221-2021.zip?dl=0

 

Screen after second crash: didnt find a way to scroll up to see previous lines..

20200221_142416.thumb.jpg.6d3e325a5ad4e46bbaab44aa46dee0ed.jpg

 

HALP! 😵

 

Best regards, Divi from Finland

Edited by Divi
Link to comment

Like I wrote, I had the problem at first with power control of Zen architecture, it crashed really fast after starting to idle. But it seemed to be solved by disabling Global C-States Control. This was over a month ago and server was 100% stable for a month before these two recent crashes. I didn't find "Power Supply Idle Control" or anything similar from BIOS, other than Global C-States Control.

20200221_212819.thumb.jpg.c6e5b5f64613662f1b83d72f54946eb3.jpg

 

I now dropped memory speed from 2400 to 1866, but again the hardware was stable for a month so I doubt this will have anything to do with the topic.

20200221_212806.thumb.jpg.a5d9c5ec55c2379974ee908d3345e088.jpg

 

BIOS version is also fairly recent, 5.80 from 2019/7/3. Newest is 6.30 from 2020/2/4. And again it was stable for a month... nothing changed there recently.

20200221_212532.thumb.jpg.dd2f5b3c07bdecc402f6614fa65bf85f.jpg

 

Edited by Divi
Link to comment

Run a memtest on the system with the memory @ 2400MHz and see if you get any errors. 1st gen Ryzen is very picky about memory. It's also possible something in your build is either on its way out or just that you might have gotten lucky that you have not had any crashes up to this point. But like @johnnie.black hinted at, this sounds like memory related. With 4 channels populated max ram speed for Ryzen 1 2133 if these are single rank memory. 

Link to comment

Yep it was down again:

20200222_113949.thumb.jpg.6255df2375ad4ded71c30e7de1f891e4.jpg

 

These came out, dual rank and most likely the cheapest stick in the world. I'll memtest these in my main rig later. Want them out from server anyway so no need to memtest there.

20200222_115845.thumb.jpg.ec8fadfcb13fad53c3751284f66ecb3a.jpg

 

2x8Gb single rank kingstons in, will try safe mode if it still crashes, but really hope not as now theres no janky components left!

20200222_115812.thumb.jpg.cb8914362c6e5c5958b4d008ac2dcbf4.jpg

Link to comment
  • 1 year later...
  • 2 months later...
On 9/30/2021 at 10:23 AM, sevenmp said:

Interested in how you fixed this as I have same issue. Was it your ram? any changes you performed ?

Memory speed. Switched those alibaba dimms to proper kingston (basic low end fury), Dropped speed bit lower and it has been rock solid ever since!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.