The Transplant Posted February 20 Share Posted February 20 I have two VMs configured - one with Home Assistant and the other with Windows. Everything was running for months and then one morning several weeks ago Home Assistant crashed and I noticed the Windows VM was unresponsive. I tried restarting VMs, rebooting Unraid. Reading threads and started playing around with CPU assignments, etc. Somehow, and I don't think because of anything I did, the VMs started to perform fine and I forgot about it. I have read everything I can on CPU pinning but to be honest the more I read the more I get confused. So perhaps my configuration here is the issue? This morning I woke up and Home Assistant had crashed again and the Windows VM was unresponsive. So I dug in to so some research and so far I can't find out what I am doing wrong. My VMs are running on a separate SSD that is set to cache specific. I have taking screen shots of anything I think can be useful. Dockers seem to be running fine. Parity check did start running on one of my reboots but I paused that and scheduled it at night. Thanks for any help and I am happy to post anything else. Recent logs: Feb 20 17:31:36 Odin avahi-daemon[29221]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.2.110. Feb 20 17:31:36 Odin avahi-daemon[29221]: New relevant interface br0.IPv4 for mDNS. Feb 20 17:31:36 Odin avahi-daemon[29221]: Network interface enumeration completed. Feb 20 17:31:36 Odin avahi-daemon[29221]: Registering new address record for 192.168.2.110 on br0.IPv4. Feb 20 17:31:36 Odin emhttpd: shcmd (457): /etc/rc.d/rc.avahidnsconfd restart Feb 20 17:31:36 Odin root: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped Feb 20 17:31:36 Odin root: Starting Avahi mDNS/DNS-SD DNS Server Configuration Daemon: /usr/sbin/avahi-dnsconfd -D Feb 20 17:31:36 Odin avahi-dnsconfd[29230]: Successfully connected to Avahi daemon. Feb 20 17:31:37 Odin emhttpd: shcmd (472): /usr/local/sbin/mount_image '/mnt/user/system/libvirt/libvirt.img' /etc/libvirt 1 Feb 20 17:31:37 Odin kernel: loop2: detected capacity change from 0 to 2097152 Feb 20 17:31:37 Odin avahi-daemon[29221]: Server startup complete. Host name is Odin.local. Local service cookie is 1689060470. Feb 20 17:31:37 Odin kernel: BTRFS: device fsid 7c8582b0-545c-4478-b3a9-c791bdac3979 devid 1 transid 603 /dev/loop2 scanned by mount (29289) Feb 20 17:31:37 Odin kernel: BTRFS info (device loop2): using crc32c (crc32c-intel) checksum algorithm Feb 20 17:31:37 Odin kernel: BTRFS info (device loop2): using free space tree Feb 20 17:31:37 Odin kernel: BTRFS info (device loop2): enabling ssd optimizations Feb 20 17:31:37 Odin root: Resize device id 1 (/dev/loop2) from 1.00GiB to max Feb 20 17:31:37 Odin emhttpd: shcmd (474): /etc/rc.d/rc.libvirt start Feb 20 17:31:37 Odin root: Starting virtlockd... Feb 20 17:31:37 Odin root: Starting virtlogd... Feb 20 17:31:37 Odin root: Starting libvirtd... Feb 20 17:31:38 Odin dnsmasq[29445]: started, version 2.89 cachesize 150 Feb 20 17:31:38 Odin dnsmasq[29445]: compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset no-nftset auth cryptohash DNSSEC loop-detect inotify dumpfile Feb 20 17:31:38 Odin dnsmasq-dhcp[29445]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h Feb 20 17:31:38 Odin dnsmasq-dhcp[29445]: DHCP, sockets bound exclusively to interface virbr0 Feb 20 17:31:38 Odin dnsmasq[29445]: reading /etc/resolv.conf Feb 20 17:31:38 Odin dnsmasq[29445]: using nameserver 192.168.2.1#53 Feb 20 17:31:38 Odin dnsmasq[29445]: read /etc/hosts - 4 names Feb 20 17:31:38 Odin dnsmasq[29445]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 names Feb 20 17:31:38 Odin dnsmasq-dhcp[29445]: read /var/lib/libvirt/dnsmasq/default.hostsfile Feb 20 17:31:38 Odin kernel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details. Feb 20 17:31:38 Odin usb_manager: Info: rc.usb_manager Reset Connected Status Feb 20 17:31:38 Odin avahi-daemon[29221]: Service "Odin" (/services/ssh.service) successfully established. Feb 20 17:31:38 Odin avahi-daemon[29221]: Service "Odin" (/services/smb.service) successfully established. Feb 20 17:31:38 Odin avahi-daemon[29221]: Service "Odin" (/services/sftp-ssh.service) successfully established. Feb 20 17:31:55 Odin flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update Feb 20 17:32:21 Odin kernel: br0: port 2(vnet0) entered blocking state Feb 20 17:32:21 Odin kernel: br0: port 2(vnet0) entered disabled state Feb 20 17:32:21 Odin kernel: device vnet0 entered promiscuous mode Feb 20 17:32:21 Odin kernel: br0: port 2(vnet0) entered blocking state Feb 20 17:32:21 Odin kernel: br0: port 2(vnet0) entered forwarding state Feb 20 17:32:21 Odin usb_manager: Info: rc.usb_manager vm_action Home Assistant prepare begin - Feb 20 17:45:38 Odin kernel: br0: port 3(vnet1) entered blocking state Feb 20 17:45:38 Odin kernel: br0: port 3(vnet1) entered disabled state Feb 20 17:45:38 Odin kernel: device vnet1 entered promiscuous mode Feb 20 17:45:38 Odin kernel: br0: port 3(vnet1) entered blocking state Feb 20 17:45:38 Odin kernel: br0: port 3(vnet1) entered forwarding state Feb 20 17:45:38 Odin usb_manager: Info: rc.usb_manager vm_action Outlook prepare begin - Feb 20 17:59:16 Odin kernel: br0: port 3(vnet1) entered disabled state Feb 20 17:59:16 Odin kernel: device vnet1 left promiscuous mode Feb 20 17:59:16 Odin kernel: br0: port 3(vnet1) entered disabled state Feb 20 17:59:16 Odin usb_manager: Info: rc.usb_manager vm_action Outlook stopped end - Feb 20 18:01:58 Odin kernel: br0: port 3(vnet2) entered blocking state Feb 20 18:01:58 Odin kernel: br0: port 3(vnet2) entered disabled state Feb 20 18:01:58 Odin kernel: device vnet2 entered promiscuous mode Feb 20 18:01:58 Odin kernel: br0: port 3(vnet2) entered blocking state Feb 20 18:01:58 Odin kernel: br0: port 3(vnet2) entered forwarding state Feb 20 18:01:58 Odin usb_manager: Info: rc.usb_manager vm_action Outlook prepare begin - Quote Link to comment
The Transplant Posted February 21 Author Share Posted February 21 Diagnostics are attached. odin-diagnostics-20240220-2144.zip Quote Link to comment
The Transplant Posted February 21 Author Share Posted February 21 Don't know if this is significant. When I look at the domains and system I see that it is spread across my two SSD cache drives. Quote Link to comment
The Transplant Posted February 22 Author Share Posted February 22 Any thoughts on this? It could be something really simple and/or stupid that I am doing? Quote Link to comment
Solution The Transplant Posted February 26 Author Solution Share Posted February 26 So I found the problem. My cache SSD was failing and did fail. I am sure there is some way for me to have seen this coming. But I didn't see any errors and I didn't see anything connecting the speed of the VM with an imminent failure on the cache drive. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.