Malachi89 Posted February 12, 2019 Share Posted February 12, 2019 (edited) Hey guys, odd issue with VM's. I have been running two unraid servers for around 2 years now without issue. One is solely storage for my client data (TITAN), the other is personal media server (ASGARD). ASGARD is running an i7, and has been utilizing a Win 10 VM for 2 years now without issue, I use it strictly for compressing files. Both have compatible hardware for VM's. Until about a month ago, it started randomly shutting off without reason. Sometimes it will come back on, sometimes it will just come up as failed when I try to start. When it does come back on, it will last anywhere from 5 minutes to 5 hours before it shuts down again, complete crash. On a whim thinking maybe the SSD was going, I pulled the VM SSD (set up using unassigned devices) from ASGARD and moved it over to TITAN, pointed a VM to it, and booted it up. Runs perfectly. Ran it for a week, compressed files, no issue what soever. So... I'm at a loss. What the hell would cause the VM to do this on one machine but not the other? I am also unsure what kind of info could maybe assist here as I've only messaged on forums a handful of times, so if you need me to upload some logs from something or screen shots let me know. Any help would be greatly appreciated, as TITAN is supposed to be for client data only and I don't want to use it as my compression machine and mix files. Edited February 12, 2019 by Malachi89 Quote Link to comment
Warrentheo Posted February 12, 2019 Share Posted February 12, 2019 Usually this is something filling up, like the SSD or something else... I had something like this when I had a repeated motherboard error filling up my Syslog file, which once it was full, the VM's would just lockup with no notification... Most likely something similar is happening here... If that doesn't help, we need diagnostics files, and the VM XML file... Quote Link to comment
Malachi89 Posted February 12, 2019 Author Share Posted February 12, 2019 (edited) Yeah I had read that can happen and deliberately made the VM image slightly smaller to help prevent that. I have the diag file but how do I get the VM XML? Edited February 12, 2019 by Malachi89 Quote Link to comment
Malachi89 Posted February 17, 2019 Author Share Posted February 17, 2019 Anyone out there able to help? Quote Link to comment
GHunter Posted February 17, 2019 Share Posted February 17, 2019 On 2/12/2019 at 5:29 PM, Malachi89 said: Yeah I had read that can happen and deliberately made the VM image slightly smaller to help prevent that. I have the diag file but how do I get the VM XML? The VM XML is included in the diagnostics file so no need to post it separately Quote Link to comment
Malachi89 Posted February 17, 2019 Author Share Posted February 17, 2019 Gotcha! Here is the diagnostic file asgard-diagnostics-20190217-1629.zip Quote Link to comment
Warrentheo Posted February 18, 2019 Share Posted February 18, 2019 Your system is having several issues... Libvirt keeps loosing access to files: 2019-02-17 05:45:16.964+0000: 9822: info : libvirt version: 4.7.0 2019-02-17 05:45:16.964+0000: 9822: info : hostname: Asgard 2019-02-17 05:45:16.964+0000: 9822: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:16.964+0000: 9823: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:16.972+0000: 9821: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:16.973+0000: 9821: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:16.974+0000: 9821: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:16.975+0000: 9823: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:19.244+0000: 9824: error : virStorageFileReportBrokenChain:4776 : Cannot access storage file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:24.100+0000: 9821: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:24.108+0000: 9825: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:24.110+0000: 9825: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:24.111+0000: 9825: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory 2019-02-17 05:45:24.112+0000: 9822: error : qemuOpenFileAs:3143 : Failed to open file '/mnt/disks/KINGSTON_SV300S37A120G_50026B774609E873/Windows 10/vdisk1.img': No such file or directory and your system keeps overheating: ....... ....... Feb 17 16:15:08 Asgard kernel: CPU6: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU7: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU5: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU3: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU1: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU2: Package temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU1: Core temperature above threshold, cpu clock throttled (total events = 2144810) Feb 17 16:15:08 Asgard kernel: CPU5: Core temperature above threshold, cpu clock throttled (total events = 2144810) Feb 17 16:15:08 Asgard kernel: CPU1: Core temperature/speed normal Feb 17 16:15:08 Asgard kernel: CPU5: Core temperature/speed normal Feb 17 16:20:28 Asgard nginx: 2019/02/17 16:20:28 [error] 8876#8876: *203316 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /webGui/include/DeviceList.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Feb 17 16:20:49 Asgard nginx: 2019/02/17 16:20:49 [error] 8876#8876: *203381 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /plugins/unassigned.devices/UnassignedDevices.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Feb 17 16:21:10 Asgard nginx: 2019/02/17 16:21:10 [error] 8876#8876: *203455 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /webGui/include/DeviceList.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Feb 17 16:21:51 Asgard nginx: 2019/02/17 16:21:51 [error] 8876#8876: *203589 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /webGui/include/DeviceList.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Feb 17 16:22:33 Asgard nginx: 2019/02/17 16:22:33 [error] 8876#8876: *203724 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /webGui/include/DeviceList.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Feb 17 16:23:57 Asgard nginx: 2019/02/17 16:23:57 [error] 8876#8876: *203990 readv() failed (104: Connection reset by peer) while reading upstream, client: 192.168.1.2, server: , request: "POST /plugins/unassigned.devices/UnassignedDevices.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.106", referrer: "http://192.168.1.106/Main" Notice where it shows : Quote .....cpu clock throttled (total events = 2144810) Most likely your log files are filling up with CPU issues in the logs, and this eventually kills LibVirt... Either way, check your hardware... Once you get the logs to stop screaming at you, you will probably fix the issues with LibVirt and your VM's.... Also, to GHunter, I am not able to find the XML files in the diagnostics... Does it matter if they have the "Anonymous" version of the diagnostics? or am I just going blind? On 2/17/2019 at 1:38 PM, GHunter said: The VM XML is included in the diagnostics file so no need to post it separately Quote Link to comment
Malachi89 Posted February 18, 2019 Author Share Posted February 18, 2019 (edited) Interesting! Guess I should learn to read these diagnostics haha. Been running fine for years so not sure why it’s overheating now, but I’ll toss a bigger CPU cooler in there and see if that helps and go from there! Edited February 18, 2019 by Malachi89 Quote Link to comment
Warrentheo Posted February 18, 2019 Share Posted February 18, 2019 (edited) 5 minutes ago, Malachi89 said: Interesting! Guess I should learn to read these diagnostics haha. Been running fine for years so not sure why it’s overheating now, but I’ll toss a bigger CPU cooler in there and see if that helps and go from there! It could also just be your thermal grease drying out, or having bumped the cooler dislodging the thermal connection... There are several options... If it is thermal grease issues, you might want to look into this, it is a new product, and not really mentioned everywhere yet...: https://www.amazon.com/gp/product/B07CKVW18G/ref=ox_sc_act_title_3?smid=A23NVCSO4PYH3S&psc=1 Edit: it could be something like a new video card overheating the case as well, again there are many options for what is causing this... I would not jump straight to buying something to fix it... Edited February 18, 2019 by Warrentheo Quote Link to comment
Malachi89 Posted February 19, 2019 Author Share Posted February 19, 2019 In all seriousness it definitely needs an aftermarket cooler LOL it’s an i7 4.0 GHz And it gets worked hard with just a stock cooler. I did go out and buy a cooler master 212 evo Which was on sale for about $25 at my local computer store. I will run that for a couple days with the VM and see how it goes. Also double checked everything else and found out one of my front fans had stopped working so I replaced that with a spare I had at home. Fingers crossed this does the trick Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.