0xjams Posted December 23, 2023 Author Share Posted December 23, 2023 (edited) 9 hours ago, ich777 said: So it started to be unreliable after a power outage correct? It started to be unreliable when I changed the BIOS configuration to turn the server back on after the power outage is over. Today we had an event, I disabled that BIOS config. The server had turned itself off automatically via NUT, I turned it on manually after the power was back and I had no problems. For today's test which yielded good results these were the changes I made: I turned off all of my VMs a few hours before the outage. I disabled the BIOS option that makes the computer turn back on automatically after the event is over. I configured the "force shut down" parameter to 600 seconds. So this leaves me with these ideas: Either something BIOS related is messing things up. Maybe something during boot requires internet. If the server turns on as soon as the power is back, the router or the switch the server is connected to, might not be ready. Having to shut down a few VMs as part of the shutting down process is leaving the server in an inconsistent state. Without those VMs, the server took 2 minutes to shut down. Edited December 23, 2023 by 0xjams Quote Link to comment
ich777 Posted December 23, 2023 Share Posted December 23, 2023 3 hours ago, 0xjams said: I disabled the BIOS option that makes the computer turn back on automatically after the event is over. I don't think that this option makes you server went crazy. 3 hours ago, 0xjams said: Either something BIOS related is messing things up. This is more likely the case, but as said above I can't imagine that only the power on after failure option will mess things up. 3 hours ago, 0xjams said: Maybe something during boot requires internet. Yes and no, for example, it would be good if you have internet when the server boots back up since it checks for new packages on every boot but even if you don't have internet it will work just fine, the boot process will just take a bit longer. 3 hours ago, 0xjams said: Having to shut down a few VMs as part of the shutting down process is leaving the server in an inconsistent state. Without those VMs, the server took 2 minutes to shut down. Are you sure that the VMs are shutting down correctly when they receive the stop command from libvirt? However, what timeouts have you set in your settings for the VMs (and maybe also for Docker)? What happens when you let everything run and then issue `powerdown` from an Unraid Terminal, does the Server shutdown correctly? I'm asking again because you haven't yet answered, have you yet looked into /boot/logs if there are any Diagnostics? The server will create Diagnostics when it fails to shutdown. Quote Link to comment
JonathanM Posted December 23, 2023 Share Posted December 23, 2023 10 hours ago, 0xjams said: Having to shut down a few VMs as part of the shutting down process Install the NUT client in slave mode in all your VMs. Set them to shut down first, for example you could set the first VM to start shutdown after 2 minutes of power loss, the next at 3, etc, then have the host shut down after 5 minutes of power failure. Keep in mind that SLA batteries in most consumer UPS models wear out much quicker when discharged below 50% capacity, so for best life you want to get completely shut down ASAP. Unless you have a commercial grade unit with extra battery racks, the UPS is meant to get you safely shut down, not to continue operating during an outage. This is especially true if you have frequent outages. 1 Quote Link to comment
Frank1940 Posted December 23, 2023 Share Posted December 23, 2023 One more thing to try is to setup the syslog server (write to flash drive mode). After a 'power outage' and auto restart, upload the resulting syslog. That may provide one of the Gurus some clue as to want is happening. Quote Link to comment
0xjams Posted December 24, 2023 Author Share Posted December 24, 2023 18 hours ago, ich777 said: I'm asking again because you haven't yet answered, have you yet looked into /boot/logs if there are any Diagnostics? The server will create Diagnostics when it fails to shutdown. Hi groudon-diagnostics-20231221-1851.zip I found a file that was created the last day the issue took place. Quote Link to comment
Vr2Io Posted December 25, 2023 Share Posted December 25, 2023 (edited) On 12/24/2023 at 8:44 AM, 0xjams said: Hi I found a file that was created the last day the issue took place. Below are the log in last 4 min, it seems not a clean shutdown. Dec 21 18:47:05 groudon shutdown[13781]: shutting down for system halt Dec 21 18:51:13 groudon root: umount: /mnt/disk1: target is busy. Dec 21 18:51:13 groudon emhttpd: shcmd (106): exit status: 32 Dec 21 18:51:13 groudon emhttpd: Retry unmounting disk share(s)... For test UPS shutdown server, pls simulate first instead actually cutting UPS power, otherwise you may kill the battery. upsmon -c fsd After all fine then perform real power cut situation. Edited December 25, 2023 by Vr2Io 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.