The builtin powerdown script can easily hang preventing a shutdown


103 posts in this topic Last Reply

Recommended Posts

After some testing with the script that includes logging, the time on my test server that it takes for emhttp to process is about 2 seconds.  I think you are all over reacting to the time delay.  The time that emhttp runs is not the complete shutdown of the server.  VMs and Dockers are stopped after the emhttp runs.

 

Here is a section of the log on my test server with the logging I've implemented:

Sep 20 18:13:26 Tower Powerdown: Rebooting server...
Sep 20 18:13:26 Tower emhttp: shcmd (799): beep -r 2
Sep 20 18:13:28 Tower emhttp: Spinning up all drives...
Sep 20 18:13:28 Tower emhttp: shcmd (800): /usr/sbin/hdparm -S0 /dev/sdf &> /dev/null
Sep 20 18:13:28 Tower kernel: mdcmd (39): nocheck 
Sep 20 18:13:28 Tower kernel: md: nocheck_array: check not active
Sep 20 18:13:28 Tower kernel: mdcmd (40): spinup 1
Sep 20 18:13:28 Tower kernel: mdcmd (41): spinup 2
Sep 20 18:13:28 Tower kernel: mdcmd (42): spinup 3
Sep 20 18:13:28 Tower Powerdown: Shutdown took 2 seconds...
Sep 20 18:13:28 Tower kernel: usb 3-3.3: USB disconnect, device number 51
Sep 20 18:13:30 Tower kernel: usb 3-3.3: new low-speed USB device number 52 using xhci_hcd
Sep 20 18:13:30 Tower emhttp: Stopping services...
Sep 20 18:13:30 Tower cache_dirs: Stopping cache_dirs process 4379
Sep 20 18:13:30 Tower emhttp: shcmd (802): /etc/rc.d/rc.libvirt stop |& logger
Sep 20 18:13:30 Tower root: Waiting on VMs to shutdown
Sep 20 18:13:30 Tower kernel: usb 3-3.3: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
Sep 20 18:13:30 Tower root: Stopping libvirtd...
Sep 20 18:13:30 Tower kernel: input: Microsoft  Microsoft Basic Optical Mouse v2.0  as /devices/pci0000:00/0000:00:14.0/usb3/3-3/3-3.3/3-3.3:1.0/0003:045E:00CB.0032/input/input54
Sep 20 18:13:30 Tower dnsmasq[6148]: exiting on receipt of SIGTERM
Sep 20 18:13:30 Tower kernel: device virbr0-nic left promiscuous mode
Sep 20 18:13:30 Tower kernel: virbr0: port 1(virbr0-nic) entered disabled state
Sep 20 18:13:30 Tower avahi-daemon[3957]: Interface virbr0.IPv4 no longer relevant for mDNS.
Sep 20 18:13:30 Tower avahi-daemon[3957]: Leaving mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1.
Sep 20 18:13:30 Tower avahi-daemon[3957]: Withdrawing address record for 192.168.122.1 on virbr0.
Sep 20 18:13:30 Tower kernel: hid-generic 0003:045E:00CB.0032: input,hidraw3: USB HID v1.11 Mouse [Microsoft  Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:14.0-3.3/input0
Sep 20 18:13:30 Tower root: Network a4007147-6d28-4b27-8a73-0b1a1672c02b destroyed
Sep 20 18:13:30 Tower root: 
Sep 20 18:13:33 Tower emhttp: shcmd (803): umount /etc/libvirt
Sep 20 18:13:33 Tower emhttp: shcmd (805): /etc/rc.d/rc.docker stop |& logger
Sep 20 18:13:33 Tower root: stopping docker ...
Sep 20 18:13:33 Tower kernel: vethd41fd78: renamed from eth0
Sep 20 18:13:33 Tower kernel: docker0: port 1(vethbccff76) entered disabled state
Sep 20 18:13:34 Tower kernel: docker0: port 1(vethbccff76) entered disabled state
Sep 20 18:13:34 Tower kernel: device vethbccff76 left promiscuous mode
Sep 20 18:13:34 Tower kernel: docker0: port 1(vethbccff76) entered disabled state
Sep 20 18:13:35 Tower root: 8db51cbf2599
Sep 20 18:13:36 Tower ntpd[1643]: Deleting interface #6 docker0, 172.17.0.1#123, interface stats: received=0, sent=0, dropped=0, active_time=2819 secs
Sep 20 18:13:36 Tower avahi-daemon[3957]: Interface docker0.IPv4 no longer relevant for mDNS.
Sep 20 18:13:36 Tower avahi-daemon[3957]: Leaving mDNS multicast group on interface docker0.IPv4 with address 172.17.0.1.
Sep 20 18:13:36 Tower avahi-daemon[3957]: Withdrawing address record for 172.17.0.1 on docker0.
Sep 20 18:13:36 Tower emhttp: shcmd (806): umount /var/lib/docker |& logger
Sep 20 18:13:37 Tower emhttp: shcmd (807): /etc/rc.d/rc.samba stop |& logger
Sep 20 18:13:37 Tower emhttp: shcmd (808): rm -f /etc/avahi/services/smb.service
Sep 20 18:13:37 Tower avahi-daemon[3957]: Files changed, reloading.
Sep 20 18:13:37 Tower avahi-daemon[3957]: Service group file /services/smb.service vanished, removing services.
Sep 20 18:13:37 Tower emhttp: shcmd (810): /etc/rc.d/rc.nfsd stop |& logger
Sep 20 18:13:37 Tower rpc.mountd[3947]: Caught signal 15, un-registering and exiting.
Sep 20 18:13:38 Tower kernel: nfsd: last server has exited, flushing export cache
Sep 20 18:13:38 Tower emhttp: shcmd (812): /etc/rc.d/rc.avahidaemon stop |& logger
Sep 20 18:13:38 Tower root: Stopping Avahi mDNS/DNS-SD Daemon: stopped
Sep 20 18:13:38 Tower avahi-daemon[3957]: Got SIGTERM, quitting.
Sep 20 18:13:38 Tower avahi-dnsconfd[3966]: read(): EOF
Sep 20 18:13:38 Tower avahi-daemon[3957]: Leaving mDNS multicast group on interface br0.1.IPv4 with address 169.254.206.15.
Sep 20 18:13:38 Tower avahi-daemon[3957]: Leaving mDNS multicast group on interface br0.IPv4 with address 192.168.1.153.
Sep 20 18:13:38 Tower avahi-daemon[3957]: avahi-daemon 0.6.32 exiting.
Sep 20 18:13:38 Tower emhttp: shcmd (813): /etc/rc.d/rc.avahidnsconfd stop |& logger
Sep 20 18:13:38 Tower root: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped
Sep 20 18:13:38 Tower emhttp: Sync filesystems...
Sep 20 18:13:38 Tower emhttp: shcmd (814): sync
Sep 20 18:13:40 Tower Recycle Bin: Stopping Recycle Bin
Sep 20 18:13:40 Tower emhttp: Stopping Recycle Bin...
Sep 20 18:13:42 Tower emhttp: Recycle Bin stopped...
Sep 20 18:13:42 Tower Recycle Bin: Recycle Bin stopped
Sep 20 18:13:42 Tower emhttp: 
Sep 20 18:13:42 Tower emhttp: 
Sep 20 18:13:42 Tower emhttp: shcmd (815): set -o pipefail ; umount /mnt/user |& logger
Sep 20 18:13:42 Tower emhttp: shcmd (816): rmdir /mnt/user |& logger
Sep 20 18:13:42 Tower emhttp: shcmd (817): set -o pipefail ; umount /mnt/user0 |& logger
Sep 20 18:13:42 Tower emhttp: shcmd (818): rmdir /mnt/user0 |& logger
Sep 20 18:13:42 Tower emhttp: shcmd (819): rm -f /boot/config/plugins/dynamix/mover.cron
Sep 20 18:13:42 Tower emhttp: shcmd (820): /usr/local/sbin/update_cron &> /dev/null
Sep 20 18:13:42 Tower emhttp: Unmounting disks...
Sep 20 18:13:42 Tower emhttp: shcmd (821): umount /mnt/disk1 |& logger
Sep 20 18:13:42 Tower kernel: XFS (md1): Unmounting Filesystem
Sep 20 18:13:42 Tower emhttp: shcmd (822): rmdir /mnt/disk1 |& logger
Sep 20 18:13:42 Tower emhttp: shcmd (823): umount /mnt/disk2 |& logger
Sep 20 18:13:43 Tower kernel: XFS (md2): Unmounting Filesystem
Sep 20 18:13:43 Tower emhttp: shcmd (824): rmdir /mnt/disk2 |& logger
Sep 20 18:13:43 Tower emhttp: shcmd (825): umount /mnt/disk3 |& logger
Sep 20 18:13:43 Tower kernel: XFS (md3): Unmounting Filesystem
Sep 20 18:13:43 Tower emhttp: shcmd (826): rmdir /mnt/disk3 |& logger
Sep 20 18:13:43 Tower emhttp: shcmd (827): umount /mnt/cache |& logger
Sep 20 18:13:43 Tower kernel: XFS (sdf1): Unmounting Filesystem
Sep 20 18:13:43 Tower emhttp: shcmd (828): rmdir /mnt/cache |& logger
Sep 20 18:13:43 Tower kernel: mdcmd (43): stop 
Sep 20 18:13:43 Tower kernel: md1: stopping
Sep 20 18:13:43 Tower kernel: md2: stopping
Sep 20 18:13:44 Tower kernel: md3: stopping
Sep 20 18:13:46 Tower emhttp: shcmd (829): rmmod md-mod |& logger
Sep 20 18:13:46 Tower kernel: md: unRAID driver removed
Sep 20 18:13:46 Tower emhttp: shcmd (830): modprobe md-mod super=/boot/config/super.dat |& logger
Sep 20 18:13:46 Tower kernel: md: unRAID driver 2.6.6 installed
Sep 20 18:13:46 Tower emhttp: Pro key detected, GUID: 0781-5530-3514-21114FE1B094 FILE: /boot/config/Pro.key
Sep 20 18:13:46 Tower emhttp: Device inventory:
Sep 20 18:13:46 Tower emhttp: shcmd (831): udevadm settle
Sep 20 18:13:46 Tower emhttp: SanDisk_Cruzer_351421114FE1B094-0:0 (sda) 7847900
Sep 20 18:13:46 Tower emhttp: Samsung_SSD_850_PRO_256GB_S1SUNSAF902713T (sdb) 250059064
Sep 20 18:13:46 Tower emhttp: WDC_WD10EADS-00L5B1_WD-WCAU47447856 (sdc) 976762552
Sep 20 18:13:46 Tower emhttp: WDC_WD20EURS-63S48Y0_WD-WCAZAH868851 (sdd) 1953514552
Sep 20 18:13:46 Tower emhttp: WDC_WD10EARS-00Y5B1_WD-WCAV55063796 (sde) 976762552
Sep 20 18:13:46 Tower emhttp: WDC_WD5000AACS-00ZUB0_WD-WCASU6424348 (sdf) 488386552
Sep 20 18:13:46 Tower kernel: mdcmd (1): import 0
Sep 20 18:13:46 Tower kernel: md: import_slot: 0 empty
Sep 20 18:13:46 Tower kernel: mdcmd (2): import 1 sdc 976762552 0 WDC_WD10EADS-00L5B1_WD-WCAU47447856
Sep 20 18:13:46 Tower kernel: md: import disk1: (sdc) WDC_WD10EADS-00L5B1_WD-WCAU47447856 size: 976762552 
Sep 20 18:13:46 Tower kernel: mdcmd (3): import 2 sdd 1953514552 0 WDC_WD20EURS-63S48Y0_WD-WCAZAH868851
Sep 20 18:13:46 Tower kernel: md: import disk2: (sdd) WDC_WD20EURS-63S48Y0_WD-WCAZAH868851 size: 1953514552 
Sep 20 18:13:46 Tower kernel: mdcmd (4): import 3 sde 976762552 0 WDC_WD10EARS-00Y5B1_WD-WCAV55063796
Sep 20 18:13:46 Tower kernel: md: import disk3: (sde) WDC_WD10EARS-00Y5B1_WD-WCAV55063796 size: 976762552 
Sep 20 18:13:46 Tower kernel: mdcmd (5): import 4
Sep 20 18:13:46 Tower kernel: mdcmd (6): import 5
Sep 20 18:13:46 Tower kernel: mdcmd (7): import 6
Sep 20 18:13:46 Tower kernel: mdcmd (: import 7
Sep 20 18:13:46 Tower kernel: mdcmd (9): import 8
Sep 20 18:13:46 Tower kernel: mdcmd (10): import 9
Sep 20 18:13:46 Tower kernel: mdcmd (11): import 10
Sep 20 18:13:46 Tower kernel: mdcmd (12): import 11
Sep 20 18:13:46 Tower kernel: mdcmd (13): import 12
Sep 20 18:13:46 Tower kernel: mdcmd (14): import 13
Sep 20 18:13:46 Tower kernel: mdcmd (15): import 14
Sep 20 18:13:46 Tower kernel: mdcmd (16): import 15
Sep 20 18:13:46 Tower kernel: mdcmd (17): import 16
Sep 20 18:13:46 Tower kernel: mdcmd (18): import 17
Sep 20 18:13:46 Tower kernel: mdcmd (19): import 18
Sep 20 18:13:46 Tower kernel: mdcmd (20): import 19
Sep 20 18:13:46 Tower kernel: mdcmd (21): import 20
Sep 20 18:13:46 Tower kernel: mdcmd (22): import 21
Sep 20 18:13:46 Tower kernel: mdcmd (23): import 22
Sep 20 18:13:46 Tower kernel: mdcmd (24): import 23
Sep 20 18:13:46 Tower kernel: mdcmd (25): import 24
Sep 20 18:13:46 Tower kernel: mdcmd (26): import 25
Sep 20 18:13:46 Tower kernel: mdcmd (27): import 26
Sep 20 18:13:46 Tower kernel: mdcmd (28): import 27
Sep 20 18:13:46 Tower kernel: mdcmd (29): import 28
Sep 20 18:13:46 Tower kernel: mdcmd (30): import 29
Sep 20 18:13:46 Tower kernel: md: import_slot: 29 empty
Sep 20 18:13:46 Tower emhttp: import 30 cache device: sdf
Sep 20 18:13:46 Tower emhttp: import flash device: sda
Sep 20 18:13:46 Tower emhttp: shcmd (837): /etc/rc.d/rc.nfsd start |& logger
Sep 20 18:13:46 Tower root: Starting NFS server daemons:
Sep 20 18:13:46 Tower root:   /usr/sbin/exportfs -r
Sep 20 18:13:46 Tower root:   /usr/sbin/rpc.nfsd 8
Sep 20 18:13:46 Tower root:   /usr/sbin/rpc.mountd
Sep 20 18:13:46 Tower rpc.mountd[1150]: Version 1.3.3 starting
Sep 20 18:13:46 Tower emhttp: shcmd (839): /etc/rc.d/rc.avahidaemon start |& logger
Sep 20 18:13:46 Tower root: Starting Avahi mDNS/DNS-SD Daemon:  /usr/sbin/avahi-daemon -D
Sep 20 18:13:46 Tower avahi-daemon[1160]: Found user 'avahi' (UID 61) and group 'avahi' (GID 214).
Sep 20 18:13:46 Tower avahi-daemon[1160]: Successfully dropped root privileges.
Sep 20 18:13:46 Tower avahi-daemon[1160]: avahi-daemon 0.6.32 starting up.
Sep 20 18:13:46 Tower avahi-daemon[1160]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns!
Sep 20 18:13:46 Tower avahi-daemon[1160]: Successfully called chroot().
Sep 20 18:13:46 Tower avahi-daemon[1160]: Successfully dropped remaining capabilities.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Loading service file /services/sftp-ssh.service.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Loading service file /services/smb.service.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Loading service file /services/ssh.service.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Joining mDNS multicast group on interface br0.1.IPv4 with address 169.254.206.15.
Sep 20 18:13:46 Tower avahi-daemon[1160]: New relevant interface br0.1.IPv4 for mDNS.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.1.153.
Sep 20 18:13:46 Tower avahi-daemon[1160]: New relevant interface br0.IPv4 for mDNS.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Network interface enumeration completed.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Registering new address record for 169.254.206.15 on br0.1.IPv4.
Sep 20 18:13:46 Tower avahi-daemon[1160]: Registering new address record for 192.168.1.153 on br0.IPv4.
Sep 20 18:13:46 Tower emhttp: shcmd (840): /etc/rc.d/rc.avahidnsconfd start |& logger
Sep 20 18:13:46 Tower root: Starting Avahi mDNS/DNS-SD DNS Server Configuration Daemon:  /usr/sbin/avahi-dnsconfd -D
Sep 20 18:13:46 Tower avahi-dnsconfd[1169]: Successfully connected to Avahi daemon.
Sep 20 18:13:46 Tower emhttp: shcmd (845): /etc/rc.d/rc.ntpd stop |& logger
Sep 20 18:13:46 Tower ntpd[1643]: ntpd exiting on signal 1 (Hangup)
Sep 20 18:13:46 Tower ntpd[1643]: 127.127.1.0 local addr 127.0.0.1 -> <null>
Sep 20 18:13:46 Tower ntpd[1643]: 45.56.105.98 local addr 192.168.1.153 -> <null>
Sep 20 18:13:46 Tower root: Stopping NTP daemon...
Sep 20 18:13:46 Tower emhttp: shcmd (846): /sbin/reboot
Sep 20 18:13:46 Tower shutdown[1192]: shutting down for system reboot
Sep 20 18:13:47 Tower init: Switching to runlevel: 6
Sep 20 18:13:47 Tower avahi-daemon[1160]: Server startup complete. Host name is Tower.local. Local service cookie is 1980821688.
Sep 20 18:13:48 Tower avahi-daemon[1160]: Service "Tower" (/services/ssh.service) successfully established.
Sep 20 18:13:48 Tower avahi-daemon[1160]: Service "Tower" (/services/smb.service) successfully established.
Sep 20 18:13:48 Tower avahi-daemon[1160]: Service "Tower" (/services/sftp-ssh.service) successfully established.
Sep 20 18:13:49 Tower [1219][1220]: Capture diagnostics to /boot/logs

 

You can see the complete shutdown took 21 seconds.  The time for emhttp to complete it's operation was 2 seconds.  Granted, I only have 1 Docker running and no VMs, but this gives you an idea of the timing of shutdown.

 

So the stopping of VMs and Dockers is not included in the 30 second timer.

Link to post
  • Replies 102
  • Created
  • Last Reply

Top Posters In This Topic

This is a test version only and should not be used in production at this time.

 

For those of you wanting to try the new script, I have added an option to the Tips and Tweaks plugin and created a beta test version where you can enable the new script, turn on the log archiving, and turn on the Ctrl-Alt-Del to do the proper shutdown.

 

If I can get a few of you to do some testing, we can refine the script and possibly work out some gremlins before LT goes live with it.

 

Here is the link to install the Tips and Tweaks beta test plugin.  Go to Settings->Tips and Tweaks and set the "Enable Powerdown Enhancements" on.  The new powerdown script will be installed and replace the built in powerdown.

 

If you have the Tips and Tweaks plugin installed already, uninstall it before installing my beta test version.

 

https://github.com/dlandon/tips.and.tweaks/raw/master/tips.and.tweaks.beta.plg

 

If you have emhttp on a port other than the default port 80, the new script should work fine for you.  If anyone is using a port other than 80, please test and let me know the result.

 

The command line options are:

- 'powerdown' to shut down the server completely.

- 'powerdown -r' to reboot the server.

 

The direct path to powerdown is /usr/local/sbin/powerdown.

 

Other events that will shutdown the server are UPS loss of power, Ctrl-Alt-Del, and quick power button press.

 

One other useful command is 'poweroff' that takes the server down pretty hard and should only be used as a last resort.

 

Don't use any other Linux shutdown or reboot commands as they will bypass the powerdown script.

Link to post

This is a test version only and should not be used in production at this time.

 

For those of you wanting to try the new script, I have added an option to the Tips and Tweaks plugin and created a beta test version where you can enable the new script, turn on the log archiving, and turn on the Ctrl-Alt-Del to do the proper shutdown.

 

If I can get a few of you to do some testing, we can refine the script and possibly work out some gremlins before LT goes live with it.

 

Here is the link to install the Tips and Tweaks beta test plugin.  Go to Settings->Tips and Tweaks and set the "Enable Powerdown Enhancements" on.  The new powerdown script will be installed and replace the built in powerdown.

 

If you have the Tips and Tweaks plugin installed already, uninstall it before installing my beta test version.

 

https://github.com/dlandon/tips.and.tweaks/raw/master/tips.and.tweaks.beta.plg

 

If you have emhttp on a port other than the default port 80, the new script should work fine for you.  If anyone is using a port other than 80, please test and let me know the result.

 

The command line options are:

- 'powerdown' to shut down the server completely.

- 'powerdown -r' to reboot the server.

 

The direct path to powerdown is /usr/local/sbin/powerdown.

 

Other events that will shutdown the server are UPS loss of power, Ctrl-Alt-Del, and quick power button press.

 

One other useful command is 'poweroff' that takes the server down pretty hard and should only be used as a last resort.

 

Don't use any other Linux shutdown or reboot commands as they will bypass the powerdown script.

 

Installed the Tips and Tweaks plugin and enabled this option. emhttp is run on a non standard port on my server and powerdown, powerdown -r, quick power button press, and UPS power down worked for me. I dont have a keyboard attached so could not try the Ctrl-Alt-Del

Link to post

This is a test version only and should not be used in production at this time.

 

For those of you wanting to try the new script, I have added an option to the Tips and Tweaks plugin and created a beta test version where you can enable the new script, turn on the log archiving, and turn on the Ctrl-Alt-Del to do the proper shutdown.

 

If I can get a few of you to do some testing, we can refine the script and possibly work out some gremlins before LT goes live with it.

 

Here is the link to install the Tips and Tweaks beta test plugin.  Go to Settings->Tips and Tweaks and set the "Enable Powerdown Enhancements" on.  The new powerdown script will be installed and replace the built in powerdown.

 

If you have the Tips and Tweaks plugin installed already, uninstall it before installing my beta test version.

 

https://github.com/dlandon/tips.and.tweaks/raw/master/tips.and.tweaks.beta.plg

 

If you have emhttp on a port other than the default port 80, the new script should work fine for you.  If anyone is using a port other than 80, please test and let me know the result.

 

The command line options are:

- 'powerdown' to shut down the server completely.

- 'powerdown -r' to reboot the server.

 

The direct path to powerdown is /usr/local/sbin/powerdown.

 

Other events that will shutdown the server are UPS loss of power, Ctrl-Alt-Del, and quick power button press.

 

One other useful command is 'poweroff' that takes the server down pretty hard and should only be used as a last resort.

 

Don't use any other Linux shutdown or reboot commands as they will bypass the powerdown script.

 

Installed the Tips and Tweaks plugin and enabled this option. emhttp is run on a non standard port on my server and powerdown, powerdown -r, quick power button press, and UPS power down worked for me. I dont have a keyboard attached so could not try the Ctrl-Alt-Del

 

Can you post the last log so I can take a look at the shutdown process?

Link to post

installed tips and tweaks plugin

enabled powerdown

did a powerdown -r

and success

i send PM to dlandon with the logs location

this evening server should auto shut down

will send new logs tomorrow to dlandon again

 

Tom if you want the logs too let me know

sorry don't hang logs on boards any more due to script kiddies

Link to post

... You can see the complete shutdown took 21 seconds.  The time for emhttp to complete it's operation was 2 seconds.  Granted, I only have 1 Docker running and no VMs, but this gives you an idea of the timing of shutdown.

 

So the stopping of VMs and Dockers is not included in the 30 second timer.

 

This begs a couple questions ...

 

=>  if emhttp completes okay, but a VM or Docker hangs and the system never shuts down; is this considered a "clean" shutdown by UnRAID?  (I'd assume so, but just want to confirm)

 

=>  If emhttp hangs and the "hammer" approach executes (poweroff), does that mean none of the Dockers or VMs get a chance to gracefully shut down?

 

=>  How many drives in your test system ... and if none of them are spinning, what does that do to the emhttp shutdown time?

 

 

 

 

Link to post

... You can see the complete shutdown took 21 seconds.  The time for emhttp to complete it's operation was 2 seconds.  Granted, I only have 1 Docker running and no VMs, but this gives you an idea of the timing of shutdown.

 

So the stopping of VMs and Dockers is not included in the 30 second timer.

 

This begs a couple questions ...

 

=>  if emhttp completes okay, but a VM or Docker hangs and the system never shuts down; is this considered a "clean" shutdown by UnRAID?  (I'd assume so, but just want to confirm)

 

=>  If emhttp hangs and the "hammer" approach executes (poweroff), does that mean none of the Dockers or VMs get a chance to gracefully shut down?

 

=>  How many drives in your test system ... and if none of them are spinning, what does that do to the emhttp shutdown time?

 

I wish I could answer your questions, but I don't understand the built in powerdown well enough to answer them.  I don't have enough time to reverse engineer the code to figure all this out.  This is part of the reason I deprecated the powerdown plugin and deferred the powerdown issues to LT.

 

The only thing I can see in the logs is that emhttp starts the shutdown process and then hands it off.

Link to post

Can you post the last log so I can take a look at the shutdown process?

 

Here is log snip from a powerdown I did just a few minutes ago to grab a fresh log for you

 

You have a VM that is not nicely shutting down:

Sep 21 06:51:02 Tower emhttp: shcmd (7073): /etc/rc.d/rc.libvirt stop |& logger
Sep 21 06:51:07 Tower root: Domain 219f0b58-c98c-40ee-2fba-782b0aed902b is being shutdown
Sep 21 06:51:07 Tower root: 
Sep 21 06:52:08 Tower root: Waiting on VMs to shutdown............................................................
Sep 21 06:52:08 Tower root: The following VMs are still running, forcing shutdown: CVS

 

You need to look into this.

Link to post

Can you post the last log so I can take a look at the shutdown process?

 

Here is log snip from a powerdown I did just a few minutes ago to grab a fresh log for you

 

You have a VM that is not nicely shutting down:

Sep 21 06:51:02 Tower emhttp: shcmd (7073): /etc/rc.d/rc.libvirt stop |& logger
Sep 21 06:51:07 Tower root: Domain 219f0b58-c98c-40ee-2fba-782b0aed902b is being shutdown
Sep 21 06:51:07 Tower root: 
Sep 21 06:52:08 Tower root: Waiting on VMs to shutdown............................................................
Sep 21 06:52:08 Tower root: The following VMs are still running, forcing shutdown: CVS

 

You need to look into this.

 

Yah,  that thing has never done a nice shutdown. I think it is the CVS server that causes it, but it has not had a problem restarting so I have never worried about it.

Link to post

Can you post the last log so I can take a look at the shutdown process?

 

Here is log snip from a powerdown I did just a few minutes ago to grab a fresh log for you

 

You have a VM that is not nicely shutting down:

Sep 21 06:51:02 Tower emhttp: shcmd (7073): /etc/rc.d/rc.libvirt stop |& logger
Sep 21 06:51:07 Tower root: Domain 219f0b58-c98c-40ee-2fba-782b0aed902b is being shutdown
Sep 21 06:51:07 Tower root: 
Sep 21 06:52:08 Tower root: Waiting on VMs to shutdown............................................................
Sep 21 06:52:08 Tower root: The following VMs are still running, forcing shutdown: CVS

 

You need to look into this.

 

Yah,  that thing has never done a nice shutdown. I think it is the CVS server that causes it, but it has not had a problem restarting so I have never worried about it.

 

I would probably get it fixed.  Forcing a VM to shutdown is not a good idea.  Is it a Windows VM?  What is a CVS server?

Link to post

Can you post the last log so I can take a look at the shutdown process?

 

Here is log snip from a powerdown I did just a few minutes ago to grab a fresh log for you

 

You have a VM that is not nicely shutting down:

Sep 21 06:51:02 Tower emhttp: shcmd (7073): /etc/rc.d/rc.libvirt stop |& logger
Sep 21 06:51:07 Tower root: Domain 219f0b58-c98c-40ee-2fba-782b0aed902b is being shutdown
Sep 21 06:51:07 Tower root: 
Sep 21 06:52:08 Tower root: Waiting on VMs to shutdown............................................................
Sep 21 06:52:08 Tower root: The following VMs are still running, forcing shutdown: CVS

 

You need to look into this.

 

Yah,  that thing has never done a nice shutdown. I think it is the CVS server that causes it, but it has not had a problem restarting so I have never worried about it.

 

I would probably get it fixed.  Forcing a VM to shutdown is not a good idea.  Is it a Windows VM?  What is a CVS server?

 

It is a CentOS VM with CVS ( Concurrent Versions System. https://en.wikipedia.org/wiki/Concurrent_Versions_System ) that I use for my development projects

Link to post

Yeah this script won't work reliably, no fault to dlandon because he doesn't know exactly what's happening inside emhttp.

 

For reference:

#!/bin/bash
#
# Helper script to gracefully power-down unRAID server.
PROG_NAME="Powerdown"
SECONDS=0
TIMEOUT=30

if [ "${1}" == "-r" ]; then
        OPT="reboot"
        logger "Rebooting server..." -t"${PROG_NAME}"
else
        OPT="shutdown"
        logger "Shuting down server..." -t"${PROG_NAME}"
fi

port=$(lsof -i -P -sTCP:LISTEN|grep -Pom1 '^emhttp.*:\K\d+')

# Access a blank page in case this is first request since startup.
/usr/bin/wget -q -O - localhost:$port/update.htm >/dev/null

# Have emhttp do all the work as if user clicked 'shutdown or reboot' in webGui.
/usr/bin/wget -q --timeout=${TIMEOUT} -O - localhost:$port/update.htm?${OPT}=apply >/dev/null

logger "Shutdown took ${SECONDS} seconds..." -t"${PROG_NAME}"

if ((${SECONDS} >= ${TIMEOUT})); then
        logger "emhttp timed out - doing a hard poweroff..." -t"${PROG_NAME}"
        # Take it down hard
        /sbin/poweroff
fi

 

When emhttp gets the 'shutdown' or 'reboot' command, what it does is beep the speaker (if your m/b doesn't have a speaker, no beep for you) and then create a thread (a pthread) that executes an internal 'Stop' command (as if you clicked Stop button), and then invokes either /sbin/poweroff' or '/sbin/reboot'.  After it has started the thread it waits 2 seconds and returns.  Before returning it also sets up a "static" page in emhttp: either "shutdown.htm" or "reboot.htm".  Once this static page is set, all further http get requests will return the contents of this page regardless of what page browser is requesting.

 

In the normal case the above 'wget' command will complete in 2 seconds.  But meanwhile system is still in process of shutting down (which can hang as described earlier in the topic).

 

It is also possible the above 'wget' command can hang "forever" if, prior to typing the the 'powerdown' command, or pressing the power button, a user clicked Stop and that Stop command is 'hanging' (for same reasons as described earlier in the topic).

 

Other items that complicate this further:

a) the rc.libvirt script has it's own default 60-second timeout for shutting down VM's (waiting for them all to exit).

b) the rc.docker script has it's own 14-second timeout for stopping docker (but I don't think this matters because I don't believe anything should cause 'docker stop' to hang, then 'kill <docker-pid' to hang).

c) when you depress the power button you typically get 2 "events", where each event invokes 'powerdown' script, separated by about a half second.

d) at present, 'rc.libvirt stop' and 'rc.docker stop' are not included in rc.6, but I planned to add them (forgot).  This will add a bit more safety, in that if Stop is hanging because a process has an open file on a mounted disk, when the /sbin/poweroff hammer is dropped, at least VMs/docker clean shutdown will be attempted, but this could add to the total time it takes to shutdown.

e) users are impatient and typically if they click power down button they expect to see things happening right away, and if not they will press that button a bunch of times before finally just holding it down (which will cause h/w to turn off PSU).

 

One other note: I guess people are aware of this: very near the end in rc.6 after everything is done, but before unmounting the usb boot flash, is to copy the system log to /boot (root of flash).  I'm going to tighten that up further so that any logging messages related to stopping md/unraid driver also included.

Link to post

As I experimented more I saw that the changes to the powerdown script alone would not address all the possible cases where shutdown can hang.  The one thing the changes did was handle those cases where a user changed the emhttp port.

 

Adding additional logging to the shutdown process would be excellent and help in situations where users have shutdown problems as they can sometimes be hard to shoot down.

 

It would be good to also capture diagnostics just before the shutdown starts and for consistency, save the log in /flash/logs/ and not /flash/.  That way logs and diagnostics will be in one place.

Link to post
  • 2 weeks later...

Fixed in 6.2.1

 

Does this mean the shutdown process is resolved in 6.2.1 (i.e. no further need for the Powerdown script)?

... or that more detailed logging (including logs from just before the shutdown as dlandon requested) will be in 6.2.1 so the issues will be easier to troubleshoot?

 

Link to post

I'm marking this as solved because 6.2.1 does a very nice shutdown on my test server.  Good job LT!  Thank you for getting this fixed.

 

Does that mean this is deprecated again?  Or is this going to be like Adobe Flash on Linux?  ;D

 

In all seriousness though, it's kind of sad to see this go as a plugin, it's been the one constant I've had throughout my Unraid usage, like a solid dependable friend at times of need and trouble, new install, always the first "invited" round and never failed me in the last 5 years.

 

Thanks dlandon, I've sent a small donation, which in all honesty, I should have sent a long time ago. 

Link to post

I'm marking this as solved because 6.2.1 does a very nice shutdown on my test server.  Good job LT!  Thank you for getting this fixed.

Good to hear that!  Sorry this work should have been done a long time ago.

There may still be issues shutting down VM's within the timeout.  That will be addressed better in subsequent releases.

Link to post

I'm marking this as solved because 6.2.1 does a very nice shutdown on my test server.  Good job LT!  Thank you for getting this fixed.

 

Does that mean this is deprecated again?  Or is this going to be like Adobe Flash on Linux?  ;D

 

In all seriousness though, it's kind of sad to see this go as a plugin, it's been the one constant I've had throughout my Unraid usage, like a solid dependable friend at times of need and trouble, new install, always the first "invited" round and never failed me in the last 5 years.

 

Thanks dlandon, I've sent a small donation, which in all honesty, I should have sent a long time ago.

 

I am not doing anything at this time.  The plugin will install until 6.3, but I'm concerned with the changes LT made that the powerdown plugin will interfere and cause problems.  I would remove it when upgrading to 6.2.1.  I have not checked operation with the plugin installed.  It seems moot at this point.

 

I too will probably miss this trusty "friend" plugin.  It has been around a long time - WeeboTech started it in 2008 to address some shutdown issues and it just morphed into what it is now.  I'm pleased that LT has addressed the issues, and from what I can tell did a nice job.

 

Thank you for the donation.  I and other developers/contributors do appreciate a little thanks for the work we do that at times seems to be taken for granted.

Link to post

I and other developers/contributors do appreciate a little thanks for the work we do that at times seems to be taken for granted.

 

You're very much preaching to the converted on that...  ;)

Link to post

I'm marking this as solved because 6.2.1 does a very nice shutdown on my test server.  Good job LT!  Thank you for getting this fixed.

Good to hear that!  Sorry this work should have been done a long time ago.

There may still be issues shutting down VM's within the timeout.  That will be addressed better in subsequent releases.

 

Tom, sorry to have been so pushy about this, but you are right it has needed some attention for a long time.

 

I will now turn my attention to having you do more logging and at least save the diagnostics when shutting down.

 

This would do what I'd like to see:

#!/bin/sh
#
# /etc/rc.d/rc.local_shutdown:  Local system shutdown script.
#

# capture the system diagnostics
logger "Capture diagnostics to /boot/logs"
echo "Capture diagnostics to /boot/logs"
rm -f /boot/logs/*diagnostics*.zip
/usr/local/sbin/diagnostics

# Invoke any custom 'stop' script
if [ -f /boot/config/stop ]; then
  echo "Starting stop script"
  fromdos </boot/config/stop >/var/tmp/stop
  chmod +x /var/tmp/stop
  /var/tmp/stop
fi

# Get time-out setting
/usr/bin/fromdos </boot/config/ident.cfg >/var/tmp/ident.cfg
source /var/tmp/ident.cfg
TIMEOUT=${shutdownTimeout:=60}
            
# Find the port emhttp is listening at.
PORT=$(lsof -i -P -sTCP:LISTEN|grep -Pom1 '^emhttp.*:\K\d+')

# (Try) to stop the array
echo "Waiting $TIMEOUT seconds for graceful shutdown..."
if ! timeout $TIMEOUT /usr/bin/wget -q -O - localhost:$PORT/update.htm?cmdStop=apply >/dev/null ;then
  echo "Forcing shutdown..."
fi

# Shutdown cpuload daemon
if [ -x /etc/rc.d/rc.cpuload ]; then
  echo "Stopping cpuload daemon: /etc/rc.d/rc.cpuload"
  /etc/rc.d/rc.cpuload stop
fi

Link to post
Guest
This topic is now closed to further replies.