Riot Posted July 21, 2013 Share Posted July 21, 2013 working on a new package for unmenu.. it will have the sleep fix included.. PACKAGE_INSTALLATION sed -i -e "s/\$0 stop/\$0 stop\n sleep 3/" /etc/rc.d/rc.apcupsd please try out and report back: http://lime-technology.com/forum/index.php?topic=27051.msg253098#msg253098 I'm running the newer apcupsd on my newer server. So far, it is working as expected. just noticed the rollover works fine. Jul 21 04:40:01 husky apcupsd[1516]: apcupsd exiting, signal 15 Jul 21 04:40:01 husky apcupsd[1516]: apcupsd shutdown succeeded Jul 21 04:40:06 husky apcupsd[28940]: apcupsd 3.14.10 (13 September 2011) slackware startup succeeded Jul 21 04:40:06 husky apcupsd[28940]: NIS server startup succeeded Does for me also Quote Link to comment
Joe L. Posted July 21, 2013 Share Posted July 21, 2013 Same here: Jul 21 04:40:01 Tower apcupsd[12583]: apcupsd exiting, signal 15 Jul 21 04:40:01 Tower apcupsd[12583]: apcupsd shutdown succeeded Jul 21 04:40:06 Tower apcupsd[10883]: apcupsd 3.14.10 (13 September 2011) slackware startup succeeded Jul 21 04:40:06 Tower apcupsd[10883]: NIS server startup succeeded apcupsd re-started just as expected. Quote Link to comment
zoggy Posted July 23, 2013 Share Posted July 23, 2013 hobbled together a little box so I could do testing with. on 5.0 RC16c. Did some more testing today with the latest apcupsd package but with clean powerdown page NOT installed. apcupsd setting to issue shutdown after 20 sec outage. pulled power and checked log and ups status: (from /sbin/apcaccess status) APC : 001,037,0962 DATE : 2013-07-23 10:30:28 -0500 HOSTNAME : testtower VERSION : 3.14.10 (13 September 2011) slackware UPSNAME : testtower CABLE : USB Cable DRIVER : USB UPS Driver UPSMODE : Stand Alone STARTTIME: 2013-07-23 10:25:43 -0500 MODEL : Back-UPS BR1500G STATUS : SHUTTING DOWN LINEV : 000.0 Volts LOADPCT : 0.0 Percent Load Capacity BCHARGE : 100.0 Percent TIMELEFT : 279.0 Minutes MBATTCHG : 5 Percent MINTIMEL : 5 Minutes MAXTIME : 20 Seconds SENSE : Medium LOTRANS : 088.0 Volts HITRANS : 147.0 Volts ALARMDEL : No alarm BATTV : 26.1 Volts LASTXFER : Unacceptable line voltage changes NUMXFERS : 1 XONBATT : 2013-07-23 10:28:58 -0500 TONBATT : 94 seconds CUMONBATT: 94 seconds XOFFBATT : N/A SELFTEST : NO STATFLAG : 0x07160210 Status Flag SERIALNO : 4B1120P04730 BATTDATE : 2011-05-09 NOMINV : 120 Volts NOMBATTV : 24.0 Volts NOMPOWER : 865 Watts FIRMWARE : 865.L2 .D USB FW:L2 END APC : 2013-07-23 10:30:32 -0500 Jul 23 10:25:44 Tower apcupsd[8781]: NIS server startup succeeded Jul 23 10:25:44 Tower apcupsd[8781]: apcupsd 3.14.10 (13 September 2011) slackware startup succeeded Jul 23 10:28:58 Tower apcupsd[8781]: Power failure. (Minor Issues) Jul 23 10:29:04 Tower apcupsd[8781]: Running on UPS batteries. Jul 23 10:29:25 Tower apcupsd[8781]: Reached run time limit on batteries. Jul 23 10:29:25 Tower apcupsd[8781]: Initiating system shutdown! Jul 23 10:29:25 Tower apcupsd[8781]: User logins prohibited staring at the console currently. it did not shut down... looking at: "/etc/apcupsd/doshutdown" I do see that the package did put in "/sbin/powerdown" then "exit 99" I would have though it would on 5.x since the powerdown package was really created for 4.x... I now see that '/sbin/powerdown' doesnt exist on 5.x.. its actually '/usr/local/sbin/powerdown'. Ran this manually and sure enough the box shutdown. So do we detect '/sbin/powerdown' and '/usr/local/sbin/powerdown' then use whichever one they have... what if they have both. what about 4.x users. Quote Link to comment
Joe L. Posted July 23, 2013 Share Posted July 23, 2013 hobbled together a little box so I could do testing with. on 5.0 RC16c. Did some more testing today with the latest apcupsd package but with clean powerdown page NOT installed. I now see that '/sbin/powerdown' doesnt exist on 5.x.. its actually '/usr/local/sbin/powerdown'. Ran this manually and sure enough the box shutdown. So do we detect '/sbin/powerdown' and '/usr/local/sbin/powerdown' then use whichever one they have... what if they have both. what about 4.x users. it is the same on any version of unRAId with a powerdown script at all. Ideally, the stock power down script would have an option to NOT wait for all disks to be idle, but to kill processes as needed in a power failure. It has no such option. you could probably use the stock powerdown if it exists (the install of the clean powerdown package renames the original to "/usr/local/sbin/unraid_powerdown" Problem was, at one time there was NO powerdown script. The "clean-powerdown" was created and named /sbin/powerdown. Then, Tom created /usr/local/sbin/powerdown. (which invokes a URL from emhttp@localhost ) But... /usr/local/sbin is FIRST in the $PATH, so I was forced to rename the lime-tech supplied command so the other would be executed when someone just typed "powerdown" Best solution on 5.X would be to invoke the original, which should invoke all the "event" scripts to stop everything, and the array will stop as needed and then finally power off the box. Problem is... not all add-ons have event related triggers to stop themselves. The disk could be busy from a "telnet" session, or a process running from it as the current-working-directory We basically need a parallel process (in addition to /usr/local/sbin/powerdown) that if it sees the array is waiting for a disk to become non-idle will after an appropriate length of time take over and start terminating processes holding disks busy. That parallel process cannot even be certain emhttp will be listening for the /sbin/powerdown command. (It just invokes the URL equivalent on emhttp to the powerdown button) for now...I'd see what happens with something like this: /usr/local/sbin/powerdown & test -f /usr/local/sbin/powerdown && sleep 60 /sbin/powerdown Quote Link to comment
zoggy Posted July 23, 2013 Share Posted July 23, 2013 So when the cleanpower down package is installed.. there is a '/sbin/powerdown'. I just though from the 5.0rc16 thread that '/sbin/powerdown' existed on 5.x from Tom.. but it turns out that its in the usr/local. Anyways, on updating the apcupsd package: So.. on 4.x we deff should tell people to install the clean powerdown package. -- like it does currently. We don't force it or require it as a dependency. On 5.x. We should tell people to install the clean powerdown package.. (and use it if installed) and if its NOT installed then we should just use the /usr/local/sbin/powerdown instead.. that way it at least does something? I've seen people saying that the powerdown package needs updating for 5.x... what is that about? It seems to work fine. Quote Link to comment
zoggy Posted July 23, 2013 Share Posted July 23, 2013 ok, on 5.0 rc16c. installed clean power down and apcupsd. Set it to shutdown after 20 sec outage and then issue command to shutdown ups. Unplugged ups from wall. It shutdown unraid (array was running) then about 1min later the actual ups shutdown just like it was supposed to. Relevant syslog entries: Jul 23 17:14:36 testtower apcupsd[1909]: Power failure. (Minor Issues) Jul 23 17:14:42 testtower apcupsd[1909]: Running on UPS batteries. Jul 23 17:15:03 testtower apcupsd[1909]: Reached run time limit on batteries. Jul 23 17:15:03 testtower apcupsd[1909]: Initiating system shutdown! Jul 23 17:15:03 testtower apcupsd[1909]: User logins prohibited Jul 23 17:15:03 testtower logger: Powerdown initiated Jul 23 17:15:03 testtower rc.unRAID[2129]: Stopping unRAID. Notice that the apcupsd killups command never makes it to the syslog (since its issued AFTER unraid shuts down). Side note, when shutting down via the clean powerdown script it does a 'status' report which sends a whole bunch of stuff to the syslog. Joe, Several things in that show up as 'errors' because of the regex matching, heres the relevant syslog messages if you want to go exclude them: Jul 23 17:15:03 testtower mounts[2137]: /dev/sda1 /boot vfat rw,noatime,nodiratime,fmask=0000,dmask=0000,allow_utime=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 0 (Errors) Jul 23 17:15:03 testtower hdparm[2142]: ^I *^ISMART error logging (Errors) Jul 23 17:15:03 testtower smartctl[2149]: without error or no self-test has ever (Errors) Jul 23 17:15:03 testtower smartctl[2149]: Error logging capability: (0x01) Error logging supported. (Errors) Jul 23 17:15:03 testtower smartctl[2149]: 184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0 (Errors) Jul 23 17:15:03 testtower smartctl[2149]: SMART Error Log Version: 1 (Errors) Jul 23 17:15:03 testtower smartctl[2149]: No Errors Logged (Errors) Jul 23 17:15:03 testtower ifconfig[2160]: RX packets:993 errors:0 dropped:0 overruns:0 frame:0 (Errors) Jul 23 17:15:03 testtower ifconfig[2160]: TX packets:1225 errors:0 dropped:0 overruns:0 carrier:0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: tx_errors: 0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: rx_errors: 0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: align_errors: 0 (Errors) Let me know if you want me to provide the full syslog if you need to see the context they are in. Quote Link to comment
zoggy Posted July 23, 2013 Share Posted July 23, 2013 ok, on 5.0 rc16c. installed clean power down and apcupsd. Set it to shutdown after 20 sec outage and then issue command to shutdown ups. Unplugged ups from wall. It shutdown unraid (array was running) then about 1min later the actual ups shutdown just like it was supposed to. Relevant syslog entries: Jul 23 17:14:36 testtower apcupsd[1909]: Power failure. (Minor Issues) Jul 23 17:14:42 testtower apcupsd[1909]: Running on UPS batteries. Jul 23 17:15:03 testtower apcupsd[1909]: Reached run time limit on batteries. Jul 23 17:15:03 testtower apcupsd[1909]: Initiating system shutdown! Jul 23 17:15:03 testtower apcupsd[1909]: User logins prohibited Jul 23 17:15:03 testtower logger: Powerdown initiated Jul 23 17:15:03 testtower rc.unRAID[2129]: Stopping unRAID. Notice that the apcupsd killups command never makes it to the syslog (since its issued AFTER unraid shuts down). Side note, when shutting down via the clean powerdown script it does a 'status' report which sends a whole bunch of stuff to the syslog. Joe, Several things in that show up as 'errors' because of the regex matching, heres the relevant syslog messages if you want to go exclude them: Jul 23 17:15:03 testtower mounts[2137]: /dev/sda1 /boot vfat rw,noatime,nodiratime,fmask=0000,dmask=0000,allow_utime=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 0 (Errors) Jul 23 17:15:03 testtower hdparm[2142]: ^I *^ISMART error logging (Errors) Jul 23 17:15:03 testtower smartctl[2149]: without error or no self-test has ever (Errors) Jul 23 17:15:03 testtower smartctl[2149]: Error logging capability: (0x01) Error logging supported. (Errors) Jul 23 17:15:03 testtower smartctl[2149]: 184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0 (Errors) Jul 23 17:15:03 testtower smartctl[2149]: SMART Error Log Version: 1 (Errors) Jul 23 17:15:03 testtower smartctl[2149]: No Errors Logged (Errors) Jul 23 17:15:03 testtower ifconfig[2160]: RX packets:993 errors:0 dropped:0 overruns:0 frame:0 (Errors) Jul 23 17:15:03 testtower ifconfig[2160]: TX packets:1225 errors:0 dropped:0 overruns:0 carrier:0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: tx_errors: 0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: rx_errors: 0 (Errors) Jul 23 17:15:03 testtower ethtool[2165]: align_errors: 0 (Errors) Let me know if you want me to provide the full syslog if you need to see the context they are in. an easy one to ignore (would eliminate 3 of those above) looks to be: _errors: and then this one would take care 4 more (if case insensitive matching): \wErrors?\wLog Quote Link to comment
Joe L. Posted July 23, 2013 Share Posted July 23, 2013 Apparently, if disks are busy, the unRAID gets upset if disks are un-mounted by clean-powerdown. Try this log in via telnet cd to /mnt/disk1/ then pull power and see what happens when the disk remains busy because of your telnet session. Regarding false error highlighting: You can look at /boot/unmenu/syslog_match.conf (I'm not near my server right now ... actually about 775 miles from it, I think that is the name of the file with the regex expressions. ) Easiest is to put specific matches for lines that are not errors and color them "black" near the top of that file. first match is used. so that will get rid of the false hits. Joe L. Quote Link to comment
zoggy Posted July 24, 2013 Share Posted July 24, 2013 so far these take care of all the ones I see... cant be for certain they wont undo any legit ones though. do you have a sample syslog you use to test with ? # testing match_case||"SMART error logging"||black match_case||" End-to-End_Error"||black match_case||" errors:0 dropped:"||black match_case||",errors=remount-ro"||black match_case||"_errors:"||black match_case||"without error or no self-test has ever"||black any_case||" Errors? Log"||black Quote Link to comment
zoggy Posted July 24, 2013 Share Posted July 24, 2013 Apparently, if disks are busy, the unRAID gets upset if disks are un-mounted by clean-powerdown. Try this log in via telnet cd to /mnt/disk1/ then pull power and see what happens when the disk remains busy because of your telnet session. Regarding false error highlighting: You can look at /boot/unmenu/syslog_match.conf (I'm not near my server right now ... actually about 775 miles from it, I think that is the name of the file with the regex expressions. ) Easiest is to put specific matches for lines that are not errors and color them "black" near the top of that file. first match is used. so that will get rid of the false hits. Joe L. ok did that.. had a 'ping -c 999 www.google.com' running as well: Jul 23 20:40:26 testtower in.telnetd[1869]: connect from 192.168.0.2 (192.168.0.2) Jul 23 20:40:28 testtower login[1870]: ROOT LOGIN on '/dev/pts/0' from '192.168.0.2' Jul 23 20:41:51 testtower apcupsd[1837]: Power failure. Jul 23 20:41:57 testtower apcupsd[1837]: Running on UPS batteries. Jul 23 20:42:18 testtower apcupsd[1837]: Reached run time limit on batteries. Jul 23 20:42:18 testtower apcupsd[1837]: Initiating system shutdown! Jul 23 20:42:18 testtower apcupsd[1837]: User logins prohibited Jul 23 20:42:18 testtower logger: Powerdown initiated Jul 23 20:42:18 testtower rc.unRAID[1920]: Stopping unRAID. ... Jul 23 20:42:19 testtower status[1967]: ACTIVE PIDS on the array Jul 23 20:42:19 testtower status[1967]: root 1870 1869 0 20:40 pts/0 00:00:00 -bash Jul 23 20:42:19 testtower status[1967]: root 1901 1870 0 20:41 pts/0 00:00:00 ping -c 9999 www.google.com Jul 23 20:42:20 testtower rc.unRAID[1995]: Killing active pids on the array drives Jul 23 20:42:20 testtower rc.unRAID[1998]: root 1870 1869 0 20:40 pts/0 00:00:00 -bash Jul 23 20:42:20 testtower rc.unRAID[1998]: root 1901 1870 0 20:41 pts/0 00:00:00 ping -c 9999 www.google.com Jul 23 20:42:21 testtower rc.unRAID[1998]: root 1870 1869 0 20:40 pts/0 00:00:00 -bash Jul 23 20:42:22 testtower rc.unRAID[1998]: root 1870 1869 0 20:40 pts/0 00:00:00 -bash Jul 23 20:42:23 testtower rc.unRAID[2018]: Umounting the drives Jul 23 20:42:23 testtower rc.unRAID[2022]: /dev/md1 umounted Jul 23 20:42:24 testtower rc.unRAID[2026]: Stopping the Array Jul 23 20:42:24 testtower kernel: mdcmd (31): stop Jul 23 20:42:24 testtower kernel: md1: stopping ... Jul 23 20:42:27 testtower logger: Initiating Shutdown with Jul 23 20:42:27 testtower shutdown[2046]: shutting down for system halt Jul 23 20:42:27 testtower init: Switching to runlevel: 0 Jul 23 20:42:29 testtower rc.unRAID[2057]: Stopping unRAID. ... Jul 23 20:42:32 testtower rc.unRAID[2130]: Killing active pids on the array drives Jul 23 20:42:32 testtower rc.unRAID[2146]: Umounting the drives Jul 23 20:42:32 testtower rc.unRAID[2150]: umount: /mnt/disk1: not mounted Jul 23 20:42:32 testtower rc.unRAID[2150]: Could not find /mnt/disk1 in mtab Jul 23 20:42:32 testtower rc.unRAID[2152]: Stopping the Array Jul 23 20:42:32 testtower kernel: mdcmd (32): stop Jul 23 20:42:32 testtower kernel: md: stop_array: not started so it looks like it worked? Quote Link to comment
zoggy Posted July 24, 2013 Share Posted July 24, 2013 modified the unmenu apcupsd package a little bit. - if the user has the clean powerdown pacakage we use that for the apcupsd 'doshutdown'. (same as before) - if the user doesn't have the clean powerdown but does have the unraid one (5.x only?), we use that at least. so, installed apcupsd without the clean powerdown. pulled power. it shutdown much quickier than the clean powerdown package did.. related syslog messages: Jul 24 18:10:23 testtower apcupsd[1541]: Power failure. (Minor Issues) Jul 24 18:10:29 testtower apcupsd[1541]: Running on UPS batteries. Jul 24 18:10:50 testtower apcupsd[1541]: Reached run time limit on batteries. Jul 24 18:10:50 testtower apcupsd[1541]: Initiating system shutdown! Jul 24 18:10:50 testtower apcupsd[1541]: User logins prohibited Jul 24 18:10:50 testtower emhttp: shcmd (33): beep -r 2 (Other emhttp) Jul 24 18:10:51 testtower emhttp: shcmd (34): /usr/local/sbin/emhttp_event stopping_svcs (Other emhttp) Jul 24 18:10:51 testtower kernel: mdcmd (31): nocheck (unRAID engine) Jul 24 18:10:51 testtower kernel: md: nocheck_array: check not active (unRAID engine) Jul 24 18:10:51 testtower emhttp_event: stopping_svcs (Other emhttp) Jul 24 18:10:51 testtower emhttp: Stop AVAHI... (Other emhttp) Jul 24 18:10:51 testtower emhttp: shcmd (35): /etc/rc.d/rc.avahidaemon stop |$stuff$ logger (Other emhttp) Jul 24 18:10:51 testtower logger: Stopping Avahi mDNS/DNS-SD Daemon: stopped Jul 24 18:10:51 testtower avahi-daemon[1100]: Got SIGTERM, quitting. Jul 24 18:10:51 testtower avahi-dnsconfd[1114]: read(): EOF Jul 24 18:10:51 testtower avahi-daemon[1100]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.117. (Network) Jul 24 18:10:51 testtower avahi-daemon[1100]: avahi-daemon 0.6.31 exiting. Jul 24 18:10:51 testtower emhttp: shcmd (36): /etc/rc.d/rc.avahidnsconfd stop |$stuff$ logger (Other emhttp) Jul 24 18:10:51 testtower logger: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped Jul 24 18:10:51 testtower emhttp: shcmd (37): ps axc | grep -q rpc.mountd (Other emhttp) Jul 24 18:10:51 testtower emhttp: _shcmd: shcmd (37): exit status: 1 (Other emhttp) Jul 24 18:10:51 testtower emhttp: Stop SMB... (Other emhttp) Jul 24 18:10:51 testtower emhttp: shcmd (38): /etc/rc.d/rc.samba stop |$stuff$ logger (Other emhttp) Jul 24 18:10:51 testtower emhttp: shcmd (39): rm /etc/avahi/services/smb.service $stuff$> /dev/null (Other emhttp) Jul 24 18:10:51 testtower emhttp: Spinning up all drives... (Other emhttp) Jul 24 18:10:51 testtower kernel: mdcmd (32): spinup 1 (Routine) so from the look of that, it seems to work fine as well. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.