swells Posted November 13, 2011 Share Posted November 13, 2011 I searched and saw similar error's posted here, but not quite the same. Anyone know what this error is? I was able to restart apcupsd and it was fine, but im worried this could happen again at the wrong time. Nov 13 04:40:01 Tower apcupsd[1702]: apcupsd exiting, signal 15 Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208 Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors) Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208 Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd error shutdown completed (Errors) Nov 13 04:40:02 Tower kernel: apcupsd[1702]: segfault at 0 ip 0805e204 sp bf9e8240 error 4 in apcupsd[8048000+38000] (Errors) Quote Link to comment
Joe L. Posted November 13, 2011 Share Posted November 13, 2011 I searched and saw similar error's posted here, but not quite the same. Anyone know what this error is? I was able to restart apcupsd and it was fine, but im worried this could happen again at the wrong time. Nov 13 04:40:01 Tower apcupsd[1702]: apcupsd exiting, signal 15 Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208 Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors) Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208 Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd error shutdown completed (Errors) Nov 13 04:40:02 Tower kernel: apcupsd[1702]: segfault at 0 ip 0805e204 sp bf9e8240 error 4 in apcupsd[8048000+38000] (Errors) To me it seems to indicate you attempted to start apculsd again when it was already started. (could not acquire the lock file, since the process ID in the lock file was different than the one attempting to read the lock.) The "segfault" was certainly not normal. But who knows what that was about... Joe L. Quote Link to comment
swells Posted November 13, 2011 Author Share Posted November 13, 2011 Not sure how it tried to start when it was already started. I guess we'll see if this happens again. Thanks Quote Link to comment
swells Posted November 20, 2011 Author Share Posted November 20, 2011 Alright, got this again, this time without the "segfault" portion. Happens Sunday mornings at the same time. Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd exiting, signal 15 Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502 Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors) Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502 Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd error shutdown completed (Errors) Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd shutdown succeeded Quote Link to comment
SeeDrs Posted November 20, 2011 Share Posted November 20, 2011 Alright, got this again, this time without the "segfault" portion. Happens Sunday mornings at the same time. Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd exiting, signal 15 Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502 Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors) Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502 Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd error shutdown completed (Errors) Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd shutdown succeeded What version of Unraid are you using and what version of APCUPSD are you using? Quote Link to comment
swells Posted November 20, 2011 Author Share Posted November 20, 2011 I'm using 5.0beta13 and the APCUPSD 3.14.10 Plugin. Quote Link to comment
SeeDrs Posted November 20, 2011 Share Posted November 20, 2011 I'm using 5.0beta13 and the APCUPSD 3.14.10 Plugin. I started getting the same error once I switched to 5.0beta13. Quote Link to comment
swells Posted November 20, 2011 Author Share Posted November 20, 2011 So are you just restarting it every time this happens or did you find a way around it? Does yours happen at the same time? Interesting that it always happens Sunday morning at 4:40am. Quote Link to comment
Joe L. Posted November 20, 2011 Share Posted November 20, 2011 So are you just restarting it every time this happens or did you find a way around it? Does yours happen at the same time? Interesting that it always happens Sunday morning at 4:40am. I think it might be a weekly log file rotation feature built into apcupsd that is broken in that release. Do a google search of "apcupsd exiting, signal 15" and you'll find others who have complaining of the same issue. Quote Link to comment
joelones Posted October 10, 2012 Share Posted October 10, 2012 I noticed the same problem while running unraid 5 rc8, APCUPSD 3.14.10 Plugin and a Cyberpower CP1350AVRLCD. Any workarounds? It seems to kill the apcupsd daemon. Oct 7 04:40:01 Clara-Belle apcupsd[2223]: apcupsd exiting, signal 15 Oct 7 04:40:01 Clara-Belle apcupsd[19654]: Valid lock file for pid=2223, but not ours pid=19654 Oct 7 04:40:01 Clara-Belle apcupsd[19654]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file Oct 7 04:40:01 Clara-Belle apcupsd[19654]: Valid lock file for pid=2223, but not ours pid=19654 Oct 7 04:40:01 Clara-Belle apcupsd[19654]: apcupsd error shutdown completed Oct 7 04:40:01 Clara-Belle kernel: apcupsd[2223]: segfault at 0 ip 0805e204 sp bfbc33d0 error 4 in apcupsd[8048000+38000] Quote Link to comment
PeterB Posted April 11, 2013 Share Posted April 11, 2013 I've just reported exactly the same problem/symptoms in another thread, and was directed to this thread. Yes, it happened to me at 4.40 last Sunday morning (April 7). Having seen this thread, I checked previous logs - at 4.40 on Sunday March 31, apcupsd shut down and restarted successfully. Strangely, there was nothing in my log for Sunday March 24 at 4.40. I have to presume that apcupsd is doing this deliberately - log rotation seems a plausible reason ... why it didn't happen on March 24 is a puzzle, perhaps it depends on how long apcupsd has been running. Now, guessing here, but I suspect that the lock file deletion doesn't always occur quickly enough - a race hazard. A simple fix in the apcupsd code would be wait a few seconds - perhaps on startup if a conflicting lock file exists. A simple workaround would be to add a cron job, running at 4.41 every Sunday, starting apcupsd if the lock file does not exist. Quote Link to comment
PeterB Posted April 19, 2013 Share Posted April 19, 2013 Okay, I've spent more time on this. apcupsd does perform log rotation, with a stop and a start (restart). The problem is that there is absolutely no artificial delay between the stop and the start command and I am convinced that it is delays in the pid file deletion cause this problem - a simple "sleep 1" would probably resolve the issue. That would be very easy to implement but would mean modification to a file which is part of the standard apcupsd distribution. I have, therefore, made alterations to the unRAID plugin, along the lines of my previous suggestion - a cron job will run at 4.41 every Sunday morning to restart the daemon if it is not running. I will watch my log for activity on Sunday mornings and once I have evidence that my changes have the desired effect, I will look into making a new distribution. My changes do mean that there is, potentially, a minute when apcupsd is not running, but I don't believe that this is a problem - I'm sure that I've seen a situation where apcupsd started while already running on batteries and it still shut the system down. Looking at the three apcupsd distributions which are commonly used with unRAID, it is clear that 3.14.10 and 3.14.8 both implement log rotation in the same way, but 3.14.3 doesn't. It seems likely that it is the same issue which prompted most users to avoid 3.14.8 in favour of 3.14.3. [b]<FILE Name="/usr/local/emhttp/plugins/apcupsd/apcrestart" Mode="0775"> <INLINE> <![CDATA[ #!/bin/bash if ! ps x | grep -qs '/sbin/[a]pcupsd' then echo not running /etc/rc.d/rc.apcupsd start fi ]]> </INLINE> </FILE>[/b] function startapcupsd() { global $newline, $log; echo("Starting apcupsd..."); exec("/etc/rc.d/rc.apcupsd stop"); sleep(0.5); exec_log("killall -9 apcupsd"); exec_log("/etc/rc.d/rc.apcupsd start"); sleep(5); [b] exec("(crontab -l | grep -v apcrestart; echo -e \42# apcrestart\n# Restart apcupsd after log rotate (Sunday 4:40) (apcrestart)\n41 4 * * 0 /usr/local/emhttp/plugins/apcupsd/apcrestart\42) | crontab -"); [/b] echo("Completed$newline"); } function stopapcupsd() { global $newline, $log; echo("Stopping apcupsd..."); [b] exec("(crontab -l | grep -v apcrestart) | crontab -");[/b] exec_log("/etc/rc.d/rc.apcupsd stop"); sleep(0.5); exec_log("killall -9 apcupsd"); echo("Completed$newline"); } Quote Link to comment
PeterB Posted April 21, 2013 Share Posted April 21, 2013 No logfile rotation was attempted his morning - I think because we had a power outage yesterday, so the system was rebooted. I have seen evidence that log rotation only happens if the system has been up for a certain minimum time. So, I wait for next Sunday! Edit: Well, 'next' Sunday arrived, and all was looking good for my test .... until the power went off at 3am. What's more infuriating, is that it was only off for just over five minutes - guess what length of time I allow before shutdown begins ... yes, five minutes. The power returned ten seconds before the server powered off! Perhaps next Sunday? Quote Link to comment
olympia Posted May 17, 2013 Share Posted May 17, 2013 Hi PeterB, Do you have any update on this issue? Quote Link to comment
PeterB Posted May 17, 2013 Share Posted May 17, 2013 Hi PeterB, Do you have any update on this issue? We've had too many powercuts ... I haven't managed to get a clear run up to a Sunday morning. Currently not had a cut since Thursday afternoon (34 hours ago) - perhaps it will hold for another 28 hours and log rotation might be attempted. If that fails, would you be prepared to run with my modifications and test for me? Quote Link to comment
olympia Posted May 18, 2013 Share Posted May 18, 2013 If that fails, would you be prepared to run with my modifications and test for me? Yes sure, however there will be some tuning on my rig next week, so not sure if I will have the necessary uptime either. Anyhow this is a hit and miss for me. I have the impression that sometimes apcupsd survives log rotation, sometimes not. Quote Link to comment
PeterB Posted May 18, 2013 Share Posted May 18, 2013 Anyhow this is a hit and miss for me. I have the impression that sometimes apcupsd survives log rotation, sometimes not. Indeed, that is true. There is a race between the deletion of the pid file, instigated by the 'old' apcupsd process, and the new apcupsd process starting up and testing for the existence of a pid file. Sometimes the deletion completes in time, sometimes it doesn't. Quote Link to comment
PeterB Posted May 19, 2013 Share Posted May 19, 2013 Well, for the first time in four weeks, power stayed up long enough for log rotation to occur - unfortunately the standard restart went without a hitch, so the restart performed by my modification wasn't invoked. However, after four weeks, I am convinced that my modification does no harm and, in artificial tests, I have proven that my code will effect a (re)start in the event that the apcupsd process is missing. I will put a release together and place it on my website for anyone to download and test. Quote Link to comment
kaiguy Posted June 18, 2013 Share Posted June 18, 2013 And where can we find your website, PeterB? Quote Link to comment
PeterB Posted June 23, 2013 Share Posted June 23, 2013 Ooops! My excuse is that I've been busy! I'll try my best to do it this week! Quote Link to comment
Riot Posted July 7, 2013 Share Posted July 7, 2013 Ooops! My excuse is that I've been busy! I'll try my best to do it this week! Do you still have plans to release your fix? Thanks Quote Link to comment
zoggy Posted July 9, 2013 Share Posted July 9, 2013 working on a new package for unmenu.. it will have the sleep fix included.. PACKAGE_INSTALLATION sed -i -e "s/\$0 stop/\$0 stop\n sleep 3/" /etc/rc.d/rc.apcupsd please try out and report back: http://lime-technology.com/forum/index.php?topic=27051.msg253098#msg253098 Quote Link to comment
Riot Posted July 17, 2013 Share Posted July 17, 2013 Modded SeeDrs plugin with new alternate package and zoggy's fix http://lime-technology.com/forum/index.php?topic=15583.msg254421#msg254421 Quote Link to comment
Joe L. Posted July 17, 2013 Share Posted July 17, 2013 working on a new package for unmenu.. it will have the sleep fix included.. PACKAGE_INSTALLATION sed -i -e "s/\$0 stop/\$0 stop\n sleep 3/" /etc/rc.d/rc.apcupsd please try out and report back: http://lime-technology.com/forum/index.php?topic=27051.msg253098#msg253098 I'm running the newer apcupsd on my newer server. So far, it is working as expected. Quote Link to comment
zoggy Posted July 21, 2013 Share Posted July 21, 2013 working on a new package for unmenu.. it will have the sleep fix included.. PACKAGE_INSTALLATION sed -i -e "s/\$0 stop/\$0 stop\n sleep 3/" /etc/rc.d/rc.apcupsd please try out and report back: http://lime-technology.com/forum/index.php?topic=27051.msg253098#msg253098 I'm running the newer apcupsd on my newer server. So far, it is working as expected. just noticed the rollover works fine. Jul 21 04:40:01 husky apcupsd[1516]: apcupsd exiting, signal 15 Jul 21 04:40:01 husky apcupsd[1516]: apcupsd shutdown succeeded Jul 21 04:40:06 husky apcupsd[28940]: apcupsd 3.14.10 (13 September 2011) slackware startup succeeded Jul 21 04:40:06 husky apcupsd[28940]: NIS server startup succeeded Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.