Jump to content
swells

apcupsd error

36 posts in this topic Last Reply

Recommended Posts

I searched and saw similar error's posted here, but not quite the same. Anyone know what this error is? I was able to restart apcupsd and it was fine, but im worried this could happen again at the wrong time.

 

Nov 13 04:40:01 Tower apcupsd[1702]: apcupsd exiting, signal 15

Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208

Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file  (Errors)

Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208

Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd error shutdown completed (Errors)

Nov 13 04:40:02 Tower kernel: apcupsd[1702]: segfault at 0 ip 0805e204 sp bf9e8240 error 4 in apcupsd[8048000+38000] (Errors)

Share this post


Link to post

I searched and saw similar error's posted here, but not quite the same. Anyone know what this error is? I was able to restart apcupsd and it was fine, but im worried this could happen again at the wrong time.

 

Nov 13 04:40:01 Tower apcupsd[1702]: apcupsd exiting, signal 15

Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208

Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file  (Errors)

Nov 13 04:40:01 Tower apcupsd[8208]: Valid lock file for pid=1702, but not ours pid=8208

Nov 13 04:40:01 Tower apcupsd[8208]: apcupsd error shutdown completed (Errors)

Nov 13 04:40:02 Tower kernel: apcupsd[1702]: segfault at 0 ip 0805e204 sp bf9e8240 error 4 in apcupsd[8048000+38000] (Errors)

To me it seems to indicate you attempted to start apculsd again when it was already started.  (could not acquire the lock file, since the process ID in the lock file was different than the one attempting to read the lock.)

 

The "segfault" was certainly not normal.  But who knows what that was about...

 

Joe L.

Share this post


Link to post

Not sure how it tried to start when it was already started. I guess we'll see if this happens again. Thanks

Share this post


Link to post

Alright, got this again, this time without the "segfault" portion. Happens Sunday mornings at the same time.

 

Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd exiting, signal 15

Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502

Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors)

Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502

Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd error shutdown completed (Errors)

Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd shutdown succeeded

 

 

Share this post


Link to post

Alright, got this again, this time without the "segfault" portion. Happens Sunday mornings at the same time.

 

Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd exiting, signal 15

Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502

Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file (Errors)

Nov 20 04:40:01 Tower apcupsd[2502]: Valid lock file for pid=1701, but not ours pid=2502

Nov 20 04:40:01 Tower apcupsd[2502]: apcupsd error shutdown completed (Errors)

Nov 20 04:40:01 Tower apcupsd[1701]: apcupsd shutdown succeeded

 

 

 

 

What version of Unraid are you using and what version of APCUPSD are you using?

Share this post


Link to post

I'm using 5.0beta13 and the APCUPSD 3.14.10 Plugin.

 

I started getting the same error once I switched to 5.0beta13.

 

 

Share this post


Link to post

So are you just restarting it every time this happens or did you find a way around it? Does yours happen at the same time? Interesting that it always happens Sunday morning at 4:40am.

Share this post


Link to post

So are you just restarting it every time this happens or did you find a way around it? Does yours happen at the same time? Interesting that it always happens Sunday morning at 4:40am.

I think it might be a weekly log file rotation feature built into apcupsd that is broken in that release.  Do a google search of "apcupsd exiting, signal 15" and you'll find others who have complaining of the same issue.

Share this post


Link to post

I noticed the same problem while running unraid 5 rc8, APCUPSD 3.14.10 Plugin and a Cyberpower CP1350AVRLCD. Any workarounds? It seems to kill the apcupsd daemon.

 

Oct  7 04:40:01 Clara-Belle apcupsd[2223]: apcupsd exiting, signal 15

Oct  7 04:40:01 Clara-Belle apcupsd[19654]: Valid lock file for pid=2223, but not ours pid=19654

Oct  7 04:40:01 Clara-Belle apcupsd[19654]: apcupsd FATAL ERROR in apcupsd.c at line 285 Failed to acquire device lock file

Oct  7 04:40:01 Clara-Belle apcupsd[19654]: Valid lock file for pid=2223, but not ours pid=19654

Oct  7 04:40:01 Clara-Belle apcupsd[19654]: apcupsd error shutdown completed

Oct  7 04:40:01 Clara-Belle kernel: apcupsd[2223]: segfault at 0 ip 0805e204 sp bfbc33d0 error 4 in apcupsd[8048000+38000]

Share this post


Link to post

I've just reported exactly the same problem/symptoms in another thread, and was directed to this thread.

 

Yes, it happened to me at 4.40 last Sunday morning (April 7).  Having seen this thread, I checked previous logs - at 4.40 on Sunday March 31, apcupsd shut down and restarted successfully.

 

Strangely, there was nothing in my log for Sunday March 24 at 4.40.

 

I have to presume that apcupsd is doing this deliberately - log rotation seems a plausible reason ... why it didn't happen on March 24 is a puzzle, perhaps it depends on how long apcupsd has been running.

 

Now, guessing here, but I suspect that the lock file deletion doesn't always occur quickly enough - a race hazard.

 

A simple fix in the apcupsd code would be wait a few seconds - perhaps on startup if a conflicting lock file exists.

 

A simple workaround would be to add a cron job, running at 4.41 every Sunday, starting apcupsd if the lock file does not exist.

Share this post


Link to post

Okay, I've spent more time on this.  apcupsd does perform log rotation, with a stop and a start (restart).  The problem is that there is absolutely no artificial delay between the stop and the start command and I am convinced that it is delays in the pid file deletion cause this problem - a simple "sleep 1" would probably resolve the issue.  That would be very easy to implement but would mean modification to a file which is part of the standard apcupsd distribution.

 

I have, therefore, made alterations to the unRAID plugin, along the lines of my previous suggestion - a cron job will run at 4.41 every Sunday morning to restart the daemon if it is not running.

 

I will watch my log for activity on Sunday mornings and once I have evidence that my changes have the desired effect, I will look into making a new distribution.

 

My changes do mean that there is, potentially, a minute when apcupsd is not running, but I don't believe that this is a problem - I'm sure that I've seen a situation where apcupsd started while already running on batteries and it still shut the system down.

 

Looking at the three apcupsd distributions which are commonly used with unRAID, it is clear that 3.14.10 and 3.14.8 both  implement log rotation in the same way, but 3.14.3 doesn't.  It seems likely that it is the same issue which prompted most users to avoid 3.14.8 in favour of 3.14.3.

 

 

[b]<FILE Name="/usr/local/emhttp/plugins/apcupsd/apcrestart" Mode="0775">
<INLINE>
<![CDATA[
#!/bin/bash
if ! ps x | grep -qs '/sbin/[a]pcupsd'
then
    echo not running
    /etc/rc.d/rc.apcupsd start
fi
]]>
</INLINE>
</FILE>[/b]

 

  function startapcupsd() {
    global $newline, $log;

    echo("Starting apcupsd...");

    exec("/etc/rc.d/rc.apcupsd stop");
    sleep(0.5);
    exec_log("killall -9 apcupsd");
    exec_log("/etc/rc.d/rc.apcupsd start");
    sleep(5);

[b]    exec("(crontab -l | grep -v apcrestart; echo -e \42#                                        apcrestart\n# Restart apcupsd after log rotate (Sunday 4:40) (apcrestart)\n41 4 * * 0 /usr/local/emhttp/plugins/apcupsd/apcrestart\42) | crontab -");
[/b]
    echo("Completed$newline");

  }
  
  function stopapcupsd() {
    global $newline, $log;

    echo("Stopping apcupsd...");
[b]    exec("(crontab -l | grep -v apcrestart) | crontab -");[/b]
    exec_log("/etc/rc.d/rc.apcupsd stop");
    sleep(0.5);
    exec_log("killall -9 apcupsd");
    echo("Completed$newline");
  }

 

 

 

Share this post


Link to post

No logfile rotation was attempted his morning - I think because we had a power outage yesterday, so the system was rebooted.  I have seen evidence that log rotation only happens if the system has been up for a certain minimum time.

 

So, I wait for next Sunday!

 

Edit:

Well, 'next' Sunday arrived, and all was looking good for my test .... until the power went off at 3am.  What's more infuriating, is that it was only off for just over five minutes - guess what length of time I allow before shutdown begins ... yes, five minutes.  The power returned ten seconds before the server powered off!

 

Perhaps next Sunday?

Share this post


Link to post

Hi PeterB,

 

Do you have any update on this issue?

 

We've had too many powercuts ... I haven't managed to get a clear run up to a Sunday morning.

Currently not had a cut since Thursday afternoon (34 hours ago) - perhaps it will hold for another 28 hours and log rotation might be attempted.

 

If that fails, would you be prepared to run with my modifications and test for me?

Share this post


Link to post

If that fails, would you be prepared to run with my modifications and test for me?

 

Yes sure, however there will be some tuning on my rig next week, so not sure if I will have the necessary uptime either. Anyhow this is a hit and miss for me. I have the impression that sometimes apcupsd survives log rotation, sometimes not.

Share this post


Link to post
Anyhow this is a hit and miss for me. I have the impression that sometimes apcupsd survives log rotation, sometimes not.

 

Indeed, that is true.  There is a race between the deletion of the pid file, instigated by the 'old' apcupsd process, and the new apcupsd process starting up and testing for the existence of a pid file.  Sometimes the deletion completes in time, sometimes it doesn't.

Share this post


Link to post

Well, for the first time in four weeks, power stayed up long enough for log rotation to occur - unfortunately the standard restart went without a hitch, so the restart performed by my modification wasn't invoked.  However, after four weeks, I am convinced that my modification does no harm and, in artificial tests, I have proven that my code will effect a (re)start in the event that the apcupsd process is missing.

 

I will put a release together and place it on my website for anyone to download and test.

Share this post


Link to post

Ooops!  My excuse is that I've been busy! :-[  I'll try my best to do it this week!

Share this post


Link to post

Ooops!  My excuse is that I've been busy! :-[  I'll try my best to do it this week!

 

Do you still have plans to release your fix?

 

Thanks

Share this post


Link to post

working on a new package for unmenu.. it will have the sleep fix included..

PACKAGE_INSTALLATION sed -i -e "s/\$0 stop/\$0 stop\n       sleep 3/" /etc/rc.d/rc.apcupsd

 

please try out and report back:

http://lime-technology.com/forum/index.php?topic=27051.msg253098#msg253098

I'm running the newer apcupsd on my newer server.  So far, it is working as expected.

 

just noticed the rollover works fine.

Jul 21 04:40:01 husky apcupsd[1516]: apcupsd exiting, signal 15
Jul 21 04:40:01 husky apcupsd[1516]: apcupsd shutdown succeeded
Jul 21 04:40:06 husky apcupsd[28940]: apcupsd 3.14.10 (13 September 2011) slackware startup succeeded
Jul 21 04:40:06 husky apcupsd[28940]: NIS server startup succeeded

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.