Jump to content

Call traces found on your server


fonzie

Recommended Posts

I was using the "Fix Common Problems" plugin on my unRAID when I got a call traces notice:

 

Your server has issued one or more call traces. This could be caused by a Kernel Issue, Bad Memory, etc

You should post your diagnostics and ask for assistance on the unRaid forums

 

I went to the diagnostics in the tools section and downloaded the zip file with "anonymize diagnostics" checked.  I just want assurance that it is safe to post that zip online without compromising the security of my server.

Link to comment

The first call trace looks as though it was caused when IRQ 16 was disabled. It's used by a USB controller.

 

Your syslog has a lot of this:

 

Feb  9 21:02:06 media root: ACPI group processor / action LNXCPU:00 is not defined

Feb  9 21:02:06 media root: ACPI group processor / action LNXCPU:01 is not defined

Feb  9 21:02:06 media root: ACPI group processor / action LNXCPU:02 is not defined

 

It starts at this time and continues until you grab your diagnostics the following day. Now, it looks alarming but a little searching revealed this thread. So I'd check for a newer BIOS first (which might also help with the IRQ 16 issue), then check in the BIOS settings for some power/performance tweak, and failing both add the dummy handler, as suggested.

 

You probably want to delete the old Dynamix beta plugin (/boot/config/plugins/dynamix.plg) as it isn't needed and seems to cause other benign error messages in your syslog.

 

 

Link to comment

I always find it kind of ironic though that while share names are obfuscated, none of the mover logs are.

You can disable mover logging though. Settings - Scheduler - Mover Settings

I don't think the OP was aware of that.

 

I was not aware of that, but now I have turned that setting off. Thanks.

 

I also followed the instructions on the link you posted.  Additionally I deleted the old dynamix.plg

 

Hopefully that does the trick.

Link to comment

Your diagnostics show the same issues as before. IRQ 16 is being disabled and the ACPI event is not being handled.

 

Apart from deleting the old Dynamix plugin, what changes have you made? The BIOS is unchanged and so is the kernel. If you found a performance tweak in the BIOS it had no effect.

 

You seem to have deleted the post with your old diagnostics attached so the only reference point I have is what I can remember.

Link to comment

I didn't make any BIOS tweak because last I checked, it was up to date.  Additionally, the posters in the thread said trying tweaks had no effect on resolving this issue.  I didn't want to risk messing up the firmware on the mobo if not needed.

 

I followed the instructions in the link you posted.  I created "acpi_handler.sh" and saved it in in the config directory.  The code I used was:

 

#!/bin/sh
# Default acpi script that takes an entry for all actions
# limetech - power off via webGui

IFS=${IFS}/
set $@

case "$1" in
  button)
    case "$2" in
#     power) /sbin/powerdown
      power) /usr/local/sbin/powerdown
         ;;
      *) logger "ACPI action $2 is not defined"
         ;;
    esac
    ;;
  *)
#    logger "ACPI group $1 / action $2 is not defined"
    ;;
esac

 

I then opened the "go" file from the root of the flash drive and added the line:

 

cp /boot/config/acpi_handler.sh /etc/acpi

 

I had my doubt about including the "cp" part of the code but I left it in there.

Link to comment

The /etc directory exists only in RAM and therefore it is recreated after a reboot. You need the

cp

to copy the handler script from the boot flash to /etc after every reboot. Problem is the events are still going unhandled so they still clutter your syslog. It isn't a problem as such, but it will eventually fill up the space available for logs.

 

Regarding the lack of newer BIOS, you can do no more. When a newer version of unRAID is released it may contain a newer kernel and that might fix the IRQ issue, or it might not. I haven't been able to work out what is meant to be using IRQ 16 but it doesn't seem to be anything important. So your choice is either to live with it or to replace your hardware. If the system works well enough for you then leave it at that.

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...