Help Troubleshooting Recurring Crashes


Recommended Posts

I've been having problems with my server recently (last 3-4 months). It will run for a number of weeks and then become unresponsive (no web GUI, no ssh access), requiring a hard reset by holding the power button. I captured system logs using the troubleshooting mode of the Fix Common Problems plugin. Syslog files and diagnostic files are attached.

 

Any help would be greatly appreciated. Thanks very much.

syslog-1529765331

bunker-diagnostics-20180630-2104.zip

syslog-1530417311

syslog-1529984970

Link to comment

Same here 

 

and here

 

 

I have not got a solution yet, I'm still looking for one and cross posting all the threads together will mean if a solution gets posted to one of them, we will all be able to find it..

 

Edited by Parax
Link to comment
On 7/1/2018 at 3:55 AM, shooga said:

I've been having problems with my server recently (last 3-4 months). It will run for a number of weeks and then become unresponsive (no web GUI, no ssh access), requiring a hard reset by holding the power button. I captured system logs using the troubleshooting mode of the Fix Common Problems plugin. Syslog files and diagnostic files are attached.

 

Any help would be greatly appreciated. Thanks very much.

syslog-1529765331

bunker-diagnostics-20180630-2104.zip

syslog-1530417311

syslog-1529984970

99% of one of your syslogs (I really only looked at ...4970 which appears to be a FCP tail) is nothing but one call trace after another of

Jun 30 03:56:56 Bunker kernel: INFO: rcu_sched self-detected stall on CPU


 

The diagnostics look like they were after you rebooted the system June 30th ~20:54

IIRC, rcu-self detected stalls *seemed* to have disappeared from reports around here somewhere around 6.0-rc something.  Whether or not this is relevant to your problems, I'm really not sure (when they were happening in early 6.0, I seem to think that they were more of an annoyance than anything else.  Perhaps @eschultz or @limetech might have a better insight into this.

 

 

 

Link to comment

@Squid Thanks for the response. FCP didn't save diagnostics, so I had to grab them after the crash. I'm running it again and this time it's saving diagnostics files.

 

Do you think the stall is causing my problem? I saw another thread where you were playing with the stall timeout and a couple other settings. Do you recommend trying that? Or should the update to Unraid have made that obsolete?

 

Thanks again.

Link to comment
  • 2 weeks later...

I was just on vacation and came home to another crash. I'm attaching the FCPsyslog_tail.txt and diagnostics files. Doesn't look like rcu-self detected stalls are the issue this time; in fact I don't see what would have caused it.

 

I have a display connected to the server this time. What should I be looking for there? Do I need to be running a specific application or in a specific mode (GUI vs no GUI for example)?

 

Thanks again for any help!

FCPsyslog_tail.txt

bunker-diagnostics-20180706-2332.zip

Link to comment

I decided to try memtest to see if that could be the issue. Ran through 5 passes with no errors, so I think that rules out a memory problem.

 

Any suggestions would be much appreciated. Given the lack of ideas, should I be considering a clean install of Unraid?

Link to comment

Thanks for the response.

 

With the infrequency of crashes (might be weeks between crashes) and the way I use my server, running in safe mode to test would largely mean it's out of service for weeks at a time. Not sure I can do that, but I will try culling plugins and dockers back to a bare minimum to see if that helps. If not, I will continue to cull. If this fails then I will try safe mode eventually, but hopefully I can avoid that.

 

Thanks again for the suggestion.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.