shooga Posted July 1, 2018 Share Posted July 1, 2018 I've been having problems with my server recently (last 3-4 months). It will run for a number of weeks and then become unresponsive (no web GUI, no ssh access), requiring a hard reset by holding the power button. I captured system logs using the troubleshooting mode of the Fix Common Problems plugin. Syslog files and diagnostic files are attached. Any help would be greatly appreciated. Thanks very much. syslog-1529765331 bunker-diagnostics-20180630-2104.zip syslog-1530417311 syslog-1529984970 Quote Link to comment
shooga Posted July 2, 2018 Author Share Posted July 2, 2018 No ideas here? Is there more information I can provide? Thanks again! Quote Link to comment
Parax Posted July 2, 2018 Share Posted July 2, 2018 (edited) Same here and here I have not got a solution yet, I'm still looking for one and cross posting all the threads together will mean if a solution gets posted to one of them, we will all be able to find it.. Edited July 2, 2018 by Parax Quote Link to comment
Squid Posted July 2, 2018 Share Posted July 2, 2018 On 7/1/2018 at 3:55 AM, shooga said: I've been having problems with my server recently (last 3-4 months). It will run for a number of weeks and then become unresponsive (no web GUI, no ssh access), requiring a hard reset by holding the power button. I captured system logs using the troubleshooting mode of the Fix Common Problems plugin. Syslog files and diagnostic files are attached. Any help would be greatly appreciated. Thanks very much. syslog-1529765331 bunker-diagnostics-20180630-2104.zip syslog-1530417311 syslog-1529984970 99% of one of your syslogs (I really only looked at ...4970 which appears to be a FCP tail) is nothing but one call trace after another of Jun 30 03:56:56 Bunker kernel: INFO: rcu_sched self-detected stall on CPU The diagnostics look like they were after you rebooted the system June 30th ~20:54 IIRC, rcu-self detected stalls *seemed* to have disappeared from reports around here somewhere around 6.0-rc something. Whether or not this is relevant to your problems, I'm really not sure (when they were happening in early 6.0, I seem to think that they were more of an annoyance than anything else. Perhaps @eschultz or @limetech might have a better insight into this. Quote Link to comment
shooga Posted July 3, 2018 Author Share Posted July 3, 2018 @Squid Thanks for the response. FCP didn't save diagnostics, so I had to grab them after the crash. I'm running it again and this time it's saving diagnostics files. Do you think the stall is causing my problem? I saw another thread where you were playing with the stall timeout and a couple other settings. Do you recommend trying that? Or should the update to Unraid have made that obsolete? Thanks again. Quote Link to comment
Squid Posted July 3, 2018 Share Posted July 3, 2018 Updates should have made it obsolete... Quote Link to comment
shooga Posted July 12, 2018 Author Share Posted July 12, 2018 I was just on vacation and came home to another crash. I'm attaching the FCPsyslog_tail.txt and diagnostics files. Doesn't look like rcu-self detected stalls are the issue this time; in fact I don't see what would have caused it. I have a display connected to the server this time. What should I be looking for there? Do I need to be running a specific application or in a specific mode (GUI vs no GUI for example)? Thanks again for any help! FCPsyslog_tail.txt bunker-diagnostics-20180706-2332.zip Quote Link to comment
shooga Posted July 14, 2018 Author Share Posted July 14, 2018 I decided to try memtest to see if that could be the issue. Ran through 5 passes with no errors, so I think that rules out a memory problem. Any suggestions would be much appreciated. Given the lack of ideas, should I be considering a clean install of Unraid? Quote Link to comment
JorgeB Posted July 14, 2018 Share Posted July 14, 2018 Have you tried running in safe mode without any docker and plugins? If stable uninstall all plugins and then install one at a time, similarly for dockers turn them on one at time and wait to see if it remains stable before turning on the next one. Quote Link to comment
shooga Posted July 14, 2018 Author Share Posted July 14, 2018 Thanks for the response. With the infrequency of crashes (might be weeks between crashes) and the way I use my server, running in safe mode to test would largely mean it's out of service for weeks at a time. Not sure I can do that, but I will try culling plugins and dockers back to a bare minimum to see if that helps. If not, I will continue to cull. If this fails then I will try safe mode eventually, but hopefully I can avoid that. Thanks again for the suggestion. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.