john014 Posted September 22, 2021 Share Posted September 22, 2021 My Unraid server, seemingly randomly, will stop responding and hang, and then require a hard reboot to get up and running again. During this hang, there is no output from the local video adapter, the server network interface is not ping-able, and nothing its running responds. I've tried troubleshooting by setting the output of syslog to the flash, but there is nothing in there of any concern, in fact it doesn't seem to log any warnings about the hang. The latest example shows the last log entry on the 21st at just after 8pm, however it was definitely working up until midnight of that day. this morning I woke up and it was fully hung, requiring a hard reboot, and this is where the log picks up again. the last few log lines are probably me updating a few containers I have running on it. Could I get some more pointers on how to troubleshoot this please? is there a log I am missing somewhere that might be of some use? Thanks syslog.txt Quote Link to comment
ChatNoir Posted September 22, 2021 Share Posted September 22, 2021 Your diagnostics might provide more information. Quote Link to comment
john014 Posted September 22, 2021 Author Share Posted September 22, 2021 (edited) Hi, yes I've already looked at some of the logs this generates, however they all start after the reboot, and dont contain any logs from before, unless I'm missing something? Thanks EDIT: attached diag logs svalbard-diagnostics-20210922-0915.zip Edited September 22, 2021 by john014 Quote Link to comment
ChatNoir Posted September 22, 2021 Share Posted September 22, 2021 Yes, you are right, to have logs that survive a crash/reboot, you have to set up a syslog server. However, the diagnostics provide a lot of information about the system and our experienced users can dig into this and provide suggestions. Quote Link to comment
john014 Posted September 24, 2021 Author Share Posted September 24, 2021 Sorry to bump, but has any super duper Unraid person managed to look at my diag logs to point me in the right direction yet? Thanks Quote Link to comment
trurl Posted September 24, 2021 Share Posted September 24, 2021 Did you have any syslogs from syslog server? Quote Link to comment
john014 Posted September 27, 2021 Author Share Posted September 27, 2021 I mirrored the syslog to flash as I don't have anything else that I can leave on permanently, and it looks like the syslog stops just before the server crashes as there is nothing of interest in there, excerpt is in OP. Quote Link to comment
Tristankin Posted September 29, 2021 Share Posted September 29, 2021 Try downgrading to 6.8.3, I had issues with 6.9.x on intel hardware, freezing every 1-2 days, now have 60 day uptime back on 6.8.3 Quote Link to comment
john014 Posted September 30, 2021 Author Share Posted September 30, 2021 Thanks for the suggestion, but my server started on 6.9.2, so I'd need pretty hard evidence that there is a bug in 6.9.2 to roll back that far. I also don't know what configs I would have to redo so sounds like a big job to me. Quote Link to comment
Tristankin Posted October 1, 2021 Share Posted October 1, 2021 Config stays the same, you may have to redo your cache drives but that's it. I downgraded by replacing everything on the usb drive except for the config folder. (ran out of rollbacks trying multiple 6.9 versions. Its cheaper than buying replacement hardware if you don't find another fix. MAke sure you have the ram at the recommended speed for the cpu, disable c states, make sure you don't have anything funky with power supply power control etc. Quote Link to comment
hamish_18 Posted March 4, 2022 Share Posted March 4, 2022 I too have the same thing. Randomly just reboots, nothing in the logs that pertains to any issue. @john014, did you ever figure out what was causing this? The last log entry for me is: Mar 4 03:49:46 UnRaid emhttpd: read SMART /dev/sdo And the next is after I powered it back up, and the system was starting. It has done this for quite sometime, and it is super annoying. Is there any sort of additional debugging we could enable? Quote Link to comment
no-thanks Posted March 4, 2022 Share Posted March 4, 2022 (edited) How long should it take to reboot? A similar thing happened to me -- I was logged out, cache drives wouldn't show up, in-progress file transfers aborted, and the login popup wouldn't come back up. So, I figured I might as well reboot (I'm just getting things setup and transferred over so the server isn't really in use yet), and now it's been sitting on the REBOOT screen for "1251" seconds (?). That seems excessive, no? Edit: I just did the "dangerous" thing and hard powered it off. After rebooting, logs indicated a problem with the attached USB drive I was copying from and after disconnecting it, it booted fine. This has been a bit of a finicky setup process, but hopefully things smooth out once it's all done. Edited March 4, 2022 by no-thanks Quote Link to comment
Squid Posted March 4, 2022 Share Posted March 4, 2022 Yeah, somethings holding it up. Do you have an attached monitor / keyboard to the system? Reboot should take around 5 minutes max depending upon what you've got installed Quote Link to comment
no-thanks Posted March 4, 2022 Share Posted March 4, 2022 (edited) Thanks! I do have an attached monitor, but there was nothing on the screen when I switched inputs over. I didn't have my monitor on the Unraid box's output when I rebooted, so I didn't see what may have happened in the process. I haven't gotten the onboard GUI working properly yet. It seems something happened to the external HDD during the copy process, because now Unraid and my Mac won't recognize/mount it. It's possible it died, it was pretty old. Nothing of import on it, fortunately, just some easily replaceable media files. Edited March 4, 2022 by no-thanks clarification Quote Link to comment
john014 Posted March 6, 2022 Author Share Posted March 6, 2022 On 3/4/2022 at 3:11 PM, hamish_18 said: I too have the same thing. Randomly just reboots, nothing in the logs that pertains to any issue. @john014, did you ever figure out what was causing this? The last log entry for me is: Mar 4 03:49:46 UnRaid emhttpd: read SMART /dev/sdo And the next is after I powered it back up, and the system was starting. It has done this for quite sometime, and it is super annoying. Is there any sort of additional debugging we could enable? Hi, I think we had different issues - as mine was a lockup. It seemingly fixed itself with no changes made by me. The only thing that would have changed would have been the docker containers I have running - but if a container can bring a system to its knees then somethings wrong... Quote Link to comment
trurl Posted March 6, 2022 Share Posted March 6, 2022 12 minutes ago, john014 said: if a container can bring a system to its knees then somethings wrong A misconfigured container can fill rootfs and cause unpredictable OS behavior since the OS lives in rootfs and needs space to work. Any volume mapping that isn't actual host storage is in rootfs. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.